Soft-clustering Technique for Fingerprint-based Localization

In this paper, the soft-clustering algorithm for the fingerprint-based localization technique is proposed. In an indoor environment, the fingerprint-based localization technique is usually employed since it can deal with signal fluctuation. Its basic principle is to find the target location by comparing its signal parameters with a previously recorded database of known-location-signal parameters. Here, the received signal strength indicator (RSSI) provided by the wireless sensor network (WSN) is used as the signal parameter. The high accuracy of location estimation requires a very fine spatial resolution of the database, corresponding to the time consumed for pattern matching. To reduce the calculation time, clustering can be applied because it can reduce the database size by grouping similar data in the same cluster. The accuracy of the algorithm to cluster the target location and fingerprint locations is the main concern. The result shows that the clustering technique used can successfully cluster the target sensing node into an appropriate cluster. This implies that, by using soft clustering with the fingerprint technique, the target location can be estimated faster than by using classical fingerprint techniques since the target location can be estimated within a small set of fingerprints in the cluster, not with all fingerprints in the database.


Introduction
(3) The localization technique can be mainly classified into two main techniques: i) range-and ii) fingerprint-based techniques.For the former, the target location is estimated using the triangulation calculation for receiving signal parameters.Therefore, the accuracy of location estimation is based on the quality of the received signal.Consequently, the range-based technique is suited to an open space environment where there is no or little effect of signal fluctuation.For the fingerprint-based technique, the target location is estimated by comparing its signal parameters to reference signal parameters obtained from known-reference locations previously recorded in the database.The known-reference locations are called the fingerprint locations and the signal parameter obtained from a fingerprint location is called the fingerprint information, or "fingerprint" for short.The reference location providing the best match of the signal parameters to those from the target is returned as the target location. (4)he key advantage of this technique is that it can be utilized in a multipath environment, but the disadvantage is that it is time-consuming because it is a two-phase process.Moreover, the accuracy of the location estimation depends on the spatial resolution of the database.Nevertheless, if the spatial resolution of the database is higher, the time consumption is consequently higher.In this paper, we focus on location estimation in an indoor environment because of its various applications.Therefore, the fingerprint-based localization technique is used because it can deal with the multipath effect in an indoor environment.7)(8)(9)(10) It is supported by the advancement of micro-mechanical systems and the development of digital electronic technology.The ZigBee module, which is the IEEE 802.15.4 standard, (11) is widely used because of its various advantages, i.e., cost effectiveness, low power consumption, security, robustness, and reliability.Moreover, since ZigBee provides a useful signal parameter for location estimation, i.e., the received signal strength indicator (RSSI), it is employed in this work.One of the challenges of the fingerprint-based localization is to reduce the computation time.Clustering algorithms have been used in many research studies to particularly reduce the computation time of the positioning phase.The aim of using clustering is to partition the location database by grouping similar data in the same cluster.This leads to reducing the pattern matching time in the positioning phase.In general, two main clustering algorithms are hard clustering that assigns data into exactly one cluster and soft clustering that allows data to belong to more than one cluster in some degree. (12)(15) Although the k-means algorithm is fast and simple, it often fails in cluster selection.Since, in an indoor environment, the signals always fluctuate, cluster selection may fail, resulting in the location estimation error.Therefore, to deal with this problem, a soft-clustering algorithm that allows overlapping is used.Kuo et al. proposed two clustering techniques extended from the k-means algorithm allowing overlap. (16)The difference of these techniques is the way to define the similarity of feature vectors.Based on simulation results, these techniques are more efficient than the traditional k-means algorithm.(19)(20)(21) We proposed the fingerprint-based localization technique using FCM in WSN in our previous study. (22)he reduction in pattern matching time is expected, but results of time consumption with and without FCM are very slightly different, which is because the number of fingerprint locations from the experiment is very small.Nevertheless, we investigated the clustering accuracy and found that FCM can cluster the target location in the appropriate cluster.The experiment is conducted in an indoor open area, where the effects of multipath fading are low.However, the clustering accuracy for indoor environments with high multipath effects should be investigated.This paper presents the extended work from our previous study.FCM as the soft-clustering algorithm is applied with the fingerprint-based localization technique in WSN.To investigate the accuracy of clustering in the environment where signals markedly fluctuate because of severe multipath effects, a warehouse of a supplier of construction materials is chosen to be studied in this work.More fingerprint locations are assigned and the data measured from those locations with interpolation are used to validate the reduction in pattern matching time.
This paper is organized as follows.Section 2 briefly explains the materials and methods used, including the implementation of the soft-clustering algorithm or FCM for the fingerprintbased localization technique, as well as the experimental system and setup.Section 3 shows the results and discussion.Finally, the conclusions are given in Sect. 4.

Fundamental concept of fingerprint-based localization technique
The fingerprint-based localization technique involves a two-phase process.For the first phase called the offline or training phase, the received signal information at each selected reference location is recorded in the database.The received signal information is referred to as the fingerprint information (or fingerprint for short) and the selected reference location is referred to as the fingerprint location.Then, for the second phase called the online or positioning phase, one of the pattern-matching algorithms is used to infer the location of the target by comparing the current observed signal information with the prerecorded fingerprint information in the database. (3,4,14)In this work, the indoor localization system is implemented in WSN.The location of the target node will be estimated within the network with respect to some of the reference sensor nodes.As the IEEE 802.15.4 standard, ZigBee is employed in this work because of its various advantages and especially because it provides a useful parameter for location estimation, i.e., RSSI.Let R be the number of reference sensor nodes, and L the number of fingerprint locations.The fingerprint information d l at the lth fingerprint location with the coordinates (x l , y l ) can be represented as a row vector of RSSIs received from all reference nodes: , , ..., , ..., where rssi lr is the RSSI obtained from the rth reference node (r = 1, 2, ..., R) at the lth fingerprint location (l = 1, 2, ..., L).The concept of fingerprint-based localization in WSN is shown in Fig. 1.
When the target node is in the fingerprint area, it collects RSSIs from all reference nodes as The location of the target node can be estimated using the pattern-matching algorithm.The current signal information of the target node is compared with those previously stored in the database.The nearest-neighbor algorithm (NNA) (4) is one well-known pattern-matching algorithm.Because of its simplicity, NNA is applied in this work.The fingerprint location providing the best match to the target information is returned as the estimated location of the target node.As mentioned, the finer the spatial resolution of the database, the more accurate the estimate of the target location.Nevertheless, with a fine spatial resolution of the database, such an algorithm is time-consuming.In the following section, the soft-clustering algorithm is applied to the fingerprint-based technique to resize the group of fingerprint information, leading to faster calculation of the target location.

Soft clustering for fingerprint-based localization
As mentioned, a soft-clustering algorithm such as FCM is a method of data clustering that allows one piece of data to belong to two or more clusters.The FCM algorithm minimizes the objective function J m (U, V) for the partition of the where d n is the feature vector containing R features, i.  is the weighting exponent (1 ≤ m < ∞) controlling the relative weighting of the membership of the clusters.In other words, it controls the fuzzy overlap between the clusters, i.e., the higher the value of m, the higher the degree of overlap.Bezdek et al. (17) reported that, although there is no theoretical basis for choosing the optimum value of m, 1.5 ≤ m ≤ 3 provides good results for most data, and Cannon et al. (18) mentioned a useful range of 1.1 ≤ m ≤ 5.In many research works using the FCM algorithm, m of 2 is selected, such as in the works of Sayadi et al. (19) and Das. (20)he FCM algorithm can be summarized as follows.In this work, the N data set includes L data values from fingerprint locations and one data value from the target location, and the feature vector d n contains RSSI values received from all reference nodes and can be shown as d n = [rssi n1 , rssi n2 , ..., rssi nr , ..., rssi nR ], where rssi nr is the rth feature of the nth feature vector and R is the number of reference nodes.
Step 1: Fix the number of clusters to C = 10 and the weighting exponent m = 2 and ε = 0.00001 Then randomly initialize the membership matrix U.
After that, start from the iteration t = 1 and proceed to the end of the algorithm.
Step 2: Calculate the cluster centers Step 3: Update the membership matrix U (t) containing the membership function u c,n using 1 .
Step 4: Calculate the objective function J m .
Step 5: Compare U (t) and (where ε is a specified minimum threshold), stop.Otherwise, increase the iteration and return to Step 2. As previously mentioned, the FCM algorithm has many benefits including the applicability to cluster multichannel data.In this work, it is deployed to cluster the fingerprint information and the target information that contains multi-RSSI values from R reference sensor nodes.For the estimation of the target location by the common fingerprint-based technique, all recorded fingerprints in the database are compared with the current signal information of the target node.Here, the clustering is used to group similar fingerprints into clusters in which the signal information of the target node also belongs to.It will be beneficial if there are many fingerprints, since the clustering will minimize matching only in one cluster.

Material
In this work, the ZigBee node, which is the IEEE 802.15.4 standard, is used as an active electronic complex material since it provides a useful signal parameter for location estimation, i.e., RSSI.It uses the 2.4 GHz ISM band with a maximum data rate of 250 kbps.The ZigBee module used is XBee 2mW Wire Antenna-Series 2 (ZB) with a transmit power of 2 mW (+3 dBm) and a receiver sensitivity of −96 dBm.Its operating current is 40 mA at 3.3 V for the boost mode.Its coverage range in an indoor environment is up to 40 m. (23)The network consists of 11 nodes: 10 reference nodes placed at the fixed-known locations and 1 observed (target) node that is able to change its location for the measurement testing.Figure 2 demonstrates the ZigBee model that can be employed for reference nodes and the target node.

Experimental setup
Measured data are needed to demonstrate that this method can be applied in practice.In this work, a warehouse of a supplier of construction materials is chosen to be studied.In the experiment, 10 sensor nodes are used as reference nodes and placed at the lowest shelves at a height of 1 m. Figure 3 shows the experiment layout where the triangles represent reference sensor nodes.The 68 circled numbers represent fingerprint locations assigned along the aisles with the coordinates shown in Fig. 3(a).Figure 3(b) shows part of the experimental area in which the width of each aisle is 1 m; it also shows the placement of one of the reference sensor nodes (R 3 in this figure).For the measurement at the offline phase, the RSSIs received from 10 reference sensor nodes are measured at each fingerprint location.The measurement is repeated 5 times at each fingerprint location, and then the RSSIs received from reference sensor nodes are averaged and further stored in the database.

Signal profile
Before starting to estimate the target location, the signal propagation in the experimental area is observed by measuring the received signal power every 1 m as the receiver node is moved away from the transmitter node.The experiment is conducted in the first aisle where the transmitter node is placed at the coordinates (0.5, 1.5).The measurement begins when the receiver node is at the coordinates (0.5, 2.5).Then, the same measurement is repeated when the receiver node is moved every 1 m from the coordinates (0.5, 2.5) to (0.5, 10.5).According to the indoor environment, the fluctuated received signal can usually be seen as shown in Fig. 4.However, it can be easily observed that the signal power decreases when the distance between the transmitter and the receiver increases, i.e., as the receiver moves away from the transmitter.

Location estimation
In order to estimate the location of the target node using the fingerprint-based localization technique, both with and without FCM, the 30 observed target locations are selected and represented by the oval with the number inside, as shown in Fig. 5. Observed target locations are selected, among which some are at the exact fingerprint locations and some are in between the fingerprint locations.At each observed location, the target node is stopped for measuring the RSSIs from all reference sensor nodes with the 5-times measurement.Then, the target node

Results and Discussion
This section shows the results of clustering and the accuracy of the estimated target location, as well as the investigation of the reduction in computation time.

Cluster results
According to the experimental setup, the location of the target is known and the cluster for each target location can be expected.For the experiments, the C value of the FCM clustering algorithm of 10 is used.It means that there are 10 clusters or groups.As previously mentioned, the C value must not exceed the number of samples, N, in vector D. Therefore, the C value of 10 is acceptable since the number of samples is 69 (68 fingerprints stored in the database and 1 target node).
Because of page limitation, the 6 observed locations are selected from the 30 observed target locations to show the FCM capability to cluster the target node in the desired cluster of fingerprints.The examples of clustering results are illustrated in Fig. 6.The desired cluster is expected to consist of the target location and the fingerprint locations near the target location in the range of 3 m, which corresponds to the received power of −55 dBm (see Fig. 4).This range covers just the next aisle from the target and the received powers significantly drop after 3 m according to the signal propagation in Fig. 4. Clustering examples of 6 observed locations, i.e., location numbers 3, 8, 16, 19, 24, and 27, are demonstrated in Fig. 6, where the desired clusters are represented by the blocks outlined with the dash-dotted line and the estimated clusters are represented by the blocks outlined with the dotted line.For the estimated clusters, 10 clusters of each data set can overlap each other in accordance with the degree of membership function; nevertheless, only the cluster with the highest degree of membership value of the target data is shown in Fig. 6.It can be seen that the target location can be clustered successfully even though the members of the estimated cluster are not completely the same as those of the desired cluster.
On the basis of the results, for all observed target locations, FCM is able to assign the target location into the appropriate cluster.This implies that the fingerprint-based localization with FCM is able to calculate the target location faster than that without FCM, since the target location can be calculated by comparing the signal parameters of the target with the fingerprints only within this cluster, not with all fingerprints in the database.

Location estimation results
The results of estimated location errors at all observed locations of the target nodes are presented in this subsection.The results of estimated location errors with and without FCM are shown in Fig. 7.The experiment was conducted in the warehouse in which the fingerprint locations and observed locations of the target node are along the aisles.As mentioned, the experiments for 30 observed target locations were conducted, where some were at the exact fingerprint locations and some were in the middle between the fingerprint locations.Zero error is expected for the observed target locations at the exact fingerprint locations, and these fingerprint locations are returned as the estimated locations of the targets.For an observed target location between two fingerprints, an error of 0.5 m is expected.Nevertheless, since the spacing of each fingerprint location is 1 m, the estimated location error of a maximum of 1 m is acceptable.In this case, the nearest fingerprint location is returned as the estimated location of the target.The estimated location errors of all observed target nodes with and without FCM are illustrated in Fig. 7.It can be seen that the results are the same for both with and without FCM.Moreover, the results of all observed target locations are satisfactory.Specifically, there are no errors in the estimated target locations at the exact fingerprint location, i.e., location numbers 1, 6, 8, 11, 14, 16, 18, 21, 24, 28, and 30.Moreover, there are 0.5 m errors for almost all estimated target locations between two fingerprints, except location numbers 7 and 20, whose estimated location errors are 1 m.By considering the placement of these two locations, they are assigned in the aisle parallel to shelves aligning along the wall in which the metals are kept.Therefore, the estimated location errors may be come from the mirrorlike reflection from the metals.Nevertheless, an error of 1 m is still acceptable, as mentioned above.

Investigation of computation-time reduction
When the computation time is considered, it is expected that FCM will be able to use less computation time for estimating the target location, since the number of fingerprints is smaller than that without FCM.Nevertheless, in our work, only 68 measured fingerprint locations are used and, even when FCM is applied, the computation time for estimating the target location is almost the same as that without FCM.This is because the number of fingerprint locations after clustering is not significantly different from that before clustering.To have more fingerprint locations, the measured data are interpolated. (24)These interpolated data are further used to investigate how much FCM can help reduce the computation time for estimating the target location.In this test, all parameters used for the FCM algorithm are the same as those described in Sect.2.2, i.e., C = 10, m = 2, and ε = 0.00001.It is found that, for hundreds of data, the computation times obtained with and without FCM are almost the same.For thousands of data, the computation time with FCM is faster than that without FCM in seconds.Moreover, for tens of thousands of data, the computation time with FCM is clearly faster than that without FCM in minutes.Nevertheless, it should be noted that the computation time depends on the CPU and parameters assigned for FCM and thus may differ from ours.Therefore, FCM can support the fingerprint-based localization by inducing shorter computation times, especially for a large number of fingerprints.

Conclusions
We presented the soft-clustering algorithm, i.e., FCM, for the indoor fingerprint-based localization in WSN.To investigate the accuracy of clustering in the indoor environment with severe multipath effects, the experiment was conducted in the warehouse of a supplier of construction materials.The clustering results showed that the target location can be successfully grouped into the appropriate cluster.Therefore, location estimation results with and without FCM are the same.Nevertheless, at two observed locations near the shelves aligning along the wall, estimated location errors were found.This is not because of failing of clustering since the estimation with and without FCM gave the same results.However, this may be because of the mirrorlike reflection from the metals kept in the shelves.The identification of the mirrorlike reflection is needed for further investigation.To investigate the reduction in pattern matching time with FCM, measured data were interpolated to have more fingerprints.The results show that FCM can help reduce the calculation time.
e., d n = [d n1 , d n2 , ..., d nr , ..., d nR ], where d nr is the rth feature of the nth feature vector.v c is the center of the cth cluster, and the matrix U of size (C × N) is the fuzzy C-partition matrix of D containing the membership function u c,n , which can be defined by u c,n = u c (d n ) in the range of [0,1] such that all c.V is the matrix containing all cluster centers (V = [v 1 , v 2 , ..., v c , ..., v C ]).The norm ||d n − v c || is the Euclidean distance between the sample d n and the cluster center v c .m

Fig. 2 .
Fig. 2. (Color online) ZigBee model used for reference sensor nodes and target sensor node.

Fig. 3 .
Fig. 3. (Color online) (a) Experimental layout of locations of fingerprints and reference sensor nodes, and (b) reference node No. 3 placed at the lowest level of the shelf with a height of 1 m in the warehouse.(a) (b)

Fig. 4 .
Fig. 4. Received power as a function of distance from the transmitter node.

Fig. 5 .
Fig. 5. (Color online) Experimental layout of all observed locations of the target node.

Fig. 7 .
Fig. 7.Estimated location errors of all observed target nodes.