Artificial Beacons with RGB-D Environment Mapping for Indoor Mobile Robot Localization

The most important issue for intelligent mobile robot development is the ability to navigate autonomously in the environment to complete certain tasks. Thus, the indoor localization problem of a mobile robot has become a key component for real applications. In general, two categories of mobile robot localization technique are identified: one is robot pose tracking, and the other is robot global localization. In pose tracking problems such as the simultaneous localization and mapping (SLAM) process, the robot has to find its new pose using the observed landmarks or features in its knowledge base of its previous observations. In global localization methodology, a robot does not have knowledge of its previous pose. It has to find its new pose directly in the environment, such as by using a global positioning system (GPS). On the other hand, an artificial beacon-based localization technique, such as the received signal strength indicator (RSSI), causes higher pose uncertainty. However, the artificial beacon can provide a good initial reference for robot position mapping. The gyrocompass of a mobile robot is suited for short-term dead-reckoning. Also the RGB-D camera of a mobile robot can record meaningful features or landmarks in 3D space. The purpose of this work is to fuse the advantages of these sensors via strategy control by a particle filter to enhance the estimation accuracy of indoor mobile robot localization.


Introduction
The issue of intelligent mobile robot self-localization can be divided into two categories: robot pose tracking and robot global localization. In the pose tracking problem, such as the simultaneous localization and mapping (SLAM) process, the robot has to find its new pose using the knowledge of its previous pose i.e., to associate the observed landmarks in its knowledge base. In the global localization problem, a robot does not have knowledge of its previous pose. It has to find its new pose directly in the environment, such as by using a global positioning system (GPS). Furthermore, for real mobile robot applications, the kidnapping problem arises, which is a combination of those two localization problems. For example, while the mobile robot is performing an extended kalman filter (EKF)-SLAM, it may be suddenly kidnapped by moving to another pose, after which the robot cannot associate the landmarks in its knowledge base. Then the mobile robot will fail and be dangerous.
In this work, a fused concept including RGB-D visual landmark extraction, gyro-odometer, and artificial beacons for mobile robot indoor localization is proposed as shown in Fig. 1. First, a SLAM process with RGB-D camera is executed to construct an environment map with visual features. Ideally, the RGB-D camera captures all significant features in the indoor environment, but in practice, the visual feature extraction is constrained by the light, shapes, and colors of the environment. Hence, in real applications, we added some artificial beacons, such as an infrared receiver (IR) or radio beacons to the environment, and marked them on the map to help indoor mobile robot localization. Finally, the experimental results show that the proposed concept actually increased the accuracy for robot pose association and localization.

Localization and mapping development of a mobile robot
An intelligent mobile robot is an artificial system that perceives the environment and its own status through sensors in order to navigate in an indefinite environment to complete certain tasks. Thus, the SLAM problem, also known as concurrent mapping and localization (CML), is one of the fundamental challenges of intelligent mobile robot development. The SLAM problem deals with the uncertainty in the pose of a robot when the environment is only partially known or completely unknown, and uses sensor measurements to estimate the robot pose and to construct an incremental environment map simultaneously. Many studies and technologies have been developed in this field. (1) Up to this point, two aspects of SLAM techniques have been classified: feature-based SLAM (2) and graph-based SLAM. (3) Feature-based SLAM applies estimation methods from Bayesian probability, and graph-based SLAM uses global optimal estimation techniques for alignment based on relative observations.

RGB-D environment SLAM
Visual recognition is a critical component for a robot to behave autonomously. Since the RGB-D sensor of Microsoft Kinect, (4) as shown in Fig. 2(a), was announced for the Xbox-360 console in 2009, it has been possible to easily capture image and depth maps at a lower cost, and therefore to use Kinect for various visual recognition studies including SLAM, and recognition of objects, human gestures, and actions. Figure 2(b) shows results from the RGB-D SLAM.

Visual landmark and feature extraction
Numerous techniques for computer vision have been proposed to model landmarks for a robot navigating in indoor environments. They all rely on two assumptions: (1) landmarks have to be easily detected in the image signal; and (2) landmarks must be locally characterized to distinguish them from others features. Most recent work also makes use of points to define landmarks, taking advantage of new, powerful interest point detection and characterization algorithms such as the scale invariant feature transform (SIFT), (5) shown in Fig. 3, which has emerged as an effective methodology in general object recognition as well as for other applications relating to machine vision. An important aspect of this approach is that it generates large numbers of features in a local region such as location, scale, rotation, magnitude, and orientation in order to record information about key points.

Bayesian filtering SLAM
SLAM using Bayesian filtering involves finding appropriate representations for modeling both an observation and motion, as shown in Fig. 4.
The observation model describes the probability of making an observation in time k(z k ), when a vehicle's location x k and a map (including the landmark locations) are known. It is generally described as The motion model describes the probability distribution of transitions in the robot's state as The state transition is assumed to be a Markov process in which the next pose x k depends only on the previous state x k−1 and the applied control input u k . The SLAM algorithm is implemented in a standard two-step recursive (sequential) prediction and correction as follows.
The solution involves an appropriate representation for both the motion model and the observation model that allows efficient and consistent computation of the prior and posterior distributions. The most popular of the state-of-the-art SLAM methods is the application of the EKF. The EKF linearizes the nonlinear motion model at an estimated linearization point, uses a first-order approximation to represent the state, and involves a Jacobian matrix calculation. The SIFT RGB-D feature map can be constructed directly with the EKF-SLAM, as shown in Fig. 5.

EKF construction of an indoor environment and beacons post processing
There are a number of SLAM application programming interfaces (APIs), such as the mobile robot programming toolkit (MRPT), for real mobile robot SLAM implementation. In this work, the EKF-SLAM toolbox using the classical EKF implementation was applied in advance to simulate a Kinect RGB-D camera to construct an environmental feature map. (6) Figure 6(a) shows the EKF-SLAM process with point landmarks in a global view, and Fig. 6(b) shows the point landmarks captured in the image of an RGB sensor on a mobile robot.
Ideally, the RGB-D camera is expected to capture all significant features of the indoor environment, but practically, the visual feature extraction is constrained by light, shapes, and colors. Hence, in real applications, it is expected that some artificial beacons, such as IR or radio beacons, will be added in the environment and marked on the map to help indoor localization. The idea and concept are shown in Fig. 7, where the diamond shapes on the map are mapped to the visual features in the environment, and the blue circles are man-made beacons which are placed in corresponding positions in the environment.

Monte Carlo particle filter (PF)
A PF, also called CONDENSATION (conditional density propagation), (7) is based on Monte Carlo and Bayesian methods. The PF uses random sampling. Each particle presents an assumption of the location (x,y) and orientation (θ) of the robot. For example, 1000 green particles are initialized with uniform distribution as shown in Fig. 8. The z-axis represents the robot head angle in radians.
The advantage of PF is that it can eliminate background noise. For mobile robot applications, two types of data are distinguished: perceptual data, such as landmark measurements, and odometer data or control, which carries information about robot motion. The PF algorithm for mobile robot self-localization is shown in Table 1, in which x t is robot pose at time t, u t is the robot motion command, and z t is measurement of the robot.

Strategy control with beacons
The strategy is to control the field of sampling when PF is initialized for robot pose estimation. The main idea is that the wireless beacon's radio signal strength or the infrared beacon's transmit scope can be pre-measured or pre-determined as a circular field. Figure 9(a) shows the green particles with uniform sampling of the entire map, because it has no cues to guess the initial pose of the robot in the map for PF initialization. However, if there is a beacon mounted on the ceiling and upon the robot as shown in Fig. 9(b), then we can constrain all the green particles with uniform distribution inside the circular boundary, and speed up PF initialization.
Furthermore, for a consistent estimate of PF, we modified the concept from Fig. 9(b). i.e., when the statistical standard deviation of all particles for estimation of the robot pose exceeds a threshold, e.g., half the radius of the beacon's scope, as shown in Fig. 10(a), and if the nearest beacon is identified as shown in Fig. 10(b), then the particle is re-sampling in the scope and its new mean position is the same as the last mean position. The complete flow diagram for re-sampling control is shown in Fig.  10(c).

Comparison of number of landmarks for PF localization
The first experiment indicated that the number of landmarks/features and number of particles affected the accuracy of the estimated pose. There are 100 landmarks and 2000 particles on the map for mobile robot self-localization, as shown in Fig. 11(a). The red line represents the PF's estimate and the blue line represents actual trajectory of the robot for a 2500 time-step (250 s) simulation. Figure 11(b) shows the standard deviation of robot pose estimation (blue, x; green, y; red, θ), and the PF achieved excellent pose estimation over the time step. Figure 11(c) is simulated with 100 landmarks and 1000 particles, where the initial estimation could not locate the actual robot pose, but after 80 time steps, the particles converge to robot's actual trajectory. Figure 11

Estimation with initial beacon
From the experiment in § 5.1, the number of landmarks is the key to increasing the accuracy of the PF estimation. In real applications however, the visual landmark is constrained by the light, shapes, and colors in the environment. Thus, the artificial beacons were added to assistant robot self-localization. The concept and comparison are shown in Figs. 9 and 12. Without artificial beacons as shown in Fig. 9(a), the initial estimation of robot position is a uniform distribution over the map, and it is difficult to converge to an actual pose in the beginning phase with fewer visual landmarks observed. When more landmarks were observed in the middle time step, the PF converged to an actual robot trajectory as shown in Fig. 12(a). As shown in Fig. 9(b), an artificial beacon was added, and it restricted the initial particle sampling field. Figure 12(b) shows an acceptable estimation result for matching an actual trajectory.

Multiple beacons with strategy control
From the experiment in § 5.2, an initial beacon restricted the initial particle sampling field to reach an acceptable estimation for matching an actual trajectory. In this experiment, multiple beacons were simulated with the strategy control to assist mobile robot localization in an environment with fewer less visual landmarks (landmarks = 20). Figure 13 shows the comparisons when landmarks = 20, particles = 800. Without the artificial beacons to assistant mobile robot localization in Fig. 13(a), the mean error in position was 3.5510 m. With the artificial beacons to assistant mobile robot localization in Fig. 13(b), the mean error in position was 0.4248 m, which is   Figure 14 shows a repeat of the experiment with particles = 2000. This experiment indicated that more particles enhance the estimation accuracy. In Fig. 14(a) without beacon control, the mean position error was decreased to 1.0791 m; in Fig. 14(b) with beacon control, the mean position was decreased to 0.2589 m. All the comparisons are shown in Table 2.

Conclusions
In this work, a fused concept of RGB-D visual landmarks, gyro-odometer, and artificial beacons was proposed to increase the accuracy of mobile robot indoor localization. First, an EKF-SLAM process using RGB-D camera was carried out to construct an environment map with visual features. Ideally, the RGB-D camera captured all significant features in the indoor environment, but practically, the visual feature extraction was constrained by the light, shapes, and colors of the environment. Hence, in real applications, we added some artificial beacons, such as IR or radio beacons, in the environment and marked them on a map to help indoor mobile robot selflocalization. In addition, for consistent estimation of robot trajectory, a re-sampling control strategy using a PF was proposed. Finally, all the experimental results show the proposed method actually increases the accuracy of mobile robot association and localization when fewer visual features are in the environment.