Using an Evolutionary Fuzzy Neural Network for Sensor-based Wall-following Control of a Mobile Robot

We propose an efficient evolutionary fuzzy neural network (EFNN) for mobile robot control. The proposed EFNN combines a fuzzy neural network (FNN) and an improved artificial bee colony (IABC) algorithm to implement the wall-following control of a mobile robot. To evaluate the wall-following control performance of the FNN, an efficient fitness function is defined. The three control factors (CFs) in the fitness function are the maintenance of the robot–wall distance, the avoidance of robot–wall collision, and the successful movement of the robot along a wall to travel around a stadium. The traditional ABC emulates the intelligent foraging behavior of honey bee swarms, but this algorithm performs favorably at exploration and poorly at exploitation. Therefore, the proposed IABC algorithm uses mutation strategies to balance exploration and exploitation. Furthermore, a new reward-based roulette wheel selection (RRWS) mechanism is adopted to obtain a more favorable solution during the learning process. Experimental results demonstrate that the proposed IABC obtains a smaller root mean square error (RMSE) than other methods in wall-following control.


Introduction
The navigation control, (1,2) wall-following behavior control, (3,4) parallel parking control, (5) and path tracking control (6,7) of mobile robots are essential issues for implementing behavior-based control in unknown environments. However, wall-following behavior control is particularly critical for a mobile robot. In traditional control methods, the control performance depends on the accuracy of its sensors because it is affected by noise interference.
Fuzzy logic was developed in 1965 by Zadeh (8) to overcome the complication, uncertainty, and nonlinearity of systems. Therefore, it is useful for solving uncertainty in real problems by simulating the human experience in fuzzy logic rules. Fuzzy logic controllers (FLCs) have been used by numerous researchers in mobile robot wall-following tasks (9,10) and obstacle avoidance. (11) To improve the performance of FLCs, many optimization algorithms, such as supervised learning, (12,13) population-based learning, (14,15) and reinforcement learning, (16) have been proposed. Supervised learning generally trains an FLC by using input and output training data. However, in wall-following tasks, the collection of training data is difficult. Therefore, in this study, we propose a new fitness function that evaluates the performance of a controller. (17,18) The fitness function is computed online from online data generated during the learning process for the mobile robot. In the training process, no training data need be collected in advance. Therefore, the training method can be extended to a real-world environment. In addition, many researchers have proposed population-based learning algorithms, such as particle swarm optimization (PSO), (19) differential evolution (DE), (20) and artificial bee colony (ABC) (21)(22)(23) algorithms.
The traditional ABC algorithm contains three essential component groups: employed bees, onlookers, and scout bees. Employed bees search for and exploit food sources while imparting food source information to the onlookers. The onlookers then select food sources according to this food source information. Scout bees perform a random search in the search space environment to find new food sources. The traditional ABC algorithm performs favorably at exploration but poorly at exploitation. (24) A new combinatorial solution search stage has been proposed to balance the importance of exploration and exploitation during the learning stage. (25)(26)(27) Furthermore, an onlooker bee in the traditional ABC algorithm measures the nectar information of all employed bees and uses a probability value to select food sources, which is similar to the "roulette wheel selection" in a genetic algorithm, related to the amount of nectar at each site. (28) In roulette wheel selection, some less than optimal food sources may remain. (29,30) We propose an efficient evolutionary fuzzy neural network (EFNN) for mobile robot control. The proposed EFNN combines a fuzzy neural network (FNN) and an improved artificial bee colony (IABC) algorithm to implement the wall-following control of a mobile robot. An IABC is proposed for adjusting the parameters of an FNN. The three control factors (CFs) in the fitness function are the maintenance of the robot-wall distance, the avoidance of robot-wall collision, and the ability of the robot to move along a wall to travel around a stadium. To improve the control performance of the FNN, a mutation strategy in the IABC algorithm is produced to balance exploration and exploitation during the learning process. Moreover, a new reward-based roulette wheel selection (RRWS) mechanism in the IABC algorithm is also proposed. A favorable solution can be obtained on the basis of a reward concept. The results are compared with the efficiencies of FNNs based on ABC and DE algorithms for wall-following control.
The rest of this study comprises six sections. Section 2 presents a description of the mobile robot and associated experiments; Sect. 3 introduces the design of the FNN; Sect. 4 introduces the proposed IABC algorithm; Sect. 5 evaluates the controller's performance and the training environment; Sect. 6 describes the simulations and experiments for wall-following robot control; and Sect. 7 presents the conclusions of this study. Figure 1 shows Pioneer 3-DX, which is a small, lightweight, two-wheel, and two-motor differential-drive robot. Eight ultrasonic sensors were incorporated into the robot to measure the distances between the robot and obstacles for wall-following control to be achieved. The ultrasonic sensor positions on the Pioneer 3-DX robot were fixed with two on the sides and six facing outward at 20° intervals to provide 180° forward coverage. Each sensor measured a distance range between 0.15 and 4.75 m.

Description of Mobile Robot
To prevent collisions between the robot and a wall or an obstacle, only the three ultrasonic sensors on the right (or left)-S 1 , S 3 , and S 4 (or S 5 , S 7 , and S 8 )-were used to evaluate the distance between the robot and the wall during a right (or left) wall-following task. The original sensor values were limited to the range of 0.2-0.74 m in the simulations and experiments because it was unnecessary to use a larger range for these particular wall-following tasks.

Design of an FNN
This section illustrates the design of the FNN. Figure 2 shows the structure of the FNN. During right (or left) wall-following control, only three ultrasonic sensors, S 1 , S 3 , and S 4 (or S 5 , S 7 , and S 8 ), on the right (or left) estimate the distance between the robot and the wall, and these are the inputs of the FNN in this study. The outputs of the FNN control the left-and right-wheel speeds of the robot. The FNN (31) can be expressed as

: [IF is and is and is ]
THEN is and is where x 1 , x 2 , and x 3 are respectively the distances between the ultrasonic sensors S 4 , S 3 , and S 1 and the wall. Moreover, A ij is the linguistic term of the precondition part, γ j ∈ [0, 1] is the compensatory factor, y l and y r are the left-and right-wheel speeds of the robot, and w j and v j are the weights of the consequent part, respectively. Fuzzification operation is used as the Gaussian membership function where m ij and σ ij respectively represent the mean and variance in the Gaussian function of the fuzzy set.
In the fuzzy implication operation using product operation, fuzzy implication assists in the evaluation of the consequent part of each rule as 1 3 where γ j = c j 2 /(c j 2 +d j 2 ) is the compensatory degree and c j , d j ∈ [−1, 1] are pessimistic and optimistic parameters, respectively.
In the defuzzification operation, the center of the area is used in this study and is described by

Review of ABC algorithm
The ABC algorithm was inspired by the intelligent behavior of honey bees. The honey bees in the ABC algorithm are classified into three groups: onlookers, employed bees, and scout bees. Bees that discover food source positions (i.e., solutions) and randomly search their vicinity are named employed bees. They return to the hive and perform a waggle dance to share information regarding the locations of new food sources available with bees in the dance region of the hive. The onlooker bees watch the dances, select the best food source among those found by the employed bees, and conduct a further random search after reaching the vicinity of the selected food source. The onlookers choose the food source according to a probability proportional to the amount of nectar (fitness value) of the food source. Scout bees randomly search the environment to find new food sources. When the food source of employed bees has been exhausted, the employed bees will become scout bees. The steps of the ABC algorithm are explained as follows: Step 1) Initialize SN population solutions x i , where x i is a food source with D-dimensional real-valued vectors and i = 1, 2, ..., SN.
Step 2) Evaluate the fitness function value of each solution.
Step 3) Each employed bee generates a new solution v i as , , , where t is the number of generations; φ i, j is a random value in the interval [−1, 1]; and k = 1, 2, ..., NP such that k ∉ i and j = 1, 2, ..., D are both randomly chosen indices. Thereafter, the fitness value of the new solution is evaluated.
Step 4) Apply a greedy selection mechanism to compare a current solution x i with a new solution v i .
Step 5) Calculate the probability value for each solution. The onlooker bees use the roulette wheel selection scheme to choose a solution. The probability value is calculated as where fit i represents the fitness value of solution i and SN represents the total number of solutions.
Step 6) Each onlooker bee produces a new solution that is in the neighborhood of its current solution by using Eq. (1) and evaluates it.
Step 7) Repeat Step 2 and use the greedy selection process to compare a current solution x i with a new solution v i . Step 8) If solution x i is not improved and exceeds a certain threshold, then a better solution cannot be found; thus, it is considered that this solution needs to be abandoned and the corresponding bee becomes a scout bee. The new scout bee is randomly initialized in the search space expressed as , min, max, min, where x min, j and x max, j are the lower and upper bounds in dimension j, respectively, and rand (0,1) represents a random value between 0 and 1.
Step 9) Remember the best solution found thus far.
Step 10) Check for termination. If the generation value is larger than the predefined maximum number of generations, stop and print the result; otherwise, return to Step 3 and continue performing the algorithm.

Population-based DE
DE is a population-based and directed search method. Similar to other evolutionary algorithms, DE begins by generating an initial population NP (at t = 0) with D-dimensional parameter vectors, which search through the search space by randomly choosing within the boundary. Thereafter, DE tries to find the global optimal solution by iterating the populations using three major operations: mutation, crossover, and selection. The basic strategy of DE is described in further detail as follows: 1) Mutation The mutation operation generates a mutant vector t i V . The mutation process is expressed as where t is the current generation and i and j are the ith vector and the dimension of the vectors, respectively. The variables r 0 , r 1 , and r 2 are randomly selected indices from the range [0, The crossover operation is used as the crossover rate to generate a trial vector from each of the target vectors and their corresponding mutant vectors after the mutation phase: where CR ∈ [0, 1] is a predefined value and rand j (0, 1) is a random value between 0 and 1.

3) Selection
If the fitness function of the new trial vector t i U is superior to its corresponding target vector t i X , the target vector is changed by the trial vector in the next generation. The operation is represented as The previous steps are repeated until the maximal evolutionary generation or until the best solution is found.

Proposed IABC
This subsection illustrates the proposed IABC. The traditional ABC performs favorably at exploration but poorly at exploitation. (24) During the learning process to achieve both exploration and exploitation characteristics, a new search process used to obtain a combinatorial solution is proposed. In the ABC, onlookers measure nectar information obtained from all employed bees and use roulette wheel selection (28) to select food source locations with a given probability. Therefore, few food sources may remain in the selection scheme. (29,30) An efficient RRWS mechanism is proposed to improve the probability values. Figure 3 shows a flow chart of the proposed IABC algorithm. The steps of the IABC are explained as follows: Step 1) Initialize the population solutions x i , i = 1, 2, ..., SN Each position of food source (solution) x i is an FNN. Each FNN consists of multiple fuzzy rules. Figure 4 shows the coding of an FNN (solution) in the IABC algorithm. In the FNN, the sensor signals S 1 , S 3 , and S 4 are used as three inputs and the left-and right-wheel speeds are used as outputs. All the control parameters must be defined in advance. In this study, a uniform random distribution is used to generate the boundary conditions for each parameter. The FNN parameters are initialized as [0,10] [0,10] where each ultrasonic sensor (S 1 , S 3 , or S 4 ) has a reading range of 0.2-0.74 m. The left-and right-wheel speeds have a range of 0-10 m/s in the simulations.
Step 2) Evaluate the FNN In traditional supervised learning, (12,13) training data are required in the learning process, which is used for optimizing an FLC. Therefore, during the wall-following control, a fitness function (17,18) is presented to evaluate the performance of the FNN. We propose a fitness function C and three robot stop conditions. All the control parameters must be defined in advance of the training stage. The maximum cost function C comprises three CFs and three robot stop conditions, as described in Sect. 6.
Step 3) Generate and evaluate the new solutions u i for employed bees Using the mutation strategy, each employed bee generates new solutions u i and applies the greedy selection operation for employed bees. Therefore, the bees seek a wider field and make the controller more adaptive. The new solutions u i are presented as Here, is the best individual, , x are randomly selected individuals, and , Step 4) Calculate the probability values p i for the solutions x i An RRWS is adopted to calculate the probability values for the solutions, which are given as Step 5) Generate and evaluate the new solutions u i for the onlookers In this phase, using the mutation strategy, each onlooker generates new solutions u i by DE that are dependent on the RRWS.
Step 6) Confirm the status of a scout bee; if abandoned, a new randomly generated solution x i is replaced with the scout bee.
If the solution of the IABC algorithm is generated within a specific range, we discard this solution and generate a new solution. This operation is described as , min, max, min, where i is the ith solution of the population, j represents the jth dimension of the solution, G is the generation number, and min, G j x and max, G j x are the lower and upper bounds of the jth dimension, respectively.
Step 7) Remember the best solution In the final phase, the best solution has been obtained. If the current fitness function is superior to the memorized best fitness function, the current fitness function replaces the previous best fitness function.

Reinforcement Learning of Mobile Robot Wall-following Control
In the training process, three ultrasonic sensor inputs, S 1 , S 3 , and S 4 , in the FNN estimate the distance between the robot and the wall. The outputs of the FNN are the left-and right-wheel speeds of the mobile robot. The FNN is optimized using the reinforcement-learning-based IABC in a training environment. The predefined training environment is shown in Fig. 5. Figure 6 displays the learning architecture of the wall-following control using the FNN during the training process.
Traditional evolutionary algorithms use input-output training data to train a controller. In this study, the fitness function is designed to assess the FNN performance in wall-following control. The proposed fitness function comprises three CFs and three stop conditions to perform the wall-following control using reinforcement learning. The stop conditions are as follows: (1) the robot collides with the wall (obstacle); (2) the robot travels away from the wall (i.e., S 4 ≥ 0.74 m); and (3) the robot successfully moves along the wall for at least one complete circuit (i.e., T dis ≥ T stop ), where T dis is the distance moved by the robot within the environment and T stop is the maximum distance that can be travelled by the robot, which is user-defined according to the scale of the environment. The three CFs of the fitness function are defined as follows: A. CF 1 : The robot keeps a predefined distance from the wall. According to sensor S 4 , CF 1 represents the right-side distance RD 1 between the robot and the wall. CF 1 at time step t is given by Fig. 5. Training environment. Fig. 6. Learning architecture of wall-following control using FNN.
. RD 1 (t) = 0 indicates that the robot maintains the desired right-side distance from the wall; d wall represents the required wall-robot distance, which is set to 0.3 m (Fig. 7); and T total is the number of time steps. B. CF 2 : The mobile robot avoids obstacles in a complex environment. According to sensors S 1 and S 3 , CF 2 is the distance RD 2 between the robot and the front-right wall. CF 2 at time step t is expressed as . RD 2 (t) = 0 represents the state wherein no obstacles are in front of the robot and d Limit (t) is the desired distance between the robot and the front-right wall, which is set to 0.5 m (Fig. 8). C. CF 3 : The robot moves along the wall to travel around the stadium successfully. CF 3 represents the difference between the distance T dis that the robot moves within the environment and T stop that represents the distance the robot will travel if it takes the optimal route around the circuit. CF 3 is defined as  ≥ , the robot is successfully moving along the wall to travel around the stadium. After calculating CF 1 , CF 2 , and CF 3 , a normalization operation is used to adjust all the CFs, and these adjustable parameters are defined as F 1, F 2 , and F 3 . The three CFs (i.e., F 1 , F 2 , and F 3 ) are used to maximize the fitness function C, which is expressed as ( ) where α 1 , α 2 , and α 3 are the weighting coefficients that are set to 0.4, 0.05, and 0.55, respectively, in these experiments.

Experimental Results
To demonstrate the proposed FNN based on the IABC algorithm, wall-following control was performed using the Pioneer 3-DX robot and the results were compared with those of other algorithms. The training environment in Fig. 5 was used. Table 1 shows all the initial parameters set before the training process in the IABC algorithm. The experiment was repeated 30 times to demonstrate the stability of the proposed IABC algorithm.

Experimental results in a training environment
As described in this subsection, we designed and analyzed an FNN for wall-following control. The maximum distance of the robot was set to 15 m. Figure 9(a) shows that the robot could successfully move along the wall to travel around the stadium using the FNN based on the IABC algorithm. Figure 9(b) shows the distances between the wall and the ultrasonic sensors S 1 , S 3 , and S 4 over one complete circuit, in addition to the left-and right-wheel speeds of the robot. When the robot moved along the wall to point A, it slowly turned left in a straight area. At this moment, the ultrasonic sensors S 1 , S 3 , and S 4 registered distances from the wall of 0.74, 0.39, and 0.38 m, and the left-and right-wheel speeds were 2.67 and 2.92 m/s, respectively. Then, the robot encountered an inside corner at point B. To avoid a collision with the wall, the robot quickly turned left; the ultrasonic sensors S 1 , S 3 , and S 4 registered distances of 0.42, 0.35, and 0.54 m, and the left-and right-wheel speeds were 2.78 and 5.88 m/s, respectively. At points C, E, and F, the robot encountered outside corners and turned right. In this case, the ultrasonic sensors S 1 , S 3 , and S 4 registered distances of 0.74, 0.74, and 0.42 m, and the leftand right-wheel speeds were 6.6 and 5.03 m/s, respectively. Finally, when the robot entered a straight area, the ultrasonic sensors S 1 , S 3 , and S 4 registered distances of 0.74, 0.41, and 0.3 m, and the left-and right-wheel speeds were 2.85 and 2.85 m/s, respectively. The trajectory obtained using the proposed FNN based on the IABC algorithm and those obtained using other population-based algorithms are compared in Fig. 10. Figure 11 shows a plot of the average values of the cost functions for the proposed IABC design at different evaluation points and a comparison of these values with the corresponding values for the ABC and DE algorithms. Figure 11 and Table 2 demonstrate that the proposed IABC algorithm performed more favorably than the ABC and DE algorithms in wall-following control.

Experimental results in two testing environments
To further demonstrate the method's performance, two complex testing environments were created for the wall-following task. Figures 12(a) and 13(a) show the robot trajectories obtained using the IABC algorithm in the two complex testing environments.
When the robot encountered an outside corner at point A in testing environment 1 [see Fig. 12 Table 3. These results demonstrate that the mobile robot successfully achieves the wall-following control in the two testing environments and keeps a fixed distance from the wall. In addition, the proposed IABC algorithm performed more favorably than the ABC and DE algorithms, as illustrated in Figs. 14 and 15 by the behavior of the robot when it encountered a corner.

Analysis of S 4
In this subsection, CF 1 is analyzed by applying the root mean square error (RMSE). CF 1 ensures that the robot can maintain a predefined wall-robot distance. In other words, it ensures that the right-hand distance between the robot and the wall according to sensor S 4 can be kept constant. The RMSE is used to measure the performance of the FNN in wall-following control. A comparison of the RMSE values obtained using the IABC, ABC, and DE methods in different environments is given in Table 4.

Experimental results in a real environment
This subsection describes the actual wall-following control of the Pioneer 3-DX mobile robot using the FNN based on the IABC algorithm. To demonstrate the system's feasibility, a real environment was created for testing the performance of the robot in actual wall-following    control. Figure 16 shows the wall-following control results obtained using the proposed approach. The Pioneer 3-DX robot not only moves along the wall (obstacle) but also maintains a user-defined distance from the wall.

Conclusions
We proposed an EFNN to execute mobile robot wall-following control. The EFNN comprises an FNN and its reinforcement-learning-based IABC algorithm. The proposed IABC algorithm adopts a mutation strategy and a new RRWS for optimizing FNN parameters. For the fitness function used to evaluate the FNN's performance, three stop conditions are proposed. Therefore, the learning process of the mobile robot control in this study does not use any training data. Experimental results show that the average fitness value of the proposed IABC algorithm is superior to those of the ABC and DE optimization algorithms. The RMSE values of the proposed IABC, ABC, and DE algorithms in the testing environment are 0.034, 0.042, and 0.042, respectively. In addition, the actual wall-following control of a Pioneer 3-DX mobile robot applying an FNN based on the IABC algorithm was also performed successfully. To achieve high-speed operation in real-time applications, the FNN will also be implemented on a field-programmable gate array in a future study.