Using Interval Type-2 Recurrent Fuzzy Cerebellar Model Articulation Controller Based on Improved Differential Evolution for Cooperative Carrying Control of Mobile Robots

In this study, we propose an effective cooperative carrying method for mobile robots in an unknown environment. During the carrying process, the state manager (SM) switches between wall-following carrying (WFC) and toward-goal carrying (TGC) to avoid obstacles and prevent the objects from dropping. An interval type-2 recurrent fuzzy cerebellar model articulation controller (IT2RFCMAC) based on dynamic group differential evolution (DGDE) is proposed for implementing the WFC and TGC of mobile robots. The adaptive wall-following control is developed using the reinforcement learning strategy to realize cooperative carrying control for mobile robots. The experimental results indicated that the proposed DGDE is superior to other algorithms and can complete the cooperative carrying of mobile robots to reach the goal location.


Introduction
(3) During the navigation process in an unknown environment, a robot has to avoid colliding with obstacles and move towards a goal.In the cooperative carrying process by multiple robots, the mobile robots assist each other to prevent an object from dropping and colliding until they successfully complete the cooperative carrying task.Therefore, developing the appropriate decision for mobile robots to avoid obstacles is an important topic.Zhao designed a fuzzy controller using infrared sensors to keep robots away from obstacles. (4)Gavrilut and Tiponut proposed an expert fuzzy system to replace the complex mathematical model and define the fuzzy rules, but specialists in the field are required to establish fuzzy rules on the basis of their expertise. (5)Zhu and Yang combined fuzzy logic with artificial neural networks (ANNs) and adjusted the controller parameters through a neural fuzzy network so that the mobile robot can complete a navigation task. (6)These methods only consider the navigation of a single mobile robot.
9) Sakuyama et al. presented a methodology for the transport of a large object by mobile robots using small hand carts. (9)The object was placed on top of the mobile robots that carry an object toward a destination.Pereira et al. divided the robots into the leader and the follower to complete the cooperative carrying. (10)However, when robots encounter obstacles, it must turn back to move in another direction, which reduces the efficiency of navigation.Yamashita et al. adopted a path planning method to calculate the optimized route in a known environment. (11)n a real environment, the input signal contains uncertainties due to noise interferences from the sensors.Therefore, Baklouti et al. used an interval type-2 fuzzy neural network for solving uncertain problems. (12)n this study, we present an effective cooperative carrying and navigation control method for mobile robots in an unknown environment.A state manager switches between two behavioral control modes, wall-following carrying (WFC) and toward-goal carrying (TGC), based on the relationship between the mobile robot and the unknown environment.In order to design a robust controller with antinoise capability, an efficient interval type-2 recurrent fuzzy cerebellar model articulation controller (IT2RFCMAC) based on dynamic group differential evolution (DGDE) is proposed to realize the carrying control and WFC for mobile robots.The experimental results demonstrate that the proposed method can enable the mobile robots to complete the task of cooperative carrying.

Description of Mobile Robot
The e-puck mobile robot developed by Ecole Polytechnique Fédérale de Lausanne was adopted in this study, as shown in Fig. 1(a).The mobile robot has been applied to various studies, such as signal processing, robot control, swarm intelligence, coordinated motion, and human-computer interaction.The e-puck is a two-wheeled mobile robot with an axle diameter of 4 cm, a height of 5 cm, and a maximum speed of 15 cm per second.The robot contains 8 infrared sensors S 0 -S 7 that are distributed around the robot, as shown in Fig. 1(b).The sensors on the right side of the robot are mounted at 10, 45, 90, and 135°.Each infrared sensor can detect a distance between 1 and 5 cm.

Wall-following Control of Mobile Robot
In this section, the proposed IT2RFCMAC based on DGDE is demonstrated to realize wallfollowing control.The DGDE algorithm is used to adjust the parameters of the IT2RFCMAC during the learning process.
Figure 2 shows the structure of IT2RFCMAC.X n represents the input of IT2RFCMAC, whereas Y L and Y R represent the left and right wheel speeds of the robot, respectively.To reduce the computational complexity during defuzzification, in this study, we adopted the centers of sets (COS) to implement the reduction process. (21)The fuzzy if-then rule can be expressed as where j is the rule number of the fuzzy hypercube, X i denotes the ith input, Ã i,j represents the interval type-2 fuzzy set, and y denotes a Takagi-Sugeno-Kang (TSK) linear function in the consequent layer.
The operations of the seven-layer IT2RFCMAC are described as follows.Layer 1 (input layer): The input data set X = [X 1 , X 2 , …, X n ] is expressed as , and 1, 2, , , , Layer 3 (firing layer): Each node is a fuzzy hypercube and uses an algebraic product operation to calculate the firing strengths F j , and is defined as , where ( ) f represent the firing strengths of the fuzzy hypercube's upper bound and lower bound, respectively.Layer 4 (recurrent layer): In this layer, feedback connections are added to embed temporal relations in the network.The output O j (4) is combined with the current firing strength and all the previous fuzzy hypercubes, and is expressed as , where , respectively.Layer 5 (consequent layer): This layer adopts the TSK function instead of the traditional fuzzy inference of the consequent, and the output in layer 5 is defined as , where where a j 0 and a i,j represent a constant of the TSK linear function weight, and the output A j (5) is expressed as the upper bound Ā j (5) and the lower bound A j (5)   .Layer 6 (defuzzification layer): The traditional type-2 order reduction method is a highly complex calculation.Thus, the type-2 fuzzy sets are converted to type-1 fuzzy sets by the typereduction (21) method.The crisp output value [y r (6)   , y 1 (6)   ] is obtained using a center-of-gravity defuzzification method and is described as Layer 7 (output processing layer): The output is defuzzified by computing the average of y r (6)   and y l (6)   , and the crisp value of y (7) is obtained as

Proposed DGDE algorithm
Evolutionary algorithms can solve optimization problems by imitating some aspects of natural evolution, such as ant colony optimization (ACO), (22) particle swarm optimization (PSO), (23) differential evolution (DE), (24) and the artificial bee colony (ABC) algorithm. (25)(28)(29)(30) However, it still has several disadvantages, such as poor accuracy and becoming easily trapped in a local optimal solution.To eliminate this disadvantage, an efficient DGDE algorithm is proposed to overcome the shortcomings of traditional DE in this study.The steps of DGDE are described in detail below.
Step 1 initialization and coding: All the parameters of each IT2RFCMAC are coded into one vector.The coding format is presented in Fig. 3.The adjustable parameters in each fuzzy hypercube consist of a Gaussian uncertainty mean (m 1 i,j ), standard deviation (σ i,j ), displacement value of the uncertainty mean (d i,j ), recurrent weight (R k,j ), and TSK linear function weights (a⁰ j and a i,j ).
Step 2 ranking the fitness values of all vectors: When the fitness value of each vector X i is calculated, it is sorted according to the fitness value from the best to the worst.The initial group number of all vectors is set to zero.
Step 3 vector group: The vector with the highest fitness value is set as the new group leader, and the group number is updated to one.Then, the average distance difference and the average fitness difference between these ungrouped particles and the group leader in group number zero are calculated.The definition is ( ) , if 0 ( ) , if 0 ( ) where NP represents the number of parameter vectors, D represents the encoded dimension, L g j denotes the jth dimension of the gth group leader, NI is the total number of ungrouped vectors (group number 0), and ADIS g and AFIT g represent the distance threshold value and fitness threshold value, respectively, of the gth group.The distance difference (Dis i ) and fitness difference (Fit i ) are calculated to determine whether the ungrouped vectors are similar to the leader vectors.
( ) ( ) ( ) If the conditions Dis i < ADIS g and Fit i < AFIT g are satisfied for a vector, then the vector is similar to the gth group leader.These vectors are grouped and the group number is updated to g.
If any ungrouped particles exist, repeat Steps 1 to 3. The ungrouped vector with the highest fitness value is set as the group leader in a new group (i.e., the 2nd group leader).The grouping process is completed when no ungroup vectors exist.
Step 4 mutation: In this study, a dynamic grouping strategy and a new mutation method are proposed for improving the traditional DE algorithms.The modified formula of mutation is expressed as where F is the mutation weight factor, X best,G is the best fitness vector, and X rL,G is a random leader selected from all the group leaders.
The traditional DE method is easily trapped into a local optimum.Therefore, the random leader as the base vector is proposed to effectively increase the search ability.The mutated vector in Eq. ( 20) revolves around the best vector and enhances the search ability in the solution space.
Steps 5 and 6 are recombination and selection, respectively, which are the same as those in the traditional DE method.The pseudocode of DGDE is as shown in Fig. 4.

Wall-following control of mobile robots
A reinforcement learning strategy is utilized by the IT2RFCMAC to realize wall-following control for the mobile robot.The IT2RFCMAC has four inputs (S 0 , S 1 , S 2 , and S 3 ) and two outputs.The input S i is the distance that is measured by the infrared sensor.The outputs are the rotational speeds V L and V R of the two wheels.The block diagram for the wall-following control of the mobile robot is as shown in Fig. 5.To avoid collision with obstacles and deviation from the wall during the wall-following control learning process, three terminal conditions are adopted for wall-following control learning.1.A larger total moving distance of the mobile robot than the maximum permitted distance of the training environment indicates that the mobile robot successfully moved in a circular path in an unknown environment.2. The mobile robot is defined as having collided with the wall when the measured distance from any infrared sensor is less than D 1 (i.e., D 1 is set to 1 cm).3. The mobile robot is defined as having deviated from the wall when the measured distance of the sensor S 2 is greater than D 2 (i.e., D 2 is set to 1 cm).
In this study, a fitness function that is a combination of three subfitness functions is proposed to evaluate the performance of mobile robot wall-following control.The three subfitness functions are for the total moving distance (SF 1 ), the distance between the robot and the wall (SF 2 ), and the degree of parallelism between the robot and the wall (SF 3 ).( 1) SF 1 : When the robot moving distance R distance is greater than the default value R stop , set R distance = R stop .This indicates that the robot has successfully moved in a circular path in the training environment.
(2) SF 2 : To maintain a fixed distance between the robot and the wall in the wall-following process, SF 2 is defined as the average of W d (t), where W d (t) denotes the distance between the robot and the wall at each time step.
( ) ( ) ( ) (3) SF 3 : According to the law of cosines, x(t) is equal to RS 2 when the robot sensor S 2 is parallel to the wall, and the angle between the robot and the wall is 90°, as shown in Fig. 7.
( ) Here, r is the radius of the robot, and D 1 and D 2 denote the distances between the sensors S 1 and S 2 .SF 3 is defined as ( ) The fitness function for evaluating the wall-following learning performance is a combination of the three subfitness functions (SF 1 , SF 2 , and SF 3 ) and is defined as ( ) ( )

Experimental results of wall-following control
In this section, the performance of the proposed DGDE method is compared with those of other evolutionary algorithms.The initialization parameters of the DGDE algorithm consist of the number of vectors (NP), crossover rate (CR), mutation weighting factor (F), generation, and number of fuzzy hypercubes, as presented in Table 1.
Moreover, different numbers of fuzzy rules are considered in the performance evaluation.The IT2RFCMAC with six fuzzy hypercubes was more efficient than that with five or seven fuzzy hypercubes, as shown in Table 2.The number of successful runs represents the number of times that the robot moved successfully in a circular path in the training environment.Table 3 shows the performance of different algorithms.The proposed DGDE achieved superior fitness values and more successful runs compared with other algorithms.The paths of the robot moving with wall-following control using various evolutionary algorithms are shown in Fig. 8.

Cooperative Carrying by Multi-evolutionary Mobile Robots
In this section, we introduce the method of cooperative carrying for mobile robots.The training environment consists of a leader robot and a follower robot, the distance between the two robots is set as 15 cm, and a rectangular object is placed on the two robots.During the cooperative carrying process, the leader explores the front environment and the follower assists the leader to achieve obstacle avoidance, as shown in Fig. 9.  (24) 0.889 0.835 0.868 0.014 8 JADE (29) 0.914 0.861 0.889 0.013 10 Rank-DE (30) 0.910 0.857 0.878 0.012 10 ABC (25) 0.824 0.774 0.803 0.016 7

Cooperative carrying method of WFC
A dual controller for cooperative carrying by two mobile robots is proposed.A cooperation controller, which contained five input signals and two output signals, was added for the follower robot to learn WFC.The inputs are the distances sensed by the follower robot's sensors (S 0 , S 1 , S 2 , and S 3 ) and the distance R d between two robots.The outputs are the rotational speeds V L and V R of the two wheels.Figure 10 presents the block diagram of cooperative carrying by two mobile robots.
The training environment was established with straight lines, smooth curves, continuous curves, and U-shaped curves, to train the follower's cooperation controller.Figure 11  ( )

Experimental results of the wall-following carrying
The performances of WFC control using our DGDE method and using other methods were compared.Each method was evaluated 10 times to verify the stability of each algorithm in this study.The initial parameters of WFC control using the DGDE algorithm are presented in Table 4.
Table 5 shows the performance evaluations of different algorithms.The proposed WFC control using the DGDE method achieved superior fitness values and more successful runs compared with other methods.In this study, a training environment was created to verify the WFC control performance of different learning algorithms and the path of the robot moving with WFC control using various evolutionary algorithms, as shown in Fig. 12.

Experimental results of cooperative carrying control
The proposed method is used to verify the performance of navigation control.Two different test environments were created to test whether the robots successfully complete cooperative carrying and navigation control.Experimental results for the two test environments are shown in Fig. 13.The effectiveness of cooperative carrying control was evaluated on the basis of the average distance (RD) between the two robots and the average distance (AD) between the follower robot and the wall.If RD is large, the two robots do not remain apart at a suitable distance during cooperative carrying control, and the object falls easily.On the other hand, if AD is small or large, it means that the robots pass the curves with poor efficiency and the object falls easily.Performance evaluation results of cooperative carrying control are shows in Table 6..438 0.221 0.358 0.046 5 JADE (29) 0.707 0.529 0.590 0.038 8 Rank-DE (30) 0.721 0.553 0.643 0.036 8 ABC (25) 0.399 0.242 0.322 0.054 4

Conclusion
We proposed an effective IT2RFCMAC controller for cooperative carrying control in an unknown environment.The parameters of IT2RFCMAC are trained in an unknown environment through reinforcement learning.The proposed DGDE learning algorithm   uses dynamic grouping and local search methods to improve the convergence stability of the traditional DE method.In addition, the proposed state manager automatically switches between WFC and TGC during navigation control in cooperative carrying.Experimental results demonstrated that the control performance of the proposed method is superior to those of the other methods in WFC, and cooperative carrying control of mobile robots in unknown environments was successfully accomplished.
are the recurrent weights of the current and previous firing strengths of each rule, respectively.M is the number of fuzzy hypercubes, , q k j λ represents the random number of recurrent weight between [0,1], and ( ) ( ) represent the upper bound and lower bound of the previous output O j(4)

Figure 6
Figure 6 presents the 1.7 × 1.4 m 2 training environment.In order to allow the mobile robots to experience various environments, the training environment consists of straight lines, corners, right-angled corners, and a U curve.To avoid collision with obstacles and deviation from the wall during the wall-following control learning process, three terminal conditions are adopted for wall-following control learning.1.A larger total moving distance of the mobile robot than the maximum permitted distance of the training environment indicates that the mobile robot successfully moved in a circular path in an unknown environment.2. The mobile robot is defined as having collided with the wall when the measured distance from any infrared sensor is less than D 1 (i.e., D 1 is set to 1 cm).3. The mobile robot is defined as having deviated from the wall when the measured distance of the sensor S 2 is greater than D 2 (i.e., D 2 is set to 1 cm).In this study, a fitness function that is a combination of three subfitness functions is proposed to evaluate the performance of mobile robot wall-following control.The three subfitness functions are for the total moving distance (SF 1 ), the distance between the robot and the wall (SF 2 ), and the degree of parallelism between the robot and the wall (SF 3 ).(1)SF 1 : When the robot moving distance R distance is greater than the default value R stop , set R distance = R stop .This indicates that the robot has successfully moved in a circular path in the training environment.

Fig. 13 .
Fig. 13.(Color online) Navigation control of cooperative carrying in (a) test environment 1 and (b) test environment 2.

Table 1
Initial parameters of DGDE.

Table 2
Performance with different numbers of fuzzy hypercubes.

Table 3
Performances of various algorithms.

Table 4
Initial parameters of WFC control.

Table 5
Fitness value of various algorithms in the test environment.

Table 6
Performance evaluation results of cooperative carrying control.