Natural Hand Gesture Recognition with an Electronic Textile Goniometer

1Department of Healthcare Engineering, Chonbuk National University, 567 Baekje-daero, Deokjin-gu, Jeonju-si, Jeollabuk-do 561-756, Republic of Korea 2Division of Biomedical Engineering, Chonbuk National University, 567 Baekje-daero, Deokjin-gu, Jeonju-si, Jeollabuk-do 561-756, Republic of Korea 3Research Center of Healthcare & Welfare Instrument for the Aged, 567 Baekje-daero, Deokjin-gu, Jeonju-si, Jeollabuk-do 561-756, Republic of Korea 4CAMTIC Advanced Mechatronics Technology Institute for Commercialization, 67 Yu-Sang-ro, Deokjin-gu, Jeonju-si, Jeollabuk-do 561-844, Republic of Korea

Gesture recognition allows distinguishing specific user motions that intend to express a message. The recognized gestures can be used in various applications such as humancomputer interface (HCI), clinical practice including rehabilitation, and personal identification. We propose a method of recognizing upper-limb motion gestures for HCI using electronic textile sensors, which consist of a double-layered structure with complementary resistance characteristics. For gesture recognition, we apply dynamic time warping (DTW) as it exhibits a high performance with simple computations for dynamic signals. We verified the functional feasibility of the proposed method from the data of 10 subjects performing 6 HCI gestures. The gesture classification accuracy for all subjects was 85.4%, although each subject separately achieved a higher performance. In fact, six subjects achieved a perfect recognition performance (100% recognition accuracy); three subjects achieved an accuracy of 98.6%, and one achieved an accuracy of 97.2%.
Various kinds of sensors can be used for gesture recognition, with image and depth sensors being the mainstream. (1,2,8,11) For instance, reliable data can be affordably obtained using sensors such as Kinect (Microsoft Co., United States). (2,8) Lahamy and Lichti recognized hand shape and sign language using an approach robust to the user's direction from a depth camera. (1) Affordable and small inertial sensors based on MEMS are also widely used for gesture recognition. (3)(4)(5)(6)(7)9,15) Kim et al. constructed an inertial measurement unit comprising an accelerometer, a gyroscope, and a magnetometer conforming a data glove to identify the flexion and extension of fingers. (6) Lee et al. recognized gestures corresponding to mouse operation by attaching a similar inertial sensor to the wrist. (7) Electronic textiles (e-textiles) that can be embedded in clothing are being increasingly applied for gesture recognition. (9)(10)(11)(12)(13)(14)(15) The electrical properties (e.g., resistance) of flexible e-textiles enable gesture recognition given their variation with bending, stretching, and shape. Moreover, e-textiles provide greater wearing comfort and flexibility than sensors implemented on solid electronics. Bobin et al. constructed a sensor using conductive threads to mount it on the elbow, and the acquired resistance signals were used to identify five levels of elbow flexion and extension with a support vector machine. (9) Gibbs and Asada estimated the angles of knee and hip joints using e-textiles mounted on pants through linear regression. (12) Tognetti et al. fabricated a double-layered sensor by attaching two pieces of e-textile to express a complementary pattern of resistance during flexion and extension. (13) Specifically, the pattern exhibits an increase in resistance for one sensor layer and a decrease in resistance for the other. Then, the angle was estimated using the difference between the signals from the two sensor layers. This sensor conforms a new concept for goniometers, and Santos et al. used a doublelayered goniometer to recognize hand movements for supporting laparoscopic surgery using robots. (14) The authors used rule-based classification to heuristically analyze the sensor patterns for recognition.
In this paper, we propose a hand gesture recognition method for HCI using e-textiles. The sensor used has a double-layered structure to exhibit the above-mentioned complementary resistance patterns. Although various recognition methods, such as dynamic time warping (DTW), (2,4) a genetic algorithm, (11) a hidden Markov model, (14) and rule-based classification, (7,13) can be used with this sensor, we selected DTW given its high performance obtained from simple computations on dynamic signals. We validated the functional feasibility of the proposed approach from the data of 10 subjects performing six HCI gestures. (7)

E-textile sensor and data acquisition
A conductive e-textile (0.80 mm thickness, EeonTex™ NW170-PI-20, Eeonyx Corp., United States) was cut into squares of 20 × 120 mm 2 , with the cut along the weft for the shortest side. Two conductive stainless-steel threads (28 Ω/ft, DEV-11791, Sparkfun Electronics, United States) were sewn 10 mm from the long end at 2 mm intervals to connect the wires, as shown in Fig. 1. The stitched parts were heated and pressed using an impulse heat sealer for contact improvement between the e-textile and the conductive thread. We prepared the two sensor layers using this procedure. Then, a double-coated foam tape (cat. #2240, 24 mm width, 2 mm thickness, 3 M, United States) was cut according to the length of the sensor size to attach one layer on each side of the tape, resulting in the double-layered sensor (Fig. 1). We prepared three doubled-layered sensors for experiments and attached the sensors to the shoulder (two sensors) and elbow (one sensor) of each participant, as detailed in Sect. 2.3.
The proposed data acquisition system is illustrated in Fig. 2 and consists of the doublelayered e-textile sensor, a constant current source supplying the sensor, a buffer (voltage follower) to obtain a low-impedance sensor output, an analog-to-digital converter (ADC) to obtain digital measurement signals, and a microcontroller for expressing the sensor voltage as resistance and transmitting it to a PC for real-time data acquisition and processing (Fig. 2).
The constant current source allows the resistance variation of the sensor to be converted into corresponding voltage signals. This source was implemented using an adjustable current source IC (LM334, Texas Instruments, Inc., United States). The output resistance of the e-textile sensor can affect the input impedance of the ADC by about 200 kΩ. Hence, we inserted a buffer (TLV2462, Texas Instruments, Inc.) to reduce the impedance at the ADC input for accurate voltage measurements. A 16-bit delta-sigma ADC (ADS1115, Texas Instruments, Inc.) quantized the measured voltage from each sensor to calculate the resistance in the microcontroller, which wirelessly transmitted the resistance signals to a PC through a Bluetooth module at a rate of 50 Hz. Figure 3 shows the output signal from a measured sensor resistance according to various gestures, where the rows and columns correspond to different gestures and sensor outputs, respectively. The sensor mounting and evaluated gestures are detailed in Sect. 2.3. The output of the first sensor (first column in Fig. 3) shows the difference among patterns for the six gestures. Specifically, in the first row ('up' gesture), the signal increases and then decreases. In  the second row ('down' gesture), the signal decreases and then increases, showing an opposite pattern to the up gesture. In the fifth row ('click' gesture), two up and down patterns appear, and in the sixth row ('double-click' gesture), the pattern from the fifth row is repeated twice. Although the third row ('left' gesture) pattern is similar to the first row ('up' gesture) pattern in the first sensor signal, these patterns can be distinguished using data from the second sensor (second column in Fig. 3), suggesting the usefulness of using patterns from multiple complementary sensors.

Gesture recognition using DTW
As mentioned above, the gesture types can be better classified by considering the complementary sensor signals. However, several issues should be addressed. First, the extent of different trials from a gesture can vary. Therefore, the signals should be appropriately scaled for pattern comparison. Likewise, the speed and gesture patterns may vary among subjects and even for the same subject over different executions.
We use DTW to handle these variations of gesture data by warping two time series of data with respect to time. Specifically, two signals are nonlinearly extended or shortened in the time domain, with the warping aiming to minimize the sum of distances between the signals. Then, the sum of distances can be used to quantify the similarity between the two signals at an equal warped data length. Furthermore, DTW is a dynamic programming technique that uses recursive calculations and has been reported to provide a computational efficiency superior to those of other statistical classification techniques. (2,4) To distinguish gestures using DTW, we perform template matching using previously prepared gesture patterns. The motion data to be classified is compared with these patterns to determine the class with the best fitting. Template matching is applied by comparing the similarity distances using DTW. The pattern retrieving the smallest sum of distances from DTW is considered as the classification result. Particularly, in this study, the gesture templates are obtained from each subject independently. Therefore, gesture classification results can vary over subjects. Figure 4 shows the gesture patterns used for template matching and generated by averaging multiple time-warped motion data of the same gestures.

Experimental protocol
Three double-layered e-textile sensors were attached at the positions shown in Fig. 5 to efficiently track joint angles during the execution of upper-limb gestures. Specifically, the sensors registered motions from the elbow and shoulder of subjects during flexion and extension, arm raising and falling, and forward and sideways motions. Sensor 1 was mounted such that its center was placed over the olecranon and along the arm length. Sensor 2 was mounted longitudinally at the deltoid-medial center and humerus center so that the shoulder joint laid at the center of the sensor. Sensor 3 was mounted along the dorsal boundary of the posterior deltoid, with one end positioned on the longitudinal centerline of the humerus. The sensors we fabricated did not elongate enough to comply with the joint flexion. (10) Hence, we attached one side of the sensor to the garment using a rubber band.
Ten healthy right-handed subjects (five females and five males) around twenty years old participated in the study. The subjects used their own clothing during the experiments and wore a rash guard with two of the double-layered sensors and an elbow brace with the remaining double-layered sensor. Each subject sequentially performed motions corresponding to six gestures with the right hand, namely, 'up', 'down', 'left', 'right', 'click', and 'double-click'. A trial consisted of performing the six gestures twice. Before measurements, we confirmed that every subject understood the experimental procedure and performed the gestures correctly.
Before the first and third trials of experiments, the sensors were calibrated by performing three times the flexion and extension of the forearm joint, the abduction and adduction of the shoulder, and horizontal adduction and horizontal abduction. The sensor positioning was verified at the end of each set of experiments to correct the position of any sensor that might be displaced during gesture execution.
The execution of each gesture took approximately 1.5 s. The subjects grabbed a switch with their right hand and maintained the switch pressed just before the gesture onset until completing the gesture. The switch signal was synchronously stored with the e-textile sensor signals to enable the segmentation of the whole motion sequence into separate gestures. There was a resting period of approximately 1 s between gestures. Also, there was a longer period of approximately 5 s before the execution of the second gesture sequence. Each subject performed six experiment trials, and 12 gesture motions were obtained for each of the six gestures.
The resistance time series of the six sensor layers and the switch signal were transmitted to a PC and saved as text files for posterior processing on implementation using MATLAB 9.4 (MathWorks, United States). The gesture motions segmented with the switch signal were identified subjectively considering the sequence of gesture motions. Then, we lowpassfiltered the sensor signals using a 4th-order Butterworth filter with a cutoff frequency of 2 Hz to suppress information not related to motion and high-frequency noise. In addition, we normalized the six sensor-layer signals from each gesture between zero and one to magnify the signal variation. Then, we employed the DTW function available in MATLAB.

Results and Discussion
The gesture classification results for the 10 subjects are shown as a confusion matrix in Fig. 6, where the rows and columns correspond to the classified and real gesture classes, respectively. Therefore, the cells along the diagonal correspond to correctly recognized gestures. The additional seventh row shows the true positive rate, or recall (top), and false negative rate (bottom) for each gesture class, and the bottom-right cell shows the total accuracy (top) and overall error (bottom).
Although the classification performance of all the subjects was not satisfactory, the results for each subject show a higher performance, as seen in the classification accuracy listed in Table 1. For six subjects, all the gestures were correctly recognized (100% accuracy). For three subjects, the accuracy was 98.6%, indicating the misidentification of only one gesture, and the lowest accuracy of 97.2% was obtained from subject 2. For subject 2, whose confusion matrix is shown in Fig. 7, one up gesture was recognized as 'down', and one 'click' gesture was recognized as 'up'.

Conclusions
We propose a method of recognizing hand gestures for HCI using e-textile sensors resembling a goniometer. Each sensor consists of a double-layered structure with complementary resistance characteristics. For gesture classification, template matching was applied to DTW results, which provide a high performance with inexpensive computations for dynamic signals. The functional feasibility of the proposed method was verified from the data of 10 subjects performing six gestures.
The overall gesture recognition accuracy for all subjects was 85.4%, with the click gesture showing the highest accuracy (95.8%) and the up gesture the lowest accuracy (64.2%). As these results may be related to the interindividual variation in anatomical shape, future studies will be devoted to mitigate this variation through a calibration process.
Unlike the overall performance, the accuracy for each subject was high, with six subjects achieving perfect classification (100% accuracy), whereas gestures from three subjects retrieved 98.6% accuracy, and those from one subject retrieved 97.2% accuracy. In future studies, we intend to further improve the method performance by using more sophisticated, albeit complex, classifiers such as statistical classifiers and artificial neural networks.
In this study, the hand gesture recognition method was tried on healthy subjects. However, the recognition of various gestures would be possible based on other joints including finger, knee, and hip joints. Also, the gesture recognition method is expected to be used not only for HCI but also for various applications such as upper limb rehabilitation and gait analysis. For example, it can be used for the daily life support and rehabilitation exercise by classifying daily activities such as drinking and reaching tasks.