Strike Activity Detection and Recognition Using Inertial Measurement Unit towards Kendo Skill Improvement Support System

1Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan 2JSPS Research Fellowships for Young Scientists, Tokyo 102-0083, Japan 3Department of Advanced Information Technology, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka, Fukuoka 819-0395, Japan 4JST Precursory Research Embryonic Science and Technology (PRESTO), Tokyo 102-0076, Japan


Introduction
In recent years, information and communication technology (ICT) in sports has been actively promoted. ICT makes it possible for athletes and coaches to quantitatively analyze numerical data of game content and training performance. Therefore, the utilization of ICT is expected to enhance the training process, increase tactical patterns in games, and improve athletes' performance in various sports. (1) In rugby, inertial measurement units (IMUs) and In this paper, we propose a new method for detecting and recognizing strike activities by practice swings using IMUs. We increased the number of participants to 14 (seven kendoexperienced and seven inexperienced persons) and used a sensor data set of strike activities by practice swings obtained from the 14 participants. We attached four IMUs to the participants' body (at the right wrist and waist) and shinai (at the sword guard called the tsuba and at the tip called the saki-gawa) during our experiment and obtained sensor data for each position. In our method, to detect the strike activity, we calculated the dynamic time warping (DTW) distance between the presegmented training data and the time series data, and detected the strike activity section. Furthermore, we classified five types of strike activity (Center-Men, Right-Men, Left-Men, Dō, and Kote) using a machine learning method.
The rest of this paper is organized as follows. In Sect. 2, we review the existing work related to this paper. In Sect. 3, we present our proposed detection and recognition methods. In Sect. 4, we describe the experiment overview. The evaluation and experimental results are described in Sect. 5. Finally, Sect. 6 concludes this paper.

Related Work
In various sports, activity analysis and monitoring systems using ICT can be expected to have many effects such as improving the performance of athletes and reducing the risk of injury. Several methods for them have been proposed and can be divided into three: 1) camera-based methods that analyze images of sports activity; 2) global navigation satellite system (GNSS)-based methods that analyze tracking information obtained by attaching a GNSS device to a player's body; 3) sensor-based methods that analyze sensor data obtained by attaching some IMUs to a body or sports equipment. (14)(15)(16) The camera-based methods do not require the attachment of devices to the body, so the player can measure sports activity without feeling any discomfort. Therefore, camera-based methods are the most frequently used method in sports analysis. In particular, there are optical motion capture systems used for the measurement of high-speed sports activity and a direct linear transformation (DLT) method using a digital video camera. (17,18) These methods are frequently used for more accurate measurement. However, when using multiple cameras, the system becomes large-scale and expensive. Also, when using only one camera, there are blind spots that cannot be measured because the measurement range is limited.
GNSS-based methods require the attachment of a device to the body to track the player. These methods are frequently used in sports where games are played in large outdoor fields such as American football (19) and soccer. (20) Catapult PLAYERTEK (21) can analyze players' mileage, play area, and accuracy of formation from the data obtained by tracking their location information in the game, and it can also be used for tactics. However, tracking is limited to outdoor sports because tracking accuracy is significantly reduced indoors.
Sensor-based methods require the attachment of IMUs to the body to obtain the sensor data of player's motion. The IMUs can obtain the motion information of players both indoors and outdoors with high accuracy. However, in these methods, it is difficult to intuitively understand the player's motion only by using the obtained data. Thus, there are many studies that analyze activity during sports by using IMUs and provide effective feedback to the players so that they can practice efficiently. Blank et al. attached inertial sensors to table tennis rackets and recognized eight different basic stroke types using data collected from 10 amateur and professional players. (22) The SONY Smart Tennis Sensor (23) can measure the hitting point, shot type, swing speed of the racket, and so forth by attaching the sensor device to the racket. James et al. analyzed Men, which is the most basic strike activity of kendo, by attaching an accelerometer at the tip of a wooden sword. (24) As a result, they reported that the accelerometer can quantitatively evaluate the difference in swing characteristics between beginners and professionals. However, they did not measure other strike activities and did not recognize activities using machine learning.
In our proposed method, the strike activities are analyzed using the sensor data obtained from IMUs. There are three reasons for employing this approach. First, this support system assumes that the player uses it alone. A camera-based method using a video camera or a smartphone camera is not easy for players to use because it takes time to set the camera and adjust the angle of view. Second, kendo is an indoor sport. Therefore, GNSS-based methods, such as using GPS, are not practical. Lastly, with the development of wearable computing in recent years, various wearable sensors such as wristwatch, glasses, and belt types have been developed. Moreover, since many of them are equipped with an IMU, we can expect further expansion of the support system in the future.

Kendo skill improvement support system
In this section, we describe the kendo skill improvement support system that enables kendo players to practice effectively even alone.

Overview
The final goal of this support system is to recognize strike activities by practice swings, as well as to provide feedback only by attaching some IMUs to the body and shinai. With this, we aim to improve players' strike activities in terms of correctness. Therefore, as shown in Fig. 1, the support system is composed of the following three steps: 1) detecting a strike activity section from the obtained sensor data; 2) recognizing the type of strike activity; 3) evaluating the strike activity and providing feedback. In this paper, we focus on the detection and recognition of strike activities by practice swings.

Target activities
Strike activities are only made towards specified target positions, all of which are protected by armor. Figure 2 shows the striking positions of five strike activities, the target activities in this study. The names of the strike activities are Men (Center-Men, Right-Men, and Left-Men), Dō, and Kote, which are explained as follows.

• Men (Center-Men, Right-Men, and Left-Men)
Men is the most basic strike activity whereby the player strikes the opponent's head with the shinai. Basically, the player strikes the center of the head (Center-Men), but there is also a strike to the upper right (Right-Men) or upper left (Left-Men) of the head.

• Dō
Dō is the strike activity whereby the player strikes the opponent's torso with the shinai. In principle, striking the right side of the torso is important in kendo, so we exclude the strike of the left side of the torso.

• Kote
Kote is the strike activity whereby the player strikes the opponent's wrist with the shinai. In principle, striking the right wrist is important in kendo, so we exclude the strike of the left wrist. Tsuki is also one of the basic strike activities whereby the player thrusts the opponent's throat with the shinai. However, since an incorrectly performed Tsuki could cause serious injury to the opponent's neck, Tsuki in practice and competitions is often restricted to senior graded kendo players. Therefore, we exclude it in this study. Figure 3 shows movements of the strike activities of Center-Men, Dō, and Kote. The movements of Men and Kote have some similarity.

Sensor type and position
The IMUs used in this study are MPU-9250, a popular IMU by InvenSense, embedded on SenStick. (25) SenStick is a tiny multi-sensing board developed for recognizing human activities as shown Fig. 4, which has eight types of typical sensor (accelerometer, gyroscope, and magnetic, temperature, humidity, pressure, light, and UV devices). Also, it can record all the sensing data to on-board memory at a sampling rate of up to 100 Hz. Furthermore, it can send data to a smartphone or a PC via low-energy Bluetooth. SenStick is used in various studies on activity recognition. (8,26,27) We use the inertial sensor (three-axis accelerometer and gyroscope) data from the IMUs. We set the sampling rate of the IMUs to 100 Hz to accurately measure the    Figure 6 shows an example of acceleration and a gyroscope waveform measured by SenStick attached to the right wrist.

Strike activity detection
We detect sections of five types of strike activity from the acceleration time series data obtained from the IMUs attached to the participant based on the DTW distance. (28) DTW is an algorithm for calculating the similarity (distance) between two time series data. The DTW distance ( ) 1 2 , D T T of time series data ( ) of lengths n and m is defined as In our method, the DTW distances between the subsequences of the acceleration time series data including strike activities obtained from the IMUs and the presegmented training data of the strike activity are calculated. We detect the strike activity sections from the time series data based on the DTW distances. In kendo, there are cases where the length and partial speed of the strike movement differ depending on the competitors, such as the number of years of experience and gender. DTW can calculate the distance between two time series data in consideration of these cases.  We firstly calculate the composition of acceleration in three-axis time series data using Eq. (4). 2 2 2 Composite The composite acceleration data are divided into subsequences with 75% overlaps and a 1.28 s time window (128 samples) using a sliding window. Next, DTW distances between all subsequences and the training data are calculated. Here, the training data is a composite of acceleration in three axes during the strike activity, pre-segmented manually. After the DTW distances are normalized, local minimum points with a distance less than or equal to ε are detected. The subsequences corresponding to the DTW distances having a local minimum point are extracted, and the total 2.56 s time series data interpolated by 0.64 s before and after subsequences is detected as the strike activity sections. Figure 7 shows a summary of the detection method and the waveform of the DTW distances between the composite acceleration data including nine strike activities of Center-Men and the training data of the same player's Center-Men. Figure 8 shows the composite acceleration data and the detection result. The ranges in blue in Fig. 8 indicate the strike activity sections segmented manually, and those in red indicate the subsequences detected as the strike activity sections by the proposed method.

Strike activity recognition
We recognized five types of activity using the inertial sensor data obtained from the IMUs attached to the participant: 1) Center-Men, 2) Right-Men, 3) Left-Men, 4) Dō, and 5) Kote.
In the data preprocessing process, first, a median filter and a 20 Hz third-order Butterworth low-pass filter are applied to three-axis acceleration (Acc-XYZ) and gyroscope (Gyro-XYZ) signals to remove noise such as spike noise. Since gravity and motion components are mixed in the denoised acceleration signal (Acc-XYZ), gravity component (GravityAcc-XYZ) and motion  Table 1 are extracted.
In the feature extraction process, the 17 signals obtained in the preprocessing process are divided into 25% overlaps with a 2.56 s time window (256 samples). The time-domain and frequency-domain features shown in Table 2 are calculated from signals separated by a time window of 2.56 s. As a result, 561 features are calculated from each window data. These are extracted as all the kinds of features to the best of our knowledge, which can be generated using the acceleration and the gyroscope from the IMU, considering that the strike activity is fast and the IMU is also attached to the shinai. (29)(30)(31) In the strike activity recognition model generation process, the 561 features are standardized and a feature selection method is applied to reduce the number of features and optimize the      Strike activity recognition models are generated on the basis of a machine learning algorithm using the features selected as input data. In this process, Scikit-learn, (32) a machine learning library, was used for standardizing and selecting features and generating strike activity recognition models.

Experiment
In this section, we describe the experiment conducted to collect the actual activity data from participants to evaluate the proposed method. We recruited 14 participants (13 male and one female, age 18.9 ± 3.9), which include seven kendo-experienced and seven inexperienced persons. There are four 1st dan and two 2nd dan grade holders in the group of experienced persons, where the dan grade implies that the basic skills have been mastered.
As target labels for strike activity detection and recognition, we selected the following five types of activity: 1) Center-Men, 2) Right-Men, 3) Left-Men, 4) Dō, and 5) Kote.

Experimental setup for data collection
Four IMUs were attached to the participants' right wrist, waist, shinai tsuba, and shinai saki-gawa, and inertial sensor (three-axis accelerometer and gyroscope) data was recorded at a sampling frequency of 100 Hz during the experiment. When the participant was a kendo-inexperienced person, before carrying out the five types of strike activity, he received brief guidance from an experienced person. All the participants performed 10 sets of each type of strike activity after taking a static state, that is, we collected five types of time series data including 10 sets of strike activities for each participant. As a result, we created a data set comprising 700 sets of strike activities from the participants. Besides tracking the participant's sensor data, all strike activities were also captured on video and manually segmented for the training data based on the video.

Strike activity detection
In this experiment, we considered two cases for the detection method of strike activity: 1) comparison of similarity measurement methods and 2) comparison of detection of target activities. The following subsections describe each case.

Comparison of similarity measurement methods
In order to evaluate the performance of the proposed method, we compared the method using DTW with other similarity measurement methods. We used Euclidean distance and cosine similarity in the other similarity measurement methods. Given two time series data ( ) We used acceleration time series data of five types of strike activity of each participant obtained from the IMU on their right wrist. Each time series data includes the 10 sets of the same type of strike activity. One set was extracted from the time series data as training data. The subsequences of the same participant's time series data including the remaining nine sets were used as input data. Each set was used as training data by cross-validation. In the methods using Euclidean distance and cosine similarity, the two time series need to have the same length. Therefore, the length of both the training data and the subsequences was set to 1.28 s (128 samples).

Comparison of detection of target activities
In order to confirm the versatility of the proposed method, we considered four types of evaluation method: A) detection of the same activity for the same participant, B) detection of a different activity for the same participant, C) detection of the same activity for a different participant, and D) detection of a different activity for a different participant. The lengths of the training data and subsequences were respectively set to 3.01 s (301 samples) and 1.28 s (128 samples) in each case. A) to D) are explained as follows: A) Detection of the same activity for the same participant Here, the same type of strike activity of the same participant as the training data was used as the input data. Out of the 10 sets of strike activities for each participant, one set was extracted as the training data from the time series data. The subsequences of time series data including the remaining nine sets of the same participant were used as the input data. Each set was used as the training data by cross-validation. Furthermore, we compared the dependence of the detection accuracy on the IMU attachment position.

B) Detection of a different activity for the same participant
Here, a different type of strike activity of the same participant from the training data was used as the input data. Out of the 10 sets of strike activities, one set was extracted as the training data from the time series data. The subsequences of time series data including 10 sets of the different type of strike activity of the same participant from the training data were used as the input data.
C) Detection of the same activity for a different participant Here, the same type of strike activity of a different participant from the training data was used as the input data. Two base participants (one kendo-experienced and one inexperienced) were chosen, and out of the 10 sets of strike activities, one set was extracted as the training data from the time series data. The subsequences of time series data including 10 sets of the same type of strike activity of the different participant from the training data were used as the input data.

D) Detection of a different activity for a different participant
Here, a different type of strike activity of a different participant as the training data was used as the input data. Two base participants (one kendo-experienced and one inexperienced) were chosen, and of the 10 sets of strike activities of them, one set was extracted as the training data from the time series data. The subsequences of time series data including 10 sets of the different type of strike activity of the different participant from the training data were used as the input data.

Strike activity recognition
In this experiment, we considered three cases for the recognition method of strike activity: 1) comparison of three machine learning algorithms, 2) comparison of each combination of IMU positions, and 3) evaluation of the generalization performance of the models.
We use machine learning for recognizing strike activities. To compare the accuracy, we adopted three machine learning algorithms: random forest (RF), support vector machine (SVM), and neural network (NN). Furthermore, we compared the dependence of the accuracy on the IMU attachment position. Our final goal is to support kendo players with small numbers of IMUs attached to the wrist and/or shinai. Therefore, we compared the accuracy of the combination of IMUs attached to the right wrist and shinai. The data of one IMU has up to 561 features, and a feature selection method was applied to optimize model performance. In order to confirm the generalization performance, we verified two cases: person-dependent (PD) and person-independent (PI) cases. In the PD case, nine sets of data recorded with a particular participant were employed for training, and one set of data was used for the test. We used each set to be the test data once by cross-validation. In the PI case, we performed leave-one-person-out cross-validation, where in each fold, 13 persons were used for training and the remaining one was used for the test.

Strike activity detection
We discuss the detection results through 1) a comparison of similarity measurement methods and 2) a comparison of detection target activities. For all detection results, we also compared the detection accuracy for participants with and without kendo experience. Table 3 shows the results of the detection accuracy for each type of strike activity using DTW and other similarity measurement methods. Figure 9 shows the detection accuracy for participants with and without kendo experience. The horizontal axis shows average accuracy based on participants' kendo experience. Similarity measurement methods are differentiated by color. DTW achieved the highest accuracy. In contrast, Euclidean distance gave the lowest accuracy. Therefore, we clarified that DTW is effective as a similarity measurement method for detecting strike activities accurately.

Comparison of detection target activities
In the case of detecting the same activity for the same participant, the results of the detection accuracy are shown in Table 4. We compared the detection accuracy of the IMU attachment position for participants with and without kendo experience. First, reading the average of all participants, the highest detection accuracy was obtained when the IMU was attached to the right wrist, and our proposed method achieved 94.2% accuracy (F-measure). When the IMU was attached to the waist, we achieved 93.0% accuracy. The lowest detection accuracy was 86.8% when the IMU was attached to the shinai tsuba and saki-gawa.
The highest accuracy among the kendo-experienced participants was obtained when IMU was attached to the shinai saki-gawa, and we achieved 99.1% accuracy. The accuracy decreased in the order, shinai tsuba, waist, and right wrist, and all the accuracies exceeded 95.0% (F-measure). In the case of inexperienced participants, the highest accuracy was obtained when the IMU was attached to the right wrist, and we achieved an accuracy of 92.8%. The accuracy decreased in the order waist, shinai tsuba, and shinai saki-gawa, which was the reverse order of the results of the experienced participants. In shinai saki-gawa, there was a 24.5% difference in detection accuracy between the experienced and inexperienced participants in terms of F-measure. Comparing Figs. 10(a) and 10(b), we confirmed that the nine peaks of the waveform are clear in the DTW distance waveform of the experienced participant, whereas the waveform of the inexperienced participant is distorted. This result shows that the experienced participants can perform 10 sets of strike activities with almost constant movements, whereas the inexperienced participants perform different movements for each strike activity. For the right wrist, the difference in detection accuracy between the experienced and inexperienced participants was the smallest, only 2.9%. Therefore, we consider that the right wrist is the most suitable attachment position for detecting the strike activities in our system. From these results, we evaluated the accuracy using data of the IMU attached to the right wrist in the following discussion.
For the case of detecting a different activity for the same participant, the results of the detection accuracy are shown in Table 5. For the results of all participants, the average detection accuracy was 89.9%, and it was confirmed that the accuracy decreased slightly when the data of the different type of strike activity was used as input data.
For the case of detecting the same activity for a different participant, the results of the detection accuracy are shown in Table 6, where the data of the kendo-experienced and kendoinexperienced persons are used as training data. It was confirmed that when the data of the experienced person was used as training data, high accuracy was obtained, but when the data of the inexperienced person was used as training data, the accuracy decreased considerably.   For the case of detecting a different activity for a different participant, the results of the detection accuracy are shown in Table 7. We confirmed that when using the kendo-experienced person data as training data, even different types of strike activity of different participants were detected with high accuracy. It is considered that training data using the data of kendo-experienced persons has a high generalization performance in our proposed method.

Strike activity recognition
We discuss recognition results through 1) a comparison of three machine learning algorithms, 2) a comparison of combinations of IMU positions, and 3) the evaluation of the generalization performance characteristics of the models.

Comparison of three machine learning algorithms
We compared the performance characteristics of three machine learning algorithms: RF, SVM, and NN. Figure 11 shows the average of 14 participants with the classification accuracy results (F-measure) of strike activities based on the PD case using RF, SVM, and NN. The result indicates that RF achieved the highest accuracy. In contrast, NN showed the lowest accuracy. Therefore, in the PD case, we confirmed that RF is effective as a machine learning algorithm for recognizing strike activities accurately.   Figure 11 shows that when only one IMU was attached, an accuracy of 89.5% (F-measure) was achieved with the right wrist (RW) and an accuracy of 82.3% was achieved with the shinai tsuba (ST). Then, for the combination of two IMUs, the highest recognition accuracy of 91.8% was achieved with the combination of RW and ST. In addition, for the combination of three or more IMUs, the recognition accuracy when the IMU was attached to all four positions: [RW, the waist (W), ST, and shinai saki-gawa (SS)] was the highest, achieving an accuracy of 93.1%. Therefore, in the PD case, when the player wears only one IMU, the wrist is optimal, and by combining sensor data from multiple positions, the recognition accuracy further improves.

Evaluation of generalization performance of the models
We evaluated the recognition accuracy by leave-one-person-out cross-validation to confirm generalized performance. Figure 12 shows the average of 14 participants with the classification accuracy results (F-measure) of strike activities based on leave-one-person-out cross-validation using the three different machine learning algorithms. The horizontal axis shows combinations of IMU positions. Machine learning algorithms are differentiated by color. The accuracy of NN was often higher than those of the other algorithms. When only one IMU was attached, an accuracy of 43.2% was achieved with the RW and an accuracy of 52.8% was achieved with the ST. The combination of RW, ST, and SS achieved the highest accuracy of 54.9%. Figures 13(a) and 13(b) respectively show the recognition accuracy corresponding to the number of features and the top five rankings of the importance of the features at each position. Figure 13(a) shows that the number of effective features depends on the position. Furthermore, from Fig. 13(b), we confirmed that the features related to the gravity components in the X-axis direction are important as a whole.
Moreover, when three types of classification were performed by unifying Center-Men, Right-Men, and Left-Men into Men as shown in Fig. 14, the combination of RW and SS increased to an accuracy of 69.3%. Figure 15 shows both five-type and three-type confusion matrixes for the combination of RW and ST. From these results, we confirmed that many false recognitions occurred in Men and Kote. We believe that the reason for the decrease in accuracy    were individual differences in the strike activities of the participants. In the future, we plan to improve the proposed method to improve accuracy.

Conclusion
In this paper, we focused on kendo, a martial art in Japan, and proposed strike activity detection and recognition methods using IMUs towards a kendo skill improvement support system. To confirm the effectiveness of the proposed method, we collected inertial sensor data of strike activities from participants including kendo-experienced and inexperienced people with four IMUs attached to right wrist, waist, shinai tsuba, and shinai saki-gawa.
First, we detected five strike activities based on the DTW distance from the acceleration time series data. When we used the training data of the same participant, the strike activities were detected with an accuracy of 89.9% (F-measure). When we used the training data of a participant with a lot of kendo experience, the strike activities were detected with an accuracy of 95.0% even from the input data of other participants. However, the conventional DTW requires a large amount of calculation, and it is difficult to detect in real time with the proposed method. Therefore, it is necessary to speed up the calculation process of the method without lowering the accuracy.
Next, we recognized five types (Center-Men, Right-Men, Left-Men, Dō, and Kote) and three types (Men, Dō, and Kote) of strike activity . In the PD case, we achieved an accuracy of 89.5% with five types when training and testing the same participant's data of only right wrist. We achieved an accuracy of 91.8% when the IMU was attached to the right wrist and shinai tsuba. We clarified that the classification accuracy was further improved by combining sensor data of  multiple positions. As a result of leave-one-person-out cross-validation from 14 participants to confirm generalized performance, we achieved an accuracy of 54.9% using three IMUs (right wrist, shinai tsuba, and shinai saki-gawa). In the three types of recognition, the accuracy of the combination of right wrist and shinai saki-gawa increased to 69.3%. As a result, on the basis of only personal data, we can recognize the strike activities with high accuracy. However, owing to individual differences in activity, the accuracy of leave-one-person-out cross-validation decreases.
The attached IMU is expected to have little effect on the swing movement of the bamboo sword owing to its small mass. Since smartwatches have already become widespread, we consider that a kendo improvement support system can be realized if high detection and recognition accuracies can be obtained only with the IMU on the wrist.
As part of future research, we aim to detect strike activities in real time and improve the accuracy of recognition in the PI cases by improving the proposed method. Furthermore, we aim to realize the kendo skill improvement support system by implementing the evaluation function of the strike activity and feedback mechanism.