Smart Device Monitoring System Based on Multi-type Inertial Sensor Machine Learning

1Intelligence and Automation in Construction Fujian Province Higher-educational Engineering Research Centre, College of Civil Engineering, Huaqiao University, Xiamen 361021, China 2Department of Automatic Control Engineering, Feng Chia University, Taichung 40724, Taiwan 3Department of Aeronautical Engineering, Chaoyang University of Technology, Taichung 413310, Taiwan 4School of Computers, Jiang Xi University of Traditional Chinese Medicine, Nan Chang, Jiang Xi 330004, China 5College of Literature, Jimei University, Xiamen 361021, China

Construction activity recognition can be improved using data fusion from multiple inertial sensors such as accelerometers and gyroscopes, yet the number of accelerometers and gyroscopes and their optimal placement for combination need empirical determination. We considered the optimal combination of these two types of sensors placed on different parts of a construction worker for identifying construction activities through machine learning. The waist, arm, and wrist were equipped with data acquisition units to simultaneously acquire acceleration and angular velocity data for multiple sensor locations. A system for recognizing complex construction activities was developed on the basis of an accelerometer and gyroscope (A+G) synergy at multiple sensor locations. Results show that the A+G combination dataset at the wrist had the best activity recognition among the sensor configurations when the raw data came from a single sensor location. The results of comparing a single sensor location, two sensor locations, and three sensor locations indicate that combination with three sensor locations produced the best accuracy.

Introduction
Traditional activity recognition technologies use computer vision approaches to collect human patterns through image acquisition devices and use image processing techniques to analyze information. (1) Accelerometers and gyroscopes have been widely used for human activity recognition including the recognition of common daily activities such as lying down, sitting, standing, and walking. (2) Low productivity and high costs have undermined the development of the construction industry. The average annual growth rate of labor productivity in construction has been lower than the national average for labor productivity across all industries in China over the past ten years. (3) An effective way to manage and to improve worker performance is to recognize and monitor their activities, to analyze operations in real time, and to dynamically optimize workflows on site. (4) The implementation of each construction task activity depends on the collaboration of various body parts of construction workers. All worker activities on a construction site can be decomposed into collective movements of multiple limbs over time, revealing the complexity of construction activities. (4) An inertial sensor used for human activity recognition works on the principle of inertia, but the signal data generated by the two types of inertial sensors, accelerometers and gyroscopes, are different. (5) Accelerometers measure acceleration caused by motion or gravity, and gyroscopes measure the rate of rotation of a device by detecting the roll, pitch, and yaw of the X, Y, and Z axes of the device in motion. (6) Human activity recognition techniques can be used to measure actual human activity through data differences between accelerometers and gyroscopes, and are thus beneficial for the recognition of complex construction activities, and the motions and locations of different body parts can be recognized by inertial sensors. (7) The aim of this study is to recognize complex construction activities based on accelerometer and gyroscope combination through a series of experiments on construction tasks. Considering the difference and commonality between activities incorporating body parts and inertial sensors, the waist, arm, and wrist were equipped with data acquisition units to simultaneously acquire acceleration and angular velocity data for multiple sensor locations. We propose the use of a combination of acceleration and angular velocity data at multiple sensor locations to increase the performance of complex activity recognition to provide technical support for the automated management of worker performance. We constructed a classifier based on the characteristics of the input feature vectors and mapped an unknown activity sample to a class in a given activity category using classification algorithms. Four machine learning algorithms, namely, k nearest neighbor (KNN), support vector machine (SVM), neural network (NN), and random forest (RF) algorithms, were used in this study to train the classifier model, where the raw data came from inertial sensor data acquired using a 50 Hz sampling frequency. We experimentally investigated which machine learning algorithms provided the most accurate classification of activities for complex construction activities through consistent validation methods. All model training and predictions were performed in R language. The ultimate goal of applying a variety of classification algorithms was to explore the best performance of the classifier model and the best configuration of sensor types and locations in practice.

Latest Developments in Machine Learning for Human Activity Monitoring
Gong et al. (8) developed a video interpretation model based on computer vision technology to automatically recognize worker activities. Gong et al. (9) combined the bag-of-video-featurewords model with the Bayesian learning method to automatically classify the construction workers and equipment from a video. Han and Lee (10) developed a framework for extracting three-dimensional human skeletons to detect unsafe predefined motion templates from captured video. Khosrowpour et al. (4) proposed the use of inexpensive RGB-D sensors to evaluate the activity of indoor construction workers using visual images. Park and Brilakis (11) combined cameras and computer-vision-based tracking and detection methods for on-site tracking of construction workers. Yang et al. (12) used machine learning and computer vision techniques to propose vision-based motion recognition for construction workers using dense trajectories, although the use of the camera in this method was limited by factors such as lighting conditions and installation locations. In addition, owing to the complexity and dynamic characteristics of the construction site, problems included a decrease in accuracy due to a high noise level, a failure to record moving objects over long distances, and occlusion in a cluttered environment. (12) The most obvious deficiencies of image processing are relatively high computational and storage costs. (8) With the advent of inertial sensors and their advantages, different research directions have been explored for human activity recognition, leading to a large number of applications and innovative developments in the field of human activity recognition. (13) The detection and classification of human activities have attracted the attention of researchers in various fields such as healthcare, (14)(15)(16) education, (17) and sports. (18) Parkka et al. (19) automatically classified daily activities such as lying down, boating, walking, sitting or standing, cycling, running, and Nordic walking to promote healthy activities and a healthier lifestyle.
Chiang et al. (20) developed a portable activity pattern recognition system to automatically identify the user's daily activities, where medical professionals could use the data to help patients solve health problems caused by obesity or metabolic syndrome and delay the development of diabetes, cardiovascular disease, and other complications. Attal et al. (21) discussed human activity recognition using wearable sensors to remotely monitor the status of older people to help healthcare providers monitor their movements in daily activities to detect unpredictable events and assist in a timely manner. Data collection is the foundation of the human activity recognition process, and the signal quality is the most important factor that directly affects the ability of feature values extracted from features to characterize the activity. (22) The most commonly used sensors for human activity recognition technology are inertial sensors such as accelerometers and gyroscopes. Many researchers have used accelerometers for daily activity recognition. (23)(24)(25)(26) Gyroscopes are often used with accelerometers for data acquisition. (27,28) Shoaib et al. (1) used a smartphone as a data collection device to investigate the activity recognition performance of different motion sensors including accelerometers, a linear acceleration sensor (derived from an accelerometer by removing the gravity component), gyroscopes, and magnetometers under different conditions, and these inertial sensors were able to recognize activities independently in addition to a magnetometer. The recognition also depends on the type of activity identified, the data feature used, and the classification technique employed. Some researchers used a single accelerometer (29)(30)(31) while other researchers used multiple accelerometers to collect data, noting that using only a single accelerometer always resulted in lower quality of activity identification. (32)(33)(34) Sensors in different locations may have different sensitivities for activity recognition. In view of this, Cleland et al. (35) conducted a survey to determine the optimal position of an accelerometer to facilitate detection. Human activity recognition can be improved by data fusion from multiple inertial sensors comprising accelerometers and gyroscopes. (36)(37)(38)(39)(40)

Experimental Design and Methods
In our experiment, the subject used a cart to load and unload sand. This task involved four basic activities: shoveling, pulling, pouring, and pushing. Shoveling refers to a shovel being used to load sand into an empty cart. Pulling involves pulling the cart containing sand to the designated location. Pouring refers to manually pouring the cart filled with sand. Finally, pushing involves pushing the empty cart back to the starting point. The data acquisition experiments were conducted in an outdoor environment, and the research flowchart is presented in Fig. 1. Straps and common sport armbands were used to bind accelerometers and gyroscopes to the participants' body. Considering the ease of movement when performing the four types of activities, three locations on the human body, the waist, wrist, and upper arm, were equipped with device units as shown in Fig. 2. The device unit placed at the waist recorded the amplitude of acceleration of the torso. Figure  3 shows the shoveling, pulling, pouring, and pushing activities performed by the participants wearing the device units. We employed Witt Smart BWT901BCL equipment units. Three Bluetooth-enabled device units were used for data acquisition experiments, where each device unit (36 × 51.3 × 15 mm 3 ) was built in a three-axis accelerometer and a three-axis gyroscope to capture motion data in the x, y, and z directions with a maximum sampling rate of 200 Hz. The range of the accelerometers was ±16 g and the angular velocity range of the gyroscopes was ±2000 °/s. In the outdoor experimental environment, the inertial sensor data were collected from six male participants aged 25 ± 2 who wore the same sets of device units and performed the same sets of construction activities at the same location. Each participant performed activities for a period of time during the experiment to ensure the collection of sufficient data, and all participants performed the construction tasks in their own way without guidance. A sampling rate of 50 Hz was set while participants performed the construction activities to ensure the validity of the data, in accordance with Shoaib et al. (1) and Kwapisz et al. (41) Data generated by all device units were transmitted to a laptop via Bluetooth.
The data generated by the three equipment units were collected simultaneously throughout the experimental process, and thus 3 × 237250 pieces of data were collected. An active category annotation process was performed after feature extraction. Figure 4 is a plot of the acceleration  data collected from the device unit at the wrist. The raw signal data of the inertial sensor cannot be applied to the classifier, but extracting the feature from the signal segment is an effective method to maintain class separability.

Signal segmentation
The feature vector extracted from the original inertial sensor data was used as the classifier inputs after adding another dimension to the inertial sensor data to minimize the influence of the sensor orientation. (42) The feature vector was calculated as 2 2 2 m x y z = + + so that each inertial sensor had four dimensions: x, y, z, and t. Signal segmentation was conducted on the stream of sensory data using a fixed-size sliding window before the feature calculation. For the window size, each window should be guaranteed to contain enough samples (at least one cycle of an activity) of different similar movements. Considering that the data sampling frequency is 50 Hz and the accelerations of pulling and pushing shown in Fig. 3 are small and frequently repeating, the window size was set as 1 s (including 50 samples) with a 50% overlap rate between adjacent windows. We selected the most commonly used features that maximize the activity of construction workers. The feature vector of each segment contained 66 features scaled into the interval [0, 1] using max-min normalization to be used for classification. Table 1 lists the parts of the source code compiled for feature extraction, signal segmentation, and model training and recognition in this study. Time-domain features were used in the inertial sensor data classification owing to their low cost and high discernibility. Features 1-16 in Appendix, which belonged to the time-domain class, included standard statistical metrics such as mean, standard deviation, and maximum. In order to better characterize the construction activities, the correlation among the x, y, and z axes was taken into consideration for the activity recognition. Specifically, the correlations between each axis and the pitch and roll tilt angles were calculated using Eqs. (1) and (2). Features 17 and 18 were the frequency-domain features extracted by the inertial sensors after a fast Fourier transform (FFT). Power spectral density (PSD), which is an important frequencydomain feature for human activity recognition, was computed as the square of the sum of the spectral coefficients of PSD normalized by the length of the sliding window using Eq. (3). Here, a i = x i cos[(2πfi)/N] and b i = x i sin[(2πfi)/N] are the real and imaginary parts of the FFT, respectively, x is the data signal of the discrete inertial sensor, f is the fth Fourier coefficient in the frequency domain, and N is the length of the sliding window. The entropy matrix was used Table 1 Parts of source code for feature extraction, signal segmentation, and model training.

Training classifier through machine learning
The classification algorithms commonly used in activity recognition include the KNN, SVM, RF, decision tree (DT), NN, and naive Bayes (NB) algorithms. (1,30,43) We constructed a classifier based on the characteristics of the input feature vectors and mapped the unknown activity sample to a class in a given activity category using classification algorithms. Four machine learning algorithms, KNN, SVM, NN, and RF, were used in this study to train the classifier model, where the raw data came from the inertial sensor data acquired using a 50 Hz sampling frequency. The feature extracted by segmenting the original data using a sliding window with a fixed size (window size 1 s, overlap 50%) constituted a feature vector, which was used as an input to the classifier. Our experiment investigated which machine learning algorithms provided the most accurate classification of complex construction activities through consistent validation methods. The parameters for each classification method were configured by a set of parameters corresponding to the maximum accuracy in 10-fold cross-validation. In k-fold cross-validation, each fold contains all categories of the same scale to ensure balance. All model training and predictions were performed in R language. The ultimate goal of applying a variety of classification algorithms was to explore the best performance of the classifier model and the optimal configuration of sensor types and locations in practice. Accuracy metrics and F-measure (F-score) metrics were used to fairly measure the performance of each classifier model using four key terms: a) true positive (TP): correctly classified as the class of interest; b) true negative (TN): correctly classified as not the class of interest; c) false positive (FP): incorrectly classified as the class of interest; and d) false negative (FN): incorrectly classified as not the class of interest. The parameters in each classification model were determined by 10fold cross-validation and the selected parameters were the combination giving the maximum accuracy. Accuracy was used as the standard measure of the overall classification performance of a human activity recognition method, as defined in Eq. (5). The recognition ability of all activity categories was measured to identify the construction worker's activities. To test the actual application performance of each classifier model, a comprehensive metric (F-measure) was needed to evaluate the activity recognition performance. The F-measure was used as an evaluation indicator of precision and recall to comprehensively reflect the indicators of the overall model. The precision, sensitivity or recall, and specificity of the F-measure were calculated using Eqs. (6)-(8), respectively. The F-measure was calculated using Eq. (9) as a model performance metric combining precision and recall into a single number, and the magnitude of the F-measure represented the actual performance of the classifier model. To facilitate the subsequent representation of the inertial sensor data, the original data obtained using only the information acquired by the accelerometer as the classifier model and that obtained using only the gyroscope were denoted by A and G, respectively; A+G denotes the scenario where the two were used together. The performance improvement rate (PIR) was defined as the percentage activity recognition accuracy (ARA) of the A+G data at the same sensor location minus max (ARA of A data, ARA of G data). The average PIR results obtained from the PIR of the four classification algorithms for each sensor location excluded the maximum and minimum averaging results.

Data Interpretation and Experimental Results
The construction task using carts to transport materials was broken down into four activities: shoveling, pulling, pouring, and pushing. Understanding the effect of the interaction between the position and category of the inertial sensors on the activity mode could help optimize the activity recognition performance. Figure 5 presents the acceleration data of the four activities collected by the device units worn on the subject's wrist, waist, and lower arm, where the activities of shoveling and pouring were completely different from those of pulling and pushing.
The fluctuation of data of each axis of the former class was large and the fluctuation of the latter was small. As shown by the actual execution of the wrist movements, the data fluctuations of shoveling and pouring were significant for pulling and pushing. Shoveling and pouring exhibited a high probability of recognition, but the similarity of the acceleration of these Pulling and pushing in the space were quasi-static processes where the acceleration data of the upper arm fluctuated similarly without large undulations. The states of the inertial sensors at the two locations on the upper arm were different owing to the different directions of the force (the pulling is tension while the pushing is thrust), and the required forces were different in the actual operations of the two types of activities. Owing to the similarity of the two construction activities, the fluctuation of the acceleration data of the two types of activities was small, which made the recognition of the two activities involving the cart using the data from the accelerometer more complicated and difficult. Therefore, the differentiation of pulling and pushing based on accelerometer data needed improvement. The similarities among construction activities made them prone to classification difficulties. Figure 5 also plots 20 s of angular velocity data for pulling and pushing at the waist sensor location, where the angular velocity patterns of these two construction activities at the same sensor location showed a large difference. The angular velocity data improved the difference and classifiability of the activity to a certain extent when the similarity of the acceleration patterns was high. Comparing the acceleration of the upper arm and the angular velocity graphs of the waist, we found that the angular velocity patterns of these two construction activities at the waist sensor location were less similar in their acceleration. However, this can only explain that the angular velocity was conducive to the recognition of pulling and pushing. The overall recognition of the four construction activities required the realization and control of overall activity recognition. It was necessary to balance the performances of the overall activity recognition and the partial activity recognition for the classification of the four construction activities. At the data level, it was necessary to optimize the combination of inertial sensors. The acceleration and angular velocity patterns of different sensor locations were greatly different for the same activity; thus, the combination of inertial sensors was not limited to the same sensor location, and the combined configuration of multiple sensor locations was more favorable for construction activity recognition. Differences in the location and category of inertial sensors affected the recognition of construction activities. The next section examines and discusses in detail how to optimize the position and category of inertial sensors to improve the performance of construction activity recognition.

Different classification methods for improving recognition performance with single sensor location
The training and test data for the classifier model were derived from 70 and 30% of the total number of feature vector sets, respectively. The training data set used a 10-fold cross-validation method to generate a classifier model for various classification algorithms. The number of feature vectors or activities of various construction activities in the study are shown in Table 2. The inertial sensor data of each construction activity were collected through the outdoor control experiment for the recognition of complex construction activities, and the feature vector generated by feature extraction was used as the classifier inputs. In order to determine whether the differences in the inertial sensor categories at the same location affected the recognition of complex activities, the performances of accelerometers and gyroscopes at the same location were analyzed. Figure 7 plots the accuracy values of the classifier model for each location and the inertial sensor configuration values for the four machine learning methods. The activity recognition performance of the A+G data of the waist, wrist, and upper arm was superior to those of the A data and G data. When the classifier model only used A+G data, the three locations of the waist, wrist, and upper arm had the highest accuracy in activity recognition using the NN classification method with values of 0.896, 0.971, and 0.923, respectively. The recognition accuracy of the wrist was much higher than that of the waist and upper arm. When the classifier model only used A data, the best classification methods for the waist, wrist, and upper arm were RF, KNN, and NN and the accuracy values were 0.849, 0.964, and 0.885, respectively, i.e., the activity recognition of the wrist was still better than that of the waist and upper arm. However, the upper-arm RF classifier model had the best performance with a value of 0.867 when only G data were used, but the RF classification recognition performances of the waist and wrist had accuracies as high as 0.829 and 0.862, respectively. Comparing the construction activity recognition performance of the inertial sensor category configuration among the waist, wrist, and upper arm, we found that the activity recognition performance of the wrist was highest. In order to maximize the construction activity recognition performance, we suggest the combination of acceleration and angular velocity data in the classifier model and the use of the NN classification method. Comparing the activity recognition performance of the wrist for A+G data and A data, we found that the A+G data gave only slightly better performance than the A data. Table 3 shows the classification recognition PIR and average PIR for the inertial sensor category configuration of the waist, upper arm, and wrist. The activity recognition performance of the A+G data for the waist was much higher than that for the wrist and upper arm, indicating that the performance is maximized by applying the combination of the accelerometer and the gyroscope when using a device worn on the waist. Although the PIR of the wrist was the smallest, combined with the overall recognition performance of the wrist in Fig. 7, the wrist data achieved the best recognition performance using only the acceleration data. Table 3 shows that the SVM classification method had the highest PIR for all three sensor locations, indicating that the single inertial sensor data were not applicable to the SVM classifier model for construction activity recognition. Table 4 shows that the classification model performance of A, G, and A+G data in a single sensor location and the data collected by the accelerometer and gyroscope placed on the wrist maximized the recognition performance of the complex construction activities when the device was placed in a single sensor location for data acquisition. However, the ability of the classification model generated by the inertial sensor category configuration to detect each activity remained to be analyzed. To this end, the F-measure was used to quantify the classification capability for each activity. Figure 6 shows that the NN classification method using A+G data produced the highest recognition accuracy.

Activity recognition performance of various combinations of sensors
There were 20 single inertial sensor configurations for multiple sensor locations, with 12 single inertial sensor configurations for two sensor locations and eight single inertial sensor configurations for three sensor locations. The activity recognition performance of each combination of sensor locations was measured using the NN classification method. Table 4 shows the activity recognition performance results for different combinations of single inertial sensor data at two sensor locations and three locations. The device units contained accelerometers and gyroscopes at the waist, wrist, and upper arm to obtain data for loading and unloading tasks. The maximum ARA in Table 4 was 0.979, for which the data combination was the angular velocity at the waist and the acceleration at the wrist, and the inertial sensor data categories in the waist, wrist, and upper arm were G, A, and A, respectively. The best results for the inertial sensor configuration with two locations indicated that the angular velocity data of the waist helped improve the activity recognition performance of the acceleration data at the upper arm. The addition of the upper arm acceleration data further enhanced the performance of activity recognition when using inertial sensors in three sensor locations. Including the acceleration data for the wrist maximized the recognition of the complex activities when a single inertial sensor was deployed in multiple sensor locations. Table 5 shows the activity recognition results for the combination of single inertial sensor data at the waist and upper arm and the Table 4 Activity recognition performance of different combinations of single inertial sensor data.

Waist
Wrist Upper arm Accuracy A+G data for the wrist. The minimum accuracy value in Table 5 was 0.854, indicating that the angular velocity values of the wrist and upper arm were highly detrimental to the recognition of construction activities. According to Fig. 6, the ARA in the case of the combination of the acceleration and angular velocity data collected by the wrist was 0.971, which was higher than that of 0.942 obtained by considering the acceleration of the wrist. The A+G data of the wrist were set as the basic data to determine the effectiveness of construction activity recognition of combinations of data obtained from multiple sensor locations. The maximum recognition accuracy in Table 5 was 0.983 when the data consisted of A+G at the wrist, G at the waist, and A at the upper arm. The recognition accuracy for the combination of wrist A+G and the waist G was 0.980 when using the data of two sensor locations, and the accuracy for the combination of wrist A+G and the upper-arm A data reached 0.981. The addition of the angular velocity of the waist and the acceleration of the wrist improved the recognition performance when the wrist A+G data were used as the activity recognition data. Combined with the results in Table 5, the maximum recognition accuracy of 0.984 for the waist G, wrist A, and upper-arm A further verified the best combination of the position and the inertial sensor category. The results of the combinations of two sensor locations in Table 5 were further compared with the results of the activity recognition of the wrist using the acceleration data in Table 4, and it was found that the activity recognition results in Table 5 were slightly superior to those in Table 4. Comparing the data of the three sensor locations in Tables 4 and 5 with the activity recognition results, we found that the A+G accuracy at the wrist was slightly lower than the A accuracy.
Our results indicated that more data fusion of inertial sensors did not necessarily lead to an increase in activity recognition performance and that the fusion of useless data only interfered with the learning of the classifier model. In summary, a suitable combination of the inertial sensor position and the category contributed to activity recognition. The A+G data for all three sensor locations were superior to single inertial sensor data; thus, the A+G data can be used to analyze the construction activity classification performance of different combinations of data obtained from the waist, wrist, and upper arm. Table 6 shows the ARA of different A+G data combinations at the three sensor locations, where the A+G combination of the waist and wrist contributed to the performance improvement Table 5 Activity recognition performance of wrist A+G data combined with single inertial sensor data at waist and upper arm. Waist when including A+G combination data for the wrist. The classification accuracy of all A+G combination data using the three sensor locations reached 0.971 but the classification result of the A+G data using wrists decreased slightly. The maximum accuracy values in Tables 4-6 were 0.979, 0.984, and 0.983, respectively, indicating that it is necessary to determine the best sensor location to optimize the arrangement of different types of sensors. The performance results of all sensor location combination models revealed that for the actual construction activity recognition under the cooperation of multiple types of inertial sensors, the model generated by the fusion of the angular velocity of the waist and the acceleration data of the wrist and upper arm is most beneficial for the recognition of the construction activities. Table 7 shows the detailed recognition results of the Waist-G + Wrist-A + Upper arm-A model for various construction activities. The highest recognition accuracy was obtained for shoveling among the construction activities, with the lowest recognition accuracy obtained for pulling. The results show that there is a distinction between construction activities, which is determined by their nature. Therefore, the construction activity recognition model should be improved.
A system for recognizing complex construction activities based on the combination of accelerometer and gyroscope data obtained at sensor locations through machine learning was developed and its interface is shown in Fig. 8.

Novelty and discussion
The novelty of this study is that we considered the optimal combination of multiple types of sensors for the recognition of construction task activities and investigated the optimal combination of positions to maximize recognition performance under the cooperation of inertial sensors, thus addressing the shortcomings of previous research. For instance, Joshua and Varghese (39) studied the application of accelerometers on the waists of workers on construction sites for collecting data to investigate accelerometer-based activity classification for automating the work-sampling process. Nonetheless, they only used inertial sensors such as accelerometers as the source of construction activity data. Note that only the performance of accelerometers on workers' waists has been studied. Most studies did not consider a construction task as a complex activity. Akhavian and Behzadan (40) simulated various types of construction activities (loading a wheelbarrow, pushing the loaded wheelbarrow, dumping material from the wheelbarrow, and returning the empty wheelbarrow) using accelerometers and gyroscopes integrated with a smartphone worn on the upper right arm to collect data for worker activity Table 6 Activity recognition performance of A+G data combinations at three sensor locations.
Waist/A+G Wrist/A+G Upper arm/A+G Accuracy recognition. Their research combined accelerometers and gyroscopes for activity recognition, but only considered the combined data of two inertial sensors at one body location and failed to take advantage of the gyroscope as the dominant sensor for activity recognition. Lee et al. (44) examined the reliability and usability of wearable sensors for monitoring roofing workers during on-duty and off-duty activities. The results demonstrated the usability of these sensors, and they recommended a data collection period of three consecutive days for obtaining an intraclass correlation coefficient for heart rate, energy expenditure, metabolic equivalents, and sleep efficiency. The participants exhibited significant variations in their physical responses, health statuses, and safety behaviors. However, the study of Lee et al. (44) had two limitations. First, it did not provide occupation-related physical activity data based on the placement of the sensor on the wrist rather than the hip or waist. Second, the performance of the model developed by Lee et al. (44) was not validated using experimental data, thus limiting the reliability and validity of the results. Yan et al. (45) developed warning personal protective equipment (PPE) based on wearable inertial measurement units (WIMUs) that enabled workers' self-awareness and self-  management of ergonomically hazardous operational patterns. They proposed data processing and real-time warning algorithms for the automatic assessment and warning of hazardous postures through a connected smartphone application as soon as dangerous operational patterns were detected. However, they did not study multiple types of inertial sensors in different locations on the human body because sensors in different locations may have different sensitivities for activity recognition. Chen et al. (46) proposed a tensor decomposition approach to compress and reorganize motion data; however, it could only examine two sample activities composed of sequencing postures. A comparison of the present study with the above studies is given in Table 8. Akhavian and Behzadan (40) studied the use of inertial sensor data using only one sensor location for the recognition of similar construction activities in their study. Although our study applied to the construction field, the activity recognition performance reached 0.98 accuracy. However, the pursuit of more reliable and efficient activity recognition performance is a basic requirement for future automation and the real-time recognition of worker activities.

Conclusions
Different types of inertial sensor data and sensor locations can have a significant influence on activity recognition performance. The experiments on sensor locations and categories in this study helped improve the performance of construction activity recognition for practical applications. We developed a system for recognizing complex construction activities based on the combination of accelerometer and gyroscope data obtained from multiple sensor locations through experiments. All activity data were derived from the acceleration and angular velocity data simultaneously recorded at three body locations (waist, wrist, and upper arm). Results show that the A+G combination dataset obtained from the wrist had the best activity recognition among the sensor category configurations when the raw data came from a single sensor location. The NN algorithm exhibited the highest classification accuracy among the machine learning algorithms. The effect of a single inertial sensor configuration at multiple locations on activity recognition indicated that the wrist was the most suitable location for sensor placement. The best results for activity recognition with the combination of two and three sensor locations were obtained with Waist-G + Wrist-A and Waist-G + Wrist-A + Upper arm-A, respectively. By comparing the results for a single sensor location, two sensor locations, and three sensor locations, we found that the use of three sensor locations produced the best accuracy. The use of a gyroscope and an accelerometer at the waist and upper arm, respectively, improved the activity recognition performance using the wrist A+G data. Since placing sensors at three sensor locations on the human body could lead to some inconvenience, further studies are recommended to search for ways of improving the accuracy of construction activity recognition with two sensor locations.