Traffic Census Sensor Using Vibration Caused by Passing Vehicles

Traffic census data are essential for investigating traffic volumes and vehicle movements, and count mechanization is currently the most efficient way to obtain and utilize advanced traffic census data. However, efforts to mechanize traffic censuses have not progressed significantly in Japan owing to the price of such systems, the size of the necessary equipment, and privacy issues. In this paper, we propose a novel vehicle-counting sensor system that is inexpensive and easy to set up. Our system is based on a piezoelectric vibration sensor that senses road vibrations from passing vehicles. More specifically, the system consists of (i) a vibration sensor device that we designed and prototyped in-house and (ii) a passing vehicle estimation method that determines the number of passing vehicles from the vibration sensor data. Our system, which achieves high accuracy owing to the use of machine learning (ML), makes it possible to conduct traffic censuses by simply placing the sensor on sidewalks next to the road that is being surveyed. To demonstrate the utility of our system, we conducted an experiment in which the vibration sensor was placed on a sidewalk, and then linear discriminant analysis (LDA) was used to estimate the number of vehicles that were traveling on the adjacent road using only the data collected from the vibration sensor. Our results showed that the number of passing vehicles could be estimated with an accuracy of 98.3%.


Introduction
Traffic censuses, which investigate traffic volumes and vehicle movements, provide essential data that will allow planners to develop more efficient road improvement plans and thus help reduce carbon dioxide (CO 2 ) emissions. In Japan, a national road/street traffic situation survey known as the "Traffic Census" is conducted once every five years. In this census, the traffic volumes of Japan's highways, national roads, and prefectural roads are investigated.
However, as of the latest (2015) road traffic census report, more than 50% of the censuses were still being counted manually. (1) Figure 1 shows the rates of the different vehiclecounting processes used in the last three censuses. Here, it can be seen that the rate of the manual counting process declined by only 13% in the last 10 years, and that the contributions of mechanized counting processes remain low, primarily because such vehicle counters are expensive, require large machines, and have long setup times. Image-based vehicle-counting methods that use existing cameras, including surveillance cameras, have been attracting significant attention recently. (2,3) However, although some parties feel it would be ideal to count vehicles via such existing cameras, others have voiced concerns about the need to protect drivers' privacy. Moreover, it is difficult for camera systems to count vehicles correctly in dark locations or at night.
Other vehicle-monitoring methods include recording and analyzing the noise of moving vehicles via microphone arrays. Methods involving the generation of sound maps from the time differences of sounds recorded by vertical and horizontal array microphones have been studied. (4) For example, Ishida and coworkers examined the use of microphone arrays and realized a workable vehicle counting system. (5)(6)(7)(8)(9) However, although the cost of the sound map and microphone array system was not high, it required the use of a pole to mount the vertical and horizontal array microphones, which significantly increased its deployment cost. Furthermore, the high volumes of automotive exhaust gases and the large amounts of tire shavings encountered on roads with heavy traffic roads can reduce the service life of the microphones, while strong winds can produce noise levels that exceed the dynamic range of the system. Taken together, these factors indicate that the system is not sufficiently robust for practical use. Infrared-type traffic counters have also been put into practical use, but these counters are not widely used because they require the construction of installation such as standing poles on roads.
In this paper, we propose an inexpensive, easy-to-use vehicle-counting sensor system that can be can be employed simply by placing the sensor on a sidewalk adjacent to the road under observation. As shown in Fig. 2, our system uses a piezoelectric sensor to collect roadway vibrations caused by passing vehicles. (10)(11)(12) More specifically, the piezoelectric device converts vibrations to electric signals, after which the system extracts the characteristics of those electric signals as features based on mel-frequency cepstral coefficients (MFCCs). The number of vehicles were counted by machine learning (ML) to identify each vehicle recorded by the sensor. The benefits of using our vibration sensor system are as follows: (1) the system can be set up nondisruptively, (2) it is robust against weather conditions because the sensor is connected to the vibration source mechanically rather than through the air, and (3) the sensor is based on emission-free passive monitoring technology.
In previous studies of vehicle detection, multiple vibration sensors and a combination of a geomagnetic sensor and a vibration sensor were used. (13,14) In this research, we measure traffic volume with a single sensor by learning as information based on the road surface condition by cutting out and learning the vibration waveform itself.
Similar studies using in-house-based vibration sensor systems were conducted previously. (15)(16)(17)(18)(19) For example, Kashimoto et al. proposed a method for locating the position of a user by analyzing the vibration generated by a pedestrian. (20) Additionally, a method using a vibration sensor to sense walking behavior within a room, including the directions and number of multiple persons crossing the measurement area, has been proposed. (21) In this study, we apply similar technology to vehicle counting and investigate the most efficient way to achieve the sensitivity necessary to detect the weak roadway vibration signals generated by passing vehicles, and then use that information to count those vehicles. We evaluated only one lane of two-lane roads, because the purpose of this research is to realize an easy low-cost setup with an infrared traffic counter that counts only one lane.
The remainder of this paper is organized as follows. We explain the details of our in-house designed and developed vibration sensor in Sect. 2, and we then present a method for detecting passing vehicles from the obtained sensor signals in Sect. 3. We describe our vehicle-counting system in Sect. 4 and provide details of our field test and results in Sect. 5. Finally, we conclude the paper in Sect. 6.

Piezoelectric device
As explained above, our system is based on an in-house designed and developed piezoelectric vibration sensor. Piezoelectric devices are made from crystals such as potassium sodium tartrate and lead zirconate titanate. In such crystals, the cation shifts at a temperature below the Curie point and polarization occurs in specific but scattered directions. However, when a strong electric field is applied to a crystal in this state, the polarization direction remains the same even when the electric field is removed. Then polarization is neutralized with floating charges in the air produced by the piezoelectric device. When pressure is applied to such a piezoelectric device, each crystal shrinks, thus reducing the polarization of the piezoelectric element. At a certain point, the neutralized electric charges become excessive and generate a voltage. Figure 3 shows a basic diagram of our piezoelectric vibration sensor unit. This structure, which was designed to transmit vibrations from a road to a piezoelectric device, was also designed to be placed easily on tilted surfaces. A paper-insulated metal weight (hereinafter called a floating weight) rests on the sensor tray in direct contact with the piezoelectric element, and the floating weight is allowed to move freely inside the guide housing. The sensor tray under the piezoelectric element is electrically insulated from a road by the bottom of the sensor tray and it is pressed against the ground by the weight of the surrounding guide housing. Figure 4 shows a simplified sketch of Fig. 3, focusing on the force applied to the piezoelectric device. The piezoelectric elements are polarized and the applied force is perpendicular to the electric field axis. Therefore, according to the piezoelectric equation, the relationship between the force applied to the element and the voltage can be expressed as

Voltage generated in piezoelectric device
where the force applied to the piezoelectric element is F [N], the generated charge is Q [C], the mass of the floating weight is m [kg], the acceleration is a [m/s 2 ], and the equivalent piezoelectric constant is d. If the capacitance of the piezoelectric element is C [F] and is constant, the relationship among voltage, capacitance, and charge is Then, by substituting Eq. (1) into Eq. (2), we obtain where the longitudinal equivalent piezoelectric constant is constant at d. The relationship between force and acceleration is shown by where the mass m of the weight is constant. By substituting Eq. (4) into Eq. (3), we obtain As shown by Eq. (5), the generated voltage is proportional to the acceleration of the vibration applied to the piezoelectric element. Figure 5 shows a block diagram of the prototype measurement system, which consists of four blocks. The first block is the piezoelectric sensor unit, which converts vibrations into electrical voltage signals, as explained in Sect. 2.3. We used a 7BB-41-2L0 sensor (Murata Manufacturing, Kyoto, Japan) for this system. The second block is the amplifier, which has an impedance conversion circuit to improve the signal-to-noise ratio (SNR) of the input signal. Figure 6 shows a photograph of the amplifier. The third block is an Audio Interface UA-25EX  Universal Serial Bus (USB) (Roland DG Corp., Shizuoka, Japan) that converts analog signals from the amplifier into pulse code modulation (PCM) signals (16 bit, 44.1 kHz). To avoid clipping the input signal, the gain of the USB is adjusted so that a −15 dBV sinusoidal 1 kHz analog signal is recorded as 0 decibels relative to the full scale (dBFS). Signal data are recorded to PC from the vibration sensor as audio data.

Vibration sensor voltage output when a vehicle passes
In our preliminary experiments, as shown in Fig. 7, we placed our prototype vibration sensor system on a sidewalk adjacent to a roadway on the grounds of Nara Institute of Science and Technology and recorded the vibration signals generated when a vehicle passed on the measurement lane. Figure 8 shows the waveforms and spectrograms of vibration signals recorded when a vehicle passed at speeds of (a) 5 km/h (very low speed), (b) 10 km/h (low speed), and (c) 30 km/h (moderate speed). The main component of the signal band when the vehicle passed was from 1 to 2 kHz. From these results, we found that our sensor could detect a passing vehicle even at the very low speed of 5 km/h.
The waveforms and spectrograms of the vibration signals also show that higher vehicle speeds result in larger, but shorter-duration, high-frequency signal levels. Conversely, lower vehicle speeds produce smaller and longer-lasting high-frequency signal levels. Generally, noise is generated by the impact and friction of moving vehicle tires with the ground surface, and the vibration frequencies of such generated noise are high. (22) We confirmed these phenomena in our preliminary experiment.

Vehicle Counting System
On the basis of the results of the preliminary experiments mentioned above, we improved the structure of the sensor unit to make it capable of data collection on general roads. The target of this system is the number of vehicles passing through the measurement lane of a two-lane road, as shown in Fig. 9. We then constructed a data collection and analysis system similar to that shown in Fig. 5.  Then, using our improved sensor system, we conducted a vehicle counting experiment as follows: 1) The vibration sensor was placed on the sidewalk adjacent to the road being surveyed in order to record vibration levels. 2) To obtain training data, we also set up a camera on the same sidewalk and recorded a video of passing vehicles.
3) The times at which vehicles passed on the measurement lane were recorded from the peak energy of recorded audio data. 4) The audio data before and after the peak point were extracted and labeled for 1 s before and after the peak, thus producing 2 s segments. 5) The data were classified using the ML algorithm according to whether or not a vehicle passed on the measurement lane. Figure 10 shows various views of the vibration sensor used in our later experiments. This device was improved in several ways from the sensor unit introduced in Fig. 3. For example, the housing of the sensor unit shown in Fig. 3 was susceptible to water penetration, so the improved sensor was provided with a new watertight structure. Additionally, the improved sensor was equipped with two weights, as well as a leveling pad for it to receive vibrations more efficiently. Furthermore, to facilitate the use of the system on inclined roadways, the sensor tray was modified to have an inverted cone-shaped structure for it to efficiently make contact with various ground surfaces.

Tray-holding weight
The improved sensor is equipped with a weight attached to the outside of the sensor tray, called the tray-holding weight, which was not used in the prototype sensor. The purpose of this 500 g weight, the mass of which was carefully determined by empirical observations, is to ensure that the sensor tray is kept in firm contact with the ground surface. The position of the tray-holding weight also protects the piezoelectric sensor element inside the sensor tray from rain and roadside dust.

Floating weight
As in the prototype sensor, the floating weight rests on the top of the piezoelectric element inside the sensor tray, and the floating weight moves freely in the tray. When nearby moving vehicles produce vibration, the ground and sensor tray also move. However, the floating weight tends to stay in the same position because of the law of inertia. As a result, pressure is produced by the acceleration of the sensor tray. Since the floating weight is not locked mechanically to the sensor tray and tray-holding weight, it can vibrate freely. As with the tray-holding weight, the mass of the floating weight was carefully determined. This is because, if the floating weight is too heavy, it exerts too much pressure on the sensor tray. In this device, the mass of the floating weight was set at approximately 300 g on the basis of empirical observations made during the preliminary experiments.

Extraction of recorded data when a vehicle passes
As shown in Fig. 11, to use ML to count the number of passing vehicles, we prepared vibration sound data that had been recorded by the sensor over a long period of time, detected the vehicle passage times, and then extracted the relevant recorded data. In addition, labeling was performed using the correct data obtained from the video recordings taken simultaneously. We used the following processes for our ML analysis.

Marking of recorded data
We marked points where many high-frequency peaks were observed using the following process because, as described in Sect. 3.2, numerous high-frequency signals were detected when vehicles passed in front of the sensor (Fig. 11): 1) The recorded signals were converted to spectrograms by short-time Fourier transform (STFT) using a Hamming window in each segment with 16384 samples (0.37 s). 2) Next, we calculated the trend of energy versus time by summing the power spectrograms of signals higher than 300 Hz because the recorded data included humming noise (50 to 60 Hz), which we refer to as energy trend data. 3) Finally, we marked peaks on energy trend data that were higher than the empirically determined threshold.

Preventing double marking
In some cases, two or more signal peaks can result from the noise of one passing vehicle because of widely separated vehicle axles or other reasons. In such cases, the same vehicle could be counted more than once. We prevented such double marking by using the following empirically determined method. The maximum speed limit on a local road in Japan is 60 km/h, the average length of a vehicle is approximately 4 m, and the average distance between vehicles is calculated to be empirically 4 m from the average vehicle length. Therefore, the time interval t (minimum) is calculated as Since it is clear that two or more vehicles cannot pass within a 0.5 s period, we can reduce the number of peaks as shown in Fig. 12.

Extraction of analysis data
As explained above, we extracted 2 s segments of recorded data for each marked point (1 s before and after the mark) because the duration of the high-frequency component accompanying a passing vehicle was approximately 2-5 s. This value was determined by empirical measurements during the preliminary experiment described in Sect. 3.

ML estimations
The sounds produced by a moving vehicle can provide important clues related to its operation. The same applies to vehicle-generated vibrations. By applying ML techniques, we can count the number of passing vehicles from vibration data. Figure 13 shows a block diagram of our ML estimation process.

Preprocessing
We extracted eight features from the 2 s data segments and then performed ML preprocessing. MFCCs were selected on the basis of the finding that the audio characteristics of vehicle sound data occupy a band very close to that of human voice signals, which means that vehicle vibrations can be handled similarly. The process reshapes the matrix (8 × 173) to a vector (1 × 1384), which is used as the input for linear discriminant analysis (LDA). Figure 14 shows an example of the results obtained when MFCCs are calculated for 2 s of audio data.

Dimensionality reduction and LDA classification
To apply LDA, we assigned one of the two data label classes below to the extracted data. Labeling was accomplished by watching the video footage of all recorded data, selecting the correct answer, and then extracting the 2 s segment containing the audio peak. The two label classes are as follows.     Figure 15 shows a diagram of our LDA process. This LDA process generates a new axis onto which it projects data in a way that minimizes the variance and maximizes the distance between the means of the classes.

Field testing
We tested our system on the public road located at 2-2 Katamachi, Miyakojimaku in Osaka City, Osaka Prefecture, Japan, and recorded data for 15 min. Figure 16 shows the field test measurement setup. The sensor unit was placed on the sidewalk adjacent to the measurement lane, and it counted the number of passing vehicles in the measurement lane. We also recorded a video of the road where vehicles passed and treated the recorded video as ground truth data.

Checking the correct vehicle count number
Using our proposed method, we extracted 173 markings. When the number of passing vehicles on the measurement lane was checked from the recorded video, 60 of the 173 peaks corresponded to the number of vehicles passing in the measurement lane, and 113 peaks corresponded to noise. Vehicles that have passed the measurement lane were large vehicles, microcars, trucks, and motorcycles, but there were no tank trucks or large trucks. The breakdown of noise revealed that the noise was attributable to vehicles passing in the opposite lane, the footsteps of pedestrians, bicycles, and motorcycles.

ML evaluation
Next, we analyzed the test data using LDA algorithms and evaluated the results in terms of accuracy, precision, recall, and F-measure. These metrics were calculated using the number of prediction results obtained by solving Eqs. (7)- (10). The terminologies used in these formulas are as follows: • TP: Test data item is correctly predicted as "1". Pass through measurement lane.
• TN: Test data item is correctly predicted as not "1". Pass through measurement lane.
• FP: Test data item is incorrectly predicted as "1". Pass through measurement lane.
• FN: Test data item is incorrectly predicted as not "1". Pass through measurement lane.

Evaluation of optimal number of MFCC dimensions
We determined the optimal number of MFCC dimensions using the above evaluation parameters. As shown in Fig. 17, the accuracy, precision, recall, and F-measure for each ML parameter were evaluated using LDA at various numbers of MFCC dimensions. Since we found that the accuracy saturates at around eight dimensions, we used eight MFCC dimensions in our ML process. Table 1 shows the results of 10-fold cross-validation obtained after applying ML using our LDA process. These results show that ML-based LDA made it possible to count passing vehicles with high accuracy. The accuracy of the proposed system is about 98%, which is higher than that of the current system of infrared traffic counters commonly used in Japan (95%). Therefore, our proposed system is considered to have adequate performance. Figure 18 shows the comparison of our method with the conventional method, which achieved traffic count using air pipes, air pressure sensors, and other counting devices. The cost of air pressure and pipe sensors is high because they need to be embedded in the ground. They also require time install and are not suitable for temporary installation for use in, for example, a traffic survey census. Infrared-type traffic counters do not require embedding, but they require construction of structures for installation such as standing poles on a road. On the other hand, piezoelectric sensors cost only a few dollars each, which makes our system cheaper than existing systems, even after considering the cost of other components, cover, and assembly. The total cost also becomes low because it does not require any installation work.

Comparison with existing systems
In addition, our system requires only to be put on the ground and it does not require any setting up. In the case of an infrared sensor, the distance from, the angle to, and obstacles on the roadway must be taken into consideration for the setup. On the other hand, in the case of a piezoelectric sensor, the vibration travels through the ground and reaches the sensor even if there is an obstacle. Therefore, our system is easy to set up without considering the distance, angle, or obstacles.

Conclusion
In this paper, we proposed a vibration-sensor-based vehicle counting system that is inexpensive and easy to set up and use. In our proposed system, the vibration sensor is placed on the sidewalk next to the observed road, where it measures and records vibrations caused by passing vehicles transmitted through the road. This is accomplished by converting the output voltage of a piezoelectric sensor unit into an audio signal, and then extracting features from the measured data using MFCCs. Finally, the system determines whether a vehicle has passed on the measurement lane by applying ML to an LDA process. From our obtained experimental results, we confirmed that our proposed system could count the number of vehicles traveling on a measurement lane with an accuracy of 98.3%. Although our present method only counts the number of passing vehicles, the recorded signal data include unique information on each passing vehicle. Therefore, as part of our future work, we will evaluate whether it is possible to judge correctly even when vehicles of different sizes and weights pass by. We are also considering ways to identify the type of passing vehicles by using vibration data and ways to count vehicles passing on multiple lanes with a single sensor.