Odor Recognition System Using Embedded Leaning Vector Quantization Circuit

In this paper, we propose a system that can recognize odors in real time even in a highly fluctuating environment. We used quartz crystal microbalance (QCM) sensors to detect odors. We adopted the learning vector quantization (LVQ) algorithm because it is possible to identify complicated data using a small amount of computation resources. Moreover, we extracted the time constant of the QCM sensor with a short-time Fourier transform (STFT) unit to improve the identification rate. Then, we performed identification experiments using pseudo QCM sensor signals that faithfully reproduced the previous data. When we performed experiments on identifying the smells of apple, muscat, banana, and pineapple, we obtained an identification rate of about 90% despite the high fluctuation of odor concentration.


Introduction
Currently, the development of an inspection device for food is desired for the safety and security of food. Since smell detection is useful, an apparatus for identifying abnormal flavors of the food in real time should be established.
A study of an odor recognition system has already been reported. For example, toxic gases were identified using a system composed of conductive polymer sensors, application-specific integrated circuits (ASICs) for a sensor-signal compensation circuit and a k-nearest neighbor algorithm on the platform of an Intel 8051 processor. (1) However, a stable and ideal environment was assumed in the previous study. Since sensor responses are disturbed significantly under a highly fluctuated environment, odor discrimination is not so easy in the real world. In this work, we built an odor recognition system that can identify the smell even under a highly fluctuated environment.
We have reported that the identification ability is increased by the extraction of the time constants of sensors using the short-time Fourier transform (STFT) algorithm. (2) Although this method was investigated using an off-line system, a realtime odor recognition system is required in an actual environment. Thus, we have developed a frequency measurement unit and an STFT unit (3) as well as a learning vector quantization (LVQ) (4) circuit. All circuits were embedded into a single fieldprogrammable gate array (FPGA) to make the real-time system small. The purpose of this study is to realize a compact odor recognition system that works in real time under a highly fluctuated environment.

Structure of Odor Recognition System
The odor recognition system consists of quartz crystal microbalance (QCM) sensors, a frequency measurement unit, an LVQ unit, an STFT unit and a CPU core (Fig. 1). To incorporate these units to FPGA, all the units except QCM sensors are composed of digital circuits.

QCM sensor
In this study, we used QCM sensors coated with sensing films. The resonance frequency of the QCM sensor is approximately 20 MHz. We used four types of coating film, namely, Apiezon (Ap-L), polyethylene glycol 1000 (PEG1k), OV17 and Tricresyl phosphate (TCP). The QCM sensors were connected to one-chip oscillation circuits with their frequencies measured using the following circuits.

Frequency measurement unit
This unit measures frequency shifts of the QCM sensors. As a frequency measurement method, we often use the direct counting technique where the number of signal pulses within the reference time is counted. However, it is difficult to use it owing to the tradeoff between frequency resolution and time resolution.
When the time resolution is enhanced, the maximum signal frequency increases according to the sampling theorem. When we observe the spectrum of the measured signal, the frequency component above 0.5 Hz still has a contribution despite the little contribution of the frequency component above 4 Hz. Since the 4 Hz bandwidth is necessary, our system requires 1/8 s time resolution, whereas the frequency resolution should be kept within 1 Hz. Thus, we adopted the reciprocal counting technique where the number of clock pulses within the measurement signal cycle is counted. (5) The block diagram of this unit is shown in Fig. 2, and has both a QCM sensor circuit for actual measurement and a built-in test signal generator.
If we use the reciprocal counting technique, we must decrease the signal frequency since the original QCM frequency of 20 MHz is too fast to obtain sufficient frequency resolution. Although a mixer is typically used to convert the signal to that in the lowfrequency range, 20 MHz signal is too fast for our digital mixer. Thus, the 20 MHz signal is converted to the 2.5 MHz range using a divider and then is again converted to the 1 kHz range, where a sufficient frequency resolution is obtained, using a digital mixer. The 2.501 MHz signal works as a local oscillator to obtain a 1-kHz-range signal from a 2.5-MHz-range signal. 8 Hz denotes the sampling frequency of the measurement data.
The frequency resolution ∆f can be calculated using the following equation. f in represents the digital mixer output frequency, f clk the clock frequency, and M the division ratio.  However, the test signal generator was mainly used in this study since we must accurately reproduce data measured using the QCM sensor in the past during the debug and evaluation phases of the odor recognition system.
The test signal generator includes a direct digital synthesizer (DDS) with ROM where previously measured QCM sensor data are accumulated, as shown in Fig. 2, and the pseudo QCM signal created using DDS is input to the reciprocal counter through a divider. Since we have a large amount of data previously measured using the odor generator (2) under its programmed concentration sequence, we put this data into ROM.

LVQ unit
We adopt LVQ as a classifier since it is suitable for hardware implementation. Its algorithm is quite simple despite the high recognition capability. The LVQ unit is a digital circuit based on the LVQ1 method. (4) Figure 3 shows the structure of the LVQ unit, and Table 1 shows the basic specification. The distance calculation circuit (DCC) outputs distance between an input  vector and a reference vector. Sixty-four DCCs corresponding to reference vectors are implemented in parallel. For simplicity of calculation, the DCC uses a Manhattan distance (city block distance), which can be realized using a simple adder and an absolute value calculator rather than a Euclidean distance accompanied by a large amount of multiplier circuit. The winner-take-all (WTA) circuit compares the calculated values of the DCCs and outputs the address of the DCC with the smallest output. The CONTROL circuit controls the sequence of operation and allows a CPU core to access the RAM of the reference vector and the input vectors.
The LVQ unit outputs the address of the nearest neighbor reference vector. In learning mode, moving the reference vector is achieved by CPU core calculation of updating the reference vector, followed by its accumulation into RAM in the LVQ unit.

STFT unit
An odor vector consists of 8-dimensional data: 4-dimensional data directly come from the frequency measurement unit, and the remaining 4-dimensional data come from the STFT unit. The identification rate of odor is improved by the STFT unit.
The STFT unit carries out Discrete Fourier Transform of the latest 32 points (32 points/8 Hz = 4 s) using the Hann window. The STFT unit calculates the amplitude of the 0.5 Hz component because it was found from a previous study that 0.5 Hz components were useful for the discrimination. (2) Although the STFT unit outputs DC and 0.5 Hz components, the 0.5 Hz component normalized with the DC component is actually used as the STFT output for the neural network. The input data of the neural network is obtained every 1/8 s. Although the STFT unit needs 4 s to obtain the spectrum data, it outputs the data every 1/8 s since it always uses the latest 32 points. Figure 4 shows the scattering diagram of the principal component analysis of the sensor responses to apple and muscat under a highly fluctuated environment where the odor concentrations were changed at random. Figure 4(a) reveals that the smell of apple  cannot be distinguished from that of muscat using only the magnitudes of the sensor responses. Thus, we focused on the information based on the time constant of the sensor response. Time constants of the sensor responses enable us to distinguish apple from muscat when we carefully look at the temporal data shown in Figs. 5(a) and 5(b). Since the STFT data extract the difference in the time constant, the apple data were separated from the muscat data using the STFT data, as shown in Fig. 4(b).

CPU core
The role of the CPU core is to control peripheral units described above and to send the calculation result to the PC. We designed the CPU core using a CPU development tool (SOPC builder) included in the FPGA development software (QuartusII Version11.1sp).
The clock frequency of the CPU core was 30 MHz, and the size of the allocated program memory was about 130 kB. The program code was written using C language. The CPU core works together with the LVQ unit and the STFT unit. It also includes the following algorithms.

Odor existence decision
A QCM sensor does not show a sharp response to odor. Particularly in the recovery phase, the sensor response still remains even if the odor exposure stops (Fig. 6). Since the sensor response in the recovery phase is different from the sensor response during the odor exposure, that data often causes misidentification. The odor existence decision  algorithm removes sensor responses without smells. Odor existence was judged using the threshold updated adaptively to match the response characteristic of each QCM sensor.
The transition from no-odor to odor-existence state is determined using the thresholds of the original signal and that of the transient response detection, which are statistically determined. The median among the multiple sensor responses is used to determine the odor existence. Moreover, the logarithm of that median is used to suppress the sensor response variation. The transition from odor-existence state to no-odor state is determined when the current sensor response becomes much smaller than the latest data. Another threshold is used for that purpose and is updated according to the data for the period of the odor existence.

Majority voting
The odor recognition system outputs the identification result every one-eighth second. In a real application, there is no problem with the time resolution longer than this period. Thus, we improved the identification rate by adopting a majority voting method. Since the category of the majority within the latest 17 points, made up of the current data point in addition to the latest past 16 points, is picked up, this system has a time resolution of about 2 s (17 points/8 Hz). We can wait for 2 s during real-time processing.

Experimental Method
The experiment to identify four smells, namely, apple, muscat, banana, and pineapple was performed using the odor recognition system.

Experimental setup
In the actual environment, we measure odor with the fluctuated concentration. However, since the odor concentration in air fluctuates irregularly, we cannot reproduce the concentration change faithfully. Therefore, we measured the odor response using an odor generator that was described elsewhere. (2) We placed the odor generator in the room with controllable temperature. Moreover, humidity was controlled by flowing the carrier gas through several vials filled with water. Temperature and humidity were measured using a temperature humidity sensor (EK-H2, SENSIRION).
In this study, we used the test signal generator instead of the actual QCM sensors. Moreover, we used a small FPGA board ACM022 (Humandata, FPGA: EP3C120F780C8N). Table 2 shows the resource of the odor recognition system embedded in FPGA. The LVQ unit consumed the most logic elements, and they were mainly consumed by DCC circuits in the LVQ unit. However, this FPGA with a moderate number of logic elements is sufficient for the implementation of the odor recognition system.

Experimental data
In this experiment, we use four odor samples: apple, muscat, banana, and pineapple. The following four parameters were controlled. • Temperature (°C) • Relative humidity (%RH) • Relative odor concentration (%RC) • Concentration profile No.
x Among these parameters, the relative humidity was fixed at 50 %RH, and the temperature was fixed at 25 °C. The relative odor concentration means the concentration relative to the full scale. Since the odor concentration is programmed to change every second, this concentration indicates the maximum one during one concentration sequence. Furthermore, each concentration profile No. x is determined using a random number, and x indicates the ID of the random number sequence.
When we evaluate the pattern recognition capability, the data for evaluation should be different from those for training. Thus, the following two experiments based on a crossvalidation test were performed.

Experiment 1
Learning was performed using the data of the relative odor concentration of 20 and 80 %RC. The data of 20, 40, and 80 %RC not used for training were used for evaluation. Concentration profile No. 3 was used.

Experiment 2
Learning was performed using the data of odor concentration profiles Nos. 1 and 3. The data of concentration profiles Nos. 1, 2, and 3 not used for training were used for evaluation. The relative concentration was 80 %RC.
These sensor response data were accumulated in the memory. The number of concentration sequences per odor sample was 10.

Evaluation of System Robustness against Concentration Fluctuation
Before performing the identification experiment, we observed the movement of the reference vectors that are important in the LVQ method. This evaluation uses apple and muscat odors. Figure 7 shows the result of principal component analysis.
In Fig. 7(a), initial reference vectors were extracted from input vector data of the same category. By performing the learning operation, the reference vectors were moved, as shown in Fig. 7(b). In the comparison of Fig. 7(b) with Fig. 7(a), the reference vectors are located, reflecting the distribution of the same category of input data.
Next, we performed the identification experiment. Learning and identification conditions are as shown in the previous section. The experimental conditions of LVQ are tabulated in Table 3. The sampling rate of the sensor data was 8 points/s. The learning coefficient monotonically decreased, and it became 1/(27 + i) during the i-th epoch.
Then, we conducted two experiments under the conditions described above. First, we performed two-category classification (apple-muscat and banana-pineapple). Twocategory classification means the data are just classified into one of two categories. There are two cases of two-category classification such as "apple versus muscat" and "banana versus pineapple". The response patterns of apple and muscat were similar, in the same manner that those of banana and pineapple were similar. Then, we performed  four-category classification (apple, muscat, banana, and pineapple). The details of experimental data were described in the previous section. Table 4 for experiment 1 and Table 5 for experiment 2 show the result of the identification rate evaluations. The number in brackets is the correct answer rate without the majority voting method; the number without brackets is the correct answer rate when adopting the majority voting method. The four-category classification is indicated as "ALL" in Tables 4 and 5.
The identification rate was almost 90% in both tables. This result shows that the proposed odor recognition system is sufficiently robust to keep up with the rapid concentration changes sufficiently. Moreover, the identification rate was improved by 9.2% when we adopted a majority voting method. In the two-category identification experiment, the identification rate of the apple-muscat set was better than that of the banana-pineapple set in most cases. Since the difference in the sensor's time constant between banana and pineapple was not clear, the STFT's contribution was not so much in the banana-pineapple case.