Investigation of Relationship Between Eye Gaze and Brain Waves towards Smart Sensing for E-learning

In recent years, we have witnessed a new trend of learning called e-learning where learners can take courses using electronic devices with Internet connection. Although e-learning is convenient because it removes temporal and spatial limitations, it is difficult to know whether the learner is really paying attention to the learning materials. To address this problem, we tried to use electroencephalography (EEG) to investigate a learner’s concentration in our previous study. However, our previous study relied on subjective evaluation, and there was no objective observation to relate the EEG signals to the learner’s concentration. Given this background, we compared eye gaze and EEG results to find the appropriate position and frequency band of EEG in e-learning in this study. We compared the EEG result obtained during a period when the subjects were watching a video lecture and that obtained during a period when the subjects were not watching, and determined that the viewing state could be predicted from EEG logistic regression and a support vector machine (SVM). The results suggested that measuring beta and gamma waves and examining the parietal and occipital regions are both effective.


Introduction
In recent years, education using electronic devices has become widespread owing to the spread of portable devices. The e-learning educational method can be used without much restriction on the place and time as long as there is a device, and it is conceivable that it will be developed in the future. However, owing to the characteristic that e-learning can be used anytime and anywhere, the monitoring system for students is not sufficient, and it is difficult for students to be able to maintain concentration or for teachers to alert students who are not concentrating. It is considered that it is important to know the learner's status using human biological information as a means of solving these problems. In our previous study, we tried to measure the effectiveness of learning materials in terms of learner concentration using electroencephalography (EEG) results as an indicator. In that study, we used alpha waves as the index of a learner's status. (1) Among biometric information, EEG has been used to study not only the state of learners but also the state of drivers while driving and the emotions of people. (2)(3)(4)(5)(6) Some EEG devices have many electrodes and can acquire information from the whole brain, and those with few electrodes can measure only a specific part.
EEG devices equipped with a large number of electrodes may impose a large burden on wearers. On the other hand, simple EEG devices with a small number of electrodes have limitations in their use. In the case of using EEG in education, we have to lower the burden of learners. Therefore, using simple EEG devices is suitable in that case. In the field of emotion recognition, research is conducted to reduce the number of electrodes by identifying effective sites for measuring emotion with EEG. (7) If it is possible to specify the effective frequency band and positions of electrodes to measure the status of learners, it is considered that the status of learners can be appropriately measured even with simple EEG devices.
In our previous work, the effectiveness of changing the contents of learning materials was measured by alpha waves in the occipital lobe. However, there was no basis for proving that alpha waves are an effective way to measure concentration. (1) To know the necessary electrodes and frequency band, an index that is an objective true value of the state during learning is necessary. We paid attention to eye gaze when humans watch a lecture video. It is said that humans obtain over 80% of the information that can be obtained from the five senses through vision. (8) There is an indicator called PERCLOS obtained from the eye that evaluates human arousal level. (9) PERCLOS is calculated as the eye closure time within a certain time window and used as an index of arousal level. In particular, it is used to evaluate a driver's arousal level while driving a car. (10,11) There are also studies that use eye gaze for learner status estimation and feedback. (12)(13)(14) Therefore, we thought that gaze could be used as the true value of a learner's state. In this study, we adopted a value similar to PERCLOS, which is how much learners were watching a video lecture in a certain window, and aimed to identify the EEG positions and frequency bands to be acquired when examining the status of learners during the video lecture using simple EEGs.
The rest of this paper is organized as follows. In Sect. 2, we explain the materials and methods. In Sect. 3, we describe the results of our experiments. In Sects. 4 and 5, we give a discussion and conclusions.

EEG
We used g.Nautilus EEG devices manufactured by g.tec in the system. g.Nautilus has 16 electrodes, namely, Fp1, Fp2, F3, F4, Fz, C3, Cz, C4, T3, T4, P3, P4, Pz, O1, O2, and Oz in accordance with the international 10-20 system, as shown in Fig. 1. A fourier transform of EGGs results was performed and, as shown in Table 1, the results were classified into five types of frequency band: delta, theta, alpha, beta, and gamma waves.
In addition, the spectrums in each frequency band were divided by the sum of the frequency spectrums of the entire frequency for each electrode to obtain the ratio of each frequency band. Let T and S be the sum of the frequency spectra of all frequency bands (1 to 65 Hz) and that of the frequency spectrums of each frequency band, respectively.
In Eq. (1), the ratio of the frequency band of delta waves is expressed as η δ . In this case, S δ is the sum of the absolute values of the frequency spectrums of 1 to 3 Hz.

Eye gaze
Using a Tobii Eye Tracker X60, we performed eye gaze recognition at the sampling frequency of 60 Hz and recorded eye gaze information when watching a video lecture. In this study, we defined viewing and nonviewing states. The nonviewing state is that when the tracker cannot recognize eye gaze and the eye does not face a video. It is suggested that there is a relationship between the increase in the number of blinks and the decrease in concentration. (14) When the eye does not face a video lecture, users are looking at another part of the monitor such as the time bar or away from the screen. In this study, the subjects were prohibited to take notes. Therefore, it was not possible that the subjects were not watching despite concentrating on their work.

Learning materials
In this study, we focused on the use of a video lecture in e-learning. Video lectures are a familiar means of e-learning that can be viewed on video posting sites. In addition, we can study without items such as desks and pens. We used approximately 15 min of a lecture video related to neuron movement. This lecture video was used in our previous work. (1) The lecture  video is a still image-and illustration-based teaching material. It does not introduce difficult content that requires calculation; however, sometimes technical terms, such as neurotransmitter, appear. An example slide is shown in Fig. 2. This slide explains neuron connections. Since the explanation is given step by step from the introduction, if subjects do not listen to the explanation, they will not understand the content of the explanation later. However, if they listen carefully, they should be able to understand the content.

Environment
This experiment was carried out with the approval of the Life Science Committee of the Faculty of Science and Engineering, Aoyama Gakuin University. The research manual and consent form were prepared on the basis of compliance with the Declaration of Helsinki.
Eight college students participated in the experiment as subjects. Each subject was asked to sit down in front of a monitor equipped with a Tobii Eye Tracker. After that, a g.Nautilus EEG was attached to the students and their eye gaze was calibrated using Tobii Studio. We asked them to watch the lecture video. Each subject was not acquainted with the information in the lecture beforehand.
While the subjects were watching the lecture video, brain waves were recorded at 250 Hz and eye gaze was recorded simultaneously at 60 Hz. In addition, for reference, we recorded their state when watching movies, and at the end, we asked the subjects to answer a questionnaire about the content of the lecture.

Definition of viewing condition
In examining the relationship between eye gaze and EEG results, a time window was set. According to a previous study, a time window of 6 s is optimal for emotional analysis using brain waves. (15) However, this may not be applied to the state of measuring concentration for video lectures. Therefore, it is essential to compare different window sizes. From this background, we compared time windows of 6, 4, and 2 s.
We also define the state of viewing using the measured data with the eye gaze assuming that the subject's eyes are directed to the target video content. We define this metric as the ratio of viewing in a monitoring period. We compared the ratios at 50, 60, and 70%.

Preprocessing of EEG signals
EEG results were divided into five frequency bands using a short-time Fourier transform (STFT), and the ratio η was calculated. The STFT was performed on 1 s sliding windows with a 0.5 s overlapping period using the Hamming window. We calculated the average and variance of η within the time window. We compiled eight people's data and integrated them into one data after standardization so that the average is 0 and the variance is 1. When the average of the original data is μ and the standard deviation is σ, the variable ′ of the original data is converted using the equation We considered that a ten-dimensional (average and variance of five frequency bands) vector is obtained from one electrode. In Eq. (3), ave(η δi ) and var(η βi ) are the average of delta waves and the variance of beta waves while watching videos, respectively.

Calculation of pseudo-coefficient of determination
The pseudo-coefficient of determination of logistic regression was calculated in each window to investigate the most appropriate time window. In Eq. (4), a i and b are parameters, but a i is the vector for x i , and Eq. (5) is the formula used in the logistic regression.
The objective variable is a dummy variable {0: viewing, 1: nonviewing} obtained by monitoring eye gaze. We performed undersampling so that 0 and 1 of the objective variables are the same before the classification.
In this case, McFadden's pseudo-coefficient of determination is used as an indicator of evaluation. (16) When the likelihood of the logit model is L 1 and the likelihood when all the parameters are 0 is L 0 , McFadden's pseudo-coefficient of determination is expressed as Its value is between 0 and 1, and the closer to 1, the better. In each window, only 10 explanatory variables obtained from one electrode out of the 160 explanatory variables obtained from all 16 electrodes were used, and the pseudo-coefficient of determination for the dummy variable of the eye gaze was determined for each electrode. We compared 2, 4, and 6 s windows.

Comparison of window size
To decide the period of monitoring and the viewing state, we varied the period of monitoring among 2, 4, and 6 s and the threshold of the viewing state derived from the eye gaze among 50, 60, and 70%.
First, we calculated the pseudo-coefficient of determination of logistic regression for all the above cases.
Results in Fig. 3 show the average pseudo-coefficients of determination of every electrode in each period of monitoring and at each ratio of the viewing state. The viewing ratio of 50% has the highest accuracy in each period of monitoring for 2, 4, and 6 s windows. Furthermore, the highest accuracy is obtained for the pair of a 2 s window and 50% viewing ratio. Thus, we used this pair for further study.

Comparison of average EEG results by states
We compared the averages μ δ , μ θ , μ α , μ β , and μ γ of the viewing and nonviewing states and performed a t-test.
Results in Table 2 showed that for the beta waves of Cz, T3, C4, P3, Pz, P4, and O1 and the gamma waves of Cz, T3, C3, C4, P3, Pz, P4, and O1, the p value was less than 0.05. However, other frequency bands showed no significant difference. Table 3 shows the average η of the significant electrodes.

Identification of electrodes
We identified electrodes having a high contribution rate to eye gaze from the 16 electrodes. Cross-validation was performed on data sampled in the 2 s window with 10 verifications, and the accuracy was calculated. Tables 4 and 5 show the results of the classification. When the number of electrodes was increased to eight, the highest accuracy rate of 76.2% was obtained when selecting Oz, Pz, and O2.

Discussion
When the averages values of η were compared, significant differences were obtained in beta and gamma waves, which originated from the parietal to occipital lobes and the left temporal lobe.
When the viewing state was classified into two values by EEG, the accuracy of 76.2% was obtained when performing logistic regression with three electrodes. In addition, even when using a support vector machine (SVM), the results suggested that the maximum accuracy rate could be obtained with four electrodes. As the position, many positions such as the occipital and parietal lobes were selected in each case. From this tendency, there are various activated brain positions when viewed with only one frequency, but when viewed with all frequencies, it is easy to select a nearby site, and the positions are the occipital and parietal lobes. Therefore, it is necessary to consider not only a specific frequency band but the entire brain wave when identifying the brain part activated during e-learning.
The occipital lobe is known to activate alpha waves at rest. (17) Although similar results were obtained in this study, it was suggested that it is necessary to consider not only rest but also other factors when measuring the state during e-learning.

Conclusions
By using a classifier, we were able to investigate the relationship between eye gaze and EEG results. By using 16 electrodes and five frequency bands, we showed the possible electrodes and frequencies that can be used for concentration classification using brain waves such as those from the parietal and occipital lobes and beta and gamma waves. This suggests that even if simple EEG devices with a small number of electrodes are used, the state of learners can be properly monitored.
In this study, we had eight subjects, but it is necessary to generalize our results by increasing the number of subjects with various characteristics and improve the accuracy by increasing the amount of data. In addition, we focused only on video lectures in e-learning. We have to verify that our results are applicable to other learning methods. In addition, we would like to examine the usefulness of the learning material using EEG by actually preparing e-learning material that provides feedback by brain waves.