Mood Prediction in Consideration of Certainty Factor Using Multilayer Deep Neural Network and Storage-Type Prediction Models

predicting with 73% In this paper, we propose multilayer-deep neural network (M-DNN) and storage-type prediction models (STPMs) to predict mood two weeks in advance with high accuracy. The M-DNN outputs predictions as well as unpredictable data using a deep neural network and threshold optimization in each prediction layer. The threshold optimization determines the threshold that maximizes a certainty factor. The certainty factor is calculated from the predictive accuracy of M-DNN and the amount of unpredictable data. The STPMs interpolates the unpredictable data by accumulating the predictions output from M-DNN. The amount of unpredictable data output from the M-DNN is decreased by STPMs. Experiments show that M-DNN and STPMs can predict mood two weeks in advance with 70% accuracy. The predictive accuracy in M-DNN+STPMs is 11% higher than that in DNN. Hence, M-DNN+STPMs is an effective method for mood prediction.


Introduction
Depression has become a social problem in Japan. Depressed patients are difficult to cure because these patients think only of bad things and fall into a vicious cycle. Therefore, depression needs to be prevented. To prevent depression, people need to recognize their mental health in daily life. However, they cannot recognize correctly their own mental health.
Previous research (1)(2)(3)(4) supports mental health care by estimating current moods and predicting future moods. C. T. Eagle (5) defines a mood as an emotional state. Moods have positive valence and negative valence for long times. Moods are changed by body condition and environment. Valenza et al. (1) estimated bipolar disorder with 95% accuracy using the heart rate of patients. Yoshino and Matsuoka (2) and Kinase and Venture (3) estimated the current mood in daily life from a heart rate and muscle potential during walking. On the other hand, Kajiwara et al. (4) predicted tomorrow's mood with 73% accuracy using yesterday's mood, weather information, and biological information. However, the moods 2 d later and beyond could not be predicted using this system. Thus, the prediction system needs to improve predictive accuracy and extend the prediction period for practical use in daily life.
In this study, we proposed multilayer-deep neural network (M-DNN) and storage-type prediction models (STPMs) to predict mood over a long period with high accuracy using biological information and weather information. The prediction error is composed of bias, variance, and irreducible error. Bias is error between target outputs and the expected value of the estimate. Variance is error from fluctuations in the training set. The irreducible error is composed of observation noise and unseen information. The machine learning algorithm has the ability to reduce bias and variance. However, the irreducible error cannot be reduced by a machine learning algorithm because machine learning cannot improve accuracy of observations and cannot obtain unseen information.
Mood is decided by a person's subjective evaluation. Hence, the mood includes much observation noise. An enforced prediction decreases the predictive accuracy. M-DNN does not necessarily predict the future mood from noisy data. M-DNN is composed of prediction layers that are the deep neural network (6) and threshold optimization. M-DNN outputs a prediction and unpredictable data in each prediction layer. The deep neural network output a class probability, which is an index of prediction difficulty. Hence, if the class probability is less than the threshold, M-DNN outputs unpredictable data. Otherwise, M-DNN outputs the prediction. The threshold optimization determines the threshold that maximizes the certainty factor. The certainty factor is calculated from the predictive accuracy and the amount of unpredictable data. The amount of unpredictable data output from the M-DNN is decreased by STPMs. The STPMs interpolates the unpredictable data by accumulating the predictions output from the M-DNN.
A person who has symptoms such as depressive mood, anorexia, and lack of sleep is likely to be depressed for a long time. Therefore, M-DNN and STPMs predict the mood two weeks later or beyond. Biological information and behavioral information are obtained with a body composition monitor, a blood pressure monitor, and a pedometer. Weather information is obtained online for free.
We explain M-DNN and STPMs in § 2. In § 3, the M-DNN and STPMs are compared with existing machine learning algorithms such as the deep neural network, the random forest, the linear kernel support vector machine, and the radial basis function kernel support vector machine. In § 4, conclusions and suggestions for future work are provided. Figure 1 shows the processing flowchart of a mood prediction system. First, a user uploads the biological information, current mood, and environmental information. Environmental information is the residence of the user. Second, the user assesses the current mental state in the subject on the basis of the biological information and current mood. Third, a prediction system such as M-DNN obtains biological information, current mood, and weather information. Fourth, M-DNN and STPMs output the prediction and unpredictable data from the biological information, current mood, and weather information. Fifth, the user takes precautionary measures against depression on the basis of the predicted future mood. The user can preserve a healthy mental state by taking 361 precautionary measures and the cure. The mood prediction system is a personal adaptive system as in previous research, (1)(2)(3)(4) and it cooperates with the mood estimation system. (1)(2)(3) Figure 2 shows a block diagram of M-DNN and STPMs. The M-DNN and STPMs predict the future mood and output unpredictable data. The mood types are positive mood and negative mood. First, the deep neural network in the first prediction layer output the class probability of the future mood by inputting biological information, current mood, and weather information. Deep neural networks in ith layer are defined as DNN(i). Second, the threshold optimization determines the threshold for the maximizing certainty factor in DNN(i). The certainty factor in DNN(i) is calculated from the predictive accuracy of DNN(i) and the amount of unpredictable data. The threshold in the i layer is defined as T c (i). Third, if the class probability in the future mood is less than T c (i), M-DNN outputs unpredictable data. Otherwise, M-DNN outputs the prediction. Fourth, DNN(i) outputs the class probability in the future mood by inputting unpredictable data. M-DNN repeats steps 2 to 4 until the amount of unpredictable data is less than 10 or the predictive accuracy in DNN(i) is less than random probability.

Mood Prediction in Consideration of the Certainty Factor by M-DNN and STPMs
The unpredictable data is interpolated using the STPMs. First, the accuracy of the probability is calculated from the precision of M-DNN. The accuracy of the probability is the reliability of accumulating predictions. Second, the threshold optimization determines the threshold for maximizing the certainty factor in STPMs. The certainty factor in STPMs is calculated from predictive accuracy of STPMs and the amount of unpredictable data. The threshold is defined as T a . Third, if the accuracy of the probability of the future mood is less than T a , STPMs outputs the unpredictable data. Otherwise, STPMs outputs the prediction.

M-DNN
M-DNN is composed of prediction layers that are the deep neural network and the threshold optimization. Deep neural networks output a class probability. The class probability is an index of the difficulty of the prediction. The deep neural networks consist of hidden layers. The hidden layers have some units in each layer.
First, the deep neural networks learn explanation variables by unsupervised learning in pretraining as follows. The first hidden layer learns explanation variables using an autoencoder. (7) The z (z = 2, 3, …, Z) hidden layer learns output results in the z−1 hidden layer with the autoencoder. The Z layer outputs a feature vector. Second, the networks of hidden layers are adjusted by fine turning.   The fine turning learns feature vectors output from the Z layer and recognizes the objective variable by supervised learning. The deep neural network outputs the class probability of each objective variable as where Y is an objective value set, y is an objective value, b is the current day, q is a future day, and c(y, b, q) is the class probability of each objective variable. If c max (b, q) is less than T c (i), the deep neural network in the i layer outputs unpredictable data. Otherwise, the deep neural network in the i layer outputs y est (b, q) as where c max (b, q) is the maximum class probability in c(y, b, q), T c (i) is the class probability threshold, and y est (b, q) is the result of predicting the mood after day q. T c (i) is determined from maximum certainty factor in DNN(i). The certainty factor maximizes the accuracy of the prediction of DNN(i) and minimizes the amount of unpredictable data as where E c (i, t c ) is a certainty factor in DNN(i), and t c is the threshold of the class probability. The term f c (i, t c ) is an f-measure in applying t c as where p c (i, t c ) is the prediction in applying t c , and r c (i, t c ) is recalled in applying t c . The term u c (i, t c ) is the percentage of d c (i) to d u (i, t c ) as where d c (i) is the amount of input data in DNN(i), and d u (i, t c ) is the amount of unpredictable data in applying t c . Figure 3 shows the outline of STPMs. The term b is a current date; q is the day being predicted. In Fig. 3, a mood in (b, q) equals a mood in (b−β, q+β) {β = 1, 2, ..., Q−q}. Hence, STPMs interpolates the unpredictable data in (b, q) from y est (b−β, q+β). If a max (b, q) for a future mood is less than T a , STPMs outputs the unpredictable data. Otherwise, STPMs outputs the prediction as

STPMs
where a max (b, q) is the maximum accuracy probability, Q is a future day set, M(b, q) is M-DNN for predicting the mood after q d at b date, and p m (M(q), y) is the prediction of M(q) in each objective ・ ・ ・  value. The term T a is determined from the maximum certainty factor in STPMs. The certainty factor maximizes the predictive accuracy and minimizes the amount of unpredictable data as where E a (t a ) is a certainty factor in STPMs, and t a is the threshold of the accuracy probability. The term f a (t a ) is the f-measure in applying t a as f a (t a ) = 2p a (t a )r a (t a ) p a (t a ) + r a (t a ) , where p a (t a ) is the prediction for applying t a , and r a (t a ) is recalled in applying t a . The term u a (t a ) is the percentage of d a to d u as where d a is the amount of input data in STPMs, and d u (t a ) is the amount of unpredictable data in applying t a .

Experimental environment
The subject was a 64-year-old healthy man who is an office worker. The measurement period was between 1 July 2013 and 30 June 2015. Table 1 shows the sample size of each mood. The location was Kanazawa City. The subject's biological information was measured with a body composition monitor (Tanita Company, BC-503), a blood pressure monitor (Tanita Company, BP-301), and a pedometer (Tanita Company, FB-723). BC-503 measured the subject's weight (WT), body fat (BF), body mass index (BMI), and muscle mass (MM). BP-301 measured the subject's systolic blood pressure (SBP), diastolic blood pressure (DBP), and pulse (PS). FB-723 measured the number of subject's walking steps (WS). Weather information was obtained for free from the Japan Meteorological Agency website. We obtained maximum temperature (MXT), minimum temperature (MNT), precipitation (PN), snowfall (SF), snow depth (SD), sunshine hours (SH), maximum wind speed (MW), relative humidity (RH), mean of cloudiness (MC), air pressure (AP), and vapor pressure (VP). The mood (MD) was determined on a scale of one to five using questionnaires. Answers of 1 or 2 on the questionnaire are defined as a negative mood. Answers of 3 on the questionnaire are defined as neutral. Answers of 4 or 5 on the questionnaire are defined as a positive mood. The mood and the biological information were measured once a day between 15:00 and 18:00.

Experimental methods
We compared M-DNN and STPMs with each of the classifiers. The classifiers were the deep neural network (DNN), naïve bayes (NB), (8) Random Forest (RF), (9) a linear kernel support vector machine (SL), (10) and a radial basis function kernel support vector machine (SR) in the R library for statistical computing. The DNN parameter was configured as the number of units, the number of layers, and activation function. The number of units was sixteen. The number of hidden layers was two. The activation function was a hyperbolic tangent function. Parameters for the other classifiers were configured as defaults in the R library. The explanation variables are the biological information today, the mood today, and the weather information until q d laters. The objective variable was the mood after q days. Positive mood and negative mood were predicted up to 14 d later. Evaluation was conducted by a leave-one-out cross-validation (LOOCV) (11) in which the f-measure was calculated.  layers increased. DNN(1) learns common patterns in input data from the majority of all input data to maximize predictive accuracy. If any input data have a majority pattern, DNN(1) outputs high class probability, and the first prediction layer outputs the prediction result. On the other hand, if any input data have a minority pattern, DNN(1) outputs low class probability, and the first prediction layer outputs the unpredictable data. DNN(2) learns common patterns in unpredictable data from the majority of all unpredictable data output from DNN(1) to maximize the predictive accuracy. Thus, DNN(i) can focus more on prediction of the unpredictable data than can DNN(i−1). Hence, M-DNN maintains the predictive accuracy and decreases the percentage of unpredictable data by increasing the number of layers. Figure 7 shows  If the f-measure of a prediction system decreases, the reliability of the prediction system decreases. In addition to that, if a prediction system outputs a lot of unpredictable data, the practicability of prediction system decreases. Thus, the prediction system should be evaluated by the f-measure and the percentage of unpredictable data. From the F7, the difference in the f-measure between M-DNN+STPMs and M-DNN is −0.03. From Fig. 8, the difference in the percentage of unpredictable data between M-DNN+STPMs and M-DNN is 0.27. Thus, the sum of the difference values is 0.24. As a result, M-DNN+STPMs is better than M-DNN. Moreover, the experimental results have shown that M-DNN+STPMs can predict mood over a long period better than existing system. (4)

Conclusion and Future Works
We proposed a predictive method combining M-DNN and STPMs. Experiments show that M-DNN and STPMs can predict mood up to two weeks with 70% accuracy. The predictive accuracy in M-DNN+STPMs is 11% higher than that in DNN. Hence, the M-DNN+STPMs is effective for personalized mood prediction. However, the number of subjects in this study was small. Thus, we will apply the M-DNN+STPMs to many subjects, and we will verify the efficacy of M-DNN+STPMs. On the other hand, the mood prediction system needs to function without biological information to predict mood over a long period. The prediction system used weather information but did not use a weather forecast. In future work, we will predict future moods using a weather forecast. Moreover, we would like to improve predictive accuracy.