NIR Spectroscopic Determination of Polyphenol Content in Teas and Tea Extract at 2142 nm

various teas and tea extracts based on regression coefficient values, selectivity ratio (SR) values of the best PLS model [correlation coefficient ( R = 0.98, n = 69)], correlation coefficients (a correlation spectrum) between the reference TPP content and Savitzky–Golay smoothing with the second derivatives (SG2D) with standard normal variate (SNV) preprocessing of raw spectra, and the peaks of the raw spectra and their SG2D with SNV. The R value between SG2D with SNV at 2142 nm and reference TPP content was 0.96 ( n = 69).


Introduction
Spectroscopy, the measurement of light intensity absorbed by a sample as a function of wavelength, is an analytical tool for identifying a wide array of compounds in agricultural products. In particular, near infrared (NIR, 800-2500 nm) spectroscopy is a new technique that exploits the region neigboring 2500 to 25000 nm (4000 to 400 cm −1 ) used in IR spectroscopy, and can provide rapid, accurate, and non-destructive analyses of agricultural products on site or online without wet chemical analysis. NIR spectroscopy, also named overtone vibrational spectroscopy, is especially suitable for rapidly measuring total amounts of water, protein, lipid, and so on, in food. (1) However, it is very difficult to determine their total amounts using only one NIR wavelength as an independent variable, because single regression analyses, in which This version has been created for advance publication by formatting the accepted manuscript. Some editorial changes may be made to this version. only one independent variable is applied, have low accuracy. Therefore, multivariate analyses, in which many independent variables are applied, have been used for NIR data to obtain high accuracy for their determination.
Concrete examples are given below. The health benefits of catechins or polyphenol in teas have been reported in numerous publications. (2)(3)(4)(5) Catechins are the main components of the total polyphenol (TPP) in teas. (6) Schulz et al. reported the determination of TPP in green teas using NIR spectroscopy. (7) Other studies have also shown the use of NIR spectroscopy for determining TPP or catechins in green (color) teas. (7)(8)(9)(10)(11)(12)(13)(14) Ren et al. and Panigrahi et al. reported the use of NIR spectroscopy for determining TPP in black teas. (15,16) Schulz et al., Chen et al., Sinija and Mishra, and Wang et al. used wavelength ranges of 1108 to 2490, 909 to 2632, 800 to 2500, and 1000 to 2500 nm, respectively, for TPP determination in green teas. (7)(8)(9)12) Bian et al. identified 1131Bian et al. identified , 1654Bian et al. identified , 1666Bian et al. identified , 1738, and 1752 nm as bands related to the absorption of TPP for leaf powders as well as fresh leaves. (10) Moreover, Bian et al. reported that 1648 nm can be linked to the absorption feature of phenolic acid for tea powder and leaves. (11) Schulz et al., Luypaert et al., and Lee et al. used wavelength ranges of 1108 to 2490, 1100 to 2500, and 1100 to 2500 nm, respectively, for catechin determination in green teas. (7,13,14) Ren et al. and Panigrahi et al. used wavelength ranges of 800 to 2500 and 400 to 2447 nm, respectively, for TPP determination in black teas. (15,16) Thus, many wavelengths can already be used for TPP or catechin determination in green or black teas. Because we cannot know a few wavelengths as independent variables for TPP determination in teas, these NIR methods have low potential for practical use; we would have to specify a few independent variables with pinpoint accuracy. Although Panigrahi et al. achieved high accuracy (R 2 = 0.96) by using 400 to 2447 nm for calibrating TPP in black teas, their accuracy was low (R 2 = 0.79) for TPP validation. (16) Moreover, Schulz et al. reported a lower correlation (R 2 = 0.67) for NIR spectroscopic measurements (1108 to 2490 nm) of TPP in dried green tea leaves. (7) On the other hand, the soluble solids of tomatoes and strawberries, which have high water content, can be non-destructively determined using only three wavelengths in the short NIR range. (17) If samples are dry, the accuracy could be higher because no large peaks associated with water absorption will appear in their NIR spectra.
As mentioned above, black and green teas have been examined separately, and different models have been developed with many independent variables. Also, the visible (Vis) region has been evaluated for the rapid analysis of green and black teas, (10,16) which may be effective for determining TPP in teas because the Vis region is sensitive to TPP, as shown by Pan et al. with a black tea infusion. (18) Moreover, tea extract with a high TPP content may help to provide better results for the determination of TPP in a new transmittance-reflectance mode using visible and NIR (Vis-NIR) spectroscopy.
Regarding multivariate analysis, the partial least squares (PLS) algorithm has been used with many independent variables without specifying a few wavelengths for the determination. (7)(8)(9)(10)(11)(12)(13)(14)(15)(16) PLS regression analyses can be performed in combination with different wavelength ranges and spectral pretreatments to specify an NIR range for TPP determination. (19) Also, the combination of new PLS analyses and the correlation coefficients between reference TPP values and Vis-NIR spectra (a correlation spectrum) may specify a wavelength for TPP determination. Therefore, we measured various teas and tea extracts in a new transmittance-reflectance mode using a Vis-NIR spectrophotometer and analyzed the data using the new PLS analyses and correlation spectrum (Fig. 1). As a result, we were unexpectedly able to specify a new key wavelength (2142 nm) for TPP determination.

Samples
A total of 69 samples were used, including black teas from the Tea Research Institute of Sri Lanka (TRISL), commercial black, green, oolong, Kamairi, Pu'er, Houji, and Sunrouge teas, tea extracts, and a tea extract containing dextrin. The samples are given in Table 1.

Sample preparation
About 10 g of each sample was ground using a tea grinder (National MX-X57; Panasonic Corporation, Osaka, Japan) for 30 s and passed through a 35-mesh sieve (pore size of 500 µm) using a mechanical sieve shaker (MVS-1; As One, Osaka, Japan). The tea extracts and tea extract containing dextrin, which have fine particles, were used after sieving without grinding to measure the Vis-NIR spectrum and determine the reference TPP content.

Measurement of Vis-NIR spectra
The temperature in the laboratory was maintained near 24 °C. Vis-NIR spectra (400-2498 nm) were acquired for 0.3 g of each sample and a catechin mixture reagent (green tea; Wako Pure Chemical Industries Ltd., Osaka, Japan) in transmittance-reflectance mode with a sample cell, in which a gold reflection plate was set with a 0.4 mm total path length, using a Foss XDS Rapid Content Analyzer (Foss Analytical Solutions, Hillerod, Denmark). For each sample, 32 scans were averaged at 2 nm intervals. WinISI and ISIscan software were used (Foss Analytical AB, Hoganas, Sweden).

Reference TPP content
Reference values of TPP content were determined using the colorimetric ferrous tartrate method described by Iwasa and Torii. (20) About 50 mg of each sample was extracted in 40 mL of distilled water for 30 min using a water bath set at 80 ± 2 °C (Elma 30 H; Elma Hans Schmidbauer GmbH & Co, Singen, Germany). Then, the extract was mixed well for 5 min. The extract was cooled at room temperature and filtered through a disposable filter with a 0.45 µm pore size (Advantec Dismic 25 hp; Toyo Roshi Kaisha Ltd., Tokyo, Japan). Gallic acid (gallic acid monohydrate; Wako) was used to prepare a series of standard solutions (10,20,30,40, and 50 µg mL −1 ). A dye solution was prepared by dissolving 100 mg of ferrous sulphate heptahydrate (Wako) and 500 mg of potassium sodium tartrate tetrahydrate (special grade; Kanto Chemicals Co. Inc., Tokyo, Japan) in 100 mL of distilled water. Phosphate buffer was prepared with a concentration of 66 mM at pH 7.5 by mixing disodium hydrogen phosphate (special grade; Kanto Chemicals) and potassium dihydrogen phosphate (special grade; Kokusan Kagaku Chemicals Co., Ltd.). Filtered tea extract (5 mL) was mixed with dye solution (5 mL) and phosphate buffer (15 mL). Sample absorbance was measured at 540 nm using a UV-NIR spectrophotometer (Solid Lambda CCD; Spectra Co-op, Tokyo, Japan), and the absorbance of each sample blank solution with the dye solution replaced with distilled water was subtracted. TPP content is expressed as w/w%.

Data analysis of reference TPP content
For the 69 samples of teas and tea extract, descriptive statistical parameters (mean, standard error, standard deviation, and coefficient of variation) of their reference TPP content were calculated.

Preprocessing spectra and multivariate analysis
Spectral data were acquired in the range of 400-2498 nm at 2 nm intervals. To reduce the systematic noise in the spectra and enhance the contribution of the chemical components, different spectral preprocessing methods were applied, including the standard normal variate (SNV), (21) autoscaling (AS), first and second derivatives (1D, 2D), and Savitzky-Golay smoothing (SG) with a window size of 11 points and a polynomial of order 2. (22) Mean centering was always applied to each model. The following wavelength ranges were used for the analyses: 400-800, 400-1100, 400-1700, 400-2200, 400-2498, 800-1100, 800-2498, 900-1700, 1100-1700, 1100-2498, 1700-2200, 1700-2498, 1850-2350, 2000-2450, 2000-2498, and 2100-2450 nm. Spectral models were developed to predict TPP content using PLS regression analyses, and the number of latent variables required for each PLS regression model was calculated by the leave-one-out cross-validation approach. Thus, PLS regression analyses were performed in combination with each wavelength range and different spectral pretreatments using the 'plsropt' package in R. (19) The best PLS model was selected using the correlation coefficient (R), with the ratio of performance deviation (RPD) used for validation, where RPD is the standard deviation of the reference TPP content divided by the standard error of crossvalidation (SECV).

Identification of optimal wavelength for predicting TPP
An optimal wavelength for determining TPP was identified using the regression coefficients, the selectivity ratio (SR) values of the best PLS model, the R values between the reference TPP content and SG2D with SNV preprocessing of raw spectra (the correlation spectrum), and the peaks of the raw spectra and SG2D with SNV. SR is the ratio of the explained variance of each variable to the residual variance. Table 2 gives the descriptive statistics of the reference TPP content in the sample set (n = 69). Reference TPP content ranged from 3.5 to 32.1% with a mean value of 10.2%, standard deviation of 5.0%, and coefficient of variation of 49.0%. Figure 2 shows averaged raw spectra of black, green, oolong, Pu'er, Kamairi, Houji, and Sunrouge teas, tea extract, tea extract containing dextrin, and catechin reagent. Black tea showed higher absorption than green tea. Oolong tea showed absorption between those of green and black teas but higher absorption at long wavelengths. Clear absorption peaks are observed at 476, 672, 1460, 1930, and 2140 nm. The peak at 2140 nm was broad, spanning 2080-2210 nm.

PLS modeling in combination with wavelength ranges and preprocessing spectra
The models used to determine TPP content via Vis-NIR spectroscopy were based on PLS regression analyses. Different combinations of wavelength ranges and spectral preprocessing    (Table 3). Figure 5 shows the relationship between the reference and NIRcalculated TPP values for the best PLS model. The regression coefficients of the best PLS model are shown in Fig. 6(a). The lowest regression coefficient value (−444.6) was observed at 2144 nm, corresponding to SG2D with SNV. Figure 6(b) shows the SR values of the best PLS model. The highest SR value of 11.1 was observed at 2144 nm.    Figure 7 shows a correlation spectrum for the reference TPP content and SG2D with SNV preprocessing of raw spectra. The lowest negative R value (−0.96, n = 69) was observed at 2142 nm. Figure 8 shows plots of reference TPP vs TPP calibrated at 2142 nm for SG2D with SNV.

Reference TPP content of samples
The Folin-Ciocalteu and ferrous tartrate methods have been used to determine reference TPP content in tea. (7)(8)(9)(10)(11)(12)15,16,18) However, a reducing substance, L-ascorbic acid, reacts with the dye solution when TPP content is quantified by the Folin-Ciocalteu method. (23,24) Bian et al. used ferrous tartrate colorimetry to determine the reference TPP concentration, and the Vis-NIR results showed a correlation of R 2 = 0.97 between the reference and calculated TPP concentrations (n = 56, 146.62 to 294.98 mg g −1 ) in young green tea powder. (10) Schulz et al. reported a lower correlation (R 2 = 0.67) for NIR spectroscopic measurements of TPP (60.8 to 199.8 mg g −1 ) in dried green tea leaves, which might be due to the lack of specificity in the colorimetric reference method. They used the Folin-Ciocalteu method. (7) Therefore, we used the ferrous tartrate method.
Studies conducted using black teas have reported lower TPP ranges than those using green teas. Ren et al. and Panigrahi et al. reported TPP ranges of 4.2-20.5 and 7.6-17.2%, respectively, for black tea samples. (15,16) By contrast, higher TPP ranges of 19.2-30.2 and 22.0-30.7% were reported for green tea samples by Chen et al. and Wang et al., respectively. (8,12) Thus, we were able to obtain a wide range of reference TPP contents for tea samples. In addition, the tea extract samples covered high concentrations of TPP (Tables 1 and 2).

Vis-NIR spectra
The tea extract and tea extract containing dextrin had relatively sharp absorption peaks at 2140 nm; both extracts, as well as the catechin mixture reagent, showed high reference TPP concentrations (Fig. 2). Thus, SG2D with SNV of the raw spectra may reduce the systematic noise in the spectra and enhance the contribution of the chemical components with lower 2D values. The relative order of their values (absorption) at 1660 and 2142 nm is Houji < Kamairi < Sunrouge < tea extract containing dextrin < tea extract, corresponding to the reference TPP concentrations [ Figs. 3(b) and 4(b)].

Optimal wavelength for predicting TPP content
We carried out this study with the new ideas mentioned. We have developed numerous determination methods using NIR spectroscopy. As described in Sect. 1, calibration with various types of samples and a wide range of reference values leads to the derivation of more universal and accurate results, (17,25) and the limits of the method may also be found. In addition, we attempted to measure Vis-NIR spectra in not the reflectance (7,8,10,11,14) or diffuse reflectance mode (9,12,13,15,16) but the transmittance-reflectance mode. Also, the older PLS analyses were simple as they did not involve SR or exhaustive analyses in combination with wavelength regions and preprocessing methods, and it was difficult to specify a few wavelengths. (7)(8)(9)(10)(11)(12)(13)(14)(15)(16) In this study, the NIR wavelength range of 2000-2498 nm, which has high sensitivity to TPP content, was selected for the best PLS model for predicting TPP content in various teas and tea extract (Table 3). Then, the raw spectra were preprocessed using SG2D with SNV. As described in the results, the regression coefficients and SR values of the best PLS model [Figs. 6(a) and 6(b)] suggest that the intensity around 2144 nm is almost proportional to TPP content, corresponding to lower 2D values. Therefore, the correlation coefficients between the reference TPP content and SG2D with SNV were investigated. SG2D with SNV gives a negative correlation coefficient when the systematic noise of the spectra is reduced and the contribution of the chemical component is enhanced. Then, the lowest negative R value (−0.96, n = 69) was observed at 2142 nm in the wavelength range of 400 to 2498 nm (Fig. 7).
As mentioned above, an optimal wavelength of 2142 nm was identified for predicting TPP content in various teas and tea extract based on the regression coefficients, the SR values of the best PLS model, the peaks of the raw spectra and their SG2D with SNV, and the R values between the reference TPP content and SG2D with SNV of the raw spectra (Table 4).
Bian et al. reported that the intensities at 1131, 1654, 1666, 1738, and 1752 nm are related to TPP absorption. (10) Moreover, they reported that the wavelength of 1648 nm can be linked to the absorption feature of phenolic acid for tea powder and leaves. (11) In Fig. 7, our results also showed a high correlation (−0.94) at 1134 nm and a lower correlation (−0.76) at 1664 nm between reference TPP and SG2D with SNV. Thus, we can use a wavelength of around 1134 nm for TPP determination. In the NIR region, the sensitivity to TPP content increased toward high wavelengths, as shown in Fig. 2, and we can observe only small absorbance at 1134 nm not only in the raw spectra ( Fig. 2) but also in SG2D with SNV [ Figs. 3(b) and 4(b)]. However, the wavelength of 2142 nm selected in the present study shows the highest R and has not been reported in previous studies. In addition, we show in Fig. 2(b) that the strongest absorption of the catechin mixture reagent (powder) is around 2142 nm in the NIR region. The powder of a standard reagent can be measured to visualize the absorption band; sucrose, glucose, and fructose have been measured for the non-destructive determination of soluble solids in some fruits (melons, tomatoes, compact watermelons, strawberries) using NIR spectroscopy. (17,26) The absorption band of the three sugars can be measured from 900 to 925 nm in the second derivative spectra, (26) and the combination of wavelengths around 902 nm and other several wavelengths can be used for the practical and non-destructive determination of soluble solids in fruits. (17) The main components of the soluble solids are these sugars. The absorption between 880 and 915 nm is generally assigned to the third overtone of the CH stretching vibration mode. (27) Combination bands appear in the NIR region of 1900 to 2500 nm. (27) According to a quantum chemical calculation method, methanol in CCl 4 has a combination of the OH stretching mode with the CO stretching mode at around 2151 nm (4650 cm −1 ), although the peak is weak. (28) This may lead to the assignment of OH stretching and CO stretching at around 2142 nm because both vibration modes can originate from alcohols or phenols and have strong absorption. The wavelength range of 3420-3250 cm −1 is assigned to OH stretching, having broad bands due to hydrogen bonds in alcohols and phenols (solids and liquids), and the wavelength range of 1340-1160 cm −1 is assigned to CO stretching, having broad bands in phenols as shown in Fig. 2. On the other hand, according to the absorption of phenol in CCl 4 , the wavelengths around 1134 nm correspond to the second overtone of the aromatic CH stretching vibration mode although the intensity is weak as shown in Figs. 2, 3(b), and 4(b). (27)

Conclusions
As a reference method for determining TPP content, the colorimetric ferrous tartrate method was applied and the absorbance of each sample blank was subtracted. Vis-NIR spectra of various teas and tea extract were measured in a new transmittance-reflectance mode. Also, the spectrum of a catechin mixture reagent was measured. PLS regression analyses were performed in combination with different wavelength ranges and spectral pretreatments. The best PLS model (2000-2498 nm) for predicting TPP content in dry teas and tea extract gave a wide range of TPP concentrations with high accuracy (R = 0.98). Moreover, absorption at 2142 nm was identified as the optimal independent variable based on regression coefficients, SRs of the best model, Vis-NIR spectra, and a correlation spectrum. This wavelength can be recommended for the rapid and accurate quantification of TPP content in teas and tea extract with high sensitivity in the NIR region.