Colorimetric Characterization of Color Image Sensors Based on Convolutional Neural Network Modeling

The colorimetric characterization of a color image sensor was developed and modeled using a convolutional neural network (CNN), which is an innovative approach. Color image sensors can be incorporated into compact devices to detect the color of objects under a wide range of light sources and brightness. They should be colorimetrically characterized to be suitable for smart industrial colorimeters or light detection. Furthermore, color image sensors can be incorporated into machine vision systems for various industrial applications. However, the red, green, and blue (RGB) signals generated by a color image sensor are devicedependent, which means that different image sensors make different RGB spectrum responses under the same conditions. Moreover, the signals are not colorimetric; that is, output RGB signals are not directly coherent in terms of device-independent tristimulus values, such as CIE XYZ or CIELAB. In this study, the colorimetric mapping of RGB signals and CIELAB tristimulus values by CNN modeling was proposed. After digitalizing an RGB image sensor, characterizing RGB colors in the CIE color space, and CNN modeling for precise accuracy, the colorimetric characterization of color image sensors based on CNN modeling was proved to be superior to that based on 3 × N polynomial regression. ΔE*ab was less than 0.5.


Introduction
Digital color cameras are widely used for capturing color images. The red, green, and blue (RGB) spectrum responses of CMOS image sensors (1) are neither colorimetrically nor linearly transformed from device-independent tristimulus values on the basis of CIE color-matching functions. (2) State-of-the-art color-matching transform functions define the precision of color reproduction and mapping RGB signals to CIE XYZ or CIELAB. The transform derivation process is known as image sensor colorimetric characterization.
The International Organization for Standardization (ISO) has developed a standard (ISO 17321-4:2016) (3) for digital camera color characterization, which is mainly used by camera manufacturers and testing laboratories. ISO requires the use of sophisticated, expensive equipment and untendered camera data. Therefore, the ISO 17321 standard is difficult to achieve in the color characterization and calibration of generic image-sensor-based devices. In this paper, a practical color-target-based method is proposed. The simplified method applies affordable IT 8.7 (ISO 1264) color calibration targets, which are captured by image sensors and measured using a spectrophotometer, to obtain RGB values and their corresponding XYZ values. Among color-matching transform methods such as polynomial regression, (4,5) 3D lookup tables with interpolation and extrapolation, (6) and various neural networks (7,8) such as convolutional neural networks (CNNs) (9) have inspired much academic and industrial research activities. (10,11) However, no published research has used CNNs for the colorimetric characterization of image sensors. (12) The deep learning of CNNs promises an intelligent solution for image sensor colorimetric characterization. A CNN is a deep machine learning algorithm to automatically learn, for example, color characterization features instead of features extracted by experts through polynomial regression. The CNN is a bioinspired trainable architecture that can learn invariant features from an input data set such as IT8.7/4 color samples. It also provides more accessible and smarter deep learning behaviors than trial-error manual testing by end users. Color science specialists can also deploy machine learning to adjust input vectors and optimize color-matching transform functions efficiently.
In this study, experiments were performed to investigate the following: 1. number of IT8.7/4 for CNN training data to build a model with high accuracy, 2. characterization performance when using CNNs, 3. characterization accuracy when using ΔE* ab as a loss function for CNN, and 4. comparison between CNN modeling and 3 × 11 polynomial regression.

Equipment and Materials
This study employed the following equipment and materials: 1. Printer and color reference targets: An EPSON Stylus Pro 9900 [ Fig. 1  characterization of 4-color process printing, 1617 color reference samples [ Fig. 1

CNN Characterization
Deep learning algorithms have equaled or surpassed human perception in some respects, (14,15) enabling their deployment in real-world applications. In this study, a CNN algorithm was employed to achieve near-human perception in colorimetric characterization.  The CNNs used were trainable architectures composed of five stages (Fig. 3):  • Batch normalization with a minibatch of 1300 color samples was used to accelerate training. • A test error rate is associated with the expected validation error rate after 100000 training epochs. 4. 3 × 3 Mask: (1) Initialized using a stochastic normal distribution.
(2) C3 is a 3 × 3 ReLU convolution layer, as shown in the 3 × 3 mask in Fig. 4 In this study, a spectrodensitometer X-Rite iOne IO measured (L 1 *, a 1 *, b 1 *) and a BPNN inferred (L 2 *, a 2 *, b 2 *); these two colors were observed in the CIELAB color space, which expresses colors as three numerical values, namely, L* for lightness and a* and b* for green- red and blue-yellow color components. CIELAB was designed to be perceptually uniform concerning human color vision, meaning that the same amount of numerical change in these values corresponds to about the same amount of visually perceived difference. In the related works of this research, the loss function could be applied to other formulas for calculating ΔE -such as CMC and CIE94 (16,17) -for various industrial applications. (3) The loss function was also applied in adaptive moment estimation, the most widely used optimizer algorithm in CNN-related studies.
There were four colors for which ΔE* ab exceeded 6, but the color difference was barely recognized by the human eyes (Fig. 9). The four colors with the highest ΔE* ab were as follows:  We applied CIELAB-standard digital color image in Fig. 10 to visualize and evaluate color characterization. The CNN-characterized CIELAB image displayed in Fig. 11 and the 3 × 11-polynomial-regression-characterized CIELAB image presented in Fig. 12 were compared with the original CIELAB image.
In summary, this paper has particular implications for the development of CNN algorithms that might offer an even smarter, more straightforward, and easier-to-use color reproduction system than 3 × N polynomial regression. CNN, a new color space mapping technology, transforms a color-image-sensor-based device (i.e., camera) into a spectrometer-like device, which is reliable, capable, and responsive, and thus can serve as a foundation for future application.
However, some problems remain to be solved. The CNN model takes more than 100 h/1000 color patches on a PC with an Nvidia GPU GTX 1080 Ti. The validation of the CNN model      takes 68 ms for 8 × 8 pixels, making the method difficult to use in some real-time (<20 ms) industrial applications. Neural networks typically take longer to run as we increase the number of features, the number of hidden layers, and the number of columns in our dataset or image resolution. In this study, we run the CNN model on GPU, which primarily takes advantage of the parallel programming capabilities for mathematical operations. Training a model is only required once. During the stage of validation or testing, compared with the 3 × 11 polynomial regression of the color characterization validation, the CNN model takes 68 ms for 8 × 8 pixels, while 3 × 11 polynomial regression only needs less than 1 ms. 3 × 11 polynomial regression seems to perform slightly faster than the CNN model. However, the accuracy of CNN modeling is much higher than that of polynomial regression. The processing time may not satisfy some industrial applications. However, we believe that the processing power of computer hardware would progress rapidly as Moore's law indicated that "the power of computers would double every 12 months, while the cost of that technology would fall by 50% over the same time". Moreover, IT8.7/4 only provides 1617 color reference samples for color patch training. It might lead to a CNN model that is slightly overfitting but does not markedly affect the CNN performance. In our future work, we propose to capture more training data of color patches and use data augmentation to reduce overfitting. On the other hand, we will use the dropout mechanism and batch normalization of CNN programs to possibly prevent overfitting. To speed up CNN computing, optimized CNN architectures, distributed CNN computing, and FPGAbased CNN accelerators for meeting real-world applications will be used.

Conclusions
In this study, modeling for the colorimetric characterization of a color image sensor was developed and validated using a CNN. CNN represents a new method of extracting color features automatically. The high accuracy of CNN characterization was demonstrated for 8 × 8 pixel color image sequential characterization without requiring any greyscale balancing or image preprocessing. A satisfactory result of a mean color difference ΔE* ab of as small as 0.48 was obtained.