Integrated Image Sensor and Deep Learning Network for Fabric Pilling Classification

Manufacturers’ fabrics are tested for abrasion resistance before leaving the factory, and the fabrics are manually visually graded to ensure that there are no defects. However, manual visual classification consumes a lot of human resources. In addition, long-term visual inspections using the eyes often result in occupational injuries. As a result, the overall efficiency is reduced. To overcome and avoid such situations, we devised an image preprocessing technology and deep learning network for classifying the pilling level of knitted fabrics. In the first step, fabric images are collected using an image optical sensor. The fast Fourier transform (FFT) and Gaussian filter are used for image preprocessing to strengthen the pilling characteristics in the fabric images. In the second step, the characteristics and classification of fabric pilling are automatically captured and identified using a deep learning network. The experimental results show that the average accuracy of the proposed method for pilling level classification is 100%. The proposed method has 0.3% and 2.7% higher average accuracy than deep-principal-component-analysis-based neural networks (DPCANN) and the type-2 fuzzy cerebellar model articulation controller (T2FCMAC), respectively, demonstrating the superiority of the proposed model.


Introduction
Modern people are very particular about the quality of fabrics; however, defects in a fabric, such as fabric pilling, may result in the fabric being returned, causing heavy losses to the producer. Among these defects, fabric pilling is a common phenomenon in many fabrics and clothes. It refers to the appearance of bundles or balls of tangled fibers suspended on the fabric surface. Fabric pilling affects not only the aesthetic quality of the fabric but also reduces its reliability and durability. To ensure the sufficient quality of fabrics, fabrics must pass the Société Générale de Surveillance SA (SGS) International Standard for an abrasion resistance test before shipment. The fabric level is judged by manual visual inspection, which may result in some judgment errors. Reasons for such errors include insufficient experience or eye fatigue of the inspector. In addition, manual visual inspection is unpersuasive and too subjective. Although accumulated experience can reduce errors in inspection, inspection staff require considerable training and time to become inspection experts. Not only are a large amount of human resources needed to use human eyes to classify fabric levels, but occupational harm may result from visual observation for a long time, resulting in a decrease in the efficiency of the detection process.
In recent years, many researchers of fabric defect detection have used machine vision technology to process fabric images. They used experience to capture the characteristics of fabric images and then perform fabric classification. Deng et al. (1) proposed a comprehensive method for fabric pilling feature extraction and objective evaluation. They used a multiscale 2D dual tree complex wavelet transform to extract fabric pilling features from fabric information. Six parameters were extracted at different scales: energy ratio of fabric pilling quality, fabric pilling area, total number of fabric pillings, standard deviation of fabric pilling area, standard deviation of pilling height, and coefficient of deviation of fabric pilling position. Then a backpropagation (BP) neural network was used as a classifier to distinguish the pilling levels. Saharkhiz and Abdorazaghi (2) proposed and compared different clustering methods for the pilling classification of knitted fabrics. They adopted the fast Fourier transform (FFT) and lowpass filtering to remove the texture of the fabric surface. Then, the number, volume, and area of the fabric pillings were extracted from the fabric surface, and these three parameters were used in clustering algorithms to evaluate the pilling of the fabric. Yun et al. (3) used the FFT and fast wavelet filtering methods to capture the characteristics of a fabric. They used three parameters for feature extraction: total number of pixels, number of fabric pillings, and grayscale values of all the fabric pilling images. Finally, a statistical method was used to establish a rule base for grayscale images of fabric pilling. Furferi et al. (4) used a digital camera to capture images, which were processed by light source adjustment and binarization. Then five parameters were proposed as fabric features: total kurtosis, total skewness, entropy curve, coefficient of variation, and brightness. Finally, an artificial neural network was adopted for clustering to evaluate the fabric level. Eldessouki et al. (5) proposed the use of four strategies, namely, binarization, cutting, quantification, and classification, for fabric pilling evaluation. Four parameters were extracted, namely, average area of fabric pillings, number of fabric pillings, density of fabric pillings, and ratio of fabric pilling area, as features of fabric pilling. Artificial neural networks were also used as classifiers. Kayseri and Kirtay (6) proposed the use of artificial neural networks to predict the pilling tendency of cotton interlocked knitted fabrics. They used the degree of fabric pilling, average area of fabric pillings, total number of fabric pillings, total weight of fabric pillings, average height of fabric pillings, fabric cover factor, and short-fiber content as fabric parameters to derive a pilling level model. Finally, a sensitivity analysis was performed to demonstrate that the fabric cover factor and short-fiber content in the model are the most suitable parameters for regression. Techniková et al. (7,8) captured images through different light sources and a camera lens, where an algorithm was used to reconstruct the images and create 3D images of the fabric. Huang and Fu (9) used two image preprocessing methods and two machine learning methods for fabric pilling classification. One of the two image preprocessing methods combined the discrete Fourier transform and Gaussian filtering, and the other used the Daubechies wavelet transform.
The two machine learning methods, i.e., artificial neural networks and support vector machines, were used to solve fabric grading classification problems. Lee and Lin (10) proposed the type-2 fuzzy cerebellar model articulation controller (T2FCMAC) for evaluating the pilling level of knitted fabrics and a method to adjust the parameters of the T2FCMAC classifier. However, in the above-mentioned traditional image processing methods, it is necessary to define the relevant features of fabric pilling in advance.
Recently, the technology of deep learning has been booming. An increasing number of researchers have carried out research on deep learning networks, and the application of image recognition has become increasingly widespread. The origin of convolutional neural networks (CNNs) can be traced to LeCun et al., (11) who proposed the LeNet-5 model, which uses BP to adjust the parameters in the network. This was a reasonably successful early CNN. Krizhevsky et al. (12) proposed the AlexNet model. They greatly deepened the network architecture, adopted dropout technology, and used a rectified linear unit (ReLU) as an incentive function. As the convolutional layer of the network architecture part, AlexNet is much deeper than LeNet, and it constructs a complex model with 60 million parameters. Lin et al. (13) presented a multilayer perceptron in a Network in Network (NIN) to replace the traditional convolutional layer. A fully connected layer was placed between the convolutional layers to enhance the characterization of each convolutional layer. Szegedy et al. (14) presented the GoogLeNet model. The convolutional layer in the GoogLeNet model was modified to achieve cross-channel information exchange and reduce dimensionality. As the network becomes deeper, the performance degradation becomes increasingly obvious. That is, the deepened network makes it difficult to learn the correct features and reduces the accuracy. To solve this problem, the deep residual network (ResNet) (15) was proposed, in which low-level features are directly mapped to high-level networks. Thus, the higher-level networks do not learn from scratch and have the same representation capabilities as the previous layers. However, deepening the network structure also causes problems that resulted in the use of graphics processing unit (GPU) hardware, which requires a large amount of computing resources. Therefore, to overcome the above-mentioned problems, we use image preprocessing technology and a deep learning network in this study for the pilling classification of knitted fabrics. By employing image preprocessing technology, the FFT is applied to the obtained fabric to convert it into a frequency signal, and then a Gaussian filter is used to remove noise. The adopted deep learning network is the LeNet-5 model, which has automatic feature extraction and classification capabilities, and does not require a large amount of hardware resources. In the deep learning network, the convolution layer and pooling layer are used to automatically extract the features of the fabric pilling, and the fully connected layers are used to classify the fabric pilling. The learning process is the same as that of the traditional neural network. The deep learning network is used to constantly adjust the weights in the convolutional layers and the fully connected network layer, calculate the error between the predicted output of the network and the desired output, and perform parameter updates and learning through the BP learning algorithm. The experimental results show that the proposed deep learning network has high performance in fabric pilling classification.
The remainder of the paper is divided into the following sections. In Sect. 2, the image preprocessing technology and deep learning network are introduced. Experimental results and comparisons are given in Sect. 3. Finally, Sect. 4 gives conclusions and future work.

Methodology
This section introduces the use of a CNN as a deep learning network to perform the pilling classification of knitted fabrics. The proposed method is divided into two steps. First, the fabric images obtained using an optical sensor are subjected to an FFT and a Gaussian filter to fade the background texture of the fabric. Second, a deep learning network is used for feature extraction and fabric pilling classification. The overall steps of the proposed method are shown in Fig. 1.

FFT and Gaussian filter
The image obtained in this study is preprocessed to obtain better image quality. The FFT and the Gaussian filter are used as the image preprocessing methods. The FFT converts the image to a frequency signal, and the Gaussian filter (i.e., low-pass filter) is used to control the smoothness and remove noise. The sigma (σ) parameter is very important in the process of the Gaussian filter. When the sigma value is set to a larger value, the filtered image is clearer, whereas a smaller sigma value results in a more blurred filtered image. Figure 2 illustrates the results of the images using different σ values.

Pilling level classification using CNN
The CNN is a very popular neural network architecture in deep learning. As shown in Fig. 3, the process of extracting features is carried out by machine learning, and the interesting features are obtained through the learning process. This method is considered to be effective and can replace the use of manual feature selection. The CNN mainly consists of multiple layers for feature extraction by constructing a multilayer network structure. The CNN is expected to extract rough local features from the lower network layers. As the number of layers increases, the global features are integrated to achieve a robust feature expression.
The CNN is mainly composed of four types of network layers: convolution, pooling, activation, and fully connected layers. The learning process is the same as that of traditional neural networks. The CNN also uses a BP learning algorithm (16) to update the parameters in the convolutional layer and the fully connected network layer. By calculating the error between the network output value and the real target value, the error is minimized, and the parameters of the entire network can be determined. Each layer in the CNN is described below.

Convolution layer
In the CNN, each convolutional layer is composed of several convolution kernels with different sizes to perform inner product operations on the image. According to a set stride value, a feature map is generated from left to right and from top to down. This step is called convolution, as shown in Fig. 4. The weight of the convolution kernel in each layer is continuously adjusted through the learning process. Thus, the image can highlight more robust features through the convolution process. In Fig. 4, the input image has a length and width of 5, the length and width of the convolution kernel are both 3, and the stride is 1. The size of the output matrix is given by (a) where W o and H o are the width and height of the matrix; W i and H i are the width and height of the input matrix, respectively; p is the padding size; s is the stride size. The output matrix (O RC ) of convolution operation is where K h and K w are the width and height of the convolution kernel, respectively.

Pooling layer
The main purpose of the pooling layer in the CNN is to reduce the feature map while retaining its important information. That is, the feature map after pooling can still retain the advantages of rotation, translation invariance, anti-alias information, and avoidance of overfitting. The pooling operation is generally divided into two steps. First, the image is divided into several regions, and the maximum value is selected in each divided region, which is called the maximum pooling operation. Second, the average of all the values is calculated in each divided region, which is called the average pooling operation. Figure 5 schematically illustrates the maximum pooling operation and the average pooling operation.

Activation layer
The activation layer plays an important role in the neural network. It is mainly used to simulate the transfer operation of the neural network. Because the activation function used in the activation layer is often nonlinear, it is also called a nonlinear transfer function. Adding activation functions to neural networks can solve more complex nonlinear problems. Common activation functions include the sigmoid function, tanh function, and ReLU. (17) In the CNN, ReLU is often used as an excitation function, as shown in Fig. 6. ReLU is defined as follows: If the input value x is less than 0, the output is equal to 0; otherwise, the original input value is output. The main purpose of the ReLU function is to solve the problem that the gradients disappear easily when calculating gradients in multilayer neural networks.

Fully connected layer
In a CNN, the first two layers are a convolutional layer and a pooling layer. Their purpose is to capture the image features. The last two layers of the CNN use a multilayer perceptron and Softmax to obtain prediction results. As shown in Fig. 7, the architecture of a fully connected layer is similar to that of a multilayer perceptron. Before their input into the multilayer neural network, all feature maps must be converted to a 1D array as the input data of the fully connected network, and then the final output is obtained through Softmax.
The purpose of using Softmax is that the output of the network can be used to present the probability distribution in the classification results of each category. Its range is between 0 and 1 and the sum of the probability of each category will equal 1. Because Softmax uses the exponential operation, large probability values are emphasized, and the gap between large and small probability values is widened. The output y i of Softmax is given by where y i denotes the output of the fully connected network and c represents the number of categories.
In the training process, the loss function, i.e., cross entropy, is often used to calculate the error between the real value ˆi y and the Softmax output y i . Then the CNN carries out learning through the BP learning algorithm. The cross entropy is defined as

Experiments
As reported in this section, the proposed method is implemented in the classification of pilling levels. First, the obtained fabric images are introduced, and then the classification of pilling levels using deep learning networks is described and its performance is compared with various existing methods.

Pilling level of knitted fabric
To establish a database of the pilling levels of clothes, we collected fabric samples and distinguished the levels of fabric pilling as preliminary work. First, the fabric was placed in a Martindale abrasion tester in accordance with the SGS International Standard Inspection. After the fabric was continuously rolled in the tester, the condition of fabric pilling on the surface was detected, then the pilling level of the fabric was evaluated by manual visual inspection. In the SGS International Standard Test, the fabric pilling is divided into five levels: level 1 is very serious fabric pilling, level 2 is serious fabric pilling, level 3 is moderate fabric pilling, level 4 is slight fabric pilling, and level 5 is no fabric pilling, as shown in Table 1.

Classification results using deep learning networks
In this subsection, we discuss the classification results obtained using the deep learning network. The fabric information was provided by a Taiwanese manufacturer that does not produce fabrics with very serious fabric pilling. Therefore, the fabric data only contained levels 2 to 5 of pilling. In this experiment, four types of fabrics were evaluated using an image optical sensor and their pilling levels were identified. The dataset we collected had a total of 320 images with 80 images for each level of pilling. In this study, 80% of the data for each level was randomly selected as training samples and 20% was used as testing samples. Then deep learning networks were used to identify fabric pilling. Ten sets of cross-validation were performed in this study, and a total of 10 different training and testing datasets were collected. The inspection results obtained using the proposed method are shown in Table 2. The average accuracy was defined as the average of the accuracy rates of the 10 testing datasets. The average accuracy was 100%; thus, the performance meets the needs of the industry. In recent years, Yang et al. (18) proposed deep-principal-component-analysis-based neural networks (DPCANN) for the pilling classification of knitted fabrics. In their study, DPCANN had the same automatic feature extraction function as the CNN; deep principal component analysis automatically extracted the characteristics of the fabric pilling, and neural networks and support vector machine classifiers were used to identify the pilling level of the knitted fabric. The 10 different training and testing datasets used to evaluate the performance of DPCANN in this study were the same as those in Table 2. The experimental results show that the average accuracies obtained using DPCANN and the support vector machine classifiers were 98.6% and 99.7%, respectively.

Conclusions and Future Works
In this study, image preprocessing technology and a deep learning network were proposed to classify the pilling level of knitted fabrics. Fabric images were obtained using an optical sensor then subjected to preprocessing, where a hybrid of the FFT and Gaussian filter was used to strengthen the characteristics of the pilling in the images. The pilling level classification of the knitted fabric was automatically captured and identified using a deep learning network. In our experiment, the average accuracy of the proposed method in pilling level classification was 100%, 0.3% and 2.7% higher than the average accuracies of DPCANN (18) and T2FCMAC, (10) respectively.