Classification of Hyperspectral Images of Small Samples Based on Support Vector Machine and Back Propagation Neural Network

High-precision hyperspectral image classification when the number of samples is small is the focus of research in the field of hyperspectrum. At present, there are few studies on the effect of different training set samples on classification accuracy. To determine the effect of different training set samples on the classification accuracy of a hyperspectral image, the hyperspectral image of an Indian Pines farm is used as the data source. In this work, we study the classification accuracy results of support vector machine (SVM) and back propagation (BP) neural network when the training set samples are 1, 2, 5, 10, and 20%. Simulation results show that the overall accuracy (OA), average accuracy (AA), and Kappa coefficients of SVM and BP increase continuously with the number of samples in the training set. Under different numbers of training set samples, the classification accuracy of BP is greater than that of SVM. When the number of samples in the training set is 20%, the recognition accuracy of the BP classification method for seven features (Grass-pasture, Grass-trees, Hay-windrowed, Oats, Wheat, Woods, and Stone-Steel-Towers) is higher than 90%, and the recognition accuracy of Hay-windrowed features is 93.97%.


Introduction
Hyperspectral imagers can acquire spectral imaging data from hundreds of narrow and continuous bands in the visible-to-infrared spectral region. Each pixel of a hyperspectral image exists in the form of a vector, whose different elements correspond to spectral response values at different wavelength bands. Since different substances reflect different electromagnetic energies on a specific wavelength band, different substances can be distinguished on the basis of their spectral characteristics. Hyperspectral images have the characteristics of high spectral resolution, wide spectral range, and strong spectral correlation. (1)(2)(3) They can detect the feature category that cannot be detected by multispectral images and are widely used in environmental detection, military security, astronomy, forestry protection, mineralogy, and other fields.
Classifying each pixel in a hyperspectral image is an issue that should be resolved in an application. According to the current classification method, the hyperspectral image classification method based on machine learning is the mainstream method of feature classification, which mainly uses a pattern recognition classifier to classify ground objects. For example, Zhang et al. determined the spatial information in the homogeneous region using the relative homogeneity coefficient through the Markov random field and combined the spectral features with the spatial information to effectively improve the classification effect of the support vector machine (SVM) classifier. (4) Chen et al. used the window method to introduce spatial information through the joint sparse representation of pixels in the window, in order to minimize the reconstruction error and obtain the classification result. (5) Guo et al. adopted the support tensor machine (STM) method to solve the hyperspectral classification problem and introduced the spatial information through the special structure of tensor feature expression to realize the hyperspectral classification of a spectral-spatial information joint. (6) Gurram and Kwon proposed a contextual SVM classification method based on kernel space embedding by introducing spatial information by averaging it in the nuclear Hilbert space. (7) However, the training set of small samples is one of the key points in the research on hyperspectral image classification, but the effect of different training set samples on classification accuracy has been rarely studied. The SVM maps the original feature vector to the high-dimensional space through the kernel function and realizes the classification by establishing the decision surface, which has strong small sample training classification ability. The back propagation (BP) neural network has a high classification accuracy, a strong parallel processing ability, a strong robustness and fault tolerance to noise, and can fully approximate complex nonlinear relationships. Therefore, to determine the effect of different training set samples on the classification accuracy of a hyperspectral image, the hyperspectral image of an Indian Pines farm is used as the data source. In this paper, when the training set samples are 1, 2, 5, 10, and 20%, the classification accuracies of SVM and BP are compared and analyzed.

Data source
The Indian Pines experimental dataset is an image of a farmland in Indiana, USA, taken in 1992 with the AVIRIS sensor. The image resolution is 145 × 145 pixel, the spatial resolution is 20 m, the spectral band range is 400-2500 nm, and the spectral resolution is 9.7-12 nm. After removing the bands with severe atmospheric absorption and noise effects, the remaining 200 bands were used for experiments. The data sample of the image is 10249 in total and contains 16 types of feature information. The real object image is shown in Fig. 1. The specific category name and the number of pixels contained in each class are shown in Table 1.

SVM and BP classifier
SVM is a typical supervised classification model. (8) The basic principle is to find the classification hyperplane so that the two sample points can be separated. In fact, it is a problem to solve a convex optimization. Let where w is the weight vector, b is the intercept of the decision function, ζ i is the slack variable, C is the penalty coefficient, and Φ(•) is the nonlinear mapping function. The BP neural network is a model for dealing with nonlinear problems. (9) It is generally composed of input, hidden, and output layers. The basic principle is to feed back the results of neural network learning to the hidden layer and adjust the weights and thresholds so that the total error is minimized to meet the expected learning requirements. The specific calculation steps of BP are as follows: Step 1: Calculate the middle layer output.
where q is the number of middle layers, m is the number of input layers, k is the number of learning, W ij is the connection weight from the input layer to the middle layer, and θ j is the threshold of the middle layer.
Step 3: Reverse transmission, calculate the error between the weight, the threshold, and the set value, and continuously update the weight and threshold in the neural network, so that the output of the output layer is made to be as close as possible to the expected output.

Classification accuracy evaluation index
After the hyperspectral images are classified, it is necessary to evaluate objectively the classification results. Generally, four indicators are often used for the evaluation, namely, class accuracy (CA), overall accuracy (OA), average accuracy (AA), and Kappa coefficient. CA is used to indicate the proportion of the correct classification result of each feature in the corresponding sample. OA indicates the proportion of the total correct classification results in all samples. AA represents the average of the proportions in which each category is correctly classified, taking into account the classification of each category. The Kappa coefficient is a ratio that takes into account the effect of uncertainty on the classification results and is used for consistency testing. The calculation formulas of these four evaluation indicators are as shown in Eqs. (4)-(7), respectively.
Here, in practical applications, it is assumed that there are k categories of features, and by comparing the classification results with the real results, the confusion matrix R of k × k can be obtained. R represents the number of samples in which category j is recognized as i. T denotes the total number of the test samples.

Results and Discussion
In each ground, 1, 2, 5, 10, and 20% are selected as training set samples, and the remaining samples are used as test set samples to verify the classification accuracies of SVM and BP classifiers in different training set ratios. This is shown in Table 2. It can be seen from Table 2 that as the number of samples in the training set increases, the OA, AA, and Kappa coefficients of SVM and BP continue to increase. This is because the more samples in the training set, the richer the mark information and the local structure information contained, and the selected features can save more global similarity or local geometric property information, so the classification accuracy is higher. Moreover, the classification accuracy of BP is higher than that of SVM under different numbers of training set samples.
When the number of samples in the training set is 1%, the OA accuracy of both SVM and BP is about 60%. When the number of samples in the training set is 2%, the OA accuracy of SVM is about 67% and that of BP is about 70%. When the number of samples in the training set is 5%, the OA accuracy of both SVM and BP is about 75%. When the number of samples in the training set reaches 10 and 20%, the OA accuracy of both SVM and BP is more than 80%.
In addition, when the number of training set samples is 20%, the BP classifier's recognition accuracy for seven objects (Grass-pasture, Grass-trees, Hay-windrowed, Oats, Wheat, Woods, and Stone-Steel-Towers) is higher than 90% and the recognition accuracy of the Hay-windrowed object is 93.97%. The recognition accuracy of the Soybean-clean object is about 85%. The recognition accuracies of the four objects (Corn-notill, Soybean-notill, Soybean-mintill, and Buildings-Grass-Trees-Drives) are between 80 and 85%. The recognition accuracy of the Corn-mintill object is 78.74%. The recognition accuracy of the Corn object is 67.34%. The recognition accuracy of the Alfalfa object is the lowest (58.97%). Figure 2 shows classification identification maps of SVM and BP in different training set ratios. When the number of samples in the training set is very small, the phenomenon of "pepper salt" in the classification maps of SVM and BP is very clear. When the number of samples in the training set is increasing, the phenomenon of "pepper salt" in the classification maps of SVM and BP is gradually decreasing. Moreover, the BP classification map is smoother than the SVM classification map under different training set ratios.

Conclusions
In this paper, the Indian Pines data set is used as the experimental data of a hyperspectral image, and 1, 2, 5, 10, and 20% are selected as the training set samples in 16 types of ground, and the remaining samples are used as test samples. The simulation results reveal the classification accuracies of SVM and BP classifiers in different training set ratios, which provide a reference for finding the best training set samples.