Sensing Analysis of Feature Extraction Types for Handwritten Character Recognition

The line density direction (LDD) feature, gradient direction (GD) feature, and deep convolution neural network-based (convNet-based) feature widely employed in handwritten character sensing recognition have acceptable accuracies. The convNet-based method determines feature expression not only from a raw pattern image but also from domain-specific LDD and GD knowledge. These methods are named convNet-based-Raw, convNet-based-LDD, and convNet-based-GD, respectively. In this paper, we present an independent sensing analysis of the five features under identical working conditions considering the preprocessing and algorithm implementation of two handwritten character databases: CASIA-HWDB1.0 (Chinese) and TUAT HANDS (Japanese). The experimental results demonstrate that convNet-based feature extraction is more robust and discriminating than LDD and GD, two traditional methods for both handwritten Chinese character recognition (HCCR) and handwritten Japanese character recognition (HJCR). Furthermore, the convNet-based-GD has the highest accuracy for both HCCR and HJCR among the three convNet-based feature extraction methods. Compared with the traditional methods, LDD and GD, the best accuracies when using convNet-based-GD are improved by 3.04 and 2.31% for HCCR, and 3.15 and 1.54% for HJCR, respectively. Similarly, compared with the two other convNet-based methods, convert-based-Raw and content-based-LDD, the best accuracies are improved by 0.44 and 0.25% for HCCR, and 0.65 and 0.08% for HJCR, respectively. Experimental comparisons of sensing analysis results are acceptable and valuable.


Introduction
With the extensive use of electronic pen input devices, personal digital assistants (PDAs), smartphones, and handheld devices, human-like freehand input is currently being strongly demanded by users. The touch sensor of handheld devices detects the moving path of the finger or pen-based input device to obtain the coordinate sequence of a handwritten input character. Then, the handwritten character pattern is recognized as the corresponding character by a trained character classifier. Feature extraction from the handwritten character pattern during character recognition not only improves the character recognition accuracy but also reduces the complexity of character recognition. Nowadays, the methods of extracting the line direction density (LDD) feature, gradient direction (GD) feature, and convolution neural network-based (convNet-based) feature are the leading techniques for handwritten character recognition. To obtain robust feature expression by using the advantage of each feature extraction, it is necessary to study the sensing sensitivity of each feature extraction type for handwritten character recognition. In this paper, we focus on the sensing analysis of feature extraction for handwritten Chinese and Japanese character recognition.
Handwritten Chinese character recognition (HCCR) has been studied for more than fifty years. It is a challenge to deal with the large number of character classes, confusion between similar characters, and distinct handwriting styles across individuals. (1) It is identical to handwritten Japanese character recognition (HJCR). On the basis of the type of input data, handwriting recognition can be divided into online and offline. Offline HCCR and HJCR have found many applications, such as mail sorting and bank check reading. Online HCCR and HJCR have been widely applied to, for example, pen input devices, PDAs, smartphones, and computer-aided education. Moreover, HCCR and HJCR respectively are also a critical and integral part of handwritten Chinese and Japanese text recognition, in which segmentation and recognition are considered simultaneously. High character recognition accuracy is essential for the success of handwritten text/string recognition. (2) The online method involves an online input handwritten character pattern, which is a time sequence of pen-tip coordinates, and although it can be easily made robust to stroke connection and deformation, the online method is sensitive to stroke order variation. On the other hand, in the offline method, the handwritten character is compared with an offline pattern, which is a character pattern image, and although it is insensitive to stroke order variation or duplicated strokes, it is not very robust to stroke connection and deformation. The offline recognition method is combined with the online process to form a combined recognizer for handwritten character recognition. (3) Although the accuracy of handwritten character recognition depends on the combined contributions of pattern normalization, feature extraction, and recognizer training, a stronger discriminating feature expression (feature extraction) is crucial to handwritten character classification. (4) In online handwritten character recognition, it is more effective and efficient to use extracted feature points instead of all the pen-tip points to express character patterns. The direction feature, inspired by the stroke segments of Chinese characters, can be approximated into four orientations (horizontal, vertical, left-diagonal, and right-diagonal), and was further employed to significantly improve the accuracy of online handwritten character recognition. In offline handwritten character recognition, wavelet transforms and a Gabor filter are successively applied to offline-written English, Indian, and Chinese character recognition. (5) Similar to the online method, the trajectory-based LDD feature and GD feature methods also showed superiority in offline handwritten character recognition, especially in HCCR and HJCR. (6) The LDD and GD features are extracted from online written character patterns for online handwritten character recognition and from handwritten character pattern images for offline handwritten character recognition, respectively.
As for online recognition, to achieve elastic matching between feature points, dynamic programming (DP) matching was introduced. (7) Thus, its greedy variation, named linear-time flexible matching, is combined with structured character representation, which is used with the effect of memory reduction and consistency against style variations. The hidden Markov model (HMM) and Markov random field (MRF) after DP matching are successively applied in online handwritten character recognition. (8) Historically, the handwritten character recognizer, which employs the nearest-neighbor method, was first employed for optical character recognition (OCR) in the offline process, but was replaced by the modified quadratic discriminant function (MQDF). Discriminative classifiers such as a neural network (NN) and a support vector machine (SVM) are also candidates, but statistical classifiers can generalize better than discriminative ones when the number of sample patterns is limited. (9) Both Chinese and Japanese have thousands of character categories, thus statistical MQDF, rather than discriminative classifiers such as SVM, is the optimal selection.
In recent years, the deep convolution neural network (convNet) has been successfully used in HCCR and HJCR. Since it considers the feature extraction and classifier training as an entire system to simultaneously optimize the parameters, its recognition performance is further improved. Although the deep convNet can directly learn discriminative representations from a raw pattern image, (10) the well-studied domain-specific knowledge shown is helpful also for improving the performance of HCCR. (11) A direction feature, such as LDD or GD, is the most critical domain-specific knowledge. (12) These previous results are applied to a different framework, using only one evaluation database without considering the characteristics of the handwritten character database. Moreover, although combining domain-specific knowledge with deep convNet creates a new benchmark in HCCR, similar research has not been carried out in HJCR. The previous works mainly reported on the progress in the accuracy of the handwritten character recognition system, but the comparisons of the performances of different feature extractions are seldom made. Therefore, more comprehensive comparative experiments of different feature extractions, i.e., LDD, GD, and convNet-based methods, and their variations for different evaluation databases under the same framework, are necessary and valuable.
The rest of this paper is organized as follows. In Sect. 2, we depict the evaluation databases. In Sects. 3 and 4, we describe the procedures of the three types of feature extraction and feature reduction. Then, two types of handwritten character recognizer are introduced in Sect. 5. In Sects. 6 and 7, we show some experimental results with further analytical experiments. Finally, some conclusions are drawn in Sect. 8.

Overall System and Evaluation Database
The handwritten character recognition system consists of two phases: classifier training and testing. The process of classifier training involves four steps, as shown in Fig. 1. The left side of Fig. 1 shows the dataflow in the training of the classifier, whereas the right side shows the corresponding modules. In the first step, the input pattern is normalized by nonlinear modified centroid-boundary alignment (NLNMCBA). (5) After that, the five types of feature vector are extracted from the normalized pattern. In the third step, the extracted feature vector is compressed by Fisher linear discriminant analysis (FLDA). (13) Finally, the compressed feature vector is employed to train MQDF. After being trained, the classifier is used to assign the input handwritten character pattern to the corresponding character class.
In HCCR, the CASIA-OLHWDB1 database is employed to evaluate the performance of different feature expressions. (14) This online database of handwritten Chinese characters and texts, collected by Anoto pen on paper, contains unconstrained handwritten characters of 4037 categories (3866 Chinese characters and 171 symbols) written by 420 people, and there are 1694741 handwritten samples in total.
In HJCR, the TUAT HANDS database, which consists of Kuchibue and Nakayosi databases of online handwritten Japanese characters, is employed as training and testing databases to evaluate the performance of different feature extractions. (15) The Kuchibue database contains online handwritten character patterns of 120 writers, i.e., 11962 patterns per writer covering 3356 character classes. The 11962 patterns per writer include two parts: the sentential part consisting of 10154 characters, often used for text recognition testing, and 1808 single characters that appear less often. The Nakayosi database contains the samples of 163 writers, 10403 patterns covering 4438 character classes per writer. Therefore, Kuchibue in total contains a total of 11962 × 120 = 1435440 character patterns and Nakayosi contains 10403 × 163 = 1695689 patterns. For Kuchibue and Nakayosi, each character pattern was written in individual writing boxes.

Feature Extractions
The LDD and GD feature extractions are quite similar and contain three steps: character pattern or character image normalization, directional decomposition, and feature sampling. The slight variation is in the directional elements of directional decomposition in the second step, i.e., LDD employs trajectory-based elements, whereas GD employs gradient-based elements. The convNet-based feature extraction directly uses the deep convNet. The feature extraction is described in detail below.

LDD feature extraction
In the LDD feature extraction, we assume that the online character C is composed of n strokes, noted as C = {S 1 , S 2 , ..., S n }. P ij ( j = 1, ..., m) is the j-th point on the stroke S i (i = 1, ..., n) and the trajectory vector A ij is the vector from P ij to P ij+1 , i.e., A ij = P ij+1 -P ij . Then all the trajectory vectors A ij ( j = 1, ..., m) of the online pattern are employed to extract the trajectorybased feature vector. The decomposition of the trajectory vector to major directions is necessary to obtain a robust feature vector. Let e 1 , e 2 , up to e 8 be the elementary vector of each of the 8 directions, as shown in Fig. 2. Then, the angle range is divided into 8 subranges, More effectively, the trajectory vector is directly decomposed into two components in standard directions between which a trajectory vector lies. (16) The length of each component is assigned to the corresponding direction plane at the corresponding pixel.   To extract feature vectors of moderate dimensionality, each direction plane must to be blurred and resampled; then, the total or average value of each zone is taken as the feature value. For the implementation of blurring, the convolution of the Gaussian filter is given in Eq. (2), and the directional plane is operated using Eq. (4) to obtain the feature value of this filter window. The variance parameter is related to the sampling frequency computed using Eq. (3), where t x is the sampling interval.
( ) From the above descriptions, the feature extraction process of LDD is shown in Fig. 3. We apply NLNMCBA to the character patterns. The local stroke direction is decomposed into eight direction planes, and then 64 feature elements from each directional plane are extracted using the Gaussian filter with a filter window of 8×8 pixels; thus, the dimensionality of feature vectors is 512. To improve the Gaussianity of feature distribution, each value of the 512 features is transformed by the Box-Cox transformation. (17)

Gradient feature extraction
Since the GD feature cannot be directly extracted from an online pattern, a pattern image converted from the corresponding online pattern is needed. The gradient vector g(x, y) = [g x , g y ] T at pixel (x, y) in a normalized pattern image is computed using Eq. (5). The gradient strength and direction can be computed from the gradient vector [g x , g y ] T . (18) Then the remaining steps, i.e., directional decomposition and feature sampling, are the same as in the feature extraction of LDD.

Deep convNet-based feature extraction
Generally, deep convNet-based feature extraction directly learns discriminative representation from a raw pattern image. However, the well-studied domain-specific knowledge, i.e., the directional feature map, is shown to still be helpful for further improving the performance of handwritten character recognition. In this research, the domain-specific knowledge focuses on the LDD and GD features. Therefore, the convNet-based feature extraction method extracts three types of feature vector: raw pattern image as input with a deep convNet-based feature vector and LDD and GD feature maps as input integrated with the convNet-based feature extraction, denoted as convNet-based-Raw, convNet-based-LDD, and convNet-based-GD features, respectively.
The architecture of deep convNet is employed to extract the three convNet-based features, as shown in Fig. 4. It comprises six convolution layers and three fully connected layers. The character pattern is first transformed into a binary image and normalized to 64 × 64 pixels, then the normalized pattern image for convNet-based-Raw, LDD feature maps for convNet-based-LDD and GD feature maps for convNet-based-GD acting as input layers are passed through a stack of convolution layers separately to capture the notions of left/right, up/down, and center. (19) The number of convolution strides is fixed to one and the number of feature maps is gradually increased from 60 (layer-1) to 360 (layer-6). In our architecture, spatial pooling is implemented after every two convolution layers to increase the depth of the network by the max-pooling of the 2 × 2 window with stride 2.
After the stacking of 6 convolution layers and 3 max-pool layers, the feature maps are flattened and concatenated into a vector with a dimensionality of 9000. Two fully connected (FC) layers (with 1500 and 200 hidden units) are then added. Finally, the softMax layer is used to perform the N-way classification. It is necessary to specify that the number N differs between HCCR and HJCR, that is, 4037 for HCCR and 4438 for HJCR. The prediction is made by taking the class with the maximal probability for the given test pattern. The network's input is 64 × 64 with the dimensionality of 9216 for convNet-based-Raw, whereas the d × n × n dimensionality is for convNet-based-LDD and convNet-based-GD, where d is the number of quantized directions, which is eight, and n = 64 is the size of the map with the dimensionality of 32768. The numbers of neurons in the network's remaining layers for all three methods are the same, as given by 216000-403680-100920-131220-150000-40560-36300-29160-9000-N.

FLDA
The FLDA is well known and common in pattern recognition for feature reduction. We assume that there are c known character classes w 1 , w 2 , ..., w c , and the i-th class has N i training samples among the total of N training samples.
The process to obtain the transformation matrix is the maximization of the following Eq. (8), called the Fisher discriminant criterion, (20) where {w i | i = 1, 2, ..., m} are m n-dimensional [ ]

Handwritten Character Classifier
To comprehensively compare the feature extraction methods, we employ the statistical classifier MQDF and the discriminative classifier, the multilayer perceptron (MLP), for HCCR and HJCR.

MQDF
We employ the MQDF since it has been most often employed for HCCR and HJCR. It is superior to the quadratic discriminant function (QDF) in terms of recognition rate, speed, and memory size. Kimura et al. proposed two types of MQDF: MQDF1 and MQDF2. (21) MQDF1 employs a type of Bayesian estimate of the covariance matrix instead of the maximum likelihood estimate, whereas MQDF2 is a version of the QDF smoothed by replacing the minor eigenvalues with a larger constant. In the system, MQDF2 is used for character classification. Given the d-dimensional feature vector denoted as x, MQDF2 of class ω i is computed using Eq. (9), where μ i is the mean vector of class ω i , λ ij and ϕ ij ( j = 1, ..., d) are the eigenvalues and eigenvectors of the covariance matrix of class ω i , respectively, κ denotes the number of principal axes, and the minor eigenvalues are replaced with a constant δ i . The parameter δ i is set as a class-independent constant computed using Eq. (10), and tr(Σ i ) denotes the trace of covariance.
On a training set of N samples (x n , c n ), n = 1, ..., N (c n is the class label of x n ), the connecting weights of MLP are adjusted to minimize the regularized squared error given by Eq. (13), where β is a coefficient of weight decay to exclude the biases; n k t is the target value of class k and is 1 for the genuine class and 0 otherwise. The weights and biases are initialized to small random values, and by stochastic gradient descent (SGD), they are iteratively updated on the training samples until the squared error becomes minimum. (

Experiments and Analysis
To evaluate the performance of feature extraction for HJCR, we use the Nakayosi database as the training data and the Kuchibue database as the testing data. However, for the evaluation of each feature extraction method for HCCR, we employ the 5-fold cross-validation on the CASIA-OLHWDB1 database and only report the mean of the 5-fold cross-validation. Considering that the statistical classifier can generalize better than the discriminative classifier in a large category classification task, the MQDF is used in this section; however, MLP is only used in Sect. 7 for comparison. The evaluation is implemented on a computer with an Intel Core 2 Duo E8500 CPU of 3.16 GHz and 8 GB of RAM.

LDD feature extraction
As for the LDD feature extraction, we set the size of the normalized pattern and direction plane to 64 × 64 pixels and the sampling interval to 8. As a result, we obtained 64 feature values from each direction plane and 512 feature values in total. We set the power of variable transformation to 0.5 to improve the Gaussianity of feature distribution.
To reduce the complexity of the MQDF classifier and improve the classification accuracy, the extracted feature vector is compressed to a lower dimensionality by FLDA. We set the reduced dimensionality d in the range from 100 to 160 in steps of 20. The number of principal axes of the classifier MQDF k takes values ranging from 10 to 60 increased in steps of 10. Tables 1  and 2 show the accuracies of character recognition based on the feature extraction of LDD for HCCR and HJCR, respectively. The accuracies of Chinese character recognition range from 90.72 to 91.93%, as shown in Table 1(a), and the accuracies of Japanese character recognition range from 90.64 to 91.82%, as shown in Table 1(b). Therefore, the accuracy variations of HCCR and HJCR based on the feature extraction of LDD fall into similar ranges.

GD feature extraction
Other than the gradient vector at each pixel of the normalized pattern image, instead of the trajectory vector of the online character pattern, being decomposed into each directional plane, the feature extraction of GD is almost the same as that of LDD. To enable a fair comparison, the size of the normalized pattern image, the number of directional planes, and the sampling interval are set to the same values with the feature extraction of LDD described in Sect. 6.1. Similarly, to train the character classifier MQDF, the 512-dimensional gradient feature vector is reduced to the same lower dimensionality d ranging from 100 to 160 in steps of 20. For each dimension, we take a different number of principal components from 10 to 60 in steps of 10, so that we produce 24 different offline recognizers. Table 2 shows the accuracies of  Table 2(a), and those for HJCR vary from 92.37 to 93.43%, as shown in Table 2(b). The best accuracy based on the feature extraction of LDD for HCCR, 91.93%, is slightly higher than that for HJCR, 91.82%, as depicted in Sect. 6.1. However, the best accuracy based on GD feature extraction for HJCR, 93.43%, surpassed that for HCCR, 92.66%, by 0.77%.

Deep convNet-based feature extraction
As for the three feature extraction methods, deep ConvNet-based-Raw, convNet-based-LDD, and convNet-based-GD for HCCR and HJCR, the same learning strategy and parameter settings are used. We employed minimum cross-entropy on all the training data sets to optimize the deep convNet using the SGD with the momentum of 0.9. The mini-batch size is 512 and the base learning rate is initialized for all trainable parameters at 0.005 for the entire deep convNet. The networks were trained for 5000 epochs, and the results obtained for every 100 training epochs are presented in Fig. 5. The best accuracies of the three convNet-based methods for HCCR and HJCR after 5000 epochs are given in Table 3. From Table 3 and Fig. 5, the deep Table 3 Best accuracies of three convNet-based methods for HCCR and HJCR (%  convNet of the three methods converge around the best recognition accuracy; those of convNetbased-Raw are 95.87 and 95.45%, those of convNet-based-LDD are 96.12 and 95.83%, and those of convNet-based-GD are 96.22 and 95.83% for HCCR and HJCR, respectively. Deep convNet is an integration of feature extraction with recognition. For further evaluation, the feature extraction part of the optimized deep convNet, i.e., the former six convolution layers, is employed to extract a 9000-dimensional feature vector to the three convNet-based feature extractions for HCCR and HJCR. For each convNet-based feature extraction, similar to the steps described in Sect. 6.1, after extraction, the 9000-dimensional feature vector is reduced to lower dimensionalities, and each dimension takes a different number of principal components to generate 24 different offline MQDF recognizers in total. Tables 4-6 show the accuracies of HCCR and HJCR when using convNet-based-Raw, convNet-based-LDD, and convNet-based-GD feature extractions, respectively. Tables 4(a), 5(a), and 6(a) show that the convNet-based-GD method for HCCR obtains the best accuracy of 94.97%, which is higher than those of convNet-based-Raw (94.53%) and convNet-based-LDD (94.73%) by 0.43 and 0.24 percentage points, respectively. Similarly, the convNet-based-GD method for HJCR also gains the best accuracy of 95.11%, outperforming by 0.79 and 0.22 percentage points, those of the convNet-based-Raw and convNet-based-LDD, respectively, as illustrated in Tables 4(b), 5(b), and 6(b).

Results and Discussion
In this section, we first analyze the impact of the characteristics of the database and different recognizers on the recognition accuracy. Then, we compare the results with those of related works.

Impact of characteristics of database on recognition accuracy
From Tables 1(a), 2(a), 4(a), 5(a), and 6(a), LDD, GD, and three other convNet-based feature extraction methods for HCCR with the CASIA-OLHWDB1 database achieved the best accuracies of 91.93, 92.66, 94.53, 94.73, and 94.97%, respectively. Similarly, the best accuracies of using the above five feature extraction methods for HJCR are 91.82, 93.43, 94.32, 94.89, and 95.11%, as given in Tables 1(b), 2(b), 4(b), 5(b), and 6(b), respectively. Comparisons of the best recognition accuracies among the five feature extraction methods for HCCR and HJCR are shown in Fig. 6. It is revealed that the convNet-based-GD is the most robust for handwritten character recognition among these five feature types. While the difference in the best recognition accuracy in the feature extraction of GD between HCCR and HJCR is 0.77%, those of using LDD and the other three convNet-based feature extractions are 0.11, 0.21, 0.16, and 0.14%, which indicate that the differences in accuracy are increased by 0.66, 0.56, 0.61, and 0.63%, respectively, compared with those of the latter four methods.
To further determine why the difference in accuracy between HCCR and HJCR based only on GD is apparently enlarged, we investigate the character patterns in the handwritten Chinese database (CASIA-HWDB1.0) and the Japanese handwritten database (TUAT HANDS). It is found that the average length between two neighboring points of all Chinese testing patterns is 18.5 pixels and that for the Japanese testing pattern is 10.3 pixels. Then, on the basis of the average interval length between two neighboring points of each handwritten pattern, which is denoted as len, the testing character patterns of both Chinese and Japanese are divided into four groups: len < 5, 5 < len < 10, 10 < len < 15, and len > 15. We selected the best accuracy of HCCR and HJCR on the basis of the GD feature (92.66% for HCCR and 93.43% for HJCR) to determine the recognition accuracy of each group, and the results are shown in Fig. 7. It is revealed that the larger the average interval length between two neighbors, the more seriously the accuracy is degraded, which indicates that the GD feature is sensitive to a sampled interval of handwritten patterns.

MLP
The MLP is an optimal candidate of MQDF in handwritten character recognition. Considering that the recognition part of the deep convNet given in Fig. 2 is composed of two FC layers (with 1500 and 200 hidden units) and a softMax layer, we construct the MLP of four layers, which is the same as the recognition part of the deep convNet. The number of neurons of each layer for the MLP is given by d-1500-200-N, where d is the dimensionality of a feature vector and N is the number of classifications, i.e., 4037 for HCCR and 4438 for HJCR.
To enable a fair comparison with the MQDF, the same dimensionality of the feature vector as that of LDD, GD, convNet-based-Raw, convNet-based-LDD, and convNet-based-GD used for MQDF is employed. The 512-dimensional feature vectors of the LDD and GD features and  the 9000-dimensional feature vector of the three convNet-based methods are reduced to lower dimensionality from 80 to 160 with steps of 20 by FLDA. The accuracy trends of the MLP for HCCR and HJCR are shown in Figs. 8 and 9, respectively. For each feature extraction method, the highest accuracy when using the MLP is slightly lower than that of using the MQDF for both HCCR and HJCR. Similarly to using the MQDF as a recognizer, the feature extraction method convNet-based-GD again shows the highest robustness and discrimination among the five feature extraction methods in the MLP used as a classifier for HCCR and HJCR.

Comparison with related works
To enable a comparison with related works, the above experiments are repeated employing the test data set used in the ICDAR 2010 and 2013 competitions for HCCR. (23) In this paper, we report the results of HCCR using the test data set, targeting 3,755 Chinese characters of the GB2312-80 first-level set. Note that our evaluation for HCCR in Sect. 6 included not only these 3755 Chinese character classes but also numbers and symbols, for a total of 4038 categories in total. The comparison with related works on HCCR is given in Table 7. The first to third methods used multiple strategies to improve their performance, such as a multiple supervised training strategy and deep convNet. However, our method of using a deep convNet integrated with domain-specific knowledge still outperforms them. Although domain-specific knowledge is already used in the fourth method, our convNet-based-GD method can still outperform them. The comparison with related works on HJCR is given in Table 8. Similarly, our convNet-based-GD method outperforms them also in HJCR.
Therefore, our proposed method of considering the LDD and GD features as domain-specific knowledge and then combining them with convNet outputs traditional feature expressions for handwritten character recognition and further improves the handwritten recognition accuracy in both HCCR and HJCR.

Conclusions
For HCCR, the three convNet-based methods obtained the highest accuracies of 94.53, 94.73, and 94.97%, but the accuracies of the other two types, LDD and GD, were 91.93 and 92.66%, respectively. Similarly, for HJCR, the accuracies of the three convNet-based methods were 94.32, 94.89, and 95.11%, but those of LDD and GD were 91.82 and 93.43%, respectively. Therefore, the convNet-based feature extraction is the most robust and discriminating compared with the two traditional methods, LDD and GD, for both HCCR and HJCR. The two convNetbased methods with domain-specific knowledge, convNet-based-LDD and convNet-based-GD, obtained the best accuracies of 94.73 and 94.97% for HCCR and HJCR, respectively. However, the accuracies of convNet-based-Raw without domain-specific knowledge were 94.53 and 94.32% for HCCR and HJCR, respectively. The results demonstrate that domain-specific knowledge makes the convNet-based method more efficient and effective for HCCR and HJCR.