Pyramidal Image Segmentation Based on U-Net for Automatic Multiscale Crater Extraction

To extract craters with a radius greater than 10 km more effectively from lunar digital elevation maps, pyramidal image segmentation based on the U-Net model is proposed, and the conversion relationship between the multilayer image pyramid and the geographic coordinates of the crater is established. The crater image pyramid method ensures the full coverage of the study area with a small number of images and that each crater exists in several images with different resolutions. The proposed method can effectively improve the detection performance of large-scale craters and solve the migration problem when stitching together craters from largescale images. This method recovered 85.48% of the craters with a radius greater than 10 km in an artificially annotated dataset, found 1044 new craters, and extended the maximum radius of detected craters from 72 km in randomly cropped image segmentation to 200 km. It was estimated by visual interpretation that approximately 82.09% of these new craters are real. Also, the recall reaches 90.17% when the new real craters are added to the true craters.


Introduction
Craters formed by impact activity are an important visual feature of many celestial bodies, providing a reference for stellar evolution and age assessment, as well as for the backward reasoning of the state of crater descent. (1)(2)(3) With the development of deep space exploration activities, craters with different shapes and ages densely distributed on the surfaces of celestia bodies have become natural landmarks for probe landings. Studying the size and distribution of craters is of great significance to ensure that landing vehicles avoid such obstacles. (4)(5)(6) Furthermore, the digital elevation maps (DEMs) generated by the fusion of remote sensing images from the SELENE (Kaguya) terrain camera and topographic height data from the Lunar Orbiter Laser Altimeter (LOLA) mounted on the Lunar Reconnaissance Orbiter (LRO) have a wide coverage, laying the foundation for a more effective extraction of craters on the rugged lunar surface. (7) In traditional methods, visual interpretation based on visual characteristics is mainly used for crater identification. This method is time-and energy-consuming, and the dataset produced cannot cover all craters on the moon. (3,8,9) Moreover, different researchers have different evaluation standards for craters, which has an impact on the data. (1,2,10) Many crater detection algorithms (CDAs) have been proposed for the automation of crater recognition, mainly (1) edge detection, such as the Canny operator; (11,12) (2) classifiers, such as support vector machines (13) and the AdaBoost classifier; (8) (3) convolutional neural networks (CNNs), including object detection and semantic segmentation, where models used in object detection include Faster-RCNN (14,15) and ZF-net, (14) and models used in semantic segmentation include U-Net. (2,16) Compared with traditional visual interpretation, these CDAs greatly improve detection efficiency. However, the effectiveness of the extraction of the first two algorithms is easily affected by non-crater edges, illumination, and overlap, and they have low detection performance for complex craters, especially those that have severely degraded over time. The above CNNs reduce the impact of these factors and have achieved good performance in the field of crater extraction owing to their ability to automatically extract features through iterative training. However, object detection is judged on the basis of many features of craters, and it is not conducive to the accurate extraction of the center and radius. Semantic segmentation can directly determine the location and size of a crater by identifying the edge pixels. In semantic segmentation, the generation of input images in the U-Net model (2) adopts the random cropping method, which selects a random position and random size between 500 and 6500 pixels to crop the image. This method can cover multiscale craters but requires many cut images to cover the entire area and all craters, and the effectiveness of extraction by these methods gradually decreases for craters with a radius greater than 10 km. In Ref. 16, the study area was regularly divided into images of the same scale. This reduced the number of cropped images required to cover the entire area, but the uniform-scale images limited the complete segmentation of large craters, making it difficult to effectively extract multiscale craters.
Image pyramids can process multiscale targets, and the effect of detecting the features of local targets is mainly enhanced by reducing the resolution of large-scale images. (17) In combination with the features of a multilevel image pyramid, (18) the effect of noise caused by a scale change is reduced and the detection effect is enhanced. (19,20) Toward solving the above-mentioned multiscale crater detection problem, we propose a pyramidal image segmentation method based on U-Net to automate multiscale crater extraction. This method first builds an image pyramid and integrates the craters of each layer of the image pyramid extracted by U-Net to reduce the impact of the scale on the extraction of craters; the conversion relationship between the segmented images of the multilayer image pyramid and the geographic coordinates of the craters is established to solve the problem of non-offset splicing of craters in large areas.
The rest of the paper is organized as follows. Section 2 introduces the data sources, proposed method, and performance metrics. The experimental results are presented in Sect. 3, and the factors affecting the results are discussed. Finally, some conclusions are drawn in Sect. 4.

Methods
The proposed pyramidal image segmentation based on U-Net is divided into four steps. First, the image pyramid is constructed by cropping the lunar DEM. Second, edge pixel segmentation is performed on the image pyramid by the trained U-Net model to determine whether each pixel in the image is the edge of a crater. The basic structure of U-Net is shown at the top of Fig. 2. U-Net is mainly divided into three parts: the contraction path, expansion path, and jump connection. The contraction path, used to extract features at different levels, contains three groups of operations, each of which includes two convolution layers and a max-pooling layer. The extended path, used for feature fusion, consists of three groups of operations, each of which includes an upsampling layer, a merge layer, a dropout layer, and two convolutional layers. The jump connection is composed of two convolutional layers, which are used to connect the contraction and expansion paths. Third, the image with the edge features of the crater uses the template matching algorithm for circle matching. If the matching is successful, the center position and radius of the circle are saved. Finally, the craters in each layer of the image pyramid are merged by establishing the conversion relationship between the image and the geographic coordinates of the crater, and then the duplicate craters are removed. The framework of pyramidal image segmentation based on U-Net for automatic multiscale crater extraction is shown in Fig. 2.

Image pyramid
As shown in Fig. 3(a), random image cropping is used in the original lunar crater identification method with U-Net. (2) A certain coordinate (a range in the image area) and a size (between 500 and 6500 pixels) are randomly selected from the 30720 × 30720 pixel lunar DEM for image cropping. Then, the scale of the cropped images is adjusted to 256 × 256 pixels. However, this cropping method covers a small area and often requires many images to cover the test area and craters of all scales.
In this paper, to detect more multiscale craters using fewer cropped images, we propose a crater image pyramid method to ensure that a small number of images can cover the original DEM image, and each crater object exists in multiple images with different resolutions to increase the probability of extraction. The operation of the crater image pyramid method is as follows [ Fig. 3(b)]. First, the 30720 × 30720 pixel lunar DEM test area is sequentially divided into several pyramid images of different scales, and the original size of the cropped image and the pixel coordinates ( ) 0 0 , X Y of the upper left corner in this image are saved. In the crater image pyramid, the first layer is the complete 30720 × 30720 pixel image, and the second and third layers are four 15360 × 15360 pixel images and sixteen 7680 × 7680 pixel images, which are regularly cut from the 30720 × 30720 pixel image, respectively. These images with different scales are then downsampled to a uniform size of 256 × 256 pixels. The final generated pyramid images are sent to the deep learning network model for edge pixel segmentation.

Establishment of the conversion relationship
A template matching algorithm is required for the images predicted by U-Net to extract the predicted location and size of each crater. (2) As shown in Fig. 4, in the template matching process, a template circle with radius r is defined, and the circle slides from left to right and from top to bottom in the CNN prediction image until the entire image is traversed. If the matching is successful, the center pixel coordinates (x pix , y pix ) of the current template circle are saved, and the pixel radius r pix of the matched crater is the radius r of the current template circle.  To adapt the coordinates of the current image relative to the lunar DEM, the pixel coordinates and pixel radius are converted into geographic coordinates and the radius in km by considering the dimensions of the entire DEM image. Note that there is an offset problem when using the original method to restore the position of the crater. This is because the image is distorted by the conversion from the PlateCarree projection to the orthographic projection. To change the distorted image to a uniform size, the edges of these images are filled. However, the coordinates of the four end points of the image are not defined as changed points. When the original size of the cropped image is larger, the distortion coefficient is larger after projection conversion, and the error of coordinate conversion will be larger. Hence, the error will eventually affect the correct restoration of the location of the crater.
To eliminate the effect of graphic distortion and solve the offset problem of coordinate conversion, we establish the conversion relationship between the multilayer image pyramid and the geographic coordinates of the crater. The projection transformation of the image is removed, the coordinates of the upper left corner of the current image matrix are used as the benchmark, the pixel matrix coordinates of the crater are converted to the pixel coordinates by considering the entire DEM image, and then the pixel coordinates are converted to geographic coordinates. The radius is restored to the global pixel coordinates and converted to km. The formulas used to establish the transformation relationship are as follows.
Here, (X 0 ,Y 0 ) and size are the pixel coordinates of the upper left corner and the size of the pyramid image, respectively. R moon = 1737.4 km. (x pix , y pix , r pix ) denotes the pixel coordinates and the radius of the crater for the current image matrix. (X degree , Y degree , R km ) denotes the latitude and longitude coordinates, and the radius in km in the original image. α and β are the parameters used to convert pixels to degrees and pixels to km, respectively.
There is a duplication phenomenon after fusing the craters extracted from the multilayer image pyramid owing to the same crater target existing in the multilayer image pyramid. Therefore, repeated craters should be removed after the crater splicing. The formulas used to identify repeated craters are as follows. ( Here, Lo, La, and R are the longitude, latitude, and radius, respectively. i and j represent two craters, and γ is the parameter used to convert km to degrees. D Lo,La and D R respectively represent the longitude and latitude thresholds and the radius threshold for judging whether the two circles are duplicate, where D Lo,La = 2.6 and D R = 1.8.

Metrics
To verify the accuracy of the extracted craters, we compare them with the ground-truth craters. Precision and recall are used to determine the performance of the method.
Here, TP is the number of true positives, i.e., predicted craters that match one of the ground-truth craters. FP is the number of false positives, i.e., predicted craters that do not exist in the groundtruth dataset. FN is the number of false negatives, i.e., real craters that the model mistakenly identifies as not existing.

Results and Discussion
We applied the proposed method to the testing area with the goal of extracting more craters with a radius of more than 10 km, and this extraction step was mainly completed in the testing phase.
Firstly, 30000 randomly cropped training data were input to train the U-Net network with 20 epochs to learn the crater edge features from them, and the weight files were retained. Then, 21 pyramid images in the test area of the lunar DEM were obtained, where the first layer has one image of 30720 × 30720 pixels, the second layer has four images of 15360 × 15360 pixels, and the third layer has sixteen images of 7680 × 7680 pixels. These pyramid images were scaled down to 256 × 256 pixels. Thirdly, the U-Net model loaded with weight files was used to segment the pixels of the crater edge in these pyramid images. Subsequently, the location and size of each crater were extracted by a template matching algorithm, and the transformation relationship was established and all the crater directories were fused.
We performed two experiments to verify the effective performance of the proposed method. In one experiment, the projected pyramid image and the original crater extraction method were used to determine the global position and size, and in the other experiment, 21 randomly cropped images obtained from the lunar DEM in the same area as the pyramid image were used, i.e., the input images were different.
As shown in Fig. 5, the conversion relationship established in this paper makes the extracted and true craters approximately coincide. However, some extracted crater locations recovered using the original method deviated from the true locations, with low or even no coverage of some craters. Thus, it is proved that the proposed coordinate transformation method is capable of more accurately recovering the coordinates of extracted craters. In the original coordinate conversion method of projected images, the distortion after projection is the main factor affecting its accuracy. In the distorted and edge-filled image, the farther the coordinates are from the center, the greater the distortion. The latitude and longitude coordinates, and pixel coordinates of the upper left and lower right corners are not defined as changed values but do actually shift. In the coordinate conversion, these coordinates and distortion factors are introduced into the calculation, resulting in errors. The method proposed in this paper does not need to perform projection or add distortion factors, and the coordinates of the endpoints introduced during the calculation are the actual coordinates. This method greatly reduces the deviation of the crater position. The slight deviation observed in this method is affected by the use of the mean radius of the irregular moon. As shown in Table 1, random cropping extracts fewer craters than ground-truth craters, while the image pyramid method extracts more craters than ground-truth craters. Compared with the original method, the recall of the image pyramid method is increased by 6%, whereas the precision is reduced. This is because the 1044 new craters discovered by the method are regarded as false positives, which affects the precision. It is estimated by visual inspection that 82.09% of the new craters are actual craters according to the standards of ground-truth craters, while for random cropping, it is only 56.60%. After recording these new real craters (TN) as TP, the recall of the image pyramid is increased by 9%, and the precision is slightly higher than that of the original method. This demonstrates that the image pyramid method is effective for detecting new craters. In addition, the image pyramid method expands the range of the extraction radius from 10-72 to 10-200 km, which demonstrates that it effectively improves the performance of extracting craters with a radius greater than 10 km. Figure 6 shows the craters extracted by the two methods in the same area. It is clear that the number of craters extracted by the image pyramid method more closely matches the number of ground-truth craters than that extracted by the original method. In particular, the number of new craters detected is about nine times that for the original method. This is attributed to the combination of three images of different scales in the same area in the image pyramid method, so as to extract craters of different scales in different layers (Fig. 7), where the first, second, and third layers can extract craters with a radius of 42 km or more, 21 km and more, and 10 km and more, respectively. The extraction results clearly reflect the effectiveness of using the pyramid image method on the same area.  The craters extracted by pyramidal image segmentation based on U-Net are compared with ground-truth craters (Fig. 8). The proposed method can only recover craters with a radius of less than 200 km because the edges of ground-truth craters with a radius of more than 200 km are not obvious. When the radius is within 200 km, except for the radius ranges of 25-35 and 75-100 km, the number of craters recovered by the pyramid image method is greater than or equal to the number of artificially labeled craters. However, some detected craters with slight deviations from the actual size are counted in other ranges, which affects the statistics of craters in this range. The proposed method finds many new craters in general. In particular, the number of craters detected by the proposed method is twice that of ground-truth craters when the radius is 10-15 km. For some undetected craters, fuzzy edges and irregularly shaped craters will affect the effectiveness of model prediction. For misdetected craters, the edge of an almost circular adjacent crater is regarded as a crater by the proposed method.

Conclusions
In this work, pyramidal image segmentation based on U-Net for automatic multiscale crater extraction was proposed, and the effective performance of the method was verified by comparison experiments. 1) It was demonstrated that the accuracy of multiscale crater extraction is improved by pyramidal image segmentation based on U-Net. The problem of large craters being cropped into multiple image blocks caused by random image cropping is reduced though the crater image pyramid method, and the extraction performance of multiscale craters is enhanced by fusing multilayer pyramid images. This method recovered 85.48% of the craters with a radius greater than 10 km from an artificially labeled dataset, and the recall was increased by 6% compared with the original method. The proposed method extends the radius of craters that can be extracted to 200 km on an equal number of images, compared with a maximum of 72 km for the original method. This method discovered 1044 new craters; in particular, the number of extracted craters with a radius of 10-15 km was about twice as large as that of artificially labeled craters. It was estimated that 82.09% of the craters were recovered, and recall reached 90.17% when the real new craters were included. 2) The conversion relationship between the multilayer image pyramid and the geographic coordinates of the crater was established to solve the problem of coordinate offset caused by the projection error in the fusion of craters extracted from different images.