Point Cloud Registration Using Intensity Features

1Dept. of Computer Science and Information Engineering, National Yunlin University of Science & Technology, No. 123, University Rd., Section 3, Douliou, Yunlin 64002, Taiwan, R.O.C. 2Intelligence Recognition Industry Service Research Center, National Yunlin University of Science and Technology, No. 123, University Rd., Section 3, Douliou, Yunlin 64002, Taiwan, R.O.C. 3Dept. of Electrical Engineering and Graduate School of Engineering Science and Technology, National Yunlin University of Science & Technology, No. 123, University Rd., Section 3, Douliou, Yunlin 64002, Taiwan, R.O.C.


Introduction
Light detection and ranging (LiDAR) is an important method for 3D object reconstruction. With the drop in price, LiDARs are becoming widely used sensors and the point cloud obtained by LiDAR is also widely used in many state-of-the-art approaches. Recently, the reconstruction of 3D objects/scenes has been an important research topic in computer vision. In general, applications of 3D reconstruction can be divided into two categories: one is reconstructing scene environments, e.g., building interiors, street views, or natural scenes. The other is constructing a single object, e.g., an item of furniture, a vehicle, or a factory product. For digital archive application, the details of the interior and exterior of historical buildings have to be scanned and modeled for repairing, maintenance, and restoration. Usually, 3D scene reconstruction involves combining a stream of point clouds to establish a city scene. The stream of point clouds is captured by a LiDAR device mounted on a mobile platform or installed on a flight device to build city models and rugged ground surfaces.
A LiDAR device is an optical remote sensing device that uses laser light to measure the distance of a target. LiDAR can measure distances with high accuracy, identify the shape of an object, and establish the surrounding 3D geographic information model. It has the advantages of high-volume range measurement, high accuracy, high discrimination, and robustness to varying illumination. Generally, the horizontal field of view (FOV) is 360 degrees, but the vertical FOV depends on the type of LiDAR device. A LiDAR device with a wide vertical FOV and high vertical angle resolution is very expensive. For example, the cost of a 64-layer LiDAR device is ten times that of a 16-layer LiDAR device. Usually, applications of a LiDAR device depend on its vertical FOV and vertical angle resolution since the distance between the two scanning layers is large at long distances. A sparse point cloud cannot completely present the appearance of an object, resulting in a decrease in the degree of recognition and an increase in the difficulty of subsequent data processing and analysis.
To increase the vertical FOV of a LiDAR device for scanning the interior of a historical building, several point clouds captured at a fixed position and varying heights are aligned and merged together to form a point cloud with a wider vertical FOV.
In this paper, the proposed registration method only uses intensity features to improve the accuracy of extending point clouds by the iterative closest point (ICP). The proposed algorithm scans the source and the target with a fixed position and varying height, extracts the intensity features from both point clouds and then derives a better approximate transformation between the two point clouds by ICP. This paper consists of five sections. First, the motivation of this research is introduced. In Sect. 2, the related works are reviewed. The detail of the proposed method is given in Sect. 3. A simulation and conclusion are given in Sects. 4 and 5, respectively.

Related Works
Many algorithms (1)(2)(3)(4)(5) for 3D reconstruction have been proposed. The key to 3D reconstruction is aligning several point clouds with partial overlapping into a common coordinate system. The major task of alignment is to find the optimal transformation between the point clouds. By merging several aligned point clouds, the 3D environment or the 3D object can be reconstructed. Depending on the searching schemes used in alignment, the existing methods can be categorized into global-registration-based methods and local-registrationbased methods. The global-registration-based methods use all points in the source to find the corresponding points in the target and then compute the approximate transformation matrix. In Ref. 1, Cordón et al. used the evolution algorithm to find the transformation and the gene algorithm used by Silva et al. (2) The disadvantage of this type of method is that its computational cost is too high. Another kind of local registration method is more commonly used in most studies and it usually uses the features of a 3D model, e.g., curves or planes, to find the corresponding features. (3)(4)(5) However, the most popular method for alignment, the ICP algorithm, has several varieties. Some ICP-based approaches (6,7) use corresponding point pairs to find the transformation. Others (8)(9)(10)(11)(12)(13)(14)(15)(16)(17) use corresponding features, e.g., planes or curves, to find the transformation of 3D scenes or other features between the reference and the source. Thus, ICP can be categorized into both the above types.
ICP is reviewed as follows. The ICP algorithm, (6) proposed in 1992 by Besl and McKay, consists of several steps: finding the closest point in the data shape for each point in the model shape, then estimating the rotation and translation by a mean squared error (MSE) cost function and transforming the data shape with respect to the obtained rotation and translation. Finally, the best result of alignment is obtained by iterating the above process until the transformation is negligible.
Let A(x j , y j , z j ) be a point in A with m points and B(x k , y k , z k ) be a point in B with n points. The MSE is defined as the summation of the Euclidean distances of the closest point with respect to all points in A.
When deriving the transformation, the cross-covariance matrix (CCM), the mean product of the closest point vectors, is used. CCM is defined as follows: where is the center point of A. P, a 3 × 3 matrix representing the difference between CCM and the its transpose, can be written as A quaternion matrix Q is defined as Therefore, the translation vector is After an iteration of ICP, MSE is recalculated to determine whether the next iteration is necessary. If the change of MSE is negligible, the optimal transformation is obtained and then ICP stops.

Scanning the source point cloud and the target point cloud
The goal of this paper is to extend the vertical FOV of a point cloud by merging several point clouds. The source point cloud and target point cloud are individually captured in the first and second steps. In the second step, the position of the LiDAR device is fixed but its height is increased to scan the extra area. Since the laser of the LiDAR device is invisible, the exact scanned position is unknown. Therefore, two black patterns are put up on two non-coplanar walls for separate calibration. In contrast to the white walls, for which the intensity of reflected light is strong, the black patterns made of a nonwoven fabric have low reflection and are easy to extract in point clouds with intensity information. The first scanned point cloud is shown in Fig. 2, in which blue represents stronger intensity and red represents weaker intensity.
Note that if the two patterns are put on the same wall, the obtained transformation is singular. In other words, the translation vector and rotation matrix between the two point clouds are uncertain. Thus, the feature blocks are put up on two non-coplanar walls.

Extracting the intensity features
In general, when a LIDAR device captures a scanned scene, the raw data includes the distances and signal intensity values of scanned points. The reflection intensity will depend on the material of the object. When the laser light hits a smooth and nontransparent plane, the intensity of the reflection is greater than that on a plane with a matte surface. Therefore, if some blocks with low-reflectivity materials are set in one scene, the characteristics of the scanned data are used as features for better calibration.
In our experiments, the non-woven fabric is used as the feature block when scanning and the range of reflection intensity values is measured in advance. The feature size used for scene scanning is 30 × 30 cm 2 . Because the intensity value of the feature points is much lower than that of the surrounding points, the calibration pattern can be easily extracted by the predetermined thresholds. The classification of points can be expressed as where S CP is the point set of the calibration pattern and I max and I min are the predetermined intensity thresholds. The point cloud data filtered by the intensity threshold will leave some noise points with similar intensity values. The noise points are removed by an outlier removal method. The result is shown in Fig. 3.

Deriving the transformation by ICP
Let the extracted calibration pattern of the source point cloud be S CP1 with m points and the extracted calibration pattern of the target point cloud be S CP2 with n points. These two point sets are the input of ICP instead of two whole point clouds. After carrying out the procedure introduced in Sect. 2.3, the optimal transformation translation vector T and rotation matrix R can be derived. Trapping at local minima is a big issue in ICP and it happens when the initial point clouds are markedly different. If some known information can be used to roughly adjust the two point clouds first, a better result can be achieved. Therefore, the offset of height can be added to the points of the target point cloud before ICP to obtain a suitable initial position.

Transforming and extending point clouds
After obtaining the translation vector and rotation matrix between the two point clouds by ICP in the previous subsection, the target point cloud P T can be transformed into the same coordinate system as the source. The new position of P T can be calculated by where P T (i) is a point of the target point cloud and ( ) ( ) T is the new position of the point in the coordinate system of the source point cloud. Then, the extended point cloud is obtained by deleting the overlapping points of the transformed point clouds.

Experimental Results and Analysis
The proposed algorithm is simulated by using data captured by a Velodyne VLP-16 LiDAR device mounted on a tripod, as shown in Fig. 4. A Bosch DLE 40 laser rangefinder is also used to measure the differences in height. The test environment is a meeting room, which is shown in Figs. 5(a) and 5(b).
In Fig. 6, the points marked in white are the source point cloud and the points marked in red are the target point cloud. To show the true relationship between the two point clouds, the measured height is added to the origin of the target. Thus, the red point cloud is considered as the ground truth. In the first experiment, ICP is used for aligning the two whole point clouds. In Fig. 7, since the ground and the wall dominate the alignment, the extra part of the target is misaligned with the top of the source point cloud. The result of the proposed algorithm using ICP with only the intensity feature block is shown in Fig. 8. The position of the extra part of the target is maintained when aligning. Another simulation was carried out using the pre-known height, which was added as compensation to the coordinates of the points of the target point cloud before ICP. The result is shown in Fig. 9.

Conclusions
Since ICP depends on the similarity of the larger parts, e.g., the ground and the wall, the alignment sometimes is incorrect because of the tyranny of the majority. In this paper, an ICPbased registration method only using intensity features is proposed for increasing the vertical FOV of point clouds. The proposed algorithm scans the source and the target with fixed position and varying height, extracts the intensity features from both point clouds, and then derives a better approximate transformation between the two point clouds by ICP.
In simulations, the average error of alignment of the proposed system is less than 16 cm in a 6 × 6 m 2 room. An improved method using the premeasured height is also tested, where the premeasured height is added to the coordinates of the points of the target point cloud before ICP to improve the performance of the proposed algorithm. The average error of alignment of the proposed system using the premeasured height is less than 12 cm in a 6 × 6 m 2 room.