Constrained Building Boundary Simplification Based on Partial Total Least Squares Method

Maps obtained from various sensors or surveying instruments provide important geographic information for users. Buildings are critical components of various city-area maps. In this paper, we present a constrained building boundary simplification method based on the partial total least squares (PTLS) method. Simplification of building boundaries may cause some data quality problems, such as geometric displacements, right angles changed into non-right angles, and area inconsistency at different scales. Therefore, in this paper, a linear fitting model based on the PTLS method is constructed for building boundaries with the aim of reducing the total positional difference. Furthermore, end-point, right-angle, and area constraints are constructed at the same time to maintain the right angles and areas of the building. The simplification results are obtained by solving the proposed constrained model by using the PTLS method with constraints. The proposed method was applied in a building boundary simplification experiment. The results showed that the proposed method maintains the right angles and areas of buildings and reduces the geometric displacements of buildings after the simplification.


Introduction
Various remote sensing images or surveyed data provide substantial information for different kinds of maps. Map generalization is used to produce small-scale maps from large-scale maps. The simplification of buildings, which are critical components in city-area maps, is one of the important operations in map generalization. However, the simplification of building boundaries may cause some data quality problems such as changes in angles, areas, and geometric displacements, resulting in large data inconsistencies in the geometric information of buildings at different scales. Therefore, in this paper, we focus on building boundary simplification with data quality constraints.
Generally, buildings can be regarded in one way as polygons and in another way as closed polyline objects. In the case of buildings regarded as polygons, significant works have been carried out on building polygon simplification methods, such as those based on mathematical morphology, pattern recognition, (1)(2)(3) and building skeleton lines, (4) and approaches that establish the shape templates of buildings. (5) To model various data quality constraints in the process of building simplification, here we adopt the second way, which is to regard buildings as polyline features. There have been many classic methods of line simplification, such as the Douglas-Peucker (6) and Li-Openshaw (7) methods. In recent years, some scholars have proposed algorithms for simplification based on the Delaunay triangulation model to establish the spatial relationship between points on line elements, (8) as well as line simplification algorithms based on machine learning. (9,10) However, the boundaries of buildings have specific geometric characteristics. The simplification of building boundaries will cause some problems such as changes in the right angles of buildings, changes in areas, and geometric displacements. An increasing amount of research has paid attention to the effects of boundary simplification on data quality. (11) Bose et al. proposed a polygon simplification method with an additional area change threshold to reduce the area changes caused by line element simplification. (12) Buchin et al. proposed a polyline feature simplification method based on moving edge technology to maintain an area. (13) The least squares (LS) method and its extensions, which are often used to derive optimization results under geometric constraints, are also widely used in the simplification of polyline features. (14) Sester proposed a method of building simplification based on the LS model, which enhanced the geometric characteristics of simplified buildings, such as rectangularity, by transforming building polygons into a parametric solution model. (15) Liu et al. proposed a building simplification method based on the LS linear fitting model with rectangularity constraints, which effectively maintained the rectangularity of buildings and reduced point displacements after the simplification. (16) Jiang proposed a building simplification method based on LS adjustment with rectangularity and area constraints. (17) The classic LS method generally only considers the errors in the observation vector. However, in some cases, the design matrix of the functional model also contains errors. For example, in the linear fitting model, the coordinates of points that contain errors also exist in the design matrix. Therefore, the total least squares (TLS) method, which simultaneously considers the errors of all variables in the model, (18) has attracted much research attention recently. Tong et al. proposed an areapreservation approach for polygonal boundary simplification based on the partial total least squares (PTLS) method. (19) In this paper, constrained building boundary simplification based on the PTLS method is proposed. In the proposed method, a linear fitting model is adopted to reduce the total positional displacements of building points after the simplification. On the other hand, endpoint, right-angle, and area constraints for the boundary of the building are constructed. This partial total least squares method with constraints (PTLSC) is used to solve constrained building simplification problems. Therefore, the purpose of this study is to develop a constrained building simplification method that maintains the rectangularity and areas of buildings after simplification and reduces the total positional changes of all the points on building boundaries.
The rest of this paper is organized as follows. In Sect. 2, detailed descriptions of the proposed constrained building boundary simplification method based on the PTLSC are presented. Experiments and discussion are presented in Sect. 3. Finally, in Sect. 4 we conclude this paper. Figure 1 illustrates the framework of the proposed building simplification approach in this paper. The proposed approach consists of three main parts. (1) Establishment of the PTLS fitting model with the aim of reducing the total positional changes after the simplification: The building boundary is first divided into sub-polylines considering geometric characteristics using the method in Ref. 17. Then the PTLS linear fitting model is established for each sub-polyline. (2) Construction of the data quality constraints: End-point, right-angle, and area constraints are formulated to preserve the geometric characteristics of the buildings. (3) Derivation of the solution of the constrained building simplification model using the PTLSC: The aim is to derive the optimal simplification result that both meets the data quality constraints and minimizes the total positional changes of the buildings.

Constraint models
At a specific map scale, building points can be generally classified into key points that control the major building outline and other intermediate points. Therefore, it is necessary to first detect these key points. The building boundary can then be divided into segments using these key points. Building points in each segment constitute a sub-polyline.
In this paper, the method in Ref. 17 is used to segment building boundaries. Figure

End-point constraints
After segmenting a building boundary using the key points, a PTLS linear fitting model is constructed for each sub-polyline to reduce the overall geometric displacements of all the points after the simplification. (19) However, the segmented fitting model may cause the adjacent fitted line segments at the key points not to meet with each other [as shown in Fig. 3(a)]. If two neighboring fitted lines are extended to intersect with each other, it may cause geometric distortions. (19) Therefore, it is necessary to construct end-point constraints at the key points to achieve tight closure of adjacent fitted line segments under the condition of minimum geometric displacements. Figure 3(b) illustrates the simplification result with end-point constraints.
Assuming that point (x, y) is the end point of the precedent sub-polyline and the starting point of the subsequent sub-polyline, the end-point constraints on this point can be expressed as (19) Fig. 2. Segmentation of a building boundary.
where 1 1 ( , ) a b and 2 2 ( , ) a b are the parameters of the two neighboring fitted line segments, and v x is the residual of the x coordinate.

Right-angle constraints
Generally, most of the interior angles of buildings are right angles. After simplifying a building, right angles may change into non-right angles, as shown in Fig. 4(a). Therefore, rightangle constraints need to be constructed to ensure the rectangularity of buildings after the simplification. Figure 4(b) illustrates the simplification result with right-angle constraints.
It is supposed that the coordinates of three neighboring end points that form a right angle are (x s , y s ), (x t , y t ), and (x u , y u ). Then the right-angle constraints of the building can be expressed as (2)

Area constraints
Area is an important statistical property of buildings. The simplification of buildings may cause changes in their areas, resulting in inconsistent building areas at different scales. To maintain the areas of buildings after the simplification, area constraints need to be introduced. It is assumed that the boundary of a building contains n points with coordinates (x i , y i ), (i = 1, 2, 3, ..., n). The area S 0 of the building before the simplification can be obtained as It is supposed that the building boundary is divided into k sub-polylines. The starting point of the jth sub-polyline is (x s , y s ) and the end point is (x t , y t ). Thus, the area of the trapezoid formed by the sub-polyline and the coordinate axes can be obtained as The area constraint equation can be further expressed as (19) S = S 0 ,

PTLSC model
The PTLS method is a special case of the TLS method. In the PTLS model, only some of the elements in the design matrix contain errors. (20) In this paper, only the x-coordinates in the design matrix of the linear fitting model contain errors. Therefore, the PTLSC (19) is used to derive the optimal solution of the building simplification model with data quality constraints.
Supposing that a building boundary is divided into k sub-polylines and each sub-polyline has n i (i = 1, 2, 3, ..., k) points, then the PTLS fitting model for all sub-polylines in the building can be obtained as where Y is the n × 1 ξ is the 2k × 1 parameter vector of the fitted lines, where ,1 ,2 ,3 , 1, 2, 3, ..., k).
Assuming that there are z constraint equations in total, then the PTLS fitting model with constraints can be expressed as where I n is an n × n identity matrix, Λ is the n × n design matrix, Δβ and Δξ are small changes in β and ξ, respectively, C is the z × (n + 2k) design matrix constructed from the coefficients of the constraint equations, and C 0 is a constant vector. The corresponding stochastic model can be expressed as Λ is designed such that The above constrained building boundary simplification model can be solved by the PTLSC. (19)

Experiment and Discussion
We conducted a building simplification experiment to test the feasibility of the proposed method. The data used in the experiment was provided by the German LDBV Agency (Agency for Digitisation, High-Speed Internet and Surveying, 2018). The data includes 42 selected typical buildings with different geometric characteristics in Munich, Germany, with the layout of the buildings shown in Fig. 5. F is used as the edge separation threshold, which can be obtained as (16) where S s is the denominator of the data collection scale, S t is the denominator of the target scale, and D is the minimum visible distance. In this dataset, the data collection scale is about 1:1000, the target scale is 1:10000, and D = 0.4 mm. From Eq. (13), the threshold F = 3.6 m was obtained. The building boundaries were segmented by the method in Ref. 17. In the experiment, the proposed method in this paper was used to simplify the building boundaries. The simplification results were analyzed using indicators including the geometric displacements, the interior angle changes, and the area changes of the buildings before and after the simplification.

Geometric displacements
The point displacement was used to evaluate and analyze the geometric displacements of the simplified buildings, which is formulated as where (x 0 , y 0 ) are the original coordinates of the building point and (x, y) are the point coordinates after the simplification. Furthermore, the average point displacement of the buildings can be obtained as ( ) where L ∆ is the average point displacement of all the buildings, ΔL ij is the point displacement of the jth point of the ith building, t is the total number of buildings, and k is the total number of points on the boundary of the ith building. Equation (15) is derived by first calculating the average displacement of each building, and then calculating the average displacement of all buildings.
First, the 42 buildings in the dataset were simplified using the classic LS method with end-point constraints and the PTLSC with end-point constraints. After the simplification, the maximum, minimum, and average displacements of the buildings were calculated. The distribution ranges of the point displacements were also evaluated and analyzed. Table 1 presents the maximum, minimum, and average point displacements of the building points, and Table 2 shows the point displacements classified into different various ranges.
From the results presented in Tables 1 and 2, the following can be seen: (1) The maximum, minimum, and average point displacements of the buildings obtained by the PTLSC are all smaller than those of the LS method. This is because the PTLSC takes into account the errors in both the coefficient matrix and the observation vector. However, the LS method only regards the observation vector as containing errors. (2) For the PTLSC, 81% of the geometric displacements of the buildings are between 0 and 0.15 m, and the geometric displacements of all the points are less than 0.3 m. However, the geometric displacements obtained by the LS method are distributed in all three ranges in Table 2. Only 38% of the points have small geometric displacements in the range of 0-0.15 m and 10% of the points have larger displacements in the range of 0.3-0.45 m.

Interior angle changes
The average difference between the interior angles of the simplified building and a right angle can be calculated as where Δα is the angle difference between the interior angles of the simplified building and a right angle, α ∆ is the average value of Δα, and k is the number of interior angles of the building that should be right angles.
We used three different PTLSCs with different constraints to simplify the buildings in the dataset: (1) the PTLSC with only end-point constraints (Case 1), (2) the PTLSC with two constraints, namely, end-point and right-angle constraints (Case 2), and (3) the PTLSC with three constraints, namely, end-point, right-angle, and area constraints (Case 3). The differences between the interior angles of the simplified building and a right angle were calculated. The differences between the interior angles and a right angle after the simplification of the buildings using the PTLSC in Case 1 are shown in Table 3.
From the results presented in Table 3, it can be seen that there are large differences between the interior angles and a right angle for the PTLSC in Case 1, which contains only end-point constraints. Therefore, it is necessary to add right-angle constraints in the simplification process.
In Cases 2 and 3 with right-angle constraints, the differences between the simplified interior angles and a right angle are all 0. Therefore, the PTLSC with right-angle constraints can guarantee the rectangularity of the buildings.

Area changes
The area change of a building can be obtained as a percentage using where S 0 is the area of the original building and S is the area of the simplified building. The area changes of the buildings after the simplifications using the PTLSC with the three different sets of constraints in Sect. 3.2 were calculated. Figure 6 shows the area change percentages of the buildings simplified by the PTLSC in the three cases. Figure 7 shows a comparison of the building boundaries before and after the simplification by the PTLSC with the end-point, right-angle, and area constraints.
From the results presented in Figs. 6 and 7, it can be seen that the simplification of building boundaries by the PTLSC with additional end-point, right-angle, and area constraints can maintain the basic distribution patterns and geometric characteristics of the buildings. Moreover, compared with the PTLSC without additional area constraints (Cases 1 and 2), the PTLSC with additional area constraints (Case 3) can maintain the areas of all the buildings after the simplification. After adding the area constraints, the percentage area changes of the buildings are about 0% (<10 −6 ) and the area changes of all the buildings are less than 10 −4 m 2 .

Conclusions
Buildings are important components of different kinds of city-area maps obtained from various remote sensing images or surveying instruments. The simplification of building boundaries is one critical operation in deriving small-scale maps from large-scale maps. A building polygon simplification method with data quality constraints based on the PTLSC is proposed in this paper. The simplification of the building boundaries may cause some data problems after the simplification, such as changes in the rectangularity and areas, and large positional differences. Therefore, the aims of the proposed method are to reduce the total geometric displacements and to maintain the rectangularity and areas of buildings after the simplification. The proposed approach is a supplement to the existing building simplification methods. In the proposed method, the PTLS fitting model for each sub-polyline of the building boundary is first constructed. The sub-polyline of the buildings can be derived using existing building simplification methods, such as the method presented in Ref. 17. Secondly, end-point, right-angle, and area constraints are introduced to maintain the rectangularity and areas of the buildings after the simplification. Finally, the PTLSC is used to derive the optimal solution of the constrained building simplification problem.
The proposed building simplification method was applied to an actual building dataset simplification experiment. The experimental results showed that the proposed method based on the PTLSC fitting model can obtain smaller geometric displacements of the buildings than the classic least-squares-based fitting method. In addition, the proposed method with endpoint, right-angle, and area constraints can effectively maintain the rectangularity and areas of buildings after the simplification. However, for some buildings that have curved facades, a curved-line-fitting-based model may need to be developed, and other quality constraints for complex buildings also need to be further researched.