Location Optimization of Bicycle-sharing Stations Using Multiple-criteria Decision Making

A bicycle-sharing system not only allows citizens to freely use bicycles installed in specific locations but is also a supplement to public transportation. In this study, we aim to improve the accessibility to public bicycles by finding the optimal locations of bicycle-sharing stations based on the history and spatial data of public bicycle operations. IoT sensors record the bicycles’ movement information. Multiple linear regression analysis is used to select the most important criteria for a bicycle-sharing system. The selected criteria are then applied to multiple-criteria decision making (MCDM) to rank the potential locations of bicycle-sharing stations. The topranking locations are finally determined through an optimization stage.


Introduction
There is a growing interest globally in maintaining the environment, and bicycle-sharing systems have been launched as an alternative to motorized transportation. A bicycle-sharing system is not only eco-friendly but also a complementary means of transportation. In particular, a bicycle-sharing system complements the shortcomings of existing public transportation systems by increasing the mobility of users. Demand-driven transportation systems, such as bicycle-sharing systems, contribute to meeting the demand for highly convenient transportation. (1) A successful bicycle-sharing system should be accessible to citizens for use, and the optimal locations of bicycle-sharing stations are crucial. (2) In this study, the optimal locations of bicycle-sharing stations were determined. As the experimental data, data records of the bicycle-sharing system called Fifteen provided by Goyang City, Gyeonggi Province, Republic of Korea, were used. The regional characteristics of the stations were statistically identified through multiple linear regression analysis to find the optimal locations. The statistically significant factors identified were applied as weights to multiple-criteria decision making (MCDM) to rank the potential station locations. We provide alternative locations for current stations that consider changes in urbanized areas such as new housing developments, population movement between regions, and public transport infrastructure changes.
In Sect. 2, the features of our approach are reviewed on the basis of previous related studies. Section 3 presents the experimental data and the proposed method. The results are discussed in Sect. 4. Finally, Sect. 5 provides some conclusions along with a summary and suggestions for future research.

Background
The first bicycle-sharing system was the Velib scheme in Paris in 2007, and bicyclesharing systems have since been established worldwide. (3) Previous studies have treated location optimization through various methods to increase the accessibility and use of systems. Researchers have applied mathematical models using a geographic information system (GIS) and MCDM to select station locations. Experts directly determined the weights of several factors and applied them to the analytic hierarchy process (AHP), which is one of the MCDM methods used to determine suitable locations. (4) In addition, research has focused on location allocation using a GIS rather than ranking within the selected areas. (5,6) However, previous studies have estimated preferential locations by applying the same weights to the criteria. Bhuyan et al. grouped users by age and analyzed each group's demand and the station density to provide experts with geographic information for a bicycle-sharing system. (5) Loidl et al. performed a kernel density estimation (KDE) analysis to extract information on the potential of a bicycle-sharing system by forming and overlaying the layers of each criterion. (6) Furthermore, researchers have applied the AHP to a GIS platform. Kabak et al. quantified the criteria influencing bicycle use based on Boolean relationships. (7) The authors and stakeholders set the importance of the criteria and applied them to the AHP. They created a suitability map based on the values obtained from the AHP. The suitable area and the existing stations were ranked through multi-objective optimization by ratio analysis (MOORA), and a comparative analysis was also conducted. Hoang et al. applied criteria calculated using a GIS to the AHP to create a set of weights and used principal component analysis (PCA) to find the correlations among the criteria to eliminate potential sites with less significant criteria. (8) Studies focusing on origin-destination (OD) analysis have also been conducted for transportation planning. OD trajectory analysis can help to not only identify urban flow patterns but also provide a wealth of information on urban flow and transportation demand. (9,10) In this study, we utilize multiple linear regression analysis to determine the critical criteria, in contrast to the previous research on manually selecting the criteria. We also use network analysis, such as that of the centrality, to consider the connectivity of bicycle-sharing stations. The selected criteria are then subjected to MOORA to determine the optimal locations of the stations.

Area of study and data
The bicycle-sharing system data used were from the bicycle-sharing system called Fifteen provided by Goyang City, Gyeonggi Province, Republic of Korea. The data comprised departure stations, arrival stations, and times. Figure 1 presents temporal signatures for the data. Commuting on weekdays is widely seen as the key reason for the use of bicycles in the scheme.
Additional data (e.g., the number of people using buses or the subway, or the number of residents) were combined with the bicycle-sharing data. Various buffer sizes around the bicyclesharing stations were tested for different data combinations. Finally, we set up a buffer of 400 m from each station as the spatial range for combining the data because the variables of the built environment are the most important in the model. The buffer size was based on a previous study. (11) Figure 2 shows the 155 bicycle-sharing stations currently installed in Goyang City, and Table 1 presents the experimental data used in this study.

Multiple linear regression
Multiple linear regression analysis involves two or more independent variables and one dependent variable. The influence of the independent variables is determined as a statistical value by comparing the absolute values of the regression analysis results. A positive value indicates a positive correlation with the dependent variable, whereas a negative value indicates a negative correlation. (12) Using this multiple linear regression model, the effect of the environmental characteristics around the stations on the use of public bicycles was analyzed in this study. Although the environmental variables of bicycle-sharing stations were regarded as independent, the number of bicycle rentals at each station was regarded as dependent.

Centrality
In network analysis, the centrality can be used to judge the importance of a node (station). The purpose of using the centrality is to express the connectivity between stations and find the ones with the most influence. The weighted PageRank centrality indicates the importance of the nodes when applying weights to the strength of the connections but does not consider the connections between nodes equally. (13) A high connection strength between stations means that the movement between the stations is high. Thus, we used the centrality as one of the criteria for determining optimal locations.

MCDM
We applied MOORA to rank the potential sites for a station. The characteristics of the surrounding environment derived from multiple linear regression analysis and the centrality derived from network analysis were applied to the MOORA parameters. This method ranks the sum of each variable by applying negative numbers as the cost and positive numbers as the benefit based on the weight of the variable. (14) Figure 3 presents the top 20 OD trajectories in terms of the frequency of use obtained from the Fifteen data records. When the sum of each station usage was calculated, it was confirmed that the top 35 of the 155 stations accounted for most of the total usage. We determined the environmental characteristics of well-used stations and used them for location selection criteria.

Results
An experiment was conducted for the 35 bicycle-sharing stations with the highest frequency of use to determine the conditions for successful locations of bicycle-sharing stations. The initial criteria were the 10 independent variables listed in Table 1. The experiment utilized backward elimination based on Akaike's information criterion (AIC) in multiple linear regression analysis to remove unnecessary parameters. Table 2 shows the five criteria remaining after backward elimination, which are statistically significant variables.
Equation (1) describes the multiple linear regression model. The adjusted R square value is 0.697, which indicates a relatively strong relationship. The p-value of the model is less than 0.05, which is highly significant. y = 3662 -0.87x 1 + 1.02x 2 + 0.44x 3 -0.47x 4 + 1.34x 5 . (1) Here, x 1 is the number of passengers at bus stops, x 2 is the number of passengers at subway stops, x 3 is the number of residences, x 4 is the total floor area of workplaces, and x 5 is the total floor area of buildings.
Equation (1) indicates that every 1% increase in the number of passengers at bus stops is associated with a 0.87% decrease in the number of bicycle rentals per day, and every 1% increase in the number of residences is associated with a 0.44% increase in the number of bicycle rentals per day.
Furthermore, we verified the assumptions of the multiple linear regression model. The non-constant variance (i.e., homogeneity of variance) was 0.23, which is greater than 0.05. No statistically significant difference was shown. The Durbin-Watson statistic (i.e., independence of observations) was 1.67, which is close to 2, and the variance inflation factor values were well below 10. The Shapiro-Wilk statistic for a normal distribution was 0.42, which is greater than 0.05. All tests show that the current multiple linear regression model satisfies all assumptions. Figure 4 shows the centrality of the bicycle-sharing stations and the connectivity among the stations. The size of each point is proportional to the value of centrality, and the thickness of each connecting line between points is proportional to the number of movements among bicycle-sharing stations. Only stations with strong connectivity are presented in Fig. 4.
Nodes with the significant use of bicycles and strong connectivity, as shown in Fig. 4, will require additional bike stations when they are relocated in the future. In this study, we added the centrality to the MOORA criteria. Because the centrality is considered to be an important factor in location selection, the weight of the centrality criterion was set to 30% of the total weight. The total of the weights in MOORA must be 1. Thus, we defined the criteria weights of variables x 1 -x 5 as shown in Eq. (2): where x 1 indicates the number of passengers at bus stops, x 2 is the number of passengers at subway stops, x 3 is the number of residences, x 4 is the total floor area of workplaces, x 5 is the total floor area of buildings, and x 6 is the PageRank centrality. The bicycle-sharing stations in Goyang City satisfy the minimum area covered by a bicycle-sharing system stipulated in the guidelines developed by the Institute for Transportation and Development Policy (ITDP). (15) According to the guidelines, an appropriate distance (2) to determine the 3038 rankings. The top 155 stations were selected for comparison with the existing stations. In addition, a bicycle station should be located on the sidewalk for better accessibility, and it is more valuable if it is located near a public transport station. We therefore applied optimization steps accordingly. Figure 5(a) shows a comparison between the current and alternative station locations for the same number of stations (i.e., 155), and Fig. 5(b) presents the alternative station locations when the number of stations is increased to 300.

Conclusions
The optimal utility of a bicycle-sharing system is closely related to its stations having good access for users. The environment surrounding bicycle-sharing stations was analyzed to determine the characteristics required for selecting the optimal locations. Multiple linear regression analysis was conducted to obtain the weights of the derived vital criteria, and significant correlations were confirmed. In addition, the centrality values of the network were calculated regarding the connectivity between stations. Finally, MOORA was performed to generate potential locations of bicycle-sharing stations.
This study provided alternative locations for existing stations to respond to changes in urbanized areas such as new housing developments, population movement between regions, and public transport infrastructure changes. We attempted to make a more actively used bicycle system by increasing citizens' accessibility and found some differences between the optimal locations of bicycle-sharing stations and their current locations. Future studies should focus on the appropriate number of bicycle stations and their allocation.