Application of Real-time Sensor Data to Evaluation of Performance of Ad Hoc Distributed Traffic Simulation

Recent advancements in sensor and wireless communication technologies are creating new opportunities to effectively exploit real-time traffic data. Onboard sensors on vehicles collect real-time traffic data and simulate traffic states in a distributed fashion. A local transportation management center coordinates the overall simulation with an optimistic execution technique. In this paper, we presents a study of the application of real-time sensor data to the evaluation of the performance of ad hoc distributed traffic simulation. In this study, the real-time field data would be replaced with the streaming sensor data in a field implementation. Two scenarios were investigated to evaluate the performance. The first scenario examined how the system adequately captures changes in traffic conditions when the sensor reports a sudden increase in traffic volume and decreases under uncongested traffic conditions. The second scenario investigated how well the ad hoc distributed traffic simulation operates when a traffic incident occurs. The proposed ad hoc distributed traffic simulation with real-time sensor data was found to be capable of capturing dynamically changing traffic conditions in both the peak traffic and incident scenarios. In both scenarios, the prediction accuracy drops when the traffic state changes. However, the ad hoc approach appears generally capable of capturing dynamically changing traffic conditions when the real-time field sensor data are available.


Introduction
Recent advancements in sensor and wireless communication technologies are creating new opportunities to effectively exploit real-time traffic data. Onboard sensors on vehicles collect real-time traffic data and simulate traffic states in a distributed fashion. Such a distributed approach can provide more up-to-date and robust estimates with real-time sensor data. (1)(2)(3) As roadside and in-vehicle sensors are deployed under connected vehicle and autonomous vehicle environments, an increasing variety of traffic data is becoming available in real time. (4,5) This real-time sensor data can be shared through wireless communication and utilized for other purposes, creating an opportunity for mobile computing and online traffic simulations. (6)(7)(8) However, traffic simulations with real-time data require a speed that is higher than the realtime running speed with a high simulation resolution, since the purpose of the simulations is to provide immediate future traffic forecast based on real-time sensor data. Simulating at a high resolution is often too computationally intensive to process a large-scale network on a single processor in real time. To mitigate this limitation, ad hoc distributed simulation with optimistic execution has been proposed as one of the promising solutions for large network simulations. (9,10) In this paper, we present a study of the application of real-time sensor data to the evaluation of the performance of ad hoc distributed traffic simulation. In this study, the real-time field data would be replaced with the streaming sensor data in a field implementation. It is envisioned that, in the proposed distributed traffic simulation framework, each in-vehicle simulation models a small portion of the overall network and provides detailed traffic state information. Traffic simulation and data processing are performed in a distributed fashion using multiple vehicles. Each in-vehicle simulation is designed to run in real time and update its estimates when necessary. This integration manages the distributed network to synchronize the predictions among logical processes (LPs).

Background
Parallel and distributed simulation with real-time sensor data has been considered as one of the promising solutions to large network simulations. (11)(12)(13)(14)(15)(16)(17) Since traffic simulation has a limitation to run fast enough with a large number of real-time sensor data, a traffic simulation program is partitioned into multiple processors, and communication middleware is used to coordinate between multiple single-processor machines. Some researchers proposed models where a large network is divided into a set of subnetworks, each of which is assigned to a different processor so that a large network simulation runs at a much higher speed.
Although parallel and distributed simulation with real-time sensor data increases the simulation speed in a large network simulation model, it requires simulation time managing processes to synchronize all LPs. (18)(19)(20) This synchronization process often significantly reduces efficiency. Since neither the speed of each LP nor the computational load for each LP are the same, the speed of the entire simulation depends on the slowest LP. For example, faster LPs always have to wait for the slowest LP and all LPs should be synchronized with respect to simulation time. This synchronization overhead can take abundant simulation resources and degrade the overall simulation performance.
Despite these issues, it is believed that the lack of detailed knowledge of the real-time traffic information can be addressed by distributed simulations, which provide current traffic data with increased computing capacity and less communication bandwidth requirements. A distributed approach allows the system to operate in close proximity to a real-time sensor, offering the potential to use more accurate data with shorter response time than conventional centralized simulations within a single large processing machine.

Experimental Design
To evaluate the performance of the ad hoc distributed traffic simulation model, the utilized traffic simulation model is required to have the following capabilities: (1) the ability to modify simulation objects at runtime, (2) generate interim simulation data at runtime, (3) produce runtime simulation states, and (4) recall the simulation states. VISSIM ® , a widely used offthe-shelf traffic simulation program, is a commercial simulation package meeting all the requirements mentioned above. VISSIM ® is a discrete, stochastic, time-step-based microscopic simulation model. (21) This behavior-based multipurpose traffic simulation program has been developed to model a wide range of traffic conditions including freeway, arterial, and public transit operations. (22) In this model, all vehicles are modeled individually on the basis of a psycho physical driver behavior model developed by Wiedemann. (21) The basic assumption of this model is that a driver can be in one of the four driving modes: free driving, approaching, following, or braking. Figure 1 illustrates the VISSIM ® network utilized for the experiments in this study. This Manhattan-style 3-by-6 grid network consists of a two-way, 8-lane road (Fifth Street) with all other roads being 4-lane, two-way facilities. Each of the eighteen signalized intersections operates using a pretimed, 120 s four-phase cycle (10 s protected-only leading lefts and a 50 s through/right movement in all approaches) and a 0 s offset. For this network, each roadway link is 400 m in length with a 180 m single-lane left-turn bay, the vehicle fleet is assumed to be 100% autos, and the desired speed is 48 km/h. At each intersection approach, 95% of vehicles are assumed to pass straight through, 3% turn right, and 2% turn left. Each LP models a 3-by-3 grid network, centered at the LP location.
In the experiments, it is assumed that the LPs are preconfigured to model the designated scenario area at the start of a run. Each LP sends the estimated flow rate, speed, travel time, and queue length data on all simulated links to the server every 60 s of the simulated time. The 60 s predictions are the aggregate flows over the previous 240 s. At the initialization of each LP, a 240 s fill period is completed before rollbacks are allowed. LPs do not send updates to the server during the fill period. The duration of each experiment is 90 simulated min, including a 30 min warm-up to allow the system to reach a steady state. The results presented in this paper do not include the warm-up period. In addition to LP simulations, one large network simulation provides a real-time state estimate of the roadway network. To fully investigate the ability of the ad hoc system to utilize real-time sensor data, no LPs are initialized under accurate demand conditions. Initial rollbacks are expected to be instanced on the basis of the real-time field sensor data. The real-time field sensor data is shared and propagated between LPs through the ad hoc algorithms.
For these experiments, LPs are uniformly distributed over the network. The locations of the eight LPs used in these experiments are shown in Fig. 1. Each LP models a 3-by-3 grid network, centered at the vehicle location. For example, LP 8 models a network covering Fourth Street, Fifth Street, and Sixth Street with First Avenue, Second Avenue, and Third Avenue. The real-time field sensor data covers the entire 3-by-6 grid network representing real-time traffic data. In a field implementation, this would be replaced with the streaming detector data.
Two different traffic conditions are examined: a peak traffic scenario under uncongested traffic conditions and an incident scenario. The first scenario assumes that a sudden increase in eastbound traffic on Second Avenue is detected at point A. This scenario explores how a traffic flow change is transferred to downstream LPs. In the second scenario, a traffic incident is assumed to occur eastbound on Second Avenue at point B, reducing the average speed of vehicles from 48 to 1 km/h for 900 s. This reduces the roadway capacity below the demand, resulting in significant upstream queueing. This scenario models congested conditions and examines the responsiveness of the system to a downstream bottleneck. The average speed and flow rate are measured every minute for each link. Details about the scenarios are presented in Table 1. Each scenario with one real-time field sensor data and eight LPs is replicated 10 times with different VISSIM ® random seed numbers.

Results and Discussion
The objective of the experiments in this paper is to investigate the performance of the proposed ad hoc traffic simulation when real-time field sensor data is available. Two scenarios are designed. The first scenario examines how the system adequately captures changes in traffic conditions when the traffic volume experiences short-duration peaking in uncongested traffic conditions. The second scenario explores how well the ad hoc distributed traffic simulation operates under incident conditions. As real-time field data represents real-time sensor data, the system's performance can be measured on the basis of the accuracy of the predictions at the current wallclock time for future wall-clock times. Predictions for future traffic states at future wall-clock times can be found from the global instance , ,  is the MAPE for the prediction for T + i min calculated at the wall-clock time T min. It is computed on the basis of the T + i min simulation time predicted at the wallclock time T min and the real-time sensor data at the T + i min wall-clock time. The MAPEs of 1-5, 6-10, and 11-15 min future predictions can be computed to examine the system's prediction performance with various near-term horizon lengths.

Peak traffic scenario
The first scenario examines how the system adequately captures changes in traffic conditions when the traffic volume is suddenly increased or decreased under uncongested traffic conditions. This is achieved by modeling the under capacity 100 veh/h/ln traffic demand for 20 min (after initialization) followed by a sudden flow increase to 500 veh/h/ln for 20 min on Second Avenue (point A), with traffic then returning to the original 100 veh/h/ln rate.
To model the traffic volume changes over the network, new traffic volume information should be transferred from upstream LPs to downstream LPs. In this scenario, the real-time sensor data is expected to reflect the increased traffic volume at 20 min wall-clock time, under the assumption that the increased volume has been detected by field detectors. This traffic increase triggers the server to send a rollback message to the upstream LPs (LP 1 and LP 2). are updated in the space-time memory and rollbacks are triggered on the downstream LPs when necessary. This process is continued, allowing the downstream LPs to receive predictive data regarding the flow increase prior to the increase reaching the LPs' simulation area.
The system's performance will be measured using the following two attributes: (1) the length of the prediction horizon-how far in advance the system provides predictions at a specific wall-clock time and (2) how accurate the predictions are at a specific wall-clock time. By focusing on these two attributes, a comprehensive quantitative comparison is conducted to explore the quality of the available predictions of the ad hoc distributed simulation approach. The accuracy of the available predictions is calculated for various near-term horizon lengths (1-5, 6-10, and 11-15 min future predictions).
Mean absolute error (MAE) (flow rate) and MAPE (travel time) analyses are conducted. Details of the calculation are as follows.  Figures 2 and 3 show that the ad hoc distributed simulations present a high degree of agreement with the field sensor data for the immediate near-term future (1-5 min future predictions). As expected, it is readily seen that the accuracy of predicting the flow rate and travel time decreases with a change in traffic state at 20 min (when the increase begins) and 40 min (when the decrease begins). However, the ad hoc distributed simulations rapidly adapt to the new traffic state and the overall accuracy of the ad hoc approach improves. In the replicated trials, the increased arrival demand on average reaches point B approximately 8 min after the initial increase at point A. Since the upstream LPs (LP 1 and LP 2) reflect the new traffic state immediately after the increase, they are expected to demonstrate results similar to the field sensor data. However, the new traffic information is not available to the downstream LPs (LP 7 and LP 8 containing point B) from the field sensors up to 8 min after crossing point A. Although the downstream LPs in the ad hoc distributed simulations, coupled with the upstream LPs, rolled back and predicted the increased/decreased traffic flow before the new traffic reached the field detectors, exchanging predicted flow rate information between LPs in an ad hoc distributed simulation allows the downstream LPs to reflect the oncoming traffic changes. Table 2 and Figs. 2 and 3 also demonstrate that the agreement between the ad hoc distributed simulations and the real-time field sensor data is significantly reduced as the prediction horizon increases to 6-10 and 11-15 min. It is not possible to have any updated prediction until the event occurs in a certain area modeled by any LPs (point A in this scenario) and is reflected on any LPs. Additionally, the propagation time is approximately 5 min from Second Street and Fifth Street. Therefore, the ad hoc distributed simulations cannot further make accurate predictions over 6 min. Thus, all of the predictions over the 6 min can be erroneous. The larger horizons will thus have more errors and the length of the prediction horizon is believed to correlate with the propagation time, which is a function of the network size, vehicle propagation speed, and LP simulation speed. For example, the ad hoc distributed simulations could further make accurate 30 min future predictions, if the traffic propagation time is 30 min or more. This will be revisited later in this paper.

Incident scenario
This scenario is intended to investigate the responsiveness of ad hoc distributed simulations when under traffic incident conditions. Traffic information transfer in this scenario is not as straightforward as that under uncongested traffic conditions. Although upstream traffic information (flow rates) propagates from upstream LPs to downstream LPs in the volume increase scenario, downstream traffic information (speed reduction) is transmitted to upstream LPs from downstream LPs, as congestion builds from downstream to upstream. To investigate how the ad hoc distributed traffic simulations perform during before-incident, during-incident, and after-incident periods, this scenario is constructed. A traffic incident is set to create congested conditions by reducing the vehicle speed from 48 to 1 km/h at point B for 15 min. The incident starts 10 min after the 20 min warm-up period. After an additional 10 min, the  vehicle queue extends to Second Street. The queue does not begin to clear from this link until after the incident is removed from point B. This experiment allows for an investigation of how the system represents both the congestion and the periods before, during, and after the congestion. Similarly to the first scenario, ten replicated runs with one real-time sensor data and eight LPs are conducted. After the runs, a comprehensive quantitative comparison is performed to examine how accurate the predictions are at specific prediction horizons.
First, the progress of the incident traffic conditions in the real-time sensor data is described and how ad hoc distributed simulations successfully model the incident is explained later. Owing to the incident at point B, the capacity on Second Avenue is reduced significantly, reducing the average speed of vehicles from 48 to 1 km/h for 600 s and resulting in significant upstream queueing toward point A. The average speed drops as the impact of the incident reaches the upstream links. It requires approximately about 15 min for the impact to reach Second Street. At the same time, only limited traffic flow (far less than 500 veh/h/ln input flow rate-approximately 50-100 veh/h/ln) can be served. After the incident is cleared at 25 min, vehicles can pass point B at a free flow speed. However, more than 15 min is required for all the unserved vehicles to pass the queue and for the traffic to return to the preincident state. The flow rate and travel time of the real-time field sensor data are depicted in Fig. 4. Also, a threedimensional plot of the travel time of the real-time field data is presented in Fig. 5 at 5 min intervals. It is shown that the segment 1 travel time reaches approximately 700 s during the congestions, while the uncongested travel time is approximately 350 s. This traffic condition is reproduced in the ad hoc distributed simulations as follows. Each LP is running its simulation based on the initial flow rate. No LP has information about the incident until the server receives the low speed and low flow rate from the real-time field sensor data and sends rollback messages to the corresponding LPs. The incident starts at point B, reducing the average speed of vehicles from 48 to 1 km/h for 900 s. It results in significant upstream queueing on Second Avenue such that the measured traffic flow and speed of the real-time sensor data are significantly reduced. Right after the incident starts at 10 min wallclock time, the real-time field sensor data starts to show a lower speed and a lower flow rate  on the link at point B. The server receives the low speed and low flow rate from the realtime sensor data, and finds a rollback threshold violation between the data from the real-time sensor data and the already received estimates from LP 7 and LP 8. The server then issues a rollback to LP 7 and LP 8. They begin to update their future traffic predictions assuming that newly received traffic conditions continue. Reproducing congested conditions on LP 7 and LP 8 is accomplished by controlling the outflow rate by altering the 'desired speed' of each vehicle on the link. When LP 7 and LP 8 update their predictions and the difference between their predictions and the flow rates already predicted by LP 5 and LP 6 (which did not have the incident information) violates the rollback threshold, and the average speed predicted by LP 7 and LP 8 is below the speed threshold, rollbacks are triggered on the upstream LPs (LP 5 and LP 6). In a similar fashion, LP 3 and LP 4 (and LP 1 and LP 2 later) make a rollback as the queueing continues to build up toward point A. This allows congested traffic information to be passed to the upstream LPs, even before the impact of the incident actually reaches the area modeled by the upstream LPs. Once there is another threshold violation (i.e., incident is removed), updated information is again transmitted from the real-time field data to LP 7 and LP 8 and from LP 7 and LP 8 to other LPs in the same manner.
Using the same method with the first scenario, MAE and MAPE are calculated (Table 3). Table 3 shows that MAEs/MAPEs are considerably higher than those of the first scenario. This implies that the ability of the ad hoc distributed simulations to reflect congested traffic conditions due to incidents is reduced. This is an expected outcome. The simulation performance worsens in the incident scenario as the outflow constraint by speed does not provide highly accurate flow control. In addition, more randomness is involved in modeling congested networks. However, it is revealed that the ad hoc distributed simulations offer reasonable replicates of the real-time sensor data for immediate future travel time predictions (1-5 min) and are capable of providing reasonable predictions for longer horizons, although delay exists in updating predictions.  Figure 6 illustrates a three-dimensional plot of travel time predictions and the real-time sensor data with wall-clock time on the y-axis. Initial (at 0, 5, and 10 min wall-clock time) predictions are available until 80 min of simulation time (area A in Fig. 6). These predictions were made during the 20 min warm-up time period. The travel time is predicted to be approximately 350 s, as these predictions are constructed without knowledge of the incident (as the incident has not yet occurred). Once a rollback is triggered by the incident, existing predictions on the rolled back clients are removed from the space-time memory and updated with new predictions based on updated rollback information. Until the ad hoc simulation receives new traffic information, the travel time is predicted to continue to increase (area B) since the current traffic condition is assumed to continue. Therefore, it is anticipated that the ad hoc simulations make predictions with high accuracy if the estimated incident clearup time information is provided. Empty cells in area C show that the predictions beyond the 50 min simulation time are not available at 25, 30, and 35 min wall-clock times, as the earlier predictions have been removed and sufficient computational time has not yet passed to allow updated predictions at this point in the time horizon. Finally, it is seen that, when the impact from the incident disappears at approximately 40 min wall-clock time, the ad hoc simulation can adjust predictions to reflect this new data (area D).

Conclusions and Limitations
As more sensor data becomes available, it opens new opportunities for various transportation analysis techniques including traffic data acquisition using image recognition technology, (23)(24)(25) real-time travel time data collection, (26)(27)(28)(29)(30) and monitoring infrastructure or workers in construction areas. (31,32) In this paper, we present the application of real-time sensor data to the evaluation of the performance of ad hoc distributed traffic simulation. The real-time field data would be replaced with the streaming sensor data in a field implementation. The first scenario examined how the system adequately captures changes in traffic conditions when the sensor reports a sudden increase in traffic volume and decreases under uncongested traffic conditions. The second scenario investigated how well the ad hoc distributed traffic simulation operates when a traffic incident occurs.
It was found that the proposed ad hoc distributed traffic simulation with real-time sensor data is capable of capturing dynamically changing traffic conditions in both the peak traffic and incident scenarios. In both scenarios, the prediction accuracy drops when the traffic state changes. Additional performance degradation is seen in the incident scenario, since the predictions are produced on the basis of the assumption that current traffic conditions continue, i.e., potential incident clearing is not assumed. However, for immediate future predictions, the proposed simulation presents a relatively good prediction capability.
All steps described above are expected to have a positive contribution to the robustness of the model. However, the output of the model should be validated with the field data. From the validation results, the model can be calibrated more to increase the performance. Various statistical tests should be employed to compare the field data and the predictions from the model.
In the algorithmic approach, predictions are made on the basis of the assumption that current traffic conditions will continue. For example, in the incident scenario, when the congestion from the incident builds up, the predicted delay will continue to increase the entire prediction horizon length, regardless of the potential future clearing of the incident. Incorporating outside source information, such as the expected incident clear-up time and planned event information, may improve the prediction accuracy.
Additionally, the proposed model is developed on the basis of a perfect communication environment assumption between sensors and the data management center. Communication error including communication message loss from a sensor, messages in reverse order, and messages over a buffer limit should be examined for a successful field implementation of the model.