Comparing Measured Driver Behavior Distributions to Results from Car-Following Models using SUMO and Real-World Vehicle Trajectories from Radar SUMO Default vs. Radar-Measured CF model Parameters

: In this study, the physical principles governing car-bfollowing (CF) behavior and their impact on traffic flow at signalized intersections are investigated. High temporal-resolution radar data is used to provide valuable insights into actual CF behavior, including acceleration, deceleration, and time headway distribution. Demand-calibrated SUMO simulations are run using empirical CF parameter distributions, and three CF models are evaluated: IDM, EIDM, and Krauss. By emulating radar data in SUMO and processing simulated vehicle traces, discrepancies between empirical and simulated parameter distributions are identified. Further analysis includes comparisons with default SUMO CF model parameters. The findings reveal that measured accelerations differ from CF model parameter accelerations and using the empirical value ( µ = 0 . 89 m/s 2 ) leads to unrealistic simulations that fail volume-based calibration. Default parameters for all three models reasonably approximate the mean and median of measured parameters, but fail to capture the true distribution shape, partly due to homogeneity when using default parameters. The results show that it is more effective to simulate with the default parameters provided by SUMO rather than using measurements of real-world distributions without additional calibration. Future work will investigate closing the loop between the measured real-world and SUMO distributions using traditional calibration tactics, as well as assess the impact of calibrated vs. default CF parameters on simulation outputs like fuel consumption.


Introduction
The modeling of car-following behavior is a crucial component of traffic micro-simulation, as it captures the longitudinal interactions between drivers and their preceding vehicles. Car-following (CF) models have been studied extensively by researchers since the 1950s, with early models proposed by Reuschel [1] and Pipes [2]. Over the years, the CF model space has evolved into several subcategories, including mathematical or engineering models, data-driven models, and hybrid models [3]. While the latter two modeling techniques have gained significant research attention, the use of mathematical models such as the intelligent driver model (IDM) [4] is still prevalent in traffic micro-simulation software such as SUMO [5]. Traffic micro-simulation has many use cases, but regardless, the reliability of the insights gained from simulation depends entirely on the model's ability to represent the underlying real-world traffic network [6].
To gain confidence in model outputs, a process of calibration and validation is required [6], [7], where each of traffic micro-simulation's sub-models, including the CF model, are optimized. There are several types of calibration referenced in literature, but they can generally be summarized in three categories: capacity calibration, route/demand calibration, and individual trajectory-based calibration [6], [8]. The first two calibration categories typically rely on aggregate measures, such as loop detector counts, travel times, and saturation flow rates. Although widely used, these measures may not always accurately reflect real-world vehicle behavior [8], [9]. Moreover, studies have shown that different CF model behaviors can lead to similar travel times, saturation flow, delays, and queues [10]. This lack of uniqueness in the CF model parameters that best fit the observed traffic measures has been previously studied [11] and can be problematic in estimating the emission and energy consumption of a traffic network using microsimulation. This is because factors such as acceleration, deceleration, jerk, and speed are highly influential in vehicle-level emissions and fuel consumption, which are difficult to estimate accurately without considering individual vehicle trajectories [10], [12].
Fuel consumption estimation is given as an example to emphasize the importance of the CF model parameters themselves. They are a central part of traffic simulation, yet its difficult to find consensus on the correct parameter settings and/or range of settings. The default values in SUMO can differ substantially from calibration literature (i.e. SUMO default acceleration 1 is 2.6m/s 2 , whereas commonly cited calibration literature ranges from 1.00 to 1.58m/s 2 [13]). Further, the vast majority of calibration literature relies on NGSIM dataset, which was collected in 2005 [14], even though it is known that CF parameters evolve over time or change depending on geography and locality [15].
The solution would seem to require that practitioners calibrate their own CF models, however the complexity of performing trajectory-based calibration in practice cannot be understated. It is not as simple as measuring accelerations or velocity in the field, as many CF model parameters cannot be derived from macroscopic measurement or do not have physical equivalents [16]. Instead, the practitioner must collect high resolution trajectory data, clean and process the data, extract leader-follower pairs, and then use computationally expensive optimization methods, such as genetic algorithms, to find the correct CF parameter settings. Recent studies have shown that state-of-theart CF model calibration still results in significant error [17] and according to Ossen and Hoogendoorn "calibration based on real trajectory data turns out to be far from trivial" [18]. On the topic of fuel consumption, sensitivity analyses show that CF model parameters are important in individual trajectory estimation and subsequent fuel and emissions estimation, but their importance diminishes as more vehicles are considered in aggregate and average parameters may be sufficient [8].
The review of literature leads practitioners to the following conundrum: it is understood that CF model parameters are important tuning knobs for calibrated outputs, but without high-resolution and complete trajectory information for the network, what should the parameter settings be? This study aims to provide additional context to this problem using data from a real-world network outfitted with radars that record vehicle positions and speeds, which (along with cameras) is becoming more popular with the rise of intelligent traffic systems (ITS). While literature has shown that these partial trajectories may be insufficient for traditional CF model calibration [17], it is still possible to extract distributions of observed parameters such as vehicle acceleration, headway and the free-flow speed. The efficacy of using these measured distributions as the CF model parameters is explored, in addition to the ability of the SUMO CF models with their default parameters to recreate the measured distributions.

Car Following Models
Intelligent Driver Model: The IDM CF model was first proposed by Treiber, Hennecke and Helbing in 2000 [4]. The IDM model determines the acceleration of the follower vehicle,v f , as a function of current velocity, v f , the distance to the leading vehicle, s, and the difference in velocity between the leader and follower, ∆v. The acceleration, v f , at any time t is written aṡ where s * is the desired minimum following gap, v 0 is the free-flow speed on the road, β is a tuning acceleration exponent, and a is the maximum follower acceleration. The target simulation network described in Section 3.2 has varied speed limits, thus a static v 0 is not applicable. SUMO instead models the desired velocity as a speed factor, SF v , which is a multiplier on the speed limit, making the equation where speed limit f is the follower vehicle's applicable speed limit. The following gap is formulated as a function of current velocity and the difference in leader and followers velocity and given as where a is again the follower's maximum acceleration, b is the follower's maximum deceleration, τ is the minimum time headway, and s 0 is the minimum space between the follower and lead vehicle. In total, the IDM model has 6 tuning parameters { a, b, τ , s 0 , SF v , β } with [13] fixing β = 4, reducing the dimensionality to 5.

Krauss Model:
The default CF model in SUMO is Krauss's [19], [20]. It is a collision free model, with each follower vehicle having a safe following speed, v safe , computed at every simulation step from the velocity, v f , using the following equation: where t is the simulation time, v l (t) is the velocity of the lead vehicle at time t, g(t) is the gap between the vehicles, τ is the reaction time of the driver, and b is the deceleration function [21]. Because the acceleration should be bounded by the physical limitations of the vehicle, the actual desired speed, v des (t), is calculated as: where a is the maximum acceleration capability of the vehicle and v 0 is the maximum speed that the driver would drive on the road according to Equation 2. The maximum acceleration parameter does not represent the physical limit of the vehicle, but rather the maximum acceleration that a driver would choose 2 [21]. Like the IDM model, a, b, and τ are available as tuning parameters.

Extended Intelligent Driver Model:
The extended intelligent driver model was proposed by Salles, Kaufmann, and Reuss in 2020 to more accurately model human driving behavior, especially the drive-off trajectories [22]. The model is based on IDM but includes many improvements from both Treiber ([23], [24]) and the authors themselves. For brevity, the equations for the EIDM are not listed in this work, rather the reader is redirected to the referenced work. Along with IDM and the Krauss model, it also contains tuneable acceleration, deceleration and headway parameters.

Radar Processing
The trajectory data was collected using radars from InnoSenT GmbH 3 . The radar used for trajectory processing is located on the north side of the west-most traffic signal (TL1) in Figure 1 and captures the west-bound approach, as well as the east-bound departure. The radars can report vehicle position, velocity, and vehicle length every 50ms, however, due to internet network speed restrictions, data for individual trajectories is actually recorded with a period of 100-200ms. Vehicle positions are reported in the radar's coordinate system and must be transformed to match the positions to the underlying road network. Vehicles that enter the radar's field of view are assigned a unique identifier, making it easy to extract the complete trajectory of a vehicle as it passes through the radar. However, filtering the trajectories is necessary as trajectories can have discontinuities, the radar can identify non-vehicle objects as vehicles (data points in the grass in Figure 1), and the object id for a vehicle can switch while the vehicle is still in the sensing range.
The radars are used for two tasks in this work: extraction of CF behavior from the velocity profiles and volume calculation. To deal with radar data issues in the context of velocity profile processing, a polygon is placed around the west-bound approach to TL1 in Figure 1. This region has high data integrity and low probability of interference. The trajectory data is filtered so that only vehicles which pass completely through the box are considered. Their trajectories are truncated to only include the intra-box data and then processed according to the methodology in Section 3.3. The other tasks is volume calculation, which uses the radars' positional information to count vehicles that cross the stop bar at all of TL1's approaches. Filtering the radar with the aforementioned strategy could introduce bias into the dataset as it only keeps vehicles for which the radar is able to maintain a steady track, however these cases typically result when vehicles become obstructed from the radar.

SUMO Simulation
The simulated SUMO network represents a two intersection corridor of Tuscaloosa, Alabama primarily consisting of US-82McFarland Blvd between Airport and Harper Roads. Figure 1 shows the SUMO model of the target network overlaid on geo-located satellite images. The call out displays the location of the radar, as well as a sample of its geo-located data and the box used to filter trajectories. Both intersections in the network are signalized (marked as TL1 and TL2 in Fig. 1), and the field controllers were emulated using NEMA controllers in SUMO with matching configurations [25]. They all operated in coordinated mode during the simulated period of time, with actuation on the non-coordinated phases.
The polygon used to filter the real-world data was also integrated into the geo-located SUMO simulation. By combining the floating car data output of SUMO with polygonbased filtering, a data set was obtained that closely resembles the data described in Section 3.1. The floating car output was processed using the same methodology that was applied to the radar data. Simulation demand is generated using both the radar displayed in Figure 1 as well as a radar on the other side of the intersection that captures the east bound approaches. Data was collected from the radars on January 13th, 2023 and the simulation time spans from 5:00AM to 11:00PM on the target day, covering periods of low volume as well as the morning rush around 8AM, which is apparent in Figure 2. The box used for radar filtering corresponds to the West bound straight volume.
To turn the radar data into traffic counts, the number of tracked objects that cross the intersection stop bar for each approach are aggregated into five-minute intervals. Additionally, turn count restrictions are put in place at TL2 so that the majority of traffic traverses the entire network on US-82 (the major east/west road). The volumes and turn counts are passed as an input to routeSampler [26] with the Poisson flow option, which is chosen to increase the randomness of the simulation. The calibrated simulation is evaluated against the GEH metric [27], which compares simulated volumes to observed volumes using the following equation: where M i t is the simulated hourly volume at location i and aggregation period t. Correspondingly, C i t is the corresponding measured hourly volume. Applying Equation 6 to the simulation output results in a 10 minute GEH < 5 at 94% of time window -location pairs when simulating with SUMO default CF model parameters, which passes the common GEH target of < 5 at 85% of locations [6], is visually represented in Figure 2.

Acceleration & Deceleration
A sample of the raw and processed radar trajectories is shown in Figure 3. The left figure is a time-space diagram of west-bound vehicles as they traverse the network. The right plot shows a piece-wise linear fit overlaid on the raw velocity data of the red vehicle in the time-space diagram. From each segment a fit quality, R 2 , slope, and duration are obtained for further processing. The vehicle trajectory shown in Figure 3 captures sample of an accelerating and decelerating vehicle.
A graphical representation of the impact of minimum time and fit quality thresholds on the distribution of measured parameters is depicted in Figure 4. The top row shows contour of median acceleration and the number of unique vehicles as a function of minimum time and R 2 threshold. The bottom row shows the measured distributions for acceleration and deceleration, with the label and color corresponding to the annotated positions on the top row. Based on the figure, location 5 was chosen as the final aggregation settings, with a minimum time of 1 second and R 2 > 0.95, due to the high confidence in the linear fit of trajectory while still containing enough samples to be representative of the population. The resulting distributions are presented quantitatively later in Table 1. The shapes of the distributions of acceleration and deceleration determined in Figure 4 are right-skewed due to the low mean acceleration values and because each parameter has a lower bound of 0. Consequently, median acceleration was considered as primary metric for evaluating threshold selection as it best represents central tendency of the acceleration's and deceleration's R 2 values.

Headway & Free-Flow Speed
To determine the headway of vehicles in the radar data, a mapping process was performed to assign each vehicle to its respective lane. This was achieved by dividing the box illustrated in Figure 1 into two separate segments, thereby enabling lane identification. The arrival time of each vehicle at distances of 100m, 60m, and 40m from the stop bar was then calculated using linear interpolation. Vehicles that switched lanes during the data collection period were filtered out. The remaining vehicles were sorted by time and leader-follower pairs were identified for each lane. The headway was calculated as the average time difference between the leader and follower at each of the three aforementioned distances. Headway times of longer than 5 seconds were deemed outside of the CF regime, inline with prior literature [10], [28], [29].
The derived acceleration and headway data was utilized to determine the free-flow speed of the network. However, it was noted that the measured range was signalized and a simple average would be significantly impacted by stopped vehicles. To address this issue, filtering was employed in a manner similar to prior literature [10]. The method considered only vehicles with time headway greater than 5 seconds and excluded vehicles with acceleration or deceleration greater than 1 m/s 2 . As can be inferred from the sample of raw data in Fig. 3 in just a few minutes of data there are large numbers of free-flowing and leader/follower pairs from which to extract the desired behavioral distributions.

Vehicle Distribution Creation
In the context of SUMO simulation, vehicle CF attributes are defined by the vehicle type attribute, which can be assigned to each individual vehicle entering the network, or specified in a vehicle type distribution file 4 from which SUMO samples when generating routes. The presence of individual vehicle data in the radar representation provides the opportunity to construct a heterogeneous vehicle distribution through sampling, either correlated or uncorrelated.
The correlation of parameters in the simulation of traffic flow has garnered attention in the literature, as it has been shown to impact the realism of simulation results [30], [31]. To create a correlated distribution from the radar data, only vehicles with deceleration, acceleration, and headway less than 5 seconds are considered. These vehicles are used to create synthetic vehicles by combining the acceleration and deceleration events, resulting in a set of acceleration, deceleration, and headway values that all derive from a single vehicle, in addition to corresponding vehicle length, which is measured by the radar. However, since the vehicle is in the CF regime, its matching speed factor is not available. In this scenario, the speed factor is sampled from the overall distribution. There are 1135 vehicles in that radar dataset that makes 1290 accelerations (refer to Section 3.3.1 for further details). Similarly, 682 vehicles decelerate, resulting in a total of 1193 recorded decelerations. Additionally, there were 1831 vehicles with headways less than 5 seconds, out of which 48 vehicles were present in both the acceleration and deceleration datasets. Because of the difference, due largely to the dramatically reduced sample size, the correlated model parameters were not used fur-ther in this work. Future work will expand raw data recording period and perhaps result in large enough size to apply correlated sampling for comparison. In this study however, only the uncorrelated case was considered in Section 4. In the uncorrelated case, the acceleration, tau, deceleration, speed factor, and length columns are independently sampled from the distributions of measured parameters.
The study also investigated the use of acceleration and deceleration measures that occur outside the CF regime. This is because both the Krauss and IDM model's representation of CF acceleration is intended to reflect the maximum follower acceleration, which could emerge in the absence of a leader vehicle. However, the measured distributions of parameters outside of the CF regime were found to not significantly differ from those within the CF regime according to the Mann-Whitney test (U = 4.21e5, p = 0.43), and thus were not included in the evaluation.

Results
The constructed distributions of CF parameters were simulated in SUMO Version 1.16.0 with a simulation step size of 0.1s. Two experiments were conducted, one with the default CF model parameters and one with parameters sampled from the measured distributions. In each experiment, the models had an actionStepLength of 0.2s 5 . All vehicles were simulated as passenger cars with the emissions and fuel consumption model PHEMlight/PC G EU4 [32].
To ensure the robustness of the findings, each CF model discussed in Section 2 (IDM, EIDM, and Krauss) was simulated 30 times, only varying the random seed. Subsequently, the output of all 180 simulations was processed according to the methods described in Section 3.3.

Simulation Comparison
In Figure 5, the empirical cumulative distribution functions (eCDF's) obtained by simulating with both the default parameters of a CF model and the results obtained from sampled parameters are presented. The colors in the figure are paired such that the darker shade represents the sampled parameters and the lighter shade represents the default parameters. However, it is immediately apparent that neither the sampled nor default parameters accurately reproduce the empirical distributions of acceleration and deceleration. Nonetheless, the default models do perform reasonably well at modeling headway and free-flow speed, although they still under-predict the speed and fail to capture lower headway vehicles.
The results are summarized in Table 1, with the mean (µ) and 50th percentiles (P 50% ). The standard deviation of the distributions are not presented, as the majority cannot be approximated by a Gaussian distribution. Also included in Table 1 is the resulting fuel consumption per vehicle. In each row the bolded numbers represent the closest to the real world. The default parameters perform best for all mean and median cases, with the mean acceleration of Krauss being 0.05m/s 2 away from the measured acceleration mean. Krauss also approximates the free-flow speed the best. For deceleration, the default IDM model has the closest mean and median (0.14m/s 2 and 0.03m/s 2 respectively). The default IDM also performs the best at headway estimation. However, the default parameters fail to capture the tails of the distributions well, which is understandable given that default parameters are a homogeneous fleet, as every car that enters the network has the same set of parameters in the default case.
The explanation for the poor performance of the sampled parameters is two fold. For one, the fact that CF model parameters are not physical or cannot simply be found via macroscopic measurements has been discussed in literature [8], [11], [16]. Secondly and perhaps more importantly in the context of SUMO, the low acceleration measured in the real-world data (P 50% = 0.77m/s 2 ) does not result in realistic traffic flows when applied to simulation. Long queues develop in turn lanes, gridlock ensues, and GEH calibration fails in the west bound straight approach. This disruption of regular traffic flow causes elevated fuel consumption per vehicle in the sampled simulations. Due to the complexity of isolating the specific contributions of the parameters from the fuel consumption resulting from congestion, the findings of the fuel consumption analysis are not presented in this study. Future work must address this issue as the use of microsimulation tools such as SUMO to study traffic control optimization for energy reductions will rely on accurate predictions of driver behavior and driving trajectories beyond typical measures of vehicles per hour or average speed.

Summary, Conclusions & Future Work
This work presented the efficacy of using measured distributions of acceleration, deceleration, time headway and free-flow speed as their corresponding parameters in SUMO CF models. The distributions were acquired via processing of data from signal pole mounted radar units, which return both the position and velocity of individual vehicles as they approach a signalized intersection. The acceleration, deceleration Table 1. Summarizes the eCDF's presented in Figure 5. The simulated value nearest the measured is presented in bold. The columns "Samp." and "Def." correspond to Sampled and Default parameters. and free-flow speed were found using piece-wise linear fit, and headway via linear interpolation. After aggregating the data into corresponding distributions, they were sampled and simulated in SUMO via vehicle distribution files. The SUMO demand was calibrated with the radar data, ensuring that simulation matched the volume of the measured day. Once simulated, the SUMO floating car data output was processed in an identical manner to the radar data, and the resulting SUMO distributions were compared to the empirical. In addition to the empirical distributions, the CF models in SUMO were also simulated using their default parameters to assess how well the defaults re-created the real distributions.
Based on the study's results, it can be concluded that using only measured accelerations and decelerations, independent of a leader-follower relationship, as a basis for CF model parameters is insufficient. While these measurements can replicate the heterogeneity of traffic, conducting SUMO simulations with a range of low accelerations (P 50% = 0.77m/s 2 ) leads to congested simulations that do not meet basic calibration standards. Due to the ensuing congestion and calibration failure, it becomes infeasible to compare the fuel consumption of simulations with sampled parameters to those with default settings. Consequently, initiating simulations with the default parameters provided by SUMO may prove more advantageous than using the presented measurements without further calibration of the car-following model.
However, there is a still a significant gap in distributions of primarily acceleration behavior which should be investigated. Moving forward, the next step is to close the loop between simulation distributions and their real-world counterparts by calibrating the CF model. This can be done through either fitting the simulation distributions to the measured data or analyzing the trajectories themselves. Both methods should be examined and evaluated for their impact particularly on fuel consumption. Additionally, it may be beneficial to assess how parameters change based on the time of day, volume of traffic, and at different locations in the network, as well as the impact of correlated versus uncorrelated behaviors on CF model parameters and compare their influence on simulation outputs.