The Effects of Route Randomization on Urban Emissions

: Routing algorithms typically suggest the fastest path or slight variation to reach a user’s desired destination. Although this suggestion at the individual level is undoubtedly advantageous for the user, from a collective point of view, the aggregation of all single suggested paths may result in an increasing impact (e.g., in terms of emissions). In this study, we use SUMO to simulate the effects of incorporating randomness into routing algorithms on emissions, their distribution, and travel time in the urban area of Milan (Italy). Our results reveal that, given the common practice of routing towards the fastest path, a certain level of randomness in routes reduces emissions and travel time. In other words, the stronger the random component in the routes, the more pronounced the benefits upon a certain threshold. Our research provides insight into the potential advantages of considering collective outcomes in routing decisions and highlights the need to explore further the relationship between route randomization and sustainability in urban transportation.


Introduction
Vehicular mobility is pivotal in global greenhouse gas emissions and determining the urban environment's sustainability [1]. Emissions of CO 2 from road vehicles were 1.57 billion metric tons in 2012, accounting for 28% of US fossil fuel CO 2 emissions [2], causing climate changes, heat islands [3] and health-related risks [4]. Traffic congestion, a significant source of CO 2 emissions in urban environments [5], may arise due to (unintended) drivers' miscoordination, which may be exacerbated nowadays by the massive use of GPS navigation systems. Typically delivered as phone apps, these systems suggest the fastest path to reach a user's desired destination. Although this suggestion is undoubtedly advantageous for the user, especially when exploring an unfamiliar city, the aggregation of all single suggested paths may result in an increasing urban impact (e.g., in terms of emissions). Indeed, a recent work shows that the higher the fraction of vehicles following these apps' suggestion, the higher the urban emissions [6]. Several alternative routing algorithms have been introduced, which typically slightly randomize the fastest path to increase route diversity [7]- [11]. However, it still needs to be determined to what extent route diversification can help reduce emissions and traffic congestion in urban environments.
This paper provides a method to assess the impact of route randomization on the urban environment using the mobility simulator SUMO and duarouter. We investigate the impact of randomized routes on CO 2 emissions and travel time in Milan, Italy. By changing the fraction of randomized vehicles for different degrees of randomization, we examine how the distribution of emissions across the roads and the vehicles' travel time change. We find that an optimal randomization degree exists, leading to a 15% reduction in CO 2 emissions and an 18% reduction in travel time, compared to the baseline case in which there is no path randomization. In particular, the CO 2 distribution entropy increases with the degree of randomization, leading to a more evenly distributed emission on the road network. Our study provides valuable insights into the potential benefits of incorporating randomness into route recommendations as it may increase sustainability in transportation networks. We provide the code and the link to the data to reproduce our study at https://bit.ly/route_randomization_sumo.

Related Work
Computing the shortest (or fastest) path between two given locations in a road network is a largely addressed problem in mobility research [12]. The fastest path is the one that minimizes the travel time to reach a desired destination. Although this suggestion at the individual level is undoubtedly advantageous for the user, from a collective point of view, the aggregation of all single suggested paths may result in an increasing impact (e.g., CO 2 emissions) [6].
Different works have focused on alternative routing [7], typically formalized as the k-shortest path problem [13], [14], which aims to find the k > 0 shortest paths between an origin and a destination in a network. Cheng et al. [8] demonstrate how, in most practical cases, path diversification is crucial to solving the k-shortest path problem since the generated paths have 99% overlap in terms of road edges. Suurballe [15] proposes another method to generate k-shortest disjointed paths, in which the route appears considerably diverse from the optimal path and the travel time and path length increase considerably. In between the k-shortest path and k-shortest disjoint paths lie several approaches that are a good tradeoff between the two approaches.
Liu et al. [9] propose the k-Shortest Paths with Diversity (kSPD) problem, defined as top-k shortest paths that are the most dissimilar with each other and minimize the paths' total length. Given the kSPD problem, Chondrogiannis et al. [10] propose an implementation and a study of the k-Shortest Paths with Limited Overlap (kSPLO), seeking to recommend k-alternative paths that are as short as possible and sufficiently dissimilar based on a similarity threshold defined by the user.
Chondrogiannis et al. in [11] formalize the Dissimilar Paths with Minimum Collective Length (kDPML) problem based on the definition proposed by Liu et al. in [9]. Given two locations on a road network, they compute a set of k paths containing sufficiently dissimilar routes and the lowest collective path length among all sets of k sufficiently different paths.
Cheng et al. [8] generate alternative routes by considering the road network as a weighted graph and distorting the edge weights. They iteratively compute the optimal path, applying a penalty on each edge of the optimal path found in the previous iteration.
Another technique used to generate alternative routes is the plateau method [16]: it builds two shortest-path trees, one from the source and one from the target, and then joins the two trees to obtain the branches in common. These common branches are termed plateaus. Top-k plateaus are selected based on their lengths, and each plateau is used to generate an alternative path by appending the shortest paths from the source to the first edge of the plateau and from the last edge of the plateau to the target.
All existing works validate and evaluate the goodness of their proposals from a theoretical and algorithmic point of view. None of them investigates the impact of route diversification on urban welfare, for example, traffic congestion and pollution. In this paper, we fill this gap and assess the collective impact of alternative routing as randomization of the fastest path on emissions using the mobility simulator SUMO.

Simulation Framework
Our simulation framework is based on SUMO (Simulation of Urban MObility), an agentbased tool that allows for intermodal traffic simulation, including road vehicles, public transport, and pedestrians [17]. SUMO models each vehicle's physics and dynamics, supporting various route choice methods and routing strategies [18].
SUMO requires two elements to simulate traffic: a road network and a traffic demand. The road network describes the virtual road infrastructure where the simulated vehicles move during the simulation. It is a directed graph G = (V, E) in which V represents intersections and E represents roads. The traffic demand describes the vehicles' movement on the road network. A vehicle path may be either a trip or a route. The origin edge, the destination edge, and the departure time define a trip. A route also contains all edges the vehicle passes through.
We control our SUMO simulations through TraCI 1 (Traffic Control Interface) [19], a Python controller that allows retrieving simulated objects' values that are useful for analyzing the simulation, such as the vehicle's trajectory, its speed and acceleration, total CO 2 emissions, and fuel consumption.

Mobility Demand
The mobility demand D = {T 1 , . . . , T N } is a collection of N trips (one per each vehicle) within a city. A single trip T v = (o, d) is defined by its origin location o and destination location d. To compute D, we first divide the area of interest into a grid with squared tiles of a given side. Then, we use real mobility data to compute the flows between the tiles obtaining an origin-destination matrix M where an element m o,d ∈ M describes the number of vehicles' trips that start in tile o and end in tile d. Finally, we iterate N times the following procedure: we choose a vehicle v's trip T v = (e o , e d ) selecting at random a matrix element m o,d ∈ M with a probability p o,d ∝ m o,d and uniformly at random two edges e o , e d ∈ E within tiles o and d.

Randomized Fastest Path
In graph theory, the shortest path between two nodes is the path that minimizes the sum of the weights of the path's edges. The fastest path is the shortest path considering travel time as the edge cost. We define a randomized fastest path as a nondeterministic distortion of the fastest path. The resulting path should not deviate considerably from the optimal path in terms of length and duration.
We compute the randomized fastest paths using the SUMO tool duarouter 2 , which allows us to compute vehicle routes using different algorithms (e.g., CHWrapper, A*, and Dijkstra) and specify the degree of path randomization w ∈ [1, +∞).
If w > 1, duarouter uses an edge weight randomization method [8] to dynamically distorts edge weights (i.e., travel time) by a chosen random factor drawn uniformly in [1, w). The edge cost distortion is performed every time duarouter computes the fastest path for a vehicle; hence, two vehicles with the exact origin and destination may be assigned to two different randomized fastest paths (see Figure 1). The edge weight considered by duarouter is the expected travel time, estimated for each edge as its length divided by the maximum speed allowed on that edge. Duarouter randomizes the edge weight f (e) of an edge e using a function f dua (e) defined as: where w is the degree of randomization and U (1, w) is a random variable drawn uniformly in [1, w). Note that for w = 1 there is no randomization. Furthermore, the higher w, the more randomness is introduced into calculating the fastest path, and the more (on average) the path deviates from the fastest path (see Figure 2). Given an origin location o, a destination location d, and random weight factor w, we define the sequence of SUMO edges computed with duarouter as DR((o, d), w).

Non-randomized and Randomized Traffic Demands
We derive two types of traffic demands based on a given mobility demand D: the nonrandomized traffic demand and the randomized demand.
The non-randomized traffic demand, N R, is a collection of N routes that link the origin to the destination of each trip in D using the fastest path (i.e., using duarouter with w = 1): The randomized demand, R, is the collection of N routes connecting the origins to the destinations of each trip in D using randomized fastest paths (i.e., using duarouter with w > 1): Given a mobility demand D, we compute N R D and R D,w for each w ∈ W , where W is the set of randomization factors to study.

Traffic Simulation
We use TraCI to collect edge and vehicle-related measures such as total travel time, emissions (CO 2 , PM, and NOx), and fuel consumption. We use the HBEFA3/PC G EU4 emission model [20], which estimates the vehicle's instantaneous emissions at a trajectory point j as [21]: where s and a are the vehicle's speed, and acceleration in point j, respectively, and c 0 , . . . , c 5 are parameters changing per emission type and vehicle taken from the HBEFA database.
We compute the total quantity for each pollutant on each edge e ∈ E by summing all the emissions corresponding to any vehicle v's trajectory point that fall on e. Finally, we construct a weighted road network G = (V, E) where each edge e ∈ E is associated with the amount of emissions on it.

Experimental Settings
We simulate the effect of route randomization into a 45 km 2 area in the city center of Milan, Italy, for which we have GPS data 3 describing 17,000 private vehicles traveling between April 2nd and 8th, 2007 (114k GPS points). Previous works demonstrate that the portion of vehicles in the dataset is representative of the real fleet of vehicles [22]. We discretize the urban area of Milan by splitting it into a grid of squared tiles (side of 1 km), and we detect the origin and destination tile of each vehicle's trip to compute the origin-destination matrix M of vehicles' flows [23], [24].
We obtain the road network G = (V, E) of Milan using OSMWebWizard 4 , included in the SUMO suite. Before conducting the simulations, we perform a preprocessing step on the road network to correct inaccuracies that may negatively affect the simulations. This preprocessing phase includes correcting lane number inaccuracies, addressing road continuity disruptions, and modifying turns to align with real-world conditions. Since the pre-computed traffic lights' programs often differ from those in reality, we set the traffic lights' program to actuated, as suggested in the SUMO documentation. The preprocessing steps are based on the methodology outlined in [18]. We use the following netconvert options (recommended in the netconvert documentation): --no-turnarounds true --geometry.remove --roundabouts.guess --ramps.guess --junctions.join --tls.guess-signals --tls.discard-simple --tls.join --output.original-names --junctions.corner-detail 5 --output.street-names After the preprocessing, the road network includes 5,551 intersections (nodes) and 36,945 road segments (edges).
Given the preprocessed road network G and the OD matrix M , we compute the mobility demand D with N = 15, 000 trips. This value of N minimizes the difference between the average travel time of actual trajectories and simulated ones, a standard way to assess a realistic estimation of the number of vehicles to simulate [6], [18]. We associate at each vehicle's trip in D a departure time assigned uniformly at random between 0 and 3600 seconds.
First, we compute the non-randomized traffic demand N R D : we connect each vehicle's origin and destination through the fastest path (w = 1). Second, we build the randomized traffic demands R D,w for several randomization values w ∈ {2. 5, 5, 7.5, 10, 12.5, 15, 17.5, 20}. Third, we create a mixed demand MP p,w specifying the fraction p of the randomized fastest paths. In each MP p,w , a fraction p of the N vehicles chosen uniformly at random are assigned to their randomized paths computed with the randomization value w. In contrast, the vehicles' remaining fraction (1 − p) is assigned to their non-randomized paths. We consider p ∈ [0, 1] at step of 0.1. The mixed demand allows us to study the impact of the percentage of vehicles that follow a randomized fastest path on the urban environment.
To make simulations more robust, for each value of p and w, we generate MP p,w ten times, each with a different choice of randomized vehicles that are chosen uniformly at random. Finally, we simulate each MP p,w in SUMO, and through the Python controller TraCI we collect the emissions on each edge and the vehicles' total travel time.
Finally, to confirm that the randomization of a path from an origin to a destination grows with w, we take 15,000 paths for each value of w ∈ W (generated starting from the trips in D). We measure the randomization of a path as the normalized Jaccard coefficient, defined between two sets A and B as: between the edges of the randomized paths (w > 1) and edges of the fastest path (w = 1) (Figure 3a), computed for the same origin and destination. We also measure, for different values of w ∈ W , the average path length (Figure 3b) and the average expected travel time (Figure 3c). Figure 3 shows how increasing the value of w results in randomized paths with a lower average Jaccard coefficient, higher length, and higher expected travel time than the fastest path. Therefore, path randomized grows with increasing w.

Results
We study how the distribution of CO 2 emissions and travel time across Milan's roads change by changing with the p of randomized vehicles for different values of w ∈ W , where W = {2.5, 5, 7.5, 10, 12.5, 15, 17.5, 20}.
We find that the fraction p of randomized paths does impact the total CO 2 : introducing randomization in the fastest path reduces the total CO 2 emissions. When p > 0.1, the emissions are lower than the baseline case (w = 1) and decrease with p, assuming their minimum value for p = 1 (Figure 4a). This relationship is consistent across different values of w.  As Figure 4a shows, the configuration p = 1 and w = 10 corresponds to the minimum value of total CO 2 emissions, with a total emissions savings with respect to the baseline of 15.61% (Figure 6a). We also compute the Shannon entropy of the total CO 2 distribution to capture the inequality of the distribution on the road network's edges, defined as: where X is a random variable. We find that the higher w, the more evenly the emissions are distributed on the road network (Figure 4b). In particular, the distribution is the most equal when all the vehicles follow a randomized fastest path, ∀w ∈ W .
The results for the travel time are in agreement with those of CO 2 emissions: the higher p, the lower the vehicles' travel times (Figure 5a). Travel times are minimized when p = 1 and w = 10, with an improvement of 18.74% with respect to the baseline scenario (Figure 6b). The entropy associated with the total travel time is more variable than the entropy of the total CO 2 ; this may arise from the stochastic nature of each simulation.

Discussion and Future Works
Our study investigates the effects of route randomization on CO 2 emissions and travel time in an urban environment. We find that the injection of randomness into the fastest  paths, which can be interpreted as an increasing "diversity" of paths on the road network, is beneficial for reducing CO 2 emissions and travel time. Presumably, path randomization helps distribute the traffic more evenly among the different (non-fastest) routes, preventing the emergence of detrimental and counterintuitive effects such as the Braess paradox [25], [26].
Path randomization with w = 10 leads to the best improvement over the baseline, with savings in CO 2 emission and travel time of 15.61% and 18.74%, respectively. However, increasing the random component in the randomization of the fastest path is beneficial up to a certain threshold: when w > 10, CO 2 emissions increase again. This result suggests that more effort should be devoted on finding the dependence of the optimal degree of randomization on the road network structure and the number of circulating vehicles. In future works, we plan to explore the potential for scaling these results to other cities and integrating them into real-world transportation planning and management.
Our findings have practical implications for real-world transportation systems. Implementing our approach could significantly reduce traffic congestion and pollution, thus improving the overall efficiency of the transportation network. The approach is also easy to implement and can be integrated into existing navigation systems without significant modifications. Further research could investigate the impact of several diversification methods and other transportation efficiency measures, such as fuel consumption.
A further improvement would be to consider a stable user equilibrium (UE) [27] as a baseline scenario instead of assigning the fastest path for each trip in the travel demand. User equilibrium (UE) describes the condition in which each driver chooses their route based on their individual preferences, resulting in a network-wide equilibrium where no individual driver can reduce their travel cost (e.g., travel time) by unilaterally using a different route. In other words, UE represents a state of traffic flow where all drivers have chosen the shortest or fastest paths, given the prevailing traffic conditions and their preferences or constraints.
In the meantime, our work is a first step towards designing next-generation routing algorithms that, as our results suggest, should consider some degree of path randomization to increase urban well-being while still satisfying individual needs.

Underlying and related material
The code and the link to the dataset to fully reproduce the analysis presented in this work is available on a GitHub repository at https://bit.ly/route_randomization_ sumo.