Estimation of Green House Gas and Contaminant Emissions from Tra(cid:14)c by microsimulation and re(cid:12)ned Origin-Destination matrices: a methodological approach

The high levels of air contamination and presence of di(cid:11)erent pollutants are a large problem in most of the cities in which road transport is the primary source of emissions. The governments of more than 100 countries are adopting di(cid:11)erent policies and strategies to help reduce and mitigate their global emissions. In terms of road transport, reductions in emissions could be achieved by replacing conventional vehicle technologies or by chang-ing the travel patterns of individuals using a private vehicle as their primary means of transportation. However, accurately quantifying the emissions related to the urban tra(cid:14)c from multiple possible scenarios is a very complicated task, even when appropriate tools made for this purpose are available. Here we apply a scienti(cid:12)cally rigorous protocol to accurately estimate greenhouse and other polluting gases. We describe the methodological steps we followed to analyse the vast quantities of data available from di(cid:11)erent heteroge-neous sources. This data can aid decision-makers in planning better strategies for urban transportation. We used the origin-destination matrices already available for Valencia city (Spain), as well as historical information for their street induction-loops and the phases and times of their tra(cid:14)c light system as our input data for the tra(cid:14)c model. Rather than a brute-force algorithm, we used a fast-convergence Lagrangian algorithm model which deals with that vast quantities of information. Based on the elements mentioned above together with the statistics about the types of vehicles in the city by simulations the urban mobility city’s tra(cid:14)c was reconstructed at di(cid:11)erent times to quantify the emissions produced with a high spatial and temporal resolution.


Introduction
Interactions between the demand and supply of transport determine traffic flows on road networks. Vehicles consume fuel and in turn, the emissions produced by these mobile sources determine the concentration of pollutants in the air. Air and noise pollution have been identified as among the most critical environmental problems present in urban areas. The importance of environmental issues in the quality of life of the population lies in health problems related to pollution. These include physiological and psychological disorders and their severity depends on the levels and extent of exposure.
As measures to mitigate the different environmental problems caused by urban traffic, various strategies have been implemented (usually by city councils), starting by defining the origins of the discomfort the population experiences in relation to these problems. These include modifying vehicle manufacturing and circulation regulations, placing barriers around highways and urban centres, careful new road developments, replacement of the types of vehicles used in fleets, and the creation of new massive parking spaces. Nevertheless, these measures are usually based on decisions not supported by scientific data or technical criteria and so, often end up increasing traffic and causing mobility problems.
In this sense, our work highlights the essential role of multivariate statistics and mathematical models in the analysis of information obtained from large amounts of data. These analyses allow us to define the significance of the observed findings and can highlight elements that may not be self-evident. The knowledge achieved through this type of analysis provides public decision-makers with a robust methodology for quantifying emissions/pollution which can help them to choose appropriate urban mobility plan strategies to mitigate the impact of transport on the environmental quality of cities.
Nonetheless, the available data only indicates the number of detections-i.e., it tells us that several vehicles have passed through a section. However, the vehicle speeds, characteristics, and trajectories remain unknown. Thus, simulations which use extrapolation-like methods are required to try to leverage this data to build a complex picture of the situation and obtain useful information from it. This is not a simple extrapolation in the form of a graphical curve, it is a much more sophisticated extrapolated reconstruction based on a mathematical model.
As a use case for this methodology, we examined the CO 2 emissions produced in the city of Valencia (Spain). Even though data related to urban traffic in this city is available, it cannot be extrapolated with sufficient accuracy to be able to make inferences and estimations based on it with any scientific rigour. Our approach creates a traffic model that uses as the input data (1) the origin-destination (OD) matrices already available for the city, (2) historical data from the induction loop detectors of the city's streets, and (3) the city's traffic light system regulation phases and times. Based on these elements, the mathematical model based in a Lagrangian algorithm deals with that vast quantity of information and results in a fast convergence. We then use the statistics about the types of vehicles in the city and by simulations of urban mobility the city's traffic flow is reconstructed at different times to quantify the emissions produced.
The rest of the paper is organised as follows: a review of relative works is presented in section 2. Material and Methods sections present scenario components and the steps we follow are presented in section 3 and 4 respectively. The reached results are presented in section 5. The paper finalises with the "Further research" and "Conclusions" sections.

Related Works
Nowadays, due to climate change and the challenges facing the environment, countries need to find solutions that reduce global warming pollution, reduce the use of fossil fuels and focus on clean energy sources. These solutions could affect mobility in different ways and planners should know whatever those effects are. Below is presented a few works that we can find in the related literature. The quantification of emissions produced by the transport sector that are specified in the Intergovernmental Panel on Climate Change (IPCC) reports could be improved by the developing of sectorial emission models of atmospheric pollutants [13].
Authors in [2] present a comparison of estimated against real truth produced emissions through simulations. This study is based on broadcasted information by equipped vehicles with an special communication devices when they passed over an induction loop in an small scenario composed just by one intersection.
In [14] authors analyse the behaviour of all the streets in the Valencia city to determine based on loads of vehicles on the streets what will be journey time and their occupancy. The mathematical model characterises all the main roads and city streets. The model is used by a centralised server that receives all the requests that arrive from each car to make balanced traffic management and predict future occupancy. In this way, the street/road occupation is already known, and in high occupancy situation, all the new requests will be sent to other areas to avoid traffic congestion problems.
In [4] the total CO 2 emissions generated in the metropolitan area of a city in Taiwan is analysed. Authors made a statistical analysis of the traffic volume in an entire year according to peak traffic hours and the development of several buildings. They obtain a prediction formula used to forecast the development scale of various buildings and the information of the road system and traffic volume is presented as hot maps.
In [9] the acquisition of an OD matrix according to the existing traffic in real-time is proposed. The resulted matrix is configured using a Lagrangian optimization algorithm with restrictions on the components of the initial matrix, using vehicular flow data at specific points in the road network.
However, the emission quantification's that comes from the IPCC reports are merely a table extrapolation that may not be the most appropriate approach for some particular areas where a certain level of accuracy is needed. Regarding to the other works, first [14], [9] the issue of emissions is not considered. Then, [4] where emissions are obtained through a formula or model lack of a simulation part to counteract their results and finally, [2] where communication V2I does not fit the reality at least in the city under study. Thus the problem is in a certain way analysed without approaching it as is presented in the methodology of this document.

Simulator
For the microsimulation we have chosen the Simulation of Urban MObility (SUMO) [12] because it is an open system which allows simulating the dynamics and interactions of almost all the elements that build mobility in a city. In addition, it has a large set of tools for the simulation's scenarios creation moreover of a big a development community behind it [11]. In this work we have used the Eclipse SUMO 1.5.0 versions.

Traffic Network
The city of Valencia with more than 800,000 inhabitants, currently has a vehicle fleet of approximately 498,000 vehicles and road network of 300 km long. To optimize the general traffic conditions for all the agents involved in the urban traffic of the city (pedestrians, private vehicles, collective transport, police, media, etc.), the Centre of City Traffic Management of Valencia carries out comprehensive traffic management.
The traffic status of the city road network is known in real-time through two information sources: (a) the detectors installed in the traffic lanes and (b) Closed Circuit Television (CCTV) images. Those source devices that communicate with the Center of City Traffic Management through a network of TCP/IP fibre optic communications with redundant gigabit Ethernet rings [7].

Traffic Demand
Existing traffic detectors along the city provide vehicles intensity information that is recorded during a data integration period (currently every ten minutes). A set of detectors contributes to a measurement point in a given coefficient. A measuring point is associated with an urban road segment. The intensity at a measurement point is measured in a number of vehicles per hour, and it is obtained by a linear combination of the intensities of the detectors that compose it. The urban road segments that have a measurement point are considered as a monitored road segment. Thus, a monitored road segment contains the traffic information associated to a specific point in the urban traffic network. In addition to the traffic detectors, some video detectors are capable of classifying the type of vehicle that travels on the roads (information not considered in this study).
Valencia has about 1318 monitored road segments [10] with information intensity recorded every ten minutes; due to the huge amount of data, for this study, we have considered hourly operating data just for the year 2017.

Origin-destination matrices
In general, in countries of the European Union to collect information about the exposure to urban traffic mobility of the population, different surveys are often carried out with different periodicity since they constitute an exposure measure that is frequently used to identify the demand for transport and mobility in their cities [3].
In Valencian Community (Spain) in 2016 through a survey [8], the inhabitants have been asked to complete a description of the "travel diaries" in which their routes and trips of the day and weekend before the survey. It may have the disadvantage that the interviewees tend not to register very short journeys by not considering them as trips or simply because the people forget them. Anyway, from the collected information, the origin-destination matrices that characterize the usual mobility patterns of the population have been configured [6]. Assuming for the majority of the people have relatively constant mobility patterns during their daily lives, as well as they are consequently easy to remember. The matrices provide high temporal resolution (daily) information about two dimensions: the intensity of exposure and the means of transport for their displacement.

Traffic light system regulation
In Valencia there is more than 1,000 intersections regulated by traffic lights their approximate location of a large part of them can be seen in the Figure Fig:trafficLight. The city has a Centralized Traffic Control system that allows traffic lights to be regulated in real-time to adapt them to different traffic conditions. Through this system, it is possible to modify the green time of each access, the traffic light cycle and the synchronization between different crossings to avoid generate queues in the streets, thereby reduce the delay and increase the circulation speed. In addition Valencia has a fixed-time emergency system that would work automatically in case of centralized system failure [7].
However, approaching the reality problem in terms of what is happening in the city is difficult since factors such as: (a) intervention of traffic police officers in access streets nearby to educational or sports centers; (b) traffic operator decisions based on CCTV observations; (c) traffic light activations based on the automatic detection of vehicles on the streets or (d) by pedestrian push-to-walk buttons; as well as (e) automatic self-regulation of traffic lights control system should be consider as system inputs for the simulations and they are normally complex to quantify.

Lagrangian algorithm
A Lagrangian algorithm is a scalar function from which a temporal evolution of a dynamic system can be obtained [1]. The algorithm has a multi-target approach, using as input the initial OD matrix, and the traffic counts. The algorithm incorporates updating processes that control the deviations of the adjusted travel matrix from a previous one. Matrices that after an assignment stage and several repetitions of the loop reach convergence. It reproduces exactly the observed traffic flows, and thus quickly finds a solution to the problem without requiring excessive computational resources [9].
The initial matrix is the O-D matrix obtained from the telephone surveys of households in the Valencian Community, and the vector of variables (Lagrange multipliers) corresponds to the traffic restrictions at a specific time of day based on the meters. Thanks to this algorithm, errors in the estimation of O-D matrices and the reconstruction of observed flows are minimised.

Methodology
Urban inventories carried out at any given time allow city decision makers and planners to quantify the magnitude of total emissions between different means of transportation. The inventory is also the starting point for the development of a mitigation strategy. Below, the steps we have followed to define our methodology are described and summarised in the   • Definition of the vehicle fleet and its characteristics A customized report is configured from the database consulted in [5] with all these attributes: type of vehicle, municipality of residence, brand, fuel, power, displacement, load, seats, antiquity -technology. By filtering and grouping the available attributes of the studied vehicle fleet, we managed to configure a database composed of just these attributes: vehicle typology, fuel, technological regulations, weighted relative for each of the categorizations. This database will be affected by the corresponding emission factors through the methodology used to obtain the total emissions of each of the pollutants contemplated and the CO 2 equivalent.

Design of the road network model
The network model is built using mainly the open data source offered by OpenStreetMap (OSM). The OSM data set downloaded for the whole metropolitan area of Valencia for the backbone of the model. Due to the city under consideration is where we live and in the case of notice an inconsistency on the data set, it was manually corrected using NETEDIT.

Zoning (districts/neighbourhoods) within an aggregate system
The partition of the study area into a set of pieces (geographical discretization) as known as zones are necessary to characterize the mobility that is currently taking place, quantify it and make forecasts about its possible evolution. We have taken advantage of the maximum pre-existing information. Thus the fundamental administrative division criterion for establishing transport zoning in Valencia has been used here. The whole zonification process in SUMO was made using NETEDIT with the traffic area zone (TAZ). Based on the division by neighbourhoods and districts plus the attached zones that simplify the metropolitan trips of the people, sixty four polygons were created as a traffic area zone.

Estimation of the target source matrix
Once the types of transport have been filtered, and with the zoning defined, a 64x64 table was obtained. The table columns represent the origins "O" (where people live), and the rows represent the destinations "D" of the displacement. The table represents the trips that people make daily.

Improvement of the algorithm and the OD matrix
The proposed algorithm performs a set of operations on each of the elements of the matrix. This approach does not show any inconvenience unless samples with a large number of zones, districts or divisions are treated. A large number of zones increase the dimensionality of the matrices to points where scattered matrices could be found (matrices where a large part of the elements are null or close to zero). An improvement has been implemented in the algorithm to increase its efficiency. To that act just on the necessary elements after an analysis of the different components of the initial matrix.
One of the elements needed by the OD-matrix refinement program is a three-dimensional probability matrix "p k". The refinement matrix shows the probability that a vehicle with a defined origin "i" and destination "j" will circulate through each of the possible "k" sections existing in the road network. In order to obtain this final improved matrix, various OD matrices are used as iterations that allow us to carry out statistical and probability studies.

Mathematical model to refined Origin-Destination matrices
The model applied is based on an initial OD matrix and measurements of traffic flow at specific points of the road network as in [9]. The desired matrix will generate flows that must have a convergence process towards a scenario that is close to real situations. The convergence process is achieved employing mathematical models based on multivariate statistical methods and controlled dispersion rates that characterise the variability of the data. In this way, we can appreciate the tendency of the obtained values to converge towards real values quantified in situ. On the other hand, these measures help us to determine if our data are far from the expected value. They provide information as to whether the central value "centroid" is adequate to represent the study population. This is useful for comparing distributions and understanding risks in decision making.
Therefore, based on this difference in the number of flows, the OD matrix is changed. If these changes meet certain pre-established convergence conditions, we will consider that they reproduce the current traffic with a certain error that we assume, as long as the convergence requirements are not reached, the iteration process will continue. When this happens, the obtained OD matrix is accepted, and the traffic load is simulated again. The result of the simulation will be a contamination map that will be used for future analysis. A graphical representation of this step can be seen in Figure 4. The results of emissions of the different types of pollutants are shown with (a) the maximum possible granularity and (b) the highest periodicity to show the potential of the developed models with the available data. The geolocation of these emission levels by type of pollutant is a graphic representation that allows visualising and analysing numerical values in a spatial context. For a better analysis of the results, the geolocated quantification can be studied in specific sectors and sections with disaggregation of the different pollutants according to vehicle and fuel typology.

Synthesis and information preparation
From the results obtained with the developed tool, these results could be in one hand a conveniently structured compilation of emission information. On the other hand, the results will be offered as graphics representation that could be displayed in dashboards, always thinking that could be a support tool for decision making.

Definition of new scenarios
With this tool, an infinite number of different scenarios can be considered to carry out simulations and observe their effects on emissions. We mainly propose simulations with most of the traffic light control programs or patterns which the city has (approximately 27 patterns). Where each of the programs is primarily adapted to a wide variety of conditions: weather (sun, wind, rain, cold, heat), time (day, night) and seasonal (weekdays, holidays, holidays) and the possible combinations among them.

Results
One of the essential pieces of information to be able to analyse the functioning of a city's transport networks is to obtain the initial origin-destination matrices. These can be used to plan improvements and simulate the mobility dynamics of the city's citizens. In this work, several simulations have been carried out where the number of vehicles inserted corresponds to the number of routes and journeys made by the vehicles represented in the initial origindestination matrix. These values do not represent the number of physical vehicles in the city, nor the vehicle fleet registered in the city, nor the sum of the detections in an instant of time. Thus, when a vehicle arrives at a destination, it is removed from the simulation. Then, the log records for each of the vehicles inserted in the scenarios proposed in the simulation are analysed. Simulations carried out considering the average daily/hourly data for the whole year of 2017.
In each execution, redefinition of matrices is carried out by adjusting the traffic volumes that pass through the detectors. This data serves as input for a new iterative simulation. Thus with all the different daily hours. Simulating for each hour, and up to ten repetitions of each hour by changing the seed. As the objective of this article is to show the effectiveness of the methodology carried out, we show one of the graphs that will be part of a dashboard for the users who make the decisions. In Figure 5 we can see the neighbourhoods or districts of the city with the average daily CO 2 emissions obtained after the simulation. Where the dark colours have the lowest emissions throughout the year, and the light yellow colours have the highest. With this graph, we confirm an expected result that the neighbourhoods/districts that are next to the large motorways entering and leaving the city are the most affected by CO 2 emissions.
In the same way that CO 2 emissions have been treated, we are able to present the contaminant emissions among which we can highlight Nitrogen Dioxide (NO2) or the atmospheric particulate matter with a diameter of less than 2.5 micrometers (PM2.5) both data available at the end of a SUMO simulation.

Further research
As a further research we pretend: (a) At a strategic level, to contribute to the improvement of scientific and technological knowledge in order to meet the objectives of a low carbon economy, as set by the European Commission for the medium and long term horizon (2030 and 2050 respectively) (b) At the technological level, to further study and characterise the interrelationship between transport and mobility with the Greenhouse Gases Emissions (GHG) and other pollutants in the urban environment. This will allow us to improve the precision and granularity of the spatial and temporal representation of the emissions.
(c) At the cooperative level, to integrate and combine efforts in research, development and innovation in the fight against and adaptation to climate change in a continuous multisectoral advance towards energy transition.

Conclusions
Thanks to the high degree of data digitalisation in Valencia, it is possible to collect data related to traffic in this city in real-time. By inputting this data into different mathematical models information about the environmental impact and influence of traffic on emissions can be obtained. SUMO microsimulation makes it possible to consider different simulation scenarios to analyse how emissions may be altered through time by different traffic mitigation approaches. This type of modelling also allows us to examine the interrelation between factors, variables, sectors, and emissions, and how varying these parameters might change pollutant gas and CO 2 emissions outcomes.
Here we created a mathematical model using the large amount of information available improve the accuracy of emissions predictions. We then compared the results obtained from our models to those from microsimulation with SUMO in a process of convergence and validation. Thus, the accuracy and scientific rigour of this process was greater than simple extrapolation of values from Intergovernmental Panel on Climate Change (IPCC) reports.
Moreover, as a temporary reference product of our methodological proposal, georeferenced emissions allow us to better understand the character and distribution of the carbon footprint and other pollutants in Valencia. In turn, this will generate new challenges and increase opportunities for the Valencian local government to manage these problems. With this information, both the public administration and planners can measure the impact of their mitigation plans and adapt their approaches to climate change in order to reduce the carbon footprint of the transport sector.