Calibration of a Microscopic Traffic Simulation in an Urban Scenario Using Loop Detector Data A Case Study within the Digital Twin Munich

. Travel demand is an essential input for the creation of traffic models. However, estimating travel demand to accurately represent traffic behaviour usually requires the collection of extensive sets of data on traffic behaviour. Traffic counts are a comparably cost effective and reproducible source of information on travel demand. The utilisation of traffic counts to estimate demand is commonly found in the literature as the static and dynamic O-D estimation problem. A variety of approaches have been developed over recent decades to tackle this problem. Usually initial estimates of the O-D matrix are calibrated by utilising traffic counts and considering different assignment models. Other approaches for the estimation of travel demand solely based on traffic measurements can be found in the simulation software SUMO. The present work demonstrates the systematic development of a network model in SUMO in the inner city of Munich. In a sample network the estimation of travel demand through the tools flowrouter and routeSampler is tested by utilising flow measurements from induction loop detectors. The tests delivered unsatisfactory results, which is proven through observations of traffic flows in the resulting simulations as well as comparisons to historic traffic counts. The lack of sufficient detector data and the complexity of the sample network are discussed as the main reasons for the results. It is concluded that the applied tools should be tested in future studies with a more extensive dataset to perform a more comprehensive review of both tools. Therefore, we deliver specific requirements based on the network example of Munich.


Introduction
The basis for the model was an automatically generated network of Munich's inner city which was previously developed by the Chair of Traffic Engineering and Control (at TUM). The network is based on data from OSM and was transformed into a SUMO network [1,2]. This network was then manually edited and refined in this project work. Those improvements were made based on official site plans of the intersections which were provided by the City of Munich. These site plans also feature information on the location of loop detectors. A supplementary CSV-file containing information on the IDs and names of the detectors was also provided. In addition to the site plans, aerial images were used and site visits were made to edit and verify the model. Aside from the data used for generating the network geometry, two sources of information on traffic signal programs were used to generate the control infrastructure in the area. On the one hand, original signal plans for a selection of signalised intersections were provided by the City of Munich. It should be mentioned that some of these signal programs are outdated due to redesigns of the junctions over recent years though they were in some cases the only source of information available. On the other hand, the City of Munich provided signal data for all signalised intersections in the city between 11:30 am and 11:00 pm from the 25th of July 2022. The dataset reports information on the status of the signal heads at all junctions over the course of the day. From this data, signal plans were reverse engineered. However, some traffic lights did not report data which may be related to erroneous communication from the traffic signals or ongoing construction works during which the regular signal heads are deactivated.
Moreover, the City of Munich provided data from induction loop detectors throughout the city of Munich between 11:30 am and 11:00 pm from the 25th of July 2022. The dataset reports the traffic volumes, speeds and occupancy measured at every detector in the city. However, speeds are only measured at few detectors. Aside from the data from induction loop detectors, traffic counts at different intersections within the study area were made available.
The year of origin of the historic counts range from 2010 to 2018. The traffic counts were not directly used for modelling travel demand but were consulted to determine the analysis period as well as to perform plausibility checks of the resulting simulation.

Network Development
In the following, we describe the development of the network model. At first, it is described in detail how the network geometry was edited and examples for the process are given. Secondly, the implementation of the control infrastructure is discussed. This includes a description of the structure of the signal dataset which was provided by the City of Munich as well the analysis of the data. Additionally, information on the implementation of signal programs in the model is given.

Network Geometry
The first step in refining the initial raw network was to remove unneeded network elements from the model. These were namely all cycling paths and pedestrian paths which were remnants of the network conversion process. Furthermore, the network was manually cropped at the river Isar meaning that all edges on the eastern side of the river were deleted.
Following that, all intersections were checked and adjusted according to the information contained in the available site plans. These plans were always compared to other available information gained from aerial images and site visits to check whether the site plans show the most recent state of the junction. This was particularly necessary when the site plans indicated ongoing construction works since in some cases these construction works are already completed. When no site plans were available at all, the road geometry was edited solely based on the secondary sources, i.e. aerial images and site visits.
Once the intersection geometry was satisfactory, the induction loop detectors could be placed in accordance with the site plans and the supplementary CSV-file. It must be stated that the detector placement was done manually which means that their location in the model might not match the exact location in the real world. Since the detectors were only used to link the counting data to the edges, this simplification is valid. However, should the detectors be used in the future for other applications such as the actuation of traffic lights, the placement of the detectors should be reviewed. Furthermore, only those detectors that are relevant for road vehicles were placed.
Lastly, some additional refinements had to be made which included removing implausible turnarounds. These may lead to implausible routes and errors in the simulation. On the one hand, these turnarounds existed at the network boundaries which can lead to vehicles not being able to exit the network or driving into dead ends and turning around to continue their route which is implausible. On the other hand, especially at junctions along residential streets, turnarounds were possible which is not plausible for example due to space constraints in the real world. Aside from removing these turnarounds, node clusters were joined to remove unnecessary network elements.
To conclude this, the process of modelling the network geometry is illustrated at the example of two intersections. The first example, shown in Figure 1, depicts the situation at Sendlinger-Tor-Platz. The graphic on the left depicts the state of the junction in the initial network and the image on the right depicts the final intersection layout. Firstly, all remaining foot and cycling paths were removed (1). The next step was the adjustment of the edge geometry (2). For example, the right-turning lane at the western leg of the central junction branches off from the left-turning and through lane in the initial network. Due to ongoing construction works in the area, this lane currently runs in parallel to the other two lanes. Lastly, the connections were corrected (3). In the initial network the eastern leg of the intersection featured a mixed lane for right-turning and through traffic and an additional lane dedicated to through traffic. In reality, one lane for through traffic and one for right-turning traffic exists. It should be mentioned that the network geometry is not an exact depiction of the real world though it is a close approximation and represents the main characteristics of the junction. A more complex example can be found at the intersection Lenbachplatz/ Elisenstraße. Figure 2 depicts the initial network on the left and the edited junction on the right. Initially the junction was heavily simplified and additional signal heads in the centre of the junction were missing. The intersection was thus edited as follows. The process started again by removing all residual foot-and cycling paths (1). Then the junction was split into two signalised intersections and the junction shape was manually adjusted so that it accurately represents the shape of its real-world counterpart (2). This also included the correction of all possible turning manoeuvres between all incoming and outgoing edges as well as an adjustment of the tram tracks. Lastly, all induction loop detectors were placed in accordance with the site plans and the supplementary CSV-file (3).
The same procedure was followed at all other junctions across the network though the focus was set on those junctions for which site plans were available. These were mostly the more complex major intersections of the road network. Junctions of minor importance in the secondary road network usually only required minor corrections, e.g. checking for the possible turning manoeuvres.

Traffic Signal Programs
Many traffic lights especially those located on main roads within the model's extent act traffic responsive or feature programs with public transport prioritisation or actuation. However, since public transport was not modelled and the control algorithms for actuated traffic signals were not known, it was decided to simplify the traffic signal programs and to design them as programs with a fixed cycle.
First and foremost, the signal programs were created by utilising a dataset containing the sequence of signal status of all signal heads at every intersection in Munich. From this data simplified signal programs with a fixed cycle were reverse engineered. An example of such a procedure is pictured in Figure 3 for the example of LSA 103. It bases on averaging the available values. A limitation of the dataset is that it only reports the status of active signal heads as locked ("g") or free ("f"). This means that amber times of the signal heads are unknown. Because of that the signal plans implemented in the model also only feature green and red phases which can lead to emergency braking by the simulated vehicles in case a signal head switches from red to green when a vehicle is close to the stop line.
In total, 109 traffic lights in the network feature signal programs which were reverse engineered from the signal dataset, 13 traffic lights were supplied with signal programs from original signal plans and 20 junctions use actuated signal programs that were automatically generated by SUMO.

Application of flowrouter and routeSampler
Subsequently, the two existing tools of the SUMO package for count-based travel demand generation are tested. On the one hand, demand was generated by using the tool flowrouter. As described previously the tool generates routes and flows, i.e. numbers of vehicles per route, based on detector data. On the other hand, the tool routeSampler was used. This tool requires a set of initial routes as well as counting data as input and then selects and multiplies routes from the initial set in such a way that the detector data is matched.
As pointed out previously, a small network was extracted from the larger one for the test of both tools. Figure 4 shows the large network on the left and the cropped network on the right. The cropped network consists mainly of Sonnenstraße and the important incoming and outgoing streets. Streets of minor significance in terms of their traffic volumes were excluded from the network in order to simplify the model. In total, the network model features eight signalised intersections. The evening peak hour was selected as the test period for the scenario. According to the historic traffic counts, the evening peak hour usually occurs between 4:00 pm and 6:45 pm. It was decided to use the period between 4:00 pm and 5:00 pm as the simulation period. In addition, 15 minutes were added before and after this period so that all vehicles detected within the analysis period can enter and exit the network.
Aside from LSA 43 at Sendlinger-Tor-Platz all intersections in the network are equipped with induction loop detectors. These detectors needed to be filtered at first. Two conditions had to be fulfilled by the detectors so that their data could be used for flowrouter and routeSampler.  At first, the application of flowrouter was tested. The tool requires as input a network-file, a file that specifies the location and ID of the detectors and a file containing the flow measurements from the detectors. The flows file can either be provided as a CSV file or TXT file with a ";" as separator between columns. The file contains a column "Detector", specifying the detector ID, a column "Time" containing the time interval in minutes and lastly a column "qPKW" which describes the vehicles counted at the respective detectors during an interval. Additionally, vehicles may be classified into passenger cars and heavy goods vehicles and measured speeds can be added. However, since the detectors in this case only reported vehicle counts without further distinction between vehicle types, all counting data was inserted into the "qPKW" column.
Following the specification of the flows file, flowrouter was run in the command prompt. The setting --respect-zero was set so that detectors were also considered which may have counted no vehicles across one-or multiple time intervals. In addition, the options --lanebased and --interval 15 were set. The first means that the values of all detectors across one edge are not aggregated but the counts from each individual lane are used. The latter indicates the aggregation interval of the traffic counts which is 15 minutes. The output from flowrouter comes in form of a route file and a flows file. The route file consists of all individual possible routes and the flow file describes the number of vehicles for each route.
In theory, the user can specify flow-restrictions as input for flowrouter so that certain implausi-ble routes are not considered by the tool. This was attempted by utilising the script implausibleRoutes.py, which allows blacklisting of certain routes according to a specified heuristic. However, all attempts to manipulate the output from flowrouter by using this tool failed. Thus, the developers of SUMO were contacted via the official forum and according to them it is currently not possible to combine flow-restrictions with the option --lane-based in flowrouter. Tests without the option --lane-based did not produce satisfactory results.
The routeSampler tool needs an initial set of routes stored in a route file and an edgedata file which contains the flow measurements for all edges. In comparison to flowrouter it is not possible to utilise lane-based counting data of each individual detector. Through the aggregation of measurements across all lanes certain information from the detectors are lost. This includes for example the number of vehicles on dedicated turning lanes. It is also possible to define turn-counts as input for routeSampler. This information could not be obtained from the induction loop detectors since the detectors are only placed at the inflows of the intersections and no downstream measurements were available. The following paragraph details the specification of the route file and the edgedata file.
At first, all possible O-D pairs were written manually into a trip file. It was assumed that there are no trips that start and end in the same direction. Additionally, it was checked whether a trip is possible or not due to turn-restrictions at intersections. This resulted in a total of 174 O-D pairs in the network. Then, the route definitions were obtained by using duarouter. Since there are no parallel streets in the network which allow alternative routes, the tool returned one route definition for each O-D pair. The edgedata file was then obtained by transforming the flows file used for flowrouter into the required format for routeSampler. This can be done through the tool edegedataFromFlow which sums the data from all detectors across an edge and assigns the resulting value to the edge. Then routeSampler was run with the route file and the edgedata file. The additional commandline specification --edgedata-attribute qPKW had to be set so that routeSampler was able to read the counting data from the edgedata file. routeSampler returns a route file which contains information on the selected routes from the initial set and the start time of every individual generated vehicle.

Results and Discussion
The observation of the resulting simulations revealed that neither flowrouter nor routeSampler were able to produce plausible estimations for travel demand in the network. First and foremost, this is related to the lack of available counting data. As discussed in the previous section, only few detectors reported values on the day the data was taken from. Detector measurements on Sonnenstraße itself were only available at the incoming edges of LSA 29. Other measurements were available at LSA 45, LSA 46 and LSA 484. However, of these intersections only LSA 484 is equipped with detectors on all incoming edges. Both tools were not able to deliver plausible estimates with this limited dataset. In addition to the lack of data, the characteristics of the study area are another reason for the unsatisfactory outcome. At the intersections LSA 29, LSA 480 and LSA 484, turnarounds are possible on certain edges. The overestimation of these turnarounds led to congestion in both simulations. Congestion was also experienced in both tools on minor streets on which traffic flows were overestimated which exceeded the capacity of the respective traffic signals.
The following paragraphs give examples for the shortcomings of both simulations by qualitatively describing a selection of errors. In case of flowrouter, the turning ratios at LSA 29 and LSA 480 are compared between the simulation and historic counts. For this purpose, test detectors were implemented in SUMO to measure the respective traffic flows. The comparison is done to prove the simulation's shortcomings. The assumption for this plausibility check is that turning ratios have not changed significantly at the selected intersections between the day of the traffic count and the day the detector data was taken from. Since this superficial qualitative analysis already indicates that the output from both tools is implausible, a more in-depth quantitative analysis seems not sensible.
A general finding of the observation of the output from flowrouter is the overestimation of traffic flows which originate and end in the same direction. This leads to unusual high shares of turnarounds at different junctions. For example, many vehicles which start their trip in the northeast of the network drive to LSA 29 where they turnaround and return to the northeast. Similar observations can be made at LSA 480 where vehicles coming from the south turn around and return in the same direction. Additionally, congestion effects occur at different intersections. One example can be found at the western leg of LSA 29. Here, the high amount of right-turning traffic cannot be handled by the corresponding traffic light. The following paragraphs illustrate these observations at some examples. These are merely a selection of observations from the simulation. Figure 6 depicts the situation at LSA 29 and the described congestion on the western leg of the intersection. Additionally, the unusual high share of vehicles that turn around can be seen on the northern leg of the intersection.  Table 1 features the comparison of turning ratios between a historic traffic count and the simulation for the northern and western leg of the intersection. The historic count was made in the year 2014 by the city of Munich and contains information on daily traffic and both peak hours. For this comparison, the turning ratios from the evening peak hour are used. The table summarises the traffic flows to the northern and eastern leg of the junction as it was not possible to measure them individually in SUMO. However, observations of the simulation show that most of those vehicles drive into the northern direction. All in all, the comparison proves the discussed findings from observing the simulation. The share of vehicles which turn around at the northern leg of the intersection was overestimated.
Additionally, it can be observed that no vehicles turn right. The turning ratios at Schwanthalerstraße were also not reproduced. While the counts indicate that 60 % turn left into Sonnenstraße or go straight into Josephspitalstraße, only 40 % of vehicles do so in the simulation.  The comparison of turning ratios at LSA 480 is depicted in Table 2 for all legs of the intersection. The historic count was made in the year 2018 by the city of Munich and only contains information on daily traffic. Key findings of the comparison are that no vehicles turn right from Sophienstraße into Elisenstraße and no vehicles turn right from Elisenstraße into Sonnenstraße. The share of flows between the eastern leg and the western leg of the intersection are also comparably low. In the traffic counts, these flows account for 45 % of all incoming traffic flows from the east while in the simulation only 21 % of vehicles go straight into Elisenstraße. Interestingly, the share of vehicles turning around or left coming from the south is lower than in the traffic counts. However, most of these vehicles turn around while in the counts the share of turnarounds only makes up for around 1% of the inflow from Sonnenstraße. All in all, this qualitative description of shortcomings of the resulting simulation using the output from flowrouter proves that the tool was not able to produce plausible traffic flows from the limited amount of available detector data.
The application of routeSampler did also not produce plausible results. Running the simulation with the set of sampled routes from the tool led to severe congestion and a breakdown of the simulation after around 15 minutes of simulation time. Congestion can be found at several intersections and results from the overestimation of routes which include turnarounds. Additionally, traffic on minor streets such as Sophienstraße at LSA 480 was severely overestimated which led to congestion in the simulation. Due to the breakdown of traffic in the simulation the comparison of turning ratios between historic counts and simulation is omitted. The reason for this is that representative vehicle counts could not be performed in the simulation since the congestion prevented vehicles from passing the test detectors.
In conclusion, it can be said that given the limited amount of data it was not possible to create a plausible simulation with flowrouter or routeSampler. However, the results of this work should not be seen as a definitive evaluation of the capabilities of the two tools since the unsatisfactory results are mainly related to the poor data basis. In principle, the study area would be a suitable test network to evaluate the tools since most intersections are equipped with induction loop detectors. Because of that it is recommended to monitor the data platform of the city of Munich to check when the currently erroneous detectors report counting data again. Then a more in-depth analysis of flowrouter and routeSampler could be performed in a future study with a more extensive data basis. Alternatively, both tools could be tested in a different study area with more available counting data. A main research question for this study would be how both tools deal with the fact that measurements are mostly only available at the inflows of intersections. This could show whether it is possible to create plausible traffic flows at intersections as this is a main shortcoming of the results from this simulation given the limited amount of counting data.

Data availability statement
The underlying SUMO networks originate from freely accessible and usable OpenStreetMap data extracts. The induction loop record extracts and signal plans used in this study are coming from the City of Munich and are currently (April 2023) not freely-accessible. Historical induction loop records (together with their exact locations) are available as part of the UTD19 data set (https://www.research-collection.ethz.ch/handle/20.500.11850/437802 ) for Munich and other European cities [3].