Stream Processing Tools for Analyzing Objects in Motion Sending High-Volume Location Data
Keywords:stream processing, location data, transport, mobility, AIS
Recently we observe a significant increase in the amount of easily accessible data on transport and mobility. This data is mostly massive streams of high velocity, magnitude, and heterogeneity, which represent a flow of goods, shipments and the movements of fleet. It is therefore necessary to develop a scalable framework and apply tools capable of handling these streams. In the paper we propose an approach for the selection of software for stream processing solutions that may be used in the transportation domain. We provide an overview of potential stream processing technologies, followed by the method for choosing the selected software for real-time analysis of data streams coming from objects in motion. We have selected two solutions: Apache Spark Streaming and Apache Flink, and benchmarked them on a real-world task. We identified the caveats and challenges when it comes to implementation of the solution in practice.
Amini S, Gerostathopoulos I, Prehofer C. Big data analytics architecture for real-time traffic control. 2017 5th IEEE International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS). 2017 5th IEEE International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS). 2017 06. https://doi.org/10.1109/mtits.2017.8005605
Lekić M, Rogić K, Boldizsár A, Zöldy M, Török Á. Big Data in Logistics. Periodica Polytechnica Transportation Engineering. 2019 Dec 17;49(1):60-65. https://doi.org/10.3311/pptr.14589
Xu H, Lin J, Yu W. Smart Transportation Systems: Architecture, Enabling Technologies, and Open Issues. In: Sun Y, Song H, eds. Secure and Trustworthy Transportation Cyber-Physical Systems. Vol SpringerBriefs in Computer Science. SpringerBriefs in Computer Science. Singapore: Springer; 2017. https://doi.org/https://doi.org/10.1007/978-981-10-3892-1_2
Nguyen D, Vadaine R, Hajduch G, Garello R, Fablet R. A Multi-Task Deep Learning Architecture for Maritime Surveillance Using AIS Data Streams. 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA). 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA). 2018 Oct. https://doi.org/10.1109/dsaa.2018.00044
Kolajo T, Daramola O, Adebiyi A. Big data stream analysis: a systematic literature review. Journal of Big Data. 2019 06 06;6(1). https://doi.org/10.1186/s40537-019-0210-7
Hueske F, Kalavri V. Stream processing with Apache Flink. O’Reilly; 2019.
Chintapalli S, Dagit D, Evans B, Farivar R, Graves T, Holderbaugh M, Liu Z, Nusbaum K, Patil K, Peng BJ, Poulosky P. Benchmarking Streaming Computation Engines: Storm, Flink and Spark Streaming. 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 2016 05. https://doi.org/10.1109/ipdpsw.2016.138
Marcu O, Costan A, Antoniu G, Perez-Hernandez MS. Spark Versus Flink: Understanding Performance in Big Data Analytics Frameworks. 2016 IEEE International Conference on Cluster Computing (CLUSTER). 2016 IEEE International Conference on Cluster Computing (CLUSTER). 2016 09. https://doi.org/10.1109/cluster.2016.22
Quoc D, Chen R, Bhatotia P, Fetze C, Hilt V, Strufe T. Approximate stream analytics in apache flink and apache spark streaming. arXiv preprint arXiv:1709.02946. 2017;
van Dongen G, Van den Poel D. Evaluation of Stream Processing Frameworks. IEEE Transactions on Parallel and Distributed Systems. 2020 08 01;31(8):1845-1858. https://doi.org/10.1109/tpds.2020.2978480
Copyright (c) 2021 Krzysztof Wecel, Marcin Szmydt, Milena Stróżyna
This work is licensed under a Creative Commons Attribution 4.0 International License.