Enterprise-Wide Metadata Management
An Industry Case on the Current State and Challenges
Keywords:Metadata Management, Data Sharing, Data Transparency, Data Catalog, Data Marketplace
Metadata management is a crucial success factor for companies today, as for example, it enables exploiting data value fully or enables legal compliance. With the emergence of new concepts, such as the data lake, and new objectives, such as the enterprise-wide sharing of data, metadata management has evolved and now poses a renewed challenge for companies. In this context, we interviewed a globally active manufacturer to reveal how metadata management is implemented in practice today and what challenges companies are faced with and whether these constitute research gaps. As an outcome, we present the company’s metadata management goals and their corresponding solution approaches and challenges. An evaluation of the challenges through a literature and tool review yields three research gaps, which are concerned with the topics: (1) metadata management for data lakes, (2) categorizations and compositions of metadata management tools for comprehensive metadata management, and (3) the use of data marketplaces as metadata-driven exchange platforms within an enterprise. The gaps lay the groundwork for further research activities in the field of metadata management and the industry case represents a starting point for research to realign with real-world industry needs.
DAMA International, DAMA-DMBOK: Data Management Body of Knowledge. Technics Publications, 2017.
C. Giebler, C. Gröger, E. Hoos, H. Schwarz, and B. Mitschang, “Leveraging the Data Lake: Current State and Challenges,” in Proc. of the 21st International Conference on Big Data Analytics and Knowledge Discovery (DaWaK), Springer, Cham, 2019. https://doi.org/10.1007/978-3-030-27520-4_13
G. De Simoni, M. Beyer, and A. Jain, “Magic Quadrant for Metadata Management Solutions,” Gartner, 2019.
C. Gröger and E. Hoos, “Ganzheitliches Metadatenmanagement im Data Lake: Anforderungen, IT-Werkzeuge und Herausforderungen in der Praxis,” in Proc. of the 18. Fachtagung für Datenbanksysteme für Business, Technologie und Web (BTW), 2019. https://doi.org/10.18420/btw2019-26
P. Sawadogo and J. Darmont, “On data lake architectures and metadata management,” J. Intell. Inf. Syst., 2020. https://doi.org/10.1007/s10844-020-00608-7
R. Hai, S. Geisler, and C. Quix, “Constance: An intelligent data lake system,” in Proc. of the 2016 International Conference on Management of Data (SIGMOD), 2016. https://doi.org/10.1145/2882903.2899389
C. Quix, R. Hai, and I. Vatov, “Metadata Extraction and Management in Data Lakes With GEMMS,” Complex Syst. Informatics Model. Q., no. 9, 2016. https://doi.org/10.7250/csimq.2016-9.04
R. Eichler, C. Giebler, C. Gröger, H. Schwarz, and B. Mitschang, “Handle - a generic metadata model for data lakes,” in Proc. of the 22nd International Conference on Big Data Analytics and Knowledge Discovery (DaWaK), 2020. https://doi.org/10.1007/978-3-030-59065-9_7
C. Giebler, C. Gröger, E. Hoos, R. Eichler, H. Schwarz, and B. Mitschang, “The Data Lake Architecture Framework: A Foundation for Building a Comprehensive Data Lake Architecture,” in Proc. der 19. Fachtagung Datenbanksysteme für Business, Technologie und Web (BTW), 2021.
C. Gröger, “Building an Industry 4.0 Analytics Platform,” Datenbank-Spektrum, vol. 18, no. 1, 2018. https://doi.org/10.1007/s13222-018-0273-1
K. Zhu, “Information Transparency in Electronic Marketplaces: Why Data Transparency May Hinder the Adoption of B2B Exchanges,” Electron. Mark., vol. 12, no. 2, 2002. https://doi.org/10.1080/10196780252844535
E. Zaidi, G. De Simoni, R. Edjlali, and A. D. Duncan, “Data Catalogs Are the New Black in Data Management and Analytics,” Gartner, 2017.
L. Ehrlinger and W. Wöß, “Towards a definition of knowledge graphs,” Semant. (Posters, Demos, SuCCESS), vol. 48, 2016.
Y. Svetashova, S. Schmid, and A. Harth, “Towards semantic model extensibility in interoperable IoT data exchange platforms,” in Proc. of the 2018 Global Internet of Things Summit (GIoTS), 2018. https://doi.org/10.1109/GIOTS.2018.8534561
M. Farid, A. Roatiş, I. F. Ilyas, H. F. Hoffmann, and X. Chu, “CLAMS: Bringing quality to data lakes,” in Proc. of the 2016 International Conference on Management of Data (SIGMOD), 2016. https://doi.org/10.1145/2882903.2899391
A. Halevy et al., “Goods : Organizing Google ’ s Datasets,” in Proc. of the 2016 International Conference on Management of Data (SIGMOD), 2016.
J. M. Hellerstein et al., “Ground : A Data Context Service,” in Proc. of the 8th Biennial Conference on Innovative Data Systems Research (CIDR), 2017.
A. Beheshti, B. Benatallah, R. Nouri, and A. Tabebordbar, “CoreKG: A Knowledge Lake Service,” Proc. VLDB Endow., vol. 11, no. 12, Aug. 2018. https://doi.org/10.14778/3229863.3236230
L. Meisel and M. Spiekermann, “Datenmarktplätze - Plattformen für Datenaustausch und Datenmonetarisierung in der Data Economy,” Fraunhofer ISST, 2019.
D. Wells, “The Rise of the Data Marketplace: Data as a Service,” Eckerson Gr., 2017.
J. Lange, F. Stahl, and G. Vossen, “Datenmarktplätze in verschiedenen Forschungsdisziplinen: Eine Übersicht,” Informatik-Spektrum, vol. 41, no. 3, 2018. https://doi.org/10.1007/s00287-017-1044-3
A. Bhardwaj et al., “Collaborative data analytics with DataHub,” Proc. VLDB Endowment, vol. 8, no. 12, 2015. https://doi.org/10.14778/2824032.2824100
R. Lutton, “Data Management 20/20: Business Glossary Best Practices – TDAN.com,” 2019. [Online]. Available: https://tdan.com/data-management-2020-business-glossary-best-practices/25216. [Accessed: 02-Nov-2020].
R. C. Fernandez, P. Subramaniam, and M. J. Franklin, “Data market platforms: Trading data assets to solve data problems,” Proc. VLDB Endow., vol. 13, no. 12, 2020.
S. Saxena, “Enterprise Data Marketplace: Democratizing Data within Organizations,” Tata Consult. Serv., 2018.
A. S. Alrawahi, K. Lee, and A. Lotfi, “AMACoT: A Marketplace Architecture for Trading Cloud of Things Resources,” IEEE Internet Things J., vol. 7, no. 3, 2019. https://doi.org/10.1109/JIOT.2019.2957441
S. Schmid et al., “An architecture for interoperable IoT Ecosystems,” in Proc. of the 2nd International Workshop on Interoperability and Open-Source Solutions for the Internet of Things (InterOSS-IoT), 2016. https://doi.org/10.1007/978-3-319-56877-5_3
D. Roman and G. Stefano, “Towards a reference architecture for trusted data marketplaces: The credit scoring perspective,” in Proc. of the 2nd International Conference on Open and Big Data (OBD), 2016. https://doi.org/10.1109/OBD.2016.21
M. Spiekermann, “Data Marketplaces: Trends and Monetisation of Data Goods,” Intereconomics, vol. 54, no. 4, 2019. https://doi.org/10.1007/s10272-019-0826-z
Copyright (c) 2021 Rebecca Eichler, Corinna Giebler, Christoph Gröger, Eva Hoos, Holger Schwarz, Bernhard Mitschang
This work is licensed under a Creative Commons Attribution 4.0 International License.