Open Science and Language Data: Expectations vs. Reality

The Role of Research Data Infrastructures




Research infrastructures, Text data, Open Science


Language data are essential for any scientific endeavor. However, unlike numerical data, language data are often protected by copyright, as they easily meet the threshold of originality. The role of research infrastructures (such CLARIN, DARIAH, and Text+) is to bridge the gap between uses allowed by statutory exceptions and the requirements of Open Science. This is achieved on the one hand by sharing language data produced by research organisations with the widest possible circle of persons, and on the other by mutualizing efforts towards copyright clearance and appropriate licensing of datasets.


Download data is not yet available.


Court of Justice of the European Union, Case C-5/08, Infopaq International A/S v. Danske Dagblades Forening, 16 July 2009.

Gesetz zur Angleichung des Urheberrechts an die aktuellen Erfordernisse der Wissens-gesellschaft vom 1. September 2017, Bundesgesetzblatt Jahrgang 2017 Teil I Nr. 61, pp. 3346-3351.

Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on copyright and related rights in the Digital Single Market and amending Direc-tives 96/9/EC and 2001/29/EC, PE/51/2019/REV/1, OJ L 130, 17.5.2019, p. 92–125.

UNESCO Recommendation on Open Science, UNESCO 2021.




How to Cite

Kamocki, P., Hinrichs, E., Leinen, P., Springer, S., Witt, A., & Zechmann, D. (2023). Open Science and Language Data: Expectations vs. Reality: The Role of Research Data Infrastructures. Proceedings of the Conference on Research Data Infrastructure , 1.

Conference Proceedings Volume


Humanities & Social Sciences
Received 2023-04-26
Accepted 2023-06-29
Published 2023-09-07