IRIS at LLMs4OL 2025 Tasks B, C and D: Enhancing Ontology Learning Through Data Enrichment and Type Filtering
DOI:
https://doi.org/10.52825/ocp.v6i.2895Keywords:
Ontology Learning, Large Language Model, Data Augmentation, Definition Mining, Similarity FilteringAbstract
Ontology Learning (OL) automates extracting structured knowledge from unstructured data. We study how model-agnostic data manipulations can boost performance of Large Language Models (LLM) on three OL tasks, i.e., term typing, taxonomy discovery and non-taxonomic relation extraction, from the LLMs4OL 2025 Challenge. We investigate two input‐enrichment techniques, i.e., (i) data augmentation, and (ii) addition of term and type definitions that expand the information supplied to an LLM. Complementing the enrichment techniques, we also study a pruning technique, i.e., a similarity-based candidate filtering technique that narrows the candidate space in taxonomy discovery and non-taxonomic relation extraction to the most semantically relevant types. When applied individually, each technique boosts precision–recall metrics over the vanilla setting where an LLM is trained on the original data. However, applied together they yield the best scores in five out of the seven ontology–task combinations, showing synergetic benefits. Our findings show that careful curation of inputs can itself yield substantial performance improvements. Codebase and all training artifacts are available at our GitHub repository.
Downloads
References
H. Babaei Giglou, J. D’Souza, and S. Auer, “Llms4ol: Large language models for ontology learning,” in International Semantic Web Conference, Springer, 2023, pp. 408–427.
S. Deng, Y. Ma, N. Zhang, Y. Cao, and B. Hooi, “Information extraction in low-resource scenarios: Survey and perspective,” in 2024 IEEE International Conference on Knowledge Graph (ICKG), IEEE, 2024, pp. 33–49.
O. Perera and J. Liu, “Exploring large language models for ontology learning,” 2024.
H. B. Giglou, J. D’Souza, and S. Auer, “Llms4ol 2024 overview: The 1st large language models for ontology learning challenge,” arXiv preprint arXiv:2409.10146, 2024.
H. Babaei Giglou, J. D’Souza, N. Mihindukulasooriya, and S. Auer, “Llms4ol 2025 overview: The 2nd large language models for ontology learning challenge,” Open Conference Proceedings, 2025.
A. Bandrowski, R. Brinkman, M. Brochhausen, et al., “The ontology for biomedical inves- tigations,” PloS one, vol. 11, no. 4, e0154556, 2016.
K. Cheung, J. Drennan, and J. Hunter, “Towards an ontology for data-driven discovery of new materials.,” in AAAI Spring Symposium: Semantic Scientific Knowledge Integration, 2008, pp. 9–14.
R. Raskin and M. Pan, “Semantic web for earth and environmental terminology (sweet),” in Proc. of the Workshop on Semantic Web Technologies for Searching and Retrieving Scientific Data, vol. 25, 2003.
P. Cimiano and J. V ¨olker, “Text2onto: A framework for ontology learning and data-driven change discovery,” in International conference on application of natural language to information systems, Springer, 2005, pp. 227–238.
M. N. Asim, M. Wasim, M. U. G. Khan, W. Mahmood, and H. M. Abbasi, “A survey of ontology learning techniques and applications,” Database, vol. 2018, bay101, 2018.
R. Du, H. An, K. Wang, and W. Liu, “A short review for ontology learning: Stride to large language models trend,” arXiv preprint arXiv:2404.14991, 2024.
A. S. Lippolis, M. J. Saeedizade, R. Keskis ¨arkk ¨a, et al., “Ontology generation using large language models,” in European Semantic Web Conference, Springer, 2025, pp. 321–341.
Y. Lu, Q. Liu, D. Dai, et al., “Unified structure generation for universal information extraction,” arXiv preprint arXiv:2203.12277, 2022.
X. Wang, W. Zhou, C. Zu, et al., “Instructuie: Multi-task instruction tuning for unified information extraction,” arXiv preprint arXiv:2304.08085, 2023.
J. Xu, M. Sun, Z. Zhang, and J. Zhou, “Chatuie: Exploring chat-based unified information extraction using large language models,” arXiv preprint arXiv:2403.05132, 2024.
P. He, X. Liu, J. Gao, and W. Chen, “Deberta: Decoding-enhanced bert with disentangled attention,” arXiv preprint arXiv:2006.03654, 2020.
A. Hurst, A. Lerer, A. P. Goucher, et al., “Gpt-4o system card,” arXiv preprint arXiv:2410.21276, 2024.
Downloads
Published
How to Cite
Conference Proceedings Volume
Section
License
Copyright (c) 2025 Insan-Aleksandr Latipov, Mike Holenderski, Nirvana Meratnia

This work is licensed under a Creative Commons Attribution 4.0 International License.