IRIS at LLMs4OL 2025 Tasks B, C and D: Enhancing Ontology Learning Through Data Enrichment and Type Filtering

Authors

DOI:

https://doi.org/10.52825/ocp.v6i.2895

Keywords:

Ontology Learning, Large Language Model, Data Augmentation, Definition Mining, Similarity Filtering

Abstract

Ontology Learning (OL) automates extracting structured knowledge from unstructured data. We study how model-agnostic data manipulations can boost performance of Large Language Models (LLM) on three OL tasks, i.e., term typing, taxonomy discovery and non-taxonomic relation extraction, from the LLMs4OL 2025 Challenge. We investigate two input‐enrichment techniques, i.e., (i) data augmentation, and (ii) addition of term and type definitions that expand the information supplied to an LLM. Complementing the enrichment techniques, we also study a pruning technique, i.e., a similarity-based candidate filtering technique that narrows the candidate space in taxonomy discovery and non-taxonomic relation extraction to the most semantically relevant types. When applied individually, each technique boosts precision–recall metrics over the vanilla setting where an LLM is trained on the original data. However, applied together they yield the best scores in five out of the seven ontology–task combinations, showing synergetic benefits. Our findings show that careful curation of inputs can itself yield substantial performance improvements. Codebase and all training artifacts are available at our GitHub repository.

Downloads

Download data is not yet available.

References

H. Babaei Giglou, J. D’Souza, and S. Auer, “Llms4ol: Large language models for ontology learning,” in International Semantic Web Conference, Springer, 2023, pp. 408–427.

S. Deng, Y. Ma, N. Zhang, Y. Cao, and B. Hooi, “Information extraction in low-resource scenarios: Survey and perspective,” in 2024 IEEE International Conference on Knowledge Graph (ICKG), IEEE, 2024, pp. 33–49.

O. Perera and J. Liu, “Exploring large language models for ontology learning,” 2024.

H. B. Giglou, J. D’Souza, and S. Auer, “Llms4ol 2024 overview: The 1st large language models for ontology learning challenge,” arXiv preprint arXiv:2409.10146, 2024.

H. Babaei Giglou, J. D’Souza, N. Mihindukulasooriya, and S. Auer, “Llms4ol 2025 overview: The 2nd large language models for ontology learning challenge,” Open Conference Proceedings, 2025.

A. Bandrowski, R. Brinkman, M. Brochhausen, et al., “The ontology for biomedical inves- tigations,” PloS one, vol. 11, no. 4, e0154556, 2016.

K. Cheung, J. Drennan, and J. Hunter, “Towards an ontology for data-driven discovery of new materials.,” in AAAI Spring Symposium: Semantic Scientific Knowledge Integration, 2008, pp. 9–14.

R. Raskin and M. Pan, “Semantic web for earth and environmental terminology (sweet),” in Proc. of the Workshop on Semantic Web Technologies for Searching and Retrieving Scientific Data, vol. 25, 2003.

P. Cimiano and J. V ¨olker, “Text2onto: A framework for ontology learning and data-driven change discovery,” in International conference on application of natural language to information systems, Springer, 2005, pp. 227–238.

M. N. Asim, M. Wasim, M. U. G. Khan, W. Mahmood, and H. M. Abbasi, “A survey of ontology learning techniques and applications,” Database, vol. 2018, bay101, 2018.

R. Du, H. An, K. Wang, and W. Liu, “A short review for ontology learning: Stride to large language models trend,” arXiv preprint arXiv:2404.14991, 2024.

A. S. Lippolis, M. J. Saeedizade, R. Keskis ¨arkk ¨a, et al., “Ontology generation using large language models,” in European Semantic Web Conference, Springer, 2025, pp. 321–341.

Y. Lu, Q. Liu, D. Dai, et al., “Unified structure generation for universal information extraction,” arXiv preprint arXiv:2203.12277, 2022.

X. Wang, W. Zhou, C. Zu, et al., “Instructuie: Multi-task instruction tuning for unified information extraction,” arXiv preprint arXiv:2304.08085, 2023.

J. Xu, M. Sun, Z. Zhang, and J. Zhou, “Chatuie: Exploring chat-based unified information extraction using large language models,” arXiv preprint arXiv:2403.05132, 2024.

P. He, X. Liu, J. Gao, and W. Chen, “Deberta: Decoding-enhanced bert with disentangled attention,” arXiv preprint arXiv:2006.03654, 2020.

A. Hurst, A. Lerer, A. P. Goucher, et al., “Gpt-4o system card,” arXiv preprint arXiv:2410.21276, 2024.

Downloads

Published

2025-10-01

How to Cite

Latipov, I.-A., Holenderski, M., & Meratnia, N. (2025). IRIS at LLMs4OL 2025 Tasks B, C and D: Enhancing Ontology Learning Through Data Enrichment and Type Filtering. Open Conference Proceedings, 6. https://doi.org/10.52825/ocp.v6i.2895

Conference Proceedings Volume

Section

LLMs4OL 2025 Task Participant Long Papers