silp_nlp at LLMs4OL 2025 Tasks A, B, C, and D: Clustering-Based Ontology Learning Using LLMs

Pankaj Kumar Goyal; Sumit Singh; Uma Shanker Tiwary

doi:10.52825/ocp.v6i.2900

Authors

Pankaj Kumar Goyal Indian Institute of Information Technology Allahabad https://orcid.org/0009-0007-5501-9111
Sumit Singh Indian Institute of Information Technology Allahabad https://orcid.org/0000-0002-3292-4131
Uma Shanker Tiwary Indian Institute of Information Technology Allahabad https://orcid.org/0000-0001-7206-9013

DOI:

https://doi.org/10.52825/ocp.v6i.2900

Keywords:

Ontology Learning, Large Language Models, Prompt Engineering, Clustering, Knowledge Representation

Abstract

This paper presents the participation of the silp\_nlp team in the LLMs4OL 2025 Challenge, where we addressed four core tasks in ontology learning: Text2Onto (Task A), Term Typing (Task B), Taxonomy Discovery (Task C), and Non-Taxonomic Relation Extraction (Task D). Building on our experience from the first edition, we proposed a clustering-enhanced methodology grounded in large language models (LLMs), integrating domain-adapted transformer models such as pranav-s/MaterialsBERT, dmis-lab/biobert-v1.1, and proprietary LLMs from Grok. Our framework combined lexical and semantic clustering with adaptive prompting to tackle entity and type extraction, semantic classification, hierarchical structure discovery, and complex relation modeling. Experimental results across 18 subtasks highlight the strength of our approach, particularly in blind and zero-shot scenarios. Notably, our model achieved multiple first-rank scores in taxonomy discovery and non-taxonomic relation extraction subtasks, validating the efficacy of clustering when coupled with semantically specialized LLMs. This work demonstrates that clustering-driven, LLM-based approaches can advance robust and scalable ontology learning across diverse domains.

Downloads

Download data is not yet available.

References

H. Babaei Giglou, J. D’Souza, N. Mihindukulasooriya, and S. Auer, “Llms4ol 2025 overview: The 2nd large language models for ontology learning challenge,” Open Conference Proceedings, 2025.

P. Shetty, A. C. Rajan, C. Kuenneth, et al., “A general-purpose material property data extraction pipeline from large polymer corpora using natural language processing,” npj Computational Materials, vol. 9, no. 1, p. 52, 2023, Implements and releases MaterialsBERT as ‘pranav-s/MaterialsBERT‘ on Hugging Face. DOI : 10.1038/s41524-023-00994-5.

J. Lee, W. Yoon, S. Kim, et al., “Biobert: A pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020.

Chat & ask ai: Your advanced ai chatbot, https://askaichat.app/chat, Accessed July 2025.

Y. Peng, Y. Mou, B. Zhu, S. Sowe, and S. Decker, “Rwth-dbis at llms4ol 2024 tasks a and b: Knowledge-enhanced domain-specific continual learning and prompt-tuning of large language models for ontology learning,” Open Conference Proceedings, vol. 4, pp. 49–63, Oct. 2024. DOI : 10.52825/ocp.v4i.2491. [Online]. Available: https://www.tib-op.org/ojs/index.php/ocp/article/view/2491.

P. Kumar Goyal, S. Singh, and U. Shanker Tiwary, “Silp nlp at llms4ol 2024 tasks a, b, and c: Ontology learning through prompts with llms,” Open Conference Proceedings, vol. 4, pp. 31–38, Oct. 2024. DOI : 10.52825/ocp.v4i.2485. [Online]. Available: https://www.tib-op.org/ojs/index.php/ocp/article/view/2485.

H. Abi Akl, “Dsti at llms4ol 2024 task a: Intrinsic versus extrinsic knowledge for type classification: Applications on wordnet and geonames datasets,” Open Conference Proceedings, vol. 4, pp. 93–101, Oct. 2024. DOI : 10.52825/ocp.v4i.2492. [Online]. Available: https://www.tib-op.org/ojs/index.php/ocp/article/view/2492.

A. Barua, S. Saki Norouzi, and P. Hitzler, “Daselab at llms4ol 2024 task a: Towards term typing in ontology learning,” Open Conference Proceedings, vol. 4, pp. 77–84, Oct. 2024. DOI : 10 . 52825 / ocp . v4i . 2489. [Online]. Available: https : / / www . tib - op . org / ojs / index.php/ocp/article/view/2489.

T. Phuttaamart, N. Kertkeidkachorn, and A. Trongratsameethong, “The ghost at llms4ol 2024 task a: Prompt-tuning-based large language models for term typing,” Open Conference Proceedings, vol. 4, pp. 85–91, Oct. 2024. DOI : 10.52825/ocp.v4i.2486. [Online]. Available: https://www.tib-op.org/ojs/index.php/ocp/article/view/2486.

S. M. H. Hashemi, M. Karimi Manesh, and M. Shamsfard, “Skh-nlp at llms4ol 2024 task b: Taxonomy discovery in ontologies using bert and llama 3,” Open Conference Proceedings, vol. 4, pp. 103–111, Oct. 2024. DOI : 10.52825/ocp.v4i.2483. [Online]. Available: https://www.tib-op.org/ojs/index.php/ocp/article/view/2483.

C. A. Atezong Ymele and A. Jiomekong, “Tsotsalearning at llms4ol tasks a and b : Combining rules to large language model for ontology learning,” Open Conference Proceedings, vol. 4, pp. 65–76, Oct. 2024. DOI : 10 . 52825 / ocp . v4i . 2484. [Online]. Available: https://www.tib-op.org/ojs/index.php/ocp/article/view/2484.

M. Sanaei, F. Azizi, and H. Babaei Giglou, “Phoenixes at llms4ol 2024 tasks a, b, and c: Retrieval augmented generation for ontology learning,” Open Conference Proceedings, vol. 4, pp. 39–47, Oct. 2024. DOI : 10.52825/ocp.v4i.2482. [Online]. Available: https://www.tib-op.org/ojs/index.php/ocp/article/view/2482.

E. Umargono, J. E. Suseno, and S. V. Gunawan, “K-means clustering optimization using the elbow method and early centroid determination based on mean and median formula,” in Proceedings of the 2nd International Seminar on Science and Technology (ISSTEC 2019), Atlantis Press, 2020, pp. 121–129, ISBN : 978-94-6239-168-0. DOI : 10. 2991/assehr.k.2

silp_nlp at LLMs4OL 2025 Tasks A, B, C, and D: Clustering-Based Ontology Learning Using LLMs

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Conference Proceedings Volume

Section

License