SBU-NLP at LLMs4OL 2025 Tasks A, B, and C: Stage-Wise Ontology Construction Through LLMs Without any Training Procedure
DOI:
https://doi.org/10.52825/ocp.v6i.2887Keywords:
Automated Ontology Construction, Large Language Models, Prompt EngineeringAbstract
Automated ontology construction is a challenging task that traditionally requires extensive domain expertise, data preprocessing, and resource-intensive model training. While learning-based methods with fine-tuning are common, they often suffer from high computational costs and limited generalizability across domains. This paper explores a fully automated approach that leverages powerful large language models (LLMs) through prompt engineering, eliminating the need for training or fine-tuning. We participated in the LLMs4OL 2025 shared task, which includes four subtasks: extracting ontological terms and types (Text2Onto), assigning generalized types to terms (Term Typing), discovering taxonomic relations (Taxonomy Discovery), and extracting non-taxonomic semantic relations (Non-Taxonomic Relation Extraction). Our team focused on the first three tasks by using stratified random sampling, simple random sampling, and chunking-based strategies to include training sets in the prompts without limitations imposed by context window sizes. This simple yet general approach has proven effective across these tasks, enabling high-quality ontology construction without additional annotations or training. Additionally, we show that pretrained sentence embedding models ranging from 0.1B to 1.5B parameters perform comparably to a simple F1 token overlap baseline in taxonomy discovery, suggesting that embedding-based methods may not always offer significant advantages. Our findings highlight that prompt-based strategies with modern LLMs enable efficient, scalable, and domain-independent ontology construction, providing a promising alternative to traditional, resource-heavy methods.
Downloads
References
H. B. Giglou, J. D’Souza, and S. Auer, “Llms4ol: Large language models for ontology learning,” International Semantic Web Conference, vol. 2023, pp. 408–427, 2023.
H. B. Giglou, J. D’Souza, and S. Auer, “Llms4ol 2024 overview: The 1st large language models for ontology learning challenge,” arXiv preprint arXiv:2409.10146, 2024.
H. B. Giglou, J. D’Souza, S. Sadruddin, and S. Auer, “Llms4ol 2024 datasets: Toward ontology learning with large language models,” Open Conference Proceedings, vol. 4, pp. 17–30, 2024.
H. Babaei Giglou, J. D’Souza, N. Mihindukulasooriya, and S. Auer, “Llms4ol 2025 overview: The 2nd large language models for ontology learning challenge,” Open Conference Proceedings, 2025.
“Google/gemini-2.5-flash-preview-05-20.” (2025), [Online]. Available: https://gemini.google.com/app (visited on 07/07/2025).
“X-ai/grok-3.” (2025), [Online]. Available: https://grok.com/ (visited on 07/07/2025).
“Deepseek/deepseek-chat-v3-0324.” (2025), [Online]. Available: https://huggingface.co/deepseek-ai/DeepSeek-V3 (visited on 07/07/2025).
“Openai/gpt-4o-mini.” (2025), [Online]. Available: https://platform.openai.com/docs/models/gpt-4o-mini (visited on 07/07/2025).
“Anthropic/claude-sonnet-4.” (2025), [Online]. Available: https://www.anthropic.com/claude/sonnet (visited on 07/07/2025).
J. Chen, S. Xiao, P. Zhang, K. Luo, D. Lian, and Z. Liu, “M3-embedding: Multi-linguality, multi-functionality, multi-granularity text embeddings through self-knowledge distillation,” Findings of the Association for Computational Linguistics: ACL 2024, vol. 2024, pp. 2318–2335, Aug. 2024. DOI : 10.18653/v1/2024.findings-acl.137.
“All-mpnet-base-v2.” (2025), [Online]. Available: https://huggingface.co/sentence-transformers/all-mpnet-base-v2 (visited on 07/07/2025).
“All-minilm-l6-v2.” (2025), [Online]. Available: https : / / huggingface . co / sentence -transformers/all-MiniLM-L6-v2 (visited on 07/07/2025).
D. Zhang, J. Li, Z. Zeng, and F. Wang, “Jasper and stella: Distillation of sota embed-ding models,” arXiv preprint arXiv:2412.19048, Dec. 2025. [Online]. Available: https ://arxiv.org/abs/2412.19048.
Downloads
Published
How to Cite
Conference Proceedings Volume
Section
License
Copyright (c) 2025 Rashin Rahnamoun, Mehrnoush Shamsfard

This work is licensed under a Creative Commons Attribution 4.0 International License.