Alexbek at LLMs4OL 2025 Tasks A, B, and C: Heterogeneous LLM Methods for Ontology Learning (Few-Shot Prompting, Ensemble Typing, and Attention-Based Taxonomies)
DOI:
https://doi.org/10.52825/ocp.v6i.2899Keywords:
Large Language Models, Ontology Engineering (OE), Ontology Learning, Domain-Specific Knowledge, Retrieval Augmented Generation, Term Typing, Taxonomy DiscoveryAbstract
We present a comprehensive system for addressing Tasks A, B, and C of the LLMs4OL 2025 challenge, which together span the full ontology construction pipeline: term extraction, typing, and taxonomy discovery. Our approach combines retrieval-augmented prompting, zero-shot classification, and attention-based graph modeling — each tailored to the demands of the respective task.
For Task A, we jointly extract domain-specific terms and their ontological types using a retrieval-augmented generation (RAG) pipeline.
Training data was reformulated into a document to terms and types correspondence, while test-time inference leverages semantically similar training examples. This single-pass method requires no model finetuning and improves overall performance through lexical augmentation
Task B, which involves assigning types to given terms, is handled via a dual strategy. In the few-shot setting (for domains with labeled training data), we reuse the RAG scheme with few-shot prompting. In the zero-shot setting (for previously unseen domains), we use a zero-shot classifier that combines cosine similarity scores from multiple embedding models using confidence-based weighting.
In Task C, we model taxonomy discovery as graph inference. Using embeddings of type labels, we train a lightweight cross-attention layer to predict is-a relations by approximating a soft adjacency matrix.
These modular, task-specific solutions enabled us to achieve top-ranking results in the official leaderboard across all three tasks. Taken together these strategies showcase the scalability, adaptability, and robustness of LLM-based architectures for ontology learning across heterogeneous domains.
Code is available at: https://github.com/BelyaevaAlex/LLMs4OL-Challenge-Alexbek
Downloads
References
H. B. Giglou, J. D’Souza, and S. Auer, Llms4ol: Large language models for ontology learning, 2023. arXiv: 2307.16648 [cs.AI]. [Online]. Available: https://arxiv.org/abs/2307.16648.
M. Sanaei, F. Azizi, and H. Babaei Giglou, “Phoenixes at llms4ol 2024 tasks a, b, and c: Retrieval augmented generation for ontology learning,” Open Conference Proceedings, vol. 4, pp. 39–47, Oct. 2024. DOI : 10.52825/ocp.v4i.2482.
H. B. Giglou, J. D’Souza, and S. Auer, Llms4ol 2024 overview: The 1st large language models for ontology learning challenge, 2024. arXiv: 2409.10146 [cs.CL]. [Online]. Available: https://arxiv.org/abs/2409.10146.
A. Lo, A. Q. Jiang, W. Li, and M. Jamnik, End-to-end ontology learning with large language models, 2024. arXiv: 2410.23584 [cs.LG]. [Online]. Available: https://arxiv.org/abs/2410.23584.
N. Fathallah, S. Staab, and A. Algergawy, Llms4life: Large language models for ontology learning in life sciences, 2024. arXiv: 2412 . 02035 [cs.AI]. [Online]. Available: https ://arxiv.org/abs/2412.02035.
Y. Zhang, M. Li, D. Long, et al., “Qwen3 embedding: Advancing text embedding and reranking through foundation models,” arXiv preprint arXiv:2506.05176, 2025.
Yoshino-s, Outline Python API library, https://github.com/yoshino-s/outline-python-api/tree/main, used on: 16.07.2025, 2024.
Downloads
Published
How to Cite
Conference Proceedings Volume
Section
License
Copyright (c) 2025 Aleksandra Beliaeva, Temurbek Rahmatullaev

This work is licensed under a Creative Commons Attribution 4.0 International License.
Funding data
-
Russian Science Foundation
Grant numbers №25-71-30008