Phoenixes at LLMs4OL 2024 Tasks A, B, and C: Retrieval Augmented Generation for Ontology Learning
DOI:
https://doi.org/10.52825/ocp.v4i.2482Keywords:
Large Language Models, Ontology Learning, Retrieval Augmented Generation, Term Typing, Taxonomy Discovery, Non-Taxonomic Relationship ExtractionAbstract
Large language models (LLMs) showed great capabilities in ontology learning (OL) where they automatically extract knowledge from text. In this paper, we proposed a Retrieval Augmented Generation (RAG) formulation for three different tasks of ontology learning defined in the LLMs4OL Challenge at ISWC 2024. For task A - term typing - we considered terms as a query and encoded the query through the Query Encoder model for searching through knowledge base embedding of types embeddings obtained through Context Encoder. Next, using Zero-Shot Prompt template we asked LLM to determine what types are appropriate for a given term within the term typing task. Similarly, for Task B, we calculated the similarity matrix using an encoder-based transformer model, and by applying the similarity threshold we considered only similar pairs to query LLM to identify whatever pairs have the "is-a" relation between a given type and in a case of having the relationships which one is "parent" and which one is "child". In final, for Task C -- non-taxonomic relationship extraction -- we combined both approaches for Task A and B, where first using Task B formulation, child-parents are identified then using Task A, we assigned them an appropriate relationship. For the LLMs4OL challenge, we experimented with the proposed framework over 5 subtasks of Task A, all subtasks of Task B, and one subtask of Task C using Mistral-7B LLM.
Downloads
References
[1] H. Babaei Giglou, J. D’Souza, and S. Auer, “Llms4ol: Large language models for ontology learning,” in The Semantic Web – ISWC 2023, T. R. Payne, V. Presutti, G. Qi, et al., Eds., Cham: Springer Nature Switzerland, 2023, pp. 408–427, ISBN : 978-3-031-47240-4.
[2] H. Babaei Giglou, J. D’Souza, and S. Auer, “Llms4ol 2024 overview: The 1st large language models for ontology learning challenge,” Open Conference Proceedings, vol. 4, Oct. 2024.
[3] V. K. Kommineni, B. König-Ries, and S. Samuel, From human experts to machines: An llm supported approach to ontology and knowledge graph construction, 2024. arXiv: 2403. 08345 [cs.CL]. [Online]. Available: https://arxiv.org/abs/2403.08345.
[4] Z. Luo, Y. Wang, W. Ke, R. Qi, Y. Guo, and P. Wang, “Boosting llms with ontology-aware prompt for ner data augmentation,” in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 12 361–12 365. DOI :
10.1109/ICASSP48485.2024.10446860.
[5] B. Zhang, V. A. Carriero, K. Schreiberhuber, et al., Ontochat: A framework for conversational ontology engineering using language models, 2024. arXiv: 2403.05921 [cs.AI]. [Online]. Available: https://arxiv.org/abs/2403.05921.
[6] J. H. Caufield, H. Hegde, V. Emonet, et al., Structured prompt interrogation and recursive extraction of semantics (spires): A method for populating knowledge bases using zero-shot learning, 2023. arXiv: 2304.02711 [cs.AI]. [Online]. Available: https://arxiv.org/abs/2304.02711.
[7] K. Bhattacharya, A. Majumder, and A. Chakrabarti, A study on effect of reference knowledge choice in generating technical content relevant to sapphire model using large language model, 2024. arXiv: 2407.00396 [cs.CL]. [Online]. Available: https://arxiv.org/abs/2407.00396.
[8] L. M. V. da Silva, A. Köcher, F. Gehlhoff, and A. Fay, On the use of large language models to generate capability ontologies, 2024. arXiv: 2404.17524 [cs.AI]. [Online]. Available: https://arxiv.org/abs/2404.17524.
[9] Y. Sun, H. Xin, K. Sun, et al., Are large language models a good replacement of taxonomies? 2024. arXiv: 2406.11131 [cs.CL]. [Online]. Available: https://arxiv.org/abs/2406.11131.
[10] H. B. Giglou, J. D’Souza, F. Engel, and S. Auer, Llms4om: Matching ontologies with large language models, 2024. arXiv: 2404.10317 [cs.AI]. [Online]. Available: https://arxiv.org/abs/2404.10317.
[11] V. Karpukhin, B. Oguz, S. Min, et al., “Dense passage retrieval for open-domain question answering,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online: Association for Computational Linguistics, Nov. 2020, pp. 6769–6781. DOI : 10.18653/v1/2020.emnlp- main.550. [Online]. Available: https://www.aclweb.org/anthology/2020.emnlp-main.550.
[12] A. Q. Jiang, A. Sablayrolles, A. Mensch, et al., Mistral 7b, 2023. arXiv: 2310 . 06825 [cs.CL]. [Online]. Available: https://arxiv.org/abs/2310.06825.
[13] H. Babaei Giglou, J. D’Souza, S. Sadruddin, and S. Auer, “Llms4ol 2024 datasets: Toward ontology learning with large language models,” Open Conference Proceedings, vol. 4, Oct. 2024
Downloads
Published
How to Cite
Conference Proceedings Volume
Section
License
Copyright (c) 2024 Mahsa Sanaei, Fatemeh Azizi, Hamed Babaei Giglou
This work is licensed under a Creative Commons Attribution 4.0 International License.