Phoenixes at LLMs4OL 2024 Tasks A, B, and C: Retrieval Augmented Generation for Ontology Learning

Mahsa Sanaei; Fatemeh Azizi; Hamed Babaei Giglou

doi:10.52825/ocp.v4i.2482

Authors

Mahsa Sanaei University of Tabriz https://orcid.org/0009-0008-2154-0627
Fatemeh Azizi University of Tabriz https://orcid.org/0009-0007-8335-1058
Hamed Babaei Giglou Technische Informationsbibliothek (TIB) https://orcid.org/0000-0003-3758-1454

DOI:

https://doi.org/10.52825/ocp.v4i.2482

Keywords:

Large Language Models, Ontology Learning, Retrieval Augmented Generation, Term Typing, Taxonomy ‌Discovery, Non-Taxonomic Relationship Extraction

Abstract

Large language models (LLMs) showed great capabilities in ontology learning (OL) where they automatically extract knowledge from text. In this paper, we proposed a Retrieval Augmented Generation (RAG) formulation for three different tasks of ontology learning defined in the LLMs4OL Challenge at ISWC 2024. For task A - term typing - we considered terms as a query and encoded the query through the Query Encoder model for searching through knowledge base embedding of types embeddings obtained through Context Encoder. Next, using Zero-Shot Prompt template we asked LLM to determine what types are appropriate for a given term within the term typing task. Similarly, for Task B, we calculated the similarity matrix using an encoder-based transformer model, and by applying the similarity threshold we considered only similar pairs to query LLM to identify whatever pairs have the "is-a" relation between a given type and in a case of having the relationships which one is "parent" and which one is "child". In final, for Task C -- non-taxonomic relationship extraction -- we combined both approaches for Task A and B, where first using Task B formulation, child-parents are identified then using Task A, we assigned them an appropriate relationship. For the LLMs4OL challenge, we experimented with the proposed framework over 5 subtasks of Task A, all subtasks of Task B, and one subtask of Task C using Mistral-7B LLM.

Downloads

Download data is not yet available.

References

[1] H. Babaei Giglou, J. D’Souza, and S. Auer, “Llms4ol: Large language models for ontology learning,” in The Semantic Web – ISWC 2023, T. R. Payne, V. Presutti, G. Qi, et al., Eds., Cham: Springer Nature Switzerland, 2023, pp. 408–427, ISBN : 978-3-031-47240-4.

[2] H. Babaei Giglou, J. D’Souza, and S. Auer, “Llms4ol 2024 overview: The 1st large language models for ontology learning challenge,” Open Conference Proceedings, vol. 4, Oct. 2024.

[3] V. K. Kommineni, B. König-Ries, and S. Samuel, From human experts to machines: An llm supported approach to ontology and knowledge graph construction, 2024. arXiv: 2403. 08345 [cs.CL]. [Online]. Available: https://arxiv.org/abs/2403.08345.

[4] Z. Luo, Y. Wang, W. Ke, R. Qi, Y. Guo, and P. Wang, “Boosting llms with ontology-aware prompt for ner data augmentation,” in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 12 361–12 365. DOI :

10.1109/ICASSP48485.2024.10446860.

[5] B. Zhang, V. A. Carriero, K. Schreiberhuber, et al., Ontochat: A framework for conversational ontology engineering using language models, 2024. arXiv: 2403.05921 [cs.AI]. [Online]. Available: https://arxiv.org/abs/2403.05921.

[6] J. H. Caufield, H. Hegde, V. Emonet, et al., Structured prompt interrogation and recursive extraction of semantics (spires): A method for populating knowledge bases using zero-shot learning, 2023. arXiv: 2304.02711 [cs.AI]. [Online]. Available: https://arxiv.org/abs/2304.02711.

[7] K. Bhattacharya, A. Majumder, and A. Chakrabarti, A study on effect of reference knowledge choice in generating technical content relevant to sapphire model using large language model, 2024. arXiv: 2407.00396 [cs.CL]. [Online]. Available: https://arxiv.org/abs/2407.00396.

[8] L. M. V. da Silva, A. Köcher, F. Gehlhoff, and A. Fay, On the use of large language models to generate capability ontologies, 2024. arXiv: 2404.17524 [cs.AI]. [Online]. Available: https://arxiv.org/abs/2404.17524.

[9] Y. Sun, H. Xin, K. Sun, et al., Are large language models a good replacement of taxonomies? 2024. arXiv: 2406.11131 [cs.CL]. [Online]. Available: https://arxiv.org/abs/2406.11131.

[10] H. B. Giglou, J. D’Souza, F. Engel, and S. Auer, Llms4om: Matching ontologies with large language models, 2024. arXiv: 2404.10317 [cs.AI]. [Online]. Available: https://arxiv.org/abs/2404.10317.

[11] V. Karpukhin, B. Oguz, S. Min, et al., “Dense passage retrieval for open-domain question answering,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online: Association for Computational Linguistics, Nov. 2020, pp. 6769–6781. DOI : 10.18653/v1/2020.emnlp- main.550. [Online]. Available: https://www.aclweb.org/anthology/2020.emnlp-main.550.

[12] A. Q. Jiang, A. Sablayrolles, A. Mensch, et al., Mistral 7b, 2023. arXiv: 2310 . 06825 [cs.CL]. [Online]. Available: https://arxiv.org/abs/2310.06825.

[13] H. Babaei Giglou, J. D’Souza, S. Sadruddin, and S. Auer, “Llms4ol 2024 datasets: Toward ontology learning with large language models,” Open Conference Proceedings, vol. 4, Oct. 2024

Phoenixes at LLMs4OL 2024 Tasks A, B, and C: Retrieval Augmented Generation for Ontology Learning

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Conference Proceedings Volume

Section

License