Knowledge Organization Systems (KOS), such as ontologies, taxonomies, and thesauri, play a crucial role in organising scientific knowledge. They help scientists navigate the vast landscape of research literature and are essential for building intelligent systems such as smart search engines, recommendation systems, conversational agents, and advanced analytics tools. However, the manual creation of these KOSs is costly, time-consuming, and often leads to outdated and overly broad representations. As a result, researchers have been exploring automated or semi-automated methods for generating ontologies of research topics. This paper analyses the use of large language models (LLMs) to identify semantic relationships between research topics. We specifically focus on six open and lightweight LLMs (up to 10.7 billion parameters) and use two zero-shot reasoning strategies to identify four types of relationships: broader, narrower, same-as, and other. Our preliminary analysis indicates that Dolphin2.1-OpenOrca-7B performs strongly in this task, achieving a 0.853 F1-score against a gold standard of 1,000 relationships derived from the IEEE Thesaurus. These promising results bring us one step closer to the next generation of tools for automatically curating KOSs, ultimately making the scientific literature easier to explore.

Aggarwal, T., Salatino, A., Osborne, F., Motta, E. (2024). Identifying Semantic Relationships Between Research Topics Using Large Language Models in a Zero-Shot Learning Setting. In Proceedings of the 4th International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment co-located with 23rd International Semantic Web Conference (ISWC 2024). CEUR-WS.

Identifying Semantic Relationships Between Research Topics Using Large Language Models in a Zero-Shot Learning Setting

Osborne F.;
2024

Abstract

Knowledge Organization Systems (KOS), such as ontologies, taxonomies, and thesauri, play a crucial role in organising scientific knowledge. They help scientists navigate the vast landscape of research literature and are essential for building intelligent systems such as smart search engines, recommendation systems, conversational agents, and advanced analytics tools. However, the manual creation of these KOSs is costly, time-consuming, and often leads to outdated and overly broad representations. As a result, researchers have been exploring automated or semi-automated methods for generating ontologies of research topics. This paper analyses the use of large language models (LLMs) to identify semantic relationships between research topics. We specifically focus on six open and lightweight LLMs (up to 10.7 billion parameters) and use two zero-shot reasoning strategies to identify four types of relationships: broader, narrower, same-as, and other. Our preliminary analysis indicates that Dolphin2.1-OpenOrca-7B performs strongly in this task, achieving a 0.853 F1-score against a gold standard of 1,000 relationships derived from the IEEE Thesaurus. These promising results bring us one step closer to the next generation of tools for automatically curating KOSs, ultimately making the scientific literature easier to explore.
paper
Large Language Models; Ontology Generation; Research Topics; Scholarly Knowledge; Scientific Knowledge Graphs; Zero-Shot Learning;
English
4th International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment, Sci-K 2024 - November 12, 2024
2024
Proceedings of the 4th International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment co-located with 23rd International Semantic Web Conference (ISWC 2024)
2024
2024
3780
https://ceur-ws.org/Vol-3780/
open
Aggarwal, T., Salatino, A., Osborne, F., Motta, E. (2024). Identifying Semantic Relationships Between Research Topics Using Large Language Models in a Zero-Shot Learning Setting. In Proceedings of the 4th International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment co-located with 23rd International Semantic Web Conference (ISWC 2024). CEUR-WS.
File in questo prodotto:
File Dimensione Formato  
Aggarwal-2024-ISWC-VoR.pdf

accesso aperto

Descrizione: This volume and its papers are published under the Creative Commons License Attribution 4.0 International (CC BY 4.0).
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 503.5 kB
Formato Adobe PDF
503.5 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/525457
Citazioni
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
Social impact