Bicocca Open Archive

The current generation of artificial intelligence technologies, such as smart search engines, recommendation systems, tools for systematic reviews, and question-answering applications, plays a crucial role in helping researchers manage and interpret scientific literature. Taxonomies and ontologies of research topics are a fundamental part of this environment as they allow intelligent systems and scientists to navigate the ever-growing number of research papers. However, creating these classifications manually is an expensive and time-consuming process, often resulting in outdated and coarse-grained representations. Consequently, researchers have been focusing on developing automated or semi-automated methods to create taxonomies of research topics. This paper studies the application of transformer-based language models for generating research topic ontologies. Specifically, we have developed a model leveraging SciBERT to identify four semantic relationships between research topics (supertopic, subtopic, same-as, and other) and conducted a comparative analysis against alternative solutions. The preliminary findings indicate that the transformer-based model significantly surpasses the performance of models reliant on traditional features.

Pisu, A., Pompianu, L., Salatino, A., Osborne, F., Riboni, D., Motta, E., et al. (2024). Leveraging Language Models for Generating Ontologies of Research Topics. In Proceedings of the 3rd International workshop one knowledge graph generation from text (TEXT2KG) and Data Quality meets Machine Learning and Knowledge Graphs (DQMLKG) co-located with the Extended Semantic Web Conference ( ESWC 2024) (pp.1-11). CEUR-WS.

Leveraging Language Models for Generating Ontologies of Research Topics

Pisu A.;Pompianu L.;Salatino A.;Osborne F.;Riboni D.;Motta E.;Recupero D. R.

2024

Abstract

The current generation of artificial intelligence technologies, such as smart search engines, recommendation systems, tools for systematic reviews, and question-answering applications, plays a crucial role in helping researchers manage and interpret scientific literature. Taxonomies and ontologies of research topics are a fundamental part of this environment as they allow intelligent systems and scientists to navigate the ever-growing number of research papers. However, creating these classifications manually is an expensive and time-consuming process, often resulting in outdated and coarse-grained representations. Consequently, researchers have been focusing on developing automated or semi-automated methods to create taxonomies of research topics. This paper studies the application of transformer-based language models for generating research topic ontologies. Specifically, we have developed a model leveraging SciBERT to identify four semantic relationships between research topics (supertopic, subtopic, same-as, and other) and conducted a comparative analysis against alternative solutions. The preliminary findings indicate that the transformer-based model significantly surpasses the performance of models reliant on traditional features.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				knowledge graph generation; language models; ontology generation; research topics; SciBERT;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				Joint of the 3rd International Workshop One Knowledge Graph Generation from Text and Data Quality Meets Machine Learning and Knowledge Graphs, TEXT2KG 2024 and DQMLKG 2024 - May 26-30, 2024
			
	Anno del convegno
	
				2024
			
	Titolo degli atti
	
				Proceedings of the 3rd International workshop one knowledge graph generation from text (TEXT2KG) and Data Quality meets Machine Learning and Knowledge Graphs (DQMLKG)
co-located with the Extended Semantic Web Conference ( ESWC 2024)
			
	Collana o serie
	
				CEUR WORKSHOP PROCEEDINGS
			
	Data di pubblicazione
	
				2024
			
	Numero del volume
	
				3747
			
	Pagina iniziale
	
				1
			
	Pagina finale
	
				11
			
	URL alternativo
	
				https://ceur-ws.org/Vol-3747/
			
	Fulltext
	
				open
			
	Citazione
	
				Pisu, A., Pompianu, L., Salatino, A., Osborne, F., Riboni, D., Motta, E., et al. (2024). Leveraging Language Models for Generating Ontologies of Research Topics. In Proceedings of the 3rd International workshop one knowledge graph generation from text (TEXT2KG) and Data Quality meets Machine Learning and Knowledge Graphs (DQMLKG)
co-located with the Extended Semantic Web Conference ( ESWC 2024) (pp.1-11). CEUR-WS.
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
Pisu-2024-TEXT2KG-VoR.pdf accesso aperto Descrizione: This volume and its papers are published under the Creative Commons License Attribution 4.0 International (CC BY 4.0). Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Creative Commons Dimensione 259.14 kB Formato Adobe PDF Visualizza/Apri	259.14 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/521192

Citazioni

0

ND

Social impact