Scientific Knowledge Graphs have recently become a powerful tool for exploring the research landscape and assisting scientific inquiry. It is crucial to generate and validate these resources to ensure they offer a comprehensive and accurate representation of specific research fields. However, manual approaches are not scalable, while automated methods often result in lower-quality resources. In this paper, we investigate novel validation techniques to improve the accuracy of automated KG generation methodologies, leveraging both a human-in-the-loop (HiL) and a large language model (LLM)-in-the-loop. Using the automated generation pipeline of the Computer Science Knowledge Graph as a case study, we demonstrate that precision can be increased by 12% (from 75% to 87%) using only LLMs. Moreover, a hybrid approach incorporating both LLMs and HiL significantly enhances both precision and recall, resulting in a 4% increase in the F1 score (from 77% to 81%).

Tsaneva, S., Dessi, D., Osborne, F., Sabou, M. (2024). Enhancing Scientific Knowledge Graph Generation Pipelines with LLMs and Human-in-the-Loop. In Proceedings of the 4th International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment co-located with 23rd International Semantic Web Conference (ISWC 2024). CEUR-WS.

Enhancing Scientific Knowledge Graph Generation Pipelines with LLMs and Human-in-the-Loop

Osborne F.;
2024

Abstract

Scientific Knowledge Graphs have recently become a powerful tool for exploring the research landscape and assisting scientific inquiry. It is crucial to generate and validate these resources to ensure they offer a comprehensive and accurate representation of specific research fields. However, manual approaches are not scalable, while automated methods often result in lower-quality resources. In this paper, we investigate novel validation techniques to improve the accuracy of automated KG generation methodologies, leveraging both a human-in-the-loop (HiL) and a large language model (LLM)-in-the-loop. Using the automated generation pipeline of the Computer Science Knowledge Graph as a case study, we demonstrate that precision can be increased by 12% (from 75% to 87%) using only LLMs. Moreover, a hybrid approach incorporating both LLMs and HiL significantly enhances both precision and recall, resulting in a 4% increase in the F1 score (from 77% to 81%).
paper
Hybrid Human-AI Workflows; Knowledge Graph Evaluation; Large Language Models; Scientific Knowledge Graph;
English
4th International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment, Sci-K 2024 - November 12, 2024
2024
Proceedings of the 4th International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment co-located with 23rd International Semantic Web Conference (ISWC 2024)
2024
3780
https://ceur-ws.org/Vol-3780/
open
Tsaneva, S., Dessi, D., Osborne, F., Sabou, M. (2024). Enhancing Scientific Knowledge Graph Generation Pipelines with LLMs and Human-in-the-Loop. In Proceedings of the 4th International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment co-located with 23rd International Semantic Web Conference (ISWC 2024). CEUR-WS.
File in questo prodotto:
File Dimensione Formato  
Tsaneva-2024-ISWC-VoR.pdf

accesso aperto

Descrizione: This volume and its papers are published under the Creative Commons License Attribution 4.0 International (CC BY 4.0).
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 1.83 MB
Formato Adobe PDF
1.83 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/525455
Citazioni
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
Social impact