Criminal investigations often involve the analysis of messages exchanged through instant messaging apps such as WhatsApp, which can be an extremely effort-consuming task. Our approach integrates knowledge graphs and NLP models to support this analysis by semantically enriching data collected from suspects’ mobile phones, and help prosecutors and investigators search into the data and get valuable insights. Our semantic enrichment process involves extracting message data and modeling it using a knowledge graph, generating transcriptions of voice messages, and annotating the data using an end-to-end entity extraction approach. We adopt two different solutions to help users get insights into the data, one based on querying and visualizing the graph, and one based on semantic search. The proposed approach ensures that users can verify the information by accessing the original data. While we report about early results and prototypes developed in the context of an ongoing project, our proposal has undergone practical applications with real investigation data. As a consequence, we had the chance to interact closely with prosecutors, collecting positive feedback but also identifying interesting opportunities as well as promising research directions to share with the research community.

Pozzi, R., Barbera, V., Alva Principe, R., Giardini, D., Rubini, R., Palmonari, M. (2025). Combining Knowledge Graphs and NLP to Analyze Instant Messaging Data in Criminal Investigations. In Web Information Systems Engineering – WISE 2024 25th International Conference, Doha, Qatar, December 2–5, 2024, Proceedings, Part II (pp.427-442). Springer Science and Business Media Deutschland GmbH [10.1007/978-981-96-0567-5_30].

Combining Knowledge Graphs and NLP to Analyze Instant Messaging Data in Criminal Investigations

Pozzi R.;Barbera V.;Alva Principe R.;Rubini R.;Palmonari M.
2025

Abstract

Criminal investigations often involve the analysis of messages exchanged through instant messaging apps such as WhatsApp, which can be an extremely effort-consuming task. Our approach integrates knowledge graphs and NLP models to support this analysis by semantically enriching data collected from suspects’ mobile phones, and help prosecutors and investigators search into the data and get valuable insights. Our semantic enrichment process involves extracting message data and modeling it using a knowledge graph, generating transcriptions of voice messages, and annotating the data using an end-to-end entity extraction approach. We adopt two different solutions to help users get insights into the data, one based on querying and visualizing the graph, and one based on semantic search. The proposed approach ensures that users can verify the information by accessing the original data. While we report about early results and prototypes developed in the context of an ongoing project, our proposal has undergone practical applications with real investigation data. As a consequence, we had the chance to interact closely with prosecutors, collecting positive feedback but also identifying interesting opportunities as well as promising research directions to share with the research community.
paper
Criminal Investigations; Entity Extraction; Instant Messaging Data; Knowledge Graph; Semantic Enrichment;
English
Web Information Systems Engineering – WISE 2024 25th International Conference - December 2–5, 2024
2024
Barhamgi, M; Wang, H; Wang, X
Web Information Systems Engineering – WISE 2024 25th International Conference, Doha, Qatar, December 2–5, 2024, Proceedings, Part II
9789819605668
3-dic-2024
2025
15437 LNCS
427
442
reserved
Pozzi, R., Barbera, V., Alva Principe, R., Giardini, D., Rubini, R., Palmonari, M. (2025). Combining Knowledge Graphs and NLP to Analyze Instant Messaging Data in Criminal Investigations. In Web Information Systems Engineering – WISE 2024 25th International Conference, Doha, Qatar, December 2–5, 2024, Proceedings, Part II (pp.427-442). Springer Science and Business Media Deutschland GmbH [10.1007/978-981-96-0567-5_30].
File in questo prodotto:
File Dimensione Formato  
Pozzi-2024-WISE-VoR.pdf

Solo gestori archivio

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Tutti i diritti riservati
Dimensione 901.28 kB
Formato Adobe PDF
901.28 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/536102
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
Social impact