Bicocca Open Archive

A popular approach to High-Recall Information Retrieval (HRIR) is Technology-Assisted Review (TAR), which uses information retrieval and machine learning techniques to aid the review of large document collections. TAR systems are commonly used in legal eDiscovery and medical systematic literature reviews. Successful TAR systems are able to find the majority of relevant documents using the least number of manual assessments. Previous work typically evaluated TAR models retrospectively, assuming that the system achieves a specific, fixed Recall level first and then measuring model quality (for instance, work saved at r% Recall).This paper presents an analysis of one of such measures: Precision at r% Recall (P@r%). We show that minimum Precision at r% scores depends on the dataset, and therefore, this measure should not be used for evaluation across topics or datasets. We propose its min-max normalised version (nP@r%), and show that it is equal to a product of TNR and Precision scores. Our analysis shows that nP@r% is least correlated with the percentage of relevant documents in the dataset and can be used to focus on additional aspects of the TAR tasks that are not captured with current measures. Finally, we introduce a variation of nP@r%, that is a geometric mean of TNR and Precision, preserving the properties of nP@r% and having a lower coefficient of variation.

Kusa, W., Peikos, G., Staudinger, M., Lipani, A., Hanbury, A. (2024). Normalised Precision at Fixed Recall for Evaluating TAR. In ICTIR 2024 - Proceedings of the 2024 ACM SIGIR International Conference on the Theory of Information Retrieval (pp.43-49). Association for Computing Machinery, Inc [10.1145/3664190.3672532].

Normalised Precision at Fixed Recall for Evaluating TAR

Kusa W.;Peikos G.;Staudinger M.;Lipani A.;Hanbury A.

2024

Abstract

A popular approach to High-Recall Information Retrieval (HRIR) is Technology-Assisted Review (TAR), which uses information retrieval and machine learning techniques to aid the review of large document collections. TAR systems are commonly used in legal eDiscovery and medical systematic literature reviews. Successful TAR systems are able to find the majority of relevant documents using the least number of manual assessments. Previous work typically evaluated TAR models retrospectively, assuming that the system achieves a specific, fixed Recall level first and then measuring model quality (for instance, work saved at r% Recall).This paper presents an analysis of one of such measures: Precision at r% Recall (P@r%). We show that minimum Precision at r% scores depends on the dataset, and therefore, this measure should not be used for evaluation across topics or datasets. We propose its min-max normalised version (nP@r%), and show that it is equal to a product of TNR and Precision scores. Our analysis shows that nP@r% is least correlated with the percentage of relevant documents in the dataset and can be used to focus on additional aspects of the TAR tasks that are not captured with current measures. Finally, we introduce a variation of nP@r%, that is a geometric mean of TNR and Precision, preserving the properties of nP@r% and having a lower coefficient of variation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				citation screening; evaluation; high-recall retrieval; normalised precision; precision at recall; systematic reviews; tar;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				The 2024 ACM SIGIR International Conference on Theory of Information Retrieval - 13 July 2024
			
	Anno del convegno
	
				2024
			
	Titolo degli atti
	
				ICTIR 2024 - Proceedings of the 2024 ACM SIGIR International Conference on the Theory of Information Retrieval
			
	ISBN del volume degli atti
	
				9798400706813
			
	Data di pubblicazione
	
				2024
			
	Pagina iniziale
	
				43
			
	Pagina finale
	
				49
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1145/3664190.3672532
			
	Fulltext
	
				open
			
	Citazione
	
				Kusa, W., Peikos, G., Staudinger, M., Lipani, A., Hanbury, A. (2024). Normalised Precision at Fixed Recall for Evaluating TAR. In ICTIR 2024 - Proceedings of the 2024 ACM SIGIR International Conference on the Theory of Information Retrieval (pp.43-49). Association for Computing Machinery, Inc [10.1145/3664190.3672532].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
Kusa-2024-ICTIR-VoR.pdf accesso aperto Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Creative Commons Dimensione 1.33 MB Formato Adobe PDF Visualizza/Apri	1.33 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/539761

Citazioni

2

0

Social impact