A popular approach to High-Recall Information Retrieval (HRIR) is Technology-Assisted Review (TAR), which uses information retrieval and machine learning techniques to aid the review of large document collections. TAR systems are commonly used in legal eDiscovery and medical systematic literature reviews. Successful TAR systems are able to find the majority of relevant documents using the least number of manual assessments. Previous work typically evaluated TAR models retrospectively, assuming that the system achieves a specific, fixed Recall level first and then measuring model quality (for instance, work saved at r% Recall).This paper presents an analysis of one of such measures: Precision at r% Recall (P@r%). We show that minimum Precision at r% scores depends on the dataset, and therefore, this measure should not be used for evaluation across topics or datasets. We propose its min-max normalised version (nP@r%), and show that it is equal to a product of TNR and Precision scores. Our analysis shows that nP@r% is least correlated with the percentage of relevant documents in the dataset and can be used to focus on additional aspects of the TAR tasks that are not captured with current measures. Finally, we introduce a variation of nP@r%, that is a geometric mean of TNR and Precision, preserving the properties of nP@r% and having a lower coefficient of variation.

Kusa, W., Peikos, G., Staudinger, M., Lipani, A., Hanbury, A. (2024). Normalised Precision at Fixed Recall for Evaluating TAR. In ICTIR 2024 - Proceedings of the 2024 ACM SIGIR International Conference on the Theory of Information Retrieval (pp.43-49). Association for Computing Machinery, Inc [10.1145/3664190.3672532].

Normalised Precision at Fixed Recall for Evaluating TAR

Peikos G.;
2024

Abstract

A popular approach to High-Recall Information Retrieval (HRIR) is Technology-Assisted Review (TAR), which uses information retrieval and machine learning techniques to aid the review of large document collections. TAR systems are commonly used in legal eDiscovery and medical systematic literature reviews. Successful TAR systems are able to find the majority of relevant documents using the least number of manual assessments. Previous work typically evaluated TAR models retrospectively, assuming that the system achieves a specific, fixed Recall level first and then measuring model quality (for instance, work saved at r% Recall).This paper presents an analysis of one of such measures: Precision at r% Recall (P@r%). We show that minimum Precision at r% scores depends on the dataset, and therefore, this measure should not be used for evaluation across topics or datasets. We propose its min-max normalised version (nP@r%), and show that it is equal to a product of TNR and Precision scores. Our analysis shows that nP@r% is least correlated with the percentage of relevant documents in the dataset and can be used to focus on additional aspects of the TAR tasks that are not captured with current measures. Finally, we introduce a variation of nP@r%, that is a geometric mean of TNR and Precision, preserving the properties of nP@r% and having a lower coefficient of variation.
paper
citation screening; evaluation; high-recall retrieval; normalised precision; precision at recall; systematic reviews; tar;
English
The 2024 ACM SIGIR International Conference on Theory of Information Retrieval - 13 July 2024
2024
ICTIR 2024 - Proceedings of the 2024 ACM SIGIR International Conference on the Theory of Information Retrieval
9798400706813
2024
43
49
open
Kusa, W., Peikos, G., Staudinger, M., Lipani, A., Hanbury, A. (2024). Normalised Precision at Fixed Recall for Evaluating TAR. In ICTIR 2024 - Proceedings of the 2024 ACM SIGIR International Conference on the Theory of Information Retrieval (pp.43-49). Association for Computing Machinery, Inc [10.1145/3664190.3672532].
File in questo prodotto:
File Dimensione Formato  
Kusa-2024-ICTIR-VoR.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 1.33 MB
Formato Adobe PDF
1.33 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/539761
Citazioni
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 0
Social impact