Injecting the BM25 Score as Text Improves BERT-Based Re-rankers

Askari, A; Abolghasemi, A; Pasi, G; Kraaij, W; Verberne, S

doi:10.1007/978-3-031-28244-7_5

In this paper we propose a novel approach for combining first-stage lexical retrieval models and Transformer-based re-rankers: we inject the relevance score of the lexical model as a token in the middle of the input of the cross-encoder re-ranker. It was shown in prior work that interpolation between the relevance score of lexical and BERT-based re-rankers may not consistently result in higher effectiveness. Our idea is motivated by the finding that BERT models can capture numeric information. We compare several representations of the BM25 score and inject them as text in the input of four different cross-encoders. We additionally analyze the effect for different query types, and investigate the effectiveness of our method for capturing exact matching relevance. Evaluation on the MSMARCO Passage collection and the TREC DL collections shows that the proposed method significantly improves over all cross-encoder re-rankers as well as the common interpolation methods. We show that the improvement is consistent for all query types. We also find an improvement in exact matching capabilities over both BM25 and the cross-encoders. Our findings indicate that cross-encoder re-rankers can efficiently be improved without additional computational burden and extra steps in the pipeline by explicitly adding the output of the first-stage ranker to the model input, and this effect is robust for different models and query types.

Askari, A., Abolghasemi, A., Pasi, G., Kraaij, W., Verberne, S. (2023). Injecting the BM25 Score as Text Improves BERT-Based Re-rankers. In Advances in Information Retrieval 45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2–6, 2023, Proceedings, Part I (pp.66-83). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-28244-7_5].

Injecting the BM25 Score as Text Improves BERT-Based Re-rankers

Askari A.;Abolghasemi A.;Pasi G.;Kraaij W.;Verberne S.

2023

Abstract

In this paper we propose a novel approach for combining first-stage lexical retrieval models and Transformer-based re-rankers: we inject the relevance score of the lexical model as a token in the middle of the input of the cross-encoder re-ranker. It was shown in prior work that interpolation between the relevance score of lexical and BERT-based re-rankers may not consistently result in higher effectiveness. Our idea is motivated by the finding that BERT models can capture numeric information. We compare several representations of the BM25 score and inject them as text in the input of four different cross-encoders. We additionally analyze the effect for different query types, and investigate the effectiveness of our method for capturing exact matching relevance. Evaluation on the MSMARCO Passage collection and the TREC DL collections shows that the proposed method significantly improves over all cross-encoder re-rankers as well as the common interpolation methods. We show that the improvement is consistent for all query types. We also find an improvement in exact matching capabilities over both BM25 and the cross-encoders. Our findings indicate that cross-encoder re-rankers can efficiently be improved without additional computational burden and extra steps in the pipeline by explicitly adding the output of the first-stage ranker to the model input, and this effect is robust for different models and query types.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				BM25; Combining lexical and neural rankers; Injecting BM25; Transformer-based rankers; Two-stage retrieval;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				45th European Conference on Information Retrieval, ECIR 2023 - 2 April 2023through 6 April 2023
			
	Anno del convegno
	
				2023
			
	Curatori della monografia
	
				Kamps, J; Goeuriot, L; Crestani, F; Maistro, M; Joho, H; Davis, B; Gurrin, C; Kruschwitz, U; Caputo, A
			
	Titolo degli atti
	
				Advances in Information Retrieval
45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2–6, 2023, Proceedings, Part I
			
	ISBN del volume degli atti
	
				9783031282430
			
	Collana o serie
	
				LECTURE NOTES IN COMPUTER SCIENCE
			
	Data di pubblicazione
	
				2023
			
	Numero del volume
	
				13980
			
	Pagina iniziale
	
				66
			
	Pagina finale
	
				83
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1007/978-3-031-28244-7_5
			
	Fulltext
	
				none
			
	Citazione
	
				Askari, A., Abolghasemi, A., Pasi, G., Kraaij, W., Verberne, S. (2023). Injecting the BM25 Score as Text Improves BERT-Based Re-rankers. In Advances in Information Retrieval
45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2–6, 2023, Proceedings, Part I (pp.66-83). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-28244-7_5].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/454532

Citazioni

12

10

Bicocca Open Archive

Injecting the BM25 Score as Text Improves BERT-Based Re-rankers

Askari A.;Abolghasemi A.;Pasi G.;Kraaij W.;Verberne S.

2023

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Citazioni

Social impact

Bicocca Open Archive

Injecting the BM25 Score as Text Improves BERT-Based Re-rankers

Askari A.;Abolghasemi A.;Pasi G.;Kraaij W.;Verberne S.

2023

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Citazioni

Social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)