Semantic Query Labeling is the task of locating the constituent parts of a query and assigning domain-specific semantic labels to each of them. It allows unfolding the relations between the query terms and the documents’ structure while leaving unaltered the keyword-based query formulation. In this paper, we investigate the pre-training of a semantic query-tagger with synthetic data generated by leveraging the documents’ structure. By simulating a dynamic environment, we also evaluate the consistency of performance improvements brought by pre-training as real-world training data becomes available. The results of our experiments suggest both the utility of pre-training with synthetic data and its improvements’ consistency over time.
Bassani, E., Pasi, G. (2022). Evaluating the Use of Synthetic Queries for Pre-training a Semantic Query Tagger. In Advances in Information Retrieval. ECIR 2022 (pp.39-46). GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND : Springer Science and Business Media Deutschland GmbH [10.1007/978-3-030-99739-7_5].
Evaluating the Use of Synthetic Queries for Pre-training a Semantic Query Tagger
Bassani E.
Primo
;Pasi G.Ultimo
2022
Abstract
Semantic Query Labeling is the task of locating the constituent parts of a query and assigning domain-specific semantic labels to each of them. It allows unfolding the relations between the query terms and the documents’ structure while leaving unaltered the keyword-based query formulation. In this paper, we investigate the pre-training of a semantic query-tagger with synthetic data generated by leveraging the documents’ structure. By simulating a dynamic environment, we also evaluate the consistency of performance improvements brought by pre-training as real-world training data becomes available. The results of our experiments suggest both the utility of pre-training with synthetic data and its improvements’ consistency over time.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.