Bicocca Open Archive

Lexical taxonomies are widely used to foster information retrieval and exchange in several domains and applications. When there are multiple taxonomies, heterogeneity among them is a severe problem for efficient collaboration processes. In this paper, we propose WETA, a domain-independent, knowledge-poor method for automatic taxonomy alignment via word embeddings. WETA associates all the leaf terms of the origin taxonomy to one or many concepts in the destination taxonomy, employing a scoring function, which merges the score of a hierarchical method based on cosine similarity and the score of a classification task. WETA is developed in the context of an EU Grant aiming at bridging the national taxonomies of EU countries towards the European Skills, Competences, Qualifications and Occupations taxonomy (ESCO) using AI Algorithms. The results, validated within the EU project activities for bridging the Italian occupation taxonomy CP and ESCO, confirm the usefulness of WETA in supporting the automatic alignment of national labor taxonomies. WETA reaches a 0.8 accuracy on recommending top-5 occupations and a wMRR of 0.72. WETA reduces the human effort needed for building a mapping from scratch: it would allow domain experts to concentrate on the validation task and decrease the incoherence due to multiple judgments. It would also make the approach reproducible and transparent to policymakers.

Giabelli, A., Malandri, L., Mercorio, F., Mezzanzanica, M. (2022). WETA: Automatic taxonomy alignment via word embeddings. COMPUTERS IN INDUSTRY, 138(June 2022) [10.1016/j.compind.2022.103626].

WETA: Automatic taxonomy alignment via word embeddings

Giabelli, Anna;Malandri, Lorenzo;Mercorio, Fabio;Mezzanzanica, Mario

2022

Abstract

Lexical taxonomies are widely used to foster information retrieval and exchange in several domains and applications. When there are multiple taxonomies, heterogeneity among them is a severe problem for efficient collaboration processes. In this paper, we propose WETA, a domain-independent, knowledge-poor method for automatic taxonomy alignment via word embeddings. WETA associates all the leaf terms of the origin taxonomy to one or many concepts in the destination taxonomy, employing a scoring function, which merges the score of a hierarchical method based on cosine similarity and the score of a classification task. WETA is developed in the context of an EU Grant aiming at bridging the national taxonomies of EU countries towards the European Skills, Competences, Qualifications and Occupations taxonomy (ESCO) using AI Algorithms. The results, validated within the EU project activities for bridging the Italian occupation taxonomy CP and ESCO, confirm the usefulness of WETA in supporting the automatic alignment of national labor taxonomies. WETA reaches a 0.8 accuracy on recommending top-5 occupations and a wMRR of 0.72. WETA reduces the human effort needed for building a mapping from scratch: it would allow domain experts to concentrate on the validation task and decrease the incoherence due to multiple judgments. It would also make the approach reproducible and transparent to policymakers.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				Human-AI; Real-life Application; Taxonomy Alignment; Word Embedding;
			
	Lingua del contenuto
	
				English
			
	Data ahead of print o Data prima pubblicazione Online
	
				9-feb-2022
			
	Data di pubblicazione
	
				2022
			
	Rivista
	
				COMPUTERS IN INDUSTRY
			
	Numero del volume
	
				138
			
	Fascicolo
	
				June 2022
			
	Article number
	
				103626
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1016/j.compind.2022.103626
			
	Fulltext
	
				none
			
	Citazione
	
				Giabelli, A., Malandri, L., Mercorio, F., Mezzanzanica, M. (2022). WETA: Automatic taxonomy alignment via word embeddings. COMPUTERS IN INDUSTRY, 138(June 2022) [10.1016/j.compind.2022.103626].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/352567

Citazioni

15

8

Social impact