In recent years, there has been an increasing interest in extracting and annotating tables on the Web. This activity allows the transformation of text data into machine-readable formats to enable the execution of various artificial intelligence tasks, e.g. semantic search and dataset extension. Semantic Table Interpretation is the process of annotating elements in a table. Current approaches are mainly based on lexical matching algorithms that rely on metadata associated with tables or custom Knowledge Graphs. Their main limitations are due to the lack of metadata, the little use of contextual semantics, and the incompleteness of the proposed methods that do not include all the necessary steps. In this paper, we propose a comprehensive approach and a tool that provides an unsupervised method to annotate independent tables, possibly without header row or other external information. The approach is based on the definition of a context created from the elements within the table in order to discriminate among matching entities found in shared Knowledge Graphs and create high-quality annotations. The approach has achieved excellent results in an international challenge, thus proving its effectiveness.
Cremaschi, M., De Paoli, F., Rula, A., Spahiu, B. (2020). A fully automated approach to a complete Semantic Table Interpretation. FUTURE GENERATION COMPUTER SYSTEMS, 112, 478-500 [10.1016/j.future.2020.05.019].
A fully automated approach to a complete Semantic Table Interpretation
Cremaschi M.
;De Paoli F.;Rula A.;Spahiu B.
2020
Abstract
In recent years, there has been an increasing interest in extracting and annotating tables on the Web. This activity allows the transformation of text data into machine-readable formats to enable the execution of various artificial intelligence tasks, e.g. semantic search and dataset extension. Semantic Table Interpretation is the process of annotating elements in a table. Current approaches are mainly based on lexical matching algorithms that rely on metadata associated with tables or custom Knowledge Graphs. Their main limitations are due to the lack of metadata, the little use of contextual semantics, and the incompleteness of the proposed methods that do not include all the necessary steps. In this paper, we propose a comprehensive approach and a tool that provides an unsupervised method to annotate independent tables, possibly without header row or other external information. The approach is based on the definition of a context created from the elements within the table in order to discriminate among matching entities found in shared Knowledge Graphs and create high-quality annotations. The approach has achieved excellent results in an international challenge, thus proving its effectiveness.File | Dimensione | Formato | |
---|---|---|---|
Cremaschi-2020-Future Generat Comput Systems-VoR.pdf
Solo gestori archivio
Descrizione: Research Article
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Tutti i diritti riservati
Dimensione
2.36 MB
Formato
Adobe PDF
|
2.36 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Cremaschi-2020-Future Generat Comput Systems-AAM.pdf
accesso aperto
Descrizione: Research Article
Tipologia di allegato:
Author’s Accepted Manuscript, AAM (Post-print)
Licenza:
Creative Commons
Dimensione
918.9 kB
Formato
Adobe PDF
|
918.9 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.