The datasets that are part of the Linking Open Data cloud diagramm (LOD cloud) are classified into the following topical categories: media, government, publications, life sciences, geographic, social networking, user-generated content, and cross-domain. The topical categories were manually assigned to the datasets. In this paper, we investigate to which extent the topical classification of new LOD datasets can be automated using machine learning techniques and the existing annotations as supervision. We conducted experiments with different classification techniques and different feature sets. The best classification technique/feature set combination reaches an accuracy of 81:62% on the task of assigning one out of the eight classes to a given LOD dataset. A deeper inspection of the classification errors reveals problems with the manual classification of datasets in the current LOD cloud.

Meusel, R., Spahiu, B., Bizer, C., Paulheim, H. (2015). Towards automatic topical classification of LOD datasets. In Conference Proceeding Workshop on Linked Data on the Web, LDOW 2015. CEUR-WS.

Towards automatic topical classification of LOD datasets

SPAHIU, BLERINA
Secondo
;
2015

Abstract

The datasets that are part of the Linking Open Data cloud diagramm (LOD cloud) are classified into the following topical categories: media, government, publications, life sciences, geographic, social networking, user-generated content, and cross-domain. The topical categories were manually assigned to the datasets. In this paper, we investigate to which extent the topical classification of new LOD datasets can be automated using machine learning techniques and the existing annotations as supervision. We conducted experiments with different classification techniques and different feature sets. The best classification technique/feature set combination reaches an accuracy of 81:62% on the task of assigning one out of the eight classes to a given LOD dataset. A deeper inspection of the classification errors reveals problems with the manual classification of datasets in the current LOD cloud.
slide + paper
Linked Open Data, Topic Detection, Data Space Profiling
English
Workshop on Linked Data on the Web, LDOW - co-located with the 24th International World Wide Web Conference, WWW 19 may
2015
Conference Proceeding Workshop on Linked Data on the Web, LDOW 2015
2015
1409
open
Meusel, R., Spahiu, B., Bizer, C., Paulheim, H. (2015). Towards automatic topical classification of LOD datasets. In Conference Proceeding Workshop on Linked Data on the Web, LDOW 2015. CEUR-WS.
File in questo prodotto:
File Dimensione Formato  
paper-03.pdf

accesso aperto

Dimensione 264.75 kB
Formato Adobe PDF
264.75 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/101434
Citazioni
  • Scopus 14
  • ???jsp.display-item.citation.isi??? ND
Social impact