Folksonomies - networks of users, resources, and tags allow users to easily retrieve, organize and browse web contents. However, their advantages are still limited mainly due to the noisiness of user provided tags. To overcome this issue, we propose an approach for characterizing related tags in folksonomies: we use tag co-occurrence statistics and Laplacian score based feature selection in order to create empirical co-occurrence probability distribution for each tag; then we identify related tags on the basis of the dissimilarity between their distributions. For this purpose, we introduce variant of the Jensen-Shannon Divergence, which is more robust to statistical noise. We experimentally evaluate our approach using WordNet and compare it to a common tag-relatedness approach based on the cosine similarity. The results show the effectiveness of our approach and its advantage over the competing method.
Mousselly-Sergieh, H., Egyed-Zsigmond, E., Gianini, G., Döller, M., Pinon, J., Kosch, H. (2014). Tag relatedness in image Folksonomies. DOCUMENT NUMÉRIQUE, 17(2), 33-54 [10.3166/dn.17.2.33-54].
Tag relatedness in image Folksonomies
Gianini, G;
2014
Abstract
Folksonomies - networks of users, resources, and tags allow users to easily retrieve, organize and browse web contents. However, their advantages are still limited mainly due to the noisiness of user provided tags. To overcome this issue, we propose an approach for characterizing related tags in folksonomies: we use tag co-occurrence statistics and Laplacian score based feature selection in order to create empirical co-occurrence probability distribution for each tag; then we identify related tags on the basis of the dissimilarity between their distributions. For this purpose, we introduce variant of the Jensen-Shannon Divergence, which is more robust to statistical noise. We experimentally evaluate our approach using WordNet and compare it to a common tag-relatedness approach based on the cosine similarity. The results show the effectiveness of our approach and its advantage over the competing method.File | Dimensione | Formato | |
---|---|---|---|
Mousselly-Sergieh-2014-Document Num-preprint.pdf
accesso aperto
Descrizione: Article
Tipologia di allegato:
Submitted Version (Pre-print)
Licenza:
Altro
Dimensione
837.63 kB
Formato
Adobe PDF
|
837.63 kB | Adobe PDF | Visualizza/Apri |
Mousselly-Sergieh-2014-Document Num-VoR.pdf
accesso aperto
Descrizione: Article
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Altro
Dimensione
687.63 kB
Formato
Adobe PDF
|
687.63 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.