Folksonomies - networks of users, resources, and tags allow users to easily retrieve, organize and browse web contents. However, their advantages are still limited according to the noisiness of user provided tags. To overcome this problem, we propose an approach for identifying related tags in folksonomies. The approach uses tag co-occurrence statistics and Laplacian score feature selection to create probability distribution for each tag. Consequently, related tags are determined according to the distance between their distributions. In this regards, we propose a distance metric based on Jensen-Shannon Divergence. The new metric named AJSD deals with the noise in the measurements due to statistical fluctuations in tag co-occurrences. We experimentally evaluated our approach using WordNet and compared it to a common tag relatedness approach based on the cosine similarity. The results show the effectiveness of our approach and its advantage over the adversary method.
Mousselly-Sergieh, H., Döller, M., Egyed-Zsigmond, E., Gianini, G., Kosch, H., Pinon, J. (2014). Tag relatedness using Laplacian score feature selection and adapted Jensen-Shannon divergence. In MultiMedia Modeling 20th Anniversary International Conference, MMM 2014, Dublin, Ireland, January 6-10, 2014, Proceedings, Part I (pp.159-171). Springer Verlag [10.1007/978-3-319-04114-8_14].
Tag relatedness using Laplacian score feature selection and adapted Jensen-Shannon divergence
Gianini, G;
2014
Abstract
Folksonomies - networks of users, resources, and tags allow users to easily retrieve, organize and browse web contents. However, their advantages are still limited according to the noisiness of user provided tags. To overcome this problem, we propose an approach for identifying related tags in folksonomies. The approach uses tag co-occurrence statistics and Laplacian score feature selection to create probability distribution for each tag. Consequently, related tags are determined according to the distance between their distributions. In this regards, we propose a distance metric based on Jensen-Shannon Divergence. The new metric named AJSD deals with the noise in the measurements due to statistical fluctuations in tag co-occurrences. We experimentally evaluated our approach using WordNet and compared it to a common tag relatedness approach based on the cosine similarity. The results show the effectiveness of our approach and its advantage over the adversary method.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.