In the social web, people use social systems for sharing content and opinions, for communicating with friends, for tagging, etc. People usually have different accounts and different profiles on all of these systems. Several tools for user data aggregation and people search have been developed and protocols and standards for data portability have been defined. This paper presents an approach and an algorithm, named Cross-System User Data Discovery (CS-UDD), to retrieve and aggregate user data distributed on social websites. It is designed to crawl websites, retrieve profiles that may belong to the searched user, correlate them, aggregate the discovered data and return them to the searcher which may, for example, be an adaptive system. The user attributes retrieved, namely attribute-value pairs, are associated with a certainty factor that expresses the confidence that they are true for the searched user. To test the algorithm, we ran it on two popular social networks, MySpace and Flickr. The evaluation has demonstrated the ability of the CS-UDD algorithm to discover unknown user attributes and has revealed high precision of the discovered attributes. © 2014 Elsevier Inc. All rights reserved.

Carmagnola, F., Osborne, F., Torre, I. (2014). User data discovery and aggregation: The CS-UDD algorithm. INFORMATION SCIENCES, 270, 41-72 [10.1016/j.ins.2014.02.111].

User data discovery and aggregation: The CS-UDD algorithm

Osborne F
;
2014

Abstract

In the social web, people use social systems for sharing content and opinions, for communicating with friends, for tagging, etc. People usually have different accounts and different profiles on all of these systems. Several tools for user data aggregation and people search have been developed and protocols and standards for data portability have been defined. This paper presents an approach and an algorithm, named Cross-System User Data Discovery (CS-UDD), to retrieve and aggregate user data distributed on social websites. It is designed to crawl websites, retrieve profiles that may belong to the searched user, correlate them, aggregate the discovered data and return them to the searcher which may, for example, be an adaptive system. The user attributes retrieved, namely attribute-value pairs, are associated with a certainty factor that expresses the confidence that they are true for the searched user. To test the algorithm, we ran it on two popular social networks, MySpace and Flickr. The evaluation has demonstrated the ability of the CS-UDD algorithm to discover unknown user attributes and has revealed high precision of the discovered attributes. © 2014 Elsevier Inc. All rights reserved.
Articolo in rivista - Articolo scientifico
Entity linkage; Entity matching; Information retrieval; Social web; User data discovery; User profiling;
English
2014
270
41
72
none
Carmagnola, F., Osborne, F., Torre, I. (2014). User data discovery and aggregation: The CS-UDD algorithm. INFORMATION SCIENCES, 270, 41-72 [10.1016/j.ins.2014.02.111].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/381196
Citazioni
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 5
Social impact