Bicocca Open Archive

In the social web, people use social systems for sharing content and opinions, for communicating with friends, for tagging, etc. People usually have different accounts and different profiles on all of these systems. Several tools for user data aggregation and people search have been developed and protocols and standards for data portability have been defined. This paper presents an approach and an algorithm, named Cross-System User Data Discovery (CS-UDD), to retrieve and aggregate user data distributed on social websites. It is designed to crawl websites, retrieve profiles that may belong to the searched user, correlate them, aggregate the discovered data and return them to the searcher which may, for example, be an adaptive system. The user attributes retrieved, namely attribute-value pairs, are associated with a certainty factor that expresses the confidence that they are true for the searched user. To test the algorithm, we ran it on two popular social networks, MySpace and Flickr. The evaluation has demonstrated the ability of the CS-UDD algorithm to discover unknown user attributes and has revealed high precision of the discovered attributes. © 2014 Elsevier Inc. All rights reserved.

Carmagnola, F., Osborne, F., Torre, I. (2014). User data discovery and aggregation: The CS-UDD algorithm. INFORMATION SCIENCES, 270, 41-72 [10.1016/j.ins.2014.02.111].

User data discovery and aggregation: The CS-UDD algorithm

Carmagnola F;Osborne F;Torre I

2014

Abstract

In the social web, people use social systems for sharing content and opinions, for communicating with friends, for tagging, etc. People usually have different accounts and different profiles on all of these systems. Several tools for user data aggregation and people search have been developed and protocols and standards for data portability have been defined. This paper presents an approach and an algorithm, named Cross-System User Data Discovery (CS-UDD), to retrieve and aggregate user data distributed on social websites. It is designed to crawl websites, retrieve profiles that may belong to the searched user, correlate them, aggregate the discovered data and return them to the searcher which may, for example, be an adaptive system. The user attributes retrieved, namely attribute-value pairs, are associated with a certainty factor that expresses the confidence that they are true for the searched user. To test the algorithm, we ran it on two popular social networks, MySpace and Flickr. The evaluation has demonstrated the ability of the CS-UDD algorithm to discover unknown user attributes and has revealed high precision of the discovered attributes. © 2014 Elsevier Inc. All rights reserved.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				Entity linkage; Entity matching; Information retrieval; Social web; User data discovery; User profiling;
			
	Lingua del contenuto
	
				English
			
	Data di pubblicazione
	
				2014
			
	Rivista
	
				INFORMATION SCIENCES
			
	Numero del volume
	
				270
			
	Pagina iniziale
	
				41
			
	Pagina finale
	
				72
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1016/j.ins.2014.02.111
			
	Fulltext
	
				none
			
	Citazione
	
				Carmagnola, F., Osborne, F., Torre, I. (2014). User data discovery and aggregation: The CS-UDD algorithm. INFORMATION SCIENCES, 270, 41-72 [10.1016/j.ins.2014.02.111].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/381196

Citazioni

6

5

Social impact