Twitter data offers an unprecedented opportunity to study demographic differences in public opinion across a virtually unlimited range of subjects. Whilst demographic attributes are often implied within user data, they are not always easily identified using computational methods. In this paper, we present a semi-automatic solution that combines automatic classification methods with a user interface designed to enable rapid resolution of ambiguous cases. TweetClass employs a two-step, interactive process to support the determination of gender and age attributes. At each step, the user is presented with feedback on the confidence levels of the automated analysis and can choose to refine ambiguous cases by examining key profile and content data. We describe how a user-centered design approach was used to optimise the interface and present the results of an evaluation which suggests that TweetClass can be used to rapidly boost demographic sample sizes in situations where high accuracy is required.
Beretta, V., Maccagnola, D., Cribbin, T., Messina, V. (2015). An interactive method for inferring demographic attributes in Twitter. In Proceedings of the 26th ACM Conference on Hypertext & Social Media (pp.113-122). Association for Computing Machinery, Inc [10.1145/2700171.2791031].
An interactive method for inferring demographic attributes in Twitter
MACCAGNOLA, DANIELE
;MESSINA, VINCENZINAUltimo
2015
Abstract
Twitter data offers an unprecedented opportunity to study demographic differences in public opinion across a virtually unlimited range of subjects. Whilst demographic attributes are often implied within user data, they are not always easily identified using computational methods. In this paper, we present a semi-automatic solution that combines automatic classification methods with a user interface designed to enable rapid resolution of ambiguous cases. TweetClass employs a two-step, interactive process to support the determination of gender and age attributes. At each step, the user is presented with feedback on the confidence levels of the automated analysis and can choose to refine ambiguous cases by examining key profile and content data. We describe how a user-centered design approach was used to optimise the interface and present the results of an evaluation which suggests that TweetClass can be used to rapidly boost demographic sample sizes in situations where high accuracy is required.File | Dimensione | Formato | |
---|---|---|---|
cribbin.pdf
accesso aperto
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Dimensione
542.22 kB
Formato
Adobe PDF
|
542.22 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.