Twitter data offers an unprecedented opportunity to study demographic differences in public opinion across a virtually unlimited range of subjects. Whilst demographic attributes are often implied within user data, they are not always easily identified using computational methods. In this paper, we present a semi-automatic solution that combines automatic classification methods with a user interface designed to enable rapid resolution of ambiguous cases. TweetClass employs a two-step, interactive process to support the determination of gender and age attributes. At each step, the user is presented with feedback on the confidence levels of the automated analysis and can choose to refine ambiguous cases by examining key profile and content data. We describe how a user-centered design approach was used to optimise the interface and present the results of an evaluation which suggests that TweetClass can be used to rapidly boost demographic sample sizes in situations where high accuracy is required.

Beretta, V., Maccagnola, D., Cribbin, T., Messina, V. (2015). An interactive method for inferring demographic attributes in Twitter. In Proceedings of the 26th ACM Conference on Hypertext & Social Media (pp.113-122). Association for Computing Machinery, Inc [10.1145/2700171.2791031].

An interactive method for inferring demographic attributes in Twitter

MACCAGNOLA, DANIELE
;
MESSINA, VINCENZINA
Ultimo
2015

Abstract

Twitter data offers an unprecedented opportunity to study demographic differences in public opinion across a virtually unlimited range of subjects. Whilst demographic attributes are often implied within user data, they are not always easily identified using computational methods. In this paper, we present a semi-automatic solution that combines automatic classification methods with a user interface designed to enable rapid resolution of ambiguous cases. TweetClass employs a two-step, interactive process to support the determination of gender and age attributes. At each step, the user is presented with feedback on the confidence levels of the automated analysis and can choose to refine ambiguous cases by examining key profile and content data. We describe how a user-centered design approach was used to optimise the interface and present the results of an evaluation which suggests that TweetClass can be used to rapidly boost demographic sample sizes in situations where high accuracy is required.
slide + paper
demographic attributes, sampling, semi-automatic classification, social research, sociology, text analysis, twitter, user interface
English
ACM Conference on Hypertext & Social Media 1-4 September
2015
Proceedings of the 26th ACM Conference on Hypertext & Social Media
9781450333955
2015
113
122
open
Beretta, V., Maccagnola, D., Cribbin, T., Messina, V. (2015). An interactive method for inferring demographic attributes in Twitter. In Proceedings of the 26th ACM Conference on Hypertext & Social Media (pp.113-122). Association for Computing Machinery, Inc [10.1145/2700171.2791031].
File in questo prodotto:
File Dimensione Formato  
cribbin.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Dimensione 542.22 kB
Formato Adobe PDF
542.22 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/89694
Citazioni
  • Scopus 15
  • ???jsp.display-item.citation.isi??? ND
Social impact