Bicocca Open Archive

Binge Drinking (BD) is a common risky behaviour that people hardly report to healthcare professionals, although it is not uncommon to find, instead, personal communications related to alcohol-related behaviors on social media. By following a data-driven approach focusing on User-Generated Content, we aimed to detect potential binge drinkers through the investigation of their language and shared topics. First, we gathered Twitter threads quoting BD and alcohol-related behaviours, by considering unequivocal keywords, identified by experts, from previous evidence on BD. Subsequently, a random sample of the gathered tweets was manually labelled, and two supervised learning classifiers were trained on both linguistic and metadata features, to classify tweets of genuine unique users with respect to media, bot, and commercial accounts. Based on this classification, we observed that approximately 55% of the 1 million alcohol-related collected tweets was automatically identified as belonging to non-genuine users. A third classifier was then trained on a subset of manually labelled tweets among those previously identified as belonging to genuine accounts, to automatically identify potential binge drinkers based only on linguistic features. On average, users classified as binge drinkers were quite similar to the standard genuine Twitter users in our sample. Nonetheless, the analysis of social media contents of genuine users reporting risky behaviours remains a promising source for informed preventive programs.

Crocamo, C., Viviani, M., Bartoli, F., Carrà, G., Pasi, G. (2020). Detecting binge drinking and alcohol-related risky behaviours from twitter’s users: An exploratory content-and topology-based analysis. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 17(5) [10.3390/ijerph17051510].

Detecting binge drinking and alcohol-related risky behaviours from twitter’s users: An exploratory content-and topology-based analysis

Crocamo, Cristina^Primo;Viviani, Marco;Bartoli, Francesco;Carrà, Giuseppe;Pasi, Gabriella

2020

Abstract

Binge Drinking (BD) is a common risky behaviour that people hardly report to healthcare professionals, although it is not uncommon to find, instead, personal communications related to alcohol-related behaviors on social media. By following a data-driven approach focusing on User-Generated Content, we aimed to detect potential binge drinkers through the investigation of their language and shared topics. First, we gathered Twitter threads quoting BD and alcohol-related behaviours, by considering unequivocal keywords, identified by experts, from previous evidence on BD. Subsequently, a random sample of the gathered tweets was manually labelled, and two supervised learning classifiers were trained on both linguistic and metadata features, to classify tweets of genuine unique users with respect to media, bot, and commercial accounts. Based on this classification, we observed that approximately 55% of the 1 million alcohol-related collected tweets was automatically identified as belonging to non-genuine users. A third classifier was then trained on a subset of manually labelled tweets among those previously identified as belonging to genuine accounts, to automatically identify potential binge drinkers based only on linguistic features. On average, users classified as binge drinkers were quite similar to the standard genuine Twitter users in our sample. Nonetheless, the analysis of social media contents of genuine users reporting risky behaviours remains a promising source for informed preventive programs.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				binge drinking; data science; risky health behaviour; social media analytics; supervised machine learning; user-generated content; vulnerability
			
	Lingua del contenuto
	
				English
			
	Data di pubblicazione
	
				2020
			
	Rivista
	
				INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH
			
	Numero del volume
	
				17
			
	Fascicolo
	
				5
			
	Article number
	
				1510
			
	DOI dell'articolo
	
				https://dx.doi.org/10.3390/ijerph17051510
			
	Fulltext
	
				open
			
	Citazione
	
				Crocamo, C., Viviani, M., Bartoli, F., Carrà, G., Pasi, G. (2020). Detecting binge drinking and alcohol-related risky behaviours from twitter’s users: An exploratory content-and topology-based analysis. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 17(5) [10.3390/ijerph17051510].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
10281-263237_VoR.pdf accesso aperto Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Creative Commons Dimensione 1.89 MB Formato Adobe PDF Visualizza/Apri	1.89 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/263237

Citazioni

12

9

Social impact