Bicocca Open Archive

The automatic detection of figurative language, such as irony and sarcasm, is one of the most challenging tasks of Natural Language Processing (NLP). This is because machine learning methods can be easily misled by the presence of words that have a strong polarity but are used ironically, which means that the opposite polarity was intended. In this paper, we propose an unsupervised framework for domain-independent irony detection. In particular, to derive an unsupervised Topic-Irony Model (TIM), we built upon an existing probabilistic topic model initially introduced for sentiment analysis purposes. Moreover, in order to improve its generalization abilities, we took advantage of Word Embeddings to obtain domain-Aware ironic orientation of words. This is the first work that addresses this task in unsupervised settings and the first study on the topic-irony distribution. Experimental results have shown that TIM is comparable, and sometimes even better with respect to supervised state of the art approaches for irony detection. Moreover, when integrating the probabilistic model with word embeddings (TIM+WE), promising results have been obtained in a more complex and real world scenario.

Nozza, D., Fersini, E., Messina, V. (2016). Unsupervised Irony Detection: A Probabilistic Model with Word Embeddings. In KDIR 2016 - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval, part of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2016) (pp.68-76). AV D MANUELL, 27A 2 ESQ, SETUBAL, 2910-595, PORTUGAL : SciTePress [10.5220/0006052000680076].

Unsupervised Irony Detection: A Probabilistic Model with Word Embeddings

NOZZA, DEBORA;FERSINI, ELISABETTA;MESSINA, VINCENZINA

2016

Abstract

The automatic detection of figurative language, such as irony and sarcasm, is one of the most challenging tasks of Natural Language Processing (NLP). This is because machine learning methods can be easily misled by the presence of words that have a strong polarity but are used ironically, which means that the opposite polarity was intended. In this paper, we propose an unsupervised framework for domain-independent irony detection. In particular, to derive an unsupervised Topic-Irony Model (TIM), we built upon an existing probabilistic topic model initially introduced for sentiment analysis purposes. Moreover, in order to improve its generalization abilities, we took advantage of Word Embeddings to obtain domain-Aware ironic orientation of words. This is the first work that addresses this task in unsupervised settings and the first study on the topic-irony distribution. Experimental results have shown that TIM is comparable, and sometimes even better with respect to supervised state of the art approaches for irony detection. Moreover, when integrating the probabilistic model with word embeddings (TIM+WE), promising results have been obtained in a more complex and real world scenario.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				Irony Detection; Unsupervised Learning; Probabilistic Model; Word Embeddings
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				KDIR 2016 - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval, part of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2016)
			
	Anno del convegno
	
				2016
			
	Titolo degli atti
	
				KDIR 2016 - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval, part of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2016)
			
	ISBN del volume degli atti
	
				9789897582035
			
	Data di pubblicazione
	
				2016
			
	Numero del volume
	
				1
			
	Pagina iniziale
	
				68
			
	Pagina finale
	
				76
			
	DOI dell'intervento
	
				https://dx.doi.org/10.5220/0006052000680076
			
	Fulltext
	
				none
			
	Citazione
	
				Nozza, D., Fersini, E., Messina, V. (2016). Unsupervised Irony Detection: A Probabilistic Model with Word Embeddings. In KDIR 2016 - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval, part of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2016) (pp.68-76). AV D MANUELL, 27A 2 ESQ, SETUBAL, 2910-595, PORTUGAL : SciTePress [10.5220/0006052000680076].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/135593

Citazioni

18

10

Social impact