The automatic detection of sarcasm and irony in user generated contents is one of the most challenging task of Natural Language Processing. In this paper we address this problem by introducing Bayesian Model Averaging (BMA), an ensemble approach to take into account several classifiers according to their reliabilities and their marginal probability predictions. The impact of the most used expressive signals (pragmatic particles and POS tags) have been evaluated in baseline models (traditional classifiers and majority voting) as well as in the proposed BMA approach. Experimental results highlight two main findings: (1) not all the features are equally able to characterize sarcasm and irony and (2) BMA not only outperforms traditional state of the art models, but is also able to ensure notable generalization capabilities both on ironic and sarcastic text.
Fersini, E., Pozzi, F., Messina, V. (2015). Detecting irony and sarcasm in microblogs: The role of expressive signals and ensemble classifiers. In Proceedings of the 2015 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2015 (pp.981-988). Institute of Electrical and Electronics Engineers Inc. [10.1109/DSAA.2015.7344888].
Detecting irony and sarcasm in microblogs: The role of expressive signals and ensemble classifiers
FERSINI, ELISABETTA
Primo
;POZZI, FEDERICO ALBERTOSecondo
;MESSINA, VINCENZINAUltimo
2015
Abstract
The automatic detection of sarcasm and irony in user generated contents is one of the most challenging task of Natural Language Processing. In this paper we address this problem by introducing Bayesian Model Averaging (BMA), an ensemble approach to take into account several classifiers according to their reliabilities and their marginal probability predictions. The impact of the most used expressive signals (pragmatic particles and POS tags) have been evaluated in baseline models (traditional classifiers and majority voting) as well as in the proposed BMA approach. Experimental results highlight two main findings: (1) not all the features are equally able to characterize sarcasm and irony and (2) BMA not only outperforms traditional state of the art models, but is also able to ensure notable generalization capabilities both on ironic and sarcastic text.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.