The huge amount of textual data on the Web has grown in the last few years rapidly creating unique contents of massive dimension. In a decision making context, one of the most relevant tasks is polarity classification of a text source, which is usually performed through supervised learning methods. Most of the existing approaches select the best classification model leading to over-confident decisions that do not take into account the inherent uncertainty of the natural language. In this paper, we pursue the paradigm of ensemble learning to reduce the noise sensitivity related to language ambiguity and therefore to provide a more accurate prediction of polarity. The proposed ensemble method is based on Bayesian Model Averaging, where both uncertainty and reliability of each single model are taken into account. We address the classifier selection problem by proposing a greedy approach that evaluates the contribution of each model with respect to the ensemble. Experimental results on gold standard datasets show that the proposed approach outperforms both traditional classification and ensemble methods.
Fersini, E., Messina, V., Pozzi, F. (2014). Sentiment analysis: Bayesian Ensemble Learning. DECISION SUPPORT SYSTEMS, 68, 26-38 [10.1016/j.dss.2014.10.004].
Sentiment analysis: Bayesian Ensemble Learning
FERSINI, ELISABETTA
;MESSINA, VINCENZINASecondo
;POZZI, FEDERICO ALBERTOUltimo
2014
Abstract
The huge amount of textual data on the Web has grown in the last few years rapidly creating unique contents of massive dimension. In a decision making context, one of the most relevant tasks is polarity classification of a text source, which is usually performed through supervised learning methods. Most of the existing approaches select the best classification model leading to over-confident decisions that do not take into account the inherent uncertainty of the natural language. In this paper, we pursue the paradigm of ensemble learning to reduce the noise sensitivity related to language ambiguity and therefore to provide a more accurate prediction of polarity. The proposed ensemble method is based on Bayesian Model Averaging, where both uncertainty and reliability of each single model are taken into account. We address the classifier selection problem by proposing a greedy approach that evaluates the contribution of each model with respect to the ensemble. Experimental results on gold standard datasets show that the proposed approach outperforms both traditional classification and ensemble methods.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.