In recent years, sentiment analysis has seen remarkable advancements, particularly with the development of learning-based and knowledge-based tools, also called lexicon-based. As opposed to machine learning methods, knowledge-based ones do not need to retrieve labelled data to train a classifier, and are less resource-expensive. However, dependency on pre-established rules may be too rigid to be adapted to different domains or too broad to encompass subtle variations in sentiment within specific domains. Additionally, due to their manual construction, their coverage often remains restricted. This study introduces SEEDOT, a novel methodology to enhance the performance of specialised lexicon-based tools. SEEDOT starts from a general lexicon and a domain-specific corpus, and uses machine learning to improve the existing lexicon with domain-specific terms. This improves at once the specificity and the coverage of the general lexicon. The effectiveness of SEEDOT is compared to a state-of-the-art lexicon-based tool, outperforming it in all four domains considered.

Haardt, V., Malandri, L., Mercorio, F., Porcelli, L. (2025). SEEDOT: Tool for Enhancing Sentiment Lexicon with Machine Learning. In Machine Learning and Principles and Practice of Knowledge Discovery in Databases International Workshops of ECML PKDD 2023, Turin, Italy, September 18–22, 2023, Revised Selected Papers, Part III (pp.390-402). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-74633-8_28].

SEEDOT: Tool for Enhancing Sentiment Lexicon with Machine Learning

Haardt V.;Malandri L.;Mercorio F.;Porcelli L.
2025

Abstract

In recent years, sentiment analysis has seen remarkable advancements, particularly with the development of learning-based and knowledge-based tools, also called lexicon-based. As opposed to machine learning methods, knowledge-based ones do not need to retrieve labelled data to train a classifier, and are less resource-expensive. However, dependency on pre-established rules may be too rigid to be adapted to different domains or too broad to encompass subtle variations in sentiment within specific domains. Additionally, due to their manual construction, their coverage often remains restricted. This study introduces SEEDOT, a novel methodology to enhance the performance of specialised lexicon-based tools. SEEDOT starts from a general lexicon and a domain-specific corpus, and uses machine learning to improve the existing lexicon with domain-specific terms. This improves at once the specificity and the coverage of the general lexicon. The effectiveness of SEEDOT is compared to a state-of-the-art lexicon-based tool, outperforming it in all four domains considered.
paper
Learning symbolic knowledge; NLP; Sentiment analysis;
English
International Workshops of ECML PKDD 2023 - September 18–22, 2023
2023
Machine Learning and Principles and Practice of Knowledge Discovery in Databases International Workshops of ECML PKDD 2023, Turin, Italy, September 18–22, 2023, Revised Selected Papers, Part III
9783031746321
2-gen-2025
2025
2135 CCIS
390
402
none
Haardt, V., Malandri, L., Mercorio, F., Porcelli, L. (2025). SEEDOT: Tool for Enhancing Sentiment Lexicon with Machine Learning. In Machine Learning and Principles and Practice of Knowledge Discovery in Databases International Workshops of ECML PKDD 2023, Turin, Italy, September 18–22, 2023, Revised Selected Papers, Part III (pp.390-402). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-031-74633-8_28].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/539823
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
Social impact