This paper describes the approach that was developed for SemEval 2018 Task 2 (Multilingual Emoji Prediction) by the DUTH Team. First, we employed a combination of preprocessing techniques to reduce the noise of tweets and produce a number of features. Then, we built several N-grams, to represent the combination of word and emojis. Finally, we trained our system with a tuned LinearSVC classifier. Our approach in the leaderboard ranked 18th amongst 48 teams.
Effrosynidis, D., Peikos, G., Symeonidis, S., Arampatzis, A. (2018). DUTH at SemEval-2018 Task 2: Emoji Prediction in Tweets. In NAACL HLT 2018 - International Workshop on Semantic Evaluation, SemEval 2018 - Proceedings of the 12th Workshop (pp.466-469). Association for Computational Linguistics (ACL) [10.18653/v1/S18-1074].
DUTH at SemEval-2018 Task 2: Emoji Prediction in Tweets
Peikos G.;
2018
Abstract
This paper describes the approach that was developed for SemEval 2018 Task 2 (Multilingual Emoji Prediction) by the DUTH Team. First, we employed a combination of preprocessing techniques to reduce the noise of tweets and produce a number of features. Then, we built several N-grams, to represent the combination of word and emojis. Finally, we trained our system with a tuned LinearSVC classifier. Our approach in the leaderboard ranked 18th amongst 48 teams.File | Dimensione | Formato | |
---|---|---|---|
Effrosynidis-2018-SemEval-VoR.pdf
accesso aperto
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
129.75 kB
Formato
Adobe PDF
|
129.75 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.