Multi-class Twitter sentiment classification with emojis

Mengdi Li, Eugene Ch’ng, Alain Yee Loong Chong, Simon See

Research output: Journal PublicationArticlepeer-review

40 Citations (Scopus)

Abstract

Purpose: Recently, various Twitter Sentiment Analysis (TSA) techniques have been developed, but little has paid attention to the microblogging feature – emojis, and few works have been conducted on the multi-class sentiment analysis of tweets. The purpose of this paper is to consider the popularity of emojis on Twitter and investigate the feasibility of an emoji training heuristic for multi-class sentiment classification of tweets. Tweets from the “2016 Orlando nightclub shooting” were used as a source of study. Besides, this study also aims to demonstrate how mapping can contribute to interpreting sentiments. Design/methodology/approach: The authors presented a methodological framework to collect, pre-process, analyse and map public Twitter postings related to the shooting. The authors designed and implemented an emoji training heuristic, which automatically prepares the training data set, a feature needed in Big Data research. The authors improved upon the previous framework by advancing the pre-processing techniques, enhancing feature engineering and optimising the classification models. The authors constructed the sentiment model with a logistic regression classifier and selected features. Finally, the authors presented how to visualise citizen sentiments on maps dynamically using Mapbox. Findings: The sentiment model constructed with the automatically annotated training sets using an emoji approach and selected features performs well in classifying tweets into five different sentiment classes, with a macro-averaged F-measure of 0.635, a macro-averaged accuracy of 0.689 and the MAEM of 0.530. Compared to those experimental results in related works, the results are satisfactory, indicating the model is effective and the proposed emoji training heuristic is useful and feasible in multi-class TSA. The maps authors created, provide a much easier-to-understand visual representation of the data, and make it more efficient to monitor citizen sentiments and distributions. Originality/value: This work appears to be the first to conduct multi-class sentiment classification on Twitter with automatic annotation of training sets using emojis. Little attention has been paid to applying TSA to monitor the public’s attitudes towards terror attacks and country’s gun policies, the authors consider this work to be a pioneering work. Besides, the authors have introduced a new data set of 2016 Orlando Shooting tweets, which will be made available for other researchers to mine the public’s political opinions about gun policies.

Original languageEnglish
Pages (from-to)1804-1820
Number of pages17
JournalIndustrial Management and Data Systems
Volume118
Issue number9
DOIs
Publication statusPublished - 28 Sept 2018

Keywords

  • Emojis
  • Sentiment analysis
  • Twitter

ASJC Scopus subject areas

  • Management Information Systems
  • Industrial relations
  • Computer Science Applications
  • Strategy and Management
  • Industrial and Manufacturing Engineering

Fingerprint

Dive into the research topics of 'Multi-class Twitter sentiment classification with emojis'. Together they form a unique fingerprint.

Cite this