Multi-class Twitter sentiment classification with emojis

Mengdi Li, Eugene Ch’ng, Alain Yee Loong Chong, Simon See

    Research output: Journal PublicationArticlepeer-review

    42 Citations (Scopus)

    Abstract

    Purpose: Recently, various Twitter Sentiment Analysis (TSA) techniques have been developed, but little has paid attention to the microblogging feature – emojis, and few works have been conducted on the multi-class sentiment analysis of tweets. The purpose of this paper is to consider the popularity of emojis on Twitter and investigate the feasibility of an emoji training heuristic for multi-class sentiment classification of tweets. Tweets from the “2016 Orlando nightclub shooting” were used as a source of study. Besides, this study also aims to demonstrate how mapping can contribute to interpreting sentiments. Design/methodology/approach: The authors presented a methodological framework to collect, pre-process, analyse and map public Twitter postings related to the shooting. The authors designed and implemented an emoji training heuristic, which automatically prepares the training data set, a feature needed in Big Data research. The authors improved upon the previous framework by advancing the pre-processing techniques, enhancing feature engineering and optimising the classification models. The authors constructed the sentiment model with a logistic regression classifier and selected features. Finally, the authors presented how to visualise citizen sentiments on maps dynamically using Mapbox. Findings: The sentiment model constructed with the automatically annotated training sets using an emoji approach and selected features performs well in classifying tweets into five different sentiment classes, with a macro-averaged F-measure of 0.635, a macro-averaged accuracy of 0.689 and the MAEM of 0.530. Compared to those experimental results in related works, the results are satisfactory, indicating the model is effective and the proposed emoji training heuristic is useful and feasible in multi-class TSA. The maps authors created, provide a much easier-to-understand visual representation of the data, and make it more efficient to monitor citizen sentiments and distributions. Originality/value: This work appears to be the first to conduct multi-class sentiment classification on Twitter with automatic annotation of training sets using emojis. Little attention has been paid to applying TSA to monitor the public’s attitudes towards terror attacks and country’s gun policies, the authors consider this work to be a pioneering work. Besides, the authors have introduced a new data set of 2016 Orlando Shooting tweets, which will be made available for other researchers to mine the public’s political opinions about gun policies.

    Original languageEnglish
    Pages (from-to)1804-1820
    Number of pages17
    JournalIndustrial Management and Data Systems
    Volume118
    Issue number9
    DOIs
    Publication statusPublished - 28 Sept 2018

    Keywords

    • Emojis
    • Sentiment analysis
    • Twitter

    ASJC Scopus subject areas

    • Management Information Systems
    • Industrial relations
    • Computer Science Applications
    • Strategy and Management
    • Industrial and Manufacturing Engineering

    Fingerprint

    Dive into the research topics of 'Multi-class Twitter sentiment classification with emojis'. Together they form a unique fingerprint.

    Cite this