TY - JOUR
T1 - An intelligent early warning system of analyzing Twitter data using machine learning on COVID-19 surveillance in the US
AU - Zhang, Yiming
AU - Chen, Ke
AU - Weng, Ying
AU - Chen, Zhuo
AU - Zhang, Juntao
AU - Hubbard, Richard
N1 - Funding Information:
This research is supported by the NCHI Grant NCHI-I01200300015.
Publisher Copyright:
© 2022 Elsevier Ltd
PY - 2022/7/15
Y1 - 2022/7/15
N2 - The World Health Organization (WHO) declared on 11th March 2020 the spread of the coronavirus disease 2019 (COVID-19) a pandemic. The traditional infectious disease surveillance had failed to alert public health authorities to intervene in time and mitigate and control the COVID-19 before it became a pandemic. Compared with traditional public health surveillance, harnessing the rich data from social media, including Twitter, has been considered a useful tool and can overcome the limitations of the traditional surveillance system. This paper proposes an intelligent COVID-19 early warning system using Twitter data with novel machine learning methods. We use the natural language processing (NLP) pre-training technique, i.e., fine-tuning BERT as a Twitter classification method. Moreover, we implement a COVID-19 forecasting model through a Twitter-based linear regression model to detect early signs of the COVID-19 outbreak. Furthermore, we develop an expert system, an early warning web application based on the proposed methods. The experimental results suggest that it is feasible to use Twitter data to provide COVID-19 surveillance and prediction in the US to support health departments’ decision-making.
AB - The World Health Organization (WHO) declared on 11th March 2020 the spread of the coronavirus disease 2019 (COVID-19) a pandemic. The traditional infectious disease surveillance had failed to alert public health authorities to intervene in time and mitigate and control the COVID-19 before it became a pandemic. Compared with traditional public health surveillance, harnessing the rich data from social media, including Twitter, has been considered a useful tool and can overcome the limitations of the traditional surveillance system. This paper proposes an intelligent COVID-19 early warning system using Twitter data with novel machine learning methods. We use the natural language processing (NLP) pre-training technique, i.e., fine-tuning BERT as a Twitter classification method. Moreover, we implement a COVID-19 forecasting model through a Twitter-based linear regression model to detect early signs of the COVID-19 outbreak. Furthermore, we develop an expert system, an early warning web application based on the proposed methods. The experimental results suggest that it is feasible to use Twitter data to provide COVID-19 surveillance and prediction in the US to support health departments’ decision-making.
KW - BERT
KW - COVID-19 surveillance
KW - Early warning system
KW - Epidemic intelligence
KW - Text classification
UR - http://www.scopus.com/inward/record.url?scp=85126866944&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2022.116882
DO - 10.1016/j.eswa.2022.116882
M3 - Article
AN - SCOPUS:85126866944
SN - 0957-4174
VL - 198
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 116882
ER -