TY - GEN
T1 - Optimization and Improvement of Fake News Detection using Voting Technique for Societal Benefit
AU - Chinta, Sribala Vidyadhari
AU - Fernandes, Karen
AU - Cheng, Ningxi
AU - Fernandez, Jordan
AU - Yazdani, Shamim
AU - Yin, Zhipeng
AU - Wang, Zichong
AU - Wang, Xuyu
AU - Xu, Weifeng
AU - Liu, Jun
AU - Yew, Chong Siang
AU - Jiang, Puqing
AU - Zhang, Wenbin
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Addressing the surge in false information and the spread of fake news on the Internet has become increasingly challenging for fact-checkers to keep up with. Consequently, the exponential growth of fake news poses a serious threat, as it has been extensively exploited to manipulate public opinion and undermine trust in reliable sources. Machine learning classifiers have been employed in previous studies to address this issue. Existing work in text classification often overlooks the incorporation of contextual information, a gap that our proposed methodology seeks to fill. Our approach distinguishes itself by employing sophisticated text preprocessing techniques to capture subtle linguistic features, thereby enhancing the overall understanding of the text. Furthermore, we draw inspiration from ensemble machine learning strategies to bolster our methodology. We adopt a voting system, wherein the most frequently predicted class by five distinct classifiers is chosen. This ensemble method helps us address the inherent limitations of individual classifiers and improve the robustness of our results. In this study, we present a comparative analysis of five individual classifiers (Logistic Regression, Decision Trees, Naive Bayes, eXtreme Gradient Boosting, and Stochastic Gradient Descent) along with their combination using our ensemble voting technique. We conduct experiments on three real-world datasets of varying sizes and contexts for evaluation. Our findings reveal the increased performance of voting techniques in distinguishing between real and fake news, providing valuable insights into their efficacy in diverse contexts when compared to individual classifiers.
AB - Addressing the surge in false information and the spread of fake news on the Internet has become increasingly challenging for fact-checkers to keep up with. Consequently, the exponential growth of fake news poses a serious threat, as it has been extensively exploited to manipulate public opinion and undermine trust in reliable sources. Machine learning classifiers have been employed in previous studies to address this issue. Existing work in text classification often overlooks the incorporation of contextual information, a gap that our proposed methodology seeks to fill. Our approach distinguishes itself by employing sophisticated text preprocessing techniques to capture subtle linguistic features, thereby enhancing the overall understanding of the text. Furthermore, we draw inspiration from ensemble machine learning strategies to bolster our methodology. We adopt a voting system, wherein the most frequently predicted class by five distinct classifiers is chosen. This ensemble method helps us address the inherent limitations of individual classifiers and improve the robustness of our results. In this study, we present a comparative analysis of five individual classifiers (Logistic Regression, Decision Trees, Naive Bayes, eXtreme Gradient Boosting, and Stochastic Gradient Descent) along with their combination using our ensemble voting technique. We conduct experiments on three real-world datasets of varying sizes and contexts for evaluation. Our findings reveal the increased performance of voting techniques in distinguishing between real and fake news, providing valuable insights into their efficacy in diverse contexts when compared to individual classifiers.
KW - Fake News
KW - Linguistic Approach
KW - Network Approach
KW - Term-Frequency-Inverse Document Frequency
KW - Voting Technique1574
UR - https://www.scopus.com/pages/publications/85186143852
U2 - 10.1109/ICDMW60847.2023.00199
DO - 10.1109/ICDMW60847.2023.00199
M3 - Conference contribution
AN - SCOPUS:85186143852
T3 - IEEE International Conference on Data Mining Workshops, ICDMW
SP - 1565
EP - 1574
BT - Proceedings - 23rd IEEE International Conference on Data Mining Workshops, ICDMW 2023
A2 - Wang, Jihe
A2 - He, Yi
A2 - Dinh, Thang N.
A2 - Grant, Christan
A2 - Qiu, Meikang
A2 - Pedrycz, Witold
PB - IEEE Computer Society
T2 - 23rd IEEE International Conference on Data Mining Workshops, ICDMW 2023
Y2 - 1 December 2023 through 4 December 2023
ER -