A hybrid medical text classification framework: Integrating attentive rule construction and neural network

Xiang Li; Menglin Cui; Jingpeng Li; Ruibin Bai; Zheng Lu; Uwe Aickelin

doi:10.1016/j.neucom.2021.02.069

A hybrid medical text classification framework: Integrating attentive rule construction and neural network

Xiang Li, Menglin Cui, Jingpeng Li, Ruibin Bai, Zheng Lu, Uwe Aickelin

School of Computer Science

Research output: Journal Publication › Article › peer-review

52 Citations (Scopus)

68 Downloads (Pure)

Abstract

The main objective of this work is to improve the quality and transparency of the medical text classification solutions. Conventional text classification methods provide users with only a restricted mechanism (based on frequency) for selecting features. In this paper, a three-stage hybrid method combining the gated attention-based bi-directional Long Short-Term Memory (ABLSTM) and the regular expression based classifier is proposed for medical text classification tasks. The bi-directional Long Short-Term Memory (LSTM) architecture with an attention layer allows the network to weigh words according to their perceived importance and focus on crucial parts of a sentence. Feature words (or keywords) extracted by ABLSTM model are utilized to guide the regular expression rule construction. Our proposed approach leverages the advantages of both the interpretability of rule-based algorithms and the computational power of deep learning approaches for a production-ready scenario. Experimental results on real-world medical online query data clearly validate the superiority of our system in selecting domain-specific and topic-related features. Results show that the proposed approach achieves an accuracy of 0.89 and an F₁-score of 0.92 respectively. Furthermore, our experimentation also illustrates the versatility of regular expressions as a user-level tool for focusing on desired patterns and providing interpretable solutions for human modification.

Original language	English
Pages (from-to)	345-355
Number of pages	11
Journal	Neurocomputing
Volume	443
DOIs	https://doi.org/10.1016/j.neucom.2021.02.069
Publication status	Published - 5 Jul 2021

Keywords

Attention mechanism
Deep learning
Hybrid system
Text classification

ASJC Scopus subject areas

Computer Science Applications
Cognitive Neuroscience
Artificial Intelligence

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1016/j.neucom.2021.02.069

20 A hybrid medical text classification framework： Integrating attentive rule construction and neural networkAccepted author manuscript, 929 KBLicence: CC BY

Cite this

@article{ea2769854a354ed08e491350c77fef4c,

title = "A hybrid medical text classification framework: Integrating attentive rule construction and neural network",

abstract = "The main objective of this work is to improve the quality and transparency of the medical text classification solutions. Conventional text classification methods provide users with only a restricted mechanism (based on frequency) for selecting features. In this paper, a three-stage hybrid method combining the gated attention-based bi-directional Long Short-Term Memory (ABLSTM) and the regular expression based classifier is proposed for medical text classification tasks. The bi-directional Long Short-Term Memory (LSTM) architecture with an attention layer allows the network to weigh words according to their perceived importance and focus on crucial parts of a sentence. Feature words (or keywords) extracted by ABLSTM model are utilized to guide the regular expression rule construction. Our proposed approach leverages the advantages of both the interpretability of rule-based algorithms and the computational power of deep learning approaches for a production-ready scenario. Experimental results on real-world medical online query data clearly validate the superiority of our system in selecting domain-specific and topic-related features. Results show that the proposed approach achieves an accuracy of 0.89 and an F1-score of 0.92 respectively. Furthermore, our experimentation also illustrates the versatility of regular expressions as a user-level tool for focusing on desired patterns and providing interpretable solutions for human modification.",

keywords = "Attention mechanism, Deep learning, Hybrid system, Text classification",

author = "Xiang Li and Menglin Cui and Jingpeng Li and Ruibin Bai and Zheng Lu and Uwe Aickelin",

note = "Publisher Copyright: {\textcopyright} 2021 Elsevier B.V.",

year = "2021",

month = jul,

day = "5",

doi = "10.1016/j.neucom.2021.02.069",

language = "English",

volume = "443",

pages = "345--355",

journal = "Neurocomputing",

issn = "0925-2312",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - A hybrid medical text classification framework

T2 - Integrating attentive rule construction and neural network

AU - Li, Xiang

AU - Cui, Menglin

AU - Li, Jingpeng

AU - Bai, Ruibin

AU - Lu, Zheng

AU - Aickelin, Uwe

PY - 2021/7/5

Y1 - 2021/7/5

N2 - The main objective of this work is to improve the quality and transparency of the medical text classification solutions. Conventional text classification methods provide users with only a restricted mechanism (based on frequency) for selecting features. In this paper, a three-stage hybrid method combining the gated attention-based bi-directional Long Short-Term Memory (ABLSTM) and the regular expression based classifier is proposed for medical text classification tasks. The bi-directional Long Short-Term Memory (LSTM) architecture with an attention layer allows the network to weigh words according to their perceived importance and focus on crucial parts of a sentence. Feature words (or keywords) extracted by ABLSTM model are utilized to guide the regular expression rule construction. Our proposed approach leverages the advantages of both the interpretability of rule-based algorithms and the computational power of deep learning approaches for a production-ready scenario. Experimental results on real-world medical online query data clearly validate the superiority of our system in selecting domain-specific and topic-related features. Results show that the proposed approach achieves an accuracy of 0.89 and an F1-score of 0.92 respectively. Furthermore, our experimentation also illustrates the versatility of regular expressions as a user-level tool for focusing on desired patterns and providing interpretable solutions for human modification.

AB - The main objective of this work is to improve the quality and transparency of the medical text classification solutions. Conventional text classification methods provide users with only a restricted mechanism (based on frequency) for selecting features. In this paper, a three-stage hybrid method combining the gated attention-based bi-directional Long Short-Term Memory (ABLSTM) and the regular expression based classifier is proposed for medical text classification tasks. The bi-directional Long Short-Term Memory (LSTM) architecture with an attention layer allows the network to weigh words according to their perceived importance and focus on crucial parts of a sentence. Feature words (or keywords) extracted by ABLSTM model are utilized to guide the regular expression rule construction. Our proposed approach leverages the advantages of both the interpretability of rule-based algorithms and the computational power of deep learning approaches for a production-ready scenario. Experimental results on real-world medical online query data clearly validate the superiority of our system in selecting domain-specific and topic-related features. Results show that the proposed approach achieves an accuracy of 0.89 and an F1-score of 0.92 respectively. Furthermore, our experimentation also illustrates the versatility of regular expressions as a user-level tool for focusing on desired patterns and providing interpretable solutions for human modification.

KW - Attention mechanism

KW - Deep learning

KW - Hybrid system

KW - Text classification

UR - http://www.scopus.com/inward/record.url?scp=85103638662&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2021.02.069

DO - 10.1016/j.neucom.2021.02.069

M3 - Article

AN - SCOPUS:85103638662

SN - 0925-2312

VL - 443

SP - 345

EP - 355

JO - Neurocomputing

JF - Neurocomputing

ER -

A hybrid medical text classification framework: Integrating attentive rule construction and neural network

Abstract

Keywords

ASJC Scopus subject areas

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this