Abstract
The main objective of this work is to improve the quality and transparency of the medical text classification solutions. Conventional text classification methods provide users with only a restricted mechanism (based on frequency) for selecting features. In this paper, a three-stage hybrid method combining the gated attention-based bi-directional Long Short-Term Memory (ABLSTM) and the regular expression based classifier is proposed for medical text classification tasks. The bi-directional Long Short-Term Memory (LSTM) architecture with an attention layer allows the network to weigh words according to their perceived importance and focus on crucial parts of a sentence. Feature words (or keywords) extracted by ABLSTM model are utilized to guide the regular expression rule construction. Our proposed approach leverages the advantages of both the interpretability of rule-based algorithms and the computational power of deep learning approaches for a production-ready scenario. Experimental results on real-world medical online query data clearly validate the superiority of our system in selecting domain-specific and topic-related features. Results show that the proposed approach achieves an accuracy of 0.89 and an F1-score of 0.92 respectively. Furthermore, our experimentation also illustrates the versatility of regular expressions as a user-level tool for focusing on desired patterns and providing interpretable solutions for human modification.
Original language | English |
---|---|
Pages (from-to) | 345-355 |
Number of pages | 11 |
Journal | Neurocomputing |
Volume | 443 |
DOIs | |
Publication status | Published - 5 Jul 2021 |
Keywords
- Attention mechanism
- Deep learning
- Hybrid system
- Text classification
ASJC Scopus subject areas
- Computer Science Applications
- Cognitive Neuroscience
- Artificial Intelligence