Question Answering with Language Models

Jia Yin Wong, Chin Poo Lee, Kian Ming Lim, Jit Yan Lim, Jashila Nair Mogan

Research output: Journal PublicationConference articlepeer-review

Abstract

With the significant increase in the number of resources providing various types of information, ranging from thousands to millions of sources including research papers, blogs, and more, extracting and providing meaningful insights has become a challenge. This study develops a Question Answering (QA) system using the Haystack framework, capable of retrieving relevant documents and extracting answers from lengthy texts. The implemented QA system consists of four main components: an indexing pipeline, a document store, a searching pipeline, and an evaluation module. Additionally, this study investigates how different model architectures affect performance in retrieving and extracting answers. Two different retrievers, BM25 Retriever and Dense Passage Retriever, and five different readers, BERT, Roberta, albert, MiniLM, and ELECTRA, are examined. The models are tested and evaluated using the SQuAD datasets. The combination of BM25 Retriever and RoBERTa Reader achieved the best performance, with an F1-score of 0.9301 and an Exact Match score of 0.8956 on the SQuAD2.0 dataset.

Original languageEnglish
Pages (from-to)18-23
Number of pages6
JournalProceedings of the IEEE Conference on Systems, Process and Control, ICSPC
Issue number2024
DOIs
Publication statusPublished - 2024
Event12th IEEE Conference on Systems, Process and Control, ICSPC 2024 - Malacca, Malaysia
Duration: 7 Dec 2024 → …

Keywords

  • ALBERT
  • BERT
  • BM25 retriever
  • dense passage retriever
  • ELECTRA
  • MiniLM
  • natural language processing
  • question answering
  • Roberta
  • transformer

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality
  • Control and Optimization
  • Modelling and Simulation
  • Education

Cite this