Incremental maintenance of biological databases using association rule mining

Kai Tak Lam, Judice L.Y. Koh, Bharadwaj Veeravalli, Vladimir Brusic

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Biological research frequently requires specialist databases to support in-depth analysis about specific subjects. With the rapid growth of biological sequences in public domain data sources, it is difficult to keep these databases current with the sources. Simple queries formulated to retrieve relevant sequences typically return a large number of false matches and thus demanding manual filtration. In this paper, we propose a novel methodology that can support automatic incremental updating of specialist databases. Complex queries for incremental updating of relevant sequences are learned using Association Rule Mining (ARM), resulting in a significant reduction in false positive matches. This is the first time ARM is used in formulating descriptive queries for the purpose of incremental maintenance of specialised biological databases. We have implemented and tested our methodology on two real-world databases. Our experiments conclusively show that the methodology guarantees an F-score of up to 80% in detecting new sequences for these two databases.

Original languageEnglish
Title of host publicationPattern Recognition in Bioinformatics - International Workshop, PRIB 2006, Proceedings
PublisherSpringer Verlag
Pages140-150
Number of pages11
ISBN (Print)3540374469, 9783540374466
DOIs
Publication statusPublished - 2006
Externally publishedYes
EventInternational Workshop on Pattern Recognition in Bioinformatics, PRIB 2006 - Hong Kong, China
Duration: 20 Aug 200620 Aug 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4146 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Workshop on Pattern Recognition in Bioinformatics, PRIB 2006
Country/TerritoryChina
CityHong Kong
Period20/08/0620/08/06

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Incremental maintenance of biological databases using association rule mining'. Together they form a unique fingerprint.

Cite this