Data classification using the Dempster-Shafer method

Qi Chen; Amanda Whitbrook; Uwe Aickelin; Chris Roadknight

doi:10.1080/0952813X.2014.886301

Data classification using the Dempster-Shafer method

Qi Chen, Amanda Whitbrook, Uwe Aickelin, Chris Roadknight

Research output: Journal Publication › Article › peer-review

44 Citations (Scopus)

Abstract

In this paper, the Dempster-Shafer (D-S) method is used as the theoretical basis for creating data classification systems. Testing is carried out using three popular multiple attribute benchmark data-sets that have two, three and four classes. In each case, a subset of the available data is used for training to establish thresholds, limits or likelihoods of class membership for each attribute, and hence create mass functions that establish probability of class membership for each attribute of the test data. Classification of each data item is achieved by combination of these probabilities via Dempster's rule of combination. Results for the first two data-sets show extremely high classification accuracy that is competitive with other popular methods. The third data-set is non-numerical and difficult to classify, but good results can be achieved provided the system and mass functions are designed carefully and the right attributes are chosen for combination. In all cases, the D-S method provides comparable performance to other more popular algorithms, but the overhead of generating accurate mass functions increases the complexity with the addition of new attributes. Overall, the results suggest that the D-S approach provides a suitable framework for the design of classification systems and that automating the mass function design and calculation would increase the viability of the algorithm for complex classification problems.

Original language	English
Pages (from-to)	493-517
Number of pages	25
Journal	Journal of Experimental and Theoretical Artificial Intelligence
Volume	26
Issue number	4
DOIs	https://doi.org/10.1080/0952813X.2014.886301
Publication status	Published - 2 Oct 2014
Externally published	Yes

Keywords

Dempster's rule of combination
Dempster-Shafer theory
data classification

ASJC Scopus subject areas

Software
Theoretical Computer Science
Artificial Intelligence

Access to Document

10.1080/0952813X.2014.886301

Cite this

@article{9fa581219db442d19eca6c0cfb17332c,

title = "Data classification using the Dempster-Shafer method",

abstract = "In this paper, the Dempster-Shafer (D-S) method is used as the theoretical basis for creating data classification systems. Testing is carried out using three popular multiple attribute benchmark data-sets that have two, three and four classes. In each case, a subset of the available data is used for training to establish thresholds, limits or likelihoods of class membership for each attribute, and hence create mass functions that establish probability of class membership for each attribute of the test data. Classification of each data item is achieved by combination of these probabilities via Dempster's rule of combination. Results for the first two data-sets show extremely high classification accuracy that is competitive with other popular methods. The third data-set is non-numerical and difficult to classify, but good results can be achieved provided the system and mass functions are designed carefully and the right attributes are chosen for combination. In all cases, the D-S method provides comparable performance to other more popular algorithms, but the overhead of generating accurate mass functions increases the complexity with the addition of new attributes. Overall, the results suggest that the D-S approach provides a suitable framework for the design of classification systems and that automating the mass function design and calculation would increase the viability of the algorithm for complex classification problems.",

keywords = "Dempster's rule of combination, Dempster-Shafer theory, data classification",

author = "Qi Chen and Amanda Whitbrook and Uwe Aickelin and Chris Roadknight",

note = "Publisher Copyright: {\textcopyright} 2014 Taylor & Francis.",

year = "2014",

month = oct,

day = "2",

doi = "10.1080/0952813X.2014.886301",

language = "English",

volume = "26",

pages = "493--517",

journal = "Journal of Experimental and Theoretical Artificial Intelligence",

issn = "0952-813X",

publisher = "Taylor and Francis Ltd.",

number = "4",

}

TY - JOUR

T1 - Data classification using the Dempster-Shafer method

AU - Chen, Qi

AU - Whitbrook, Amanda

AU - Aickelin, Uwe

AU - Roadknight, Chris

PY - 2014/10/2

Y1 - 2014/10/2

N2 - In this paper, the Dempster-Shafer (D-S) method is used as the theoretical basis for creating data classification systems. Testing is carried out using three popular multiple attribute benchmark data-sets that have two, three and four classes. In each case, a subset of the available data is used for training to establish thresholds, limits or likelihoods of class membership for each attribute, and hence create mass functions that establish probability of class membership for each attribute of the test data. Classification of each data item is achieved by combination of these probabilities via Dempster's rule of combination. Results for the first two data-sets show extremely high classification accuracy that is competitive with other popular methods. The third data-set is non-numerical and difficult to classify, but good results can be achieved provided the system and mass functions are designed carefully and the right attributes are chosen for combination. In all cases, the D-S method provides comparable performance to other more popular algorithms, but the overhead of generating accurate mass functions increases the complexity with the addition of new attributes. Overall, the results suggest that the D-S approach provides a suitable framework for the design of classification systems and that automating the mass function design and calculation would increase the viability of the algorithm for complex classification problems.

AB - In this paper, the Dempster-Shafer (D-S) method is used as the theoretical basis for creating data classification systems. Testing is carried out using three popular multiple attribute benchmark data-sets that have two, three and four classes. In each case, a subset of the available data is used for training to establish thresholds, limits or likelihoods of class membership for each attribute, and hence create mass functions that establish probability of class membership for each attribute of the test data. Classification of each data item is achieved by combination of these probabilities via Dempster's rule of combination. Results for the first two data-sets show extremely high classification accuracy that is competitive with other popular methods. The third data-set is non-numerical and difficult to classify, but good results can be achieved provided the system and mass functions are designed carefully and the right attributes are chosen for combination. In all cases, the D-S method provides comparable performance to other more popular algorithms, but the overhead of generating accurate mass functions increases the complexity with the addition of new attributes. Overall, the results suggest that the D-S approach provides a suitable framework for the design of classification systems and that automating the mass function design and calculation would increase the viability of the algorithm for complex classification problems.

KW - Dempster's rule of combination

KW - Dempster-Shafer theory

KW - data classification

UR - http://www.scopus.com/inward/record.url?scp=84909943781&partnerID=8YFLogxK

U2 - 10.1080/0952813X.2014.886301

DO - 10.1080/0952813X.2014.886301

M3 - Article

AN - SCOPUS:84909943781

SN - 0952-813X

VL - 26

SP - 493

EP - 517

JO - Journal of Experimental and Theoretical Artificial Intelligence

JF - Journal of Experimental and Theoretical Artificial Intelligence

IS - 4

ER -

Data classification using the Dempster-Shafer method

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this