An unsupervised self-organizing learning with support vector ranking for imbalanced datasets

Yok Yen Nguwi; Siu Yeung Cho

doi:10.1016/j.eswa.2010.05.054

An unsupervised self-organizing learning with support vector ranking for imbalanced datasets

Yok Yen Nguwi, Siu Yeung Cho

Research output: Journal Publication › Article › peer-review

27 Citations (Scopus)

Abstract

The aim of computational learning algorithm is to establish grounds that work for any types of data, once and for all. However, majority of the classifiers have their base from balanced datasets. This paper discusses the issues related to imbalanced data distribution problem and the common strategy to deal with imbalance datasets. We propose a model capable of handling imbalance datasets well in which other typical classifiers fail to do so. The model adopted a derivation of support vector machines in selecting variables so that the problem of imbalanced data distribution can be relaxed. Then, we used an Emergent Self-Organizing Map (ESOM) to cluster the ranker features so as to provide clusters for unsupervised classification. This work progresses by examining the efficiency of the model in evaluating imbalanced datasets. Experimental results show that the criterion based on weight vector derivative achieves good results and performs consistently well over imbalance datasets. In general, our approach outperforms other classification methods which are unable to handle the imbalanced data distribution in the testing datasets.

Original language	English
Pages (from-to)	8303-8312
Number of pages	10
Journal	Expert Systems with Applications
Volume	37
Issue number	12
DOIs	https://doi.org/10.1016/j.eswa.2010.05.054
Publication status	Published - Dec 2010
Externally published	Yes

Keywords

Emergent Self-Organizing Map
Imbalanced datasets
Support vector ranking

ASJC Scopus subject areas

General Engineering
Computer Science Applications
Artificial Intelligence

Access to Document

10.1016/j.eswa.2010.05.054

Cite this

@article{109f26a8f3eb4e939db7e85bac51a897,

title = "An unsupervised self-organizing learning with support vector ranking for imbalanced datasets",

abstract = "The aim of computational learning algorithm is to establish grounds that work for any types of data, once and for all. However, majority of the classifiers have their base from balanced datasets. This paper discusses the issues related to imbalanced data distribution problem and the common strategy to deal with imbalance datasets. We propose a model capable of handling imbalance datasets well in which other typical classifiers fail to do so. The model adopted a derivation of support vector machines in selecting variables so that the problem of imbalanced data distribution can be relaxed. Then, we used an Emergent Self-Organizing Map (ESOM) to cluster the ranker features so as to provide clusters for unsupervised classification. This work progresses by examining the efficiency of the model in evaluating imbalanced datasets. Experimental results show that the criterion based on weight vector derivative achieves good results and performs consistently well over imbalance datasets. In general, our approach outperforms other classification methods which are unable to handle the imbalanced data distribution in the testing datasets.",

keywords = "Emergent Self-Organizing Map, Imbalanced datasets, Support vector ranking",

author = "Nguwi, {Yok Yen} and Cho, {Siu Yeung}",

year = "2010",

month = dec,

doi = "10.1016/j.eswa.2010.05.054",

language = "English",

volume = "37",

pages = "8303--8312",

journal = "Expert Systems with Applications",

issn = "0957-4174",

publisher = "Elsevier Ltd.",

number = "12",

}

TY - JOUR

T1 - An unsupervised self-organizing learning with support vector ranking for imbalanced datasets

AU - Nguwi, Yok Yen

AU - Cho, Siu Yeung

PY - 2010/12

Y1 - 2010/12

N2 - The aim of computational learning algorithm is to establish grounds that work for any types of data, once and for all. However, majority of the classifiers have their base from balanced datasets. This paper discusses the issues related to imbalanced data distribution problem and the common strategy to deal with imbalance datasets. We propose a model capable of handling imbalance datasets well in which other typical classifiers fail to do so. The model adopted a derivation of support vector machines in selecting variables so that the problem of imbalanced data distribution can be relaxed. Then, we used an Emergent Self-Organizing Map (ESOM) to cluster the ranker features so as to provide clusters for unsupervised classification. This work progresses by examining the efficiency of the model in evaluating imbalanced datasets. Experimental results show that the criterion based on weight vector derivative achieves good results and performs consistently well over imbalance datasets. In general, our approach outperforms other classification methods which are unable to handle the imbalanced data distribution in the testing datasets.

AB - The aim of computational learning algorithm is to establish grounds that work for any types of data, once and for all. However, majority of the classifiers have their base from balanced datasets. This paper discusses the issues related to imbalanced data distribution problem and the common strategy to deal with imbalance datasets. We propose a model capable of handling imbalance datasets well in which other typical classifiers fail to do so. The model adopted a derivation of support vector machines in selecting variables so that the problem of imbalanced data distribution can be relaxed. Then, we used an Emergent Self-Organizing Map (ESOM) to cluster the ranker features so as to provide clusters for unsupervised classification. This work progresses by examining the efficiency of the model in evaluating imbalanced datasets. Experimental results show that the criterion based on weight vector derivative achieves good results and performs consistently well over imbalance datasets. In general, our approach outperforms other classification methods which are unable to handle the imbalanced data distribution in the testing datasets.

KW - Emergent Self-Organizing Map

KW - Imbalanced datasets

KW - Support vector ranking

UR - http://www.scopus.com/inward/record.url?scp=77957857196&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2010.05.054

DO - 10.1016/j.eswa.2010.05.054

M3 - Article

AN - SCOPUS:77957857196

SN - 0957-4174

VL - 37

SP - 8303

EP - 8312

JO - Expert Systems with Applications

JF - Expert Systems with Applications

IS - 12

ER -

An unsupervised self-organizing learning with support vector ranking for imbalanced datasets

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this