Improving clustering performance by incorporating uncertainty

Maha Bakoben, Anthony Bellotti, Niall Adams

Research output: Journal PublicationArticlepeer-review

6 Citations (Scopus)

Abstract

In more challenging problems the input to a clustering problem is not raw data objects, but rather parametric statistical summaries of the data objects. For example, time series of different lengths may be clustered on the basis of estimated parameters from autoregression models. Such summary procedures usually provide estimates of uncertainty for parameters, and ignoring this source of uncertainty affects the recovery of the true clusters. This paper is concerned with the incorporation of this source of uncertainty in the clustering procedure. A new dissimilarity measure is developed based on geometric overlap of confidence ellipsoids implied by the uncertainty estimates. In extensive simulation studies and a synthetic time series benchmark dataset, this new measure is shown to yield improved performance over standard approaches.

Original languageEnglish
Pages (from-to)28-34
Number of pages7
JournalPattern Recognition Letters
Volume77
DOIs
Publication statusPublished - 1 Jul 2016
Externally publishedYes

Keywords

  • Clustering with uncertainty
  • Confidence ellipsoids
  • Ellipsoid dissimilarity measures
  • Time series clustering

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Improving clustering performance by incorporating uncertainty'. Together they form a unique fingerprint.

Cite this