Fuzzy c-means clustering for uncertain data using quadratic penalty-vector regularization

Yasunori Endo, Yasushi Hasegawa, Yukihiro Hamasuna, Yuchi Kanzawa

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

Clustering - defined as an unsupervised data-analysis classification transforming real-space information into data in pattern space and analyzing it - may require that data be represented by a set, rather than points, due to data uncertainty, e.g., measurement error margin, data regarded as one point, or missing values. These data uncertainties have been represented as interval ranges for which many clustering algorithms are constructed, but the lack of guidelines in selecting available distances in individual cases has made selection difficult and raised the need for ways to calculate dissimilarity between uncertain data without introducing a nearest-neighbor or other distance. The tolerance concept we propose represents uncertain data as a point with a tolerance vector, not as an interval, while this is convenient for handling uncertain data, tolerance-vector constraints make mathematical development difficult. We attempt to remove the tolerance-vector constraints using quadratic penaltyvector regularization similar to the tolerance vector. We also propose clustering algorithms for uncertain data considering optimization and obtaining an optimal solution to handle uncertainty appropriately.

Original languageEnglish
Pages (from-to)76-82
Number of pages7
JournalJournal of Advanced Computational Intelligence and Intelligent Informatics
Volume15
Issue number1
Publication statusPublished - 2011 Jan

Fingerprint

Clustering algorithms
Measurement errors
Uncertainty

Keywords

  • Clustering
  • Fuzzy c-means
  • Optimization
  • Penalty vector
  • Uncertain data

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Human-Computer Interaction

Cite this

Fuzzy c-means clustering for uncertain data using quadratic penalty-vector regularization. / Endo, Yasunori; Hasegawa, Yasushi; Hamasuna, Yukihiro; Kanzawa, Yuchi.

In: Journal of Advanced Computational Intelligence and Intelligent Informatics, Vol. 15, No. 1, 01.2011, p. 76-82.

Research output: Contribution to journalArticle

@article{ea9700e0683a4290b23491ef751e756e,
title = "Fuzzy c-means clustering for uncertain data using quadratic penalty-vector regularization",
abstract = "Clustering - defined as an unsupervised data-analysis classification transforming real-space information into data in pattern space and analyzing it - may require that data be represented by a set, rather than points, due to data uncertainty, e.g., measurement error margin, data regarded as one point, or missing values. These data uncertainties have been represented as interval ranges for which many clustering algorithms are constructed, but the lack of guidelines in selecting available distances in individual cases has made selection difficult and raised the need for ways to calculate dissimilarity between uncertain data without introducing a nearest-neighbor or other distance. The tolerance concept we propose represents uncertain data as a point with a tolerance vector, not as an interval, while this is convenient for handling uncertain data, tolerance-vector constraints make mathematical development difficult. We attempt to remove the tolerance-vector constraints using quadratic penaltyvector regularization similar to the tolerance vector. We also propose clustering algorithms for uncertain data considering optimization and obtaining an optimal solution to handle uncertainty appropriately.",
keywords = "Clustering, Fuzzy c-means, Optimization, Penalty vector, Uncertain data",
author = "Yasunori Endo and Yasushi Hasegawa and Yukihiro Hamasuna and Yuchi Kanzawa",
year = "2011",
month = "1",
language = "English",
volume = "15",
pages = "76--82",
journal = "Journal of Advanced Computational Intelligence and Intelligent Informatics",
issn = "1343-0130",
publisher = "Fuji Technology Press",
number = "1",

}

TY - JOUR

T1 - Fuzzy c-means clustering for uncertain data using quadratic penalty-vector regularization

AU - Endo, Yasunori

AU - Hasegawa, Yasushi

AU - Hamasuna, Yukihiro

AU - Kanzawa, Yuchi

PY - 2011/1

Y1 - 2011/1

N2 - Clustering - defined as an unsupervised data-analysis classification transforming real-space information into data in pattern space and analyzing it - may require that data be represented by a set, rather than points, due to data uncertainty, e.g., measurement error margin, data regarded as one point, or missing values. These data uncertainties have been represented as interval ranges for which many clustering algorithms are constructed, but the lack of guidelines in selecting available distances in individual cases has made selection difficult and raised the need for ways to calculate dissimilarity between uncertain data without introducing a nearest-neighbor or other distance. The tolerance concept we propose represents uncertain data as a point with a tolerance vector, not as an interval, while this is convenient for handling uncertain data, tolerance-vector constraints make mathematical development difficult. We attempt to remove the tolerance-vector constraints using quadratic penaltyvector regularization similar to the tolerance vector. We also propose clustering algorithms for uncertain data considering optimization and obtaining an optimal solution to handle uncertainty appropriately.

AB - Clustering - defined as an unsupervised data-analysis classification transforming real-space information into data in pattern space and analyzing it - may require that data be represented by a set, rather than points, due to data uncertainty, e.g., measurement error margin, data regarded as one point, or missing values. These data uncertainties have been represented as interval ranges for which many clustering algorithms are constructed, but the lack of guidelines in selecting available distances in individual cases has made selection difficult and raised the need for ways to calculate dissimilarity between uncertain data without introducing a nearest-neighbor or other distance. The tolerance concept we propose represents uncertain data as a point with a tolerance vector, not as an interval, while this is convenient for handling uncertain data, tolerance-vector constraints make mathematical development difficult. We attempt to remove the tolerance-vector constraints using quadratic penaltyvector regularization similar to the tolerance vector. We also propose clustering algorithms for uncertain data considering optimization and obtaining an optimal solution to handle uncertainty appropriately.

KW - Clustering

KW - Fuzzy c-means

KW - Optimization

KW - Penalty vector

KW - Uncertain data

UR - http://www.scopus.com/inward/record.url?scp=78751630281&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78751630281&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:78751630281

VL - 15

SP - 76

EP - 82

JO - Journal of Advanced Computational Intelligence and Intelligent Informatics

JF - Journal of Advanced Computational Intelligence and Intelligent Informatics

SN - 1343-0130

IS - 1

ER -