On Fuzzy c-Means clustering for uncertain data using quadratic regularization of penalty vectors

Yasunori Endo, Yukihiro Hamasuna, Yuchi Kanzawa, Sadaaki Miyamoto

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In recent years, data from many natural and social phenomena are accumulated into huge databases in the world wide network of computers. Thus, advanced data analysis techniques to get valuable knowledge from data using computing power of today are required.Clustering is one of the unsupervised classification technique of the data analysis and both of hard and fuzzy c-means clusterings are the most typical technique of clustering. By the way, information on a real space is transformed to data in a pattern space and analyzed in clustering. However, the data should be often represented not by a point but by a set because of uncertainty of the data, e.g., measurement error margin, data that cannot be regarded as one point, and missing values in data. These uncertainties of data have been represented as interval range and many clustering algorithms for these interval ranges of data have been constructed.However, the guideline to select an available distance in each case has not been shown so that this selection problem is difficult. Therefore, methods to calculate the dissimilarity between such uncertain data without introducing a particular distance, e.g., nearest neighbor one and so on, have been strongly desired. From this viewpoint, we have proposed a concept of tolerance.The concept represents a uncertain data not as an interval but as a point with a tolerance vector. In this paper, we try to remove the constraint for tolerance vectors by using quadratic regularization of penalty vector which is similar to tolerance vector and propose new clustering algorithms for uncertain data through considering the optimization problems and obtaining the optimal solution, to handle such uncertainty more appropriately.

Original languageEnglish
Title of host publication2009 IEEE International Conference on Granular Computing, GRC 2009
Pages148-153
Number of pages6
DOIs
Publication statusPublished - 2009
Event2009 IEEE International Conference on Granular Computing, GRC 2009 - Nanchang
Duration: 2009 Aug 172009 Aug 19

Other

Other2009 IEEE International Conference on Granular Computing, GRC 2009
CityNanchang
Period09/8/1709/8/19

Fingerprint

Clustering algorithms
Measurement errors
Uncertainty

Keywords

  • Fuzzy c-means clustering
  • Penalty vector
  • Quadratic regularization
  • Tolerance
  • Uncertain data

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Software

Cite this

Endo, Y., Hamasuna, Y., Kanzawa, Y., & Miyamoto, S. (2009). On Fuzzy c-Means clustering for uncertain data using quadratic regularization of penalty vectors. In 2009 IEEE International Conference on Granular Computing, GRC 2009 (pp. 148-153). [5255142] https://doi.org/10.1109/GRC.2009.5255142

On Fuzzy c-Means clustering for uncertain data using quadratic regularization of penalty vectors. / Endo, Yasunori; Hamasuna, Yukihiro; Kanzawa, Yuchi; Miyamoto, Sadaaki.

2009 IEEE International Conference on Granular Computing, GRC 2009. 2009. p. 148-153 5255142.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Endo, Y, Hamasuna, Y, Kanzawa, Y & Miyamoto, S 2009, On Fuzzy c-Means clustering for uncertain data using quadratic regularization of penalty vectors. in 2009 IEEE International Conference on Granular Computing, GRC 2009., 5255142, pp. 148-153, 2009 IEEE International Conference on Granular Computing, GRC 2009, Nanchang, 09/8/17. https://doi.org/10.1109/GRC.2009.5255142
Endo Y, Hamasuna Y, Kanzawa Y, Miyamoto S. On Fuzzy c-Means clustering for uncertain data using quadratic regularization of penalty vectors. In 2009 IEEE International Conference on Granular Computing, GRC 2009. 2009. p. 148-153. 5255142 https://doi.org/10.1109/GRC.2009.5255142
Endo, Yasunori ; Hamasuna, Yukihiro ; Kanzawa, Yuchi ; Miyamoto, Sadaaki. / On Fuzzy c-Means clustering for uncertain data using quadratic regularization of penalty vectors. 2009 IEEE International Conference on Granular Computing, GRC 2009. 2009. pp. 148-153
@inproceedings{30a209dbe2634a229fa82cd0abe86e8c,
title = "On Fuzzy c-Means clustering for uncertain data using quadratic regularization of penalty vectors",
abstract = "In recent years, data from many natural and social phenomena are accumulated into huge databases in the world wide network of computers. Thus, advanced data analysis techniques to get valuable knowledge from data using computing power of today are required.Clustering is one of the unsupervised classification technique of the data analysis and both of hard and fuzzy c-means clusterings are the most typical technique of clustering. By the way, information on a real space is transformed to data in a pattern space and analyzed in clustering. However, the data should be often represented not by a point but by a set because of uncertainty of the data, e.g., measurement error margin, data that cannot be regarded as one point, and missing values in data. These uncertainties of data have been represented as interval range and many clustering algorithms for these interval ranges of data have been constructed.However, the guideline to select an available distance in each case has not been shown so that this selection problem is difficult. Therefore, methods to calculate the dissimilarity between such uncertain data without introducing a particular distance, e.g., nearest neighbor one and so on, have been strongly desired. From this viewpoint, we have proposed a concept of tolerance.The concept represents a uncertain data not as an interval but as a point with a tolerance vector. In this paper, we try to remove the constraint for tolerance vectors by using quadratic regularization of penalty vector which is similar to tolerance vector and propose new clustering algorithms for uncertain data through considering the optimization problems and obtaining the optimal solution, to handle such uncertainty more appropriately.",
keywords = "Fuzzy c-means clustering, Penalty vector, Quadratic regularization, Tolerance, Uncertain data",
author = "Yasunori Endo and Yukihiro Hamasuna and Yuchi Kanzawa and Sadaaki Miyamoto",
year = "2009",
doi = "10.1109/GRC.2009.5255142",
language = "English",
isbn = "9781424448319",
pages = "148--153",
booktitle = "2009 IEEE International Conference on Granular Computing, GRC 2009",

}

TY - GEN

T1 - On Fuzzy c-Means clustering for uncertain data using quadratic regularization of penalty vectors

AU - Endo, Yasunori

AU - Hamasuna, Yukihiro

AU - Kanzawa, Yuchi

AU - Miyamoto, Sadaaki

PY - 2009

Y1 - 2009

N2 - In recent years, data from many natural and social phenomena are accumulated into huge databases in the world wide network of computers. Thus, advanced data analysis techniques to get valuable knowledge from data using computing power of today are required.Clustering is one of the unsupervised classification technique of the data analysis and both of hard and fuzzy c-means clusterings are the most typical technique of clustering. By the way, information on a real space is transformed to data in a pattern space and analyzed in clustering. However, the data should be often represented not by a point but by a set because of uncertainty of the data, e.g., measurement error margin, data that cannot be regarded as one point, and missing values in data. These uncertainties of data have been represented as interval range and many clustering algorithms for these interval ranges of data have been constructed.However, the guideline to select an available distance in each case has not been shown so that this selection problem is difficult. Therefore, methods to calculate the dissimilarity between such uncertain data without introducing a particular distance, e.g., nearest neighbor one and so on, have been strongly desired. From this viewpoint, we have proposed a concept of tolerance.The concept represents a uncertain data not as an interval but as a point with a tolerance vector. In this paper, we try to remove the constraint for tolerance vectors by using quadratic regularization of penalty vector which is similar to tolerance vector and propose new clustering algorithms for uncertain data through considering the optimization problems and obtaining the optimal solution, to handle such uncertainty more appropriately.

AB - In recent years, data from many natural and social phenomena are accumulated into huge databases in the world wide network of computers. Thus, advanced data analysis techniques to get valuable knowledge from data using computing power of today are required.Clustering is one of the unsupervised classification technique of the data analysis and both of hard and fuzzy c-means clusterings are the most typical technique of clustering. By the way, information on a real space is transformed to data in a pattern space and analyzed in clustering. However, the data should be often represented not by a point but by a set because of uncertainty of the data, e.g., measurement error margin, data that cannot be regarded as one point, and missing values in data. These uncertainties of data have been represented as interval range and many clustering algorithms for these interval ranges of data have been constructed.However, the guideline to select an available distance in each case has not been shown so that this selection problem is difficult. Therefore, methods to calculate the dissimilarity between such uncertain data without introducing a particular distance, e.g., nearest neighbor one and so on, have been strongly desired. From this viewpoint, we have proposed a concept of tolerance.The concept represents a uncertain data not as an interval but as a point with a tolerance vector. In this paper, we try to remove the constraint for tolerance vectors by using quadratic regularization of penalty vector which is similar to tolerance vector and propose new clustering algorithms for uncertain data through considering the optimization problems and obtaining the optimal solution, to handle such uncertainty more appropriately.

KW - Fuzzy c-means clustering

KW - Penalty vector

KW - Quadratic regularization

KW - Tolerance

KW - Uncertain data

UR - http://www.scopus.com/inward/record.url?scp=70449956056&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70449956056&partnerID=8YFLogxK

U2 - 10.1109/GRC.2009.5255142

DO - 10.1109/GRC.2009.5255142

M3 - Conference contribution

SN - 9781424448319

SP - 148

EP - 153

BT - 2009 IEEE International Conference on Granular Computing, GRC 2009

ER -