Accuracy of clustering prediction of PAM and K-modes algorithms

Marc Gregory Dixon, Stanimir Genov, Vasil Hnatyshin, Umashanger Thayasivam

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The concept of grouping (or clustering) data points with similar characteristics is of importance when working with the data that frequently appears in everyday life. Data scientists cluster the data that is numerical in nature based on the notion of distance, usually computed using Euclidean measure. However, there are many datasets that often consists of categorical values which require alternative methods for grouping the data. That is why clustering of categorical data employs methods that rely on similarity between the values rather than distance. This work focuses on studying the ability of different clustering algorithms and several definitions of similarity to organize categorical data into groups.

Original languageEnglish (US)
Title of host publicationAdvances in Information and Communication Networks - Proceedings of the 2018 Future of Information and Communication Conference FICC, Vol. 1
EditorsRahul Bhatia, Kohei Arai, Supriya Kapoor
PublisherSpringer Verlag
Pages330-345
Number of pages16
ISBN (Print)9783030034016
DOIs
StatePublished - Jan 1 2019
EventFuture of Information and Communication Conference, FICC 2018 - Singapore, Singapore
Duration: Apr 5 2018Apr 6 2018

Publication series

NameAdvances in Intelligent Systems and Computing
Volume886

Other

OtherFuture of Information and Communication Conference, FICC 2018
CountrySingapore
CitySingapore
Period4/5/184/6/18

Fingerprint

Pulse amplitude modulation
Clustering algorithms

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Computer Science(all)

Cite this

Dixon, M. G., Genov, S., Hnatyshin, V., & Thayasivam, U. (2019). Accuracy of clustering prediction of PAM and K-modes algorithms. In R. Bhatia, K. Arai, & S. Kapoor (Eds.), Advances in Information and Communication Networks - Proceedings of the 2018 Future of Information and Communication Conference FICC, Vol. 1 (pp. 330-345). (Advances in Intelligent Systems and Computing; Vol. 886). Springer Verlag. https://doi.org/10.1007/978-3-030-03402-3_22
Dixon, Marc Gregory ; Genov, Stanimir ; Hnatyshin, Vasil ; Thayasivam, Umashanger. / Accuracy of clustering prediction of PAM and K-modes algorithms. Advances in Information and Communication Networks - Proceedings of the 2018 Future of Information and Communication Conference FICC, Vol. 1. editor / Rahul Bhatia ; Kohei Arai ; Supriya Kapoor. Springer Verlag, 2019. pp. 330-345 (Advances in Intelligent Systems and Computing).
@inproceedings{9385c837b9a94abd9275056d0c36cc01,
title = "Accuracy of clustering prediction of PAM and K-modes algorithms",
abstract = "The concept of grouping (or clustering) data points with similar characteristics is of importance when working with the data that frequently appears in everyday life. Data scientists cluster the data that is numerical in nature based on the notion of distance, usually computed using Euclidean measure. However, there are many datasets that often consists of categorical values which require alternative methods for grouping the data. That is why clustering of categorical data employs methods that rely on similarity between the values rather than distance. This work focuses on studying the ability of different clustering algorithms and several definitions of similarity to organize categorical data into groups.",
author = "Dixon, {Marc Gregory} and Stanimir Genov and Vasil Hnatyshin and Umashanger Thayasivam",
year = "2019",
month = "1",
day = "1",
doi = "https://doi.org/10.1007/978-3-030-03402-3_22",
language = "English (US)",
isbn = "9783030034016",
series = "Advances in Intelligent Systems and Computing",
publisher = "Springer Verlag",
pages = "330--345",
editor = "Rahul Bhatia and Kohei Arai and Supriya Kapoor",
booktitle = "Advances in Information and Communication Networks - Proceedings of the 2018 Future of Information and Communication Conference FICC, Vol. 1",
address = "Germany",

}

Dixon, MG, Genov, S, Hnatyshin, V & Thayasivam, U 2019, Accuracy of clustering prediction of PAM and K-modes algorithms. in R Bhatia, K Arai & S Kapoor (eds), Advances in Information and Communication Networks - Proceedings of the 2018 Future of Information and Communication Conference FICC, Vol. 1. Advances in Intelligent Systems and Computing, vol. 886, Springer Verlag, pp. 330-345, Future of Information and Communication Conference, FICC 2018, Singapore, Singapore, 4/5/18. https://doi.org/10.1007/978-3-030-03402-3_22

Accuracy of clustering prediction of PAM and K-modes algorithms. / Dixon, Marc Gregory; Genov, Stanimir; Hnatyshin, Vasil; Thayasivam, Umashanger.

Advances in Information and Communication Networks - Proceedings of the 2018 Future of Information and Communication Conference FICC, Vol. 1. ed. / Rahul Bhatia; Kohei Arai; Supriya Kapoor. Springer Verlag, 2019. p. 330-345 (Advances in Intelligent Systems and Computing; Vol. 886).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Accuracy of clustering prediction of PAM and K-modes algorithms

AU - Dixon, Marc Gregory

AU - Genov, Stanimir

AU - Hnatyshin, Vasil

AU - Thayasivam, Umashanger

PY - 2019/1/1

Y1 - 2019/1/1

N2 - The concept of grouping (or clustering) data points with similar characteristics is of importance when working with the data that frequently appears in everyday life. Data scientists cluster the data that is numerical in nature based on the notion of distance, usually computed using Euclidean measure. However, there are many datasets that often consists of categorical values which require alternative methods for grouping the data. That is why clustering of categorical data employs methods that rely on similarity between the values rather than distance. This work focuses on studying the ability of different clustering algorithms and several definitions of similarity to organize categorical data into groups.

AB - The concept of grouping (or clustering) data points with similar characteristics is of importance when working with the data that frequently appears in everyday life. Data scientists cluster the data that is numerical in nature based on the notion of distance, usually computed using Euclidean measure. However, there are many datasets that often consists of categorical values which require alternative methods for grouping the data. That is why clustering of categorical data employs methods that rely on similarity between the values rather than distance. This work focuses on studying the ability of different clustering algorithms and several definitions of similarity to organize categorical data into groups.

UR - http://www.scopus.com/inward/record.url?scp=85058547489&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85058547489&partnerID=8YFLogxK

U2 - https://doi.org/10.1007/978-3-030-03402-3_22

DO - https://doi.org/10.1007/978-3-030-03402-3_22

M3 - Conference contribution

SN - 9783030034016

T3 - Advances in Intelligent Systems and Computing

SP - 330

EP - 345

BT - Advances in Information and Communication Networks - Proceedings of the 2018 Future of Information and Communication Conference FICC, Vol. 1

A2 - Bhatia, Rahul

A2 - Arai, Kohei

A2 - Kapoor, Supriya

PB - Springer Verlag

ER -

Dixon MG, Genov S, Hnatyshin V, Thayasivam U. Accuracy of clustering prediction of PAM and K-modes algorithms. In Bhatia R, Arai K, Kapoor S, editors, Advances in Information and Communication Networks - Proceedings of the 2018 Future of Information and Communication Conference FICC, Vol. 1. Springer Verlag. 2019. p. 330-345. (Advances in Intelligent Systems and Computing). https://doi.org/10.1007/978-3-030-03402-3_22