Classification of Cancer Types by Cluster Analysis Methods
Aynur İncekırık,
Öznur İşçi Güneri and
Burcu Durmuş
Alphanumeric Journal, 2021, vol. 9, issue 1, 125-142
Abstract:
Cluster analysis can be defined as the group of methods that aim to classify multivariate observations by using similarity/dissimilarity measures between observations. The clusters obtained as a result of the analysis are required to be homogeneous within themselves and heterogeneous among themselves. This study aims to cluster cancer types in datasets created by considering age group characteristics according to gender. In the study, clustering analysis was applied to four different datasets created from the data registered between 1982 and 2016 for 57 cancer types in men and women according to age groups at the Australian Institute of Health and Welfare, and the analysis results were evaluated and interpreted. In addition, in determining the clustering method and the number of clusters, Cophenetic correlation coefficients and 26 cluster validity indices were used, respectively. The distribution of cancer types in age groups determined by gender was observed in 4 different datasets created with 3 different age group characteristics that led to the best separation of cancer groups, and the clustering tendencies of cancers in the relevant age groups were investigated. R-3.5.1 package program was used for analyses. In this study, the analysis results of the k-means method and the average linkage method, which was decided to be the most successful method due to the high cophenetic correlation coefficient value, were evaluated and interpreted. The number of clusters was determined as 3 with the help of cluster validity indices. When the results obtained are examined, it is seen that breast cancer in women and prostate cancer in men is the most common type of cancer in the age group of 40 and above, and that these cancers are alone in a cluster. In addition, it is seen that the 0-14 age group characteristic fails to separate the clusters.
Keywords: Cancer Types; Cluster Analysis; Cluster Validity İndex; Cophenetic Correlation Coefficient; K-means (search for similar items in EconPapers)
JEL-codes: C01 (search for similar items in EconPapers)
Date: 2021
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.alphanumericjournal.com/media/Issue/vo ... analysis-methods.pdf (application/pdf)
https://alphanumericjournal.com/article/classifica ... ter-analysis-methods (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:anm:alpnmr:v:9:y:2021:i:1:p:125-142
DOI: 10.17093/alphanumeric.949958
Access Statistics for this article
More articles in Alphanumeric Journal from Bahadir Fatih Yildirim
Bibliographic data for series maintained by Bahadir Fatih Yildirim ().