Clustering

Last revised by Andrew Murphy on 24 Dec 2019

Citation, DOI, disclosures and article data

Citation:

Moore C, Murphy A, Weerakkody Y, Clustering. Reference article, Radiopaedia.org (Accessed on 07 May 2024) https://doi.org/10.53347/rID-70687

DOI:

https://doi.org/10.53347/rID-70687

Permalink:

https://radiopaedia.org/articles/70687

rID:

70687

Article created:

31 Aug 2019, Candace Makeda Moore

Disclosures:

At the time the article was created Candace Makeda Moore had no recorded disclosures.

View Candace Makeda Moore's current disclosures

Last revised:

24 Dec 2019, Andrew Murphy ◉

Disclosures:

At the time the article was last revised Andrew Murphy had no recorded disclosures.

View Andrew Murphy's current disclosures

Revisions:

3 times, by 3 contributors - see full revision history and disclosures

Sections:

Artificial Intelligence

Tags:

machine learning

Clustering, also known as cluster analysis, is a machine learning technique designed to group similar data points together. Since the data points do not necessarily have to be labeled, clustering is an example of unsupervised learning. Clustering in machine learning should not be confused with discovering clusters in epidemiology.

There are many algorithms that have been developed to achieve clustering, and the effectiveness of each is largely dependent on the size of the dataset and the distribution of data points. The most popular algorithm taught in machine learning courses is the K-means algorithm, which seeks to group a dataset into K number of clusters. An example of a more advanced algorithm is Density-Based Spatial Clustering of Applications with Noise (DBSCAN), which is more effective for data distributed in a non-guassian manner.

In radiology (as well as pathology), clustering groups data, which may correspond to sets of images, reports or patients, by similarities in terms of various attributes or features without being explicitly programmed about final labels to group by. Thus clustering has the potential to reveal similarities in data overlooked by humans.

Practically speaking, clustering has proven useful in segmentation algorithms for radiology, which are used to identify different tissue types and/or differentiate pathological and normal tissue. However clustering algorithms are researched in other areas such as natural language processing of reports ¹.

Some of the more commonly used algorithms of clustering in radiology, which have been in use for decades for the task of segmentation, include Fuzzy C mean clustering and K means clustering ^2,3.