What Does K-Means Clustering Mean?
K-means clustering is a simple unsupervised learning
algorithm that is used to solve clustering problems. It follows a simple
procedure of classifying a given data set into a number of clusters, defined by
the letter “k,” which is fixed beforehand. The clusters are then positioned as
points and all observations or data points are associated with the nearest
cluster, computed, adjusted and then the process starts over using the new
adjustments until a desired result is reached.
K-means clustering has uses in search engines, market segmentation, statistics and even astronomy.
Techopedia Explains K-Means Clustering
K-means clustering is a method used for clustering analysis, especially in data mining and statistics. It aims to partition a set of observations into a number of clusters (k), resulting in the partitioning of the data into Voronoi cells. It can be considered a method of finding out which group a certain object really belongs to.
It is used mainly in statistics and can be applied to almost any branch of study. For example, in marketing, it can be used to group different demographics of people into simple groups that make it easier for marketers to target. Astronomers use it to sift through huge amounts of astronomical data; since they cannot analyze each object one by one, they need a way to statistically find points of interest for observation and investigation.
The algorithm:
- K points are placed into the object data space representing the initial group of centroids.
- Each object or data point is assigned into the closest k.
- After all objects are assigned, the positions of the k centroids are recalculated.
- Steps 2 and 3 are repeated until the positions of the centroids no longer move.