Post by Sandro SaittaHello,
I want to try k-means with the chebyshev distance. I know it is
possible to use euclidean and city block distances but what about
chebyshev? Is there a way to use this distance with the standard
k-means matlab function?
K-means (the algorithm) requires two things:
1) a distance measure, and
2) a way to compute the centroid of a cluster so as to minimize the sum of
point-to-centroid distances.
There are lots of distance measures, but few have a simple computation for the
centroid. For example, the default distance used by KMEANS (the function) is
most definitely NOT "Euclidean", but rather "squared Euclidean", because for the
latter, the centroid is the arithmetic mean, while for the former, it is a
difficult computation requiring an iterative solution (the calculation has a
hyphenated name attached to it that escapes me).
That's why no Chebyshev distance in KMEANS.
That being said, there is a "coordinate-free" version of K-means (the algorithm)
that does not require a centroid calculation. However, it is not the standard
thing that most people want, and requires a distance matrix rather than raw
data, limiting its use for large datasets (which is what most people use K-means
for). Is that of interest to you?
It's certainly also possible to modify KMEANS to use Chebyshev distance and some
unsuitable centroid calculation, but you're on your own there.
- Peter Perkins
The MathWorks, Inc.