I spent some time today trying to figure out which clustering packages are already available for python. Seems that all the bleeding-edge clustering methods like CHAMELEON, BIRCH, CURE, CLARANS are still missing an available implementation. Actually these methods are missing much more than a python porting. They are listed in the literature but not available to the wide public. I wonder if this is just another copyright issue …
However, I found a nice package for C that is called, guess what?: Cluster. This package implements the k-means, k-medians, k-medoids, treecluster, Self-Organising Maps. Not much but better than nothing. A Python meodule is also available: Pycluster.
A more complete environment seems to be this gCLUTO (CLUstering TOolkit). However doesn’t seems to be easily callable from Python. I have to enquiry more.
Tags: algorithmic information theory, clustering, spatial clustering, statistics