- Fit a probability density function to the data. I would use a "mixture of Gaussians" and fit it using Expectation Maximisation learning primed by the K-means algorithm. The K-means by itself can sometimes be sufficient without EM. The number of clusters itself would need to be primed with a model order selection algorithm.
- Then, each point can be scored with p(x) using the model. I.e. get the posterior probability that the point was generated by the model.
- Find the maximum p(x) to find the cluster centroids.
This can be coded very quickly in a tool like Matlab using a machine learning toolbox. MoG/EM learning/K-Means clustering are discussed widely on the web/standard texts. My favourite text is "Pattern classification" by Duda/Hart.