By analyzing the problem of k-means, we find the traditional k-means algorithm suffers from some shortcomings, such as requiring the user to give out the number of clusters k in advance, being sensitive to the initial cluster centers, being sensitive to the noise and isolated data, only being applied to the type found in globular clusters, and being easily trapped into a local solution et cetera. This improved algorithm uses the potential of data to find the center data and eliminate the noise data. It decomposes big or extended cluster into several small clusters, then merges adjacent small c...