Implement the **K-Means Clustering** algorithm from scratch.
Algorithm:
1. Randomly initialize k centroids from the data points
2. **Assign:** Each point to the nearest centroid (using Euclidean distance)
3. **Update:** Recompute each centroid as the mean of its assigned points
4. Repeat steps 2-3 until centroids stop changing or max_iterations reached
Example:
km = KMeans(n_clusters=3)
km.fit(X)
labels = km.predict(X) # Cluster assignments for each point
print(km.centroids) # Final centroid positions
**Explanation:** K-Means partitions n data points into k clusters by minimizing within-cluster sum of squared distances. Each iteration reduces the total distance, guaranteeing convergence (to a local minimum).
Constraints:
Test Cases
Test Case 1
Input:
3 well-separated clusters, k=3Expected:
3 distinct labels assignedTest Case 2
Input:
centroids shape after fit = (k, n_features)Expected:
TrueTest Case 3
Input:
predict returns array of length n_samplesExpected:
True+ 2 hidden test cases