Click4Ai

84.

Medium

Implement the **Silhouette Score** for evaluating clustering quality.

For each point i:

a(i) = average distance to all other points in the SAME cluster

b(i) = minimum average distance to points in any OTHER cluster

s(i) = (b(i) - a(i)) / max(a(i), b(i))

The overall silhouette score is the mean of s(i) across all points.

**Range:** -1 to +1

  • **+1:** Points are well-matched to their own cluster
  • **0:** Points are on the boundary between clusters
  • **-1:** Points are likely assigned to the wrong cluster
  • Example:

    X = [[1,2], [1.5,1.8], [5,8], [8,8], [1,0.6], [9,11]]

    labels = [0, 0, 1, 1, 0, 1]

    silhouette_score(X, labels) → ~0.66

    Constraints:

  • Use Euclidean distance
  • Handle clusters with only one point (silhouette = 0 for that point)
  • Return the average silhouette score across all points
  • Test Cases

    Test Case 1
    Input: X=[[1,2],[1.5,1.8],[5,8],[8,8],[1,0.6],[9,11]], labels=[0,0,1,1,0,1]
    Expected: ~0.66
    Test Case 2
    Input: Perfect clustering (well-separated groups)
    Expected: close to 1.0
    Test Case 3
    Input: Random labels on clustered data
    Expected: close to 0 or negative
    + 2 hidden test cases