Click4Ai

88.

Hard

Implement a **Gaussian Mixture Model (GMM)** using the Expectation-Maximization (EM) algorithm.

Algorithm:

1. **Initialize:** Random means, identity covariances, equal weights

2. **E-step:** Compute responsibility of each component for each point:

r_ik = (w_k * N(x_i | mu_k, sigma_k)) / sum_j(w_j * N(x_i | mu_j, sigma_j))

3. **M-step:** Update parameters using responsibilities:

- mu_k = sum(r_ik * x_i) / sum(r_ik)

- sigma_k = sum(r_ik * (x_i - mu_k)(x_i - mu_k)^T) / sum(r_ik)

- w_k = mean(r_ik)

4. Repeat until convergence

Example:

gmm = GMM(n_components=3)

gmm.fit(X)

labels = gmm.predict(X) # Soft clustering → hard assignments

**Explanation:** Unlike K-Means (hard assignment), GMM models data as a mixture of Gaussians and provides soft probabilistic assignments. Each point can partially belong to multiple clusters.

Constraints:

  • Use multivariate Gaussian PDF
  • Add 1e-6 * I to covariances for numerical stability
  • predict() returns hard assignments (argmax of responsibilities)
  • Test Cases

    Test Case 1
    Input: 2 Gaussian blobs, n_components=2
    Expected: Two clusters identified
    Test Case 2
    Input: weights sum to 1.0 after fit
    Expected: True
    Test Case 3
    Input: predict returns array of length n_samples
    Expected: True
    + 2 hidden test cases