Click4Ai

92.

Hard

Implement **Linear Discriminant Analysis (LDA)** for supervised dimensionality reduction.

Algorithm:

1. Compute the **within-class scatter matrix** S_W:

S_W = sum_over_classes( sum_over_samples_in_class( (x - mean_c)(x - mean_c)^T ) )

2. Compute the **between-class scatter matrix** S_B:

S_B = sum_over_classes( n_c * (mean_c - mean_overall)(mean_c - mean_overall)^T )

3. Compute eigenvalues/eigenvectors of S_W^(-1) @ S_B

4. Select top n_components eigenvectors (sorted by eigenvalue magnitude)

5. Project data onto these discriminant directions

Example:

lda = LDA(n_components=2)

lda.fit(X_train, y_train)

X_projected = lda.transform(X_test) # Reduced dimensions

**Explanation:** Unlike PCA (unsupervised), LDA uses class labels to find directions that maximize class separation while minimizing within-class variance. Maximum useful components = min(n_features, n_classes - 1).

Constraints:

  • S_W must be invertible (add small regularization 1e-6 * I if needed)
  • Use real parts of eigenvalues/eigenvectors (result may have complex numbers)
  • Test Cases

    Test Case 1
    Input: 3-class data, n_components=2
    Expected: output shape (n_samples, 2)
    Test Case 2
    Input: 2-class data, n_components=1
    Expected: output shape (n_samples, 1)
    Test Case 3
    Input: linear_discriminants shape after fit
    Expected: (n_components, n_features)
    + 2 hidden test cases