Click4Ai

78.

Medium

Implement a GaussianNaiveBayes classifier that uses Bayes' theorem with the assumption that features follow a Gaussian (normal) distribution within each class.

Algorithm:

1. **fit(X, y):** For each class, compute the mean and variance of each feature, and the prior probability P(class).

2. **predict(X):** For each sample, compute P(class | features) using Bayes' theorem. The class with the highest posterior probability wins.

**Gaussian PDF:** P(x | class) = (1 / sqrt(2 * pi * var)) * exp(-(x - mean)^2 / (2 * var))

Example:

gnb = GaussianNaiveBayes()

gnb.fit(X_train, y_train)

predictions = gnb.predict(X_test)

**Explanation:** For each test sample, compute the log-likelihood of each class by summing log-Gaussian-PDF across all features, add the log-prior, and pick the class with the highest total.

Constraints:

  • Assume features are continuous and conditionally independent given the class
  • Use log probabilities to avoid numerical underflow
  • Handle at least 2 classes
  • Test Cases

    Test Case 1
    Input: Fit on simple 2-class data with 2 features
    Expected: Model stores correct number of classes, means, and variances
    Test Case 2
    Input: Predict on training data (well-separated clusters)
    Expected: High accuracy on training set
    Test Case 3
    Input: Check priors sum to 1.0 after fit
    Expected: 1.0
    + 2 hidden test cases