Click4Ai

77.

Hard

Implement a RandomForest classifier from scratch using bootstrap aggregation (bagging) of decision trees with majority voting.

Algorithm:

1. For each tree, create a bootstrap sample (random sample with replacement) of the training data

2. Train a decision tree on each bootstrap sample (optionally using a random subset of features)

3. For prediction, each tree votes and the majority class wins

Example:

rf = RandomForest(n_trees=5, max_depth=5)

rf.fit(X_train, y_train) # Train 5 trees on bootstrap samples

predictions = rf.predict(X_test) # Majority vote across all trees

**Explanation:** Random Forest reduces overfitting by averaging many decorrelated trees. Each tree sees a different bootstrap sample, and the ensemble prediction is more robust than any single tree.

Constraints:

  • Use bootstrap sampling (sampling with replacement, same size as original dataset)
  • Implement majority voting for predictions
  • Support max_depth and min_samples_split parameters for individual trees
  • Test Cases

    Test Case 1
    Input: Fit on XOR-like data [[0,0],[0,1],[1,0],[1,1]], labels [0,1,1,0]
    Expected: Model trains without error
    Test Case 2
    Input: Predict on training data after fit
    Expected: Returns array of integer labels
    Test Case 3
    Input: n_trees=5, check len(rf.trees) after fit
    Expected: 5
    + 2 hidden test cases