Cross-Validation (K-Fold)
Implement k-fold cross-validation to evaluate model performance. K-fold cross-validation is a robust model evaluation technique that splits the dataset into k equally sized folds, then iteratively uses each fold as a validation set while training on the remaining k-1 folds. This provides a more reliable estimate of model performance than a single train-test split.
The k-fold cross-validation algorithm is:
fold_size = len(data) // k
for i in range(k):
val_data = data[i * fold_size : (i+1) * fold_size]
train_data = all data except val_data
model.train(train_data)
scores[i] = model.evaluate(val_data)
final_score = mean(scores)
Your function cross_validation(X, y, k) should split the data into k folds, and for each fold, compute the accuracy of a dummy model (that always predicts 0) on the validation set. Return an array of scores for each fold.
Example:
Input: X = [[1,2],[3,4],[5,6],[7,8]], y = [0, 1, 0, 1], k = 2
Fold 1: val = X[0:2], y_val = [0, 1] -> accuracy of predicting 0 = 0.5
Fold 2: val = X[2:4], y_val = [0, 1] -> accuracy of predicting 0 = 0.5
Output: [0.5, 0.5]
Cross-validation reduces the variance of the performance estimate by averaging over multiple train-test splits. It is especially useful when the dataset is small, as it ensures every data point is used for both training and validation. Common choices for k are 5 and 10.
Constraints:
Test Cases
[[[1,2],[3,4]],[[5,6],[7,8]]][0.5,0.5][[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]][0.5,0.5]