Implement a **Multi-class Logistic Regression** classifier from scratch using the softmax function and gradient descent.
Softmax Function:
softmax(z_i) = exp(z_i) / sum(exp(z_j)) for all j
**One-Hot Encoding:** Convert integer labels to binary vectors (e.g., label 2 with 3 classes -> [0, 0, 1]).
Gradient Descent Update:
dW = (1/n) * X^T . (probabilities - y_one_hot)
db = (1/n) * sum(probabilities - y_one_hot)
W = W - lr * dW
b = b - lr * db
Example:
X = [[1, 2], [2, 3], [3, 1], [4, 3], [5, 3], [6, 2]]
y = [0, 0, 1, 1, 2, 2]
model = MultiClassLogisticRegression(learning_rate=0.1, n_iterations=1000)
model.fit(X, y)
predictions = model.predict(X)
# predictions should approximate [0, 0, 1, 1, 2, 2]
**Explanation:** The softmax function converts raw scores (logits) into probability distributions across multiple classes. Training uses cross-entropy loss with gradient descent to update weights. Prediction selects the class with the highest probability.
Constraints:
Test Cases
Example input 1Expected output 1Example input 2Expected output 2