Implement **Elastic Net Regression**, which combines both L1 (Lasso) and L2 (Ridge) regularization using gradient descent.
Loss Function:
Loss = (1/n) * sum((y_pred - y)^2) + alpha * (l1_ratio * sum(|w|) + (1 - l1_ratio) * sum(w^2))
Gradient Descent Update Rules:
dw = (2/n) * (X^T @ (y_pred - y)) + alpha * (l1_ratio * sign(w) + 2 * (1 - l1_ratio) * w)
db = (2/n) * sum(y_pred - y)
w = w - learning_rate * dw
b = b - learning_rate * db
Parameters:
Your class should have:
Example:
model = ElasticNet(alpha=0.1, l1_ratio=0.5, learning_rate=0.01)
model.fit(X, y, n_iterations=1000)
# Combines benefits of both L1 sparsity and L2 weight shrinkage
**Explanation:** Elastic Net gets the best of both worlds: L1's feature selection ability and L2's stability when features are correlated. When l1_ratio=0.5, equal weight is given to both penalties.
Constraints:
Test Cases
X=[[1,2],[2,3],[3,4],[4,5]], y=[5,8,11,14], alpha=0.1, l1_ratio=0.5predictions close to true values; weights have both sparsity and shrinkageX=[[1],[2],[3],[4],[5]], y=[2,4,6,8,10], alpha=0.1, l1_ratio=0.0behaves like Ridge regression (pure L2)X=[[1],[2],[3],[4],[5]], y=[2,4,6,8,10], alpha=0.1, l1_ratio=1.0behaves like Lasso regression (pure L1)