Click4Ai

105.

Medium

Adam Optimizer

Implement the Adam (Adaptive Moment Estimation) Optimizer for updating neural network weights. Adam combines the benefits of two other optimizers: AdaGrad (which adapts the learning rate per parameter) and RMSProp (which uses a moving average of squared gradients). It is one of the most widely used optimizers in deep learning.

The Adam update rules are:

# First moment estimate (momentum):

m = beta1 * m + (1 - beta1) * gradient

# Second moment estimate (velocity):

v = beta2 * v + (1 - beta2) * gradient^2

# Bias-corrected estimates:

m_hat = m / (1 - beta1^t)

v_hat = v / (1 - beta2^t)

# Weight update:

weights = weights - learning_rate * m_hat / (sqrt(v_hat) + epsilon)

Your function adam_optimizer(weights, gradients, learning_rate, beta1, beta2, epsilon) should initialize m and v to zeros, compute one step of Adam, and return the updated weights.

Example:

Input: weights = [[1, 2], [3, 4]], gradients = [[0.1, 0.2], [0.3, 0.4]]

learning_rate = 0.01, beta1 = 0.9, beta2 = 0.999, epsilon = 1e-8

m = 0.9 * 0 + 0.1 * gradients

v = 0.999 * 0 + 0.001 * gradients^2

Output: updated weights after applying bias correction and update rule

Adam maintains per-parameter learning rates that are adapted based on the first moment (mean) and second moment (uncentered variance) of the gradients. The bias correction terms compensate for the initialization of m and v at zero, which biases them toward zero during the initial time steps.

Constraints:

  • The learning rate should be between 0 and 1
  • beta1 is typically 0.9, beta2 is typically 0.999
  • epsilon is a small constant (e.g., 1e-8) to prevent division by zero
  • Initialize m and v to zeros with the same shape as weights
  • The input should be a 2D numpy array
  • Test Cases

    Test Case 1
    Input: [[1, 2], [3, 4]]
    Expected: [[0.99, 1.98], [2.97, 3.96]]
    Test Case 2
    Input: [[5, 6], [7, 8]]
    Expected: [[4.95, 5.94], [6.93, 7.92]]
    + 3 hidden test cases