Click4Ai

148.

Hard

Quantization Aware Training

Implement Quantization Aware Training (QAT), a technique that simulates the effects of quantization during the training process. By inserting fake quantization operations (rounding weights and activations to their nearest quantization levels), the model learns to be robust to the precision loss that occurs when deploying to low-bit hardware (e.g., INT8).

Fake Quantization:

quantized_value = round(original_value)

QAT Loss:

L = MSE(round(weights), weights) + MSE(round(activations), activations)

= mean((round(w) - w)^2) + mean((round(a) - a)^2)

Where:

weights = model weight values (float)

activations = model activation values (float)

round() = rounding to nearest integer (simulating quantization)

Example:

Input: weights = [1.1, 1.2], activations = [1.3, 1.4]

round(weights) = [1.0, 1.0]

round(activations) = [1.0, 1.0]

weight_loss = mean([(1.0-1.1)^2, (1.0-1.2)^2]) = mean([0.01, 0.04]) = 0.025

act_loss = mean([(1.0-1.3)^2, (1.0-1.4)^2]) = mean([0.09, 0.16]) = 0.125

Total loss = 0.025 + 0.125 = 0.15

Output: 0.15

The QAT loss quantifies the distortion introduced by rounding. During training, the model adjusts its weights to minimize this distortion, resulting in weights that are naturally closer to quantization-friendly values. This leads to minimal accuracy loss when the model is actually quantized for deployment.

Constraints:

  • `weights` and `activations` are 1D NumPy arrays of floats.
  • Use `np.round` for the fake quantization step.
  • The loss is the sum of MSE for weights and MSE for activations.
  • Use NumPy for all operations.
  • Test Cases

    Test Case 1
    Input: [[1.1,1.2],[1.3,1.4]]
    Expected: 0.02
    Test Case 2
    Input: [[1.5,1.6],[1.7,1.8]]
    Expected: 0.04
    + 3 hidden test cases