Click4Ai

116.

Easy

Global Average Pooling

Global Average Pooling (GAP) is a pooling operation that reduces each feature map (channel) to a single scalar value by computing the **mean of all spatial positions**. It replaces fully connected layers at the end of CNN architectures, reducing the total number of parameters and helping prevent overfitting.

Formula:

For a feature map of shape (channels, height, width):

output[c] = (1 / (H * W)) * sum(input[c, i, j]) for all i, j

Output shape: (batch_size, channels) -- one value per channel

For a single-channel input of shape (1, H, W):

output = mean of all H * W values

Example:

Input tensor (shape 1x2x2):

[[[1, 2],

[3, 4]]]

GAP computation:

output = (1 + 2 + 3 + 4) / 4 = 10 / 4 = 2.5

Output: 2.5

Global Average Pooling was popularized by the Network in Network (NIN) paper and is widely used in modern architectures like GoogLeNet and ResNet. It acts as a structural regularizer that enforces correspondences between feature maps and categories, making the network more interpretable.

Constraints:

  • Input tensor has shape (1, height, width) representing a single-channel 3D tensor
  • Height and width can be any positive integers
  • Output is a single floating-point scalar value
  • Test Cases

    Test Case 1
    Input: [[[1, 2], [3, 4]]]
    Expected: 2.5
    Test Case 2
    Input: [[[5, 6], [7, 8]]]
    Expected: 6.5
    + 3 hidden test cases