VGG Block
A VGG block is the fundamental building unit of the VGG network architecture (Simonyan & Zisserman, 2014). It follows a simple pattern: **multiple 3x3 convolution + ReLU layers** followed by a **2x2 max pooling layer**. This design showed that depth with small filters is more effective than fewer layers with large filters.
VGG Block Pattern:
For each conv layer (repeated num_conv_layers times):
output = ReLU(Conv3x3(input))
output = MaxPool2x2(output)
Where:
Conv3x3: Convolution with 3x3 kernel, same padding
ReLU: max(0, x) activation
MaxPool2x2: 2x2 max pooling with stride 2
Spatial dimensions after block:
output_height = input_height / 2
output_width = input_width / 2
Example:
Input shape: (1, 4, 4, 3) num_conv_layers = 2, num_filters = 64
Step 1: Conv3x3 + ReLU -> (1, 4, 4, 64)
Step 2: Conv3x3 + ReLU -> (1, 4, 4, 64)
Step 3: MaxPool2x2 -> (1, 2, 2, 64)
Output shape: (1, 2, 2, 64)
VGG demonstrated that network depth is critical for good performance. The VGG-16 and VGG-19 variants stack multiple VGG blocks with increasing filter counts (64, 128, 256, 512), achieving strong results on ImageNet classification.
Constraints:
Test Cases
[[[[1,2],[3,4]],[[5,6],[7,8]]]][[[[7],[8]],[[15],[16]]]][[[[1,2,3],[4,5,6],[7,8,9]],[[10,11,12],[13,14,15],[16,17,18]]]][[[[27],[36]],[[54],[72]]]]