Residual Block
A Residual block is the core building unit of ResNet (He et al., 2015). It introduces a **shortcut (skip) connection** that adds the input directly to the output of stacked convolutional layers. This allows gradients to flow directly through the network, enabling training of much deeper architectures (100+ layers).
Formula:
output = F(x) + x
Where:
x = input to the residual block
F(x) = transformation through the block:
F(x) = ReLU(Conv3x3(ReLU(Conv3x3(x))))
+ = element-wise addition (residual connection)
The block learns the residual mapping F(x) = desired_output - x,
which is easier to optimize than learning the full mapping directly.
Example:
Input x: [[[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]]]
After Conv3x3 + ReLU (layer 1): transformed once
After Conv3x3 + ReLU (layer 2): F(x) computed
Add residual: output = F(x) + x
Output: [[[[3, 4],
[7, 8]],
[[11, 12],
[15, 16]]]]
Residual blocks solve the degradation problem where adding more layers to a deep network causes higher training error. By learning residual functions instead of unreferenced mappings, the network can effectively learn identity mappings when needed, making depth increase always beneficial.
Constraints:
Test Cases
[[[[1,2],[3,4]],[[5,6],[7,8]]]][[[[3,4],[7,8]],[[11,12],[15,16]]]][[[[1,2,3],[4,5,6],[7,8,9]],[[10,11,12],[13,14,15],[16,17,18]]]][[[[5,6,7],[10,11,12],[15,16,17]],[[22,23,24],[29,30,31],[36,37,38]]]]