Skip Connections
Skip connections (also called shortcut connections or identity mappings) are a technique that adds the **input of a layer directly to its output**, bypassing one or more intermediate transformations. This helps combat the vanishing gradient problem in deep networks by providing a direct path for gradient flow during backpropagation.
Formula:
output = activation(conv(x) + x)
Simplified (without activation):
output = F(x) + x
Where:
x = original input
F(x) = output of the transformation (e.g., convolution)
+ = element-wise addition
Example:
Input x (same used as both inputs and outputs):
[[[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]]]
Skip connection: output = x + x (input added to itself)
Output:
[[[[2, 4],
[6, 8]],
[[10, 12],
[14, 16]]]]
Skip connections were popularized by ResNet and are now used across many architectures including DenseNet (concatenation-based), U-Net (for segmentation), and Transformers. They enable training of very deep networks by ensuring that information and gradients can flow without degradation across many layers.
Constraints:
Test Cases
[[[[1,2],[3,4]],[[5,6],[7,8]]]][[[[2,4],[6,8]],[[10,12],[14,16]]]][[[[1,2,3],[4,5,6],[7,8,9]],[[10,11,12],[13,14,15],[16,17,18]]]][[[[2,4,6],[8,10,12],[14,16,18]],[[20,22,24],[26,28,30],[32,34,36]]]]