Dilated Convolution (Atrous Convolution)
Dilated (atrous) convolution expands the receptive field of a convolutional kernel by inserting **gaps (holes) between kernel elements**. This allows the network to capture larger spatial context without increasing the number of parameters or reducing spatial resolution.
Formula:
Effective kernel size = kernel_size + (kernel_size - 1) * (dilation_rate - 1)
For a kernel K with dilation rate d:
output[i, j] = sum(input[i + k*d, j + l*d] * kernel[k, l])
for k in range(kernel_height), l in range(kernel_width)
Output dimensions:
effective_K = kernel_size + (kernel_size - 1) * (dilation - 1)
output_size = input_size - effective_K + 1
Example:
Input (4x4): Kernel (2x2): Dilation = 2
[[1, 2, 3, 4], [[0, 1],
[5, 6, 7, 8], [1, 0]]
[9, 10, 11, 12],
[13, 14, 15, 16]]
Effective kernel size = 2 + (2-1)*(2-1) = 3
Output size = 4 - 3 + 1 = 2x2
Position (0,0): input[0,0]*0 + input[0,2]*1 + input[2,0]*1 + input[2,2]*0
= 0 + 3 + 9 + 0 = 12
Output: [[12, 16],
[28, 32]]
Dilated convolutions are widely used in semantic segmentation (DeepLab), audio generation (WaveNet), and any task requiring large receptive fields. Stacking layers with increasing dilation rates (1, 2, 4, 8...) exponentially grows the receptive field while maintaining the same number of parameters per layer.
Constraints:
Test Cases
[[1,2],[3,4]][[3]][[1,2,3],[4,5,6],[7,8,9]][[12,21],[27,36]]