Click4Ai

128.

Medium

Dilated Convolution (Atrous Convolution)

Dilated (atrous) convolution expands the receptive field of a convolutional kernel by inserting **gaps (holes) between kernel elements**. This allows the network to capture larger spatial context without increasing the number of parameters or reducing spatial resolution.

Formula:

Effective kernel size = kernel_size + (kernel_size - 1) * (dilation_rate - 1)

For a kernel K with dilation rate d:

output[i, j] = sum(input[i + k*d, j + l*d] * kernel[k, l])

for k in range(kernel_height), l in range(kernel_width)

Output dimensions:

effective_K = kernel_size + (kernel_size - 1) * (dilation - 1)

output_size = input_size - effective_K + 1

Example:

Input (4x4): Kernel (2x2): Dilation = 2

[[1, 2, 3, 4], [[0, 1],

[5, 6, 7, 8], [1, 0]]

[9, 10, 11, 12],

[13, 14, 15, 16]]

Effective kernel size = 2 + (2-1)*(2-1) = 3

Output size = 4 - 3 + 1 = 2x2

Position (0,0): input[0,0]*0 + input[0,2]*1 + input[2,0]*1 + input[2,2]*0

= 0 + 3 + 9 + 0 = 12

Output: [[12, 16],

[28, 32]]

Dilated convolutions are widely used in semantic segmentation (DeepLab), audio generation (WaveNet), and any task requiring large receptive fields. Stacking layers with increasing dilation rates (1, 2, 4, 8...) exponentially grows the receptive field while maintaining the same number of parameters per layer.

Constraints:

  • Input and kernel are 2D numpy arrays
  • Dilation rate is a positive integer (1 = standard convolution)
  • The input must be large enough to accommodate the effective kernel size
  • Output values are computed by summing element-wise products at dilated positions
  • Test Cases

    Test Case 1
    Input: [[1,2],[3,4]]
    Expected: [[3]]
    Test Case 2
    Input: [[1,2,3],[4,5,6],[7,8,9]]
    Expected: [[12,21],[27,36]]
    + 3 hidden test cases