Click4Ai

127.

Medium

Transposed Convolution

Transposed convolution (also called deconvolution or fractionally-strided convolution) is an **upsampling operation** that increases the spatial dimensions of the input. It works by placing each input value at a position in the output and multiplying it by the kernel, accumulating overlapping contributions.

Formula:

output_size = (input_size - 1) * stride + kernel_size

For stride = 2, input (N x N), kernel (K x K):

output_size = (N - 1) * 2 + K

Algorithm:

For each input position (i, j):

output[i*stride : i*stride+K, j*stride : j*stride+K] += input[i,j] * kernel

Overlapping regions are summed together.

Example:

Input (2x2): Kernel (2x2): Stride = 2

[[1, 2], [[5, 6],

[3, 4]] [7, 8]]

output_size = (2-1)*2 + 2 = 4 -> 4x4 output

Place input[0,0]=1 * kernel at (0,0): [[5,6,0,0],[7,8,0,0],[0,0,0,0],[0,0,0,0]]

Place input[0,1]=2 * kernel at (0,2): add [[0,0,10,12],[0,0,14,16],[0,0,0,0],[0,0,0,0]]

Place input[1,0]=3 * kernel at (2,0): add [[0,0,0,0],[0,0,0,0],[15,18,0,0],[21,24,0,0]]

Place input[1,1]=4 * kernel at (2,2): add [[0,0,0,0],[0,0,0,0],[0,0,20,24],[0,0,28,32]]

Output: [[ 5, 6, 10, 12],

[ 7, 8, 14, 16],

[15, 18, 20, 24],

[21, 24, 28, 32]]

Transposed convolutions are essential in architectures that require upsampling, such as autoencoders, GANs (generators), and semantic segmentation networks (e.g., U-Net decoder). They learn upsampling parameters rather than using fixed interpolation.

Constraints:

  • Input is a 2x2 numpy array
  • Kernel is a 2x2 numpy array
  • Stride is 2 (each input element maps to a 2x2 region in the output)
  • Output is a 4x4 numpy array
  • Overlapping kernel placements are summed
  • Test Cases

    Test Case 1
    Input: [[1,2],[3,4]]
    Expected: [[12,21,12,21],[21,30,21,30],[12,21,12,21],[21,30,21,30]]
    Test Case 2
    Input: [[5,6],[7,8]]
    Expected: [[60,81,60,81],[81,108,81,108],[60,81,60,81],[81,108,81,108]]
    + 3 hidden test cases