How To Calculate Dimensions Of First Linear Layer Of A CNN

November 22, 2022 Post a Comment

Currently, I am working with a CNN where there is a fully connected layer attached to it and I am working with a 3 channel image of size 32x32. I am wondering on if there is a cons

Solution 1:

Given the input spatial dimension w, a 2d convolution layer will output a tensor with the following size on this dimension:

int((w + 2*p - d*(k - 1) - 1)/s + 1)

The exact same is true for nn.MaxPool2d. For reference, you can look it up here, on the PyTorch documentation.

The convolution part of your model is made up of three (Conv2d + MaxPool2d) blocks. You can easily infer the spatial dimension size of the output with this helper function:

def conv_shape(x, k=1, p=0, s=1, d=1):
    return int((x + 2*p - d*(k - 1) - 1)/s + 1)

Calling it recursively, you get the resulting spatial dimension:

>>> w = conv_shape(conv_shape(32, k=3, p=1), k=2, s=2)
>>> w = conv_shape(conv_shape(w, k=3), k=2, s=2)
>>> w = conv_shape(conv_shape(w, k=3), k=2, s=2)

>>> w
2

Since your convolutions have squared kernels and identical strides, paddings (horizontal equals vertical), the above calculations hold true for the width and the height dimensions of the tensor. Lastly, looking at the last convolution layer conv3, which has 64 filters, the resulting number of elements per batch element before your fully connected layer is: w*w*64, i.e. 256.

Baca Juga

However, nothing stops you from calling your layers to find out the output shape!

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        
        self.feature_extractor = nn.Sequential(
            nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),
            nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3),
            nn.ReLU(),
            nn.MaxPool2d(2,2),
            nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3),
            nn.ReLU(),
            nn.MaxPool2d(2,2),
            nn.Flatten())

        n_channels = self.feature_extractor(torch.empty(1, 3, 32, 32)).size(-1)

        self.classifier = nn.Sequential(
            nn.Linear(n_channels, 200),
            nn.ReLU(),
            nn.Dropout(0.25),
            nn.Linear(200, 100),
            nn.ReLU(),
            nn.Dropout(0.25),
            nn.Linear(100, 10))

    def forward(self, x):
        features = self.feature_extractor(x)
        out = self.classifier(features)
        return out

model = Net()

Learn Python Tutorials

How To Calculate Dimensions Of First Linear Layer Of A CNN

Solution 1:

Post a Comment for "How To Calculate Dimensions Of First Linear Layer Of A CNN"