About PyTorch source code interpretation

There is a very important and easy-to-use package in the PyTorch framework: torchvision. This package is mainly composed of 3 sub-packages, namely:,,.

For specific introductions to these 3 subpacks, please refer to the official website:

/docs/master/torchvision/。

For specific code, please refer to github:

/pytorch/vision/tree/master/torchvision。

This blog is about. This package contains common network structures such as alexnet, densitynet, inception, resnet, squeezenet, vgg, etc., and provides pre-trained models, which can read the network structure and pre-trained models through simple calls.

Example of use:

import torchvision
model = .resnet50(pretrained=True)

This way, the pre-trained model of resnet50 was imported. If only the network structure is needed and no pre-trained model parameters are required to initialize it, then it is:

model = .resnet50(pretrained=False)

The same is true if you want to import densitynet models, for example, import densitynet169, and it does not need to be a pre-trained model:

model = .densenet169(pretrained=False)

Since the pretrained parameter is False by default, it is equivalent to:

model = .densenet169()

However, for the sake of clear code, it is best to add parameter assignments.

Next, take the import of resnet50 as an example to introduce the source code when importing the model. When running model = .resnet50(pretrained=True), it is performed through the scripts packaged, with the source code as follows:

The first is to import the necessary libraries, where model_zoo is a package related to importing pretrained models, and the all variable defines the function name or class name that can be imported from the outside. This is also the reason why it can be called with .resnet50() in the previous section. The model_urls dictionary is the download address of the pretrained model.

import  as nn
import math
import .model_zoo as model_zoo

__all__ = ['ResNet', 'resnet18', 'resnet34', 'resnet50', 'resnet101',
  'resnet152']

model_urls = {
 'resnet18': '/models/',
 'resnet34': '/models/',
 'resnet50': '/models/',
 'resnet101': '/models/',
 'resnet152': '/models/',
}

Next is the function resnet50, and the parameter pretrained is False by default. First of all, model = ResNet(Bottleneck, [3, 4, 6, 3], **kwargs) is to build a network structure. Bottleneck is another class that builds bottleneck. There are many duplicate substructures in the construction of ResNet network structure. These substructures are built through the Bottleneck class, which will be introduced later. Then if the parameter pretrained is True, the corresponding pretrained model will be downloaded or imported according to the model_url dictionary through the load_url function in model_zoo.py. Finally, use the pre-trained model parameters to initialize the network structure you built by calling the load_state_dict method of the model. This method is to initialize the operation of another model's layer using one model's parameters. Another important parameter of the load_state_dict method is strict. This parameter is True by default, indicating that the layer of the pretrained model is strictly equivalent to your network structure layer (such as layer name and dimension).

def resnet50(pretrained=False, **kwargs):
 """Constructs a ResNet-50 model.

 Args:
 pretrained (bool): If True, returns a model pre-trained on ImageNet
 """
 model = ResNet(Bottleneck, [3, 4, 6, 3], **kwargs)
 if pretrained:
 model.load_state_dict(model_zoo.load_url(model_urls['resnet50']))
 return model

Other functions such as resnet18, resnet101 are basically similar to resnet50, the main difference is:

1. When building a network structure, the parameters of block are different. For example, in resnet18, it is [2, 2, 2, 2], and in resnet101, it is [3, 4, 23, 3].

2. The block classes called are different. For example, the Bottleneck classes are called in resnet50, resnet101, and resnet152, while the BasicBlock classes are called in resnet18 and resnet34. The difference between these two classes is mainly because the number of convolutional layers in the residual result is different. This is related to the network structure, which will be introduced in detail later.

3. If you download a pretrained model, the keys of the model_urls dictionary are different, corresponding to different pretrained models. Therefore, let’s take a look at how to build the network structure and how to import the pretrained model.

def resnet18(pretrained=False, **kwargs):
 """Constructs a ResNet-18 model.

 Args:
 pretrained (bool): If True, returns a model pre-trained on ImageNet
 """
 model = ResNet(BasicBlock, [2, 2, 2, 2], **kwargs)
 if pretrained:
 model.load_state_dict(model_zoo.load_url(model_urls['resnet18']))
 return model

def resnet101(pretrained=False, **kwargs):
 """Constructs a ResNet-101 model.

 Args:
 pretrained (bool): If True, returns a model pre-trained on ImageNet
 """
 model = ResNet(Bottleneck, [3, 4, 23, 3], **kwargs)
 if pretrained:
 model.load_state_dict(model_zoo.load_url(model_urls['resnet101']))
 return model

The ResNet network is built through the ResNet class. First, it is still to inherit the base class of the network in PyTorch: second, the main one is to override the initialization __init__ and forward methods. In initializing __init__, it mainly defines some layers of parameters. The forward method mainly defines the flow order of data between layers, that is, the connection order of layers. In addition, other private methods can be defined in the class to modularize some operations, such as the _make_layer method here is used to build 4 blocks in the ResNet network. The first input block of the _make_layer method is Bottleneck or BasicBlock class, the second input is the output channel of the blocks, and the third input is how many residual substructures are contained in each block. Therefore, the list of layers is the [3, 4, 6, 3] of the previous resnet50.

The two more important lines of code in the _make_layer method are: 1. (block(, planes, stride, downsample)). This part saves the first residual structure of each block in the layers list. 2. for i in range(1, blocks): (block(, planes)), this part saves the remaining residual structure of each block in the layers list, thus completing the construction of a block. In these two lines of code, each residual is constructed through the Bottleneck class. Next, the Bottleneck class is introduced.

class ResNet():

 def __init__(self, block, layers, num_classes=1000):
  = 64
 super(ResNet, self).__init__()
 self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
    bias=False)
 self.bn1 = nn.BatchNorm2d(64)
  = (inplace=True)
  = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
 self.layer1 = self._make_layer(block, 64, layers[0])
 self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
 self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
 self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
  = nn.AvgPool2d(7, stride=1)
  = (512 * , num_classes)

 for m in ():
  if isinstance(m, nn.Conv2d):
  n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
  .normal_(0, (2. / n))
  elif isinstance(m, nn.BatchNorm2d):
  .fill_(1)
  .zero_()

 def _make_layer(self, block, planes, blocks, stride=1):
 downsample = None
 if stride != 1 or  != planes * :
  downsample = (
  nn.Conv2d(, planes * ,
    kernel_size=1, stride=stride, bias=False),
  nn.BatchNorm2d(planes * ),
  )

 layers = []
 (block(, planes, stride, downsample))
  = planes * 
 for i in range(1, blocks):
  (block(, planes))

 return (*layers)

 def forward(self, x):
 x = self.conv1(x)
 x = self.bn1(x)
 x = (x)
 x = (x)

 x = self.layer1(x)
 x = self.layer2(x)
 x = self.layer3(x)
 x = self.layer4(x)

 x = (x)
 x = ((0), -1)
 x = (x)

 return x

From the previous ResNet class, we can see that when constructing the ResNet network, the most important thing is the Bottleneck class, because ResNet is composed of a residual structure, and the Bottleneck class completes the construction of the residual structure. Similarly, Bottlenect inherits the class and rewritten the __init__ and forward methods. From the forward method, we can see that bottleneck is the three main convolutional layers, BN layers and activation layers we are familiar with. The last out += residual is the operation of element-wise add.

class Bottleneck():
 expansion = 4

 def __init__(self, inplanes, planes, stride=1, downsample=None):
 super(Bottleneck, self).__init__()
 self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
 self.bn1 = nn.BatchNorm2d(planes)
 self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
    padding=1, bias=False)
 self.bn2 = nn.BatchNorm2d(planes)
 self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
 self.bn3 = nn.BatchNorm2d(planes * 4)
  = (inplace=True)
  = downsample
  = stride

 def forward(self, x):
 residual = x

 out = self.conv1(x)
 out = self.bn1(out)
 out = (out)

 out = self.conv2(out)
 out = self.bn2(out)
 out = (out)

 out = self.conv3(out)
 out = self.bn3(out)

 if  is not None:
  residual = (x)

 out += residual
 out = (out)

 return out

The BasicBlock class is similar to the Bottleneck class. The former is mainly used to build ResNet18 and ResNet34 networks, because the residual structure of these two networks only contains two convolutional layers and does not have the bottleneck concept in the Bottleneck class. Therefore, in this class, the first convolution layer uses a convolution with kernel_size=3, as shown in the conv3x3 function.

def conv3x3(in_planes, out_planes, stride=1):
 """3x3 convolution with padding"""
 return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
   padding=1, bias=False)

class BasicBlock():
 expansion = 1

 def __init__(self, inplanes, planes, stride=1, downsample=None):
 super(BasicBlock, self).__init__()
 self.conv1 = conv3x3(inplanes, planes, stride)
 self.bn1 = nn.BatchNorm2d(planes)
  = (inplace=True)
 self.conv2 = conv3x3(planes, planes)
 self.bn2 = nn.BatchNorm2d(planes)
  = downsample
  = stride

 def forward(self, x):
 residual = x

 out = self.conv1(x)
 out = self.bn1(out)
 out = (out)

 out = self.conv2(out)
 out = self.bn2(out)

 if  is not None:
  residual = (x)

 out += residual
 out = (out)

 return out

After introducing how to build a network, the next step is how to obtain a pre-trained model. The code mentioned above is: if pretrained: model.load_state_dict(model_zoo.load_url(model_urls['resnet50'])), which mainly imports the corresponding pretrained model based on the model_urls dictionary through the load_url function in model_zoo.py. The github address of the models_zoo.py script:

/pytorch/pytorch/blob/master/torch/utils/model_zoo.py。

The source code of the load_url function is as follows.

First of all, model_dir is the storage address of the downloaded model. If it is not specified, it will be saved in the project's .torch directory. It is best to specify it. cached_file is the path to save the model plus the model name. The next if not (cached_file) statement is used to determine whether the model to be downloaded already exists in the specified directory. If it already exists, the interface is directly called to import the model. If it does not exist, it will be downloaded from the Internet. The download is carried out through _download_url_to_file(url, cached_file, hash_prefix, progress=progress), and will not be discussed in detail. The key point is that model import is carried out through the () interface, regardless of whether your model is downloaded online or already available locally.

def load_url(url, model_dir=None, map_location=None, progress=True):
 r"""Loads the Torch serialized object at the given URL.

 If the object is already present in `model_dir`, it's deserialized and
 returned. The filename part of the URL should follow the naming convention
 ``filename-<sha256>.ext`` where ``<sha256>`` is the first eight or more
 digits of the SHA256 hash of the contents of the file. The hash is used to
 ensure unique names and to verify the contents of the file.

 The default value of `model_dir` is ``$TORCH_HOME/models`` where
 ``$TORCH_HOME`` defaults to ``~/.torch``. The default directory can be
 overriden with the ``$TORCH_MODEL_ZOO`` environment variable.

 Args:
 url (string): URL of the object to download
 model_dir (string, optional): directory in which to save the object
 map_location (optional): a function or a dict specifying how to remap storage locations (see )
 progress (bool, optional): whether or not to display a progress bar to stderr

 Example:
 >>> state_dict = .model_zoo.load_url('/pytorch/models/')

 """
 if model_dir is None:
 torch_home = (('TORCH_HOME', '~/.torch'))
 model_dir = ('TORCH_MODEL_ZOO', (torch_home, 'models'))
 if not (model_dir):
 (model_dir)
 parts = urlparse(url)
 filename = ()
 cached_file = (model_dir, filename)
 if not (cached_file):
 ('Downloading: "{}" to {}\n'.format(url, cached_file))
 hash_prefix = HASH_REGEX.search(filename).group(1)
 _download_url_to_file(url, cached_file, hash_prefix, progress=progress)
 return (cached_file, map_location=map_location)

The above explanation of PyTorch source code is all the content I share with you. I hope you can give you a reference and I hope you can support me more.