SoFunction
Updated on 2024-10-30

Build your own resnet18 network and load torchvision's own weights for operation

The directly built network must have exactly the same structure, dimensions and variable naming as the weights of the network that comes with torchvision, i.e. the pth file, otherwise the weights file cannot be loaded.

At this point, you can compare the 2 dictionaries and load them one by one, see

Solution for pytorch loading a pre-trained model that doesn't match its own model

import torch
import torchvision
import cv2 as cv
from  import letter_box
from  import ResNet18

model1 = ResNet18(1)
model2 = .resnet18(progress=False)
fc = 
 = (512, 1)
# print(model)
model_dict1 = model1.state_dict()
model_dict2 = ('')
model_list1 = list(model_dict1.keys())
model_list2 = list(model_dict2.keys())
len1 = len(model_list1)
len2 = len(model_list2)
minlen = min(len1, len2)
for n in range(minlen):
    if model_dict1[model_list1[n]].shape != model_dict2[model_list2[n]].shape:
        continue
    model_dict1[model_list1[n]] = model_dict2[model_list2[n]]
model1.load_state_dict(model_dict1)
missing, unspected = model2.load_state_dict(model_dict2)
image = ('')
image = letter_box(image, 224)
image = image[:, :, ::-1].transpose(2, 0, 1)
print('Network loading complete.')
()
()
with torch.no_grad():
    image = (image/256, dtype=torch.float32).unsqueeze(0)
    predict1 = model1(image)
    predict2 = model2(image)
print('finished')
# (model.state_dict(), '')

The above is the full procedure, which ultimately allows you to test whether the output of the original model is equal to the output of the custom model loaded with its own weights.

Supplementary: building a ResNet classification network using Pytorch and training it based on transfer learning

If stride=1, padding=1

Convolutional processing is not changing the height and width of the feature matrix

When using the BN layer

The parameter bias in the convolution is set to False (the output of the BN layer is the same with or without bias), and the BN layer is placed in the middle of the conv and relu layers

Review the BN tier:

Batch Norm layers are normalized for each layer and then linearly transformed to improve the data distribution, where the linear transformation is learnable.

Batch Norm Advantages:Mitigates overfitting; improves gradient propagation (weights are not too high or too low) Allows for higher learning rates and can increase training speed. Reduce the strong dependence on the initialization weights, so that the data is distributed in the non-saturated region of the activation function, to a certain extent, to solve the problem of gradient disappearance. Acts as a form of regularization, reducing the use of dropout to some extent.

Batch Norm Layer Placement: There is no standardization as to whether it should be placed before or after the activation layer (e.g. ReLU).

BN layer cooperates with Dropout: the proposal of Batch Norm makes the use of dropout reduce, but Batch Norm can't replace dropout completely, keep a smaller dropout rate, such as 0.2 may be more effective.

Why do you need to normalize first and then restore close to the original by γ,β linear transformation, isn't it redundant?

Under certain conditions it is possible to correct the distribution of the original data (variance, mean become new values γ,β), when the distribution of the original data is good enough it is a constant mapping and does not change the distribution. If no BN is done, the variance and mean have complex correlation dependence on the parameters of the previous network with complex nonlinearities. In the new parameter γH′ + β is determined only by γ,β, which is independent of the parameters of the preceding network, so the new parameter is easy to learn by gradient descent, and is able to learn a better distribution.

Transfer learning import weights and download weights:

import #ctrl+left mouse click to download weights
net = resnet34()#You can't set the output type of the fully-connected layer to what you want at first, you have to load the model parameters first and then modify the fully-connected layer.
# Official method for loading pre-trained models
model_weight_path = "./"#Weighting paths
missing_keys, unexpected_keys = net.load_state_dict((model_weight_path), strict=False)#Load model weights
inchannel = .in_features
 = (inchannel, 5)# Redefine the full connectivity layer

Full Code:

MODEL section:

import  as nn
import torch
class BasicBlock():# Residual structures corresponding to the 18th and 34th floors (both solid and dashed residual structure functions)
    expansion = 1Are the three convolutional layers on the main branch of the # residual structure the same, 1 for the same, 4 if the third layer is four times as large as one or two layers
    def __init__(self, in_channel, out_channel, stride=1, downsample=None):#downsample represents the dashed residual structure option
        super(BasicBlock, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=in_channel, out_channels=out_channel,
                               kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(out_channel)
         = ()
        self.conv2 = nn.Conv2d(in_channels=out_channel, out_channels=out_channel,
                               kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_channel)
         = downsample
    def forward(self, x):
        identity = x
        if  is not None:
            identity = (x)# Get the output of the shortcut branch
        out = self.conv1(x)
        out = self.bn1(out)
        out = (out)
        out = self.conv2(out)
        out = self.bn2(out)
        out += identity
        out = (out)
        return out# Get the final output of the residual structure

class Bottleneck():# Residual structures corresponding to layers 50, 101 and 152
    expansion = 4# The third layer has four times as many convolution kernels as the first and second layers.
    def __init__(self, in_channel, out_channel, stride=1, downsample=None):
        super(Bottleneck, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=in_channel, out_channels=out_channel,
                               kernel_size=1, stride=1, bias=False)
        self.bn1 = nn.BatchNorm2d(out_channel)
        self.conv2 = nn.Conv2d(in_channels=out_channel, out_channels=out_channel,
                               kernel_size=3, stride=stride, bias=False, padding=1)
        self.bn2 = nn.BatchNorm2d(out_channel)
        self.conv3 = nn.Conv2d(in_channels=out_channel, out_channels=out_channel*,
                               kernel_size=1, stride=1, bias=False)
        self.bn3 = nn.BatchNorm2d(out_channel*)
         = (inplace=True)
         = downsample
    def forward(self, x):
        identity = x
        if  is not None:
            identity = (x)
        out = self.conv1(x)
        out = self.bn1(out)
        out = (out)
        out = self.conv2(out)
        out = self.bn2(out)
        out = (out)
        out = self.conv3(out)
        out = self.bn3(out)
        out += identity
        out = (out)
        return out

class ResNet():# Define the framework part of the whole network
#blocks_num is the number of residual structures, it is a list parameter, blocks correspond to which residual module
    def __init__(self, block, blocks_num, num_classes=1000, include_top=True):
        super(ResNet, self).__init__()
        self.include_top = include_top
        self.in_channel = 64# Depth of the feature matrix obtained after passing through the first pooling layer
        self.conv1 = nn.Conv2d(3, self.in_channel, kernel_size=7, stride=2,
                               padding=3, bias=False)
        self.bn1 = nn.BatchNorm2d(self.in_channel)
         = (inplace=True)
         = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.layer1 = self._make_layer(block, 64, blocks_num[0])
        self.layer2 = self._make_layer(block, 128, blocks_num[1], stride=2)
        self.layer3 = self._make_layer(block, 256, blocks_num[2], stride=2)
        self.layer4 = self._make_layer(block, 512, blocks_num[3], stride=2)
        if self.include_top:
             = nn.AdaptiveAvgPool2d((1, 1))  # output size = (1, 1)
             = (512 * , num_classes)
        for m in ():
            if isinstance(m, nn.Conv2d):
                .kaiming_normal_(, mode='fan_out', nonlinearity='relu')
    def _make_layer(self, block, channel, block_num, stride=1):#channel: number of convolution kernels used in the first convolutional layer of the residual structure
        downsample = None
        if stride != 1 or self.in_channel != channel * :#Levels 18 and 34 will just skip this if statement
            downsample = (
                nn.Conv2d(self.in_channel, channel * , kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(channel * ))
        layers = []
        (block(self.in_channel, channel, downsample=downsample, stride=stride))
        self.in_channel = channel * 
        for _ in range(1, block_num):
            (block(self.in_channel, channel))
        return (*layers)
    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = (x)
        x = (x)
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        if self.include_top:# The default is true
            x = (x)
            x = (x, 1)
            x = (x)
        return x

def resnet34(num_classes=1000, include_top=True):
    return ResNet(BasicBlock, [3, 4, 6, 3], num_classes=num_classes, include_top=include_top)

def resnet101(num_classes=1000, include_top=True):
    return ResNet(Bottleneck, [3, 4, 23, 3], num_classes=num_classes, include_top=include_top)

Training section:

import torch
import  as nn
from torchvision import transforms, datasets
import json
import  as plt
import os
import  as optim
from model import resnet34, resnet101
import #ctrl+left mouse click to download weights
device = ("cuda:0" if .is_available() else "cpu")
print(device)
data_transform = {
    "train": ([(224),
                                 (),
                                 (),
                                 ([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),#Stay consistent with the official website's initialization method
    "val": ([(256),
                               (224),
                               (),
                               ([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])}

data_root = (((), "../.."))  # get data root path
image_path = data_root + "/data_set/flower_data/"  # flower data set path
train_dataset = (root=image_path+"train",
                                     transform=data_transform["train"])
train_num = len(train_dataset)
# {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4}
flower_list = train_dataset.class_to_idx
cla_dict = dict((val, key) for key, val in flower_list.items())
# write dict into json file
json_str = (cla_dict, indent=4)
with open('class_indices.json', 'w') as json_file:
    json_file.write(json_str)
batch_size = 16
train_loader = (train_dataset,
                                           batch_size=batch_size, shuffle=True,
                                           num_workers=0)
validate_dataset = (root=image_path + "val",
                                        transform=data_transform["val"])
val_num = len(validate_dataset)
validate_loader = (validate_dataset,
                                              batch_size=batch_size, shuffle=False,
                                              num_workers=0)
net = resnet34()#You can't set the output type of the fully-connected layer to what you want at first, you have to load the model parameters first and then modify the fully-connected layer.
# Official method for loading pre-trained models
model_weight_path = "./"#Weighting paths
missing_keys, unexpected_keys = net.load_state_dict((model_weight_path), strict=False)#Load model weights
inchannel = .in_features
 = (inchannel, 5)# Redefine the full connectivity layer
(device)
loss_function = ()
optimizer = ((), lr=0.0001)
best_acc = 0.0
save_path = './'
for epoch in range(3):
    # train
    ()# Control BN layer status
    running_loss = 0.0
    for step, data in enumerate(train_loader, start=0):
        images, labels = data
        optimizer.zero_grad()
        logits = net((device))
        loss = loss_function(logits, (device))
        ()
        ()
        # print statistics
        running_loss += ()
        # print train process
        rate = (step+1)/len(train_loader)
        a = "*" * int(rate * 50)
        b = "." * int((1 - rate) * 50)
        print("\rtrain loss: {:^3.0f}%[{}->{}]{:.4f}".format(int(rate*100), a, b, loss), end="")
    print()
    # validate
    ()# Control BN layer status
    acc = 0.0  # accumulate accurate number / epoch
    with torch.no_grad():
        for val_data in validate_loader:
            val_images, val_labels = val_data
            outputs = net(val_images.to(device))  # eval model only have last output layer
            # loss = loss_function(outputs, test_labels)
            predict_y = (outputs, dim=1)[1]
            acc += (predict_y == val_labels.to(device)).sum().item()
        val_accurate = acc / val_num
        if val_accurate > best_acc:
            best_acc = val_accurate
            (net.state_dict(), save_path)
        print('[epoch %d] train_loss: %.3f  test_accuracy: %.3f' %
              (epoch + 1, running_loss / step, val_accurate))
print('Finished Training')

Prediction section:

import torch
from model import resnet34
from PIL import Image
from torchvision import transforms
import  as plt
import json
device = ("cuda:0" if .is_available() else "cpu")
data_transform = (
    [(256),
     (224),
     (),
     ([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])# Use the same normalization as the training method
# load image
img = ("../")
(img)
# [N, C, H, W]
img = data_transform(img)
# expand batch dimension
img = (img, dim=0)
# read class_indict
try:
    json_file = open('./class_indices.json', 'r')
    class_indict = (json_file)
except Exception as e:
    print(e)
    exit(-1)
# create model
model = resnet34(num_classes=5)
# load model weights
model_weight_path = "./"
model.load_state_dict((model_weight_path, map_location=device))#Load the parameters of the trained model
()# Use eval() mode
with torch.no_grad():#Not tracking the loss gradient
    # predict class
    output = (model(img))# Compress the batch dimension
    predict = (output, dim=0)# Get the probability distribution via softmax
    predict_cla = (predict).numpy()# Find the index corresponding to the maximum value
print(class_indict[str(predict_cla)], predict[predict_cla].numpy())#Print category information and probabilities
()

The above is a personal experience, I hope it can give you a reference, and I hope you can support me more.