Python Pytorch gpu analysis environment configuration

Pytorch is one of the hottest deep learning frameworks out there, the other being TensorFlow. but I've always used is CPU version before, I bought a 3070Ti laptop a few months ago (yes, I bought a 30-series when 40-series graphics came out, which is really hard to say), and I also have an M1 chip Macbook Pro, which also currently supports Pytorch's GPU acceleration, so I thought, I'll put a Pytorch on both computers and learn deep learning superficially.

Apple silicon

First up is the M1 chip, and this is particularly easy. Start by installing a conda, only with the built-in mamba package manager, add the conda-forge channel, arm64 version.

# Download
wget /conda-forge/miniforge/releases/latest/download/
# Installation
bash

Then we create an environment with mamba, using the development version of pytorch, so the channel specifies pytorch-nightly

mamba create -n pytorch \
   jupyterlab jupyterhub pytorch torchvision torchaudio 
   -c pytorch-nightly

Finally, it is possible to use theconda activate pytorch, and then test to see if the GPU is correctly recognized.

import torch
torch.has_mps
# True
# Configure the device
device = ("mps")

Reference./metal/pytorch/

Windows NVIDIA

First, you need to make sure that your computer has an NVIDIA graphics card installed, as well as having the appropriate CUDA drivers.

Graphics Card Architecture Requirements for CUDA./deeplearning/cudnn/support-matrix/

The new generation of computers basically comes with CUDA drivers. This can be done by opening the NVIDIA control panel's System Information

Check in Components to see which CUDA drivers you have installed, for example mine is 11.7.89 .

It is also possible to view, from the command line, the

Next, let's install pytorch. again the recommended method for conda, we start by downloading Miniconda from the Tsinghua mirror source.

Select the Windows installation package

Once installed, we can access the command line via Anaconda Prompt and follow the recommendations on the pytorch website.

在这里插入图片描述

But there is one difference, to avoid environment conflicts, it is better to create a separate environment, so the code is as follows

conda create -n pytorch pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia

followed byconda activate pytorchStart the environment and then test it in the python environment

import torch
torch.has_cuda
# True

A few common questions (at least the ones I thought of while writing the article):

Q: What is the difference between installing with conda and pip?

A: conda is the official recommended installation method for pytorch, because conda will install the CUDA drivers and related toolsets needed for pytorch to run for you as well. This means that it takes up a bit more space for conda.

Q: Is it possible to install the latest pytorch on very old hardware?

A: I think this is similar to installing a game. You can install the game, but if you don't meet the minimum configuration requirements for the game, it won't run as usual.

Q: Do I have to install CUDA driver and CUDA toolkit on my computer?

A: I'm actually not quite sure how to answer this personally, here are some of my insights so far. If you are using conda, then he will help you with some dependencies. If you are using pip, then you need to do the configuration yourself. Among them, the CUDA driver is a must install, because the CUDA driver is responsible for connecting the GPU hardware to the computer operating system, without the driver, the operating system does not recognize the CUDA core, quite you did not install NVIDIA graphics card. CUDA toolkit is convenient for us to call the CUDA core of a variety of development tools collection, you install CUDA toolkit at the same time will be accompanied by the installation of CUDA drivers. Unless you want to do low-level development, or you need to compile a pytorch from source, we don't need to install the CUDA toolkit.

Q: What if the CUDA driver version on my computer is older? Or what happens if my CUDA driver version is 11.7, but I install pytorch with cuda=11.8, or a different version of pytorch?

A: When we install pytorch with cuda=11.7, we are essentially installing a version of pytorch that has been compiled in a CUDA Toolkit version 11.7 environment. When the difference between the cuda versions is not particularly large, or not a destructive upgrade, then it will theoretically work.

Handwriting Data Performance Test

The following is a handwriting recognition (MNIST) case code provided to me by GPT 3.5 to test the speed of different platforms.

import torch
import torchvision
import  as transforms

# Convert data formats and load data
transform = (
    [(),
     ((0.5,), (0.5,))])

trainset = (root='./data', train=True,
                                        download=True, transform=transform)
trainloader = (trainset, batch_size=64,
                                          shuffle=True, num_workers=2)

testset = (root='./data', train=False,
                                       download=False, transform=transform)
testloader = (testset, batch_size=64,
                                         shuffle=False, num_workers=2)

# Define the network model
class Net():
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = .Conv2d(1, 6, 5)
         = .MaxPool2d(2, 2)
        self.conv2 = .Conv2d(6, 16, 5)
        self.fc1 = (16 * 4 * 4, 120)
        self.fc2 = (120, 84)
        self.fc3 = (84, 10)

    def forward(self, x):
        x = ((self.conv1(x)))
        x = ((self.conv2(x)))
        x = (-1, 16 * 4 * 4)
        x = (self.fc1(x))
        x = (self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()

# The code here is rather arbitrary, it's just a matter of which platform to run it on #
# CPU
device = ("cpu")
# CUDA
device = ("cuda:0")
# MPS
device = ("mps")

(device)

# Define loss functions and optimizers
criterion = ()
optimizer = ((), lr=0.001, momentum=0.9)

# Train the network

import time

start_time = ()  # Record start time

for epoch in range(10):  # Conduct 10 iterations of training
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data[0].to(device), data[1].to(device)
        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        ()
        ()

        running_loss += ()
        if i % 100 == 99:
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 100))
            running_loss = 0.0

end_time = ()  # Record end time
training_time = end_time - start_time  # Calculate training time

print('Training took %.2f seconds.' % training_time)

print('Finished Training')

# Test the network
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data[0].to(device), data[1].to(device)
        outputs = net(images)
        _, predicted = (, 1)
        total += (0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
      ))

The final tally came down to

Windows platform

3070Ti Training took 45.02 seconds.
i9 12900H Training took CPU 75.65

Mac platform

M1 Max Training took 50.79 seconds.
M1 MAX CPU Training took 109.61 seconds.

Overall the GPU acceleration is obvious, both mac and Windows. followed by a comparison of GPU acceleration results, the M1 max chip is a 10% worse than the 3070Ti?

However, the tests are all small datasets at the moment, so I'll try it with larger datasets when I've been learning for a while.

This article on Python Pytorch (gpu) analysis environment configuration of the article is introduced to this, more related python pytorch configuration content, please search for my previous posts or continue to browse the following related articles I hope you will support me in the future!