Exploring Pytorch's Dynamic Memory Allocation Laws in Detail

Here's an experiment to explore the way Pytorch allocates video memory.

Experimenting with video memory to main memory

I used VSCode's jupyter to experiment, first importing only pytorch with the following code:

import torch

Open Task Manager to check the main memory and video memory. The situation is as follows:

Create a 1GB tensor in the memory and assign it to a. The code is as follows:

a = ([256,1024,1024],device= 'cpu')

Check the main memory and video memory:

You can see that both the main memory and the video memory have become larger, and the video memory is not only 1G larger, but the extra memory is some configuration variables needed for pytorch to run, which we ignore here.

Again, create a 1GB tensor in the memory, assign it to b, and the code is as follows:

b = ([256,1024,1024],device= 'cpu')

Check the main video memory:

This time the main memory size didn't change and the video memory got 1GB higher, which makes sense. Then we move b to the main memory with the following code:

b = ('cpu')

Check the main video memory:

It was found that while the main memory was made 1GB taller, the video memory was only made 0.1GB smaller, as if it was just copying the video memory tensor to the main memory. In fact, pytorch does copy the tensor to the main memory, but it also records the movement of the tensor in the video memory. We then execute the following code to create another 1GB tensor assignment to c:

c = ([256,1024,1024],device= 'cuda')

Check the main video memory:

Finding that only the size of the video memory has gotten 0.1GB larger shows that Pytorch did indeed record the movement of the tensor in the video memory, it just didn't free up the memory space right away, it chose to overwrite this location the next time a new variable was created. Next, we repeat the above line of code:

c = ([256,1024,1024],device= 'cuda')

The main video memory is as follows:

Obviously we have overwritten the tensor c, and the content of the video memory has become larger, why is this? In fact, when Pytorch executes this code, it first finds a memory location that can be used, creates the 1GB tensor, and then assigns it to c. However, because the original c still occupies 1GB of memory when the new tensor is created, Pytorch can only fetch 1GB of memory to create the tensor, and then assign the tensor to c. In this way, the memory content of the original c will be empty, but as mentioned above, Pytorch will not immediately release it. is empty, but as mentioned earlier, pytorch will not release the memory immediately, but wait for the next overwrite, so the memory size is not reduced.

We can verify the above conjecture by creating another 1GB d-tensor with the following code:

d = ([256,1024,1024],device= 'cuda')

The main video memory is as follows:

The fact that the size of the memory does not change is because pytorch creates the new tensor in the position vacated by c in the previous step, and then assigns it to d. Also, the delete variable operation similarly does not immediately free the memory:

del d

Main video memory situation:

No change in the video memory, again waiting for the next override.

Main Memory to Video Memory

Continuing with the experiment above, we create create a 1GB tensor directly in the main memory and assign it to e. The code is as follows:

e = ([256,1024,1024],device= 'cpu')

The main video memory is as follows:

It makes sense to make the main memory 1GB larger. Then move e to video memory with the following code:

e = ('cuda')

The main video memory is as follows:

The main memory gets 1GB smaller, the video memory doesn't change because the above tensor d is deleted without being overwritten, which makes sense. This means that the release of the main memory is performed immediately.

summarize

Through the above experiments, we have learned that pytorch does not immediately free memory for invalid variables in thedisplay memory, it utilizes the available space in thedisplay memory in an overwriting manner. In addition, if you want to reset a larger tensor in the memory, it is better to move it to the main memory first, or just delete it and create a new value, otherwise it will require twice as much memory for this operation, and there is a risk of not having enough memory.

The experimental code is summarized below:

#%% 
import torch
#%%
a = ([256,1024,1024],device= 'cuda') 
#%%
b = ([256,1024,1024],device= 'cuda') 
#%%
b = ('cpu')
#%%
c = ([256,1024,1024],device= 'cuda') 
#%%
c = ([256,1024,1024],device= 'cuda') 
#%% 
d = ([256,1024,1024],device= 'cuda') 
#%%
del d 
#%% 
e = ([256,1024,1024],device= 'cpu') 
#%%
e = ('cuda')

To this point this article on Pytorch dynamic allocation of memory to explore the law of the article is introduced to this, more related Pytorc memory allocation law content please search for my previous articles or continue to browse the following related articles I hope you will support me more in the future!