Understand the caching mechanism of docker in one article

docker documentation

Docker's caching mechanism

Image Layer Cache

Docker images are built in layers, and each Dockerfile directive will generate a new image layer.
Docker caches each image layer. When building a new image, if it detects that a layer has been built before, it will directly reuse the layer without rebuilding.

FROM ubuntu:18.04
RUN apt-get update
RUN apt-get install -y nginx
COPY app/ /var/www/html/
CMD ["nginx", "-g", "daemon off;"]

In the Dockerfile above, the image layer corresponding to the first two RUN instructions will be cached. When the image is built again, if these two layers do not change, Docker will directly reuse the cache without re-executing these two commands, greatly speeding up the construction speed.

Container Layer Cache

Each container has its own writable container layer, located above the mirror layer, and is used to store the container's dynamic data.
Docker also caches the container layer. When restarting the container, Docker reuses the previous container layer to avoid recreation.

# Create and start the containerdocker run -d --name my-app my-app:v1

# Stop and restart the containerdocker stop my-app
docker start my-app

In the example above, when we stop and restart the container, Docker reuses the previous container layer, avoiding the overhead of recreating the container and making the container start faster.

Build Cache

Docker also provides a special build caching mechanism that can further optimize the image construction process.
The build cache will cache the intermediate layers used in the previous build process. If the same instructions are detected during the next build, these cache layers will be reused.
Users can specify the cache layer that needs to be reused by setting the --cache-from parameter.

# Enable build cache when building imagesdocker build --cache-from my-app:v1 -t my-app:v2 .

In the above example, we specified the --cache-from my-app:v1 parameter, telling Docker to reuse the cache layer used when building my-app:v1 before, which can further speed up the build.

Registry cache

Docker Registry also provides a caching mechanism, which can cache the previously pulled mirror layer, and get it directly from the cache when pulling next time, improving the pulling speed.
This caching mechanism is very useful for mirror distribution in distributed environments and can relieve the pressure on the Registry server.

To sum up, Docker's caching mechanism covers multiple aspects such as the image layer, container layer, construction process, and Registry. Making full use of caching can greatly improve Docker's performance and efficiency. If you have any other questions, please continue to ask

Several ways to write docker

Directive order optimization

The order of instructions in Dockerfile is very important because Docker builds the image layer according to the order of instructions.
Usually, more stable and unchanging instructions are placed in front and variable instructions are placed in the back, so that the cache can be maximized.

For example:

# Bad writingRUN apt-get update
COPY . /app
RUN pip install -r 

# Better writingFROM python:3.9
WORKDIR /app
COPY  .
RUN pip install -r 
COPY . .

In the second example, we first copy and install the dependency package. This part of the content is relatively stable and can make good use of the cache. The COPY . . instruction behind is the most variable, and it can maximize the use of cache in the end.

Layered construction

Make full use of Docker's hierarchical characteristics to split the Dockerfile into multiple stages, and each stage builds a part of the function.
This ensures that the stable base layer can maximize the use of cache, while the changed part only needs to rebuild the corresponding layer.

For example:

# Basic layerFROM ubuntu:18.04
RUN apt-get update &amp;&amp; apt-get install -y nodejs npm

# Application LayerFROM base-image
COPY . /app
WORKDIR /app
RUN npm install
CMD ["node", ""]

In this example, we divide the Dockerfile into two stages: the base layer and the application layer. The basic layer is responsible for the installation environment. This part of the content is relatively stable and can make good use of the cache. The application layer is responsible for copying the source code and installing the dependencies, which is easier to change.

Multi-stage construction

Docker 17.05 introduces the functionality of multi-stage construction, which can further optimize the image size and construction process.
By defining multiple FROM instructions in the Dockerfile, it is possible to split the build process into multiple independent stages.

For example:

# Build phaseFROM golang:1.16 AS builder
WORKDIR /app
COPY . .
RUN go build -o myapp

# Running phaseFROM alpine:latest
COPY --from=builder /app/myapp /myapp
CMD ["/myapp"]

In this example, we first compile the Go code in the builder stage, and then copy only the compiled binary files into the final image in the runtime stage. This method can greatly reduce the volume of the final image and also make good use of the cache.

About the builder stage of docker

In a multi-stage construction scenario, even if the source code is copied in the builder stage, the compiled binary files will be copied to the runtime container in the end. This is because the COPY --from=builder directive selectively copies files for the specified stage.

Let's explain in more detail:

During the builder phase, we copy the source code to the working directory and execute the compile command go build. At this time, the container contains the source code and compiled binary files.
In the runtime phase, we use the COPY --from=builder directive to copy the binary file /app/myapp path from the builder phase to the current container.
The benefits of doing this are:
1. Only the final required binary files are copied into the runtime container, greatly reducing the image size.
2. The source code and compilation process of the builder stage will not be brought into the final image, thereby avoiding unnecessary information leakage.

In a multi-stage build, the source code of the builder stage will indeed be saved, but will not be included in the final container image. Here are a few points to explain:

Docker cache

Docker will use the cache mechanism when building images to speed up the construction process of subsequent images.
This means that the intermediate layer images generated by the builder stage will be temporarily retained in the local Docker environment for subsequent build-time reuse.

Final mirror

Through the COPY --from=builder directive, only the binary files produced by the builder stage will be copied to the final runtime container.
Source code and other files generated during compilation will not be included in the final image.

Separation of development and deployment

An important purpose of multi-stage construction is to separate the development and deployment environments.
The source code and build process during the development phase are retained in the builder container and will not be leaked into the final deployment container.

This is the end of this article about understanding the caching mechanism of docker in this article. For more related docker caching mechanism content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!