00 Requirements
You need to configure some new servers, you can only connect through ssh [email protected], and then create your own docker under the /data1 disk, and use ssh to connect to docker to use the server.
(The addresses of boss and 172.16.1.100 are both fictitious. When using them, you need to replace them with the server address you want to configure and the account you can use)
System: Ubuntu 20.04, with nvidia graphics card.
01 Install docker
(I need to configure docker already installed in the server, so I didn't do this step. The following tutorial was generated by LLM)
# First, make sure there are no older versions of Docker on the systemsudo apt-get remove docker docker-engine containerd runc # Then, update the package list and install the necessary packagessudo apt-get update sudo apt-get install apt-transport-https ca-certificates curl software-properties-common # Add Docker's official GPG keycurl -fsSL /linux/ubuntu/gpg | sudo apt-key add - # Set up a stable version of Docker repositorysudo add-apt-repository "deb [arch=amd64] /linux/ubuntu $(lsb_release -cs) stable" # Update the package list to include packages in the Docker repositorysudo apt-get update # Install Docker CE, Docker CLI and Containerdsudo apt-get install docker-ce docker-ce-cli # Check Docker installation versiondocker --version # Verify that Docker is installed successfully. This command will download and run a test image.sudo docker run hello-world # Finally, configure Docker to bootsudo systemctl enable docker
In order not to run the Docker command using sudo, you can add the current user to the docker group:
sudo usermod -aG docker $USER
Log in or restart the system for the group changes to take effect.
02 Preparation
Create a new directory that is ready to put docker and change the directory permissions: (<user_name> is my name. When running the command, you need to replace it with the name you want docker to have)
sudo mkdir /data1/<user_name> sudo chown boss /data1/<user_name>/ -R sudo chgrp boss /data1/<user_name>/ -R mkdir /data1/<user_name>/docker mkdir /data1/<user_name>/project
Configure authorized_keys for ssh:
cd /data1/<user_name>/docker/ vim authorized_keys # Put the local computer user/.ssh The inside id_rsa.pub Copy the contents of
03 Configure Dockerfile and docker composer
Create a new Dockerfile:
cd /data1/<user_name>/docker/ vim Dockerfile
The specific content of Dockerfile:
# Check out what images are in docker imagesFROM nvidia/cuda:11.6.0-devel-ubuntu20.04 # Set the time zoneENV TZ=Asia/Shanghai RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone # Install basic softwareRUN apt-get update && \ apt-get install -y \ openssh-server \ python3 \ python3-pip \ vim \ git \ wget \ curl \ unzip \ sudo \ net-tools \ iputils-ping \ build-essential \ cmake \ htop \ && apt-get clean \ && rm -rf /var/lib/apt/lists/* # Install other softwareRUN apt-get update && \ apt-get install -y \ tmux \ && apt-get clean \ && rm -rf /var/lib/apt/lists/* # Create a user (maintain the same UID as the host to avoid permission issues)RUN useradd -m -u 1001 -s /bin/bash <user_name> # sudo without password RUN echo "<user_name> ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers USER <user_name> WORKDIR /home/<user_name> # Create .ssh directory and set permissionsRUN mkdir -p /home/<user_name>/.ssh && \ chown -R <user_name>:<user_name> /home/<user_name>/.ssh && \ chmod 700 /home/<user_name>/.ssh # Install CondaRUN wget /miniconda/Miniconda3-latest-Linux-x86_64.sh -O && \ bash -b -p /home/<user_name>/miniconda && \ rm RUN /home/<user_name>/miniconda/bin/conda init bash CMD ["/bin/bash"]
Before configuring docker composer, confirm which port is available:
sudo netstat -tuln # Find a port that is not listed,for example 8012
Then, create a new docker composer:
cd /data1/<user_name>/docker/ vim
Specific content:
version: '3.8' services: <user_name>: container_name: <user_name> # Set the container name build: . # Build the image using the Dockerfile in the current directory image: <user_name> # Mirror name restart: unless-stopped runtime: nvidia # Enable GPU support ports: - "8012:22" # Select an unoccupied port (please confirm that 8012 is available) volumes: - /data1/<user_name>/project:/home/<user_name>/project # Mount the project directory - /data1/<user_name>/docker/authorized_keys:/home/<user_name>/.ssh/authorized_keys # SSH environment: - NVIDIA_DRIVER_CAPABILITIES=all command: /bin/bash -c "sudo service ssh start && sleep infinity"
A docker composer compatible with the old version of docker (I don't know any old versions, they are all written by experts)
services: container_name: <user_name> # Set the container name build: . # Build the image using the Dockerfile in the current directory restart: unless-stopped ports: - "8012:22" # Select an unoccupied port (please confirm that 8012 is available) volumes: - /data1/<user_name>/project:/home/<user_name>/project # Mount the project directory - /data1/<user_name>/docker/authorized_keys:/home/<user_name>/.ssh/authorized_keys # SSH environment: - NVIDIA_DRIVER_CAPABILITIES=all command: /bin/bash -c "sudo service ssh start && sleep infinity"
04 Start docker
Then, start docker:
cd /data1/<user_name>/docker/ docker compose build # build Dockerfile docker compose up -d # Start docker# Old version dockerdocker-compose build # build Dockerfile docker-compose up -d # Start docker # Enter docker and take a lookdocker exec -it <user_name> bash # Then ls, you will see two directories: miniconda and project. All files that need to be mapped to disk and do not want to be lost need to be placed in the project. # View directory permissionsls -al # If you find permission problems, exit docker and change the permissions in the directorysudo chown boss /data1/<user_name>/ -R sudo chgrp boss /data1/<user_name>/ -R # If you find that the Dockerfile is written incorrectly, or you want to add something, you can run it againdocker compose build # build Dockerfile docker compose up -d # will become Recreating <user_name> # Assuming you have entered docker, you want to change the permissions of docker ./sshdocker exec -it <user_name> bash sudo chown <user_name> ~/.ssh -R sudo chgrp <user_name> ~/.ssh -R # Stop and start docker temporarilydocker compose stop docker compose start # Kill dockerdocker compose down
05 Test whether you can ssh to connect to this docker (may need to combine 04 to debug)
# Connect on your local computerssh -p 8012 <user_name>@172.16.1.100
The ssh connection is unsuccessful (such as letting the password enter), which is likely to be a problem with the permissions of .ssh or authorize_keys inside and outside dockers. The outside dockers must be changed to boss, and the inside dockers must be changed to <user_name>.
If it appears while connecting
ECDSA host key for [172.16.1.100]:8012 has changed and you have requested strict checking.
Host key verification failed.
You need to delete 172.16.1.100 in known_host. The above error message will give the commands that need to be executed.
The above is the detailed content of a simple tutorial on using docker on Linux server. For more information about using docker on Linux, please follow my other related articles!