Problem phenomenon
During the server operation and maintenance process, we often encounter this situation, the server disk space usage rate reaches 100%, and business abnormalities occur.
Problem positioning
1. Log in to the server and view it through df -Hl
[root@k8s-master1 ~]# df -Hl File system capacity Used Available Used% Mounting point devtmpfs 8.4G 0 8.4G 0% /dev tmpfs 8.5G 0 8.5G 0% /dev/shm tmpfs 8.5G 774M 7.7G 10% /run tmpfs 8.5G 0 8.5G 0% /sys/fs/cgroup /dev/mapper/centos-root 136G 68G 68G 51% / /dev/sda1 1.1G 238M 827M 23% /boot
2. Find directories or files that take up a large space
Stupid method: In the root directory, use the du -hs command to list the size of space occupied by each directory
[root@k8s-master1 /]# du -hs * 0 bin 194M boot 1012K core.580 0 dev 37M etc 21M home 7.7G kuboard-data 0 lib 0 lib64 0 media 0 mnt 135M opt du: Unable to access"proc/21212/task/21212/fd/3": There is no file or directory du: Unable to access"proc/21212/task/21212/fdinfo/3": There is no file or directory du: Unable to access"proc/21212/fd/3": There is no file or directory du: Unable to access"proc/21212/fdinfo/3": There is no file or directory 0 proc 1.6G root 738M run 0 sbin 0 src 0 srv 0 sys 6.2M tmp 3.2G usr 51G var
A relatively efficient method is to set the directory depth of the query through du's -d parameter, or --max-depth
du -h -d 2|grep [GT] |sort -nr du -h --max-depth=2|grep [GT] |sort -nr #In this way, you can search for large directories that occupy disk space in units of G or T and sort them [root@k8s-master1 /]# du -h -d 1 / 194M /boot 0 /dev du: Unable to access"/proc/24731/task/24731/fd/3": There is no file or directory du: Unable to access"/proc/24731/task/24731/fdinfo/3": There is no file or directory du: Unable to access"/proc/24731/fd/4": There is no file or directory du: Unable to access"/proc/24731/fdinfo/4": There is no file or directory 0 /proc 738M /run 0 /sys 37M /etc 1.6G /root 51G /var 6.2M /tmp 3.2G /usr 21M /home 0 /media 0 /mnt 135M /opt 0 /srv 0 /src 7.7G /kuboard-data 64G /
A highly efficient method, using find for search is more efficient than using du
find / -type f -size +1G -exec du -h {} \; #Find file greater than10GFiles
Encountering problems
After deleting some backup files and log information, the viewing space is still insufficient. I found that the deleted logs were not released.
Reasons for not freeing disk space:
In Linux or Unix systems, deleting a file through rm or file manager will unlink it from the file system's folder structure. However, assuming that the file is opened (a process is in use), the process will still be able to read the file and the disk space is always occupied. What I deleted was nginx's access log file, which was being used when deleting.
How to deal with it
[root@local ~]# lsof |grep deleted nginx 4399 root 38w REG 253,0 19304448 10835682 /var/nginx/logs/ (deleted) nginx 4399 root 39w REG 253,0 3502080 10835684 /var/nginx/logs/ (deleted) nginx 4401 nobody 38w REG 253,0 19304448 10835682 /var/nginx/logs/ (deleted) nginx 4401 nobody 39w REG 253,0 3502080 10835684 /var/nginx/logs/ (deleted)
From the output results, you can see that /var/nginx/logs/ and \ are still in use, so the space is not released.
So how to release the process?
- Method 1: Directly kill the corresponding process, or stop the application using this file, so that the operating system can actively recycle disk space.
- Method 2: When cleaning the large log file that is being read and written in the future, use the echo "" > command directly, which means that the file is empty, which does not affect the use of the service. The file size is also controlled and the disk space is also freed.
Summarize
The above is personal experience. I hope you can give you a reference and I hope you can support me more.