SoFunction
Updated on 2025-03-02

Guide to Analysis and Cleaning of Disk Space Occupancy of Linux Server (Solution)

To ensure that during major holidays, the test environment server disks that the team is responsible for will not occupy too much, resulting in frequent alarms. We require server disk usage to be checked before major holidays. If it is found that the occupancy is too high, human intervention is required to carry out corresponding cleaning.

1. Inspection requirements

Check the occupancy of each partition. If any of the following conditions are met, human intervention is required to make judgments or deal with it:

(1) Disk usage rate > 90%

(2) Disk usage > 80% and remaining space < 30G

(3) Disk usage > 70% and remaining space < 50G

2. Solution

Use shell scripts to count and judge disk usage. If there are any exceptions, output exception information.

The script code is as follows:

#!/bin/bash
RED='\033[0;31m'
NC='\033[0m'
if [[ $1 == "detail" ]]
then
    df -BG
else
    IS_NORMAL=1
    while read line
    do
        if [[ ${line} == Filesystem* ]]; then
            continue
        fi
        filesystem=$(echo ${line} | awk '{print $1}')
        use_rate=$(echo ${line} | awk '{print $5}' | sed 's/%//g')
        avail_space=$(echo ${line} | awk '{print $4}' | sed 's/G//g')
        mounted_on=$(echo ${line} | awk '{print $6}')
        if [[ ${use_rate} -gt 90 ]] || [[ ${use_rate} -gt 80 && ${avail_space} -lt 30 ]] || [[ ${use_rate} -gt 70 && ${avail_space} -lt 50 ]]; then
            echo -e "${RED}WARN: Filesystem ${filesystem} mounted on ${mounted_on} has problem: use rate is ${use_rate}%, available space is ${avail_space}G.${NC}"
            IS_NORMAL=0
        fi
    done < <(df -BG) 
    if [[ ${IS_NORMAL} -eq 1 ]]; then
        echo "INFO: Disk space usage is normal."
    fi
fi

Key code description:

df -BG: The command is used to report the disk space usage of the file system. The -BG option means to be displayed in G bytes. The meaning of B is: use SIZE-byte blocks

It is recommended that script save path:/data/sh/general/disk_usage_check.sh

Initialize the script to execute commands:mkdir -p /data/sh/general/;touch /data/sh/general/disk_usage_check.sh;chmod +x /data/sh/general/disk_usage_check.sh;vim /data/sh/general/disk_usage_check.sh

3. How to use scripts

1) According to the inspection requirements, determine whether there is too much disk space occupancy

Execute the script:/data/sh/general/disk_usage_check.sh

2) If necessary, you can further view the disk occupation information of each partition

Execute the script:/data/sh/general/disk_usage_check.sh detail

Use examples to view the results, and there are two situations:

(1) Check normally

[root@localhost ~]# /data/sh/general/disk_usage_check.sh 
INFO: Disk space usage is normal.

(2) Check abnormalities, and check it requires human intervention to judge and deal with it.

[root@novalocal general]# /data/sh/general/disk_usage_check.sh 
WARN: Filesystem /dev/vdb mounted on /data has problem: use rate is 76%, available space is 47G.
[root@novalocal general]# /data/sh/general/disk_usage_check.sh detail
Filesystem              1G-blocks  Used Available Use% Mounted on
/dev/mapper/centos-root       49G   12G       38G  23% /
devtmpfs                       8G    0G        8G   0% /dev
tmpfs                          8G    1G        8G   1% /dev/shm
tmpfs                          8G    1G        7G  11% /run
tmpfs                          8G    0G        8G   0% /sys/fs/cgroup
/dev/vdb                     197G  142G       47G  76% /data
/dev/vda1                      1G    1G        1G  20% /boot
tmpfs                          2G    0G        2G   0% /run/user/0

Please refer to the next chapter for processing methods

4. Occupancy and positioning and solutions

1. View the size of each text or folder in the directory and output it in descending order

[root@f2 data]# du -sh * | sort -hr
27G tomcat
5.1G did-generator
4.1G register
2.5G turbine-web
1.4G rbmq-productor
1.1G consul
600M backup
544M test-backup
527M deploy

Command parsing:

du: yes "disk usage" Abbreviation of,This command is used to estimate the space occupied by a file or directory on disk。
-s: This option tells du The command displays only the total size,Without listing the size of each subdirectory or file。
-h: This option allows du Display size in an easy-to-read format(For example,Automatic selection KB、MB、GB Units)。
-r: This option allows sort Commands sort results in descending order(默认yes升序)。

2. Why can’t you free up space by deleting occupied files?

In Linux, when you delete a file, if the file is still used by a process (that is, an open file descriptor points to the file), the disk space of the file is not immediately released. This is because in Linux, file deletion actually deletes the association between the file name and inode, rather than the inode itself. Only when all file descriptors associated with the inode are closed will the inode be deleted and the corresponding disk space will be released.

If you delete a file that is still used by the process, you can restart the process or restart the system to ensure that all file descriptors are closed, thereby freeing up disk space.

You can use the lsof command to view this file

(1) The command to view deleted but not released files:lsof | grep '(deleted)'

(2) Check the deleted but not released files that take up the largest space:lsof | grep '(deleted)' | sort -n -r -k 7,7 | head -n 10, command analysis:

-n:Sort by numerical values。By default,sort Commands are sorted in strings,but -n The option will allow sort Commands sort numerical。
-r:Sort in reverse order。By default,sort Commands are sorted in ascending order,but -r The option will allow sort Commands sort in descending order。
-k 7,7:Specify the sorted fields。By default,sort Commands will be sorted by the entire line,but -k Options can make sort The command only uses the specified fields as the basis for sorting。Here,-k 7,7 It means only7Fields as the basis for sorting。

5. Encountering problems

1. When the pipeline mode is executed, the variable value cannot be updated.

  IS_NORMAL=1
  df -BG | while read line
  do
    IS_NORMAL=0  
  done
  echo ${IS_NORMAL} 

In the above code, the output value of IS_NORMAL is always 1 and cannot be modified to 0. Reason:

In the bash script, the Pipeline Character | creates a child shell to execute the command to the right of the Pipeline Character. In this example, the while read line loop is executed in a child shell. Therefore, modifications to the variable IS_NORMAL within the loop occur in the child shell and do not affect the variables in the main shell.

To solve this problem, you can use process instead, and execute the while read line loop as the main process. The modified code is as follows:

  IS_NORMAL=1
  while read line
  do
    IS_NORMAL=0  
  done < <(df -BG) 
  echo ${IS_NORMAL} 

VI. Supplementary Notes

1. The difference between du and df

duanddfThey are all commands in Linux systems that are used to check disk space usage, but their usage is different from the information displayed.

(1) duOrder:duis the abbreviation of "disk usage". The main function of this command is to estimate the amount of space occupied by files or directories on disk. It recursively scans the directory and then calculates the size of each subdirectory.

   Give an example:du -sh /home

This command displays the total size of the /home directory.-sParameters indicate that only the total is displayed.-hParameters represent display sizes in an easy-to-read format (e.g. K, M, G).

(2)dfOrder:dfis the abbreviation of "disk filesystem". The main function of this command is to display the usage of the disk. It displays disk space usage for all mounted file systems, including the total space, used space, remaining space, and percentage of usage.

   Give an example:df -h

This command displays disk space usage for all mounted file systems in an easy-to-read format.-hParameters represent display sizes in an easy-to-read format (e.g. K, M, G).

Overall,duanddfThe main difference is thatduis used to view the size of a file or directory, anddfIt is used to check the usage of the disk.

This is the article about the analysis and cleaning guide for Linux server disk space occupancy. For more related content on Linux server disk space occupancy, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!