PostgreSQL's Vacuum needs to be executed regularly for the following reasons.
Free up and use the disk space occupied by updating or deleting updates.
Update statistics for PostgreSQL query plan.
Avoid resetting of transaction IDs causing very old data loss.
The execution of VACUUM's standard SQL article and other actual operations on the database can be processed in parallel.
Commands such as SELECT, INSERT, UPDATE, DELETE can continue to be executed as usual.
However, when VACUUM is being processed, commands such as ALTER TABLE ADD COLUMN cannot redefine the table.
Also, since there are a large number of I/O operations when executing VACUUM, other operations may have relatively low performance, such as the query response is very slow. To have less performance impact, it can be adjusted by parameters.
autovacuum (boolean ):
Whether the database server is set to automatic vacuum. The default is automatic vacuum. But if you want the vacuum to function properly, track_counts must be enabled. The track_counts parameter is set in the configuration file or through commands.
Even if the settings are not automatic vacuum. The system will automatically start if it is found that it is necessary to prevent the reset of the transaction ID.
log_autovacuum_min_duration (integer ):
Set the execution time of the vacuum that has been executed for more than how long it will output the log. Time unit milliseconds.
If this parameter is set to 0, all vacuum-related logs will be output.
If this parameter is set to -1, this is also the default setting. The output of the log is invalid, that is, all the vacuum-related logs are not output.
This parameter setting can be modified by the configuration file or through commands.
autovacuum_max_workers (integer ):
Sets the maximum number of vacuum processes that can be executed simultaneously. The default is 3.
This parameter setting can be modified by the configuration file or through commands.
autovacuum_naptime (integer ):
Sets the minimum delay for database execution vacuum. The time unit is a minute. The default is 1 minute. This parameter setting can be modified by the configuration file or through commands.
autovacuum_vacuum_threshold (integer ):
Set the update of VACUUM in any table to delete the minimum number of tuple. The default is 50.
This parameter setting can be modified by the configuration file or through commands. This setting can cover each table through the pg_autovacuum item.
autovacuum_analyze_threshold (integer ):
Set the update of ANALYZE in any table and delete the minimum number of tuple. The default is 50.
This parameter setting can be modified by the configuration file or through commands. This setting can cover each table through the pg_autovacuum item.
autovacuum_vacuum_scale_factor (floating point ):
When determining whether to trigger VACUUM, set the table capacity breakdown added to autovacuum_vacuum_threshold. The default is 0.2 (i.e. 20%).
This parameter setting can be modified by the configuration file or through commands. This setting can cover each table through the pg_autovacuum item.
autovacuum_analyze_scale_factor (floating point ):
When ANALYZE is triggered, set the table capacity breakdown added to autovacuum_vacuum_threshold. The default is 0.1 (i.e. 10%).
This parameter setting can be modified by the configuration file or through commands. This setting can cover each table through the pg_autovacuum item.
autovacuum_freeze_max_age (integer ):
To prevent the transaction ID from resetting, before VACUUM forced the operation, set the maximum value of the pg_class .relfrozenxid field of the table. The default is 200 million.
This parameter can only be set at startup.
autovacuum_vacuum_cost_delay (integer ):
Set cost delay in automatic vacuum operation. The default is 20 milliseconds. If the value of vacuum_cost_delay is set to -1, use the value of vacuum_cost_delay.
This parameter setting can be modified by the configuration file or through commands. This setting can cover each table through the pg_autovacuum item.
autovacuum_vacuum_cost_limit (integer ):
Sets the maximum limit value of cost in automatic vacuum operation. The default is -1, and the value of vacuum_cost_limit is used at this time. This parameter setting can be modified by the configuration file or through commands. This setting can cover each table through the pg_autovacuum item.
Supplement: PostgreSQL Database Management: Regular Vacuum
Why is regular vacuum in PostgreSQL database management work?
The reasons are the following 3 points:
Free and reuse the disk space occupied by the updated/deleted rows.
Update statistics used in PostgreSQL query plan.
Prevents the loss of very old data due to resetting of transaction IDs.
The first reason is
The insertion, update and deletion of PostgreSQL data are not really placed in the database space. If space is released from time to time, due to too much data, the query speed will drop greatly.
The second reason is
When PostgreSQL is doing query processing, in order to improve query speed, it will determine the execution plan based on statistical data. If it is not updated in time, the query effect may not be as expected.
The third reason is
Each transaction in PostgreSQL will generate a transaction ID, but this number has an upper limit. When the transaction ID reaches the maximum value, it will start looping from the minimum value again. In this way, if the previous data is not released in time, the original old data will be lost due to the loss of the transaction ID.
Having said that, vacuum operations can be manually and automatically. If there are special database maintenance personnel, they can be carried out in time. However, in order to save maintenance costs, many systems need to rely on automatic vacuum.
Although regular vacuum is a weakness of PostgreSQL, after version 8.3, you can just hand over this task to automatic vacuum.
To make the automatic vacuum effective, the track_counts parameter must be set to true. For specific settings, please refer to the official documentation.
It is better to write a shell to automatically execute the vacuum regularly.
Execute the vacuumdb command in the shell. Example: ./vacuumdb -d databaseName --analyze
The above is personal experience. I hope you can give you a reference and I hope you can support me more. If there are any mistakes or no complete considerations, I would like to give you advice.