Preface
This article introduces the conditions and recovery process required by PostgreSQL based on point-in-time-recover, which is an operation instruction.
1. WAL log
WAL (Write Ahead Log) log priority writing mechanism is often used to maintain transaction persistence and consistency of relational databases. For example, Redo Log in MySQL is also a WAL log. However, if you want to implement PITR in MySQL, you need to use Binlog logs. I understand that MySQL can also use Redo for replication, or use Redo for PITR. However, MySQL is a multi-engine architecture, not only transaction engines, but other engines do not support Redo, so Binlog is used for PITR and master-slave replication.
In PostgreSQL, there is only a transaction engine, so WAL logs can be applied to both transaction systems, master-slave synchronization and PITR.
1.1 WAL configuration
If you want to implement PITR is equivalent to a full backup + incremental backup, then WAL is used for incremental backup. At this time, WAL archives need to be enabled:
wal_level = replica # minimal, replica, or logical archive_mode = on # enables archiving; off, on, or always # (change requires restart) # The goal here is to archive the directory, which needs to be created in advancearchive_command = 'cp %p /data/pgsql12/archive/%f && echo %f >> /data/pgsql12/archive/' # command to use to archive a logfile segment
After the configuration is completed, the database needs to be restarted:
pg_ctl -D /data/pgsql12/data/ -l /data/pgsql12/logs/ restart
2. pg_basebackup
pg_basebackup is a physical hot standby tool that comes with PostgreSQL. However, it only supports backup of the entire instance, and cannot perform single library and single table backup. It can be used for backup library construction:
pg_basebackup [OPTION]...
parameter | meaning |
---|---|
-D/–pg-data | Backup file directory, which means writing the backup file to which directory |
-F/–format | The default is p, p and t can be selected |
-r/–max-rate | Maximum rate limit for data transmission |
-R/–write-recovery-conf | Output configuration information for replication |
-X, --wal-method | Specify the way to copy wal logs, including none, fetch, and stream. It is recommended to use stream to avoid receiving wal information. The source log is overwritten. |
-z, --gzip | Whether to compress, use with -F t |
-Z, --compress=0-9 | Compression level, the larger the number, the greater the compression rate, the more CPU resources are consumed |
-c, --checkpoint | Set checkpoint mode: fast, spread |
-C, --create-slot | Create a copy slot |
-S, --slot=SLOTNAME | Specify the copy slot name |
-l, --label=LABEL | Specify a backup identifier to facilitate subsequent maintenance by operation and maintenance personnel |
-n, --no-clean | do not clean up after errors |
-N, --no-sync | do not wait for changes to be written safely to disk |
-P, --progress | Print backup progress information |
-v, --verbose | Output detailed information |
2.1 Common Commands
Common commands for basic backup and recovery:
pg_basebackup -D Backup file directory -v -P -Upostgres -h 127.0.0.1 -p5432 -R
3. Full recovery experiment
3.1 Make a full preparation
pg_basebackup -D /data/pgsql12/backup -v -P -Upostgres -h 127.0.0.1 -p5432 -R
pg_basebackup: initiating base backup, waiting for checkpoint to complete pg_basebackup: checkpoint completed pg_basebackup: write-ahead log start point: 2/6D000028 on timeline 2 pg_basebackup: starting background WAL receiver pg_basebackup: created temporary replication slot "pg_basebackup_74591" 2455778/2455778 kB (100%), 1/1 tablespace pg_basebackup: write-ahead log end point: 2/6D000138 pg_basebackup: waiting for background process to finish streaming ... pg_basebackup: syncing data to disk ... pg_basebackup: base backup completed
3.2 Write incremental data
postgres=# update pgbench_accounts set abalance = 100 where bid = 41; UPDATE 100000
postgres=# select pg_current_wal_lsn(); pg_current_wal_lsn -------------------- 2/70D63FD0 postgres=# select now(); now ------------------------------- 2023-08-08 13:51:34.475891+08
3.3 Simulate backup recovery
The worst thing happened at this time. You need to restore the data. You need to close the database first:
pg_ctl -D /data/pgsql12/data/ stop waiting for server to shut down.... done server stopped
Rename the data directory and backup the basics
# Rename the backup directorymv data data_bak # Move backup files to data directorymv backup data # Modify the data directory to 700 otherwise it cannot be started# DETAIL: Permissions should be u=rwx (0700) or u=rwx,g=rx (0750). chmod 700 /data/pgsql12/data/
Modify the file under the data directory:
restore_command = 'cp /data/pgsql12/archive/%f %p > /data/pgsql12/archive/ 2>&1 ' # Recover via LSNrecovery_target_lsn='2/70D63FD0' # Recover by time point 2 Select 1recovery_target_time = '2023-08-08 13:51:34' # After reaching the recovery target, the data action promote means that you can directly accept connections and tests. It is recommended to use pause (default value) to confirm that the data is recovered and delete the file manually.recovery_target_action = 'promote'
After configuring these, restart the database:
pg_ctl -D /data/pgsql12/data start -l /data/pgsql12/logs/
Verify that the incremental data just now has been restored:
postgres=# select count(*) from pgbench_accounts where bid = 41 and abalance = 100; count -------- 100000
Summarize
To implement PITR, PostgreSQL requires setting the wal_level configuration parameter to replica or higher, and a reasonable configuration of archiving mechanism and archiving cleaning mechanism. This article only introduces the recovery process. How to configure a production environment backup recovery, we will introduce it in the next article.
This is the end of this article about PotgreSQL based on point-in-time recovery. For more related PotgreSQL point-in-time recovery content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!