PostgreSQL stream replication implements HA master-subsidy switching
Environment description and host planning
operating system | Host Name | Host | Role | port |
---|---|---|---|---|
CentOS 7 | master | 10.0.0.11 | PG-Master | 54321 |
CentOS 7 | slave | 10.0.0.12 | PG-Slave | 54321 |
CentOS 7 | pool | 10.0.0.13 | pgpool | 54321 |
Basic environment configuration (all host operations)
Configure HOSTS
echo -e "10.0.0.11 master\n10.0.0.12 slave\n10.0.0.13 pool" >> /etc/hosts # Execute once
Configure a unified time (if configured, please ignore it)
yum install -y ntpdate && ntpdate echo -e "# sync time from \n5 * * * * /usr/sbin/ntpdate > /dev/null 2>&1 " >> /var/spool/cron/root # Write timing tasks,Just execute once
Create postgres user
useradd postgres && echo "your_password" | passwd --stdin postgres
Configure key-free login
su - postgres ssh-keygen -t rsa -f /home/postgres/.ssh/id_rsa -P "" cd ~/.ssh/ ssh-copy-id postgres@master # Three hosts executescp authorized_keys postgres@slave:~/.ssh # Execute only on the master hostscp authorized_keys postgres@pool:~/.ssh # Only inmasterHost execution
Install Postgresql database (PG9.6)
yum install -y /pub/repos/yum/9.6/redhat/rhel-7-x86_64/pgdg-centos96-9.
yum install -y postgresql96-server postgresql96-contrib postgresql96 postgresql96-libs
Create a unified directory structure
mkdir /data1/pg_{data,bin,logs} -p
chown -R /data1/
Modify system variables
vi /etc/profile #Add the following contentexport PGHOME=/usr/pgsql-9.6/ export PGDATA=/data1/pg_data export PGPORT=54321 export PATH=$PATH:$PGHOME/bin #Effectivesource /etc/profile
PostgreSQL stream replication structure (master and slave host operations)
Master Host Operation
Initialize the system
/usr/pgsql-9.6/bin/postgresql96-setup initdb
vi /usr/lib/systemd/system/postgresql-9.
Modify postgresql-9.
The content is as follows:
# Include the default config: .include /usr/lib/systemd/system/postgresql-9. [Service] Environment=PGDATA=/data1/pg_data
Restart PG service
systemctl daemon-reload su - postgres -c '/usr/pgsql-9.6/bin/initdb -D /data1/pg_data' systemctl restart postgresql-9.6 systemctl enable postgresql-9.
Modify the system configuration (the following is the postgres user operation)
cp /data1/pg_data/pg_hba.conf{,.bak} cat >/data1/pg_data/pg_hba.conf<<EOF local all all trust host all all 10.0.0.11/32 trust host all all 10.0.0.12/32 trust host all all 0.0.0.0/0 md5 host all all ::1/128 trust host replication stream_replication 0.0.0.0/0 md5 EOF #host replication stream_replication 0.0.0.0/0 md5 Replicate users for streams
64G
cp /data1/pg_data/{,.bak} cat >/data1/pg_data/<<EOF listen_addresses = '*' port = 54321 max_connections = 256 shared_buffers = 16GB effective_cache_size = 48GB work_mem = 64MB maintenance_work_mem = 2GB min_wal_size = 2GB max_wal_size = 4GB checkpoint_completion_target = 0.9 wal_buffers = 16MB default_statistics_target = 100 wal_level = hot_standby wal_log_hints = on max_wal_senders = 1 hot_standby = on logging_collector = on log_directory = 'pg_log' EOF #Remember to restart after operation pg_ctl restart
128G
listen_addresses = '*' port = 54321 max_connections = 256 shared_buffers = 32GB effective_cache_size = 96GB work_mem = 128MB maintenance_work_mem = 2GB min_wal_size = 2GB max_wal_size = 4GB checkpoint_completion_target = 0.9 wal_buffers = 16MB default_statistics_target = 100 wal_level = hot_standby wal_log_hints = on max_wal_senders = 1 hot_standby = on logging_collector = on log_directory = 'pg_log'
Create stream replication users (stream_replication) and PGPool users (srcheck) in the main library
CREATE USER stream_replication replication LOGIN CONNECTION LIMIT 5 ENCRYPTED PASSWORD 'your_password';
CREATE USER srcheck replication LOGIN CONNECTION LIMIT 5 ENCRYPTED PASSWORD 'your_password';
Modify the main library pg_hba.conf file (see cat >/data1/pg_data/pg_hba.conf<<EOF for operation)
host replication stream_replication 0.0.0.0/0 md5
slave host operation
Initialize the system
/usr/pgsql-9.6/bin/postgresql96-setup initdb
vi /usr/lib/systemd/system/postgresql-9.
Modify postgresql-9.
The content is as follows:
# Include the default config: .include /usr/lib/systemd/system/postgresql-9. [Service] Environment=PGDATA=/data1/pg_data
Restart PG service
systemctl daemon-reload
Copy the basic backup to the backup server
rm -rf /data1/pg_data # If there is no important data to operate, it is mainly synchronized main library path
su - postgres -c 'pg_basebackup -D $PGDATA --format=p -h master -p 54321 -U stream_replication -W'
Modify the library configuration information
cp $PGHOME/share/ $PGDATA/
vi $PGDATA/
Add the following content
standby_mode='on' primary_conninfo = 'host=master port=54321 user=stream_replication password=your_password' restore_command = '' recovery_target_timeline = 'latest' # Restart PG servicesystemctl restart postgresql-9.6 systemctl enable postgresql-9.
verify
Master node execution
create table test (id int4, create_time timestamp(0) without time zone); insert into test values (1, now()); select * from test;
Preparing node execution
select * from test;
Other queries
Enter the test database test, execute the following command on the main library to return f, and return t on the backup library. select pg_is_in_recovery();
Run the following command to view the snapshot, which returns the main library record point and the backup library record point; for each additional write in the main library, the value of the record point will be increased by 1.
select txid_current_snapshot();
Run the following command to view the main and backup synchronization status.
select * from pg_stat_replication;
The synchronization status displayed by the field state is: startup (connection), catchup (synchronization), and streaming (synchronization); the modes displayed by the field sync_state are: async (asynchronization), sync (synchronization), and potential (although it is now in asynchronous mode, it is possible to upgrade to synchronous mode).
Main and backup switch
Assuming the main library crashes, how can the backup library switch from read-only to read-write state? Just modify the hot_standby in the backup library to off, delete it, and restart the library to provide services.
PGPool2 (pool host operation)
Install PGPool2
yum install -y /yum/rpms/3.6/redhat/rhel-7-x86_64/pgpool-II-release-3. yum -y install pgpool-II-pg96 pgpool-II-pg96-debuginfo pgpool-II-pg96-devel pgpool-II-pg96-extensions systemctl enable #Turn on automatic startup
Add Pgpool-II to run the user
useradd postgres # The environment is readychown -R /etc/pgpool-II chown -R /var/run/pgpool/
Configure pool_hba.conf
cp /etc/pgpool-II/pool_hba.conf{,.bak}
vi /etc/pgpool-II/pool_hba.conf
Add content
host all all 0.0.0.0/0 md5
Configuration
Execute after logging in:
postgres=# select rolname,rolpassword from pg_authid; rolname | rolpassword --------------------+------------------------------------- pg_signal_backend | srcheck | md5662c10f61b27a9ab38ce69157186b25f postgres | md5d3612d57ee8d4c147cf27b11e3a0974d stream_replication | md59279ef6b904bc483e4f85e6d44cfc0ed (4 rows)
vi /etc/pgpool-II/pool_passwd
Add the content of SQL execution results, in the form $rolname:$rolpassword For example:
srcheck:md5662c10f61b27a9ab38ce69157186b25f
or:
pg_md5 -u postgres your_password
vi /etc/pgpool-II/ ## Join postgres: output of the previous command
Configuration
cp /etc/pgpool-II/{,.bak}
vi /etc/pgpool-II/
The content is as follows:
# CONNECTIONS listen_addresses = '*' port = 54321 socket_dir = '/var/run/pgpool' pcp_listen_addresses = '*' pcp_port = 9898 pcp_socket_dir = '/var/run/pgpool' # - Backend Connection Settings - backend_hostname0 = 'master' backend_port0 = 54321 backend_weight0 = 1 backend_data_directory0 = '/data1/pg_data' backend_flag0 = 'ALLOW_TO_FAILOVER' backend_hostname1 = 'slave' backend_port1 = 54321 backend_weight1 = 1 backend_data_directory1 = '/data1/pg_data' backend_flag1 = 'ALLOW_TO_FAILOVER' # - Authentication - enable_pool_hba = on pool_passwd = 'pool_passwd' # FILE LOCATIONS pid_file_name = '/var/run/pgpool/' logdir = '/data1/pg_logs' replication_mode = off load_balance_mode = on master_slave_mode = on master_slave_sub_mode = 'stream' sr_check_period = 5 sr_check_user = 'srcheck' sr_check_password = '123456' sr_check_database = 'postgres' # HEALTH CHECK Health Check health_check_period = 10 health_check_timeout = 20 health_check_user = 'srcheck' health_check_password = '123456' health_check_database = 'postgres' # FAILOVER AND FAILBACK failover_command = '/data1/pg_bin/failover_stream.sh %H'
failover_stream.sh script
vim /data1/pg_bin/failover_stream.sh chmod 777 /data1/pg_bin/failover_stream.sh chmod u+s /sbin/ifconfig chmod u+s /usr/sbin pgpool -n -d -D > /data1/pg_logs/ 2>&1 & ## start uppgpool -m fast stop ## closure
failover_stream.sh content:
#! /bin/sh # Failover command for streaming replication. # Arguments: $1: new master hostname. new_master=$1 trigger_command="$PGHOME/bin/pg_ctl promote -D $PGDATA" # Prompte standby database. /usr/bin/ssh -T $new_master $trigger_command exit 0;
Login settings
When executing pgpool -n -d -D > /data1/pg_logs/ 2>&1 &, you can view the cluster status:
[postgres@pool pgpool-II]$ psql -p 54321 -h 10.0.0.13 -U srcheck -d postgres postgres=# show pool_nodes; node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay ---------+----------+-------+--------+-----------+---------+------------+-------------------+------------------- 0 | master | 54321 | up | 0.500000 | primary | 0 | false | 0 1 | slave | 54321 | up | 0.500000 | standby | 0 | true | 0 (2 rows)
If the cluster status is not found, perform the following operations on the master and slave hosts:
[postgres@pool ~]$ pcp_attach_node -d -U postgres -h pool -p 9898 -n 0 [postgres@pool ~]$ pcp_attach_node -d -U postgres -h pool -p 9898 -n 1 #Details query commandpcp_attach_node
HA switch
Simulate master host downtime
Master side:
[postgres@master ~]$ pg_ctl stop waiting for server to shut down.... done server stopped
Current cluster status
[postgres@pool ~]$ psql -p 54321 -h 10.0.0.13 -U srcheck -d postgres psql (9.6.1) Type "help" for help. postgres=# show pool_nodes; node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay ---------+----------+------+--------+-----------+---------+------------+-------------------+------------------- 0 | master | 5432 | down| 0.500000 | standby | 0 | false | 0 1 | slave | 5432 | up | 0.500000 | primary | 0 | true | 0 (2 rows)
I found that the master is already standby and down the machine
Modify the master and start
When the master host goes down, the slave host PG database becomes the master library. Just modify the master to become the slave library of slave
[postgres@master ~]$ vim standby_mode='on' primary_conninfo = 'host=slave port=54321 user=stream_replication password=your_password' restore_command = '' recovery_target_timeline = 'latest'
Synchronous timeline
#If the timeline conflict is reported, stop the PG service first, and then execute the synchronous timeline. If you know, look at the status directly[postgres@master ~]$ pg_rewind --target-pgdata=/data1/pg_data --source-server='host=slave port=54321 user=postgres dbname=postgres' servers diverged at WAL position 0/5000098 on timeline 1 rewinding from last common checkpoint at 0/5000028 on timeline 1 Done! # Restart the database[postgres@master ~]$ pg_ctl start
Check the current status again
[postgres@pool ~]$ psql -p 54321 -h 10.0.0.13 -U srcheck -d postgres postgres=# show pool_nodes; node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay ---------+----------+------+--------+-----------+---------+------------+-------------------+------------------- 0 | master | 5432 | down| 0.500000 | standby | 0 | false | 0 1 | slave | 5432 | up | 0.500000 | primary | 0 | true | 0 (2 rows) #Note that although the master has been started, it is still down. You need to manually add the master node to pgpool. The node_id of the master is 0, so -n 0[postgres@pool ~]$ pcp_attach_node -d -U postgres -h pool -p 54321 -n 0 # Prompt enter password, enter pcp management password#Check the current status[postgres@pool ~]$ psql -p 54321 -h 10.0.0.13 -U srcheck -d postgres postgres=# show pool_nodes; node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay ---------+----------+------+--------+-----------+---------+------------+-------------------+------------------- 0 | master | 5432 | up | 0.500000 | standby | 0 | false | 0 1 | slave | 5432 | up | 0.500000 | primary | 0 | true | 0 (2 rows)
Now both nodes are up.
Master-slave two node pgpool health check script (pgpool_check.sh)
Note: This script is used based on the fact that PGpool is only installed on the master and slave. After the master host has a pgpool process, you can execute sh pgpool_check.sh & on the slave host.
#! /bin/bash # Check Master host pgpool-process while true do pgcount=$(nmap 10.0.0.11|egrep '9898|9999'|wc -l) if [ $pgcount -eq 2 ] ; then echo 'Master host pgpool is GOOD!!!' > /dev/null 2>&1 else echo -e "Master host pgpool is \033[31m BAD!!! \033[0m" echo -e "Master host pgpool is \033[31m BAD!!! \033[0m" echo -e "Master host pgpool is \033[31m BAD!!! \033[0m" echo -e "SYSTEM WILL DO THE SHELL : \033[34m su - postgres -c 'pgpool -n -d -D > /data1/pg_logs/ 2>&1 &' \033[0m" su - postgres -c 'pgpool -n -d -D > /data1/pg_logs/ 2>&1 &' pgport=$(netstat -lntup|egrep '9898|9999'|wc -l) [ $pgport -gt 0 ] && echo -e "Slave host pgpool is \033[32m RUNNING!!! \033[0m" exit 0 fi done
The above is personal experience. I hope you can give you a reference and I hope you can support me more. If there are any mistakes or no complete considerations, I would like to give you advice.