Use Bucardo5 to implement the primary database replication of PostgreSQL

The next generation of asynchronous multiple primary database replication systems, Bucardo 5, has been released. This version removes the restrictions on the two database sources in the old version, allowing for more source databases (i.e. the primary database) and more target databases (i.e. the backup database). Bucardo can also be copied to other types of target databases, including MySQL, MariaDB, Oracle, SQLite, MongoDB, and Redis. Bucardo has been completely rewritten, and this version is more powerful and more efficient than the previous version of Bucardo 4. You can visit the Bucardo wiki to find the latest versionBucardo。

This article quickly introduces Bucardo. Future blog posts will introduce the power of Bucardo, and now we will introduce how to simply implement multi-master database replication.

For the convenience of demonstration, I usedAmazon Web Services(AWS) provides a fast-created, free-to-use server, that is, a basic server running Amazon Linux. If you follow the prompts, it will create a server instance for free and simply for you. Once the instance is created successfully, we can use the ec2-user account to log in to the server through the SSH protocol, and then we can start installing PostgreSQL and Bucardo.

# Always a good idea:
$ sudo yum update
# This also installs other postgresql packages:
$ sudo yum install postgresql-plperl
# Create a new Postgres cluster:
$ initdb btest

At this time, we still can't expect you PostgreSQL, because this release version of the socket communication directory uses /var/run/postgresql and /tmp. After adjusting the permissions of the first directory, we can start PostgreSQL and create the first test database:

$ sudo chmod 777 /var/run/postgresql
$ pg_ctl -D btest -l logfile start
$ createdb shake1

Next we can copy the database! To get sample data, I used the open source Shakespeare project. It has a small, easy to load, easy to use, simple database schema.This small project on githubThis includes a ready-made PostgreSQL database schema, and now we can load it into the new database:

$ sudo yum install git
$ git clone -q /catherinedevlin/
$ psql shake1 -q -f opensourceshakespeare/
# You can safely ignore the 'role does not exist' errors

We intend to create copies of this database that can be treated as other data sources. In other words, these servers have the same data and can be written. Implementing these is very simple:

$ createdb shake2 -T shake1
$ createdb shake3 -T shake1

Bucardo needs to install some dependency packages. If the operating system you installed is different, then the dependency packages you may want to install are different: Here are the dependency packages that Amazon Linux needs to install when I wrote this article. (If you are lucky, your release package may already include Bucardo. In this case, the following execution steps are not required. You just need to run "yum install bucard" - but make sure you are using version 5 or better! (View it through yum info bucardo))

$ sudo yum install perl-ExtUtils-MakeMaker perl-DBD-Pg \
> perl-Encode-Locale perl-Sys-Syslog perl-boolean \
> perl-Time-HiRes perl-Test-Simple perl-Pod-Parser
$ sudo yum install cpan
$ echo y | cpan DBIx::Safe

The yum repository of this system does not contain the Perl module DBIx::Safe, so we need to passCPANLet's install this module. Once all the above dependencies are successfully installed, we are ready to install Bucardo. We will obtain the official compression package, verify, decompress, and then install:

$ wget -nv /
$ wget -nv /
$ gpg -q --keyserver  --recv-key 14964AC8
$ gpg --verify 
$ tar xfz  $ ln -s Bucardo-5.0.0 bucardo
$ cd bucardo
$ perl 
$ make
$ sudo make install

We make some minor adjustments to the bucardorc file (the file that sets some global information). Then run "bucardo install", this command will create the bucardo main database, which contains the information required by the Bucardo service process:

$ mkdir pid
$ echo -e "piddir=pid\nlogdest=." > .bucardorc
$ bucardo install --batch --quiet
Creating superuser 'bucardo'

Now that Bucardo is installed, I'm ready to copy it next. At this point, we have three databases that can be copied from each other. Below we can only use two commands to copy three databases to each other:

 bucardo add dbs s1,s2,s3 dbname=shake1,shake2,shake3
Added databases "s1","s2","s3"
$ bucardo add sync bard dbs=s1:source,s2:source,s3:source tables=all
Added sync "bard"
Created a new relgroup named "bard"
Created a new dbgroup named "bard"
 Added table ""
 Added table ""
 Added table "public.character_work"
 Added table ""
 Added table ""
 Added table ""

In the first command, we tell Bucardo how to connect to three databases. We tell Bucardo the name of the database, and then Bucardo treats these three databases as (s1, s2, s3). You can also specify ports and hosts, but in this example, the default port is 5432 and there is no need for a host (using the Unix Socket communication mechanism).

The second command creates a named replication system with a sync name bard. Bucardo needs to know where to copy and how to copy, so we tell it to use three databases s1, s2 and s3. Each database can be used as a source database, so we added such information to them. Finally we need to know what to copy. In this example, what we need to copy is all tables (or more precisely, all databases with primary keys or unique indexes). Note: Bucardo always puts the database and tables in the named group - in this example we just hardcoded it to 10, however usually this value is the length of the table view controller array. Now in the example, all this is done automatically, and both dbgroup and relgroup are named after sync.

We verify whether the replication is running, that is, check whether the update row is copied to all databases contained in sync:

$ bucardo start
$ psql shake1 -c \
> "update character set speechcount=123 where charname='Hamlet'"
UPDATE 1
$ for i in {1,2,3}; do psql shake$i -tc "select \
> current_database(), speechcount from character \
> where charname='Hamlet'"; done | grep s
 shake1    |   123
 shake2    |   123
 shake3    |   123

We can also check Bucardo's log file "" to see if there is a copy operation:

$ tail -2 
(25181) KID (bard) Delta count for ."character": 1
(25181) KID (bard) Totals: deletes=2 inserts=2 conflicts=0

There are two deletes and two insert commands above, because updating a line means that the delete is first run on the other two databases, and then the insert (technically adopted COPY). Next, let's see how Bucardo handles conflicts. We will update the same line on all servers, which will create a conflict:

$ for i in {1,2,3}; do psql shake$i -tc \
> "update character set speechcount=$i$i$i \
> where charname='Hamlet'"; done
UPDATE 1
UPDATE 1
UPDATE 1

Checking the log shows that there is indeed a conflict, and it also resolves the conflict well. The default conflict resolution indicates that the last updated database is the winner, and now all three databases have the same rows as the last updated database.

$ tail 
(25181) KID (bard) Delta count for ."character": 1
(25181) KID (bard) Delta count for ."character": 1
(25181) KID (bard) Delta count for ."character": 1
(25181) KID (bard) Conflicts for public."character": 1
(25181) KID (bard) Conflicts have been resolved
(25181) KID (bard) Totals: deletes=2 inserts=2 conflicts=1
 
$ for i in {1,2,3}; do psql shake$i -tc \
> "select current_database(), speechcount \
> from character where charname='Hamlet'"; done | grep s
 shake1    |   333
 shake2    |   333
 shake3    |   333

When we developed this example, Bucardo sometimes ran very fast, so there was no conflict. That is, because the update is executed sequentially. So before the next update, there is a time window that allows Bucardo to complete the copying of the update. In addition, the "pause sync" function is also very convenient. As long as you need to temporarily stop running sync, run the following command:

$ bucardo pause bard
Syncs paused: bard
$ psql shake1 -c "update character set speechcount=1234 where charname='Hamlet'"
UPDATE 1
$ psql shake2 -c "update character set speechcount=4321 where charname='Hamlet'"
UPDATE 1
$ bucardo resume bard
Syncs resumed: bard
 
$ tail 
(27344) KID (bard) Delta count for ."character": 1
(27344) KID (bard) Delta count for ."character": 1
(27344) KID (bard) Conflicts for public."character": 1
(27344) KID (bard) Conflicts have been resolved
(27344) KID (bard) Totals: deletes=2 inserts=2 conflicts=1

Bucardo 5 has a lot more features than we demonstrate here. In future blog posts, we will include other functions that it can accomplish, from copying to non-PostgreSQL systems such as Oracle, Mysql, or MongoDB to using custom conflict solutions. And when copying, it performs conversion on running data. If you have any questions, please state it in the comments below, or write a text message to Bucardo mailing list bucardo-general@.

Over the years, this major version would not have been released without many people contributing code, raising vulnerabilities, testing Bucardo, and asking (or answering!). CheckChangesFiles, you can see a list of some contributors. Thank you all, especially Jon Jensen, who started this project a long time ago.