PostgreSQL replication and Bucardo |

PostgreSQL streaming replication in PostgreSQL 9.0 and later is excellent if it is suitable, and it is relatively easy to set up, with several walkthroughs on the Internet. The main limitation of the streaming replication is that all the servers which are replicating have to be pretty much identical. They have to be:

Same major version (eg all 9.0, all 9.1 etc). You cannot use streaming replication to move the data from one version to a later version
Same ‘bitness’. They either all have to be 32 bit, or all 64 bit versions of PostgreSQL
Same OS. You cannot replicate from Windows to Linux, or vice versa, or even from one Linux distro to another.

Also, the entire database cluster has to be replicated. You cannot just replicate some databases or tables and not others. For many situations – eg disaster recovery – these limitations are not a big problem, so the streaming replication fits the bill. However, we recently had a situation where we wanted to move some databases from v9.1 on Windows to v9.2 on Ubuntu. The ‘normal’ way of performing this migration would be to shut down the application so it did no more writes to the database, then do a pg_dump from the old server and pg_restore on the new server. This will work between versions and platforms, so would work fine. The major problem is that this process was going to take between 12 and 36 hours for each database we had to move. While this wasn’t going to be impossible, it was certainly going to be inconvenient. On the PostgreSQL discussion sites, the recommended solution to this problem is to use a third party replication tool, such as Slony, Bucardo or Londiste. Brief investigation suggested Slony was a bit complicated, Londiste may not work because it didn’t seem to be supported on Windows, but Bucardo might possibly work.

Bucardo

The reason Bucardo might work is that, while Bucardo is only supported on Linux, the Bucardo software doesn’t need to run on the same PC as either of the PostgreSQL servers involved in the replication. So, theoretically, you could run Bucardo on a Linux PC, and have it managing the replication between two Windows PostgreSQL servers. In our particular case, we could run Bucardo on the target PostgreSQL server (running Ubuntu) and let it talk remotely to the source Windows PostgreSQL server. The only requirements for the source & target PostgreSQL servers are that they are PostgreSQL 8.1 or higher and have the Pl/pgsql language installed. In our case this wasn’t a problem. Unfortunately the Bucardo documentation is a bit lacking, so, it took several false starts where we couldn’t get the hang of where various things were meant to go. However, we seem to be there now.

Control tool issues

The most painful thing we found out was that the control tool (Bucardo_ctl) is a bit lacking in functionality. For instance, while you can use ‘bucardo_ctl add all tables‘ to add all the tables from the database into the replication, there doesn’t seem to be a way to remove all the tables. This led to several times where we had to drop the ‘bucardo’ database from the management PC and start from scratch. The bucardo_ctl tool primarily seems to be designed to work with databases where you will just want to replicate a few tables. We wanted to replicate all the tables – which is getting on for 100 tables, so entering each table by name was not practical. I believe Bucardo v5 will support wildcards in the command line parameters, which should improve things immensely.

Primary Keys

The other thing is that for replication to work well, the tables involved must have a ‘primary key’ defined. In our VPOP3 mail server software (which is what we were wanting to move), several of the tables did not have a primary key. These often had unique indexes which could have been used as primary keys, or were small tables which didn’t really need the overhead of an index (eg just a few rows). Bucardo does have a delta sync method for when you have primary keys, and a full table copy method for when you do not have primary keys. However, the full table copy method is obviously inefficient. Also, Bucardo won’t automatically pick which method to use. So, what this meant was that we tried to set up the sync, and found it failed because we were missing a primary key. Then, because we couldn’t just remove all the tables from Bucardo, we had to drop the entire database and start again from scratch after adding the primary key definition. Several times. In the end we found this blog which gives a method for finding tables without a primary key. The SQL presented there is slightly flawed because it also returns views without a primary key, which are irrelevant. The corrected SQL is:

SELECT table_catalog, table_schema, table_name FROM information_schema.tables WHERE 
table_type <> 'VIEW' AND (table_catalog, table_schema, table_name) NOT IN 
(SELECT table_catalog, table_schema, table_name FROM information_schema.table_constraints 
WHERE constraint_type = 'PRIMARY KEY') AND table_schema NOT IN 
('information_schema', 'pg_catalog');

We then used this to go through all the tables and create primary keys where they were missing.

The Bucardo Setup

Then, we installed Bucardo on the Ubuntu server using sudo apt-get install bucardo. We needed to alter the pg_hba.conf file on the Ubuntu server to allow the bucardo user access, by adding the line: host all bucardo 127.0.0.1/32 trust (make sure you add this line before any other lines referring to 127.0.0.1/32 We had to create a super user called Bucardo on the source Windows PostgreSQL server

CREATE USER bucardo SUPERUSER;

and add a trust line to the pg_hba.conf file

host all bucardo 192.168.1.0/24 trust

Then we copied the schema from the Windows server to the Ubuntu server by creating the appropriate user(s) and database and then running this command on the Ubuntu server itself

pg_dump -U postgres --schema-only -h 192.168.1.100 vpop3 |psql vpop3

Now, we added both servers to Bucardo

bucardo_ctl add db vpop3 name=vpop3here

bucardo_ctl add db vpop3 name=vpop3there host=192.168.1.100

Next, we added the tables and sequences from the source to a Bucardo ‘herd’ called ‘myherd’

bucardo_ctl add all tables db=vpop3there --herd=myherd

bucardo_ctl add all sequences db=vpop3there --herd=myherd

Create a sync job

bucardo_ctl add sync mysync type=pushdelta source=myherd targetdb=vpop3here

Now, we can tell Bucardo to do a full copy of the source to the target to populate the target db

bucardo_ctl update sync mysync onetimecopy=2

Start the Bucardo sync job

sudo bucardo_ctl start

and wait…

You can tell when the sync is up to date by periodically running

bucardo_ctl status

or monitoring the contents of tables on the target server

If/when you want to stop using the old database server, then you stop the application, then run

bucardo_ctl kick mysync 0

This tells Bucardo to start the sync process running, and not to return until it is up to date

Then, you can reconfigure the application to use the new database server, and disable/remove the old database.

===Update===

While we found the above works with small-ish databases (up to 10GB or so), we had problems with large databases. What would happen is that the ‘onetimecopy’ process would start, run for a bit, then stop for no obvious reason. Bucardo would then restart it. This would happen many times. The disk would eventually fill up due to lots of dead tuples, but the data wouldn’t come over.

An alternative way which works is to set up the sync as above, but change the sync creation command to

bucardo_ctl add sync mysync type=pushdelta source=myherd targetdb=vpop3here status=inactive

This starts the sync process running on the source, but the changes don’t get migrated to the target – yet.

Now, do a backup / restore using pg_dump & pg_restore. But note that you don’t need to stop the application while doing this. Because the backup is done AFTER starting the sync process, then any changes which the application makes after the backup starts will be logged by Bucardo. It doesn’t really matter how long is between starting the sync, and running the backup; Bucardo can handle changes which are made to the database before its starting ‘image’ of the database.

Once you have finished the restore, then run

bucardo_ctl update sync mysync status=active

to start the sync process running

news and info