Delving Into the Performance Features of Logical Replication in PostgreSQL 16
Explore the performance features associated with logical replication and the steps for observing these features in real time.
Join the DZone community and get the full member experience.
Join For FreeIn my last article, I examined the concept of replication methods in PostgreSQL and provided an overview of the main characteristics of logical replication incorporated into PostgreSQL 16. Now, I am going to explore the performance features associated with logical replication, showcase the steps for observing these features in real time, and share the outcomes of performance benchmarking.
I want to spotlight two specific features of PostgreSQL 16 — parallel apply and binary copy. The parallel apply feature extends the capability to utilize parallel background workers at the subscriber node to facilitate the application of changes in large ongoing transactions. The specification of the number of parallel workers that should be engaged for assimilating changes from the publisher is determined by max_parallel_apply_workers_per_subscription
. The binary copy, the second performance feature, permits logical replication to conduct the initial data copy in a binary format. This significantly enhances performance, particularly when copying tables containing binary columns.
What Is Parallel Apply?
Parallel apply is a performance enhancement feature that yields considerable benefits in replicating extensive ongoing transactions. The process is initiated by streaming the changes to the subscriber node, following which parallel background workers at the subscriber node are deployed to implement the changes as they are streamed from the publisher. The number of parallel workers assigned to apply the changes at the subscriber node can be tailored through the max_parallel_apply_workers_per_subscription
configuration parameter.
To illustrate the application of this compelling feature of logical replication, an example is provided below. Alongside this, I have also presented some sample performance figures obtained from a test run on several AWS instances located in disparate regions.
In this particular example, the publisher was operating on AWS us-east-1, and the subscriber node was functional on AWS us-west-2.
To configure the publisher node, connect to the node and:
Create a fresh PostgreSQL cluster with
initdb
and set the following configuration parameters. Specify values that work well with your server specification:
shared_buffers=2GB
wal_level = logical
client_min_messages = warning
checkpoint_timeout = 30min
max_wal_size = 20GB
min_wal_size = 10GB
autovacuum = off
logical_decoding_work_mem = 64MB
Create a table for publication; we've used the following command:
CREATE TABLE large_test (id int primary key, num1 bigint, num2 double precision, num3 double precision );
Create a publication
FOR ALL TABLES
; you can optionally create a publication for just thelarge_test table
created in the previous step:
CREATE PUBLICATION pub FOR ALL TABLES
To configure the subscriber node, connect to the node and:
Create a fresh cluster with
initdb
and set the following configuration parameters. The parameters need to be set according to your server specification:
shared_buffers=2GB
wal_level = logical
client_min_messages = warning
checkpoint_timeout = 30min
max_wal_size = 20GB
min_wal_size = 10GB
autovacuum = off
logical_decoding_work_mem = 64MB
max_parallel_apply_workers_per_subscription = 4
For our test server, I set max_parallel_apply_workers_per_subscription
to 4 to spawn four parallel workers for applying changes to the subscriber node.
Create a table for publication to receive the replication stream from the publisher:
CREATE TABLE large_test (id int primary key, num1 bigint, num2 double precision, num3 double precision );
Create a subscription with connection properties to the publisher:
CREATE SUBSCRIPTION sub CONNECTION 'host=ec2-44-209-69-249.compute-1.amazonaws.com user=admin password=password dbname=demo' PUBLICATION pub WITH (copy_data = off, streaming=parallel, synchronous_commit=remote_apply);
Please note that we are setting the copy_data
parameter to off for the purposes of this test so we can stream the table changes instead of doing the initial data copy. We are also setting the streaming type to parallel; this will enable the parallel apply feature and apply the changes to the subscriber node with the specified number of workers.
To set up our test scenario, we connect to the publisher node and:
Set
synchronous_standby_names
to the name of the subscriber; you don't need to do this to make use of the parallel apply feature; this was only done for the purpose of this test. Setting the parameter ensures that the backend waits for the application on the subscriber node, so we can measure the timing:
cat << EOF >> /opt/pgedge/data/pg16/postgresql.conf synchronous_standby_names = 'sub' EOF
Restart the PostgreSQL server.
Use
psql
to run the following command. The command starts and times a large transaction on the publisher node:
\timing EXPLAIN (ANALYZE, BUFFERS) INSERT INTO large_test (id, num1, num2, num3) SELECT i, round(random()*10), random(), random()*142 FROM generate_series(1, 5000000) s(i);
Now, let’s look at the results:
- With streaming set to parallel, it takes 58887.540 ms (00:58.888) to complete the transaction and apply the changes at the subscriber node.
- With streaming set to off, it took 106909.268 ms (01:46.909) to complete the transaction and apply the changes at the subscriber node.
Note that this gives us up to 50-60% performance gain for large in-progress transactions using parallel apply.
Binary Copy
Binary copy constitutes yet another performance augmentation feature of logical replication incorporated into PostgreSQL 16. The specialty of binary copy lies in its capability to execute the initial data copy of table content in binary format. While data streaming in binary format was introduced in prior versions, the execution of the initial table copy in binary mode was not supported until the advent of PostgreSQL 16.
To demonstrate the considerable performance enhancement that this feature brings, I carried out a test using a pair of AWS instances. The ensuing example illustrates how to activate this feature and also furnishes the performance metrics from testing the initial data load using binary format in contrast with the non-binary format.
Publisher
To set up our binary copy test scenario, connect to the publisher node and:
Set the following configuration parameters to maximize your system performance:
shared_buffers=2GB
wal_level = logical
client_min_messages = warning
checkpoint_timeout = 30min
max_wal_size = 20GB
min_wal_size = 10GB
autovacuum = off
logical_decoding_work_mem = 64MB
Create a table that includes
bytea
columns
CREATE TABLE large_test (id int primary key, num1 bytea, num2 bytea, num3 double precision );
Create a publication, specifying the
FOR ALL TABLES
clause:
CREATE PUBLICATION pub FOR ALL TABLE
Add records to the table:
\timing EXPLAIN (ANALYZE, BUFFERS) INSERT INTO large_test (id, num1, num2, num3) SELECT i, md5(round(random()*10)::text)::bytea, md5(random()::text)::bytea, random()*142 FROM generate_series(1, 50000000) s(i);
Check the table size after the initial data load:
SELECT pg_size_pretty (pg_relation_size('large_test')); pg_size_pretty ---------------- 5208 MB (1 row)
Subscriber
Connect to the subscriber node and:
Set the following configuration parameters appropriately for your system:
shared_buffers=2GB
wal_level = logical
client_min_messages = warning
checkpoint_timeout = 30min
max_wal_size = 20GB
min_wal_size = 10GB
autovacuum = off
logical_decoding_work_mem = 64MB
Create a table with the same
bytea
columns:
CREATE TABLE large_test (id int primary key, num1 bytea, num2 bytea, num3 double precision );
Create the subscription; set the binary parameter to true and the copy_data parameter to on for the initial data transfer.
CREATE SUBSCRIPTION sub CONNECTION 'host=ec2-44-209-69-249.compute-1.amazonaws.com user=admin password=password dbname=demo' PUBLICATION pub WITH (copy_data=on, binary=true);
Create the following function to time the initial data copy from publisher to subscriber:
CREATE OR REPLACE PROCEDURE wait_for_rep() LANGUAGE plpgsql AS $$ BEGIN WHILE (SELECT count(*) != 0 FROM pg_subscription_rel WHERE srsubstate <> 'r') LOOP COMMIT; END LOOP; END; $$;
Call the function to time the transfer:
\timing call wait_for_rep();
Here are the results:
- Without binary load (binary set to false), it took 383884.913 ms (06:23.885) to complete the transaction and apply the changes at the subscriber node.
- With binary load (binary set to true), it took 267149.655 ms (04:27.150) to complete the transaction and apply the changes at the subscriber node.
This provides a 32% performance gain when performing the initial table copy in binary format.
The adoption of distributed PostgreSQL databases is escalating at a rapid pace, and replication emerges as a quintessential and fundamental component of any distributed framework. With each significant release, the replication features within PostgreSQL continue to mature, enhancing their richness in functionality. The initial foundation for logical replication was set in place prior to PostgreSQL 10, but it was only with PostgreSQL 10 that the logical replication feature came to fruition in a usable form. Subsequently, support for replication has witnessed enormous growth, and each major update introduces significant features. In my next article, I will explore the remaining logical replication features from PostgreSQL 16.
Published at DZone with permission of Ahsan Hadi. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments