Plotting the Evolution of Logical Replication in PostgreSQL 16
PostgreSQL 16 has been released to include features for making the logical replication feature more robust and performant.
Join the DZone community and get the full member experience.
Join For FreePostgreSQL, a robust and open-source relational database system, is renowned for its native replication mechanisms, logical replication, and physical replication (also known as streaming replication). PostgreSQL 16 includes numerous improvements in logical replication designed to enhance performance, bolster data consistency, and advance compatibility.
In the context of PostgreSQL, logical replication employs a publisher/subscriber model to mirror changes between PostgreSQL servers. The primary node, where the database resides, is the publisher, while the stand-by node, which receives transaction copies, is the subscriber. Changes in the database are replicated from the publisher node to one or more subscriber node(s) identified by the subscription.
Logical replication functions by taking a snapshot of the published database data and copying it to the subscriber. When the subscription commences, changes on the publisher go to the subscriber in real-time. Using a transactional model, changes are applied to the subscriber in the same order they occur on the publisher to ensure transactional consistency.
Conversely, physical or streaming replication transmits data from the primary node to the standby node in Write-Ahead Log (WAL) files. Streaming replication can be synchronous or asynchronous, with the former offering better data protection at the expense of more resources, while the latter is more cost-effective.
Parallel Application of Changes
PostgreSQL 16 introduces several enhancements to logical replication to boost performance, data consistency, and compatibility. In the current scenario, changes for large, ongoing transactions are transferred from the publisher to the subscriber in multiple streams, divided into chunks based on the logical_decoding_work_mem parameter value. PostgreSQL 16 enhances this by applying changes to the subscriber node in parallel using numerous background workers, improving performance by 30-40%. You can review the benchmarking as part of the patch history.
Binary Format for Initial Data Transfer
With PostgreSQL 16, when you create a subscription, you can use binary format for the initial data transfer. Prior versions performed the initial sync in text format, with the option to change to binary only after starting logical replication. The new functionality allows the initial sync to be done in the format intended for replication, potentially reducing the initial sync time.
It is possible to significantly enhance performance by specifying the index to be used on the subscriber when applying UPDATES and DELETES. This prevents time-consuming full table scans and efficiently identifies the correct row to modify on the subscriber.
Logical Decoding on Stand-By
Previously, the only support for logical decoding was for the primary node. PostgreSQL 16 introduces minimal logical decoding on the stand-by node, allowing the creation of a logical replication slot on a stand-by node, subscription to a stand-by node, and logical decoding on the stand-by node. This functionality introduces the pg_log_standby_snapshot()
function, which quickly takes a snapshot of a running transaction and writes it into WAL files without needing a checkpoint.
Origin Filtering (Bi-Directional Replication — It Is Just the Beginning)
PostgreSQL 16 also introduces another feature that can be considered a good start for bi-directional replication support in PG. The origin filtering feature allows the user to create a subscriber with ORIGIN set to None. Previously, the only value accepted for the ORIGIN parameter was ANY. With ANY for Origin, it means that the subscription will send changes regardless of its origin — this leads to infinite loops or errors in case the table has a unique key constraint.
PG-16 Introduces Origin = None, which enables subscribers to request publishers for changes that don’t have an origin. This makes bi-directional replication possible between two servers,
Both servers will be publishers and subscribers to each other.
Looking Ahead
These improvements to PostgreSQL 16's logical replication system make it more efficient and robust, setting the stage for more advanced features in future releases. PostgreSQL's dedication to continuous enhancement is why it remains one of the most reliable and renowned open-source relational database systems on the market.
Opinions expressed by DZone contributors are their own.
Comments