When Postgres Blocks: 7 Tips for Dealing With Locks
These are some of the important dos and don'ts that we've seen as helpful when working with users to migrate from their single node Postgres database to Citus.
Join the DZone community and get the full member experience.
Join For FreeRecently, I wrote about locking behavior in Postgres, which commands block each other, and how you can diagnose blocked commands. Of course, after the diagnosis, you may also want a cure. With Postgres, it is possible to shoot yourself in the foot, but Postgres also offers you a way to stay on target. These are some of the important dos and don'ts that we've seen as helpful when working with users to migrate from their single node Postgres database to Citus or when building new real-time analytics apps on Citus.
1. Never Add a Column With a Default Value
A golden rule of PostgreSQL is: when you add a column to a table in production, never specify a default.
Adding a column takes a very aggressive lock on the table, which blocks read and write. If you add a column with a default, PostgreSQL will rewrite the whole table to fill in the default for every row, which can take hours on large tables. In the meantime, all queries will block, so your database will be unavailable.
Don't do this:
-- reads and writes block until it is fully rewritten (hours?)
ALTER TABLE items ADD COLUMN last_update timestamptz DEFAULT now();
Do this instead:
-- select, update, insert, and delete block until the catalog is update (milliseconds)
ALTER TABLE items ADD COLUMN last_update timestamptz;
-- select and insert go through, some updates and deletes block while the table is rewritten
UPDATE items SET last_update = now();
Or better yet, avoid blocking updates and delete for a long time by updating in small batches, i.e.:
do {
numRowsUpdated = executeUpdate(
"UPDATE items SET last_update = ? " +
"WHERE ctid IN (SELECT ctid FROM items WHERE last_update IS NULL LIMIT 5000)",
now);
} while (numRowsUpdate > 0);
This way, you can add and populate a new column with minimal interruption to your users.
2. Beware of Lock Queues; Use Lock Timeouts
Every lock in PostgreSQL has a queue. If a transaction B tries to acquire a lock that is already held by transaction A with a conflicting lock level, then transaction B will wait in the lock queue. Now, something interesting happens: if another transaction C comes in, then it will not only have to check for conflict with A, but also with transaction B, and any other transaction in the lock queue.
This means that even if your DDL command can run very quickly, it might be in a queue for a long time waiting for queries to finish, and queries that start after it will be blocked behind it.
When you can have long-running SELECT
queries on a table, don't do this:
ALTER TABLE items ADD COLUMN last_update timestamptz;
Instead, do this:
SET lock_timeout TO '2s'
ALTER TABLE items ADD COLUMN last_update timestamptz;
By setting lock_timeout
, the DDL command will fail if it ends up waiting for a lock, thus blocking queries for more than two seconds. The downside is that your ALTER TABLE
might not succeed, but you can try again later. You may want to query pg_stat_activity
to see if there are long-running queries before starting the DDL command.
3. Create Indexes Concurrently
Another golden rule of PostgreSQL is: always create your indexes concurrently.
Creating an index on a large dataset can take hours or even days, and the regular CREATE INDEX
command blocks all writes for the duration of the command. While it doesn't block SELECT
s, this is still pretty bad and there's a better way: CREATE INDEX CONCURRENTLY
.
Don't do this:
-- blocks all writes
CREATE INDEX items_value_idx ON items USING GIN (value jsonb_path_ops);
Instead, do this:
-- only blocks other DDL
CREATE INDEX CONCURRENTLY items_value_idx ON items USING GIN (value jsonb_path_ops);
Creating an index concurrently does have a downside. If something goes wrong it does not roll back and leaves an unfinished ("invalid") index behind. If that happens, don't worry, simply run DROP INDEX CONCURRENTLY items_value_idx
and try to create it again.
4. Take Aggressive Locks as Late as Possible
When you need to run a command that acquires aggressive locks on a table, try to do it as late in the transaction as possible to allow queries to continue for as long as possible.
For example, if you want to completely replace the contents of a table. Don't do this:
BEGIN;
-- reads and writes blocked from here:
TRUNCATE items;
-- long-running operation:
\COPY items FROM 'newdata.csv' WITH CSV
COMMIT;
Instead, load the data into a new table and then replace the old table:
BEGIN;
CREATE TABLE items_new (LIKE items INCLUDING ALL);
-- long-running operation:
\COPY items_new FROM 'newdata.csv' WITH CSV
-- reads and writes blocked from here:
DROP TABLE items;
ALTER TABLE items_new RENAME TO items;
COMMIT;
There is one problem: we didn't block writes from the start and the old items
table might have changed by the time we drop it. To prevent that, we can explicitly take a lock the table that blocks writes, but not reads:
BEGIN;
LOCK items IN EXCLUSIVE MODE;
...
Sometimes it's better to take locking into your own hands.
5. Adding a Primary Key With Minimal Locking
It's often a good idea to add a primary key to your tables — for example, when you want to use logical replication or migrate your database using Citus Warp.
Postgres makes it very easy to create a primary key using ALTER TABLE
, but while the index for the primary key is being built, which can take a long time if the table is large, all queries will be blocked.
ALTER TABLE items ADD PRIMARY KEY (id); -- blocks queries for a long time
Fortunately, you can first do all the heavy lifting using CREATE UNIQUE INDEX CONCURRENTLY
, and then use the unique index as a primary key, which is a fast operation.
CREATE UNIQUE INDEX CONCURRENTLY items_pk ON items (id); -- takes a long time, but doesn’t block queries
ALTER TABLE items ADD CONSTRAINT items_pk PRIMARY KEY USING INDEX items_pk; -- blocks queries, but only very briefly
By breaking down primary key creation into two steps, it has almost no impact on the user.
6. Never VACUUM FULL
The Postgres user experience can be a little surprising sometimes. While VACUUM FULL
sounds like something you want to do clear the dust of your database, a more appropriate command would have been:
PLEASE FREEZE MY DATABASE FOR HOURS;
VACUUM FULL
rewrites the entire table to disk, which can take hours or days, and blocks all queries while doing it. While there some valid use cases for VACUUM FULL
, such as a table that used to be big, but is now small and still takes up a lot of space, it is probably not your use case.
While you should aim to tune your auto-vacuum settings and use indexes to make your queries fast, you may occasionally want to run VACUUM
, but NOTVACUUM FULL
.
7. Avoid Deadlocks by Ordering Commands
If you've been using PostgreSQL for a while, chances are you've seen errors like:
RROR: deadlock detected
DETAIL: Process 13661 waits for ShareLock on transaction 45942; blocked by process 13483.
Process 13483 waits for ShareLock on transaction 45937; blocked by process 13661.
This happens when concurrent transactions take the same locks in a different order. For example, one transaction issues the following commands.
BEGIN;
UPDATE items SET counter = counter + 1 WHERE key = 'hello'; -- grabs lock on hello
UPDATE items SET counter = counter + 1 WHERE key = 'world'; -- blocks waiting for world
END;
Simultaneously, another transaction might be issuing the same commands, but in a different order.
BEGIN
UPDATE items SET counter = counter + 1 WHERE key = 'world'; -- grabs lock on world
UPDATE items SET counter = counter + 1 WHERE key = 'hello'; -- blocks waiting for hello
END;
If these transaction blocks run simultaneously, chances are that they get stuck waiting for each other and would never finish. Postgres will recognize this situation after a second or so and will cancel one of the transactions to let the other one finish. When this happens, you should take a look at your application to see if you can make your transactions always follow the same order. If both transactions first modify hello
, then world
, then the first transaction will block the second one on the hello
lock before it can grab any other locks.
Published at DZone with permission of Marco Slot, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments