FLUSH and LOCK Handling in Percona XtraDB Cluster
Let's run through your options for FLUSH and LOCK handling when using Percona XtraDB Cluster as well as tips for each of your choices.
Join the DZone community and get the full member experience.
Join For FreeIn this blog post, we’ll look at how Percona XtraDB Cluster (PXC) executes FLUSH and LOCK handling.
Introduction
Percona XtraDB Cluster is a multi-master solution that allows parallel execution of the transactions on multiple nodes at the same point in time. Given this semantics, it is important to understand how Percona XtraDB Cluster executes statements regarding FLUSH and LOCK handling (that operate at node level).
The section below enlist different flavors of these statements and their PXC semantics
FLUSH TABLE WITH READ LOCK
- FTWRL is normally used for backup purposes.
- Execution of this command establishes a global level read lock.
- This read lock is non-preemptable by the background running applier thread.
- PXC causes the node to move to DESYNC state (thereby blocking emission of flow-control) and also pauses the node.
2018-03-08T05:09:54.293991Z 0 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 1777)
2018-03-08T05:09:58.040809Z 5 [Note] WSREP: Provider paused at c7daf065-2285-11e8-a848-af3e3329ab8f:2002 (2047)
2018-03-08T05:14:20.508317Z 5 [Note] WSREP: resuming provider at 2047
2018-03-08T05:14:20.508350Z 5 [Note] WSREP: Provider resumed.
2018-03-08T05:14:20.508887Z 0 [Note] WSREP: Member 1.0 (n2) resyncs itself to group
2018-03-08T05:14:20.508900Z 0 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 29145)
2018-03-08T05:15:16.932759Z 0 [Note] WSREP: Member 1.0 (n2) synced with group.
2018-03-08T05:15:16.932782Z 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 29145)
2018-03-08T05:15:16.988029Z 2 [Note] WSREP: Synchronized with group, ready for connections
2018-03-08T05:15:16.988054Z 2 [Note] WSREP: Setting wsrep_ready to true
- Other nodes of the cluster continue to process the workload.
- DESYNC and pause node continue to see the replication traffic. Though it doesn’t process the write-sets, they are appended to Galera cache for future processing.
- Fallback: When FTWRL is released (through UNLOCK TABLES), and if the workload is active on other nodes of the cluster, FTWRL executed node may start emitting flow-control to cover the backlog. Check details here.
FLUSH TABLE <tablename> (WITH READ LOCK|FOR EXPORT)
- It is meant to take global level read lock on the said table only. This lock command is not replicated and so pxc_strict_mode = ENFORCING blocks execution of this command.
- This read lock is non-preemptable by the background running applier thread.
- Execution of this command will cause the node to pause.
- If the flush command executing node is same as workload processing node, then the node will pause immediately
- If the flush command executing node is different from workload processing node, then the write-sets are queued to the incoming queue and flow-control will cause the pause.
- End-result is cluster will stall in both cases.
2018-03-07T06:40:00.143783Z 5 [Note] WSREP: Provider paused at 40de14ba-21be-11e8-8e3d-0ee226700bda:147682 (149032)
2018-03-07T06:40:00.144347Z 5 [Note] InnoDB: Sync to disk of `test`.`t` started.
2018-03-07T06:40:00.144365Z 5 [Note] InnoDB: Stopping purge
2018-03-07T06:40:00.144468Z 5 [Note] InnoDB: Writing table metadata to './test/t.cfg'
2018-03-07T06:40:00.144537Z 5 [Note] InnoDB: Table `test`.`t` flushed to disk
2018-03-07T06:40:01.855847Z 5 [Note] InnoDB: Deleting the meta-data file './test/t.cfg'
2018-03-07T06:40:01.855874Z 5 [Note] InnoDB: Resuming purge
2018-03-07T06:40:01.855955Z 5 [Note] WSREP: resuming provider at 149032
2018-03-07T06:40:01.855970Z 5 [Note] WSREP: Provider resumed.
- Once the lock is released (through UNLOCK TABLES), node resumes apply of write-sets.
LOCK TABLE <tablename> READ/WRITE
- LOCK TABLE command is meant to lock the said table in the said mode.
- Again, the lock established by this command is non-preemptable.
- LOCK is taken at node level (command is not replicated) so pxc_strict_mode = ENFORCING blocks this command.
- There is no state change in PXC on the execution of this command.
- If the lock is taken on the table that is not being touched by the active workload, the workload can continue to progress. If the lock is taken on the table that is part of the workload, said transaction in the workload will wait for the lock to get released, in turn, will cause complete workload to halt.
GET_LOCK
- It is named lock and follows same semantics as LOCK TABLE for PXC. (Base semantics of MySQL are slightly different that you can check here).
LOCK TABLES FOR BACKUP
- As the semantics goes, this lock is specially meant for backup and blocks non-transactional changes (like the updates to non-transactional engine = MyISAM and DDL changes).
- PXC doesn’t have any special add-on semantics for this command
LOCK BINLOG FOR BACKUP
- This statement blocks write to binlog. PXC always generates a binlog (persist to disk is controlled by the log-bin setting). If you disable log-bin, then PXC enables emulation-based binlogging.
- This effectively means this command can cause the cluster to stall.
Tracking Active Lock/Flush
- If you have executed a flush or lock command and wanted to find out, it is possible using the com_% counter. These counters are connection specific, so execute these commands from the same client connection. Also, these counters are aggregate counters and incremental only.
mysql> show status like 'Com%lock%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Com_lock_tables | 2 |
| Com_lock_tables_for_backup | 1 |
| Com_lock_binlog_for_backup | 1 |
| Com_unlock_binlog | 1 |
| Com_unlock_tables | 5 |
+----------------------------+-------+
5 rows in set (0.01 sec)
mysql> show status like '%flush%';
+--------------------------------------+---------+
| Variable_name | Value |
+--------------------------------------+---------+
| Com_flush | 4 |
| Flush_commands | 3 |
* Flush_commands is a global counter. Check MySQL documentation for more details.
Conclusion
By now, we can conclude that the user should be a bit more careful when executing local lock commands (understanding the semantics and the effect). Careful execution of these commands can help serve your purpose.
Published at DZone with permission of Krunal Bauskar, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments