Correcting My Misconceptions With REGIONAL BY ROW Tables
This tutorial will dive deeper into nuances with CockroachDB REGIONAL BY ROW tables. I learned the hard way about the limitations and to pay close attention to the documentation on the topic.
Join the DZone community and get the full member experience.
Join For FreePrevious Articles on Multi-Region
CockroachDB Multi-Region Abstractions for MongoDB Developers With FerretDB
Motivation
I was working with a prospect on a use case where my knowledge of CockroachDB multi-region abstractions was put to the test. After trying and failing to gain low latency write behavior from a geographically distributed table, I reached out to the engineering team who corrected my understanding. This is my way of documenting my understanding and committing it to memory.
High-Level Steps
- Provision a multi-region CockroachDB cluster
- Demonstrate the problem
- Demonstrate the solution
- Workaround for non-UUID primary keys
Step-by-Step Instructions
Provision a Multi-Region CockroachDB Cluster
You will need access to a multi-region CockroachDB cluster. The easiest and most cost-effective way is signing up for CockroachDB Serverless. That's what I am using in my demonstration. My serverless cluster spans three regions in GCP.
We can connect to the regional endpoints using the Connect
modal. The default connection string given will be localized as CockroachDB Serverless automatically detects the end user locality. To access the cluster from the remote regions, you must append the regions to the connection string. In essence, the host artem-serverless-mr-26.h4f.cockroachlabs.cloud
becomes artem-serverless-mr-26.h4f.gcp-us-east1.cockroachlabs.cloud
, artem-serverless-mr-26.h4f.gcp-us-west2.cockroachlabs.cloud
, and artem-serverless-mr-26.h4f.gcp-europe-west1.cockroachlabs.cloud
.
We can verify the gateway region we're connecting from by querying the helper function select gateway_region();
from each of the connections.
gateway_region ------------------ gcp-us-east1 gateway_region ------------------ gcp-us-west2 gateway_region -------------------- gcp-europe-west1
Demonstrate the Problem
Considering an inventory system moving to CockroachDB and leveraging multi-region abstractions, typically the challenge is getting the primary key correct to optimize for writes. The following examples should demonstrate where the challenges are and how to work around them.
To get started, we're going to create a database called demo and enable multi-region abstractions on it.
CREATE DATABASE demo; ALTER DATABASE demo SET PRIMARY REGION "gcp-us-east1"; ALTER DATABASE demo ADD region "gcp-us-west2"; ALTER DATABASE demo ADD region "gcp-europe-west1"; SET override_multi_region_zone_config = true;
Feel free to review the docs for more information but in summary, we're enrolling our database into multi-region abstractions and setting gcp-us-east1
as the primary region as it's the closest region to my location.
Then I'm passing a property to override certain zone configurations. This step is optional unless you are coming back to make further changes after.
Finally, we want to pin the leaseholders to the US region so that schema changes in multi-region context complete faster. You can learn more here.
ALTER DATABASE system CONFIGURE ZONE USING constraints = '{"+region=gcp-us-east1": 1}', lease_preferences = '[[+region=gcp-us-east1]]';
Let's create a table to demonstrate the problem.
CREATE TABLE demo_int_pk (productID PRIMARY KEY, count) AS SELECT unordered_unique_rowid() AS productID, generate_series(1, 10) AS count;
Make the table REGIONAL BY ROW
.
ALTER TABLE demo_int_pk SET LOCALITY REGIONAL BY ROW;
The primary key on the table is an integer type. This problem is not specific to integers. You will have similar behavior with other types too, except for UUID — but I'm jumping ahead a bit.
Let's look at the schema of the table.
CREATE TABLE public.demo_int_pk ( productid INT8 NOT NULL, count INT8 NULL, crdb_region demo.public.crdb_internal_region NOT VISIBLE NOT NULL DEFAULT default_to_database_primary_region(gateway_region())::demo.public.crdb_internal_region, CONSTRAINT demo_int_pk_pkey PRIMARY KEY (productid ASC) ) LOCALITY REGIONAL BY ROW
We can confirm the PK is indeed an integer. We can also see a new column called crdb_region
which was added when we enabled REGIONAL BY ROW
.
Let's look at the data.
SELECT crdb_region, * FROM demo_int_pk;
crdb_region | productid | count ---------------+---------------------+-------- gcp-us-east1 | 1021351265574322179 | 5 gcp-us-east1 | 2174272770181169155 | 9 gcp-us-east1 | 2750733522484592643 | 1 gcp-us-east1 | 3327194274788016131 | 7 gcp-us-east1 | 3903655027091439619 | 3 gcp-us-east1 | 5633037284001710083 | 6 gcp-us-east1 | 6209498036305133571 | 2 gcp-us-east1 | 6785958788608557059 | 10 gcp-us-east1 | 7938880293215404035 | 8 gcp-us-east1 | 8515341045518827523 | 4
Notice the crdb_region
column reports the US East region which is the region where I executed the insert statement and the rows are homed in. Let's attempt to do an update on the inventory count for one of the records.
WITH update_demo_int_pk AS ( UPDATE demo_int_pk SET count = 100 WHERE productID = 2750733522484592643 RETURNING crdb_region, count ) SELECT gateway_region(), crdb_region, count FROM update_demo_int_pk;
gateway_region | crdb_region | count -----------------+--------------+-------- gcp-us-east1 | gcp-us-east1 | 100 Time: 40ms total (execution 11ms / network 29ms)
It takes 11ms to update the count using the U.S. East connection. Notice I am using the gateway_region()
function, which will confirm I am using the US East connection. Let's now update the record from another region. I am going to use gcp-us-west2
region.
WITH update_demo_int_pk AS ( UPDATE demo_int_pk SET count = 100 WHERE productID = 2750733522484592643 RETURNING crdb_region, count ) SELECT gateway_region(), crdb_region, count FROM update_demo_int_pk;
gateway_region | crdb_region | count -----------------+--------------+-------- gcp-us-west2 | gcp-us-east1 | 100 Time: 1.280s total (execution 1.204s / network 0.076s)
The query took 1.2 seconds to execute. This is not surprising considering the record is physically stored in the US East region. Let's look at the plan of the query:
distribution: local vectorized: true • root │ ├── • render │ │ │ └── • scan buffer │ estimated row count: 1 │ label: buffer 2 (update_demo_int_pk) │ └── • subquery │ id: @S1 │ original sql: UPDATE demo_int_pk SET count = 100 WHERE productid = 2750733522484592643 RETURNING crdb_region, count │ exec mode: all rows │ └── • buffer │ label: buffer 2 (update_demo_int_pk) │ └── • update │ estimated row count: 1 │ table: demo_int_pk │ set: count │ └── • render │ └── • union all │ estimated row count: 1 │ limit: 1 │ ├── • scan │ estimated row count: 1 (10% of the table; stats collected 19 seconds ago; using stats forecast for 12 minutes in the future) │ table: demo_int_pk@demo_int_pk_pkey │ spans: [/'gcp-us-west2'/2750733522484592643 - /'gcp-us-west2'/2750733522484592643] │ └── • scan estimated row count: 1 (10% of the table; stats collected 19 seconds ago; using stats forecast for 12 minutes in the future) table: demo_int_pk@demo_int_pk_pkey spans: [/'gcp-europe-west1'/2750733522484592643 - /'gcp-europe-west1'/2750733522484592643] [/'gcp-us-east1'/2750733522484592643 - /'gcp-us-east1'/2750733522484592643]
If you read the bottom part, we union all the results of the scans in the gcp-us-west2
and gcp-europe-west1
regions to validate the record is unique across all regions.
Let's add a record in the US West region. We're expecting the write to be quick.
INSERT INTO demo_int_pk (productID, count) VALUES (unordered_unique_rowid(), 1) RETURNING gateway_region(), crdb_region, productID, count;
gateway_region | crdb_region | productid | count -----------------+--------------+---------------------+-------- gcp-us-west2 | gcp-us-west2 | 5746135747274211350 | 1 Time: 296ms total (execution 217ms / network 78ms)
It takes 217ms to insert, even though we are writing from the US West region.
The story is the same if we were to insert from the EU:
gateway_region | crdb_region | productid | count -------------------+------------------+---------------------+-------- gcp-europe-west1 | gcp-europe-west1 | 1222035725603831819 | 1 Time: 338ms total (execution 250ms / network 88ms)
Updating a record from the outside region will also be costly:
WITH update_demo_int_pk AS ( UPDATE demo_int_pk SET count = 100 WHERE productID = 2750733522484592643 RETURNING crdb_region, count ) SELECT gateway_region(), crdb_region, count FROM update_demo_int_pk;
gateway_region | crdb_region | count -----------------+--------------+-------- gcp-us-west2 | gcp-us-east1 | 100 Time: 354ms total (execution 277ms / network 76ms)
Just to explain the statement — I updated the record physically stored in the US East (crdb_region
) going through the US West endpoint (gateway_region
).
We now have very fast writes from the local region but very slow writes from everywhere else.
Let's now focus on how we can make the scenario better. The hard lesson for me was to realize that the docs call out an optimization that I had failed to see originally.
Note: When using
DEFAULT gen_random_uuid()
on columns inREGIONAL BY ROW
tables, uniqueness checks on those columns are disabled by default for performance purposes. CockroachDB assumes uniqueness based on the way this column generates UUIDs. To enable this check, you can modify thesql.optimizer.uniqueness_checks_for_gen_random_uuid.enabled
cluster setting. Note that while there is virtually no chance of a collision occurring when enabling this setting, it is not truly zero.
With that, let's create a new table with PRIMARY KEY
being UUID.
Demonstrate the Solution
CREATE TABLE demo_uuid_pk (productID PRIMARY KEY, count) AS SELECT gen_random_uuid() AS productID, generate_series(1, 10) AS count;
ALTER TABLE demo_uuid_pk SET LOCALITY REGIONAL BY ROW;
The create statement is below:
CREATE TABLE public.demo_uuid_pk ( productid UUID NOT NULL, count INT8 NULL, crdb_region demo.public.crdb_internal_region NOT VISIBLE NOT NULL DEFAULT default_to_database_primary_region(gateway_region())::demo.public.crdb_internal_region, CONSTRAINT demo_uuid_pk_pkey PRIMARY KEY (productid ASC) ) LOCALITY REGIONAL BY ROW
The data looks like so:
crdb_region | productid | count ---------------+--------------------------------------+-------- gcp-us-east1 | 229f4258-cac2-4b06-815f-5dd8be8483c7 | 9 gcp-us-east1 | 525845a9-30ab-42dd-95bb-fb1d72fc7137 | 8 gcp-us-east1 | 58b174b5-c2f6-433e-a373-74879d81ff92 | 5 gcp-us-east1 | 5b10da77-e750-431d-a7ea-f0b5f0d61c77 | 3 gcp-us-east1 | 930e64db-fbae-4743-9419-c274853b4cb3 | 2 gcp-us-east1 | 994f1714-5d0e-4b9a-8cbb-ab0c94efe226 | 4 gcp-us-east1 | a5b70013-b8cb-4ba1-86f5-1f892285cf99 | 7 gcp-us-east1 | d3778386-153b-4431-ab8c-7d9ee9215bec | 10 gcp-us-east1 | de6b91f8-f5a8-4575-ae3d-fe1b516ae906 | 6 gcp-us-east1 | f5fa80ea-4fbf-4234-86b7-5592f1bfdda2 | 1
Let's update the inventory from the US East region.
WITH update_demo_uuid_pk AS ( UPDATE demo_uuid_pk SET count = 10 WHERE productID = '229f4258-cac2-4b06-815f-5dd8be8483c7' RETURNING crdb_region, count ) SELECT gateway_region(), crdb_region, count FROM update_demo_uuid_pk;
gateway_region | crdb_region | count -----------------+--------------+-------- gcp-us-east1 | gcp-us-east1 | 10 Time: 42ms total (execution 12ms / network 30ms)
This is similar to our original results. Let's add records in the US West region and update the inventory.
INSERT INTO demo_uuid_pk (productID, count) SELECT gen_random_uuid(), generate_series(1, 10) RETURNING gateway_region(), crdb_region, productID, count;
gateway_region | crdb_region | productid | count -----------------+--------------+--------------------------------------+-------- gcp-us-west2 | gcp-us-west2 | cfb86888-e9b5-4328-9c60-0c102035b5e6 | 1 gcp-us-west2 | gcp-us-west2 | b243affe-47e9-47b2-a7ba-cbc966d42797 | 2 gcp-us-west2 | gcp-us-west2 | 82dc438b-d819-4a52-85b8-5ac4ecf4c725 | 3 gcp-us-west2 | gcp-us-west2 | 4a7bda92-b37a-461a-8f76-ea33dba899dc | 4 gcp-us-west2 | gcp-us-west2 | 55954dba-6892-4c8a-9f84-2a70b7b6f204 | 5 gcp-us-west2 | gcp-us-west2 | 95c7dbf6-093a-4afb-af71-196833ab469c | 6 gcp-us-west2 | gcp-us-west2 | 01689a64-9ae5-408a-b150-590f51efde31 | 7 gcp-us-west2 | gcp-us-west2 | 7387bac4-e5c6-43b3-9bf6-149774de7ba5 | 8 gcp-us-west2 | gcp-us-west2 | 6f0f8abe-6735-41d1-a549-755daddfa14a | 9 gcp-us-west2 | gcp-us-west2 | 3d8c9ad6-d8ac-4551-b25a-067b8efd2db4 | 10 Time: 143ms total (execution 66ms / network 77ms)
Notice the execution is 66ms, which is 5x better than 250ms in the earlier attempts with integers.
Similarly, updates are just as fast.
WITH update_demo_uuid_pk AS ( UPDATE demo_uuid_pk SET count = 10 WHERE productID = 'e19aa60b-d449-478d-8867-7ecabe6eb361' RETURNING crdb_region, count ) SELECT gateway_region(), crdb_region, count FROM update_demo_uuid_pk;
gateway_region | crdb_region | count -----------------+--------------+-------- gcp-us-west2 | gcp-us-west2 | 10 Time: 86ms total (execution 10ms / network 77ms)
Notice the execution is now 10ms, which is significantly faster.
For posterity, here are results for inserting and updating in the EU:
gateway_region | crdb_region | productid | count -------------------+------------------+--------------------------------------+-------- gcp-europe-west1 | gcp-europe-west1 | f27db778-20ba-4458-b75f-0f3a6d46c6ab | 1 gcp-europe-west1 | gcp-europe-west1 | fbff7233-468f-49d1-a7a2-719f851fed54 | 2 gcp-europe-west1 | gcp-europe-west1 | 01709624-1cf6-4f72-8e8c-6188db8e517a | 3 gcp-europe-west1 | gcp-europe-west1 | 21c4999a-d4e4-47c4-a034-ad005dd9e9b2 | 4 gcp-europe-west1 | gcp-europe-west1 | 64ea0759-129d-4174-88cb-020ef7e947a8 | 5 gcp-europe-west1 | gcp-europe-west1 | 946daa03-3932-4694-a7bc-a87b0fa5d242 | 6 gcp-europe-west1 | gcp-europe-west1 | c1cbe899-d981-4223-993c-0d5987b9320b | 7 gcp-europe-west1 | gcp-europe-west1 | 5dcca454-2b66-4dd1-aee2-d540eac6b710 | 8 gcp-europe-west1 | gcp-europe-west1 | 1a6209c4-d804-432e-a239-60916c2f889b | 9 gcp-europe-west1 | gcp-europe-west1 | 2df436d4-0f3f-40c3-8303-bec776e591c7 | 10 Time: 115ms total (execution 27ms / network 88ms)
WITH update_demo_uuid_pk AS ( UPDATE demo_uuid_pk SET count = 10 WHERE productID = 'f27db778-20ba-4458-b75f-0f3a6d46c6ab' RETURNING crdb_region, count ) SELECT gateway_region(), crdb_region, count FROM update_demo_uuid_pk;
gateway_region | crdb_region | count -------------------+------------------+-------- gcp-europe-west1 | gcp-europe-west1 | 10 Time: 100ms total (execution 11ms / network 88ms)
The reason this is so fast is because we're using UUID type for our primary key. The chance of collision for UUID PK across all regions is minimal, therefore we skip uniqueness checks in the remote regions.
Workaround for Non-UUID Primary Keys
You may be asking, what if we would like to maintain the old type for primary keys? Well, in that case, we do have one more way to maintain backwards compatibility to your legacy primary keys and still leverage great performance. We can take advantage of composite primary keys where one column is going to be your legacy non-UUID primary key and the other being a locality column like crdb_region
.
CREATE TABLE demo_composite_pk (productID PRIMARY KEY, count) AS SELECT unordered_unique_rowid() AS productID, generate_series(1, 10) AS count;
Now we're going to add a new column called region
, which you've seen before — but this time, we are going to add it as a default expression. The reason being, it will force to evaluate gateway_region()
which is the function we determine the locality of the write statement, at the insert time. The additional benefit to this approach is that region column in PK will avoid the uniqueness checks globally.
ALTER TABLE demo_composite_pk ADD COLUMN region crdb_internal_region NOT NULL DEFAULT gateway_region()::crdb_internal_region;
CockroachDB is capable of changing primary keys online. Here is a highly simplified example on a small table, and in many cases, this is immensely useful! Changing PK in production is not a frequent occurrence, but it's good to know there are products that can do that.
ALTER TABLE demo_composite_pk ALTER PRIMARY KEY USING COLUMNS (region, productID);
Finally, let's set the table to REGIONAL BY ROW
.
ALTER TABLE demo_composite_pk SET LOCALITY REGIONAL BY ROW;
The final schema is below:
CREATE TABLE public.demo_composite_pk ( productid INT8 NOT NULL, count INT8 NULL, region demo.public.crdb_internal_region NOT NULL DEFAULT gateway_region()::demo.public.crdb_internal_region, crdb_region demo.public.crdb_internal_region NOT VISIBLE NOT NULL DEFAULT default_to_database_primary_region(gateway_region())::demo.public.crdb_internal_region, CONSTRAINT demo_composite_pk_pkey PRIMARY KEY (region ASC, productid ASC), UNIQUE INDEX demo_composite_pk_productid_key (productid ASC) ) LOCALITY REGIONAL BY ROW
Notice there's an additional unique index on the productid. It is not just the remnants of the legacy PK (CockroachDB promotes a former PK to Unique Index when PK changes to a new one). Additionally, it serves another purpose to maintain uniqueness for productids.
Let's test this from the US East:
INSERT INTO demo_composite_pk (productID, count) SELECT unordered_unique_rowid(), generate_series(1, 10) RETURNING gateway_region(), crdb_region, productID, count;
gateway_region | crdb_region | productid | count -----------------+--------------+---------------------+-------- gcp-us-east1 | gcp-us-east1 | 5199323410302500866 | 1 gcp-us-east1 | gcp-us-east1 | 8658087924123041794 | 2 gcp-us-east1 | gcp-us-east1 | 299407015723401218 | 3 gcp-us-east1 | gcp-us-east1 | 4911093034150789122 | 4 gcp-us-east1 | gcp-us-east1 | 2605250024937095170 | 5 gcp-us-east1 | gcp-us-east1 | 7216936043364483074 | 6 gcp-us-east1 | gcp-us-east1 | 1452328520330248194 | 7 gcp-us-east1 | gcp-us-east1 | 6064014538757636098 | 8 gcp-us-east1 | gcp-us-east1 | 3758171529543942146 | 9 gcp-us-east1 | gcp-us-east1 | 8369857547971330050 | 10 Time: 352ms total (execution 329ms / network 23ms)
Let's update a row homed in the US East region:
WITH demo_composite_pk AS ( UPDATE demo_composite_pk SET count = 10 WHERE productID = '5199323410302500866' RETURNING crdb_region, count ) SELECT gateway_region(), crdb_region, count FROM demo_composite_pk;
gateway_region | crdb_region | count -----------------+--------------+-------- gcp-us-east1 | gcp-us-east1 | 10 Time: 35ms total (execution 12ms / network 22ms)
Alright, we're not seeing anything we haven't seen before. Let's insert a row in the US West region and then update it.
gateway_region | crdb_region | productid | count -----------------+--------------+---------------------+-------- gcp-us-west2 | gcp-us-west2 | 695146676766375958 | 1 gcp-us-west2 | gcp-us-west2 | 6459754199800610838 | 2 gcp-us-west2 | gcp-us-west2 | 4153911190586916886 | 3 gcp-us-west2 | gcp-us-west2 | 8765597209014304790 | 4 gcp-us-west2 | gcp-us-west2 | 406916300614664214 | 5 gcp-us-west2 | gcp-us-west2 | 5018602319042052118 | 6 gcp-us-west2 | gcp-us-west2 | 2712759309828358166 | 7 gcp-us-west2 | gcp-us-west2 | 7324445328255746070 | 8 gcp-us-west2 | gcp-us-west2 | 1559837805221511190 | 9 gcp-us-west2 | gcp-us-west2 | 6171523823648899094 | 10 Time: 986ms total (execution 724ms / network 262ms)
WITH demo_composite_pk AS ( UPDATE demo_composite_pk SET count = 10 WHERE productID = '6459754199800610838' RETURNING crdb_region, count ) SELECT gateway_region(), crdb_region, count FROM demo_composite_pk;
gateway_region | crdb_region | count -----------------+--------------+-------- gcp-us-west2 | gcp-us-west2 | 10 Time: 85ms total (execution 10ms / network 75ms)
This is good. We now have similar update performance from the US East and the West regions. What about EU?
gateway_region | crdb_region | productid | count -------------------+------------------+---------------------+-------- gcp-europe-west1 | gcp-europe-west1 | 3574678019714056204 | 1 gcp-europe-west1 | gcp-europe-west1 | 2998217267410632716 | 2 gcp-europe-west1 | gcp-europe-west1 | 7609903285838020620 | 3 gcp-europe-west1 | gcp-europe-west1 | 1845295762803785740 | 4 gcp-europe-west1 | gcp-europe-west1 | 6456981781231173644 | 5 gcp-europe-west1 | gcp-europe-west1 | 4151138772017479692 | 6 gcp-europe-west1 | gcp-europe-west1 | 8762824790444867596 | 7 gcp-europe-west1 | gcp-europe-west1 | 5015829900472614924 | 8 gcp-europe-west1 | gcp-europe-west1 | 2709986891258920972 | 9 gcp-europe-west1 | gcp-europe-west1 | 7321672909686308876 | 10 Time: 1.046s total (execution 0.885s / network 0.162s)
WITH demo_composite_pk AS ( UPDATE demo_composite_pk SET count = 10 WHERE productID = '3574678019714056204' RETURNING crdb_region, count ) SELECT gateway_region(), crdb_region, count FROM demo_composite_pk;
gateway_region | crdb_region | count -------------------+------------------+-------- gcp-europe-west1 | gcp-europe-west1 | 10 Time: 129ms total (execution 14ms / network 116ms)
This is great. We're able to maintain backward compatibility to legacy keys and take advantage of local writes.
It's worth calling out that having a partition column as part of PK is not absolutely necessary. As long as the field you're partitioning by region has a unique constraint, you should be good to go. To emphasize this point, let's do the following. Let's drop the unique constraint on the productid field.
DROP INDEX demo_composite_pk_productid_key;
Execute an update on the table from any region.
gateway_region | crdb_region | count -------------------+------------------+-------- gcp-europe-west1 | gcp-europe-west1 | 10 Time: 411ms total (execution 157ms / network 254ms)
Latency goes up immediately. We can no longer maintain uniqueness across regions and must visit each region.
distribution: local vectorized: true • root │ ├── • render │ │ │ └── • scan buffer │ estimated row count: 0 │ label: buffer 2 (demo_composite_pk) │ └── • subquery │ id: @S1 │ original sql: UPDATE demo_composite_pk SET count = 10 WHERE productid = '3574678019714056204' RETURNING crdb_region, count │ exec mode: all rows │ └── • buffer │ label: buffer 2 (demo_composite_pk) │ └── • update │ estimated row count: 0 │ table: demo_composite_pk │ set: count │ └── • render │ └── • scan estimated row count: 0 (0.26% of the table; stats collected 16 minutes ago; using stats forecast for 5 hours in the future) table: demo_composite_pk@demo_composite_pk_pkey spans: [/'gcp-europe-west1'/'gcp-europe-west1'/3574678019714056204 - /'gcp-europe-west1'/'gcp-europe-west1'/3574678019714056204] [/'gcp-europe-west1'/'gcp-us-east1'/3574678019714056204 - /'gcp-europe-west1'/'gcp-us-east1'/3574678019714056204] [/'gcp-europe-west1'/'gcp-us-west2'/3574678019714056204 - /'gcp-europe-west1'/'gcp-us-west2'/3574678019714056204] [/'gcp-us-east1'/'gcp-europe-west1'/3574678019714056204 - /'gcp-us-east1'/'gcp-europe-west1'/3574678019714056204] … (5 more) locking strength: for update
Let's add it back using the example in the link above.
ALTER TABLE demo_composite_pk ADD CONSTRAINT productid_unique UNIQUE (productid);
Let's take a look at the partitions for productid across every region. We are going to filter the output of the following command as it is verbose.
SHOW PARTITIONS FROM TABLE demo_composite_pk;
Changing it to:
SELECT table_name, partition_name, column_names, index_name FROM [SHOW PARTITIONS FROM TABLE demo_composite_pk];
table_name | partition_name | column_names | index_name --------------------+------------------+--------------+------------------------------------------- demo_composite_pk | gcp-europe-west1 | crdb_region | demo_composite_pk@demo_composite_pk_pkey demo_composite_pk | gcp-us-east1 | crdb_region | demo_composite_pk@demo_composite_pk_pkey demo_composite_pk | gcp-us-west2 | crdb_region | demo_composite_pk@demo_composite_pk_pkey demo_composite_pk | gcp-europe-west1 | crdb_region | demo_composite_pk@productid_unique demo_composite_pk | gcp-us-east1 | crdb_region | demo_composite_pk@productid_unique demo_composite_pk | gcp-us-west2 | crdb_region | demo_composite_pk@productid_unique
Conclusion
In summary, if your application has a choice to migrate and pay the one-time cost to capture all of the benefits of what a multi-region database can offer, then opt for UUID PK. If your app is not that flexible, make sure you add a deterministic partition column and add a unique index on it. In my experiments, I still see a better overall performance with UUID, but it's the next best alternative.
Published at DZone with permission of Artem Ervits. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments