Exploring CockroachDB With Flyway Schema Migration Tool
Get familiar with Flyway and some of the new capabilities in CockroachDB leveraging schema migrations using a docker-compose environment.
Join the DZone community and get the full member experience.
Join For FreeToday, I am going to quickly introduce you to Flyway and some of the new capabilities in CockroachDB leveraging schema migrations. This is by no means a deep-dive on Flyway, for that, I highly recommend you get familiar with Flyway's documentation. With that, let's dive in.
I will continue to use a docker-compose environment for the following tutorial as it fits nicely with the iterative model of development and deployment with schema migration tools. We will need a recent CockroachDB image. My current folder tree looks like so:
crdb-flyway └── docker-compose.yml 0 directories, 1 file
My docker-compose file looks like so:
version: '3.9' services: crdb: image: cockroachdb/cockroach:v21.2.4 container_name: crdb ports: - "26257:26257" - "8080:8080" command: start-single-node --insecure volumes: - ${PWD}/cockroach-data/crdb:/cockroach-data:rw networks: default: external: name: roachnet
Flyway ships as a docker container but since it's only used to execute a migration, it is not necessary to keep the flyway service active. We are going to run a Flyway container on demand. That said, we do need a network created so that the running CockroachDB node and a Flyway instance can communicate.
The first thing we need to do after creating a compose file is docker network create roachnet
. You can see we reference that network in the compose file after the service definition. We can check the network is available with docker network ls
➜ crdb-flyway docker network create roachnet a9456bd109efef37c582b36719b73c48e43a9d2d90ae83faa3b0098e7e1b8bdc ➜ crdb-flyway docker network ls NETWORK ID NAME DRIVER SCOPE 5bab0ec87901 bridge bridge local 0113aeaf0346 host host local 5208e02099d3 none null local a9456bd109ef roachnet bridge local
Start the compose with docker-compose up -d
and let Cockroach run in the background.
At this point our project tree should have a compose file and a cockroach-data directory.
. ├── cockroach-data │ └── crdb └── docker-compose.yml 2 directories, 1 file
Now, we can confirm Cockroach is accessible with Flyway, let's pull the latest release of Flyway image compatible with Cockroach, at the time of writing, it is 6.4.4.
➜ crdb-flyway docker pull flyway/flyway:6.4.4 6.4.4: Pulling from flyway/flyway Digest: sha256:22d97ceb0c47182c04e5f45be6dcc29d3f5bb5d7c2218fa236670f793693f501 Status: Image is up to date for flyway/flyway:6.4.4 docker.io/flyway/flyway:6.4.4
At this point, because we have a network defined, we can run the following command to make sure Flyway can access our instance of Cockroach.
docker run --rm --network roachnet -v $PWD/flyway/sql:/flyway/sql flyway/flyway:6.4.4 -url=jdbc:postgresql://crdb:26257/defaultdb -user=root -password="" -connectRetries=3 info
We are running Flyway with info
predicate to report on what's the current status of migration. We run a container, we point a volume called $PWD/flyway/sql:/flyway/sql
where all our SQL migration files will reside, the instance of CRDB is accessible with the url jdbc://postgres://crdb:26257/defaultdb
, where crdb
is the name of the container with user root
, this is an insecure instance so we're passing an empty string for password and an optional -connectRetries
flag to retry connection 3 times. The output of the command is below:
Flyway Community Edition 6.4.4 by Redgate Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Schema version: << Empty Schema >> +----------+---------+-------------+------+--------------+-------+ | Category | Version | Description | Type | Installed On | State | +----------+---------+-------------+------+--------------+-------+ | No migrations found | +----------+---------+-------------+------+--------------+-------+
With this command, we added a new directory called flyway in our project directory.
. ├── cockroach-data │ └── crdb ├── docker-compose.yml └── flyway └── sql 4 directories, 1 file
This is quite a lengthy command, let's simplify this a bit with a flyway conf file.
mkdir -p flyway/conf touch flyway/conf/flyway.conf
This is a configuration file where we can pass some of the arguments we explicitly passed in the info
command above.
flyway.url=jdbc:postgresql://crdb:26257/defaultdb flyway.user=root flyway.password="" flyway.connectRetries=3
Our docker command changes to docker run --rm --network roachnet -v $PWD/flyway/sql:/flyway/sql -v $PWD/flyway/conf:/flyway/conf flyway/flyway:6.4.4 info
Refer to the Flyway documentation for other configuration parameters.
We now have the following project tree.
. ├── cockroach-data │ └── crdb ├── docker-compose.yml └── flyway ├── conf │ └── flyway.conf └── sql
With all of this out of the way, let's start exploring Flyway and Cockroach schema capabilities.
Let's write a DDL statement and save it as a .sql file in flyway/sql
directory.
We are going to create a table with a PK on an integer column. Later on, we're going to demonstrate online Primary Key change capabilities that shipped with 20.1.
1. Create Table With INT PK
CREATE TABLE fruits ( id INT NOT NULL PRIMARY KEY DEFAULT unique_rowid(), name STRING, color STRING ); SHOW CREATE TABLE fruits;
Save this as flyway/sql/V1__Create_table.sql
. ├── cockroach-data │ └── crdb ├── docker-compose.yml └── flyway ├── conf │ └── flyway.conf └── sql └── V1__Create_table.sql
At this point we're ready to run a migration, so let's run the info
command again:
Flyway Community Edition 6.4.4 by Redgate Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Schema version: << Empty Schema >> +-----------+---------+--------------+------+--------------+---------+ | Category | Version | Description | Type | Installed On | State | +-----------+---------+--------------+------+--------------+---------+ | Versioned | 1 | Create table | SQL | | Pending | +-----------+---------+--------------+------+--------------+---------+
The migration is in a pending state; let's migrate!
docker run --rm --network roachnet -v $PWD/flyway/sql:/flyway/sql -v $PWD/flyway/conf:/flyway/conf flyway/flyway:6.4.4 migrate
Our command is changed with a single argument, migrate
Flyway Community Edition 6.4.4 by Redgate Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 1 migration (execution time 00:00.023s) Creating Schema History table "defaultdb"."flyway_schema_history" ... Current version of schema "defaultdb": << Empty Schema >> Migrating schema "defaultdb" to version 1 - Create table [non-transactional] +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | table_name | create_statement | +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | fruits | CREATE TABLE fruits ( id INT8 NOT NULL DEFAULT unique_rowid(), name STRING NULL, color STRING NULL, CONSTRAINT "primary" PRIMARY KEY (id ASC), FAMILY "primary" (id, name, color) ) | +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.367s)
We are going to load data into our newly-created table. Because we are using unique_rowid() function, we have no visibility into what ID is being generated at runtime, RETURNING
clause does exactly that, it prints to the stdout the IDs generated with this insert.
2. Load Data With RETURNING Keyword to Display the Output of unique_rowid()
INSERT INTO fruits (name, color) VALUES ('apple', 'red'),('orange', 'orange'),('plum', 'purple') RETURNING id;
Save the file as flyway/sql/V2__Load_data.sql
. V#__
is a standard Flyway naming convention, refer to their docs for the explanation. Run the migration again.
Flyway Community Edition 6.4.4 by Redgate Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 2 migrations (execution time 00:00.037s) Current version of schema "defaultdb": 1 Migrating schema "defaultdb" to version 2 - Load data [non-transactional] +--------------------+ | id | +--------------------+ | 566986134960537601 | | 566986134960603137 | | 566986134960635905 | +--------------------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.177s)
We can check the info
again:
Flyway Community Edition 6.4.4 by Redgate Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Schema version: 2 +-----------+---------+--------------+------+---------------------+---------+ | Category | Version | Description | Type | Installed On | State | +-----------+---------+--------------+------+---------------------+---------+ | Versioned | 1 | Create table | SQL | 2020-06-25 15:55:40 | Success | | Versioned | 2 | Load data | SQL | 2020-06-25 16:00:36 | Success | +-----------+---------+--------------+------+---------------------+---------+
You can start seeing the benefits of schema migration tools: versioning, audit trail, SDLC, etc.
At this point, I'd like to demonstrate some of the typical development scenarios an engineer may go through to develop a product. Let's say we changed our mind on the primary key and need to change that to better fit our the use case and access patterns. We're going to deprecate our id
column and nominate name
and color
as our new PK. Because the fields name
and color
may contain NULL
, let's change the table definition for these columns to prevent it.
3. Change Name and Color to NOT NULL
ALTER TABLE fruits ALTER COLUMN name SET NOT NULL; ALTER TABLE fruits ALTER COLUMN color SET NOT NULL; SHOW CREATE TABLE fruits;
Once you save the SQL statement above and prefix the file with V3__
, run the migrate command.
Successfully validated 3 migrations (execution time 00:00.032s) Current version of schema "defaultdb": 2 Migrating schema "defaultdb" to version 3 - Change color and name to not null [non-transactional] +------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | table_name | create_statement | +------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | fruits | CREATE TABLE fruits ( id INT8 NOT NULL DEFAULT unique_rowid(), name STRING NOT NULL, color STRING NOT NULL, CONSTRAINT "primary" PRIMARY KEY (id ASC), FAMILY "primary" (id, name, color) ) | +------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.708s)
Now comes the big part, changing the PK to name
+ color
.
4. Alter Table to Change PK to Name, Color
ALTER TABLE fruits ALTER PRIMARY KEY USING COLUMNS (name, color); SHOW CREATE TABLE fruits;
Again, save the DDL as a file in flyway/sql/V4__<desired_name>.sql
Flyway Community Edition 6.4.4 by Redgate Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 4 migrations (execution time 00:00.038s) Current version of schema "defaultdb": 3 Migrating schema "defaultdb" to version 4 - Change pk to name and color [non-transactional] DB: primary key changes are finalized asynchronously; further schema changes on this table may be restricted until the job completes +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | table_name | create_statement | +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | fruits | CREATE TABLE fruits ( id INT8 NOT NULL DEFAULT unique_rowid(), name STRING NOT NULL, color STRING NOT NULL, CONSTRAINT "primary" PRIMARY KEY (name ASC, color ASC), UNIQUE INDEX fruits_id_key (id ASC), FAMILY "primary" (id, name, color) ) | +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.560s)
Notice the PK changed, we can now safely drop the id
column as it is no longer of use for us.
5. Drop Unique Index Constraint and id Column
SHOW INDEX FROM fruits; DROP INDEX fruits_id_key CASCADE; ALTER TABLE fruits DROP COLUMN id CASCADE; SHOW CREATE TABLE fruits;
Flyway Community Edition 6.4.4 by Redgate Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 5 migrations (execution time 00:00.038s) Current version of schema "defaultdb": 4 Migrating schema "defaultdb" to version 5 - Drop unique constraint and id column [non-transactional] +------------+---------------+------------+--------------+-------------+-----------+---------+----------+ | table_name | index_name | non_unique | seq_in_index | column_name | direction | storing | implicit | +------------+---------------+------------+--------------+-------------+-----------+---------+----------+ | fruits | primary | f | 1 | name | ASC | f | f | | fruits | primary | f | 2 | color | ASC | f | f | | fruits | fruits_id_key | f | 1 | id | ASC | f | f | | fruits | fruits_id_key | f | 2 | name | ASC | f | t | | fruits | fruits_id_key | f | 3 | color | ASC | f | t | +------------+---------------+------------+--------------+-------------+-----------+---------+----------+ DB: the data for dropped indexes is reclaimed asynchronously +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | table_name | create_statement | +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | fruits | CREATE TABLE fruits ( name STRING NOT NULL, color STRING NOT NULL, CONSTRAINT "primary" PRIMARY KEY (name ASC, color ASC), FAMILY "primary" (name, color) ) | +------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.801s)
6. Load More Data
INSERT INTO fruits (name, color) VALUES ('avocado', 'green'),('peach', 'yellow'); SELECT * FROM fruits;
Flyway Community Edition 6.4.4 by Redgate Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 6 migrations (execution time 00:00.042s) Current version of schema "defaultdb": 5 Migrating schema "defaultdb" to version 6 - Load data [non-transactional] +---------+--------+ | name | color | +---------+--------+ | apple | red | | avocado | green | | orange | orange | | peach | yellow | | plum | purple | +---------+--------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.161s)
Let's make it a bit interesting by adding a UNIQUE
constraint on the name
field. It means we can have duplicate color
as long as name
of the fruit remains unique.
7. Add Unique Constraint to Name in desc Order
CREATE UNIQUE INDEX ON fruits (name DESC); SHOW CONSTRAINTS FROM fruits; SHOW CREATE TABLE fruits;
Flyway Community Edition 6.4.4 by Redgate Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 7 migrations (execution time 00:00.044s) Current version of schema "defaultdb": 6 Migrating schema "defaultdb" to version 7 - Add unique constraint on name column [non-transactional] +------------+-----------------+-----------------+-----------------------------------+-----------+ | table_name | constraint_name | constraint_type | details | validated | +------------+-----------------+-----------------+-----------------------------------+-----------+ | fruits | fruits_name_key | UNIQUE | UNIQUE (name DESC) | t | | fruits | primary | PRIMARY KEY | PRIMARY KEY (name ASC, color ASC) | t | +------------+-----------------+-----------------+-----------------------------------+-----------+ +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | table_name | create_statement | +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | fruits | CREATE TABLE fruits ( name STRING NOT NULL, color STRING NOT NULL, CONSTRAINT "primary" PRIMARY KEY (name ASC, color ASC), UNIQUE INDEX fruits_name_key (name DESC), FAMILY "primary" (name, color) ) | +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.556s)
8. Load More Data
Notice colors yellow
and green
have been used before (a unique constraint or name does not affect color).
INSERT INTO fruits (name, color) VALUES ('clementine', 'yellow'), ('pear', 'green'); SELECT * FROM fruits;
Flyway Community Edition 6.4.4 by Redgate Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 8 migrations (execution time 00:00.045s) Current version of schema "defaultdb": 7 Migrating schema "defaultdb" to version 8 - Load data respecting unique [non-transactional] +------------+--------+ | name | color | +------------+--------+ | apple | red | | avocado | green | | clementine | yellow | | orange | orange | | peach | yellow | | pear | green | | plum | purple | +------------+--------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.151s)
Let me demonstrate what would happen if we tried to insert another duplicate fruit:
root@:26257/defaultdb> insert into fruits (name, color) values ('plum', 'red'); ERROR: duplicate key value (name)=('plum') violates unique constraint "fruits_name_key" SQLSTATE: 23505 root@:26257/defaultdb>
9. Truncate Table
TRUNCATE TABLE fruits;
Flyway Community Edition 6.4.4 by Redgate Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 9 migrations (execution time 00:00.051s) Current version of schema "defaultdb": 8 Migrating schema "defaultdb" to version 9 - Truncate table [non-transactional] Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.227s)
We now want to introduce a new globaly unique id
field with uuid data type. We're going to add a new column called revised_id
, make a DEFAULT
behavior on it to generate a new key each time a record is inserted, change PK to that new field, and finally drop the index on the name and color fields.
10. Alter Table to Change PK to UUID and Drop Unique Index for Old PK (Name, Color)
ALTER TABLE fruits ADD COLUMN revised_id UUID NOT NULL DEFAULT gen_random_uuid(); ALTER TABLE fruits ALTER PRIMARY KEY USING COLUMNS (revised_id); SHOW CONSTRAINTS FROM fruits; DROP INDEX fruits_name_color_key CASCADE; SHOW CONSTRAINTS FROM fruits;
Flyway Community Edition 6.4.4 by Redgate Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 10 migrations (execution time 00:00.052s) Current version of schema "defaultdb": 9 Migrating schema "defaultdb" to version 10 - Alter table change pk [non-transactional] DB: primary key changes are finalized asynchronously; further schema changes on this table may be restricted until the job completes +------------+-----------------------+-----------------+------------------------------+-----------+ | table_name | constraint_name | constraint_type | details | validated | +------------+-----------------------+-----------------+------------------------------+-----------+ | fruits | fruits_name_color_key | UNIQUE | UNIQUE (name ASC, color ASC) | t | | fruits | fruits_name_key | UNIQUE | UNIQUE (name DESC) | t | | fruits | primary | PRIMARY KEY | PRIMARY KEY (revised_id ASC) | t | +------------+-----------------------+-----------------+------------------------------+-----------+ DB: the data for dropped indexes is reclaimed asynchronously +------------+-----------------+-----------------+------------------------------+-----------+ | table_name | constraint_name | constraint_type | details | validated | +------------+-----------------+-----------------+------------------------------+-----------+ | fruits | fruits_name_key | UNIQUE | UNIQUE (name DESC) | t | | fruits | primary | PRIMARY KEY | PRIMARY KEY (revised_id ASC) | t | +------------+-----------------+-----------------+------------------------------+-----------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.893s)
So now, any new insert will have a new key with datatype uuid
.
11. Load Data Respecting UUID
INSERT INTO fruits (name, color) VALUES ('clementine', 'yellow'), ('pear', 'green'), ('avocado', 'green'),('peach', 'yellow'), ('apple', 'red'),('orange', 'orange'),('plum', 'purple'); SELECT * FROM fruits;
Flyway Community Edition 6.4.4 by Redgate Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 11 migrations (execution time 00:00.049s) Current version of schema "defaultdb": 10 Migrating schema "defaultdb" to version 11 - Reload data [non-transactional] +------------+--------+--------------------------------------+ | name | color | revised_id | +------------+--------+--------------------------------------+ | plum | purple | 0f9eb934-f29d-4aee-a3b7-4f3431421c99 | | clementine | yellow | 1f38f571-027d-450c-8c65-a0675f25b963 | | apple | red | 4d47207f-d186-43c9-9321-0cb400c432b0 | | pear | green | 776093f9-9b36-4ca3-a5fa-6e78483baf69 | | orange | orange | d49b6792-eaf8-47ad-8a5b-94451471c1c9 | | avocado | green | da259e58-30c8-4ee6-b01b-e2c46d999478 | | peach | yellow | e0fe4ba5-119f-474e-8007-0928c8b715e7 | +------------+--------+--------------------------------------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.143s)
I'm not entirely pleased with the name revised_id
and want to rename it to primary_id
.
12. Demonstrate Column Rename
ALTER TABLE fruits RENAME COLUMN revised_id TO primary_id; SHOW CREATE TABLE fruits;
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 12 migrations (execution time 00:00.054s) Current version of schema "defaultdb": 11 Migrating schema "defaultdb" to version 12 - Rename column [non-transactional] +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | table_name | create_statement | +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | fruits | CREATE TABLE fruits ( name STRING NOT NULL, color STRING NOT NULL, primary_id UUID NOT NULL DEFAULT gen_random_uuid(), CONSTRAINT "primary" PRIMARY KEY (primary_id ASC), UNIQUE INDEX fruits_name_key (name DESC), FAMILY "primary" (name, color, primary_id) ) | +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.409s)
CockroachDB supports Column Families for heterogeneous access patterns within a row. A similar concept exists in run-of-the-mill NoSQL databases. We're going to add a new column of type JSONB
, a common pattern is to separate a binary
or json
payload separate from regular data as one or the other may not be regularly accessed. Filtering by column family can make the performance of each query dramatically faster.
13. Add a New Column Into a New CF With JSON
ALTER TABLE fruits ADD COLUMN payload JSONB NULL CREATE IF NOT EXISTS FAMILY secondary;
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 13 migrations (execution time 00:00.057s) Current version of schema "defaultdb": 12 Migrating schema "defaultdb" to version 13 - Add json column new cf [non-transactional] +------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | table_name | create_statement | +------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | fruits | CREATE TABLE fruits ( name STRING NOT NULL, color STRING NOT NULL, primary_id UUID NOT NULL DEFAULT gen_random_uuid(), payload JSONB NULL, CONSTRAINT "primary" PRIMARY KEY (primary_id ASC), UNIQUE INDEX fruits_name_key (name DESC), FAMILY "primary" (name, color, primary_id), FAMILY secondary (payload) ) | +------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.547s)
Let's add rows with JSON embedded in them.
14. Load JSON Data
INSERT INTO fruits (name, color, payload) VALUES ('apricot', 'yellow', '{"name":"apricot", "color":"yellow"}'), ('mango', 'orange', '{"name":"mango", "color":"orange"}'), ('snake fruit', 'brown', '{"name":"snake fruit", "color":"brown"}'), ('mangostin', 'red', '{"name":"mangostin", "color":"red"}'), ('jackfruit', 'yellow', '{"name":"jackfruit", "color":"yellow"}'), ('durian', 'yellow', '{"name":"durian", "color":"yellow"}'); SELECT * FROM fruits;
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 14 migrations (execution time 00:00.060s) Current version of schema "defaultdb": 13 Migrating schema "defaultdb" to version 14 - Load json [non-transactional] +-------------+--------+--------------------------------------+-------------------------------------------+ | name | color | primary_id | payload | +-------------+--------+--------------------------------------+-------------------------------------------+ | plum | purple | 0f9eb934-f29d-4aee-a3b7-4f3431421c99 | | | clementine | yellow | 1f38f571-027d-450c-8c65-a0675f25b963 | | | apple | red | 4d47207f-d186-43c9-9321-0cb400c432b0 | | | jackfruit | yellow | 51289d06-3b4b-47cf-a85d-4dc8739aad9f | {"color": "yellow", "name": "jackfruit"} | | mangostin | red | 53b661e0-3406-4e7f-a9bd-9c227ad5ab1b | {"color": "red", "name": "mangostin"} | | mango | orange | 54260768-a7c8-49c7-8181-9810c9368f11 | {"color": "orange", "name": "mango"} | | pear | green | 776093f9-9b36-4ca3-a5fa-6e78483baf69 | | | durian | yellow | 7ebf21fc-9c20-4332-970e-19d299bd6ff2 | {"color": "yellow", "name": "durian"} | | snake fruit | brown | 8f68436b-2b34-4fa2-9247-b2b63d2a6eb1 | {"color": "brown", "name": "snake fruit"} | | apricot | yellow | be4cc470-64e0-410c-b186-5186a0a17818 | {"color": "yellow", "name": "apricot"} | | orange | orange | d49b6792-eaf8-47ad-8a5b-94451471c1c9 | | | avocado | green | da259e58-30c8-4ee6-b01b-e2c46d999478 | | | peach | yellow | e0fe4ba5-119f-474e-8007-0928c8b715e7 | | +-------------+--------+--------------------------------------+-------------------------------------------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.168s)
Now, if you choose to query the table the way it is, performance may be noticeably slower.
root@:26257/defaultdb> SELECT * FROM fruits WHERE payload @> '{"color": "yellow"}}'; name | color | primary_id | payload ------------+--------+--------------------------------------+------------------------------------------- jackfruit | yellow | 30b823f8-eaa3-4426-8198-5f55286eb07c | {"color": "yellow", "name": "jackfruit"} durian | yellow | 5aba39fa-58cb-4ecc-a066-b45a7f57f561 | {"color": "yellow", "name": "durian"} apricot | yellow | c2598274-d826-44e8-b536-c4a2513eb33b | {"color": "yellow", "name": "apricot"} (3 rows) Time: 2.0718ms
Adding what's called an inverted index on the JSONB
field, will improve the performance of your JSON queries.
15. Add Inverted Index
CREATE INVERTED INDEX ON fruits(payload) SHOW CREATE TABLE fruits;
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 15 migrations (execution time 00:00.072s) Current version of schema "defaultdb": 14 Migrating schema "defaultdb" to version 15 - Create inverted index [non-transactional] +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | table_name | create_statement | +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | fruits | CREATE TABLE fruits ( name STRING NOT NULL, color STRING NOT NULL, primary_id UUID NOT NULL DEFAULT gen_random_uuid(), payload JSONB NULL, CONSTRAINT "primary" PRIMARY KEY (primary_id ASC), UNIQUE INDEX fruits_name_key (name DESC), INVERTED INDEX fruits_payload_idx (payload), FAMILY "primary" (name, color, primary_id), FAMILY secondary (payload) ) | +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.540s)
16. Query the Table Using Index
SELECT * FROM fruits WHERE payload @> '{"color": "yellow"}}';
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Successfully validated 16 migrations (execution time 00:00.064s) Current version of schema "defaultdb": 15 Migrating schema "defaultdb" to version 16 - Query json with inverted index [non-transactional] +-----------+--------+--------------------------------------+------------------------------------------+ | name | color | primary_id | payload | +-----------+--------+--------------------------------------+------------------------------------------+ | jackfruit | yellow | 30b823f8-eaa3-4426-8198-5f55286eb07c | {"color": "yellow", "name": "jackfruit"} | | durian | yellow | 5aba39fa-58cb-4ecc-a066-b45a7f57f561 | {"color": "yellow", "name": "durian"} | | apricot | yellow | c2598274-d826-44e8-b536-c4a2513eb33b | {"color": "yellow", "name": "apricot"} | +-----------+--------+--------------------------------------+------------------------------------------+ Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.153s)
Notice the execution time, 00:00.153s
vs. 2.0718ms
At this point, the tree of the project looks like so:
├── cockroach-data │ └── crdb ├── docker-compose.yml └── flyway ├── conf │ └── flyway.conf └── sql ├── V10__Alter_table_change_pk.sql ├── V11__Reload_data.sql ├── V12__Rename_column.sql ├── V13__Add_json_column_new_cf.sql ├── V14__Load_json.sql ├── V15__Create_inverted_index.sql ├── V16__Query_json_with_inverted_index.sql ├── V1__Create_table.sql ├── V2__Load_data.sql ├── V3__Change_color_and_name_to_not_null.sql ├── V4__Change_pk_to_name_and_color.sql ├── V5__Drop_unique_constraint_and_id_column.sql ├── V6__Load_data.sql ├── V7__Add_unique_constraint_on_name_column.sql ├── V8__Load_data_respecting_unique.sql └── V9__Truncate_table.sql 5 directories, 18 files
And running info
on Flyway will result in:
Flyway Community Edition 6.4.4 by Redgate Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5) Schema version: 16 +-----------+---------+--------------------------------------+------+---------------------+---------+ | Category | Version | Description | Type | Installed On | State | +-----------+---------+--------------------------------------+------+---------------------+---------+ | Versioned | 1 | Create table | SQL | 2020-06-29 20:03:09 | Success | | Versioned | 2 | Load data | SQL | 2020-06-29 20:03:09 | Success | | Versioned | 3 | Change color and name to not null | SQL | 2020-06-29 20:03:10 | Success | | Versioned | 4 | Change pk to name and color | SQL | 2020-06-29 20:03:11 | Success | | Versioned | 5 | Drop unique constraint and id column | SQL | 2020-06-29 20:03:11 | Success | | Versioned | 6 | Load data | SQL | 2020-06-29 20:03:11 | Success | | Versioned | 7 | Add unique constraint on name column | SQL | 2020-06-29 20:03:12 | Success | | Versioned | 8 | Load data respecting unique | SQL | 2020-06-29 20:03:12 | Success | | Versioned | 9 | Truncate table | SQL | 2020-06-29 20:03:12 | Success | | Versioned | 10 | Alter table change pk | SQL | 2020-06-29 20:03:13 | Success | | Versioned | 11 | Reload data | SQL | 2020-06-29 20:03:13 | Success | | Versioned | 12 | Rename column | SQL | 2020-06-29 20:03:14 | Success | | Versioned | 13 | Add json column new cf | SQL | 2020-06-29 20:03:14 | Success | | Versioned | 14 | Load json | SQL | 2020-06-29 20:03:14 | Success | | Versioned | 15 | Create inverted index | SQL | 2020-06-29 20:05:54 | Success | | Versioned | 16 | Query json with inverted index | SQL | 2020-06-29 20:08:49 | Success | +-----------+---------+--------------------------------------+------+---------------------+---------+
Published at DZone with permission of Artem Ervits. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments