Mixing SQL and NoSQL With MariaDB and MongoDB
This article explores the compatibility of MariaDB and MongoDB for combining SQL and NoSQL databases.
Join the DZone community and get the full member experience.
Join For FreeLet's say you have an application developed in Node.js (or any other platform). This application connects to a MongoDB database (NoSQL) to store ratings on books (number of stars given and a comment). Let's also say that you have another application developed in Java (or Python, C#, TypeScript… anything). This application connects to a MariaDB database (SQL, relational) to manage a catalog of books (title, year of publishing, number of pages).
You are asked to create a report that shows the title and rating information for each book. Notice that the MongoDB database doesn't contain the title of the books, and the relational database doesn't contain the ratings. We need to mix data created by a NoSQL application with data created by a SQL application.
A common approach to this is to query both databases independently (using different data sources) and process the data to match by, for example, ISBN (the id of a book) and put the combined information in a new object. This needs to be done in a programming language like Java, TypeScript, C#, Python, or any other imperative programming language that is able to connect to both databases.
A polyglot application
This approach works. However, joining data is a job for a database. They are built for this kind of data operation. Also, with this approach, the SQL application is no longer an SQL-only application; it becomes a database polyglot, and this increases complexity, making it harder to maintain.
With a database proxy like MaxScale, you can join this data at the database level using the best language for data — SQL. Your SQL application doesn't need to become a polyglot.
Although this requires an additional element in the infrastructure, you also gain all the functionality that a database proxy has to offer. Things such as automatic failover, transparent data masking, topology isolation, caches, security filters, and more.
MaxScale is a powerful, intelligent database proxy that understands both SQL and NoSQL. It also understands Kafka (for CDC or data ingestion), but that's a topic for another occasion. In short, with MaxScale, you can connect your NoSQL application to a fully ACID-compliant relational database and store the data right there next to tables that other SQL applications use.
MaxScale allows an SQL application to consume NoSQL data.
Let's try this last approach in a quick and easy-to-follow experiment with MaxScale. You will need the following installed on your computer:
Setting up the MariaDB Database
Using a plain text editor, create a new file and save it with the name docker-compose.yml. The file should contain the following:
version: "3.9"
services:
mariadb:
image: alejandrodu/mariadb
environment:
- MARIADB_CREATE_DATABASE=demo
- MARIADB_CREATE_USER=user:Password123!
- MARIADB_CREATE_MAXSCALE_USER=maxscale_user:MaxScalePassword123!
maxscale:
image: alejandrodu/mariadb-maxscale
command: --admin_host 0.0.0.0 --admin_secure_gui false
ports:
- "3306:4000"
- "27017:27017"
- "8989:8989"
environment:
- MAXSCALE_USER=maxscale_user:MaxScalePassword123!
- MARIADB_HOST_1=mariadb 3306
- MAXSCALE_CREATE_NOSQL_LISTENER=user:Password123!
This is a Docker Compose file. It describes a set of services to be created by Docker. We are creating two services (or containers) — a MariaDB database server and a MaxScale database proxy. They will be running locally on your machine, but in production environments, it's common to deploy them in separate physical machines. Keep in mind that these Docker images are not suitable for production! They are intended to be suitable for quick demos and tests. You can find the source code for these images on GitHub. For the official Docker images from MariaDB, head to the MariaDB page on Docker Hub.
The previous Docker Compose file configures a MariaDB database server with a database (or schema; they are synonyms in MariaDB) called demo
. It also creates a username user
with the password Password123!
. This user has suitable privileges on the demo
database. There's an additional user with a name maxscale_user
and password MaxScalePassword123!
. This is the user that the MaxScale database proxy will use to connect to the MariaDB database.
The Docker Compose file also configures the database proxy by disabling HTTPS (don't do this in production!), exposing a set of ports (more on this in a moment), and configuring the database user and location of the MariaDB database proxy (usually an IP address, but here we can use the name of the container previously defined in the Docker file). The last line creates a NoSQL listener that we'll use to connect as a MongoDB client on the default port (27017).
To start the services (containers) using the command line, move to the directory in which you saved the Docker Compose file and run the following:
docker compose up -d
After downloading all the software and starting the containers, you'll have a MariaDB database and MaxScale proxy, both preconfigured for this experiment.
Creating a SQL Table in MariaDB
Let's connect to the relational database. In the command line, execute the following:
mariadb-shell --dsn mariadb://user:'Password123!'@127.0.0.1
Check that you can see the demo
database:
show databases;
Switch to the demo
database:
use demo;
Connecting to a database with MariaDB Shell.
Create the books
table:
CREATE TABLE books(
isbn VARCHAR(20) PRIMARY KEY,
title VARCHAR(256),
year INT
);
Insert some data. I'm going to use the cliché of inserting my own books:
INSERT INTO books(title, isbn, year)
VALUES
("Vaadin 7 UI Design By Example", "978-1-78216-226-1", 2013),
("Data-Centric Applications with Vaadin 8", "978-1-78328-884-7", 2018),
("Practical Vaadin", "978-1-4842-7178-0", 2021);
Check that the books are stored in the database by running:
SELECT * FROM books;
Inserting data with MariaDB Shell.
Creating a JSON Collection in MariaDB
We haven't installed MongoDB, yet we can use a MongoDB client (or application) to connect to create collections and documents as if we were using MongoDB, except that the data is stored in a powerful, fully ACID-compliant, and scalable relational database. Let's try that out!
In the command line, use the MongoDB shell tool to connect to the MongoDB… wait… it's actually the MariaDB database! Run the following:
mongosh
By default, this tool tries to connect to a MongoDB server (which, again, happens to be MariaDB this time) running on your local machine (127.0.0.1) using the default port (20017). If everything goes well, you should be able to see the demo
database listed when you run the following command:
show databases
Switch to the demo
database:
use demo
Connecting to MariaDB using Mongo Shell.
We are connected to a relational database from a non-relational client! Let's create the ratings
collection and insert some data into it:
db.ratings.insertMany([
{
"isbn": "978-1-78216-226-1",
"starts": 5,
"comment": "A good resource for beginners who want to learn Vaadin"
},
{
"isbn": "978-1-78328-884-7",
"starts": 4,
"comment": "Explains Vaadin in the context of other Java technologies"
},
{
"isbn": "978-1-4842-7178-0",
"starts": 5,
"comment": "The best resource to learn web development with Java and Vaadin"
}
])
Check that the ratings are persisted in the database:
db.ratings.find()
Querying a MariaDB database using Mongo Shell.
Using JSON Functions in MariaDB
At this point, we have a single database that, from the outside, looks like a NoSQL (MongoDB) database and a relational (MariaDB) database. We are able to connect to the same database and write and read data from MongoDB clients and SQL clients. All the data is stored in MariaDB, so we can use SQL to join data from MongoDB clients or applications with data from MariaDB clients or applications. Let's explore how MaxScale is using MariaDB to store MongoDB data (collections and documents).
Connect to the database using an SQL client like mariadb-shell
, and show the tables in the demo schema:
show tables in demo;
You should see both the books
and ratings
tables listed. ratings
Was created as a MongoDB collection. MaxScale translated the commands sent from the MongoDB client and created a table to store the data in a table. Let's see the structure of this table:
describe demo.ratings;
A NoSQL collection is stored as a MariaDB relational table.
The ratings
the table contains two columns:
id
: the object ID.doc
: the document in JSON format.
If we inspect the contents of the table, we'll see that all the data about ratings are stored in the doc
column in JSON format:
SELECT doc FROM demo.ratings \G
NoSQL documents are stored in a MariaDB database.
Let's get back to our original goal—show the book titles with their rating information. The following is not the case, but let's suppose for a moment that the ratings
table is a regular table with columns stars
and comment
. If that were the case, joining this table with the books
table would be easy, and our job would be done:
/* this doesn’t work */
SELECT b.title, r.stars, r.comment
FROM ratings r
JOIN books b USING(isbn)
Back to reality. We need to convert the doc
column of the actual ratings
table to a relational expression that can be used as a new table in the query. Something like this:
/* this still doesn’t work */
SELECT b.title, r.stars, r.comment
FROM ratings rt
JOIN ...something to convert rt.doc to a table... AS r
JOIN books b USING(isbn)
That something is the JSON_TABLE
function. MariaDB includes a comprehensive set of JSON functions for manipulating JSON strings. We'll use the JSON_TABLE
function to convert the doc
column into a relational form that we can use to perform SQL joins. The general syntax of the JSON_TABLE
function is as follows:
JSON_TABLE(json_document, context_path COLUMNS (
column_definition_1,
column_definition_2,
...
)
) [AS] the_new_relational_table
Where:
json_document
: a string or expression that returns the JSON documents to be used.context_path
: a JSON Path expression that defines the nodes to be used as the source of the rows.
And the column definitions (column_definition_1
, column_definition_2
, etc...) have the following syntax:
new_column_name sql_type PATH path_in_the_json_doc [on_empty] [on_error]
Combining this knowledge, our SQL query would look like the following:
SELECT b.title, r.stars, r.comment
FROM ratings rt
JOIN JSON_TABLE(rt.doc, '$' COLUMNS(
isbn VARCHAR(20) PATH '$.isbn',
stars INT PATH '$.starts',
comment TEXT PATH '$.comment'
)
) AS r
JOIN books b USING(isbn);
Joining NoSQL and SQL data in a single SQL query.
We could have used the ISBN value as the MongoDB ObjectID and, consequently, as the id
column in the ratings
table, but I'll leave that to you as an exercise (hint: use _id
instead of isbn
when inserting data using the MongoDB client or app).
A Word on Scalability
There's a misconception that relational databases don't scale horizontally (adding more nodes) while NoSQL databases do. But relational databases scale without sacrificing ACID properties. MariaDB has multiple storage engines tailored to different workloads. For example, you can scale a MariaDB database by implementing data sharding with the help of Spider. You can also use a variety of storage engines to handle different workloads on a per-table basis. Cross-engine joins are possible in a single SQL query.
Combining multiple storage engines in a single logical MariaDB database.
Another more modern alternative is distributed SQL with MariaDB Xpand. A distributed SQL database appears as a single logical relational database to applications through transparent sharding. It employs a shared-nothing architecture that scales both reading and writing.
A distributed SQL database deployment.
Conclusion
Our job here is done! Now, your systems can have an ACID-compliant scalable 360-degree view of your data independent of whether it was created by SQL or NoSQL applications. There's less need to migrate your apps from NoSQL to SQL or to make SQL apps database polyglots. If you want to learn more about other features in MaxScale, watch this video or visit the docs.
Opinions expressed by DZone contributors are their own.
Comments