Your NoSQL Database Has an Implicit Schema
There's an assumption in our industry that NoSQL or document database systems doesn't have schema, and hence are more easy to use. This assumption is simply wrong!
Join the DZone community and get the full member experience.
Join For FreeI've heard several software developers say "Nahh, I'll just use a document database, because it allows me to completely ignore schema, and I don't have to design my database up front". This assumption is positively wrong. Let me illustrate with a use case. Imagine you start out your project, and your database model looks like the following.
x
class Customer
{
string Name;
string Email;
string PhoneNumber;
}
You start using the above class to persist your objects into your CosmosDB, CouchBase, or MongoDB. 6 months down the road, you've got 1 million customer records in your database, and you want to change your application, because you realise that a customer might have more than one phone. The customer might have a home phone, cell phone, and work phone. Ignoring the type of phone number, and just focusing on a sub part of the problem, we could imagine the developer happily updates his db model, because he does't "need to fiddle with the database schema". The new version becomes as follows.
xxxxxxxxxx
class Customer
{
string Name;
string Email;
List<string> PhoneNumbers;
}
Congratulations! You now have one million "garbage records" in your database. Records you can no longer read, simply because your Data Access Layer (DAL) returns an instance of Customer as you query for customers, and your existing customers that are persisted as follows into your database ...
xxxxxxxxxx
{
"Name": "John Doe",
"Email": "john@doe.com",
"PhoneNumber": "555 555 5555"
}
... can no longer be read using your "new and shiny improved database model". Why? Because "PhoneNumber" no longer exists, and has been replaced with an array called "PhoneNumbers" (plural). At best, this will return in you having a million customers with a "null" phone number. At worst, your DAL will throw an exception as you try to read any old records from your database.
NoSQL has (some few) advantages over relational database systems, but the lack of schema is not one of them. And if you use NoSQL for the wrong reasons, implying for instance that "they have no schema, it's more easy to start my app", you're in for a lot of pain further down the road ...
If you had used a relational database system, such as SQL Server, MySQL or MariaDB - And you combined it with for instance Entity Framework or (n)Hibernate, you'd at least get a compiler error as you updated your database model from your database, or vice versa. With NoSQL, you're on your own. In these regards, NoSQL becomes the equivalent of an "untyped programming language", such as JavaScript - Compared to a "strongly typed programming language", such as C#, Java or TypeScript. And "untyping" your database is a very, very, very bad idea - Unless you absolutely have to (for other reasons).
For the record (pun!), there are ways to fix this, such as injecting some sort of adapter logic between the database reader, returning an untyped object, for then to "map" from the old structure to the new using code - But this only implies that you are aware of that there's an implicit schema in your database, and that you are willing to accommodate for it with code. In a "strongly typed" database (SQL), you could have done the entirety of the job during development, and then created an update database SQL script, converting your production database to its new structure. In NoSQL, you'll often end up with tons of "if version x, then do y" type of code, in your production environment, cluttering your project, with "temporary garbage code", not needing to be there if you had a "strongly typed" database. Besides, if others read some random X records from your database, interpolating their "average", trying to create a strongly typed model for reading these documents, maybe in a totally unrelated project for that matter - They might miss out on parts of your "implicit schema changes", making their code end up becoming a ticking bomb, just waiting to explode once they read an "old record" ..D
Opinions expressed by DZone contributors are their own.
Comments