MongoDB to Couchbase (Part 3): Data Types
Compare and contract MongoDB and Couchbase's data types.
Join the DZone community and get the full member experience.
Join For FreeIn the article on MongoDB and Couchbase database objects, we saw the mapping from databases to buckets, collections to collections, documents to documents, and field to fields. The data itself is stored within these fields. The {"field": "value"} is commonly called key-value pairs. Every key has a value and the value is the data. This value self describes the data under BSON or JSON definition in MongoDB and Couchbase.
MongoDB uses BSON, Couchbase uses JSON -- both are document-oriented and JSON-based. Couchbase uses JSON exactly, MongoDB uses extended JSON called BSON, binary JSON.
As part of data remodeling, while moving from the relational model to the JSON model, you’ll have to consider the data type mapping. In Oracle, you’ll have to create and declare the types of each column explicitly before you load the data or write queries. In Couchbase, you simply conform to JSON syntax and the data type interpretation is automatic and implicit. Here’s the overview of mappings, conversion, and arithmetic on these data types.
Here's a sample document within MongoDB.
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"created": ISODate("2019-12-19T06:01:17.171Z"),
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
},
{
"type" : "master",
"cardnum" : "6274-2842-2847-3909",
"expiry" : "2019-03"
}
],
"Connections" : [
{
"CustId" : "XYZ987",
"Name" : "Joe Smith"
},
{
"CustId" : "PQR823",
"Name" : "Dylan Smith"
},
{
"CustId" : "PQR823",
"Name" : "Dylan Smith"
}
],
"Purchases" : [
{ "id":12, item: "mac", "amt": Decimal128("2823.52") },
{ "id":19, item: "ipad2", "amt": Decimal128("623.52") }
]
}
Let's convert that to a Couchbase document
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
“Created": "2019-12-19T06:01:17.171Z",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
},
{
"type" : "master",
"cardnum" : "6274-2842-2847-3909",
"expiry" : "2019-03"
}
],
"Connections" : [
{
"CustId" : "XYZ987",
"Name" : "Joe Smith"
},
{
"CustId" : "PQR823",
"Name" : "Dylan Smith"
},
{
"CustId" : "PQR823",
"Name" : "Dylan Smith"
}
],
"Purchases" : [
{ "id":12, item: "mac", "amt": 2823.52 },
{ "id":19, item: "ipad2", "amt": 623.52 }
]
}
They look pretty close, but there are some differences: The date and decimal are common differences. Let's compare each type.
MongoDB |
Couchbase |
Data types |
Data types https://docs.couchbase.com/server/current/n1ql/n1ql-language-reference/datatypes.html |
Stored in BSON format. BSON is MongoDB’s extended and binary representation for JSON. It follows the JSON model, but adds types like Double, Date, timestamp, code, and some internal types. Here’s we’ll only discuss the types directly used by the application. |
Stored in JSON format. Physical storage compresses the JSON using snappy compression. |
BASIC TYPES |
|
{salary: 88345.4299 } |
{“salary”: 88345.4299 } |
64-bit and 32-bit signed integer. |
Numbers can be stored in integer, decimal, floating point and scientific notation. |
MongoDB supports IEEE 754 decimal128 data type and provides various operators to convert to and from its decimal data type. Insert as: {"price" : NumberDecimal("2.098")} Shows as: {price: Decimal128(“2.098”)} |
Does not support decimal data type. You can store the decimals as JSON numerical value, but perform the arithmetic in the application before storing it so you get the exact scale and precision. E.g. You can use the BigDecmial in Java. {"price" : 2.098 } |
UTF-8 encoded string {“company”: “MongoDB”} |
String This is same as MongoDB. String in JSON is of UTF-8 encoding. {“company”: “Couchbase”} |
Date Uses ISO8601 form and provides special function in Mongo shell to encode it and it’s encoded in the BSON date format. Insert as: {created: ISODate("2012-12-19T06:01:17.171Z")} Shows up as: {created: ISODate("2012-12-19T06:01:17.171Z")} |
Date as a special data type is unavailable. The best practice is to store the data as a string in ISO8601 format and use the rich set of functions to extract, manipulate and do arithmetic on the data value. All of the common functionality on date is supported. {“created”: "2012-12-19T06:01:17.171Z"} |
Timestamp is simply epoch time in milliseconds. It’s usually used to store elapsed time in Jan 1st 1970. {"$timestamp":{"t":1565545664,"i":1}} |
Timestamp isn’t supported directly, a rich set of functions can convert milliseconds since Jan 1st, 1970 to other formats and vice versa. On the JSON document, it’s simply stored as an unsigned numeric value. |
Every document in MongoDB is identified by either user generated or system generated id. When user does not specify a value for the _id field, the system generated an objectID automatically for the document. This value is stored within the document and an index is automatically created on this field. User does not have an option to drop it. {_id: ObjectId("622d7fbe57fe91991b9340f5")} |
Document Key Each couchbase document must have a user-generated document key string that’s unique per collection. If you don’t care about the structure of the document key, you can simply generate the key using a UUID() or equivalent function. The uniqueness is enforced during insert and cannot be changed. The document key resides outside the document. The document can be accessed directly via API or N1QL if you know the document key. INSERT INTO t(KEY, VALUE) VALUES(“cx:123:PR”, {“a”:1, “b”:55}); INSERT INTO t(KEY, VALUE) VALUES(UUID(), {“a”:1, “b”:55}); |
NULL These are known unknowns. This is a field defined within the document but with an unknown value. {“address”: null} |
NULL This is defined by JSON and means the same thing as MongoDB. {“address”: null} |
No definition of missing. |
Since JSON is a flexible schema, querying a field not present in the document won’t give an error. But, you can inspect to see if the field has an unknown value or the field itself is missing by using the IS MISSING predicate. To account for MISSING, Couchbase has defined the 4-valued boolean logic. |
COMPLEX TYPES |
|
JSON (and BSON) as a whole represents an object and is an object. These objects can contain other objects. Nesting of other objects is a basic property. { _id: "SF:QB:13", team: "49ers", name: { first: "joe", last: "montana", superbowls: [ 1982, 1985, 1989, 1990 ] } } |
This is the same as MongoDB. A JSON document itself is an object. You can nest an object within another object. You can have an object of scalars, objects of objects, objects of arrays, objects of arrays of objects. This ability to nest objects without pre-planning while creating a collection gives the Couchbase and MongoDB collections the “flexible schema” ability. Document key: "SF:QB:13" { team: "49ers", "name": { "first": "joe", "last": "montana", "superbowls": [ 1982, 1985, 1989, 1990 ] } }
|
Array is more of a data structure than a data type. An array can contain zero, one or more values of any type: string, double, object, etc. Arrays can contain other arrays forming n-dimensional matrix. Array type makes the document model truly flexible, extensible. It also is a challenge to query, index and meet the demanding SLAs. {mytoys: []} {mytoys: [“lego”, “cp360”, 4842, 85]} {mytoys: [[1, 2], [“a”, “y”, “b], [42, “xyz”]} {mytoys: {x: [1, 5, 2, 4]}} {mytoys: [{a:1, b:2}, {a:8, c:43}, {a:”x”, b:23}] |
Arrays in MongoDB and standard JSON are exactly the same. Couchbase provides more advanced indexing and optimization features for arrays – this will be discussed later in the series. {“mytoys”: []} {“mytoys”: [“lego”, “cp360”, 4842, 85]} {“mytoys”: [[1, 2], [“a”, “y”, “b”], [42, “xyz”]} {“mytoys”: {“x”: [1, 5, 2, 4]}} {“mytoys”: [{“a”:1, “b”:2}, {“a”:8, “c”:43}, {“a”:”x”, “b”:23}] |
TYPE RELATED FEATURES |
|
Type comparison and sort order Null has the lowest value and Timestamp has the highest value. In order: Null, Numbers, String, Object, Array, BinData, ObjectId, Boolean, Timestamp, RegularExpression. |
Type comparison and sort order (collation) MISSING has the lowest values and Object has the highest value within the types. When comparing documents. In order, MISSING, NULL, FALSE, TRUE, number, string, array, object |
Published at DZone with permission of Keshav Murthy, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments