Building a Scalable E-Commerce Data Model
Starting to outgrow small business e-commerce software? Use this as a starting point for building an e-commerce data model that’s scalable, flexible, and fast.
Join the DZone community and get the full member experience.
Join For FreeIntroduction
If selling products online is a core part of your business, then you need to build an e-commerce data model that’s scalable, flexible, and fast. Most off-the-shelf providers like Shopify and BigCommerce are built for small stores selling a few million dollars in orders per month, so many e-commerce retailers working at scale start to investigate creating a bespoke solution.
This article will look at what it takes to start building this infrastructure on your own. What are some of the areas to consider? What might the data model look like? How much work is involved?
Note: A full summary diagram of the e-commerce data model is at the end of the article.
Who Are Your Customers?
First, you need to consider who will be purchasing items from your e-commerce application. How might you model customer information in a database as a result? You’ll probably want to have basic information like your customer's name, email address, etc. Do you want your customers to be able to create a profile in your system? Or just fill out a form each time they want to purchase something?
Just starting out, a basic model might look like this:
If you want your customers to have a persistent profile, then you need to build some way for them to log in to your application. Moving forward with more real-world requirements, you might also want to keep track of their login attempt history and password history.
You might also want to consider whether your customers are part of a large organization; and, if so, how would they like to handle password resets? Do they need single sign-on or OAuth support?
Deep Dive: Addresses
Did you notice there’s no address tied to a customer in any of the data models shown so far? It might be your first inclination to include a customer’s address as part of those models. However, most customers will have multiple addresses and multiple kinds of addresses, like billing and shipping. B2B retailers might also have to consider multiple delivery locations based on the number of warehouses and offices they support.
What happens if the billing and shipping address are different? Well, you’ll need to do more than just add extra columns to the Customer
table! It’s not that simple.
So how does storing a billing address affect the scalability of your application?
If you were to split the payment and shipping areas into separate (micro)services, each having their own database, then putting billing and payment addresses into the Customer
area would lead to having “chatty” services. This is a well-known design smell when building microservices.
To avoid this issue, you’re better off putting the addresses within the appropriate area/service that requires them, but with that, your data model becomes more complex.
One way to avoid much of this complexity is to consider an API-first software provider. With this software, you can integrate the OMS into your data model without spending months of engineering time.
How Do You Organize Products and Catalog?
The first thing you see when you enter a store (either in-person or digitally) are products ready for you to purchase, and they are usually displayed with some thought for how you might be likely to shop.
For an e-commerce web application, you will probably want to highlight things like:
- Best-selling products
- Trending products
- New products
- The ability to browse products by search criteria
Providing customers with that information means you first need to keep track of a lot of data about your products: their prices, historical purchase data, and so on.
Let’s see what a “first shot” at creating a data model for a product catalog might look like:
Here’s a Product
table with some basic information, like a product’s name, SKU, and price. The product is also linked to another table representing various categories that the product is associated with. You might also strategically add indexes and full-text search to the Product
table to enable site visitors to efficiently search for various products.
This is a decent first attempt. However, to get an even more realistic and useful e-commerce product catalog, you’ll need to support more requirements such as:
- Tracking pricing history so site administrators can analyze trends in product pricing
- Supporting related products to display on a product’s page
- Incorporating product vendors so customers can view all products sold by an individual vendor/company
To address those extra requirements, you might end up with the following data model:
This model still isn’t perfect, as it embeds your prices into the product itself, but at least it lets you maintain a previous pricing history.
Another option is to integrate your e-commerce store with an API-first software provider like Fabric that handles pricing for you. This will let you roll out different prices to different users based on their intent, location, cart, or order history.
Deep Dive: Pricing
While the more complex product data model still has a product’s price in the same table, this may not be the best thing to do in a real large-scale application.
Consider that your organization has various departments, such as inventory/warehousing, sales, marketing, customer support, etc. You might have dedicated systems that allow merchandisers to change the price of an item since they are the experts in determining how much a product should sell for. Similar to the considerations with a customer’s billing and shipping addresses, this would lead to cross-boundary/service communication if we left the price in the core Product
table.
Therefore, you might want to store product prices under the data stores that the sales department owns. But don’t forget, there are many different kinds of “prices” that haven’t been taken into consideration yet, including:
- Price (cost) when purchasing stock from vendors
- Customer sale price
- Discounted sale prices
- Manufacturer’s suggested retail price
Handling all these in the context of your organizational structure would require even more exploration and complexity in your data model. While your engineering team could likely accomplish this task, it’s going to take time. Using ready-made solutions can shave weeks or months off your e-commerce data modeling timeline.
How Do You Streamline Orders?
Now that you have customers in your database and products available to purchase, you’ll need to think about how to design the order-taking process and data model.
The process of placing an order might look something like this:
- A customer places products into their cart while browsing
- The customer decides they want to purchase the products that are in their cart
- They proceed to purchase the order
- The customer gets an emailed receipt or confirmation number
However, it’s rarely so simple. Placing orders can be deceptively tricky as there are many moving parts:
- Products
- An active cart
- Cart converted into an order
- A finalized order with confirmation
If you were to look at a simple data model for order placement, it might look something like this:
Notice that each row in the ShoppingCartItem
table contains the “captured” price of the product. When the customer puts an item into their shopping cart should the price at that moment be “locked-in?" If so, for how long?
Note: How the price functions is a business requirement that would need to be discussed with your product owners, and so on, as mentioned in the "Deep Dive: Pricing" section earlier.
The same question applies to an unpaid order. If a customer has ordered a discounted item, should they be able to keep the promise of that discounted price forever until they pay? Or does it expire?
Other questions to consider for an orders data model might include:
- Are you tracking analytics on orders?
- What happens if a customer returns a defective item?
- Should you handle shipping within the same data model or have a dedicated shipping context/schema?
With some of these concerns in mind, you might end up with a data model that looks more like this:
Some things to take note of in this more complex orders model:
ShoppingCartItem
now supports an expiration date for a locked-in priceShoppingCartHistory
tracks when items are added, removed, etc.- An order item may be returned (this still does not handle cases where 1 out of X items of the same product are returned)
- An order may have multiple shipments (eg, how Amazon will sometimes split an order up into multiple packages/shipments)
This article also hasn’t even touched the surface of using alternative data storage methods like JSON documents or event sourcing!
Conclusion
To help you see how all the pieces fit together, here are all the diagrams shown together. I’ve removed a number of links/lines to the Customer
table to increase readability:
As I mentioned above, this article still doesn’t even cover many of the basics like payment processing and invoicing. Beyond the features covered here, you might eventually require more advanced features like:
- Coupon codes
- Taxes
- Third-party integrations with OAuth providers, other retailers, or partners
- Shipment tracking notifications
Building a data model for an e-commerce application, as you can see, is not so simple. What looks upfront to be a straightforward set of database tables is not so simple once you dig into real-world requirements.
Published at DZone with permission of James Hickey. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments