see also: part i: when to build your data warehouse part ii: building a new schema part iii: location of your data warehouse part iv: extraction, transformation, and load in part i we looked at the advantages of building a data warehouse independent of cubes/a bi system and in part ii we looked at how to architect a data warehouse’s table schema. in part iii, we looked at where to put the data warehouse tables. in part iv, we are going to look at how to populate those tables and keep them in sync with your oltp system. today, our last part in this series, we will take a quick look at the benefits of building the data warehouse before we need it for cubes and bi by exploring our reporting and other options. as i said in part i, you should plan on building your data warehouse when you architect your system up front. doing so gives you a platform for building reports, or even application such as web sites off the aggregated data. as i mentioned in part ii, it is much easier to build a query and a report against the rolled up table than the oltp tables. to demonstrate, i will make a quick pivot table using sql server 2008 r2 powerpivot for excel (or just powerpivot for short!). i have showed how to use powerpivot before on this blog , however, i usually was going against a sql server table, sql azure table, or an odata feed. today we will use a sql server table, but rather than build a powerpivot against the oltp data of northwind, we will use our new rolled up fact table. to get started, i will open up powerpivot and import data from the data warehouse i created in part ii. i will pull in the time, employee, and product dimension tables as well as the fact table. once the data is loaded into powerpivot, i am going to launch a new pivottable. powerpivot understands the relationships between the dimension and fact tables and places the tables in the designed shown below. i am going to drag some fields into the boxes on the powerpivot designer to build a powerful and interactive pivot table. for rows i will choose the category and product hierarchy and sum on the total sales. i’ll make the columns (or pivot on this field) the month from the time dimension to get a sum of sales by category/product by month. i will also drag in year and quarter in my vertical and horizontal slicers for interactive filtering. lastly i will place the employee field in the report filter pane, giving the user the ability to filter by employee. the results look like this, i am dynamically filtering by 1997, third quarter and employee name janet leverling. this is a pretty powerful interactive report build in powerpivot using the four data warehouse tables. if there was no data warehouse, this pivot table would have been very hard for an end user to build. either they or a developer would have to perform joins to get the category and product hierarchy as well as more joins to get the order details and sum of the sales. in addition, the breakout and dynamic filtering by year and quarter, and display by month, are only possible by the dimtime table, so if there were no data warehouse tables, the user would have had to parse out those dateparts. just about the only thing the end user could have done without assistance from a developer or sophisticated query is the employee filter (and even that would have taken some powerpivot magic to display the employee name, unless the user did a join.) of course pivot tables are not the only thing you can create from the data warehouse tables you can create reports, ad hoc query builders, web pages, and even an amazon style browse application. (amazon uses its data warehouse to display inventory and oltp to take your order.) i hope you have enjoyed this series, enjoy your data warehousing.
In Part I we looked at the advantages of building a data warehouse independent of cubes/a BI system and in Part II we looked at how to architect a data warehouse’s table schema. Today we are going to look at where to put your data warehouse tables. Let’s look at the location of your data warehouse. Usually as your system matures, it follows this pattern: Segmenting your data warehouse tables into their own isolated schema inside of the OLTP database Moving the data warehouse tables to their own physical database Moving the data warehouse database to its own hardware When you bring a new system online, or start a new BI effort, to keep things simple you can put your data warehouse tables inside of your OLTP database, just segregated from the other tables. You can do this a variety of ways, most easily is using a database schema (ie dbo), I usually use dwh as the schema. This way it is easy for your application to access these tables as well as fill them and keep them in sync. The advantage of this is that your data warehouse and OLTP system is self-contained and it is easy to keep the systems in sync. As your data warehouse grows, you may want to isolate your data warehouse further and move it to its own database. This will add a small amount of complexity to the load and synchronization, however, moving the data warehouse tables to their own table brings some benefits that make the move worth it. The benefits include implementing a separate security scheme. This is also very helpful if your OLTP database scheme locks down all of the tables and will not allow SELECT access and you don’t want to create new users and roles just for the data warehouse. In addition, you can implement a separate backup and maintenance plan, not having your date warehouse tables, which tend to be larger, slow down your OLTP backup (and potential restore!). If you only load data at night, you can even make the data warehouse database read only. Lastly, while minor, you will have less table clutter, making it easier to work with. Once your system grows even further, you can isolate the data warehouse onto its own hardware. The benefits of this are huge, you can have less I/O contention on the database server with the OLTP system. Depending on your network topology, you can reduce network traffic. You can also load up on more RAM and CPUs. In addition you can consider different RAID array techniques for the OLTP and data warehouse servers (OLTP would be better with RAID 5, data warehouse RAID 1.) Once you move your data warehouse to its own database or its own database server, you can also start to replicate the data warehouse. For example, let’s say that you have an OLTP that works worldwide but you have management in offices in different parts of the world. You can reduce network traffic by having all reporting (and what else do managers do??) run on a local network against a local data warehouse. This only works if you don’t have to update the date warehouse more than a few times a day. Where you put your data warehouse is important, I suggest that you start small and work your way up as the needs dictate.
The Decorator pattern is a great fit for modifying the behaviour of a microservice. Native language support can help with applying it quickly and modularly.
We will discuss file uploads with the help of the IFormFile Interface and others provided by .NET and step-by-step implementation using .NET Core 6 Web API.
Discover the key differences between Databricks and Snowflake around architecture, pricing, security, compliance, data support, data protection, performance, and more.
Learn what data ingestion is, why it matters, and how you can use it to power your analytics and activate your data as an essential part of the modern data stack.
This article will help automate the process of creating and configuring an Amazon Aurora Postgres Global Database. It also describes ways to handle fail-over scenarios.
The widely adopted SCM tools we use today, GitHub and Gitlab, are built on the dated architecture and design of git, but this has some security gaps we'll explore.
Continue on a journey into the world of cloud-native observability: go out onto the playing field to understand who the players are and what teams they form.
Explore the difference between two well-known Auth building methods: AWS Cognito and JSON Web Token. Plus, take a look at the AWS Cognito application process.
Gain context of three separate API solutions which can be used to validate name input fields within an application as you follow along with this demonstration.