I’ve hear this question many times; “Do you think we need a date dimension?”. some years ago, date dimension has been used more, but nowadays less! The main reason is that there are some tools that are using their own built-in date hierarchy. As an example Power BI has a built-in date hierarchy that is enabled by default on top of all date fields which gives you a simple hierarchy of year, quarter, month, and day. Having this hierarchy brings the question most of the time that do I need to have a date dimension still? of I can simply use this date hierarchy? In this post I’ll answer this question and explain it with reasons. Majority of this post is conceptual and can be used across all BI tools, however part of it is Power BI focused. If you like to learn more about Power BI, read Power BI online book from Rookie to Rock Star.
What is Date Dimension?
Date Dimension is a table that has one record per each day, no more, no less! Depends on the period used in the business you can define start and end of the date dimension. For example your date dimension can start from 1st of Jan 1980 to 31st of December of 2030. For every year normally you will have 365 records (one record per year), except leap years with 366 records. here is an example screenshot of a date dimension records;
Date Dimension is not a big dimension as it would be only ~3650 records for 10 years, or even ~36500 rows for 100 years (and you might never want to go beyond that). Columns will be normally all descriptive information about date, such as Date itself, year, month, quarter, half year, day of month, day of year….
Date Dimension will be normally loaded once, and used many times after it. So it shouldn’t be part of your every night ETL or data load process.
Why Date Dimension?
Date Dimension is useful for scenarios mentioned below;
- Ability to slice and dice by many date attributes such as week number, half year, day of year, and etc.
- Consistency in Analysis
- Ability to do analysis based on public holidays (Easter Monday, Good Friday, and etc)
- Some BI tools extended functions need to work with a Date Dimension
Let’s look at reasons one by one;
Powerful Slice and Dice
If you have worked with Power BI you know that there is a date hierarchy built-in which gives you fields such as Year, Quarter, Month, and Day. This hierarchy normally adds automatically to date fields. here is an example of how it looks like;
This built-in hierarchy is very good help, but what about times that you want to slice and dice date fields based on something other than these 4 fields? for example questions that you cannot answer with this hierarchy are;
- Weekday analysis of revenue (by Monday, Tuesday, …., Sunday)
- Which week in the year generates least revenue or most revenue?
- What about Fiscal? What is the revenue for Fiscal year?
- and many other questions.
The fact is; when you use the simple hierarchy of Year, Quarter, Month, and Day, you are limited to only slice and dice by these fields. If you want to step beyond that, then you have to create additional fields to do it. That’s when a date dimension with all of these fields is handy. Here are some fields that you can slice and dice based on that in Date Dimension (and this is just a very short list, a real date dimension might have twice, three times, or 4 times more fields than this! Yes more fields means more power in slicing and dicing)
You can answer questions above without a date dimension of course, for example; to answer the first question above, you will need to add Weekday Name, because you would need to sort it based on Weekday number, so you have to bring that as well. To answer second question you need to bring Week number of year. For third question you need to bring fiscal columns, and after a while you will have heaps of date related columns in your fact table!
And worst part is that this is only this fact table (let’s say Sales fact table). Next month you’ll bring Inventory fact table, and the story begins with date fields in Inventory fact table. This simply leads us to the second reason for date dimension; Consistency.
Consistency in Analysis
You don’t want to be able to slice and dice your Sales data by week number of year, but not having this feature in Inventory data! If you repeat these columns in all fact tables you will have big fact tables which are not consistent though, if you add a new column somewhere you have to add it in all other tables. What about a change in calculation? Believe me you never want to do it.
Date Dimension on the other hand is a dimension that is shared between all fact tables. You add all fields and calculations here, and fact tables are only related to this. This is a consistent approach. Here is a simple view of a data warehouse schema with a date dimension shared between two fact tables.
Public Holidays Insight
In most of businesses nowadays public holidays or special event’s insight is an important aspect of reporting and analysis. some of analysis are like;
- Was the sales in Easter this year better than last year? (you don’t want to calculate Easter in Power BI of course)
- How is the revenue is public holidays compared with weekends?
- How was the sales for store in opening day? (special event)
- and many other questions….
One of the best use cases of date dimension is public holidays and special events. I have shown previously in another post how to fetch public holidays live in Power BI. Calculating some of public holidays are time consuming and not easy. Easter for example changes every year. You can also have some fields in date dimension that mention what the special day is (opening, closing, massive sale….). Your date dimension can be enhanced a lot with insight from these fields. Here is an example of public holidays information in a date dimension;
Extended Functions for Time Intelligence
BI Tools in the market has some extended functions that gives you insight related to date, named as time intelligence functions. For examples, talking about Power BI, you can leverage heaps of DAX functions, such as TotalYTD to calculate year to date, ParallelPeriod to find the period parallel to the current period in previous year or quarter, and many other functions. These functions sometimes need to work with a date dimension, otherwise they won’t return correct result!
Yes, you’ve read it correctly, DAX time intelligence functions won’t work properly if you don’t have a proper date dimension. proper date dimension is a table that has a full date column in one of the fields, and has no gaps in dates. This means that if you are using Sales transaction table which has an OrderDate field as an input to DAX time intelligence functions, and if there is no sales transaction for 1st of January, then the result of time intelligence functions won’t be correct! It won’t give you an error, but it won’t show you the correct result as well, this is the worst type of error might happen. So having a Date Dimension can be helpful in this kind of scenarios as well.
In this post I’ve explained how to use some of DAX time intelligence functions in Power BI.
Do I Need a Date Dimension?
Now is the time to answer the question; “Do I need a date dimension”? the answer is: Yes, IMHO. Let me answer that in this way: Obviously I can do a BI solution without date dimension, but would you build a table without hammer?! Would you build a BI solution that has some date columns in each fact table, and they are not consistent with each other? would you overlook having special event or public holidays dates insight in your solution? would you only stick to year, quarter, month, and date slicing and dicing? would you like to write all your time intelligence functions yourself? Obviously you don’t. Everyone has the same answer when I explain it all.
I have been working with countless of BI solutions in my work experience, and there is always requirement for a Date Dimension. Sooner or later you get to a point that you need to add it. In my opinion add it at the very beginning so you don’t get into the hassle of some rework.
How to Build a Date Dimension?
Now that you want to build a date dimension, there are many ways to build it. Date Dimension is a dimension table that you normally load once and use it always. because 1st of Jan 2017 is always 1st of Jan 2017! data won’t change, except bringing new holiday or special event dates. I mention some of ways to build a date dimension as below (there are many more ways to do it if you do a bit of search in Google);
Date Dimension T-SQL Script
You can use a written T-SQL Script and load date dimension into a relational database such as SQL Server, Oracle, MySQL, and etc. Here is a T_SQL example I have written few years ago.
Power Query Script
I have written a series of blog posts explaining how you can do the whole date dimension in Power Query. the blog post is a bit old, I will write a new one in near future and will show an easier way of doing this in Power BI.
- Date Dimension in Power Query – Part 1: Calendar Columns
- Date Dimension in Power Query – Part 2: Fiscal Columns
- Date Dimension in Power Query – Part 3: Public Holidays
Fetch Public Holidays
I have written another blog post that mentioned how to fetch public holidays from internet and attach it to a date dimension in Power BI.
DAX Calendar Functions
You can use Calendar() function in DAX to generate a date dimension very quickly as well.
Using Date Dimension Multiple Times?
Using a dimension multiple times in a data warehouse called Role Playing dimension. I have explained how to do role playing dimension for DimDate in this blog post.