I often get this question: “Is it better that I create this calculation as a snapshot table in Power Query (or T-SQL, or the data source), or write DAX measures for it?” If you ever had a scenario of creating a report of snapshots, you know what I mean. “Should I have a table for inventory stock on hand values for each day, or should I calculate those values on the fly?”. Answering this question is an important Power BI modeling step. In this article, I’ll answer that question.
What is a Snapshot Fact Table?
The snapshot fact table is a table that each row of it represents a snapshot. The snapshots are normally time or date-based, plus other fields that might be your dimensions. The difference between the snapshot fact table and a transactional fact table is that the transactional fact table records the “happening of an action”, such as a sales transaction. The snapshot fact table, however, would have snapshot records, even if there is no action happening.
Why do we need snapshot tables?
For many scenarios, you need to calculate the values based on the snapshots of the time. For example, you are interested to see how many open tickets you have, not just for now, but also for yesterday, and the day before.
Snapshot based calculations are very popular in many business cases. That is why there is a type of fact table dedicated to it.
Grain of the Snapshot
The granularity of the snapshot fact table is based on the snapshot periods required. For example, if you want to get open orders every day, then your snapshot should have one record per day. Even if there is no order on a particular date.
The snapshots are also accompanied by other dimensions. For example, you are not only interested to know the number of open orders in each date. You are interested to see how many open orders you had for each product. This means your snapshot table, should have a column for product, and a column for the date.
Size of the snapshot table
The snapshot tables are among the biggest table in your data warehouse. The reason for that is the granularity of the table, and the cross join of the different cardinality of the items used in the grain of that table.
For example, if we are looking for a daily snapshot, for a period of 10 years, in a system with 10 warehouses, and 1000 products, this is how you can calculate the number of rows in the snapshot table:
(10 years)*(365 days in each year)*(10 warehouses)*(1000 products)=36,500,000 rows (36.5 million rows)
Storing more detailed granularity in the snapshot table, results in a bigger snapshot table.
How to create snapshot tables
Snapshot tables are pre-calculated. The technology and method that you use for populating these tables is totally depends on your skill set. Some people are good with T-SQL, and they create it with SQL scripts and stored procedures or views. Many in the world of Power BI, create the snapshot tables using Power Query. You can even use DAX expressions to create that as a calculated table.
In the process of populating these tables, normally a loop structure is involved. In each iteration of the loop based on the values of date and other dimensions, the calculations of values will be done, and then moving to the next iteration until all rows populated.
Alternative: DAX Measures
DAX measures are fantastic alternative for snapshot fact tables. Because in Power BI measures are calculated dynamically, they can replace the need for the snapshot fact table. I have written an article explaining one of those scenarios to calculate stock on hand without a snapshot fact table;
Which one? DAX measures or the Snapshot fact table
Now let’s answer the question that I posed at the begining of this article; which one is better? having a snapshot fact table with pre-calculated values, or dynamically calculating these values using DAX measures?
The short answer to the question above is: It depends. It is not one hat fits everyone. It really depends on the scenario. It depends on what you need. Let me give you a couple of examples:
Big Period Snapshots, Pre-Calculated
Let’s assume you are not interested in daily snapshots (or your report user isn’t), you want yearly snapshots. And this yearly snapshot is probably only at the product category level, not every single product level.
This means the size of our snapshot table would be very small.
(10 years)*(1 row per year)*(7 product categories)=70 rows
The size of our snapshot table in this scenario would be only 70 rows. This is such a small size for a table, which won’t really impact the model at all. You can easily add a table with that size to your model without even noticing a performance difference.
In a scenario like this, I’d recommend creating the snapshot table as a pre-calculated object.
Detailed snapshots, Dynamic
Now let’s assume that you need a very detailed snapshot. One snapshot per day, per warehouse, per product etc. And you want to do this calculation based on a selected list of slicer values in the report dynamically.
As you already seen in the calculation for the size of snapshot table, this table can easily be over 30 million rows with 10 warehouses and 1000 products over a period of 10 years. and if you have 20 warehouses, this will be 70 million rows table.
If you have only a few values to calculate over each iteration, then it makes sense to do those calculations using DAX measures dynamically rather than pre-calculation. It will save you a lot of space, and you will have a better performance.
How to choose the right method?
The two examples above are only two scenarios out of thousands of available scenarios. You should be able to figure out what is the best answer and method for you. So I’ll explain the pros and cons of each model. You can then decide best for your scenario.
DAX Measures Dynamically
- You don’t need a huge snapshot fact table
- calculations are dynamic, and work simply based on the context of your report.
- faster dataset refresh time (because building the snapshot tables might be a slow transformation process)
- slower report performance (if you have too many measures)
- snapshots are only accessible from Power BI (because they are written in DAX)
Pre-calculated Snapshot Fact Table
- Values are pre-calculated, report performance is high
- If snapshots are needed sometimes even from the data source level, it is possible. (depends on how/where you create the snapshot table)
- Creating a huge table, consuming space
- bringing the performance of the model down if the table size is so big and no aggregations have been used
- slower dataset refresh time
Important factors to consider when deciding
Based on the above lists of pros and cons, the important factors to consider are two main things below
The size of the snapshot table
Take a moment and calculate what would be the size of your snapshot table. If the size is going to be big, then think about creating that table more carefully. DAX measure might be a far better option if your table is going to be very big. Consider that every day that table grows bigger and bigger.
The number of measures
You need to also consider how many calculations are needed at every snapshot level. If you need for example 20 different calculations for every snapshot, this might not perform well as a dynamic operation as a measure.
Implement a solution that works
It is easier said that done. However, you can check it for yourself, If you tried one method and didn’t get a good performance out of it, it definitely worth trying the other one. A method that works isn’t a method that shows the calculation results after the user waiting for five minutes, it is something reasonable and workable for users.
Answering a snapshot question is depends on each scenario. You can’t find a single method that works for everyone. Based on your requirements, you need to assess the pros and cons mentioned in this article and find the right solution for you. If you can’t decide still, ping me in the comments below or using the contact us option and I’ll be more than happy to help.