Calculated tables first introduced in September 2015 update of Power BI Desktop. The name speak for itself; these are tables created by calculation. As these are in-memory tables their calculation is based on DAX (Data Analysis eXpression language). There are many benefits of using Calculated tables, such as using them for role playing dimensions (for example having more than one date dimension in a model), There are some DAX functions and expressions that returns a table as a result, and using them as table in your model sometimes is really helpful. for example you might want to create a table for top 10 customers and then use that as the main source table in many reports. In this post I’ll explain to you some use cases of calculated tables.
For running examples of this post you need to install Adventure Works DW database on SQL Server, or you can download Excel version of it from here:
Role Playing Dimension
The very first functionality that appears in mind when we talk about Calculated Tables is ability to create role play dimensions. Role Play dimensions are dimensions with the same structure and data rows that plays different roles in our data model. For example Date Dimension is a generic dimension. However in a sales transaction table you might have more than one date columns to relate with the date dimension. in example below we have three date fields in FactInternetSales table: Order Date, Ship Date, and Due Date.
There three fields should be related to three different date dimensions (because tabular model which Power BI is based on that, doesn’t support role playing dimensions built-in). So what you can do is to just load the date dimension once in Get Data section from the SQL Server (Here is the T-SQL script for creating date dimension), or by a blank query from Power Query script (Here is the Power Query script for creating the date dimension). Here is the example date dimension loaded in Power BI Desktop (through Get Data):
Now I can create role playing dimensions with creating a Calculated table:
and this will create a table in memory for me and allows me to write definition of the table
The language for table definition is DAX (Data Analysis eXpression). If you want to know more about DAX you have to read DAX section of the Power BI book. For now let’s keep it simple to see how it works in action. We want an exact copy of the DimDate table here. so I can simply use ALL function in DAX as below:
Ship Date = ALL(DimDate)
As soon as I type the expression above and press Enter I’ll see the result underneath it as data rows, and also list of columns in the Fields pane. I’ve created my role dimension as simple as that. Now I can set up the relationship;
For the relationship above I’ve created also a Due Date dimension, and renamed the original DimDate to Order Date.
In-Memory Structure, Less Refresh Time
Calculated table loads into memory, so your Power BI file size will increase. However you don’t need to read them again from the external data source. Yes, you can create multiple views in the source database and connect to them through Get Data section with Power Query, however their data need to populate from the source database every time a refresh happens (either scheduled or manual).
WITHOUT Calculated Tables: Here is example of three date tables loaded from the external data source:
WITH Calculated Tables: and here is only one date dimension loaded (for the role playing dimension example above):
As you can see this is much more efficient in terms of reducing the refresh time. However the memory consumption would be the same in both methods.
Date dimension was a narrow data set example. You might need role playing for big data tables, so Calculated tables will save you a lot of time in refreshing data in such case.
DAX Table Functions
There are some DAX functions that return a table. For example ALL function which I used in the role playing sample above. ALL was a simple example of a DAX function that returns the whole copy of the source table. Let’s have a look at some other examples and see how it works in other scenarios.
Top 100 Customers as a Calculated Table
There are many examples that a business considers top 10 or top 20 customers and filter down whole dashboard and set of reports only for them. Usually the main reason is that top 10, 20 customers will bring majority of revenue to the business. Fortunately there is a TOPN function in DAX which helps us to build such calculation. TOPN function returns a table. With TOPN we can choose how many rows we want in the result set, and the grouping function to be applied (if there is any) and the aggregation (if there is any).
In this example I want to show you how to use calculated table to generate a list of top 100 customers. As a business requirement I want to visualize the total revenue from top 100 customers and compare it with total revenue of the whole business. There might be different ways to calculate and visualize it, but I want to do it with calculated table as a sample scenario.
Summarize is a DAX function that generates a grouped by list from a table. Summarize works similar to Group By in T-SQL. So if I want to create a table with CustomerKeys and their total sales amount, I can write this expression:
Customer Sales = SUMMARIZE(FactInternetSales,FactInternetSales[CustomerKey], "Total Sales", SUM(FactInternetSales[Total Sales]))
Here are details about parameters I passed in the expression above to Summarize function:
- First parameter: Source Table. FactInternetSales is the source table that I want the group by (summarize) operation to be applied on it.
- Second Parameter: Group by Column. CustomerKey in the FactInternetSales table is the column that I want to use as the key for grouping.
- Third parameter: Output Column Name. I named the output calculated column name as Total Sales.
- Forth parameter: Output Column Calculation. Here I write the calculation for the output column, which is simply sum of Total Sales Column.
So as a result I will have a table with CustomerKey and Total Sales.
Now that we have list of customers with their total sales, it is easy to get top 100 customers. I can simply use TOPN function like this to create another calculated table (I could do this example with only one calculated table instead of two, but I only did it with two table to help you understand the logic better);
Top 10 Customers = TOPN(100,'Customer Sales','Customer Sales'[Total Sales],DESC)
And expression above means:
- First parameter: Number of rows to return. 100 for top 100 customers.
- Second parameter: Source Table. Customer Sales is the source of this operation which we want to fetch top 100 customers from it.
- Third parameter: Order By Column. I want to order the table based on Total Sales of each Customer.
- Forth parameter: Order By expression (ASC, or DESC). To get top 100 customers I have to order it by Total Sales DESC.
and here is the result:
I also renamed the Total Sales column to Top Customer Sales (as you see in screenshot above).
Now I can simply build a report in Power BI to show the difference between Total Sales and Top Customer Sales (If you don’t know how to create visualization below follow the visualization chapters of Power BI book):
Great, We’ve used calculated tables for getting some insight out of top 100 customers and compared it with the total. There are many other cases that you can use Calculated Table for. Chris Webb mentioned using Calendar function in his blog post as a calculated table to start building a date dimension.
As any other DAX related limitations, calculated tables very first limitation is memory. You need to have enough memory to use these tables. This limitation is also an advantage on the other hand, because in-memory structure makes these calculations really fast.
The other limitation which I like to mention at this stage is: Not Inheriting Formatting.
By not inheriting formatting I mean the calculated table doesn’t inherit format from the source table. In some complex scenarios where the calculation comes from many tables that might not be necessary. but for our role playing simple example above; If my original date dimension has some formatting configuration. such as setting DateKey to a “Do Not Summarize”, or some other configuration, then I would like to see the same in the calculated table fetched out of this.
The formatting applied on the calculated table columns also will be overwritten after each change in the DAX expression.
Download the Power BI file of demo from here: