Dual Storage Mode; The Most Important Configuration for Aggregations! Step 2 Power BI Aggregations

An aggregated table can be a layer on the source DirectQuery table. This table needs to have a proper relationship set up with other tables and also the proper storage mode configuration. Storage Mode configuration of the composite model is an important configuration and critical part of setting up the aggregation. The storage mode configuration of Import Data and DirectQuery is self-explanatory, but what about Dual Storage mode? I’m going to explain in detail what the Dual mode is. Let’s dig in. This is the third article about aggregations. In the first article, you learned what aggregation is, and in the second article, you learned how to create the aggregated table. This article talks about the next step, which is storage mode configuration in the model. If you want to learn more about Power BI, read the Power BI book from Rookie to Rock Star.

Sample model

In the previous step, we ended up with a data model including tables and relationships below;

Sales Agg table is an aggregated table created in the previous step, and FactInternetSales is a table with DirectQuery source. All of the tables in this model (except Sales Agg) are DirectQuery sourced tables.

What is Storage Mode

Storage mode in Power BI tables is determining where the data of that table is stored and how queries will be sent to the data source. You can find the storage mode of a table either by hovering the mouse on it in the Power BI Desktop in the fields section;

Or by right-clicking on the table and selecting properties;

In the properties pane, you can see the storage mode option drop-down list;

There are three storage modes;

  • Import
  • DirectQuery
  • Dual

Import

Import and DirectQuery are the obvious options in the above list. For example, If a table’s storage mode is Import, then it means the data of that table will be stored in the in-memory storage of the Power BI server (the machine that runs the Power BI engine). Every query to the data will be a query to the in-memory structure, not to the data source.

Suppose I have a table sourced from SQL Server but has the storage mode of Import. Then a copy of that data will be stored in the memory engine of Power BI. Whenever you refresh a visualization in the Power BI report, it will just query the in-memory structure rather than sending the query to the SQL Server data source.

DirectQuery

Tables with the DirectQuery storage mode will keep the data in the data source. FactInternetSales stores the data in the SQL Server data source in our example dataset. If we have a visualization from a table with this storage mode, Power BI will send a T-SQL query to the data source and get the result back. For example, the below visualization in the Power BI report is just a Card visual on the SalesAmount field in the FactInternetSales;

Because this table has the storage mode of DirectQuery, if we run SQL Profiler simultaneously (to capture queries sent to the data source), here is the query captured to the SQL Server database:

Query from Multiple DirectQuery tables

If we have multiple tables with DirectQuery storage mode, the result will be a query sent to the data source for that combination. Here we have a column chart by SalesAmount from FactInternetSales (DirectQuery) and CalendarYear from DimDate (DirectQuery);

And this is the query sent for that:

Caution: Combining DirectQuery and Import

So far, with the above examples, everything was expected. Nothing strange happened. However, combining fields from DirectQuery sourced tables with Import storage mode tables would be strange. As an example, The Sales Agg table is an import table. The whole purpose of creating this table was to query faster from the in-memory structure. If you create a visual that just contains something from the Sales Agg table, no query will be sent to the data source. It all happens in memory. As expected, of course.

And SQL Profiler doesn’t catch any queries sent to the data source;

However, the behaviour is different If we have a visual getting data from the Sales Agg table and a DirectQuery table such as DimDate;

This time, however, we see a T-SQL query tracked in SQL Profiler, querying the DimDate in the SQL Server database.

This is not what we expect to see. The whole purpose of the Sales Agg table is to speed up the process from DirectQuery mode, but we are still querying the DimDate from the database. So, what is the solution? Do we change the storage mode of DimDate to Import? If we do that, what about the connection between DimDate and FactInternetSales? We want that connection to work as DirectQuery, of course.

Now that you learned about the challenge, it is a good time to talk about the third storage mode; Dual.

Dual Storage Mode

The Dual storage mode is built to cover a scenario like the above. With Dual storage mode, one table can act either as DirectQuery or Import respective to the relationship to other tables. Dual storage mode is the secret sauce of Composite mode and aggregation in Power BI. Let’s see how this mode works.

If we change the storage mode of DimDate to Import, we get the issue of sending the query to the data source when we are querying FactInternetSales. If we change the storage mode of DimDate to DirectQuery, then even for Sales Agg, we are still sending the query to the data source. The solution is to change the storage mode of common dimension tables to Dual.

When you set the storage mode of a table to Dual, for example, DimDate, you get a warning that this process may take some time. The reason is that by changing the storage mode to Dual, you will get a copy of that table in the memory.

The copy of that table in the memory acts like an Import table. There will be another version of that which works through DirectQuery, though. It looks like you have two identical tables, one for Import and another for DirectQuery.

What would be the impact of that on the reporting? The same visual we had before in the previous example is now working all through the in-memory engine;

As you can see, no queries were sent to the data source for this table;

Because this table is connected to an Import table (Sales Agg), and as a result, the Dual storage mode acts like an Import table storage. Everything will be queried from memory.

If you have the Dual storage mode table, used in a visualizing alongside a DirectQuery table, similar to below:

This visualization will send a query to the data source, as you can see in the SQL Profiler track below;

In this scenario, the Dual storage mode acts like a DirectQuery because it comes with a connection to a DirectQuery sourced table;

So, now you can tell why this mode is called Dual. Sometimes it acts like Import and sometimes, like DirectQuery, Depending on the table that is combined through the visualization.

That’s too Technical; what should I do to set up the Aggregation?

Well, It was a long post, and the description was quite technical. You may say; I don’t care about the details; just tell me how to get the aggregations working. Well, the answer is easy: All tables connected to both Sales Agg (our aggregation table, which is Import), and FactInternetSales (our big fact table, which is DirectQuery), should be set to storage mode of Dual. This way, they can act both ways, depending on the situation they are used in.

All tables connected to both Import table (aggregation), and DirectQuery table (big fact table), should be set as Dual storage mode.

In our example, we have to set the Dual storage mode for; DimDate, DimProductSubcategory, and DimCustomer. However, when you apply that on DimCustomer, you will notice that a message pops up saying that the DimGeography will also be set as Storage mode. That is correct; other tables that are related to this through Many to One needs to be Dual too.

The same process will happen with DimProductSubcategory and DimProductCategory. In our sample diagram and model, all tables selected below are Dual Storage Mode;

Now your model is ready to perform Step 3, Configuring aggregations. Stay tuned for that article.

Previous and Next Steps

Reza Rad on FacebookReza Rad on LinkedinReza Rad on TwitterReza Rad on Youtube
Reza Rad
Trainer, Consultant, Mentor
Reza Rad is a Microsoft Regional Director, an Author, Trainer, Speaker and Consultant. He has a BSc in Computer engineering; he has more than 20 years’ experience in data analysis, BI, databases, programming, and development mostly on Microsoft technologies. He is a Microsoft Data Platform MVP for 12 continuous years (from 2011 till now) for his dedication in Microsoft BI. Reza is an active blogger and co-founder of RADACAD. Reza is also co-founder and co-organizer of Difinity conference in New Zealand, Power BI Summit, and Data Insight Summit.
Reza is author of more than 14 books on Microsoft Business Intelligence, most of these books are published under Power BI category. Among these are books such as Power BI DAX Simplified, Pro Power BI Architecture, Power BI from Rookie to Rock Star, Power Query books series, Row-Level Security in Power BI and etc.
He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.
Reza’s passion is to help you find the best data solution, he is Data enthusiast.
His articles on different aspects of technologies, especially on MS BI, can be found on his blog: https://radacad.com/blog.

7 thoughts on “Dual Storage Mode; The Most Important Configuration for Aggregations! Step 2 Power BI Aggregations

  • Hi Reza Rad
    Thank you for this very informative and well explained series on aggregation. I have been trying to follow your steps using a database I have on SQL server. In your 1st article, you created the ‘Sales Agg’ table by referencing the ‘FactInternetSales’ which is a Direct Query. In this article, you mention that the ‘Sales Agg’ table is the only table in Import mode. When I created in Power query a similar aggregation table referencing a direct query table, the aggregated table is also being shown as ‘Direct Query’ mode instead of ‘Import mode’ as in your example. Am I missing something here?

    • Hi Gopa

      Sales Agg would be first defined as DirectQuery mode automatically because it is sourced from the FactInternetSales. But then after applying the change in Query Editor, when you come to Power BI Desktop, you can change the storage mode of Sales Agg to Import with right click on the table, properties, storage mode.

  • Hi Reza,
    I followed the steps you have mentioned and tried with my model but the queries aren’t hitting the agg. table. My model connects to SQL server on-prem and all tables are table views except the date table which I created using DAX. Further, could you explain the purpose of creating Sum and count of Unit Price?
    Thanks

    • Hi. Have you read my two other blog posts earlier? the very first article explains why we do aggregations. The aggregated table will be a layer on top of the DirectQuery table. Also to do the complete example, and test the functionality, you have to read the next step too. then you can test to see your queries will be hitting the agg table.

  • Hi!
    Thanks for the great tutorials on aggregations!

    I was wondering if we could setup an aggregated table on an imported table, or we can only setup aggregated tables on DirectImported tables,

    Thanks!

    • Hi Francis.
      You can build aggregated table everywhere. However, when you have Import table, most of the time, there might not be a point of doing so, because Import Data already have a high performance considering the in-memory engine of Power BI
      Cheers
      Reza

Leave a Reply