An aggregated table can be a layer on the source DirectQuery table. This table needs to have a proper relationship set up with other tables and also the proper storage mode configuration. Storage Mode configuration of the composite model is an important configuration and critical part of setting up the aggregation. The storage mode configuration of Import Data and DirectQuery is self-explanatory, but what about Dual Storage mode? I’m going to explain in detail what the Dual mode is. Let’s dig in. This is the third article about aggregations. In the first article, you learned what aggregation is, and in the second article, you learned how to create the aggregated table. This article talks about the next step, which is storage mode configuration in the model. If you want to learn more about Power BI, read the Power BI book from Rookie to Rock Star.
In the previous step, we ended up with a data model including tables and relationships below;
Sales Agg table is an aggregated table created in the previous step, and FactInternetSales is a table with DirectQuery source. All of the tables in this model (except Sales Agg) are DirectQuery sourced tables.
What is Storage Mode
Storage mode in Power BI tables is determining where the data of that table is stored and how queries will be sent to the data source. You can find the storage mode of a table either by hovering the mouse on it in the Power BI Desktop in the fields section;
Or by right-clicking on the table and selecting properties;
In the properties pane, you can see the storage mode option drop-down list;
There are three storage modes;
Import and DirectQuery are the obvious options in the above list. For example, If a table’s storage mode is Import, then it means the data of that table will be stored in the in-memory storage of the Power BI server (the machine that runs the Power BI engine). Every query to the data will be a query to the in-memory structure, not to the data source.
Suppose I have a table sourced from SQL Server but has the storage mode of Import. Then a copy of that data will be stored in the memory engine of Power BI. Whenever you refresh a visualization in the Power BI report, it will just query the in-memory structure rather than sending the query to the SQL Server data source.
Tables with the DirectQuery storage mode will keep the data in the data source. FactInternetSales stores the data in the SQL Server data source in our example dataset. If we have a visualization from a table with this storage mode, Power BI will send a T-SQL query to the data source and get the result back. For example, the below visualization in the Power BI report is just a Card visual on the SalesAmount field in the FactInternetSales;
Because this table has the storage mode of DirectQuery, if we run SQL Profiler simultaneously (to capture queries sent to the data source), here is the query captured to the SQL Server database:
Query from Multiple DirectQuery tables
If we have multiple tables with DirectQuery storage mode, the result will be a query sent to the data source for that combination. Here we have a column chart by SalesAmount from FactInternetSales (DirectQuery) and CalendarYear from DimDate (DirectQuery);
And this is the query sent for that:
Caution: Combining DirectQuery and Import
So far, with the above examples, everything was expected. Nothing strange happened. However, combining fields from DirectQuery sourced tables with Import storage mode tables would be strange. As an example, The Sales Agg table is an import table. The whole purpose of creating this table was to query faster from the in-memory structure. If you create a visual that just contains something from the Sales Agg table, no query will be sent to the data source. It all happens in memory. As expected, of course.
And SQL Profiler doesn’t catch any queries sent to the data source;
However, the behaviour is different If we have a visual getting data from the Sales Agg table and a DirectQuery table such as DimDate;
This time, however, we see a T-SQL query tracked in SQL Profiler, querying the DimDate in the SQL Server database.
This is not what we expect to see. The whole purpose of the Sales Agg table is to speed up the process from DirectQuery mode, but we are still querying the DimDate from the database. So, what is the solution? Do we change the storage mode of DimDate to Import? If we do that, what about the connection between DimDate and FactInternetSales? We want that connection to work as DirectQuery, of course.
Now that you learned about the challenge, it is a good time to talk about the third storage mode; Dual.
Dual Storage Mode
The Dual storage mode is built to cover a scenario like the above. With Dual storage mode, one table can act either as DirectQuery or Import respective to the relationship to other tables. Dual storage mode is the secret sauce of Composite mode and aggregation in Power BI. Let’s see how this mode works.
If we change the storage mode of DimDate to Import, we get the issue of sending the query to the data source when we are querying FactInternetSales. If we change the storage mode of DimDate to DirectQuery, then even for Sales Agg, we are still sending the query to the data source. The solution is to change the storage mode of common dimension tables to Dual.
When you set the storage mode of a table to Dual, for example, DimDate, you get a warning that this process may take some time. The reason is that by changing the storage mode to Dual, you will get a copy of that table in the memory.
The copy of that table in the memory acts like an Import table. There will be another version of that which works through DirectQuery, though. It looks like you have two identical tables, one for Import and another for DirectQuery.
What would be the impact of that on the reporting? The same visual we had before in the previous example is now working all through the in-memory engine;
As you can see, no queries were sent to the data source for this table;
Because this table is connected to an Import table (Sales Agg), and as a result, the Dual storage mode acts like an Import table storage. Everything will be queried from memory.
If you have the Dual storage mode table, used in a visualizing alongside a DirectQuery table, similar to below:
This visualization will send a query to the data source, as you can see in the SQL Profiler track below;
In this scenario, the Dual storage mode acts like a DirectQuery because it comes with a connection to a DirectQuery sourced table;
So, now you can tell why this mode is called Dual. Sometimes it acts like Import and sometimes, like DirectQuery, Depending on the table that is combined through the visualization.
That’s too Technical; what should I do to set up the Aggregation?
Well, It was a long post, and the description was quite technical. You may say; I don’t care about the details; just tell me how to get the aggregations working. Well, the answer is easy: All tables connected to both Sales Agg (our aggregation table, which is Import), and FactInternetSales (our big fact table, which is DirectQuery), should be set to storage mode of Dual. This way, they can act both ways, depending on the situation they are used in.
All tables connected to both Import table (aggregation), and DirectQuery table (big fact table), should be set as Dual storage mode.
In our example, we have to set the Dual storage mode for; DimDate, DimProductSubcategory, and DimCustomer. However, when you apply that on DimCustomer, you will notice that a message pops up saying that the DimGeography will also be set as Storage mode. That is correct; other tables that are related to this through Many to One needs to be Dual too.
The same process will happen with DimProductSubcategory and DimProductCategory. In our sample diagram and model, all tables selected below are Dual Storage Mode;
Now your model is ready to perform Step 3, Configuring aggregations. Stay tuned for that article.
Previous and Next Steps
- Power BI Fast and Furious with Aggregations
- Power BI Aggregation: Step 1 Create the Aggregated Table
- Dual Storage Mode; The Most Important Configuration for Aggregations! Step 2 Power BI Aggregations (the current article)
- Power BI Aggregations Step 3: Configure Aggregation Functions and Test Aggregations in Action
- Multiple Layers of Aggregations in Power BI; Model Responds Even Faster