In the early days of developing Power BI solutions, you could choose DirectQuery or Import Data connection types, but not both. In 2018, Power BI added a breakthrough feature called Composite Model. Now, using this feature, you can have part of your Power BI dataset connected directly from a data source, and another part to be imported. Along with this change, you can also choose the storage mode for each entity. In this post, you will learn about the composite model; what it is and how it works. If you want to learn more about Power BI; read the Power BI book from Rookie to Rock Star.
Composite Model; What does it mean?
A composite model in Power BI means part of your model can be a DirectQuery connection to a data source (for example, SQL Server database), and another part as Import Data (for example, an Excel file). Previously, when you used DirectQuery, you couldn’t even add another data source into the model.
Composite Model means now you can have a model, that very large tables of that are coming from the DirectQuery connection, without the need for importing, and small tables to be imported to be accessible quickly.
Why Composite Model?
The important question to ask is: “Why should you use the Composite Model?” To answer this question, you first need to review the benefits of Import and DirectQuery modes. Import Data is good for super-fast data analysis and reporting and flexibility, while the DirectQuery is good for big data tables and the data freshness.
Now the Composite Model combines the good things of both Import and DirectQuery into one model. Using the Composite Model, you can work with big data tables using DirectQuery, and still import smaller tables using Import Data.
Using the Composite model, you can have tables with different connection types, some Import Data and some DirectQuery.
Having big data tables with the DirectQuery and smaller tables with Import Data is much better than doing everything pure DirectQuery, because in the pure DirectQuery mode, even for querying data from small tables, the performance would be still slow. In the composite model, the smaller tables can be used as Import Data to improve the performance. This also comes with a new storage mode; Dual, which you will learn about it a bit later in this post.
How does the composite model work?
Here is a Power BI file using a DirectQuery connection to a SQL Server database (AdventureWorksDW in this example);
I have selected the FactInternetSales table from this database.
In Power BI Desktop, after loading the table, you can see on the right-hand side bottom corner, that we are using the DirectQuery storage mode;
Now for the second table, I get data from Excel,
And select DimCustomer from the Excel file (this excel file is the Excel version of the AdventureWorksDW database, you can download it from the link in my other blog posts). As soon as you bring a table from another data source, you will be notified that there is a potential privacy risk that the data from one data source, may be used to pass parameters into another data source.
After saying OK to the above warning, you will get the new data table loaded into Power BI, and you can see that the mode changes to Mixed Mode.
The mixed-mode of the composite model means that the fact table in our example is NOT LOADED into Power BI. It will be queried each time from the SQL Server database table. The dimension, however, is already imported into the Power BI model and needs to be refreshed time by time. If you refresh your model, you will see that the only table(s) refreshing the data are Import Data tables.
Using this option, you will be able to see the Data tab (which you couldn’t have done in mere DirectQuery mode previously), and you can see only Import Data tables there.
You can have the big data tables of the model, coming from a DirectQuery source, and it doesn’t matter if it is billions of rows, because they are not loading into the Power BI. Small tables can be imported into Power BI for better performance.
Storage Mode: Dual
Considering the Composite mode explained above, you probably ask yourself this question where the data is stored in every table, and how they are working behind the scene? Can you select a table, and understand what is the storage mode for that, the answer is yes. Each table in Power BI has a storage mode. The storage mode can be; Import, DirectQuery, or Dual.
If you go to the model tab in the Power BI Desktop, and then click on a table, in the Properties pane, under the Advanced section, you can see the Storage mode. There are three options: Import, DirectQuery, and Dual.
Import and DirectQuery in the storage modes above are understandable enough. Import means, that every time you refresh the data, it will be imported from the data source, DirectQuery means it will be every time queried directly from the source no matter what visual you use. So what is the Dual mode?
If you try to change the mode of an Import table to Dual, you see that this change is not possible.
Dual Mode is only an option for DirectQuery data sourced tables.
When you change the storage mode of a DirectQuery table to Dual, it will create a copy of that table in the Power BI model. But unlike Import Data, there is a copy in the memory, and there is still the main table in the DirectQuery source.
The Dual storage mode, works as DirectQuery or Import Data, depending on what other tables related to it are used in a visualization. If you have a visual that has columns from Import tables and a dual table, then the dual table will act like an Import. If you have a visual that has columns from DirectQuery tables and a dual table, then the dual table will act like a DirectQuery table. This ensures that you get the best performance when working with smaller tables (such as dimension tables) in a dual storage mode.
Modeling for Power BI Composite model
Here are some of the things to consider when doing data modeling with a Power BI composite model
Relationship in Power BI Composite model
You can create relationships simply between tables coming from different data sources and using different storage modes. The diagram below shows how DimCustomer (Import) is connected to FactInternetSales (DirectQuery) and also DimDate (Dual).
The cardinality of the relationship can be anything you want; One-to-many, Many-to-one, One-to-one, and Many-to-many.
Calculations and DAX
Calculated tables are available in the composite model. These tables will be in Import mode, and their expression can be anything that the DAX allows. You can also create calculated tables using Power Query Editor.
Calculated tables will be refreshed at the scheduled refresh time of the Power BI dataset.
Depending on where you create the calculated column; they can have some limitations. If you create a calculated column in an Import table or a calculated table, then there will be no limitation. However, if you create a calculated column in a DirectQuery table, then in the expression for the column only columns from the same table can be used.
DirectQuery to Power BI Datasets
In the composite model, you cannot only use DirectQuery to SQL Server, Oracle, and some other DirectQuery sources, But you can also create a DirectQuery connection to a Power BI Dataset. This is different from the Live connection to the Power BI Dataset. The DirectQuery to the Power BI Dataset will enable you to bring the Power BI dataset’s data into a composite model, which gives you the ability to do further data modeling in the chained dataset. This concept is explained in this post.
In summary, combining DirectQuery tables and Import Data tables in one Power BI model is called Composite Model or mixed-mode. With this feature, you can also leverage the option for using Dual storage mode to keep a copy of a DirectQuery table in memory for answering quick slicing and dicing questions.
DirectQuery was not a common method when it comes to working with data in Power BI, especially because with this mode you couldn’t combine multiple data sources, but with the composite model, the DirectQuery becomes faster. If you have big data tables, then I definitely recommend exploring the composite model.
The composite model can even go one step further in the performance when aggregations are used. You will learn more about that in this post.
The composite model is better than a pure-DirectQuery approach. This option gives you performance and big data both in one data model.