All You Need to Know About the Incremental Refresh in Power BI: Load Changes Only

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail

The default configuration for Power BI dataset is to wipe out the entire data and re-load it again. This can be a long process if you have a big dataset. In this article, I explain how you can set up an incremental refresh in Power BI, and what are the requirements for it. Incremental Refresh is not just in Power BI datasets, but also in Dataflows too. In this article, you learn how to load only part of the data that changes instead of loading the entire data each time. To learn more about Power BI, read Power BI book from Rookie to Rock Star.

What is Incremental Refresh?

When you load data from the source into the destination (Power BI), there are two methods: Full Load, or Incremental Refresh. Full Load means fetching the entire dataset each time and wiping out the previous data. When I say the entire dataset, I mean after all Power Query transformations, because there might be some filtering in Power Query, whatever data that loads into the Power BI dataset in this sentence is considered as the entire data.

If the dataset is small, or the refresh process is not taking a long time, then the full load is not a problem at all. The problem happens when either the dataset is big, or the refresh process takes longer than an acceptable timeframe. Consider you have a large dataset including 20 years of data. From that dataset, probably the data of 20 years ago, won’t change at all anymore, or even the data for 5 years ago, sometimes even a year ago. So why re-processing it again? why re-loading a data that doesn’t update? Incremental Refresh is the process of loading only part of the data that might change, and adding it to the previous dataset which is not changing anymore.

Requirements: What do you need to set up?

Table with Date Field(s)

For setting up the incremental refresh, you do need to have a table (or more) with date field(s) in it. The date field is the field that will have the impact of the partial refresh of the data. For example; Let’s say you have a FactSales table. You want to load all Sales which made earlier than a year ago just once, but everything from a year ago to now regularly. So you do need to have a date field in your table for it. Most of the time, this field can be created date, modified date, order date, publish date, etc. You do need to have a field like that as of date data type.

Licensing Requirement

You also need to have a Power BI Premium capacity. You can set up the incremental refresh in Power BI Desktop even without that, but then when you publish it to the service, if the file is not in a workspace with premium capacity, you won’t be able to use this feature. The premium capacity at the moment is a limitation for many Power BI users. There are some discussions that the incremental refresh might be available somehow for Pro users too. However, sticking to the facts for now; Premium is the only way for the incremental refresh.

Limitations: Things to Know Beforehand

One limitation is the license as mentioned above, However, the other limitation which can be more important is that after setting up the incremental refresh, you cannot download the PBIX file from the service anymore, because the data is now partitioned. It also makes sense, because the size of the data is probably too large for downloading it as well.

How to Setup Incremental Refresh

Setting up Incremental Refresh has some steps in Power BI Desktop, and then in the Power BI Service. Let’s check them one by one.

Enable the Preview Feature

The first step is to enable the Incremental Refresh Policies in the Preview features of Power BI Desktop. If you read this article a year later, this feature might be released at that time, and you might not need this step then. To enable the preview feature, In Power BI Desktop, go to File -> Options and Settings -> Options;

Then go to Preview features tab, and select the Incremental Refresh Policies there. You do need to restart Power BI Desktop (Close and Open again) afterward.

Now you are ready to set up the Incremental Refresh.

Parameters in Power Query

For setting up the incremental refresh in Power BI, you need to use Power Query parameters. You need to create two parameters with the reserved names of RangeStart and RangeEnd (Note that Power Query is a case sensitive language). Go to Edit Queries in your Power BI Desktop solution,

And then click on New Parameters.

Then create two parameters of DateTime data type, with names of RangeStart and RangeEnd, set a default value for each too. The default value can be anything, but the name and the data type should be as mentioned here.

Filter Data Based on Parameters

After creating the two parameters, you need to filter the data of the date field based on these two parameters. For example, in the FactInternetSales, I can filter the OrderDate using the column filtering as below;

You can use the Between filter using the RangeStart and RangeEnd parameters as below;

After doing this action, your data in the table will be filtered based on the default values you have set for the RangeStart and RangeEnd parameters. However, don’t worry about that. these two parameters will be overwritten with the configuration you make in the Incremental Refresh setting of the Power BI Desktop.

Power BI Desktop Incremental Refresh Setup

The final step in Power BI Desktop is to close&apply the Power Query Editor window and set up the incremental refresh setting for the table. You have to right click on the table in the Power BI Desktop, and select Incremental Refresh.

In the Incremental Refresh settings window, you can choose the table first. If the table is a table that doesn’t have the two parameters of RangeStart and RangeEnd used in filter criteria, then you won’t be able to do the setting for it. For example, DimCustomer won’t give me the option to do the incremental refresh settings.

However, I can do the settings for FactInternetSales because I did filter the OrderDate field of this table based on the parameters. There are two things to notice at this step: If your data source coming from a data source that supports query folding, then Incremental load works as it best. If not, it is not recommended to use it. Most of the time, a huge data source is coming from a relational data store system, which supports query folding by the way.

another thing to note is that Incremental Refresh requires Premium licensing. and finally, you have to know that you cannot download a PBIX file from the service if it has incremental refresh setup on it.

Configuration for the incremental refresh is easy. You just set up the amount rows to Store (load only once, and store it), and the amount rows to Refresh (re-load every time);

Incremental Refresh for Multiple Tables

You can set up the incremental refresh for multiple tables. You don’t need more parameters, the two parameters of RangeStart and RangeEnd are enough. You just need to set up the filter in any other tables you want as well.

Then you need to set up the configuration in the Power BI Desktop for each table;

Note that although we are using the same parameters of RangeStart and RangeEnd, you can have totally different configurations of Store and Refresh for the incremental refresh setting for each table.

Publish to Service

After setting up everything, now you can publish the Power BI file to the service. Note that you can only publish it to Premium workspaces.

Detect Data Changes

Incremental Refresh will make the part of the dataset to refresh much smaller, and as a result, the process would be much faster. However, There is still one better way to do that. If you have a modified DateTime (or updated DateTime) in your table, then the process of incremental refresh can monitor that field, and only get rows that their date/time is after the latest date/time in that field in the previous refresh. To enable this process, you can enable the Detect Data Changes, and then choose the modified date or update date from the table. Notice that this is different from the OrderDate or transaction date. And not all tables have such a field.

Only Refresh Complete Period

Depends on the period you selected when you set up the incremental refresh, you can choose to only refresh when the period is complete. For example, in the FactInternetSales, I set the refresh period to Year. And then I can set the option to Only Refresh the Complete Period, which means even If we are in Feb 2019, it will only refresh the data up to Dec 2018 (because that is the latest date that we have a full complete year in it);

Incremental Refresh for Dataflows

Dataflows are ETL process in the cloud for the Power BI service. You can also set up the Incremental Refresh for Dataflows. And it is even easier to do it for the dataflow. You don’t need to create the RangeStart and RangeEnd parameters there. Just go to the dataflow Incremental Refresh setting directly.

You can see that the same set up of the incremental refresh setting is available for the dataflow;

Summary

Setting up the incremental refresh in Power BI means loading only part of the data on a regular basis, and storing the consistent data. This process will make your refresh time much faster, however, there are some requirements for it. You need to have a date field in your table, and at the moment, there is a licensing requirement for Power BI premium for it. You can do this configuration on a Power BI dataset, or in Power BI dataflows. If you do this on a dataset, then after publishing the dataset, you cannot download the PBIX file.

Have you ever considered using Incremental Refresh? If you haven’t done it so far, let me know why in the comments below, and I’m always happy to help you with that.

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail
rssyoutuberssyoutube
Reza Rad on FacebookReza Rad on LinkedinReza Rad on TwitterReza Rad on Youtube
Reza Rad
Trainer, Consultant, Mentor
Reza Rad is a Microsoft Regional Director, an Author, Trainer, Speaker and Consultant. He has a BSc in Computer engineering; he has more than 20 years’ experience in data analysis, BI, databases, programming, and development mostly on Microsoft technologies. He is a Microsoft Data Platform MVP for nine continuous years (from 2011 till now) for his dedication in Microsoft BI. Reza is an active blogger and co-founder of RADACAD. Reza is also co-founder and co-organizer of Difinity conference in New Zealand.
His articles on different aspects of technologies, especially on MS BI, can be found on his blog: https://radacad.com/blog.
He wrote some books on MS SQL BI and also is writing some others, He was also an active member on online technical forums such as MSDN and Experts-Exchange, and was a moderator of MSDN SQL Server forums, and is an MCP, MCSE, and MCITP of BI. He is the leader of the New Zealand Business Intelligence users group. He is also the author of very popular book Power BI from Rookie to Rock Star, which is free with more than 1700 pages of content and the Power BI Pro Architecture published by Apress.
He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.
Reza’s passion is to help you find the best data solution, he is Data enthusiast.

6 thoughts on “All You Need to Know About the Incremental Refresh in Power BI: Load Changes Only

  • Really insightful! Thank you Reza. Do you know if this feature is available to Power BI Report Server?

  • Hi Reza, great article. So as you can’t download the pbix you just need to make sure you save a local copy and redeploy it when you want to make an update? At the moment of pbix update (redeploy to service) the dataset will have to do a full refresh once again before it will start using incremental refresh? Thx for letting me know. J

  • Hi Reza,
    Did you try to work with incremental upload from SAP BW? there data extraction is based on MDX query and it is not so straight forward as you described.
    1. With power query
    The fact that parameters are used the is very helpful as we can converts is value as an input entry for MDX query. So technically we can work with that but in case we want to combine data from multiple tables, every time we are changing data model we have to publish data set again and every time refresh historical data for all tables (not only for those where model is changed). Also we cannot Refresh data for one particular table. All or nothing… So we thought that Data Flow would help us…
    2. Data Flow
    Different way of working, no parameters… date only so we cannot use our solution with parameters. And even if we have a data source with date column then the data format in SAP BW is different than in Power BI, so the incremental upload is not possible at all.

    If you have any experience with SAP BW or a smart solution how we can use incremental upload in data flow I’d really appreciate if you could share with us.
    Thanks
    Dariusz

    • Hi Dariusz
      I am not sure if Query folding is supported with SAP BW. I assume not. If no query folding, then the incremental load won’t make the refresh process that much different, because the data will be loaded fully from the source to the engine of Power Query every time, and filtered there.
      One way is to create a dataflow for previous periods manually, run it once. and then create the rest in a new dataflow.
      Cheers
      Reza

Leave a Reply to Sebastian J Cancel reply

Your email address will not be published. Required fields are marked *