In previous posts I’ve mentioned that Power Query is a self-service ETL tool. One of the most fundamental functions of an ETL tool is ability to schedule ETL run. In the other words you would require to populate data from operational data stores on a schedule basis to feed into your Data Model. Power Query extract and transforms data into the Data Model such as PowerPivot. Now the remaining piece is to schedule data refresh. Marco Russo wrote a good blog post about it (this for his great post), and I want to explain that with a little bit more elaboration.
IMPORTANT NOTE: (Thanks to Marco Russo for pointing out this)
Power Query Mashup data refresh in Power BI is not still available. So explanation of this blog post doesn’t work for power query, but only works for direct data source connections for PowerPivot models. Hopefully this feature be added soon to support Power Query as well.
Schedule Data Refresh
Once you’ve built your Power Query transformation script, you can add the data into PowerPivot Model. and then you can upload your excel document to a Power BI site (if you have Power BI for office 365 subscription). for more information about how to create Power BI site read this link:
In the Power BI site you can simply upload the document;
After uploading your document, you can schedule it for data refresh as below:
Screenshot below shows the options available to schedule data refresh:
You can check status of previous schedules and data refreshes on History tab. There are few tips to work with this schedule which explained well in Marco Russo blog post here, I strongly recommend you to read that post before creating the schedule.
As you seen scheduling data refresh is easy, and what you need is a subscription to Power BI for Office 365.
But the important note is that Scheduling data refresh only works with below data sources (in the current version of Power BI):
- Power BI Cloud Service
- Windows Azure SQL Database
- SQL Server in Windows Azure Virtual Machines
- Basic authentication
- Anonymous authentication
- SP Lists
- ProjectOnline feeds
- SQL Server 2005 and above
- Oracle 10g, 11g and 11gR2
Reference and link to study more about schedule data refresh:
For scheduling data refresh out of cloud based data source (such as Windows Azure SQL database) you won’t need any further configuration. but for scheduling data refresh for On-premises data source (means data server located physically in your company not on cloud), you would apply some steps beforehand. below is description of steps required to access on-premises data sources from Power BI to finally be able to schedule a data refresh for them.
Data Management Gateway
This gateway is a client service agent, that connects a data source connection from Power BI on Office 365 to on-premises data stores out of cloud. you should download and install Data Management Gateway. and also configure it through the Power BI admin site. you can configure more than one data management gateways on your Power BI admin site to connect to different data sources.
This link contains more information about Data Management Gateway:
And this link explains how to Create Data Management Gateway through Power BI admin site:
You can configure Data Management Gateway in Power BI admin site in the "gateways" tab
After creating the Data Management Gateway and starting it, you can create a data source connection to the On-Premises data store. This also can be done in Power BI admin site.
Once the data source connection created and started, then you can access to on-premises data store from Power BI site on cloud.
* Remember that steps for creating Data Management Gateway is only required if you want to connect to On-Premises data stores.