Scheduling Power Query ETL Runs

In previous posts I’ve mentioned that Power Query is a self-service ETL tool. One of the most fundamental functions of an ETL tool is ability to schedule ETL run. In the other words you would require to populate data from operational data stores on a schedule basis to feed into your Data Model. Power Query extract and transforms data into the Data Model such as PowerPivot. Now the remaining piece is to schedule data refresh. Marco Russo wrote a good blog post about it (this for his great post), and I want to explain that with a little bit more elaboration.

IMPORTANT NOTE: (Thanks to Marco Russo for pointing out this)

Power Query Mashup data refresh in Power BI is not still available. So explanation of this blog post doesn’t work for power query, but only works for direct data source connections for PowerPivot models. Hopefully this feature be added soon to support Power Query as well.

Schedule Data Refresh

Once you’ve built your Power Query transformation script, you can add the data into PowerPivot Model. and then you can upload your excel document to a Power BI site (if  you have Power BI for office 365 subscription). for more information about how to create Power BI site read this link:

http://office.microsoft.com/en-us/office365-sharepoint-online-enterprise-help/add-the-power-bi-sites-app-to-your-power-bi-for-office-365-site-HA104097295.aspx?CTT=5&origin=HA104180761

In the Power BI site you can simply upload the document;

After uploading your document, you can schedule it for data refresh as below:

Screenshot below shows the options available to schedule data refresh:

You can check status of previous schedules and data refreshes on History tab. There are few tips to work with this schedule which explained well in Marco Russo blog post here, I strongly recommend you to read that post before creating the schedule.

As you seen scheduling data refresh is easy, and what you need is a subscription to Power BI for Office 365.

But the important note is that Scheduling data refresh only works with below data sources (in the current version of Power BI):

 Reference and link to study more about schedule data refresh:

http://office.microsoft.com/en-us/office365-sharepoint-online-enterprise-help/schedule-data-refresh-for-workbooks-in-power-bi-for-office-365-HA104180761.aspx

On-Premises Configuration

For scheduling data refresh out of cloud based data source (such as Windows Azure SQL database) you won’t need any further configuration. but for scheduling data refresh for On-premises data source (means data server located physically in your company not on cloud), you would apply some steps beforehand. below is description of steps required to access on-premises data sources from Power BI to finally be able to schedule a data refresh for them.

Data Management Gateway

This gateway is a client service agent, that connects a data source connection from Power BI on Office 365 to on-premises data stores out of cloud. you should download and install Data Management Gateway. and also configure it through the Power BI admin site. you can configure more than one data management gateways on your Power BI admin site to connect to different data sources.

This link contains more information about Data Management Gateway:

http://office.microsoft.com/en-nz/office365-sharepoint-online-enterprise-help/introduction-to-data-management-gateway-HA104079171.aspx

And this link explains how to Create Data Management Gateway through Power BI admin site:

http://office.microsoft.com/en-nz/office365-sharepoint-online-enterprise-help/create-a-data-management-gateway-HA104093659.aspx?CTT=5&origin=HA104079171

You can configure Data Management Gateway in Power BI admin site in the "gateways" tab

After creating the Data Management Gateway and starting it, you can create a data source connection to the On-Premises data store. This also can be done in Power BI admin site.

Once the data source connection created and started, then you can access to on-premises data store from Power BI site on cloud.

* Remember that steps for creating Data Management Gateway is only required if you want to connect to On-Premises data stores.

Reza Rad on FacebookReza Rad on LinkedinReza Rad on TwitterReza Rad on Youtube
Reza Rad
Trainer, Consultant, Mentor
Reza Rad is a Microsoft Regional Director, an Author, Trainer, Speaker and Consultant. He has a BSc in Computer engineering; he has more than 20 years’ experience in data analysis, BI, databases, programming, and development mostly on Microsoft technologies. He is a Microsoft Data Platform MVP for 12 continuous years (from 2011 till now) for his dedication in Microsoft BI. Reza is an active blogger and co-founder of RADACAD. Reza is also co-founder and co-organizer of Difinity conference in New Zealand, Power BI Summit, and Data Insight Summit.
Reza is author of more than 14 books on Microsoft Business Intelligence, most of these books are published under Power BI category. Among these are books such as Power BI DAX Simplified, Pro Power BI Architecture, Power BI from Rookie to Rock Star, Power Query books series, Row-Level Security in Power BI and etc.
He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.
Reza’s passion is to help you find the best data solution, he is Data enthusiast.
His articles on different aspects of technologies, especially on MS BI, can be found on his blog: https://radacad.com/blog.

Leave a Reply