Getting Started With Dataflow in Power BI – Part 2 of Dataflow Series

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail

In the previous article, I mentioned what is the Dataflow, how it can be helpful, and mentioned some of the use cases of it in real-world Power BI implementation. In this article as the second part of the series for the Dataflow, I like to give you some hands-on experience with Dataflow and explain how a Dataflow works. If you like to learn more about what and why about dataflow, read my article about; What are the Use Cases of Dataflow for You in Power BI. If you like to learn more about Power BI; read Power BI book from Rookie to Rock Star.

Prerequisite

Understanding what dataflow is, and how it can be useful in a Power BI implementation is the first part of this blog series. read about it here: What are the Use Cases of Dataflow for You in Power BI

Things to Know Beforehand

Let’s start this example, by creating the very first dataflow straight away. The first thing you need to know is that Dataflow creating and maintenance process is happening all in Power BI Service not in the Desktop because Dataflow is not part of any report or *.pbix file.

Developing or Editing Dataflows are possible through Power BI service (not the Desktop)

The second important thing you need to know is that Dataflow can be created only in an app workspace. You cannot create a dataflow in the “My workspace”. So either create or navigate to an app workspace. As you can see in the below screenshot, I am in an app workspace called “dataflow”. The name of the app workspace can be anything you want.

Dataflow is only available in an app workspace (not in “My workspace”)

As you can see in the above screenshot, the Dataflows tab is available. Please note that at the time of writing this post, Dataflow is still a preview feature. In the Dataflows tab, you will, later on, see those dataflows that you have created. If you don’t see the Dataflow tab, even in an app workspace, then there is something else you need to consider; the administration’s control on the dataflow.

Administrative’s Control

Power BI administrator can turn off or on the creating and use of the dataflow for users in the tenant. So if you don’t see the Dataflows option, it is probably because the Power BI administrator has turned off that features. So contact your administrator to get that enabled. At the time of writing this blog post, this option can be only turned on or off for the entire organization (like many other options in the tenant setting at the very first few months of their appearance), but I believe this feature would be available for selective group of people (like many other options in the tenant settings in the future).

In the Admin Portal of Power BI service, under Tenant Settings. there is a configuration option for dataflow as shown in the screenshot below;

Creating the First Dataflow

Now that we are ready, let’s start building the very first Dataflow. Start by clicking on the Create option, and then choose Dataflow.

Each dataflow is like a Job schedule process. It has one or more of transformations in it and can be scheduled. These transformations can write data into some entities. So let’s see how you can create the Entity. In the screenshot below you see there is an option to Add new entities. Let’s start with that.

Click on Add new entities, You can see the list of all supported data sources (Recently in an update, many more data sources are added to the list). As you can see in the list below, the interface is very similar to the Get Data interface of the Power BI Desktop.

Sample Dataset

For this example, we will be using an OData source, because it won’t need gateway setup, and it doesn’t need additional setup requirements. Select OData as the type of data source, and in the URL part, enter this address: https://services.odata.org/V3/Northwind/Northwind.svc/

(OData is a dataset that is available through API, the output of that may include one or more tables. Many data transformation tools can read data from Odata. Power Query (or in this case; dataflow) also has an OData data source connection.

After this step, you should see a screen which is very similar to the Navigator window in the Power Query. In this screen, you will see a list of all tables and can start exploring them.

For this example, let’s select these tables: Customers, Employees, and Orders and then click on Next.

Power Query Editor Online

As you can see in the screenshot below, after selecting tables, you will see a screen which is very similar to the Power Query Editor, but the online version of it. Not all functions that you have in the graphical interface of Power Query Editor is available here, however, some of the main transformations are achievable. You can see the Queries pane, Steps (Query Settings) pane, Data preview pane, and Data transformation pane all available.

In this example, I just selected the three tables with no transformation and clicked on Done. You can see then all tables in this dataflow. In dataflow, we call these Entities. For every entity, you can specify Incremental Refresh if you want.

You can then Save the dataflow.

Gateway

If your data is sourced from an on-premises (local domain) source, then you would need to have a gateway setup. I have previously written all about Gateway, and you can read that information in my article here. The difference here with the Power Query in the Power BI Desktop is that; In the Power BI Desktop, because you are developing the file locally, You can connect to an on-premises data source, and start developing your solution, and then after publishing to the service, you can set up the gateway. However, in Dataflow because everything is happening in the service, you need to have a gateway setup if you are connecting to an on-premises data source, otherwise, you cannot pass the very first step. If you are connecting to an online data source (such as Azure SQL Database), you won’t need a gateway.

Blank Query as the Data Source

One way to transfer your Power Query scripts to Dataflow is to select Blank query as the data source and copy and paste the M script from your Power Query tables to the dataflow. However, be mindful that not all Power Query transformations are supported yet. You may need to make some changes.

The script below is creating a basic date dimension (with no fiscal columns or public holidays, just with calendar columns):

You can copy and paste it into the blank query in the dataflow;

Dataflow then should be able to (depends on the functions used in the script and if they are already supported in dataflow or not) show you a list of steps based on the script.

Schedule Refresh

Scheduling the refresh of dataflows can be configured at each dataflow level (Not at the entity level).

Summary

In this article, you have learned how to create a dataflow, and also the necessary steps for creating it. Dataflow, however, still have many areas to reveal; the schedule refresh, mapping outputs of the dataflow query to a standard model, common data model, external dataflow, the structure of how the dataflow is saved in the Azure Data Lake. Stay tuned for the next parts of this blog series to learn more about dataflow. In below you can read other articles in the Dataflow series;

Part 1: What are the Use Cases of Dataflow for You in Power BI?

Part 2: Getting Starting with Dataflow in Power BI (this article)

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail
rssyoutuberssyoutube
Reza Rad
Reza Rad is a Microsoft Regional Director, an Author, Trainer, Speaker and Consultant. He has a BSc in Computer engineering; he has more than 20 years’ experience in data analysis, BI, databases, programming, and development mostly on Microsoft technologies. He is a Microsoft Data Platform MVP for eight continuous years (from 2011 till now) for his dedication in Microsoft BI. Reza is an active blogger and co-founder of RADACAD. Reza is also co-founder and co-organizer of Difinity conference in New Zealand.
His articles on different aspects of technologies, especially on MS BI, can be found on his blog: http://www.radacad.com/blog.
He wrote some books on MS SQL BI and also is writing some others, He was also an active member on online technical forums such as MSDN and Experts-Exchange, and was a moderator of MSDN SQL Server forums, and is an MCP, MCSE, and MCITP of BI. He is the leader of the New Zealand Business Intelligence users group. He is also the author of very popular book Power BI from Rookie to Rock Star, which is free with more than 1700 pages of content and the Power BI Pro Architecture published by Apress.
He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.
Reza’s passion is to help you find the best data solution, he is Data enthusiast.

Leave a Reply

Your email address will not be published. Required fields are marked *