Power BI Says Hi to 3D Maps

Published Date : November 25, 2015

1

If you’ve read my previous articles about Power BI, you know that Power BI has five main components: Power Query, Power Pivot, Power View, Power Q&A, and Power Map. So far all of these components were now supported in PowerBI.com website as part of your Power BI solution except Power Map! Power Map was the only component that you could use in Excel for Office 365 subscription, and it allowed you great 3D Geo-spatial visualization with ability to tell stories with videos based on animated tours. The great news today is that Power BI Desktop now supports 3D map visualization, but is that visualization same as what Power Map exactly does? Let’s find out.

 

3D Visualization in Power BI Desktop

I’ve heard of this features first in a session at PASS Summit 2015, and I was looking forward to see the feature be available to run some tests on it. My Particular interest was because I’ve done some interesting projects on Geo-spatial data visualization with Power Map for different clients, and I know what are their requirements in that area.

Fortunately today Microsoft Power BI team announced that 3D map feature added to Power BI through Custom Visuals. So let’s see how it looks like. For using this visual you have to download it first from Power BI Visuals Gallery:

https://app.powerbi.com/visuals

1

After downloading this visual you can import it into Power BI desktop

2

Now you see Globe Map visual added to the visualization pane

3

I show my sample Pubs map demo of Power BI here with Globe and here it is;

4

As you see a nice 3D map showed here easily. However there is no particular options for it. it is just a 3D map! If I wanted to zoom into part of the map on a particular angle I cannot do that.

The map however is able to show two measures; one as the height of bar chart, and another one as heat map. you can see that above map shows sales amount as height, and quantity of sales as heat map.

You can not yet see Globe visual in PowerBI website, I reckon that would come very soon. but for now if you publish your report you will see this;

5

One specific good feature of Power BI Globe visual is that it sync simply with other visualization items in the report (it is not isolated component as Power Map) , so if you click on column chart’s specific column you will see only related part of the data for that visualized in map.

6

 

Power Map

Well Power Map is much more mature in 3D Geo-Spatial visualization, you can have different layers of visualization (such as column chart and heat map, and region visualization). You can zoom into map on particular angle if you want to. You can have a play axis which is very important for story telling.

7

Here are some features on Power Map, and the information about either they are or not yet supported in Power BI;

tbl

 

Conclusion

Power BI team did a great step forward with adding 3D map visual in Power BI Desktop. Thank you Microsoft Power BI team because of that! It is really useful for some scenarios that users need to see visualization on 3D map. However this feature is far behind Power Map features for story telling, creating tours, play axis, and many other features. I believe that soon many of these features will be added into Power BI. So I say Power BI said hi to 3D maps, but please be quick on that Microsoft team because Power Map raised expectations of our clients to a very high level! They will be looking for same features (at least) in Power BI desktop.

 

A Question For You

Have you ever used Power Map or any other 3D Geo-Spatial data visualization tools? what are features of that (those) tools that you liked most? what are features that you didn’t used? What about Power Map? what are great features of this product from your (or your customers) point of view? and what needs to be added?


Live SSAS Multi-Dimensional Connection from Power BI

Published Date : November 21, 2015

2

Power BI Desktop comes with new updates today as well, and the best feature in this update (from my point of view) is Live connection to SSAS Multi-Dimensional. This feature is still in preview mode, but it is good to see that Microsoft Power BI team invested on this most wanted request from users. With this new feature you can connect to SQL Server Analysis Services 2012 SP 1 CU4 or higher versions from Power BI Desktop. The ability to connect from Power BI website as live connection to on-premises SSAS multi-dimensional however still not available, but I reckon it would be underway.

 

How Live Connection Works?

If you are not familiar with Live connection and might ask that what is Live or Direct connection then and what is the difference of that with regular connection, I have to point out you to this blog post about SSAS Tabular Live Connection from Power BI. However I’ll explain it here in a nutshell;

Power BI has a built-in data model engine which is Power Pivot. Everything that comes to Power BI will be loaded into Power Pivot, and will be processed there. That is why when you close your Get Data experience in Power BI Desktop then you will see messages that shows 1000 rows loaded into memory or messages like that. There is however another method of connection in Power BI. Power BI Desktop can connect to few data sources as direct / Live connection. This means that data of those data sources would be directly loaded into the report, No Power Pivot layer between! only structure of the source will be available in Power BI Desktop for building the report. This method of connection reduce one extra step which is loading the data again into Power Pivot component of Power BI. With this method if you have an existing data source (such as SSAS tabular) built and ready for reporting, then you connect it directly to Power BI without hassling of modeling it again into Power BI Desktop. However there are some cons for this approach, for some data sources the connection speed might be low and report take longer to visualize or refresh.

 

Live Connection for SSAS Multi-Dimensional

In July release of Power BI Desktop you could connect to SSAS Tabular as direct connection. Now with the new release of Power BI Desktop today you can connect to SSAS Multi-Dimensional and use it as a direct and live connection! Here is how you can do it:

You have to first set a configuration in Power BI Desktop Options;

1

In the Options window under Preview Features check the Explore live with SQL Server Analysis Services multidimensional models option.

2

Then you need to restart Power BI Desktop (Close and re-open)

3

After restart, when you go to Get Data, and then Databases, and choose Analysis Services as the source, you can choose Live Connection

4

After creating the connection you can choose which database, and cube you want to connect to

5

and then you will see all objects under that cube in list of Fields

6

and you can simply build any reports you want.

9

with this live connection you can use these SSAS objects and features;

 

Limitations of Live Connection

Live connection is great, however there are some limitations for it. Some of these limitations might be removed in later versions, but this is list of limitations at this point of the time;

7

8

This doesn’t look right, does it? because you should be able to publish your report, otherwise what would be the point of creating it?! The reason for this is that Power BI website doesn’t support live connection to on-premises SSAS multi-dimensional. However please understand that this feature is still on preview mode, and I believe product team will bring it into Power BI website very soon.

 

 

 

 


Definitive Guide to Power BI Personal Gateway

Published Date : September 17, 2015

0

If you’ve followed my Power BI from Rookie to Rockstar book so far, you know that Power BI can reach on-premises data sources as well as cloud based. With Power BI Desktop you simply connect to on-premises data stores, however when the report published into Power BI website there should be a bus connection between Power BI (on cloud) and on-premises data stores (such as SQL Server, Oracle, SSAS Multi-Dimensional and so on). Here is where Power BI Personal Gateway come to play its role. Personal Gateway create the connection path from the data set in Power BI on cloud to the data store on -premises (on your organization server, or even your laptop!). Personal Gateway creates that connection line through Azure Service Bus.

In this section you will learn

We are in Get Data section of Power BI book and this is the best time to talk about Personal Gateway because most of the data sources in organizations I believe still are on-premises, and Power BI fortunately can connect to it. Let’s start from the definition of Personal Gateway;

What Is Power BI Personal Gateway?

Power BI Personal Gateway is an application and service that creates the bus connection between Power BI data set on cloud to on-premises data store. This is an small application that you download, install and configure.  You can use Personal Gateway to connect to any on-premises data stores that is listed in Power BI Get Data section except ODBC, Active Directory, and Microsoft Exchange. Screenshot below show a diagram of gateway role for Power BI and on-premises data stores;

0

Where Power BI Personal Gateway Would be Useful?

When you get data from an on-premises data store, and your on-premises data store is not ODBC, Active Directory, or Microsoft Exchange. So don’t limit yourself, you can connect to SQL Server On-premises, SSAS multi-dimensional on-premises (You can do SSAS Tabular on-premises as well, but there is a better type of connection – live/direct connection – for it which would serve purposes much better).

What about Online Data?

If you get data from online data sources, such as CRM Online, Mailchimp and other Content packs of Power BI, data from a website and etc, then you don’t need Personal Gateway. Online data can be refreshed as long as required credential is set up correctly. However remember if you are using combination of an online data store and on-premises, then you need personal gateway to get on-premises refreshed.

 

How to Install Power BI Personal Gateway?

Installation is really easy and straight forward, you would be able to follow installation steps easily. Download it from the Power BI download menu option as screenshot below mentioned:

1

Or you can download it directly (Note that direct download link might not have the latest Power BI Personal Gateway version):

http://go.microsoft.com/fwlink/?LinkId=534231

After installation all you need to do is to configure it with a windows/Active Directory user and password. The Gateway then connected and running.

3

Do I Need More than One Instance of Gateway Installed?

No. One Gateway will provide service bus connection to all supported on-premises data stores on your machine. In other words; One Gateway to Rule Them All ;)

 

How to Set Scheduled Refresh in Power BI Website?

The whole purpose of Gateway is for being able to schedule a data refresh or be able to refresh the data set on Power BI website with Refresh Now button. You can Set a Scheduled Refresh in this way:

Click on ellipsis button on the right hand side of the data set in Power BI website

4

In the dataset setting you can see if Gateway is online or not (if there is no gateway installed, you can install it from here as well)

5

You can set credentials. Example below belongs to an Analysis Services Multi Dimensional data store. SSAS MD doesn’t support basic authentication, but don’t worry this bug has been reported, and will be fixed.

6

Then you can simply schedule refresh as you want

7

Not that you can also Refresh dataset anytime you want with Refresh Now option

8

When refresh happens you will see a refresh icon besides the data set

9

 

Limitations of Power BI Personal Gateway

file891271793979

This Gateway in its early life stages, suffer some limitations, which most of them will hopefully go away soon. Here are list of existing limitations;

 

  1. Limitation On Data Sources; As mentioned above, all data sources on-premises and supported by Power BI are supported except ODBC, Active Directory, and Microsoft Exchange.
  2. Gateway can be only installed on 64 bit machine.
  3. Gateway cannot be installed on a machine that has Power BI Analysis Services Connector Installed. (SSAS Connector will be discussed in another section)
  4. The machine that has gateway running, should not be turned off, shut down, or stand by.  (At the time of scheduled refresh happening at least)
  5. Power BI Personal Gateway would be only available for Power BI Pro. (You can try trial version of it for free for 60 days I reckon)

 

Myths and Misconceptions About Power BI Personal Gateway

Gateway doesn’t pass user credential from Power BI website

Gateway creates the service bus for cloud based Power BI data set to connect to the on-premises data stores. However the connection won’t by dynamic. What I mean by Dynamic? I’ll explain it here:

Let’s consider that you have a SQL Server Analysis Services Multi-Dimensional set up. Let’s assume that you have set up cubes for HR data marts. As you probably now HR contains sensitive data that needs to be filtered for each user/department based on their access level. someone from one department shouldn’t be allowed to see someone else’s information from another department. Fortunately in SSAS multi-dimensional this is supported through defining Roles and row level security applied on them. Then users and groups on active directory will be mapped to roles, and anytime they login through a client (let’s say Excel PivotTable for example) their credential will pass through down to SSAS engine, and data will be returned only based on their authorized permissions defined in their role in SSAS.

This scenario works perfectly when client tool or user report is on-premises, such as Excel, or even SSRS reports on your organizational servers can leverage this method using Kerberos installed and configured. However when client tool is on cloud then things should be set up differently. Power BI report is a cloud based tool which connects to on-premises data store (In this example SSAS multi-dimensional) through Personal Gateway. So you might end up with this conclusion that the user credential from the person who logged in to Power BI website will pass through the gateway down to SSAS on-premises and he/she would access only to their allowed data portion defined in SSAS Roles. This is a misconception however.

Unfortunately Power BI Personal Gateway doesn’t pass user credential from cloud to on-premises. Personal Gateway actually works with a SAVED credential (the one that you defined at the time of configuring Personal Gateway), and users will have access to any data that the saved credential does. So as a result data level Role-based security with personal gateway is not possible (I mean for on-premises data stores, and only with Personal Gateway). For SSAS Tabular things are not similar however, Fortunately SSAS Tabular supported through direct connection with Power BI Analysis Services Connector. SSAS Connector passes credentials through, and credential works with Active Directory and Azure Active Directory (I’ll explain that in separate section).

Same principal applies on any on-premises data stores that Power BI connects to it through personal gateway. In Nutshell Power BI Personal Gateway doesn’t pass credentials to data stores.

Personal Gateway is not a Live Connection

The other misconception that you might have is that the connection to on-premises with Personal Gateway is Live. The actual fact is that connection is off-line, and is not live. the data set needs to be refreshed anytime for the data to be updated.

You Can Read Data From On-Premises Data Stores Not Only Databases But Also Files

Personal Gateway is not just for databases. You can connect to a CSV file stored on-premises shared folder on your organization server or even your laptop. As long as the saved credential in personal gateway has access to it, Power BI can read it.

Personal Gateway and Analysis Services Connector Are Different!

I’ve heard this question a lot! Is Personal Gateway same as AS connector? The answer is No. You probably concluded that so far after reading this section. But they are two separate applications, here is their purpose:

 

I would like to mention the Power BI site’s article about Personal Gateway as well, where you can study more about it;

https://support.powerbi.com/knowledgebase/articles/649846-power-bi-personal-gateway

 


Power BI Get Data: From Azure SQL Database

Published Date : September 12, 2015

26

Power BI and Power Query can connect to files such as Excel, CSV, text files and on-premises databases such as SQL Server, Oracle, MySQL. Power BI can connect to many data sources on cloud such as Azure SQL Database, Azure SQL Data Warehouse and etc. In this part you will learn how to connect from Power BI Desktop to Azure SQL Database. There is also a way of connecting to Azure SQL Database with a direct connection from Power BI website which will be explored in this section as well. You will also learn how you can schedule your report to refresh data loaded from Azure SQL DB. So In general you should expect to learn everything related to Power BI relation to Azure SQL Database in this section.

In this section you will learn;

 

Preparation

For this section you need an Azure SQL Database. I use AdveutureWorksLT example, this is the sample database in Azure SQL Database templates that you can easily install and configure. If you have this database set up on Azure then you can skip this step. For creating an Azure SQL Database for AdventureWorksLT database follow below steps;

Do you need an Azure subscription to run this experiment?

Yes, but don’t worry if you don’t have it. You can have an Azure subscription free for 25 days with 200$ free credit for you to use, just use the trial version. You can start trial version by following this URL: https://azure.microsoft.com/en-us/pricing/free-trial/

1

After setting up your Azure Account, go to Azure Portal.  I have to mention here that there are two versions of management portal for Azure. the new Azure portal which is tablet friendly, with newer and better look and feel, and the old management portal. screenshots and steps described in this example all has been done in the new Azure Portal. You can go to Azure portal by using this URL: https://portal.azure.com

2

You can manage your Azure services in the management portal by creating new services, editing existing services. Talking about Azure services is out of scope for this example and you need to read books on that topic. However for this example let’s smoothly continue steps to create an Azure SQL Database. Click on New on the top left side and then  under Data + Storage choose SQL Database

3

In the SQL Database Create pane, name the database as AdventureWorks LT. You have to choose the server also. Server is like a SQL Server instance that this database will be hosted on that. You can choose from an existing server or you can create a new Server.

4

As you can see in screenshot above, after going to Configure required Settings for the Server, you can choose to create new server (numbered 3), which will redirect you to a new pane for setting up the server (numbered 4). or you can choose an existing server (numbered 5).

After setting up the server, you have to select source for the database. for this example choose Sample. after choosing sample you will see the Select Sample option below appears.

5

Choose the sample as AdventureWorksLT [V12], and then type in the sever admin login and password (you have defined that when you set up the Server)

6

You have to also choose a pricing tier, and resource group. for pricing tier just use one of the tiers (you can better choose yourself), and then for the resource group you can choose an existing one of create a new. A resource group is a grouping for Azure services, you can have a resource group and add all related azure services under that. for example you can have a resource group for Power BI Online Book and create all examples of this book under that. Please note that the resource group name should not have spaces in the name, but it can have dashes.

7

After all the configuration click on Create so the SQL DB creates. the tick on check box for pin to Startboard will bring the SQL DB on the first welcome page (start board) of the Azure Portal. It may take a bit time for the database to be created. After completion of creating database process, you will be redirected to database page in azure portal (if you didn’t, then click on the AdventureWorks LT database on start board to go to its pane). screenshot below is showing the database created

8

Now you are all set, example database is ready to be used in Power BI.

 

Get Data From Azure SQL Database

You can connect to Azure SQL Database from Power BI Desktop or Power Query for Excel. Both methods works same. Let’s go through the connection from Power BI Desktop. Before starting steps I have to mention that Power BI Desktop connection to Azure SQL Database is an off-line connection. Off-line connection here means the data from Azure SQL Database will be loaded into the Power BI model and then reports will use the data in the model, this disconnected way of connection is what I call off-line. The off-line connection to Azure SQL DB can be scheduled in the Power BI website to be refreshed to populated updated data from the database. In this section we will create the connection from Power BI Desktop to Azure SQL DB, and in the next section following you will learn how to schedule the data refresh.

Open the Power BI Desktop and Get Data from Azure SQL Database

9

In Power Query for Excel you can also follow the path mentioned in screenshot below

10

You need to enter the server name in SQL Server Database dialog box. Remember that you’ve set up the server when you created Azure SQL DB. if you don’t know what is the server for your database, simply find it through Azure portal under the Azure SQL DB pane;

11

Type in the server name in SQL Server Database dialog box in Power BI Desktop. and type in the database as AdventureWorks LT. leave the SQL statement as is. then press OK

12

It is very likely that you get to the window that says Unable to Connect. This window is saying that Power BI Desktop cannot connect to Azure SQL DB and the reason is that the Azure SQL Server didn’t allowed your IP address to pass through its firewall. I have to mention that Azure SQL Server by default doesn’t allow external IP addresses to connect to it. if you want to connect to any databases on Azure SQL Server you have to allow the IP of that machine to pass through. This is not your internal network IP, this is the IP that your internet connection has. You can find the IP easily. It is mentioned in Unable to Connect error message below!

We encountered an error while trying to connect to “…..”. Details: Microsoft SQL: Cannot open server’…’ requested by the login. Client with IP address’210.246.15.145‘ is not allowed to access the server. To enable access, use the Windows Azure Management Portal or run sp_set_firewall_rule on the master database to create a firewall rule for this IP address or address range. It may take up to five minutes for this change to take effect.”

13

So Let’s add the IP in firewall pass list. Go to Azure Portal again. if you closed it, open it again, go to Browse All, then Choose SQL Databases, then choose AdventureWorksLT database in the list, and then click on the server name of it to open the Azure SQL Server administration pane

14

Click on Show firewall settings. In the Firewall Settings pane, enter the new IP as a rule, and then save it.

15

Now you can try again to connect from Power BI Desktop (if you get that error again, just wait for few minutes and try again. Sometimes it takes few minutes for changes to take effect). After a successful connection you should be able to see Navigator dialog box with the structure of AdventureWorksLT database

16

As you can see in the navigator all views, tables and functions will be listed. You can choose multiple objects and then continue editing them in the Edit Queries or Power Query Editor window. For this example I’ve chosen these tables: Customer, Product, ProductCategory, ProductModel, SalesOrderDetail, and SalesOrderHeader.

You can then choose these tables with their fields to be used for building a report without any modification in Power Query or Data tab. You can even see that Power BI and Power Query understand the relationship in Azure SQL Database and load the same relationship in the Power BI model.

17

This section is still one of the early chapters of Power BI online book, and I don’t want to discuss visualization and modelling. However for this example I’ve build a simple chart, the chart is a clustered column chart with Color (from Product table) as Axis, and SalesPerson (from Customer) as Legend, and OrderQty (from SalesOderDetail) as Value.

18

The chart is simple, but still revealing something interesting. Jae0 did the most sales, however color wise each sales person did the best in specific color. shu0 was best at Yellow. lina3 at Silver, and Jae0 at Black. Now you can publish your report into Power BI website with the Publish menu option. after publishing the report you would be able to see that under your Power BI account in the website.

19

 

Schedule Data Refresh

In Power BI website you can set up scheduled data refresh for data sets, but not all data sets supports this feature. Fortunately Azure SQL Database supports it. To find out list of all data sets that supports data refresh read this link. To schedule a refresh in Power BI website, under Datasets click on ellipsis besides the data source that you want. and then choose Schedule Refresh.

20

Set the Data Source Credentials for Azure SQL Database

21

And then you can schedule refresh. You can choose the frequency to be daily or weekly. and you can add multiple times on the day under that.

22

If you have a scheduled refresh set up for a while then you can see the history of refresh as well

23

After setting up the schedule refresh, you can see the latest refresh and the next refresh information easily with ellipsis button on the data set

24

 

Direct Connection to Azure SQL Database from Power BI Website

So far you’ve learned how to connect from Power BI Desktop or Power Query to Azure SQL database. However the connection in that way is off-line and you need to set up a schedule refresh to keep data up-to-date. Fortunately there is another way of connection which is Direct Connection. Direct connection won’t load data into Power BI model, it directly bring data into the report, and you won’t need to schedule refresh anymore, because data is always up-to-date. There will be however a lag for loading the report depends on the data and volume required loaded in the report.

You can set Direct connection only from Power BI website at the moment. And this is one of Power BI Pro features, so you need to buy the pro plan (which costs 9.99$ per user per month at the moment). To set a direct connection, click on Get Data in Power BI website. you will be redirected to Get Data Page

25

Click on the Get option from Databases. You can see that some databases are supported through this type of connection; Azure SQL Database, Azure SQL Data Warehouse, SQL Server Analysis Services, and Spark on Azure HDInsight. Click on Azure SQL Database and then Connect.

26

Set the connection information such as server, database name, and user and password

27

After creating the connection you will see Azure SQL Database in your dashboard and also the dataset in list of datasets.

28

by click on any of these items you will be redirected to online Power BI report designer. You can see that all tables and views now are listed in the Fields pane, and ready for you to build reports from them.

29

 

Summary

In this lesson you’ve learned how to connection from Power BI Desktop or Power Query in Excel to Azure SQL Database. You’ve also learned some basics about Azure SQL Database settings in Azure Portal. Then you’ve learned that you need to set firewall in Azure Portal for server to pass your IP for connections. You then learned how you can set scheduled refresh for the data in Power BI website. At the last section you’ve learned that you can create direct connection from Power BI website to Azure SQL Database that doesn’t load data into a model and works online with the data. In next sections of this chapter you will learn about other data sources that you can connect from Power BI.

 


Power BI Get Data From Excel: Everything You Need to Know

Published Date : September 2, 2015

1

In Previous section you learned about Power Query through an example of data mash-up of movies. Also prior than that you’ve learned about Power BI and its components in Power BI online book from rookie to rockstar. In this section I would like to start exploration of different data sources in Power BI, and I want to start that with an Excel source. Excel source seems to be an easy one, but in the other hand it is one of the most common sources of the data. In this section I want to share some tips about Excel data source and then I show an example of working with Olympic data source in an Excel file.

Excel Data Source

Power Query or Power BI can connect to many data sources, one of the supported data sources in Excel. Power Query for Excel get data from Excel in this way:

1

Power BI Desktop connects to Excel through Get Data experience

2

For getting data from excel you just need to specify path of the file. After specifying the file, Power Query will analyze contents of the file and distinguish all sheets and tables in the file and list them in the Navigator dialog box as preview;

3

As you see in the Navigator dialog box screenshot above, Power Query will distinguish Tables and Sheets and show appropriate icon for any of these types (note to the highlighted section in screenshot above).

Loading Excel Tables into Power Query

If you have data in an Excel Table then it can be easily detected and picked by Power Query as you see in screenshot above, CashFlow is a table in Excel. screenshot below shows the Excel table for CashFlow;

4

As you can see in screenshot below Power Query (or Power BI) fetched the table fully

5

Loading Excel Sheets into Power Query

You can get data directly from Excel sheets as well, No matter if you have tables or not. Power Query will always read the data from Excel sheets from all cells that contains data. if you have even two data sets in one Excel sheet Power Query still read that and load it correctly. Here is an example of Excel sheet with two data sets;

6

When you load this sheet in Power Query only data range of data cells up to the last cell’s column and row will be fetched;

7

What Happens If Excel Contains Formatting?

Any formatting such as color, font, data type of cells would be ignored when it loads into Power Query. The reason is that Power Query is a data mash-up tools not a modelling or visualization tools. You can apply these formatting later in the model (Power Pivot) or in the report (Power View). As an example; if your data values contains decimal points such as 12.94 and in Excel you’ve formatted cell to have zero decimal points (Excel will show cell value as 13 in this case). Power Query still fetches the original value which is 12.94 .

What Happens to Power View Sheets in Excel?

You can have Power View sheets in Excel, and your Power View sheets can contain Data values, such as table. However Power Query won’t load anything from a Power View Sheet at this stage. Because usually Power View sheet uses a data source you can use that source directly in Power Query. Power View data source might be sources such as Pivot Table or SSAS Tabular connection that can be both used in Power Query directly.

Pivot Tables and Pivot Charts?

Obviously you cannot fetch data from Pivot Charts! Why? because chart is a visualization element, so same principal that I said for Power View in above paragraph works here, connect to Pivot Chart’s source directly. You can get data from PivotTables however. PivotTables data can be fetched exactly as they shaped in Excel file with same structure of columns and rows. For example, if you have a Pivot Table like below screenshot:

8

After loading that in Power Query, you will see the Pivot Table exactly the same (without formatting);

9

What If Your Excel Table Has Merged Cells?

Merged Cells in Excel are commonly used. You can have tables with cells merged vertically or horizontally or even both. Here is an example of a table with merged cells;

10

Power Query still reads cells in their original detailed format. It means merged cells won’t be merged in Power Query, they will be seen as separate cells.

11

You can use transformations such as Fill Down to fill null values in the remaining cells of merged cells later on. Also note in above screenshot that green highlighted column is a calculated column. This calculated value will be fetched in Power Query as static values.

 

Example Excel Data Source: Olympic Games

I’ve found list of all medalists in Olympic games from the very first game (1896) till 2008 (unfortunately 2012 london games are not included in this data set). This is fortunately a public list made available by The Guardian, that you can download here. The list is well structured with a main sheet for all medalists as below;

12

As you see in above sheet all medalists with their main sport category and discipline and detailed event are available. Name of Athletes and their gender and Medals as well as the Olympic game (year) all is listed. Countries information listed as three character code. This three character codes are avaiable as a reference in another sheet of the file named IOC Country Codes as below;

13

Data is well structured and loading it into Power BI would be just matter of seconds! Start getting this information by Get Data From Excel and then address the downloaded Excel file. In Navigator dialog box choose All Medalists and IOC Country Codes sheets both to be checked. Also note that All Medalists sheet’s data shows in non-merged cell style (as you’ve learned earlier in this section; merged cells will be un-merged in Power Query). After selecting these two sheets click on Edit.

14

There are just few changes that we need to make;

Remove Rows

All Medalists query contains four heading rows which we don’t need them (it is just title and disclaimer) so better to remove those;

Go to All Medalists query and then click on Remove Rows in the Home menu. From the Remove Rows popup menu choose Remove Top Rows

15

Remove Top 4 Rows in the dialog box by entering 4. and the result will look like below;

16

Use First Row As Headers

Headers of query as you see in screenshot above are Column1, Column2…. Fortunately in the data set the first row contains column headers. We can simply set that to be used for column headers in Power Query. in the Home tab click on Use First Row As Headers.

17

Data set will looks like screenshot below after above change

18

 

For IOC Country Codes query just set the first row as headers. No more changes is required

19

 

Close and Load

After above changes we can now load the result set into Power BI model to build report for it. You can simply click on Close and Apply menu button in Home Tab

20

Power BI will load result sets into memory for further modeling and reporting.

21

 

A bit of Modeling

This section is not about data modeling. However I want you to be able to play with this data set and build some nice reports with it (if you can’t wait till modeling and visualization chapters of this book). So Let’s create the relationship between All medalists and IOC Country Code. For doing this go to Relationship tab in Power BI Desktop and the click on Manage Relationship. Then create a new relationship as below;

22

After relationship has been created you will see it in the relationship diagram

23

 

Play Time!

Now it’s your turn to play with this data set and make some nice reports and visualization items. Here is what I’ve built with this data set:

24

 

Don’t be panic if you can’t build report above. I will go through step by step process of building this report later in the Reporting Chapters.

 

Summary

In this section you’ve learned some tips of working with Excel Data source from Power Query or Power BI Desktop. You’ve learned that merged cells would be loaded into Power Query Un-merged. Formatting won’t be considered at the time of loading data into Power Query, and Power Query can load data from Pivot Tables, Excel Tables, and Sheets. You’ve also learned a real world example of fetching Olympic medalists data from Excel file into Power BI Desktop. In next sections I will get you through the journey of Getting data from some other data sources.

 


Get Started with Power Query: Movies Data Mash-Up

Published Date : September 1, 2015

original-26681-1393460484-7

As another section of the Power BI online book: from Rookie to Rockstar, I would like to get start working with Power Query. From my point of view learning through an example is the best way to learn new technology. For this post I have decided to use movie’s data to be mashed up. I used this example because movie’s data is a fun example at the early sections of the book, you all watch movies, and you will see many familiar titles here. If you want to learn about Power Query or you need a Power Query introduction before this example, read the previous post: What Is Power Query? Introduction to Data Mash up Engine of Power BI.

You can use either Power Query for Excel or Power Query as part of the Power BI Desktop for running this example. I use two data sets for this example:

This information are available in http://www.boxofficemojo.com website, as below:

1

IMDB is the movie database on the internet that users can rate movies. list of top 250 movies rated by users listed here in the website as below:

2

Let’s Get Started

Start by getting gross sales data; Open Excel, then Power Query Tab, and then from Web;

4

Or Open Power BI Desktop and Get Data from Web

3

Then Enter the web page URL for top 100 sold movies all the time from this link: http://www.boxofficemojo.com/alltime/world/

5

Click OK, after a quick processing, you will see a Navigator window. Power Query will check for any tables in the html web page and will come back with a list of tables on the left side under the URL address;

6

Click on Table 0. You will see a preview of data in the table in main pane. Now tick the checkbox for Table 0 and click on Edit button in Navigator

7

Query Editor

After clicking on Edit you will see the Query Editor window opened. This is an editor that you will spend most of your time for data mash-up here.

8

Query Editor has four main sections (numbers matched to screenshot above);

  1. Main data set pane; This is the central area that the result set will be displayed as preview with limited number of rows
  2. List of Queries; Left hand side pane will show list of all queries in this solution or file
  3. Query Settings pane; Properties such as Name of the query can be set here. Also list of all applied steps to the current query is visible in this pane.
  4. Transformations Menu; Power Query has many transformations options in GUI that are available through the menu in top section

Rename the existing query to Top Sales 100

11

Our goal in this example is to join the data set of global gross sales with the IMDB user rating, and then analyze to see are best sellers in movie titles among the best rated movies or not? so the more data in gross sales we get would give us better analysis. The above URL only gives us top 100 sold movies. but the option to go to pages for rest of the result set are available;

9

So Let’s add list of movies from 101 to 200 in best sellers;

In the existing Query Editor window go to New Source, and then choose From Web. Enter the URL as : http://www.boxofficemojo.com/alltime/world/?pagenum=2&p=.htm

10

This will lead you to the top second 100 movies sold. Click on Table 0 in navigator window and then OK. in the Query Editor rename this query as Top Sales 200

12

Follow this process for links below;

http://www.boxofficemojo.com/alltime/world/?pagenum=3&p=.htm

http://www.boxofficemojo.com/alltime/world/?pagenum=4&p=.htm

Bring data for all top 615 movies in Power Query

13

As you see in query editor all of these queries are separate from each other. Let’s combine them altogether. in database and SQL world that can be done with UNION. Here in Power Query we can do Append Queries;

Use a Query as a Reference

First Create a reference from Top Sales 100 (because for this example I want to keep that query as is);

Right click on Top Sales 100, and from the pop up menu choose Reference

14

This will create a new query that users Top Sales 100 as the source (or reference). Rename this new query to be just “Top Sales”

15

Append Queries

Now let’s combine queries into this new query;

Click on Top Sales and then from the menu (Home) click on Append Queries

16

For append to work you need two queries; the first query is the query that you are on it (Top Sales), the second query name should be entered in the Append dialog box;

17

as you see in screenshot above you can choose other queries. For append to works best queries have to be in the same structure (number of columns, order of columns, data type of columns….). Choose Top Sales 200 in this window and click OK. This will create another step in the query setting named Appended Query. and the result set in the main pane (if you scroll down) will show you first top 200 movies sold.

18

Continue this process to add all 615 top movies into Top Sales query.

Extract First Characters

After doing this change Let’s clean the Year column data; Year column has an special character in some values as below;

19

Click on Year Column, and then from Transform menu under Text Column click on Extract, and then choose First Characters

20

Enter 4 in the Extract First Characters dialog box (because year isn’t more than 4 characters). then click on OK.

21

You will see that year column is clean now without any extra characters.That was easy data transform, wasn’t it? This option in the transformation menu (Extract First Characters) has been added recently in Power BI Desktop.

You can even now change data type of this column to whole number. Right click on Year column and then under Change Type choose Whole Number.

22

 

Great We’ve done enough with the first data set. Let’s work on the second data set (IMDB user rating);

Go to Home Tab in Query Editor again, and Get data from the New Source and from Web. Enter the URL as: http://www.imdb.com/chart/top

In the Navigator window, Table 0 contains the data that we want, so load it with clicking on OK. the data loads into Query Editor as screenshot below illustrates

23

Rename the query to IMDB Rating.

Remove Columns

You can see that there are three useless columns in the data set; first column, and last two columns. remove these columns simply by clicking on them and then right click and Remove.

24

Split Column

Now in the result set we have two columns; Rank & Title, and IMDB Rating. Rank & Title is a combined column which contains rank, title, and year of the movie. Let’s split these values;

25

Rank is separated by single dot (.). So we can use Split Column transformation to split it easily; Right click on Rank & Title column first. then Choose Split Column, and then By Delimiter

26

In the Split Column by Delimiter dialog box you can choose one of the common delimiters such as comma or color … or you can use a custom delimiter. Set it to Custom, and enter single dot (.) in the box underneath. You can also specify how the split works. The default option is At each occurrence of the delimiter. This default option might not be best for our case, because sometimes there might be a dot in the movie’s title. So select the split method as At the left-most delimiter. This option will scan text from the left, and will stop splitting after finding the first delimiter.

27

after split the result set would look like below;

28

Rename the Rank & Title.1 column to Rank.

Now Let’s split title and year. Year value is surrounded between brackets, so we can use same split column method, this time using open bracket as below;

29

The result set looks like below screenshot;

30

Replace Values

Rank & Title.2.2 column has the year value with an extra close bracket. Click on this column and then from Transform menu under Any Column click on Replace Values

31

Replace close bracket with an empty string as below;

32

Result set would have the close bracket removed. rename the column to Year, and change it’s data type to Whole number (change data type with right click on the column)

33

Trim

Also rename the Rank & Title.2.1 column to Title. Because this column might have extra spaces at the beginning and end of values (as the result of split column steps), let’s remove extra spaces;

right click on this column and then under transform choose Trim. this will remove all heading and trailing spaces from values in this column.

34

Awesome, our work with this data set has been finished as well.

Applied Steps

One of the most useful sections of Query Editor window is Applied Steps in the Query Settings Pane. This section of Query Editor window is very useful for debugging and tracking steps and changes. You can see all the steps that you’ve applied on the current data set in this pane. And this is not all of it! You can click on an step and the main pane will show you the data at that step! such an awesome way of keeping track of steps.

35

You can even remove an step, or you can change settings of an step with clicking on remove icon (on the left side of step) or setting icon (on the right side of the step, but only for steps that settings is applicable to them).

 

Final Merge

We’ve prepared both data sets for a final merge together to see how best selling movies are among top user rated films. So we are one step away from this result. we have to merge these two data sets, or Join them in other word.

Click on Top Sales query and create a reference of it, name the new query as Merge Result. Now click on Merge Result, and then from Home tab, under Combine choose Merge Queries

36

Merge Queries is equivalent to Join in SQL or database terminology.

Merging queries requires two queries; the first query is the query that you are currently on it (Merge Result), and you can choose the second query in the Merge dialog box. Choose second query as IMDB Rating. Now select joining keys as Title (you can also choose multiple joining columns with pressing ctrl keyboard key). Set also join kind to Left outer join (this will only select all records from the first query with matched rows of that from the second query)

37

Notice in screenshot above that merge dialog mentioned only 58 records out of 615 movies matched! it means only 58 of best seller movies are among top user rated list! such a pity. Screenshot showed only Inner Join result, but you choose Left Outer and then click on OK to look at the data;

Joining experience in Power Query is a bit different from database tables. as a result of join you will get first table with a new column for the new table. This new column holds table values which needs to be expanded. If you click on the column header icon, you can choose which columns of the nested table you want to expand.

38

Let’s keep all columns and click OK. You can now see some movies that are among best sellers but not in top 250 user rated list of IMDB, There are movies name ssuch as Iron Man 3, Skyfall, Furious 7 and list goes on. Play with that yourself to see what you explore!

39

 

Summary

In this section you’ve learned basics of Power Query through an example. You’ve seen how Power Query can analyze tables in a web page and load it into query editor. You’ve experienced Query Editor and you’ve learned how to apply some transformations. You’ve learned that transformations such as split column, replace values, change data type, and extract part of a text are easy transformations that can be simply done through Power Query editor. In next sections I will explain different types of data sources that Power Query or Power BI can work with through the Get Data Experience. You will see that Power Query and Power BI can get data from text files such as CSV, Text as well as database connections such as MySQL, Oracle and SQL Server, it can also bring data from on-premises data stores as well as cloud Azure based services.