It was sometimes ago that I’ve posted some data analysis that I’ve done with Power BI Designer on FIFA 2014 Worldcup results. In this post I want to take you through the process of how to achieve that analysis. This post is an introduction to Power BI features with the example of FIFA 2014 Worldcup result set. Through this post you will learn how to connect to web based data sets and mash up the data with Power Query component of Power BI Desktop. And finally how to model it and visualize it in the Power BI report. Most of the data for this example comes from data sets in FIFA 2014 Worldcup official website: http://www.fifa.com/worldcup/archive/brazil2014/index.html
For this example I use Power BI Desktop which has been announced yesterday as General Availability, you can download Power BI Desktop (32 or 64 bit) from link below:
http://www.microsoft.com/en-us/download/details.aspx?id=45331
Data Extraction
I use more than one data set for analysis, so let’s start with some basic data sets;
Goals Scored by Teams
Number of goals scored by each team listed in this URL: http://www.fifa.com/worldcup/archive/brazil2014/statistics/teams/goal-scored.html
Steps to get this data set into Power BI is;
- Open Power BI Desktop
- Choose Get Data
- In Get Data Window, under Other, click on Web
- In “From Web” dialog box, Enter the URL and click OK; http://www.fifa.com/worldcup/archive/brazil2014/statistics/teams/goal-scored.html
- a Navigator window will open as below, click on the table that contains data set of goal scorers and then click on Edit to open Query Editor window
- The Query Editor window will appear with the data set loaded and displayed in the main pane.
Query Editor is the Power Query editor window, which gives you ability to transform data and then load it into the model that Power BI can use. There are different panes in Power Query Editor window here;
- Main pane that shows a preview of the data set after applying transformation listed in the Query Setting pane.
- Query pane, which shows all queries used
- Query setting pane; here you can see steps applied on the data set, you can delete them or change them as you want.
- Transformation Pane; all transformation options listed in top menu here, you can choose simply to work with any of them.
This data set is almost ready to use, the only transformation I want to make here is to remove extra two characters at the end of each column name that came from icons used in the FIFA website. so rename every column with right click on that and choosing “rename column” option from the menu.
Also rename the Query name to TeamGoals in the right hand side pane in Query Setting section. Here is the result set after above changes;
Passes by Teams
The second data set is passed by teams which we get from this url: http://www.fifa.com/worldcup/archive/brazil2014/statistics/teams/passes.html
- Without closing Query Editor window, on the Menu (under Home, New Query) click on New Source, and choose from Web. In the From Web paste URL above and press OK.
- In Navigator window choose the table that contains passes by teams data set, and click OK.
- Now in the same Query Editor window you will see Table 0 loaded with it’s data set in the main pane
- For this data set again we have to change column names and remove two extra characters from each column name. Rename Team2 column name to Teams Abrv.
- There are some percentage columns in this data set, which considered as text column (because of % character in the value), we want to remove the character and change the data type to be Decimal number.
- Right click on Passes Completed Percentage column (this is renamed column version of Passes Completed 2), and choose Replace Values
- Replace the % character with empty string as mentioned below
- After applying above transformation, you will see % character removed. Now change the data type of this column to Decimal number
- Now we have to divide value of this column by 100 (to have a real percentage value). Click on the Passes Completed Percentage Column. and from Transform Menu, Under Number Column, Under Standard, select Divide
- Enter the value as 100 in Divide dialog box, and click OK.
- Do the same for Crosses Completed Percentage column (renamed from Crosses Completed 2)
- Rename the query to TeamPasses. The result should looks like below screenshot
Teams Defending
I would also like to add another data set to this analysis which is Teams Defending result set. URL: http://www.fifa.com/worldcup/archive/brazil2014/statistics/teams/defending.html
- Get Data again, from New Source, From Web, and enter URL above.
- After loading the data set in Query Editor apply below changes (as illustrated above)
- Rename columns and remove extra two characters
- remove % character and change the data type of percentage columns to Decimal number
- Divide column by 100 to get actual percentage value.
- rename the query to TeamDefending
- The result set should looks like below:
Data Modelling
We can add much more data sets from FIFA official website. but for this example let’s keep this simple and stick to three data sets above. In this section we would model relationship between these data sets, and model it to be used in reporting section.
in Query Editor window click on Close & Load icon in the Home tab, under Close section
Relationship Tab
In the Power BI Desktop from left pane click on Relationships tab and you will see the relationship created itself by the similarity of column name between queries.
You can also change the relationship if you want. current relationship is based on Teams column as the key column.
I’ll keep the relationship as it is.
Data Tab
In the data tab you can view the data set, and set formatting of the columns or add new measures (DAX measures) if you want. Screenshot below shows how to access data tab from the left pane. You can also see the Modeling menu that gives you ability to create and edit measures, as well as format them in appropriate way. you can also access managing relationships from this menu.
Let’s set formatting of percentage columns appropriate;
- in the Fields pane in right hand side click on TeamPasses.
- In the data set click on Passes Completed Percentage, and from Formatting section in Modeling menu choose percentage.
- Do this for all percentage columns in TeamPasses and TeamDefending
Now our model is ready enough to be used in reporting section
Visualization
Now let’s make some visualization with the Report tab based on the data in model we’ve built so far.
Goals Scored Column Chart
From the Fields section under TeamGoals, drag and drop teams into the main pane, then drag and drop Goals Scored on it. you will have a table showing values as below
Now from Visualizations click on Column chart, and you will see the result set in the chart easily.
Let’s highlight teams that scored their goals in open play, rather than penalty or own goals. We can achieve that with the fantastic new feature of Color Saturation in the Power BI as mentioned below
The result set also can be modified with some color them for background, and moving title to the center (all of these can be achieved in format section of the visualization)
Map
Double click on the map from visualization section. Then from TeamPasses Drag and drop Teams into the Location section of the map setting. Bring Passes Completed into the values section
Let’s change the color, and color saturate it by number of matches each team played.
Gauge
Now let’s create a gauge for goals scored. set the value as Goals Scored. Minimum value as Min of Goals Scored, Maximum value as Max of Goals scored, and the Target as Average of Goals Scored.
Clustered Column and Line Chart
as the last visualization item in this report I would like to show a column chart combined with line chart. the column values shows Completed Clearance from TeamDefendings, and the line value shows number of Saves. Here is the chart
Here is our completed report in the Power BI Desktop
Publish
Now let’s publish this report into Power BI website. for this section you have to create an account in Power BI website, if you don’t have it, just create it, it is free!
Click on Publish option in the menu above, and your report simply will be uploaded into your Power BI account.
Open the report in Web
You can even pin some visualization items as dashboards as illustrated below
Pinned dashboards can then be viewed separately in dashboards section
First of all: Thanks a lot for this post since I was the one who wrote down on your first post about the World Cup to go a Little bit more into detail 🙂
In the meantime I tried my own version and actually I am quite happy, but I have a big problem with showing how many ties, losses and wins one team had during the World Cup. How did you do it? And I also recognized that you had 4 queries about the matches (see here: https://radacad.com/wp-content/uploads/2015/04/pq.png – matches1, matches2, matchesreverse and allmatches).
I am pretty sure that you are spending a lot of time on other things (among the (online-)book about PowerBI, what I really appreciate), but could you please describe the problems I mentioned above a little bit more in detail?
Hi Christian,
Thanks for your kind feedback.
I’ve got matches information from a Wikipedia page:
http://en.wikipedia.org/wiki/List_of_2014_FIFA_World_Cup_matches
Then I’ve applied some transformation on it. For example it had round 16, 8… etc in different rows, so I’ve removed extra lines and columns. the data source had only team 1 on left side and team 2 on right, so it wasn’t easy to find out how many times each team won, so I’ve created a reverse matches data which team 2 sits on column team 1 and reverse. then I unioned all the data and the rest is sorting and some column selection.
I hope this explanation makes sense for you. However if it doesn’t I can share the *.pbix file for you
Cheers,
Reza
Hi Reza,
thanks for your answer, again! This is what I did, but it seems that I have some Errors in it. And maybe I am thinking in a too wide range, since I would love to add not only the data from the world Cup in 2014, but also some others (maybe not back until 1930, but to the 1970’s or 80’s???).
It would be great if you could share your .pbix file with me, that would probably help me to find the right way!
Thanks a lot!
Hi Christian,
I’ve emailed you the Power BI file.
Regarding fetching 1970 and 80… FIFA worldcup information, I strongly suggest you to use the open database for FIFA which has all the information in text format I believe. that information can be manipulated and mashed up with Power Query to have everything with nice shape and design in Power BI model.
Because that would be a bit more complex example, I might write about that a bit later in the book.
Cheers,
Reza