Power Query Not for BI: Event Date and Time Scheduler – Part 1

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail

Picture1

Previously you have read how to use Power Query as the data transformation component of Power BI and you learned how to use it in BI applications. You can read more about Power Query usage in BI in Power BI online book from Rookie to Rock Star. In this series I will explain how to use Power Query for non-BI usage. This data transformation tool can be used anywhere to use input parameters, fetch the data from web or any other data sources, mash up the data to form the required shape, and automate the process. In this series I will show you a real use case of using Power Query for non-BI application.

 

Why?

As an introduction to this series, I want to take you to the path that leads me to use Power Query here. You might be aware that I am teaching Power BI courses, and most of my courses are online and Live. This means that courses are not recorded videos, it is me on the other side of the line with full interactive audio and video experience with students with Go2Meeting application. Students connecting to me from other places in the world. So I do need an event date/time scheduler that I can announce date and time of the event in different time zones.

Fortunately there is a very good website that helps to find a date/time in different time zones. In this website I can set my input parameters as the date/time of my event locally (in my city), and name of the event, and duration.

2016-05-13_11h51_05

Then the website produces a result set of important cities in different time zones with the date and time of the event in their time zones. That’s brilliant, isn’t it?

2016-05-13_11h52_26

The main issue here is that; I want to use this list in my course announcement web page, and list above is overwhelming! If I copy and paste list above, it will make my page so long. Audience will lose interest to read such a long page to find their city and local time zone. What I need is this:

  • List of all above cities with the event date time on their time zone, categorized by time. That means; Los Angeles, San Francisco, Las Vegas, Seattle, and Portland, should be all listed in one row of my result set table.

2016-05-13_12h05_36

  • I want this list to be in Excel format, so I can send it to my mailing list or attach it as an excel table
  • I don’t want to produce the data from website. I would like to change parameters in my Excel file, and it automatically fetches data from this website, and load result set back to my Excel spreadsheet.

So as a result I decided to use Power Query, because with Power Query I can;

  • Have input parameters in an Excel sheet for local date/time, and duration of the event.
  • Browse web URL and fetch result set.
  • Mash up the output data to produce my desired data set.
  • Load result set into Excel.

So Let’s now start building this solution step by step.

 

What You Will Learn?

By completing this scenario, you will learn a real-world use case of Power Query for non-BI solution. You will learn using Power Query for;

  • Defining a table for input parameters in Excel sheet
  • Using Power Query to read values from that table as input parameter and build web URL for timeanddate.com web query.
  • Querying web URL and Fetching the data
  • Data Manipulation with Power Query such as; Removing Columns, Replacing Values, Appending Tables, Grouping Rows, Adding Custom Columns, Changing Tables to Column, Columns to Lists, Lists to Tables, and Concatenating Table Columns with Comma.
  • Automating the Power Query solution to take the parameters and produce result set when you hit Refresh button.

 

Prerequisite

This example has been done with Excel 2016 which has Power Query embedded. So if you have Excel 2016 you don’t need to install anything. If you are using Excel 2013 or 2010 you need to download and install Power Query for Excel Add-in. Power Query editor experience in all of these versions of Excel are the same. The only difference is that Power Query menu options in Excel 2016 is located under Data Tab;

001

And in Excel 2013 and 2010 are located under Power Query Tab;

024

 

Get Data from Web

I’ll start by the easy part of the work, which is using a static web URL like this:

http://www.timeanddate.com/worldclock/fixedtime.html?msg=Power+BI+Training&iso=20160530T09&p1=22&ah=3

In Power Query I can fetch tables produced in this web page. Follow the menu as: Go to Data Tab (In excel 2013/2010 Power Query Tab), New Query, From Other Data Sources, and then From Web

001

Enter the URL mentioned above in the dialog box

2016-05-13_12h21_07

Power Query is smart enough to separate each table, and It shows you a preview of each table. Choose the table that has values with cities and date/times.

2016-05-13_12h22_26

I can see in preview above that there are some extra columns, and also I need to apply some transformations. So I Click on Edit to lunch query editor for further data manipulation.

2016-05-13_12h23_22

Apply Basic Transformations

When query editor lunches and Power Query loads the preview of table, it also automatically applies some data type changes as the very first step. Here is the data set when it loads in query editor for the first time;

002

There is a Header column which is not giving me any useful information, So I’ll remove it by right click on the column and Remove column

003

There are also some * characters in cities names (at the end of some cities, such as Barcelona*). So I remove them by clicking on the Location Column, then choosing Replace Values from icons under Home menu, and then in replace values dialog box I replace * with blank.

004

I’ll do the same step again Location2 Column this time. Here is what my result set looks like so far;

rep1

As you can see in screenshot above all applied steps so far are listed in the right pane (numbered 3 in screenshot above). This is one of the great features of Power Query that allows you rollback changes anytime you want and check preview of query for specific step.

In screenshot above I have to set of columns; first one numbered 1, and the second one numbered 2. These are not repetitive values. You can check city names in columns Location and Location2. The fact is that timeanddate.com website split the result set into two set of columns and produced HTML table like that. So for reading the full data set, I have to separate these two set of columns and then combine their values into a single set of columns.

So I’ll produce a result set for the second set of columns with removing the first set;

rep2

This will give me the second set of columns as below;

rep3

Then I rename the column names for this data set. The reason for this rename is that later on when I want to combine two queries together column names should be the same. I’ll rename Columns to be Location, Start Time, End Time. for rename I just right click on each column and Rename

rep4

And here is my result set for the second data set:

rep5

Insert a Step

Now I want to produce another variable for the first set of columns. Instead of creating the whole process again from Get data from web, I start from one step ago which was before removing columns. As I’ve mentioned earlier I can grab that status easily from the Applied Steps section in Query Settings pane. There is a step named Replaced Value1. This step belongs to where I replaced * characters with blank for Location2 columns.

I Click on Replaced Value1 step and then choose second set of columns to remove.

rep6

When I click on Remove columns as above, I see a message says; Are you sure you want to insert a step? I get this message because this is not the latest step in my query editor. As long as I know what I am doing adding an extra step is fine.

009

I hit Insert for the message above and a new step will be created called Removed Columns2. With the result set as below;

2016-05-15_20h23_13

Fixing Insert Step reference issue

When I insert a step in Power Query, the step after that will source from inserted step. This will cause some issues in my scenario. I inserted an step and removed Location2, Start time2, and End time2. and then in Removed Columns1 I want to remove other columns and in Renamed Columns rename columns that does not exists anymore. So I have to fix the reference issue happened by inserting the step. Here is how I fix it:

I go to View tab, Advanced Editor (or I can find Advanced Editor button in Home tab). This will open Advanced Editor with M code for me. Then I replace the highlighted part to be referenced from Replaced Value1.

rep7

Now my data sets exists as below:

Removed Columns2: the data set for first set of columns

Renamed Columns: the data set for second set of columns

Append Queries

Now It’s the time to combine these two data sets. There is a menu option that combines two data sets. However, it only appends two QUERIES. As I have everything here in one query (but different steps or let’s say variables), so I can’t use that option. Fortunately, I can do that with a single line of M scripting.

I click on Advanced Editor to see the M code generated behind the scene. Then I Insert code line below at the end of LET clause, after adding a single comma (which is line separator in Power Query) in the line before it.

Statement above creates a new data set as a variable called Appended Query. I’ve used Table.Combine to append queries (or UNION them in SQL and Database terminologies). This function is a simple function and gets data sets to be combined which are Removed Columns1 and Renamed Columns1.

Then I show this step in the result set with writing that in IN section:

Now my result set have both data sets combined into one. (I can check this includes 144 columns for 144 cities; combined list of 72 cities in each of previous data sets).

rep8

 

Next Steps

It will be really long if I mention all steps in a single post, So stay tuned. Next step would be Grouping rows and applying some transformations to get the desired output. Finally I will show you how to use parameters in the query and automate the process.

Power Query Not for BI; Event Date and Time Scheduler – Part 2

Power Query Not for BI; Event Date and Time Scheduler – Part 3

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail
rssyoutuberssyoutube
Reza Rad
Reza Rad is an Author, Trainer, Speaker and DW/BI Consultant. He has a BSc in Computer engineering; he has more than 15 years’ experience in databases, programming and development mostly on Microsoft technologies. He is a Microsoft MVP in Data Platform for seven continues years (from 2011 till now) for his dedication in Microsoft BI. He is author of some SQL Server and BI books, and also Power BI online book; from Rookie to Rock Star.

14 Comments

  • There is something wrong with the steps detailed. “Renamed Columns” step is only done once in the post, therefore, there is no generation of the step “Renamed Columns1″. Kindly, correct me if I’ve understood the same wrongly.
    Would be great if you can post the M Query Code or Excel Workbook here.

    Thanks.

    • Hi Deepak,

      Wow, I impressed by how fast you started to follow the steps. Well Done.
      Regarding the Renamed Columns1 Step; That is where I’ve mentioned: Then I rename the column names for this dataset. The reason for this rename is that later on when I want to combine two queries together column names should be the same. I’ll rename Columns to be Location, Start Time, End Time. (Right before Append Queries)
      As renaming was easy step to do, I just did that explanation. Sorry if it is confusing.
      I’ll share the whole M script at the end of series, Stay tuned.

      Cheers,
      Reza

      • Hi Reza,

        Thanks for the response. However, when I rename the columns, the step created is “Renamed Columns” and not “Renamed Columns1″, therefore, I get an error also i.e. “Location2 could not be found”. Even if I modify the query,
        #”Appended Query” = Table.Combine({#”Removed Columns1″, #”Renamed Columns1″ })
        to
        #”Appended Query” = Table.Combine({#”Removed Columns1″, #”Renamed Columns” }),
        the code throws the same error.

        However, this being said, I’ll go through the complete M script and see where is the issue. Thanks.
        -Deepak

        • Moreover, when I insert a step after “Replaced Values1″, “Removed Columns2″ comes before “Removed Columns”. However, in your part 2 of the series, “Removed Columns2″ comes after “Removed Columns1″, therefore, the step sequence in query is not same. Kindly, correct me if I’ve wrongly understood something. Thanks.
          -Deepak

          • [Typo in earlier reply]
            Moreover, when I insert a step after “Replaced Values1″, “Removed Columns2″ comes before “Removed Columns1”. However, in your part 2 of the series, “Removed Columns2″ comes after “Removed Columns1″, therefore, the step sequence in query is not same. Kindly, correct me if I’ve wrongly understood something. Thanks.
            -Deepak

        • Thanks Deepak for all your detailed comments.
          You are absolutely right about naming and order of some steps.
          However I believe naming and order of some steps are easy for readers like you to figure it out. If after renaming, your step named as Renamed Columns, then that is absolutely fine, use that in the query. The main point to consider here in this walk through is that: we have two set of columns; first one with names; Location, Start Time, End Time (name it whatever you like, I call it first data set), and second one with names; Location2, Start Time2, and End time2. rename this second data set to: Location, Start Time, End Time. and then finally append result of first data set and the second data set.
          The reason that step names and orders are a bit different in my screenshots is that I’ve built the demo first, and then I wrote and generate screenshots.
          Please let me know if it is still hard to figure out, or you face an issue yet, and I’ll do my best to correct steps with right naming convention and order.

          Sorry for inconvenience

          Cheers,
          Reza

          • Hi Reza,
            Thanks a lot for the quick response. I’m not able to go forward because I’m getting the error mentioned earlier in my reply even after correcting the step name. The issue is, how can Removed Columns2 be after Removed Columns1 when we are inserting the step after Replaced Values1, which places Removed Columns2 before Removed Columns1 and after that I follow the steps mentioned I get the same error again. Thanks.
            -Deepak

  • Hi Reza,
    Following is my M Script after Replaced Values step:
    #”Replaced Value1″ = Table.ReplaceValue(#”Replaced Value”,”*”,””,Replacer.ReplaceText,{“Location2″}),
    #”Removed Columns2″ = Table.RemoveColumns(#”Replaced Value1″,{“Location”, “Start time”, “End time”}),
    #”Renamed Columns” = Table.RenameColumns(#”Removed Columns2″,{{“Location2″, “Location”}, {“Start time2″, “Start time”}, {“End time2″, “End time”}}),
    #”Removed Columns1″ = Table.RemoveColumns(#”Renamed Columns”,{“Location2″, “Start time2″, “End time2″}),
    #”Appended Query” = Table.Combine({#”Removed Columns2″, #”Renamed Columns” })
    in
    #”Appended Query”

    The Script is not appending the tables one below the other. If I replace “Removed Columns2″ with “Removed Columns1″ in Appended Query, I get an error that Location2 could not be found. I’m not able to debug the same. Request your help. Thanks.
    -Deepak

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">