Be Fitbit BI Developer in Few Steps: Step 2 Loop Through All CSV Files

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail

2

In the first step you’ve learned how to manipulate and mash up the data with few changes in Power Query M script. In spite of Power BI Desktop not being able to recognize your CSV file correctly because of the header row, you managed to write script and fetch data as you required. However this process needs to be repeated for the second file, and then the third file, and god knows what happens if you have 1000s of files. Real case scenarios contains many files, even millions. So a manual process won’t be the answer. You have to create your data transformation process in a way that works automatically not matter how many files are there.

Warning: This post contains scripting!

In this post we will write M Power Query script to loop through all CSV files in the source folder and process them one by one with the fetch table script that we produced in previous step. So be prepared to write some code, and don’t scare from it ;) It is almost impossible to create a BI solution for real world problem solving that doesn’t include scripting (even a bit), So you have to be familiar with that. Yes, Power BI is not just a nice drag and drop dashboard building tool, and has many powerful features. You have to spend time to learn it. So here you go, let’s write some codes here.

Looping Through CSV Files

Unfortunately Power Query or let’s say Power BI doesn’t have a loop structure, and that is because of the functional structure of this language. However there are data structures such as Table and List that can be easily used with each singleton function to work exactly as a loop structure does. Here in this post I will get you through the process of looping into files in a directory and processing them all, and finally combining them into a large big table. You will also learn some Power Query M functions through this process.

Where We’ve been in Previous Step

In previous step we ended up with below script

That generates table below from an exported CSV file of Fitbit activities

19

Now There are multiple files in the source folder that I want to repeat the process for all of them;

3

Get Data From Folder

Let’s start the process with reading all files in a folder. Power BI Desktop can do that easily. Through Get Data options choose from Folder and the select the folder containing source files as above.

1

This will load all files and their general information into a single table. There are columns for name of file, folder path, date created, modified, and etc. There is also a column named Content. This is an important column as it contains the whole content of each file in a Binary format.

2

Custom Function

So far we have list of all files and the script to convert their content into a desired table structure. We can create a custom function with that script and re-use it for each file’s content. The process of creating custom functions are easy. I’m not going to explains every parts of creating custom function because that’s a whole separate topic. If you are interested to read more about Creating Custom Functions in Power Query read this post.

Go into Advanced Editor, and create the function as below highlighted section;

Note that all the highlighted part from line 2 to line 16 in code above is the code for the custom function. The first line of function (line 2 in above script) contains name of the function; FetchTable, and the input parameter; table. From line 3 to line 16 is all the script we’ve copied from first step of this exercise.

To Make sure that the function works properly you can call it as below;

Add above single line of code in the IN section of script instead of Files. and you will see the result table as below:

3

So this shows the function definition and function call work perfectly. Now let’s loop through files.

EACH Singleton Function

EACH is a singleton function, which means it has one input parameter and one output. It can be applied on EACH record in a table or EACH item in a list. with using that you can apply calculations, data transformations, adding custom columns and many other operations. For our case I want to use EACH to apply FetchTable function on every Content value of the Files table. Each is easy to use as below;

As you see the single line above and specially the section each FetchTable([Content]) will apply FetchTable function for every Content value in each record of the table. and as a result we will have a new column called Data that includes tables. line 3 of script above just select Data column (because we don’t need other columns). Here is the result;

4

As you see in the result set above we have one table with one column (named Data), and it includes a Table in every item. This is the table that we want to use, so we have to combine all of these tables into one table. Fortunately there is a function for it in Power Query called Table.Combine .

Combining, Appending, or UNION of Tables

Table.Combine does the UNION of multiple tables into one table. This function get a LIST of tables (they should be all with the same structure), and returns a single table which is appended version (or UNION version) of all the table. Here is the definition of this function;

As you mentioned probably this function needs a LIST as an input parameter. And so far we have built a TABLE (single column table) which needs to be converted to list. There are different methods of converting a table into a list. I’ll show you one method here. I first fetch only a single column list from this table with Table.ToColumns function, and then I get the only first value of the list with List.First function. Finally I use the Table.Combine with the list parameter, here’s the code:

Line 4 converts the table to a list, but a list that has another list within. Because a table might have more than one column.

Line 5 gets the first item of the list, which would be the first column of the table. For our example as we have only a single column then only that column would be fetched.

Line 6 finally combines all the tables in the list into one large table.

And the result:

5

You can see that I have now all tables combined as I have all months together in this final table.

Here is the full script of this demo so far:

That ends this step. You’ve learned some scripting skills here today, and I hope you’ve enjoyed it. With few lines of scripting like this you can add massive power to your queries, you can automate things and get rid of manual extra work. Power Query is one of the most power full components of Power BI that gives you ability to apply many transformations and data mash up. The next step will be about Visualization of the data we’ve prepared so far, so stay tuned!

 

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail
rssyoutuberssyoutube
Reza Rad
Reza Rad is an Author, Trainer, Speaker and DW/BI Consultant. He has a BSc in Computer engineering; he has more than 15 years’ experience in databases, programming and development mostly on Microsoft technologies. He is a Microsoft MVP in Data Platform for seven continues years (from 2011 till now) for his dedication in Microsoft BI. He is author of some SQL Server and BI books, and also Power BI online book; from Rookie to Rock Star.

2 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">