Watch Your Steps! Power Query Performance Caution for Power BI

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail

I work with Power Query transformations every day these days, and I want to share one simple, but critical caution with you. If you use Power Query a lot, this tip can improve the performance of your transformation significantly. The number of steps that you add in a query counts in the performance of your data transformation (if you have too many steps), I like to show this to you through an example. If you like to learn more about Power BI and Power Query, read the Power BI from Rookie to Rock Star book.

Too Many Variables

This is a sample Power Query file, which in that I do a very simple transformation. The transformation is adding one to the existing number. However, in this sample, we are doing it over thousands of steps! one step at a time, we are adding thousands to a number. The main reason to do it this way is to show you what is the performance you get when you have too many variables (or let’s say steps) in Power Query.

Here is my sample query:

The query above takes 15 minutes to run on my Surface Book 2 machine with Core i7 CPU and 16GB memory! The 15 minutes that you cannot touch Power BI through it. it will not respond to your actions, you have to wait for that time to see the result after that.

Oh! That is a really long time for a query that just adds one value in every step, Isn’t it? Let’s see how the resources consumption is in the system with Power BI Desktop and Power Query running. here it is only halfway through:

Uh! Even with a simple calculation like adding one to a number I had over 10GB memory consumption and 70% CPU usage for a long period! What do you think caused it? The number of variables of course. There is nothing else in this query.

Having too many variables, or too many steps, cause performance issues. Power Query allocates memory for every variable, and memory consumption raises significantly. The processor also takes lots of time to process that number of variables through the Power Query Engine.

Of course, this example was an exaggerated example of too many variables. You would never have 1,800 variables. However, this example also was on a variable with a single simple numeric value. In most of the cases, your variables are tables with many data rows and columns. So this can happen in a real-world scenario for you even with hundreds of steps or variables. You need to watch the number of variables in other words.

Too Many Variables will cause performance issue!

What is the Solution?

Well, if the number of variables is the cause of the issue, the remedy would be reducing the number of variables in the first place. If I change the query above to this one below, I get the result in less than a second!

The query above runs in less than a second, vs the previous query that took 15 minutes to run! The only difference is the number of variables. So here is the take away for this post:

Watch the number of variables (steps) in Power Query, If too many variables, then best to combine some of them

Important Note: I am not saying that you have to combine all your variables. You will not usually see performance issues with less than a hundred variables, this problem starts to appear when the number goes beyond that. One critical point you need to consider is that if you combine variables together, then debugging or troubleshooting or maintenance of the code would be harder as well. so use this solution only if needed.

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail
rssyoutuberssyoutube
Reza Rad on FacebookReza Rad on LinkedinReza Rad on TwitterReza Rad on Youtube
Reza Rad
Trainer, Consultant, Mentor
Reza Rad is a Microsoft Regional Director, an Author, Trainer, Speaker and Consultant. He has a BSc in Computer engineering; he has more than 20 years’ experience in data analysis, BI, databases, programming, and development mostly on Microsoft technologies. He is a Microsoft Data Platform MVP for eight continuous years (from 2011 till now) for his dedication in Microsoft BI. Reza is an active blogger and co-founder of RADACAD. Reza is also co-founder and co-organizer of Difinity conference in New Zealand.
His articles on different aspects of technologies, especially on MS BI, can be found on his blog: https://radacad.com/blog.
He wrote some books on MS SQL BI and also is writing some others, He was also an active member on online technical forums such as MSDN and Experts-Exchange, and was a moderator of MSDN SQL Server forums, and is an MCP, MCSE, and MCITP of BI. He is the leader of the New Zealand Business Intelligence users group. He is also the author of very popular book Power BI from Rookie to Rock Star, which is free with more than 1700 pages of content and the Power BI Pro Architecture published by Apress.
He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.
Reza’s passion is to help you find the best data solution, he is Data enthusiast.

Leave a Reply

Your email address will not be published. Required fields are marked *