Watch Your Steps! Power Query Performance Caution for Power BI

I work with Power Query transformations every day these days, and I want to share one simple, but critical caution with you. If you use Power Query a lot, this tip can improve the performance of your transformation significantly. The number of steps that you add in a query counts in the performance of your data transformation (if you have too many steps), I like to show this to you through an example. If you like to learn more about Power BI and Power Query, read the Power BI from Rookie to Rock Star book.

Too Many Variables

This is a sample Power Query file, which in that I do a very simple transformation. The transformation is adding one to the existing number. However, in this sample, we are doing it over thousands of steps! one step at a time, we are adding thousands to a number. The main reason to do it this way is to show you what is the performance you get when you have too many variables (or let’s say steps) in Power Query.

Here is my sample query:

let
    a0= 0,
a1=a0+1,
a2=a1+1,
a3=a2+1,
... //
... // each variable used in the next variable with a plus one
... //
a1808=a1807+1,
a1809=a1808+1,
a1810=a1809+1,
a1811=a1810+1,
a1812=a1811+1
in
    a1812

The query above takes 15 minutes to run on my Surface Book 2 machine with Core i7 CPU and 16GB memory! The 15 minutes that you cannot touch Power BI through it. it will not respond to your actions, you have to wait for that time to see the result after that.

Oh! That is a really long time for a query that just adds one value in every step, Isn’t it? Let’s see how the resources consumption is in the system with Power BI Desktop and Power Query running. here it is only halfway through:

Uh! Even with a simple calculation like adding one to a number I had over 10GB memory consumption and 70% CPU usage for a long period! What do you think caused it? The number of variables of course. There is nothing else in this query.

Having too many variables, or too many steps, cause performance issues. Power Query allocates memory for every variable, and memory consumption raises significantly. The processor also takes lots of time to process that number of variables through the Power Query Engine.

Of course, this example was an exaggerated example of too many variables. You would never have 1,800 variables. However, this example also was on a variable with a single simple numeric value. In most of the cases, your variables are tables with many data rows and columns. So this can happen in a real-world scenario for you even with hundreds of steps or variables. You need to watch the number of variables in other words.

Too Many Variables will cause performance issue!

What is the Solution?

Well, if the number of variables is the cause of the issue, the remedy would be reducing the number of variables in the first place. If I change the query above to this one below, I get the result in less than a second!

The query above runs in less than a second, vs the previous query that took 15 minutes to run! The only difference is the number of variables. So here is the take away for this post:

Watch the number of variables (steps) in Power Query, If too many variables, then best to combine some of them

Important Note: I am not saying that you have to combine all your variables. You will not usually see performance issues with less than a hundred variables, this problem starts to appear when the number goes beyond that. One critical point you need to consider is that if you combine variables together, then debugging or troubleshooting or maintenance of the code would be harder as well. so use this solution only if needed.

Reza Rad on FacebookReza Rad on LinkedinReza Rad on TwitterReza Rad on Youtube
Reza Rad
Trainer, Consultant, Mentor
Reza Rad is a Microsoft Regional Director, an Author, Trainer, Speaker and Consultant. He has a BSc in Computer engineering; he has more than 20 years’ experience in data analysis, BI, databases, programming, and development mostly on Microsoft technologies. He is a Microsoft Data Platform MVP for 12 continuous years (from 2011 till now) for his dedication in Microsoft BI. Reza is an active blogger and co-founder of RADACAD. Reza is also co-founder and co-organizer of Difinity conference in New Zealand, Power BI Summit, and Data Insight Summit.
Reza is author of more than 14 books on Microsoft Business Intelligence, most of these books are published under Power BI category. Among these are books such as Power BI DAX Simplified, Pro Power BI Architecture, Power BI from Rookie to Rock Star, Power Query books series, Row-Level Security in Power BI and etc.
He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.
Reza’s passion is to help you find the best data solution, he is Data enthusiast.
His articles on different aspects of technologies, especially on MS BI, can be found on his blog: https://radacad.com/blog.

2 thoughts on “Watch Your Steps! Power Query Performance Caution for Power BI

Leave a Reply