Filtering in Power Query component of Power BI is easy, however it can be misleading very easily as well. I have seen the usage of this filtering inappropriately in many cases. Many people simply believe in what they see, rather than seeing behind the scenes. Power Query filtering is totally different when you do basic filtering or advanced filtering, and the result of filtering will be different too. In this post I’ll show you through a very simple example how misleading it can be and what it the correct way to do filtering in Power Query. This post is a must read for everyone who use filtering in Power Query. If you like to learn more about Power BI, read the Power BI online book from Rookie to Rock Star.
For this example I’ll use AdventureWorksDW SQL Server database example. only one table which is DimCustomer.
Filtering in Power Query
Filtering rows is one of the most basic requirements in a data preparation tool such as Power Query, and Power Query do it strongly with an Excel like behavior for filtering. To see what it looks like open Power BI Desktop (or Excel), and Get Data from the database and table above; DimCustomer, Click on Edit to open Query Editor window.
In the Query Editor window there is a button on top right hand side of every column that gives you access to filtering rows
You can simply filter values in the search box provided.
For basic filtering all you need to do is typing in the text, number, or date you want in the search box and find values to select from the list. For example if you want to filter all data rows to be only for the FirstName of David, you can simply type it and select and click on OK.
Result then will be only records with David as first name.
Exactly as you expect. Now let’s see when it is misleading.
Misleading Behavior of Basic Filtering
If you want to do equity filtering, basic filtering is great. For example you only want to pick David, and that’s what we’ve done in above example. However what if you want to do similarity filtering? For example, let’s say you are interested in FirstNames that has three characters of “Dan” in it. You can do it in the same search box of basic filtering, and this will give you list of all first names that has “Dan” in it.
And then you can click on OK. the result will be showing you all first names with “Dan” characters in it. So everything looks exactly as expected, right?
However as a Power Query geek I always look at the M script of each step to see what is happening behind the scene. For this case, here is the Power Query M script;
The script tells the whole story. Despite the fact that you typed in “Dan” and Power Query showed you all FirstNames that has “Dan” in it. the script still use equity filters for every individual FirstName. For this data set there won’t be any issue obviously, because all FirstNames with “Dan” is already selected. However if new data rows coming in to this table in the future, and they have records with FirstNames that are not one of these values, for example Dandy, it won’t be picked! As a result the filter won’t work exactly as you expect. That’s why I say this is misleading.
Advanced Filtering: Correct Filtering
Basic filtering is very handy with that search box and suggestions of items while you typing, but it is misleading easily as you have seen above, and can’t give you what you expect in many real-world scenarios. Advanced Filtering is simply what you can see on top of the Search box, which depends on the data type of column can be Text Filters, Number Filters, Date Filters and etc. If we want to do the same “Dan” filter here, I can go to Text Filtering and select Contains.
You can even choose filters such as Begins With or Ends With or many other choices. For this example I’ll select Contains, and type in Dan in the text box of Contains.
After clicking on OK, you will see same result set (like the time that you did basic filtering with “Dan”), however the M script this time is different;
You can see the big different now, This time M script used Text.Contains function. This function will pick every new values with “Dan” in the FirstName. There will be no values missing in this way. You will be amazed how many Power Query scenarios are not working correctly because of this very simple reason. There are advanced filters for all types of data types.
Basic Filtering is good only if you want to do equity filtering for values that exists in the current data set, however it won’t work correctly if you want to check ranges, or contains or things that is not an exact equity filter. Advanced Filtering is the correct way of filtering in Power Query, and there are advanced filters for all types of data types; Numbers, Text, Date…. This is very simple fact, but not considering that will bring lots of unwanted behavior and incorrect insight to your Power BI solution.