Find Mismatch Rows with Power Query in Power BI

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail

Finding rows that are in one table, but not the other is one of the most common scenarios happening in any data related applications. You may have customer records coming from two sources, and want to find data rows that exist in one, but not the other. In Power Query, you can use Merge to combine data tables together. Merge can be also used for finding mismatch records. You will learn through this blog post, how in Power Query you can find out which records are missing with Merge, and then report it in Power BI. To learn more about Power BI, read Power BI book from Rookie to Rock Star.

Sample Data Tables

In this sample scenario, I have two customer tables; one customer table coming from the website, and another customer table coming from an application. Here is the customer table from the website:

And the other table that comes from the application:

The customer table from the website has the email address field, plus name and CustomerKey. They table coming from the application has fields such as birthdate, gender, and marital status, plus name and customer key. the link field in this scenario is CustomerKey which exists in both tables.

Merge in Power Query

Previously I have explained about What the Merge is, and what is the difference of that to Append. I also explained in another blog post, about different types of Merge Kind. In this example, you would find a couple of those join kinds useful; Left Anti Join, and Right Anti Join. In Below Power Query example, I read the data from both tables, and I merge them together;

Left Anti Join: Records Only in the First Table

When I merge these two queries, I select Left Anti Join, that gives me rows that exist only in the first table (customer table from application):

This will give you only rows that exist in the first table “Customer table from application”

Then you can remove the last column in the output because the table value would not have any rows (this are mismatch rows)

Right Anti Join: Records Only in the Second Table

The same Approach can be used for rows that exist only in the second table, using the Right Anti Join

But right Anti Join will give you a result which looks a bit weird if you do not expand the table;

You DO NEED to expand the table for the second query to get mismatch rows when you use RIGHT ANIT Join. which is an extra step, but still works fine. You can remove all columns from the first table, and expand the last column;

Left Anti with Changing Order of Tables; Works similar to Right Anti

Or alternatively, you can switch the order of tables at the time of Merge, and use Left Anti join to get the same output;

The output of this is similar to the step we have done before (it gives you the records only in the second table “customers from the website”):

Summary

In summary, this post focused on two of the least common join types in Power Query; Left Anti join, and Right Anti join. These two join types are very useful when you want to find records that exist in one of the tables, but not the other one. All you do need is to select the right order of tables and merge type in the Merge command graphical interface. If you like to know more about other types of joins, read this post.

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail
rssyoutuberssyoutube
Reza Rad
Reza Rad is an Author, Trainer, Speaker and DW/BI Consultant. He has a BSc in Computer engineering; he has more than 15 years’ experience in databases, programming and development mostly on Microsoft technologies. He is a Microsoft MVP in Data Platform for seven continues years (from 2011 till now) for his dedication in Microsoft BI. He is author of some SQL Server and BI books, and also Power BI online book; from Rookie to Rock Star.

2 Comments

  • Hi,

    Great article! I’ve known the use of left and right anti-joins for a while. One case I’m working on is finding the differences between 2 tables. And these two joins alone don’t get me the right result.

    If I have 2 similar tables, table A and Table B. And 3 actions have taken place. In Table A I have removed row 1, which is still present in table B. And in Table B I have changed record 3. Lastly I removed row 5 in table B.

    Doing a Left Anti-Join, on Table A, comparing to table B, will result in row 3 (that has changed in Table B) and row 5 (that was removed in table B). Yet it misses row 1 that has been removed in Table A.

    Doing a right Anti-Join,with a similar setup, will result in showing Row 1 (that was removed in table 1) and changed record 3. Yet shows nothing of row 5, as it is removed from Table B.

    My question is. If I want to compare 2 tables, see what rows are different from one another. Rows that have been removed from each side, added on each side, and adjusted. How can I most easily do that?

    Is there something like a full-Anti Join?

    • Hi Rick.
      Best would be sending your sample dataset to me to have a look. I believe your inquiry can be done with Merge (or every maybe append depends on the structure) and some conditional expressions. please send it to my first name at the radacad website domain dot come

      Cheers
      Reza

Leave a Reply

Your email address will not be published. Required fields are marked *