Combining two queries in Power Query or in Power BI is one of the most basic and also essential tasks that you would need to do in most of data preparation scenarios. There are two types of combining queries; Merge, and Append. Database developers easily understand the difference, but majority of Power BI users are not developers. In this post I’ll explain the difference between Merge and Append, and situations that you should use each. If you want to learn more about Power BI, read Power BI online book, from Rookie to Rock Star.
Why Combine Queries?
This might be the first question comes into your mind; Why should I combine queries? The answer is that; You can do most of the things you want in a single query, however it will be very complicated with hundreds of steps very quickly. On the other hand your queries might be used in different places. For example one of them might be used as a table in Power BI model, and also playing part of data preparation for another query. Combining queries is a big help in writing better and simpler queries. I’ll show you some examples of combining queries.
Result of a combine operation on one or more queries will be only one query. You can find Append or Merge in the Combine Queries section of the Query Editor in Power BI or in Excel.
Append means results of two (or more) queries (which are tables themselves) will be combined into one query in this way:
- Rows will be appended after each other. (for example appending a query with 50 rows with another query with 100 rows, will return a result set of 150 rows)
- Columns will be same number of columns for each query*. (for example col1, col2,…, col10 in first query, after appending with same columns in the second query will result into one query with single set of col1,col2, …, col10)
There is an exception for number of columns which I’ll talk about it later. Let’s first look at what Append looks like in action;
Consider two sample data sets; one for students of each course, Students of course 1:
and Students of course 2:
To append these queries, Click on one of them and select Append Queries from the Combine section of Home tab in Query Editor
If you want to keep the existing query result as it is, and create a new query with the appended result choose Append Queries as New, otherwise just select Append Queries. In this example I’ll do Append Queries as New, because I want to keep existing queries intact.
You can also choose to append Three or more tables and add tables to the list as you wish. For this example I have only two tables, so I’ll continue with above configuration. Append Queries simply append rows after each other, and because column names are exactly similar in both queries, the result set will have same columns.
Result of Append as simple as that
Append is similar to UNION ALL in T-SQL.
What about Duplicates?
Append Queries will NOT remove duplicates. You have to use Group By or Remove Duplicate Rows to get rid of duplicates.
What if columns in source queries are not exactly match?
Append requires columns to be exactly similar to work in best condition. if columns in source queries are different, append still works, but will create one column in the output per each new column, if one of the sources doesn’t have that column the cell value of that column for those rows will be null.
Merge is another type of combining queries which is based on matching rows, rather than columns. The output of Merge will be single query with;
- There should be joining or matching criteria between two queries. (for example StudentID column of both queries to be matched with each other)
- Number of rows will be dependent on matching criteria between queries
- Number of Columns will be dependent of what columns selected in the result set. (Merge will create a structured column as a result).
Understanding how Merge works might look a bit more complicated, but it will be very easy with example, let’s have a look at that in action;
In addition to tables in first example, consider that there is another table for Course’s details as below:
Now if I want to combine Course query with the Appended result of CourseXStudents to see which students are part of which course with all details in each row, I need to use Merge Queries. Here is the appended result again;
Select Course Query first, and then Select Merge Queries (as New)
Merging Queries require joining criteria. Joining criteria is field(s) in each source query that should be matched with each other to build the result query. In this example, I want to Merge Course query with Append1, based on Title of the course.
I’ll talk about types of join later. For now continue the selection, and you will see these two queries match with each other based on Course title, result query will be same as the first query (Course in this example), plus one additional column named as NewColumn with a table in each cell. This is a structured column which can be expanded into underlying tables. if you click on an empty area of the cell containing one of these tables, you will see the sub table underneath.
Now click on Expand column icon, and expand the New Column to all underneath table structure
Result will be a table including columns from both tables, and rows matching with each other.
Columns in the left hand side are coming from Course table, columns in the right hand side are coming from Students table. Values in the rows only appears in matching criteria. First three rows are students of Math course, then two students for English course, and because there is no student for Physics course you will see null values for students columns.
Merge is similar to JOIN in T-SQL
There are 6 types of joins supported in Power BI as below, depends on the effect on the result set based on matching rows, each of these types works differently.
Explaining what each join type will do is totally different post which I will write later. For now this picture explains it very well:
Picture referenced from: http://www.udel.edu/evelyn/SQL-Class2/SQLclass2_Join.html
Download the Power BI file of demo from here: