To build a robust BI system, you need to cater for errors and handle errors carefully. If you build a reporting solution that the refresh of that fails everytime an error occurs, it is not a robust system. Errors can happen by many reasons, In this post, I’ll show you a way to catch potential errors in Power Query and how to build an exception report page to visualize the error rows for further investigation. The method that you learn here, will save your model from failing at the time of refresh. Means you get the dataset updated, and you can catch any rows caused the error in an exception report page. To learn more about Power BI, read Power BI book from Rookie to Rock Star.
I will use a sample Excel file as a data source which has 18,484 customer rows in it. In the sample Dataset, we have a BirthDate field beside all other fields, which supposed to have a date value in it. Here is what the data looks like when I bring it into Power Query:
When I get this dataset in my Power Query Editor window (as you see in the above screenshot), Power Query automatically converts the data type of the BirthDate column to Date. You can see this automatic data type conversion in the list of Steps;
Of course, you can turn off this automatic data type detection of Power Query, but that is not my point. I want the dataset to fail to show you how to deal with it. Errors happen in Power Query in the real world, and I’m here to show you how to find them.
As you can see in the Power Query Editor, I see no errors for this data type change, and everything looks great;
Now I load this dataset into Power BI, by using Close and Apply in the Query Editor window, and I expect everything to load successfully, however, this is coming out of the blue!
Does this sound familiar? Yes, if you have worked with Power BI for a while, you might have experienced it. No errors in Power Query Editor, but when we load the data into Power BI, there are errors! How that is possible? let’s first find out why this happens.
Why Power Query Editor didn’t Catch the Error?
Power Query Editor always work with a preview of the dataset, the size of the preview depends on how many columns you have, sometimes it is 1000 rows, sometimes 200 rows. If you click on a Query in the Power Query editor window, you can, in fact, see this stated down below in the status bar;
The reason for Power Query to use the preview dataset is mainly because of speeding up the transformation development process. Imagine if you have a table with 10 million rows, every single transformation that you want to apply on that dataset would take a long time, and you have to wait for it before you start doing the next step. The wait for the response each time will slow down your development process. This is the reason why working on a preview on the dataset is a preferred option. You can apply all transformations you want on the preview, and when you are happy with it, then apply it to the entire dataset. Usually, the first 1000 rows or the first 200 rows are a good sample of the entire dataset, and you can expect to see most of the data challenges there. Usually, not always of course.
How the transformation will be applied to the entire dataset then? When you load the data into Power BI. That means when you click on Close and APPLY in the Power Query Editor window. That APPLY means apply those transformations now on the entire dataset. That is the reason, why the load process may take longer especially if the dataset is big.
Power Query Editor always works with a preview of the data, to make the development process fast. When you load the data in Power BI, transformations will be applied on the entire dataset.
Now that you know how Power Query Editor deals with the preview of the data, you can guess why the error above happened? The reason is; The preview of the data (which was about 1000 rows) had no issues with the transformations applied (in this case automatic data type change to Date for the BirthDate column). However, the entire dataset (which is about 18K rows) have problems with that transformation! When you see the error above in the Power BI Desktop, then you can click on View errors and go to Power Query editor and see those rows, and deal with them somehow, and fix it. However, that is not enough.
What if the error doesn’t happen in Power BI Desktop, but happens in a scheduled refresh in the Power BI Service?
True! Fixing errors in Power BI Desktop is easy, but consider that the error didn’t happen in the desktop too, and you got your Power BI report published to the website, and scheduled it to refresh. Then the next day you see the report failed to refresh with an error! You have to learn how to deal with the error rows beforehand before it cause the scheduled refresh to fail. Let’s see how to deal with it then.
Dealing with Errors: Catching the Error Rows
To deal with errors, you have to catch the error before it loads into Power BI. One way to do it is to create two references of the same table, one as the final query, and the other one as Error Rows.
in the screenshot above, I renamed the DimCustomer table to DimCustomer – Original, and then created a Reference from it. If you would like to learn what Reference is, read my article about Reference vs Duplicate here. The new referenced query can be called as DimCustomer. This would be the clean query with no errors (we will remove errors from it in the next step);
The new table is the table that would be clean with no errors and we can use it in the report. Let’s clean this from any errors
Remove Errors from the Table loading into Power BI
As DimCustomer would be the final query for us, I want to remove errors from it. Removing errors is a simple option in the Home tab, under Reduce Rows -> Remove Rows -> Remove Errors. Make sure that you select the BirthDate column before that.
You can also do this for all columns if you want; by selecting all columns and then using Remove Errors. This post is just a sample on one column and can be extended to all for sure.
Remove Errors will be a step in the data transformations step, and it means that when you click on APPLY, it will apply on the entire dataset, so as a result, when the data type change cause an error, the next step after that which is Remove Errors, will wipe the rows that caused the error. But the DimCustomer – Original still may cause the error, so we have to uncheck the Enable Load of that query.
Now we have successfully removed the errors, and if we load the data into Power BI. There would be no errors happening.
But wait! what about those error rows? How we can catch them? We need to catch those rows and investigate what happened and think about an action plan to fix them, right? So, we do need another query reference from the original query, but to keep the error rows.
Keep Errors in the Exception Table
Similar to the Remove Errors option, there is also an option to Keep Errors. If you have seen this option before, you may have wondered what is the use of such a thing? well, here is the exact use case scenario. Keep Errors will help to catch the error rows in an exception table.
Create another reference from the DimCustomer – Original.
Rename this new query as DimCustomer Error Rows. For this query, we have to Keep Errors, which can be found close to where the Remove Errors is, but under Keep Rows.
Now this table would only keep rows that cause an error. Here is a sample set;
This is not the end of the story. If you load this new table of DimCustomer – Error Rows in Power BI, you will end up with the same error again. Why? well, because this query is definitely going to return error rows! You need to remove the Error occurred from this dataset.
Getting Error Details
If you remove the error column from the exception table we have created, then you would have no details about the error happened, and it would be hard to track it back and troubleshoot. The best would be catching the error details. The error message and the value that caused the error are important details that you don’t want to miss. Follow steps below to get that information.
In the Error Rows table, add a Custom Column.
In the Custom Column editor, write “try” and then after a space, name of the field that caused the error. In our example: BirthDate;
try (all lowercase), is a keyword in M that will catch the error details. Instead of returning just an error, it will return a record containing the error details such as the source value and the error message. Below screenshot shows how the output of try would come;
The Record output of the “try” will have two fields; HasError (which we already know it is going to be true), and the Error. The Error is another record with more details. Click on Expand on the Custom column, and just select Error.
In the output column named Error, click on Expand again and this time select all columns;
It is good to have the original column name as a prefix because then you would know that these are error detail columns.
Now you would get the full details of the error as below;
The information above is your most valuable asset for the exception reporting.
Remove Error Column
Now the last step before loading the data into Power BI is to remove the column that causes the Error. In our example; the BirthDate Column should be removed (otherwise the refresh will fail again);
You can now load the data into Power BI. You will have two tables; DimCustomer, and DimCustomer – Error Rows. DimCustomer is the table that you can use for your normal reporting. DimCustomer – Error Rows is the table that you can use for exception reporting. The exception report is the report that can be used for troubleshooting and will list all the errors to users for further investigation. Make sure that there is no relationship between these two tables.
Here is a sample report visual I created that shows the errors;
Errors happen, and you have to deal with them. Instead of waiting for the error to happen and then finding it a month after it caused, it is better to catch them as soon as they happen. In this article, you learned a way to deal with error rows. In RADACAD we always create an exception report for Power BI reports. That way, we are always sure that the refresh won’t fail because of the error, and we would also have a place for investigating the errors, which is called the Exception Report. Do you have an exception report? if not, go and create one, If yes, tell us about your experience down below in the comments.