To build a robust BI system, you need to cater for errors and handle errors carefully. If you build a reporting solution that the refresh of that fails everytime an error occurs, it is not a robust system. Errors can happen by many reasons, In this post, I’ll show you a way to catch potential errors in Power Query and how to build an exception report page to visualize the error rows for further investigation. The method that you learn here, will save your model from failing at the time of refresh. Means you get the dataset updated, and you can catch any rows caused the error in an exception report page. To learn more about Power BI, read Power BI book from Rookie to Rock Star.
Sample Dataset
I will use a sample Excel file as a data source which has 18,484 customer rows in it. In the sample Dataset, we have a BirthDate field beside all other fields, which supposed to have a date value in it. Here is what the data looks like when I bring it into Power Query:
Error Happens
When I get this dataset in my Power Query Editor window (as you see in the above screenshot), Power Query automatically converts the data type of the BirthDate column to Date. You can see this automatic data type conversion in the list of Steps;
Of course, you can turn off this automatic data type detection of Power Query, but that is not my point. I want the dataset to fail to show you how to deal with it. Errors happen in Power Query in the real world, and I’m here to show you how to find them.
As you can see in the Power Query Editor, I see no errors for this data type change, and everything looks great;
Now I load this dataset into Power BI, by using Close and Apply in the Query Editor window, and I expect everything to load successfully, however, this is coming out of the blue!
Does this sound familiar? Yes, if you have worked with Power BI for a while, you might have experienced it. No errors in Power Query Editor, but when we load the data into Power BI, there are errors! How that is possible? let’s first find out why this happens.
Why Power Query Editor didn’t Catch the Error?
Power Query Editor always work with a preview of the dataset, the size of the preview depends on how many columns you have, sometimes it is 1000 rows, sometimes 200 rows. If you click on a Query in the Power Query editor window, you can, in fact, see this stated down below in the status bar;
The reason for Power Query to use the preview dataset is mainly because of speeding up the transformation development process. Imagine if you have a table with 10 million rows, every single transformation that you want to apply on that dataset would take a long time, and you have to wait for it before you start doing the next step. The wait for the response each time will slow down your development process. This is the reason why working on a preview on the dataset is a preferred option. You can apply all transformations you want on the preview, and when you are happy with it, then apply it to the entire dataset. Usually, the first 1000 rows or the first 200 rows are a good sample of the entire dataset, and you can expect to see most of the data challenges there. Usually, not always of course.
How the transformation will be applied to the entire dataset then? When you load the data into Power BI. That means when you click on Close and APPLY in the Power Query Editor window. That APPLY means apply those transformations now on the entire dataset. That is the reason, why the load process may take longer especially if the dataset is big.
Power Query Editor always works with a preview of the data, to make the development process fast. When you load the data in Power BI, transformations will be applied on the entire dataset.
Now that you know how Power Query Editor deals with the preview of the data, you can guess why the error above happened? The reason is; The preview of the data (which was about 1000 rows) had no issues with the transformations applied (in this case automatic data type change to Date for the BirthDate column). However, the entire dataset (which is about 18K rows) have problems with that transformation! When you see the error above in the Power BI Desktop, then you can click on View errors and go to Power Query editor and see those rows, and deal with them somehow, and fix it. However, that is not enough.
What if the error doesn’t happen in Power BI Desktop, but happens in a scheduled refresh in the Power BI Service?
True! Fixing errors in Power BI Desktop is easy, but consider that the error didn’t happen in the desktop too, and you got your Power BI report published to the website, and scheduled it to refresh. Then the next day you see the report failed to refresh with an error! You have to learn how to deal with the error rows beforehand before it cause the scheduled refresh to fail. Let’s see how to deal with it then.
Dealing with Errors: Catching the Error Rows
To deal with errors, you have to catch the error before it loads into Power BI. One way to do it is to create two references of the same table, one as the final query, and the other one as Error Rows.
in the screenshot above, I renamed the DimCustomer table to DimCustomer – Original, and then created a Reference from it. If you would like to learn what Reference is, read my article about Reference vs Duplicate here. The new referenced query can be called as DimCustomer. This would be the clean query with no errors (we will remove errors from it in the next step);
The new table is the table that would be clean with no errors and we can use it in the report. Let’s clean this from any errors
Remove Errors from the Table loading into Power BI
As DimCustomer would be the final query for us, I want to remove errors from it. Removing errors is a simple option in the Home tab, under Reduce Rows -> Remove Rows -> Remove Errors. Make sure that you select the BirthDate column before that.
You can also do this for all columns if you want; by selecting all columns and then using Remove Errors. This post is just a sample on one column and can be extended to all for sure.
Remove Errors will be a step in the data transformations step, and it means that when you click on APPLY, it will apply on the entire dataset, so as a result, when the data type change cause an error, the next step after that which is Remove Errors, will wipe the rows that caused the error. But the DimCustomer – Original still may cause the error, so we have to uncheck the Enable Load of that query.
Now we have successfully removed the errors, and if we load the data into Power BI. There would be no errors happening.
But wait! what about those error rows? How we can catch them? We need to catch those rows and investigate what happened and think about an action plan to fix them, right? So, we do need another query reference from the original query, but to keep the error rows.
Keep Errors in the Exception Table
Similar to the Remove Errors option, there is also an option to Keep Errors. If you have seen this option before, you may have wondered what is the use of such a thing? well, here is the exact use case scenario. Keep Errors will help to catch the error rows in an exception table.
Create another reference from the DimCustomer – Original.
Rename this new query as DimCustomer Error Rows. For this query, we have to Keep Errors, which can be found close to where the Remove Errors is, but under Keep Rows.
Now this table would only keep rows that cause an error. Here is a sample set;
This is not the end of the story. If you load this new table of DimCustomer – Error Rows in Power BI, you will end up with the same error again. Why? well, because this query is definitely going to return error rows! You need to remove the Error occurred from this dataset.
Getting Error Details
If you remove the error column from the exception table we have created, then you would have no details about the error happened, and it would be hard to track it back and troubleshoot. The best would be catching the error details. The error message and the value that caused the error are important details that you don’t want to miss. Follow steps below to get that information.
In the Error Rows table, add a Custom Column.
In the Custom Column editor, write “try” and then after a space, name of the field that caused the error. In our example: BirthDate;
try [BirthDate]
try (all lowercase), is a keyword in M that will catch the error details. Instead of returning just an error, it will return a record containing the error details such as the source value and the error message. Below screenshot shows how the output of try would come;
The Record output of the “try” will have two fields; HasError (which we already know it is going to be true), and the Error. The Error is another record with more details. Click on Expand on the Custom column, and just select Error.
In the output column named Error, click on Expand again and this time select all columns;
It is good to have the original column name as a prefix because then you would know that these are error detail columns.
Now you would get the full details of the error as below;
The information above is your most valuable asset for the exception reporting.
Remove Error Column
Now the last step before loading the data into Power BI is to remove the column that causes the Error. In our example; the BirthDate Column should be removed (otherwise the refresh will fail again);
Exception Report
You can now load the data into Power BI. You will have two tables; DimCustomer, and DimCustomer – Error Rows. DimCustomer is the table that you can use for your normal reporting. DimCustomer – Error Rows is the table that you can use for exception reporting. The exception report is the report that can be used for troubleshooting and will list all the errors to users for further investigation. Make sure that there is no relationship between these two tables.
Here is a sample report visual I created that shows the errors;
Summary
Errors happen, and you have to deal with them. Instead of waiting for the error to happen and then finding it a month after it caused, it is better to catch them as soon as they happen. In this article, you learned a way to deal with error rows. In RADACAD we always create an exception report for Power BI reports. That way, we are always sure that the refresh won’t fail because of the error, and we would also have a place for investigating the errors, which is called the Exception Report. Do you have an exception report? if not, go and create one, If yes, tell us about your experience down below in the comments.
Wow
Raza, this article is excellent. I learned several new techniques from you. Thank you for sharing your knowledge.
That’s brilliant! Thank you Reza.
I have several datasets which are based on other people’s (dubious) Excel spreadsheets – this will be a great help.
I’m off to build this everywhere 🙂
Under the Error Happens section > Power Query Editor screenshot, each column shows 3 percentages for Valid/Error/Empty. My Power Query Editor doesn’t show this info. Is this an Add-In? If so, how can obtain same?
Hi Joan
That is called Column profiling. It is a new feature added to Power Query from October release. You can enable it in the View Tab; Column Quality and Column Distribution. check both boxes
Cheers
Reza
This is quite illuminating Reza, I have just been pondering on how to deal with data refresh failures today and i just happened on this article. I can’t wait to use this. Also your series on Aggregate table was a delight. Many thanks.
Thanks Fabian
Glad it helps 🙂
Hi Fabian. This is an amazing post. Thank you for taking the time to do this.
I have one question though, I see a screenshot in the beginning of the post showing proportion of Valid, Error and Empty values. How does that work? And is there a way to get that in Power Query that comes with Excel (not Power BI Desktop)?
Hi Ejaz.
My name is Reza!
Everything in that screenshot except the window that says apply query changes, is what I have built. When you read the post you learn how to do it yourself. Yes, you can do the same in Power Query with Excel
Cheers
Reza
I am sorry the question I had was already asked and answered. I have another question though.
Do you have any advice on adding these checks to a lot of columns.
A blog post is coming explaining exactly that point, very soon. Stay tuned!
Thank God I found this article! I’ve spent countless of hours trying to figure out why power query is throwing out an error and worst thing is Power BI just closes . I’m getting data from 10 different SQL dbs and applied some data cleansing and transformations . The remove error option did it!
This is damn good, Reza. Well structured. ..
Great article. This works great when you what the erroneous column is. I have a table with 600 columns and 100,000 records. I end up with about 50 errors. Is there a way to limit the preview columns to only those with errors? How can I find which columns have the errors without having to slowly scroll through all the fields and look for the “Error” text?
Hi Mike,
There are two things I have to respond to your question:
first: Why you have 600 columns?!!! this looks like un-cleansed data to me. Do you really need them all in your analysis? all 600 columns? in one table?
second: to answer your question about which columns caused the error, I actually wrote the second part for this article here.
Cheers
Reza
Hi Reza, this post is amazing I must use it everytime get data from excel haha.
Question: I watched you on Ignite and you mention that the best way to make this productive should be a custom connector. I started the development but the thing here is that power query sdk won’t let you run code with error, no even if I add a remove error step. The code won’t run and will throw exception.
It seems that the SDK is not like Edit Queries where the error is captured in a record and the code continues.
Do you have any suggestion on how can I keep going with this?
Hi Ignacio
Great to see you here, and glad you liked the session and article.
the code using SDK should work. I can have a look into it if you share your code with me. as long as the final step doesn’t return error values you should be fine.
Cheers
Reza
Tat amazing and great support to move forward with Power BI platform.
So much appreciated.
Thank you,
Laurie Y,
Hi Reza,
In the screenshot where you are looking for errors in the birthdate column, there is a helpful-looking chart between the column header and the data cells, that shows how many errors there are, how much data is valid, etc. Where does this display come from? I can’t find it anywhere in Power BI Query Editor.
Is this because you are using Power Query as a separate application whereas I am using the Query Editor built into Power BI?
Thanks,
Ken
This is Data Quality chart. You can enable it in the View Tab using the checkbox for it.
Cheers
Reza
Found it! View –> Data Preview –> Tick boxes for “Column Quality” and “Column distribution”
Thanks Reza!
Hi Reza,
This was very interesting article, thank you! I like the way you write, very clear and no shortcuts. I already followed many of your links and found your book(s) as well, lot of bookmarking activities here.
You mentioned the followig problem in the beginning of this article:
“What if the error doesn’t happen in Power BI Desktop, but happens in a scheduled refresh in the Power BI Service?”
I have this situation now, and the error message says:
“The query referenced calculated column ‘TABLE'[FIELD] which does not hold any data because evaluation of one of the rows caused an error.”
The mentioned column refers to a column I have created with DAX (which refers to columns in two different tables). I made your error checking to all fields in these two tables, but it resulted to empty tables, i.e. no errors in Power Query state.
Any suggestions how to catch that kind of an error? I was notified by the error yesterday, and by sorting the “error column” in the Data view (normal Desktop view, not Power Query), I found two infinity values. After quick-fixing that, the report started to work again, but after last night’s refresh I got the same error. All the values seem to be ok in the Data view.
It is very confusing not to have any errors in the Power BI Desktop, but to have them in Power BI Service. It is then usually someone else who notices the problem before me…
Thank you,
Aurora
Hi Aurora
Thanks for your kind words. Glad you like the content and writing style 🙂
Regarding your scenario: Your error comes from a calculated column in DAX. Power Query cannot catch that error. Because Power Query transformations happen BEFORE processing DAX calculations.
Errors in DAX should be processed differently. I might write about it sometimes in the future.
one way to fix is to have a measure with the expression of your calculated column and the same row context applied, and show it in a visual report page so you can figure out there is an error.
Cheers
Reza
Hi Reza, thanks for the article. Would you know why Desktop would flag such an error (“X loaded queries contained errors”) but the dataset refresh in the Service does not? Thanks!
Hi Adolfo
there can be multiple reasons for this. here are a couple I am thinking:
1. the Locale of your local system is different from in the Power BI service, and data type changes behave differently
2. caching
3. the data at the time of refresh might be different
4. the connection to the data source might require some additional setup
…
Cheers
Reza
Hi Reza, came across your website completely by accident and am pleasantly surprised many great techniques here not mentioned anywhere else on the net (Or hard to find).
Something to ask – I know removing the error causing column will allow refresh in the future but is there a reason to do this? Wouldn’t we technically be deleting relevant data from the same row? This will affect our report integrity since we are ‘throwing out’ data. (Asking this question due to inexperience)
Hi.
Thanks 🙂
It depends on the situation and the system really. in some scenarios, without even having a row of data, the whole story doesn’t make sense, in some other scenarios, the system would live without having the error rows. at least for a while, and having the troubleshooting reports will show something happened, so users know that part of the data is not there. But they can still use the system partially.
Cheers
Reza
Another great video, Reza.
Thank you!!
Hi Reza,
In table Keep Error Rows, on creating Custom column .. I do not see the expand option to have ERROR column..
Can you please help.
have you created the column with “try …” yet?
Hi Reza, the technique you have explained to find the errors is great and is very simple to understand. thanks!!