Import Email Attachments Directly Into a Power BI Report using Power Query

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail
FacebooktwitterlinkedinrssyoutubeFacebooktwitterlinkedinrssyoutube

Power Query has more than 80 data sources supported, and Microsoft Exchange is one of those data sources. Although it is not a new data source, however, the power of this data source combined with the data transformation power of Power Query is not clear for everyone. Using this data source, you can get all emails in your inbox (or other folders), and then filter is based on sender or subject, and then expand attachments of emails into a table and visualize it at the end in Power BI. You can do all of that using Power Query in Power BI, you won’t need to do any manual step of saving attachments into a shared folder, and then get data from that folder. In this article, I’m going to show you how it is possible. To learn more about Power BI, read Power BI book from Rookie to Rock Star.

Get Data from Email

Mailbox is a data source that can be very beneficial. There are a lot of important data coming through emails. sometimes with attachments, sometimes without. A mailbox also can have other parts such as calendar, tasks, and people which can be very helpful for a data insight solution. In Power BI, or let’s say using Power Query, you can start getting data from your mailbox, using the Get Data option from Microsoft Exchange;

Then you need to enter your email address;

and after that, you will be asked to enter credentials. If you have Office 365 mailbox, then you can use Microsoft tab to login. After successful login, you will see a list of all entities that you can get data from that;

As you can see in the above screenshot, I can get Calendar items, Mails, Meeting requests, People and Tasks from a mailbox. The screenshot above shows an example of a Calendar entity. However, in this example, I’m focusing on Mails. So I choose that to get data from and then click Edit to filter my data before loading it into Power BI;

This will bring up the Power Query Editor with a preview of emails I have in my inbox;

This would be a table including all email properties as well as a couple of fields for HasAttachments and Attachments itself!

Filtering Data

In this example, I’m focusing only on emails with specific attachments. You have to find the rules to filter the table based on it. For example, if you are going to filter based on the sender, then you can expand the Sender field (which is a record itself), and then filter based on the email of the sender;

Or if you want to filter based on the subject, then you can do the filtering based on that;

Filtering the Folder

You might also want to filter based on the folder in your email account (in case you have emails coming or going from multiple folders), These are folders in my inbox for example; which from that list I select Inbox only.

Filtering the Subject

In this example, I’m expecting emails to come with the subject line of “sample files for email PQ test”, so I filter it this way;

And I put the subject line I’m expecting there;

As a result, I only have emails with that subject line in that specific folder;

Expanding Attachments

If you scroll further right in the columns list in the table above, you will see two columns for attachments; HasAttachment, and Attachments. HasAttachments will have values of TRUE or FALSE saying that this email has attachments or not (which can be also a good filter added to the previous step too, to make sure you only get emails that have an attachment). If you click on the Attachments itself, you will see a table which has all the details about attachments;

Let me show you where this is coming from, this is one of the emails above in my mailbox; (you can see that there is an attachment for it as well, with exactly the same size, name and extension)

We are only interested in the content of attachments, so let’s expand to this folder, however, first it is better to remove all other columns as we don’t need them anymore;

Then you can expand the Attachments using the expand icon;

You will have now a table of all attachments coming from all emails that we have filtered before;

There are columns such as Name of the attachment file, Extension of it, the size and some other properties. Most importantly, we have the AttachmentContent column, which includes the file itself in the binary format. And that is the column, which we are going to focus on. So let’s remove all other columns.

Combine Files

Now that we have all files in a table column, we can combine them using the Combine Files option. I have previously explained how the Combine Files works behind the scene in this article. In my example, attachments are CSV files, and I just click on Combine Files;

Power Query, will scan the files, and consider the first one as a template structure for others (remember for this method your files should have the same structure, otherwise you need to create multiple copies of this process into multiple tables using different sets of filters). My files are CSV files, Power Query understand the structure and comes up with a preview of the default structure;

This process will create a custom function behind the scene from one of the files and will loop through it for all files, as the result, I will have a table including the content of all files together;

The dataset in my example needs some cleanup, and it would look like this afterward;

Here you go! You have expanded all attachments directly into Power BI without needing any manual download and save in a shared folder. All through Power BI and Power Query! Before you start implementing it in a real-world scenario, here is an important tip to remember!

Make it Failproof

Building a solution is one thing, and making it failproof is another. There are tons of reasons why a solution might fail somewhere in the future. for example, consider that the next email comes like this:

The email above has some CSV files which are fine, However, it has some other attachments too. For example, the image in the signature! when I refresh the model that I built so far in Power Query, I get this:

In this case, I know the reason is because of that image attachments, and I can go to the Expanded Attachments step in Power Query to find that out;

I can filter the Extension filed to .csv only to make sure I won’t get other types of attachments;

This way, the solution would work fine with no errors.

You should always look for finding where and how your implementation might break, and prevent it from breaking.

I have written a full set of articles about creating Exception Reports in Power BI and Power Query, which I strongly recommend reading and implementing;

Summary

Power Query can get data from Exchange, which helps you to automate the process of importing attachment’s data directly into Power BI or Excel. You can then use the transformative power of Power Query for filter based on all necessary fields, and expand attachments into a structured table. This method will be a fully automated solution, and the next time that you receive an email with attachments, it will bring them into the output too. However, make sure to build a failproof solution always. This method would also work with shared mailboxes, as long as there is an actual mailbox associated with it.

Hope the solution helps you in your everyday work, and if you have a question or challenge implementing it, please do let me know in the comments below.

Facebooktwittergoogle_plusredditpinterestlinkedintumblrmailFacebooktwittergoogle_plusredditpinterestlinkedintumblrmail
FacebooktwitterlinkedinrssyoutubeFacebooktwitterlinkedinrssyoutube
Reza Rad on FacebookReza Rad on LinkedinReza Rad on TwitterReza Rad on Youtube
Reza Rad
Trainer, Consultant, Mentor
Reza Rad is a Microsoft Regional Director, an Author, Trainer, Speaker and Consultant. He has a BSc in Computer engineering; he has more than 20 years’ experience in data analysis, BI, databases, programming, and development mostly on Microsoft technologies. He is a Microsoft Data Platform MVP for nine continuous years (from 2011 till now) for his dedication in Microsoft BI. Reza is an active blogger and co-founder of RADACAD. Reza is also co-founder and co-organizer of Difinity conference in New Zealand.
His articles on different aspects of technologies, especially on MS BI, can be found on his blog: https://radacad.com/blog.
He wrote some books on MS SQL BI and also is writing some others, He was also an active member on online technical forums such as MSDN and Experts-Exchange, and was a moderator of MSDN SQL Server forums, and is an MCP, MCSE, and MCITP of BI. He is the leader of the New Zealand Business Intelligence users group. He is also the author of very popular book Power BI from Rookie to Rock Star, which is free with more than 1700 pages of content and the Power BI Pro Architecture published by Apress.
He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.
Reza’s passion is to help you find the best data solution, he is Data enthusiast.

13 thoughts on “Import Email Attachments Directly Into a Power BI Report using Power Query

  • Haven’t followed it in practice, but it fits pretty well in a demand I see coming at work. Thank you!

  • Hello Reza,

    thanks for detailed explanation. could it be possible not combine attachments. I want to merge my files but couldn’t find a way for it.

    regards
    Cem

    • Hi Cem.
      The combine works best with Excel, Text, CSV files. Especially when they have a similar structure. If you can’t get the Combine to work, you either have other files. Or you have other files somewhere in your list (make sure to filter out any unnecessary files), or the structure of files is not the same.
      Cheers
      Reza

  • This is great!
    I would like to know if there is a setting that I need to change to make sure that it automatically updates, however.
    What intervals will it be checking the e-mail box for new updates? I am assuming that once you set it up correctly that it will automatically import whatever e-mails match the filters, but I don’t see anywhere to tell Power BI to check every hour or every minute, etc.

    • This is an IMPORT Data model at the end of the day. The frequency of the update would be the frequency of setting your data refresh in the Power BI service 🙂

      Cheers
      Reza

    • every file is a binary content behind the scene.
      but processing Jpeg or pdf would need a bit more effort to dig into the metadata. PDF is already supported as a data source in Power Query and it makes it much simpler, but for Jpeg, you would need a bit of extra work.
      Cheers
      Reza

  • Hi,

    How scale-able is this? If I have daily files in my email box and I want to combine them into a query. Will this slow down the query over time as the days pass?

    • of course, more files, more rows, will take more time to process.
      However, you can use another approach to make it faster. use a dataflow for emails from long time ago, which won’t change.
      and then use normal Power Query in PBI Desktop for new emails.
      then you can merge them together.
      because the dataflow one stores the output in CSV intermediate storage, it can help us speeding up the refresh time.
      Cheers
      Reza

  • Thanks for the explanation. Is it also possible to have the attachment combined data together with email meta data like sender? So that every row in the attachment has also the sender info

    • Yes, It is
      At one of the early stages, before removing all other columns and keeping only Attachments column, you can also expand the Sender column, and keep that column too.
      Cheers
      Reza

Leave a Reply

Your email address will not be published. Required fields are marked *