Power BI Desktop file stores many metadata information in it, one of the most important piece of information stored there is Power Query M scripts. It is always easy to open a PBIX file and realize the queries in Query Editor Window. However, if you have many queries and you want a way to get the script of all of them then this post is for you. I will explain a bit about the structure of PBIX file at the beginning and then I’ll explain how you can access the M script from it. If you like to learn more about Power BI and Power Query; read Power BI book from Rookie to Rock Star.
Structure of a PBIX File
Gilbert recently written a great blog post about the structure of PBIX file, and I strongly recommend everyone to read it. For this blog post, my focus is on M script section of that. so, I’ll go through other parts of the structure briefly. A Power BI Desktop file is a file with *.PBIX extension. This file is a renamed version of a ZIP file. You can simply rename the file to *.zip to see the structure of it. Then you can simply unzip it into a folder.
Files under this folder are;
Here is a very brief explanation of each file or folder;
- Report Folder: Layout of the Power BI visual reports (Including two files in the folder)
- [Content_Types].xml: Includes the content structure of this folder in XML format (older PBIX files might have different content structure)
- DataMashup: This is the file which is main topic of this blog post. It includes everything related to Power Query side of the file; M scripts, structure of queries, parameters, and functions.
- DataModel: This actually includes the data in the model (in a compressed format).
- DiagramState: It seems it stores the table and Matrix information.
- Metadata: Contains all labels (Names). In Power BI Desktop, everything has a GUID assigned to it. what we see as a table name, has a GUID assigned to it. This file is mapping of GUIDs and names that we see.
- SecurityBindings: I believe this is Row Level Security configuration in the Power BI stored in binary format.
- Settings: I believe it is some of the settings applied in Power BI settings menu options.
- Version: Version of the file.
in the future I’ll write some other posts with detailed explanation of every file and folder. For now, let’s focus on DataMashaup File.
DataMashup File: Everything You Need
Talking about Power Query; DataMashup file is all you need. It includes everything from the structure of queries, tables, parameters, list, to the actual M scripts behind the scene. You can Fetch all of these information from this single file. Let’s look at the structure of this file. If you open this file with a text editor. you will see some binary things first (which are related to the zipped nature of this file), and also some XML information. Yes, this is a zipped file. Let’s start with unzipping it into a folder. I’ve done that with 7-zip application.
In the folder you will have these files:
Here are details of each file;
- [Content_Types].xml: explains the content of this section, very high level information. it is just saying that M script are somewhere in this folder
- Config Folder: Includes one file: Package.xml, which is the information about settings of Power Query. For example the culture, versioning, and some other configurations.
- Formulas Folder: Includes M scripts! files with *.m extension.
You might end up having more than one file if you have more than 1 section, but this is unlikely to happen in usual cases. Usually everything you do in Query Editor goes under single section. If you are a M geek and know how to write multiple sections, then you might end up with having multiple files 😉 I will write another post in the future explaining Sections and their usages. For now, let’s look at this *.m file with a text editor:
M Script File
Section1.m File gives you all M script in well formatted text;
If you need to send your M scripts to someone this is the place to copy them from. This includes every single M script all in one place (as long as they are all under one section). Here is the actual Power Query’s Query Editor list:
all of these queries, parameters, and functions are in the section1.m file. You can easily understand if they are query, parameter of function. Look at screenshot below;
Queries are just starting with let expressions. However, for parameters there is a meta information:
meta [IsParameterQuery=true, Type="Text", IsParameterQueryRequired=true]
for functions; there is a FunctionQueryBinding section which explain where the function is sourced from;
[ FunctionQueryBinding = "{""exemplarFormulaName"":""Public Holidays in Victoria in 2017""}" ]
Green section above is a simple query
Red section is a parameter
and Yellow section is a function
So, very easily you can access all queries pane from this area. However, this doesn’t tell you the folder structure, or some other query editor settings. For example, many of these queries are not loading in Power BI and disabled for load. This *.m file doesn’t tell you that. That information, however, exists somewhere else. let’s look at that.
Metadata Information in XML Format
If you open the DataMashup file in a text editor (like Notepad++) you will see another story. content starts with some binary characters, but then some XML codes appears.
Starting binary data are related to the zipped nature of this file (which we’ve already went through it in previous part of this post). After binary characters, some XML code starts, which includes some schema definition. You will easily find all the metadata information between these tags;
all useful information are under LocalPackageMetadataFile tags. you can simply remove everything else. After removing everything else and keeping only the content between these tags, then you will have XML structure that includes all metadata information about Power Query’s Query Editor;
As you can see in above screenshot, there are XML tags for all attributes; name of every step and query (highlighted yellow), and their properties (highlighted green) such as LoadToReportDisabled, ResultType and Value.
If you be a Power Query geek like myself, you would dig into this structure with Power Query Itself! I have opened another Power BI Desktop file, and got data from this file; DataMashup File. I have done some steps, such as removing everything before and after main xml tags, then drilled into different sections of it, and I’ve got some results such as this:
As you can see it is fairly obvious that is the query Parameter, Function or a normal Query(highlighted yellow). Is the query inside a group or not (highlighted red), or is it enabled load in the report (highlighted green), and many other information.
Here you go. You have all you need in terms of understanding the metadata of tables, queries, parameters, and functions.
Summary
DataMashup file in *.pbix file included all you need to access to Query Editor’s metadata information, and also the M scripts. You can simply use this file to access to the whole code behind of Power Query component of your Power BI solution. In future posts I’ll go through steps of exploring data from the XML content in DataMashup file and also explain other files under *.pbix file. There are still a lot of information in XML content of DataMashup file, I encourage you all to explore it with Power Query and open your eyes to the world of metadata of Power Query 🙂
Tell me what part of this DataMashup file and metadata information you like most in the comments.
When I import/read XML files FROM FOLDER , they sometimes have different “culture versioning”.
or . I need a function to change the TextEncoding.Utf8
Hi,
Where could I find the calculated field expressions and measures field expression in the PBIX file?
Hi Cinna
If you use Power BI Helper, in the Model tab, you can get all expressions information and document it out to HTML format
Cheers
Reza
Can we modify the contents of the DataMashup file?
Anything can be changed.
however, it is not supported after your change, and you have to take the risk and responsibility of that change yourself.
Cheers
Reza
Is it still working for you in March 2020 update? I tried this last time with file saved with Feb 2020 update and it worked but now it does not.
Hi Jakub, the new format of the dataset is different. You can connect to it using Power BI helper, and then from the DMV tab, use the PARTITIONS schema to get M code.
Cheers
Reza
Hi Reza, very interesting post!
Do you know a way to extract from pbix file the last refresh date of each individual table? That is the date and time that pops up when you hoover on the table name in the filed section of PBI Desktop
Many thanks
Andrea
one way is to add a custom column in Power Query to each table with DateTime.LocalNow() and then use that for reporting