Dynamic SQL Using Power Query

Every so often I find myself needing to import data, but only want data relevant to values already existing in my data model.  This is common where I build smaller models based on a subset of fact or transaction records.  Rather than bring in entire dimension tables, I may only need the dimension records that cover the transactions I already imported.

Recently I worked on a  model that only contained 250 members of a dimension with over 6 million members.  Naturally I wasn’t keen to import 6 million records when I knew 99.9% would be irrelevant.

There are a few ways this can be achieved but I will highlight here a nifty and speedy technique that works when your data source accepts SQL statements.

The quick summary is :

  • Convert the column containing the data you’d like to use to a Power Query List
  • Use the Table.Combine function to turn the above list into a string that you can drop into an SQL  query

The step by step guide is ;

1. Grab data from a sample DB.  I downloaded and restored WideWorldImportersDW-Full.bak to my local machine.

I used the  SQL statement and named the imported table Sales.  This query retrieves all sales for just 1 salesperson, so not all stock items are covered.

SELECT 
	[invoice date key] , 
	[stock item key] , 
	count(*) AS Sales
FROM FACT.Sale 
WHERE 
	[Salesperson Key] = 4
GROUP BY 
	[invoice date key] , 
	[stock item key]

Step1

 

The query is deliberately hard-coded to Salesperson = 4 to limit the number of stock items.

For my next table, I only want to import rows that belong to stock items from the above table.  This could be handy if my source dimension table has millions of records and I only want to query the relevant items, rather than bring ALL records into Power Query or Power BI to filter.

My approach is bring another table in from the database and again use an SQL statement as the query.  I’m start with the following hard-coded query where I specify some stock items which I will change later.

	SELECT
		[Stock Item Key] ,
		[Stock Item] ,
		[Color] ,
		[Brand]
	FROM [Dimension].[Stock Item]
	WHERE
		[Stock Item Key] IN (
					1,2,3
					)

Step 2

 

This results in 3 rows as expected, but these are not the items I actually want.  I need to make the above query dynamic using a generated list of Stock items we actually need.

Right click the Sales table in the Queries bar and choose the reference option from the context menu.  This creates a copy of the Sales table in its final state. I name this new table mySQL.

Step 3

To convert this table to a List, right click the column header of the stock item key column and choose Remove Other Columns which should result in a single column Table.

Step 4

On the Home Tab, click Remove Rows and then Remove duplicate Row for the single column table.

Step 4a

From the Transform tab, click the Convert to List which should show the same data but as a list.

 

Step 5

The underlying M should look like this :

let 
    Source = Sales,
    #"Removed Other Columns" = Table.SelectColumns(Source,{"stock item key"}),
    #"Removed Duplicates" = Table.Distinct(#"Removed Other Columns"),
    #"stock item key" = #"Removed Duplicates"[stock item key]
in
    #"stock item key"

We now have a List which can be dropped into our Items Query.

Select the Items table from the Queries list and click the Advanced Editor from the Home tab.

Scroll to the right and find the highlighted text ….

IN (#(lf)#(tab)#(tab)#(tab)#(tab)#(tab)1,2,3#(lf)#(tab)#(tab)#(tab)#(tab)...

and replace with… (replace the red text)

IN (#(lf)#(tab)#(tab)#(tab)#(tab)#(tab)" & Text.Combine(mySQL,",") & "#(lf)#(tab)#(tab)#(tab)#(tab)...

The Text.Combine function converts our list into a string of comma separated values, with no need to remove any trailing or leading commas.

One more hurdle.

You may encounter a Formula.Firewall error such as :

firewall

You can fix this back in the Power BI options and settings under Privacy by selecting the option to Ignore the Privacy Levels and potentially improve performance.

ignore firewall

This should result in an Items table that only contains data based on results from another table.  This can be adapted in many ways and offers a faster alternative to a row-by-row option using the Invoke Custom Function approach.  However this does rely on an underlying data source that accepts SQL statements and more specifically, leveraging the IN operator.

final

You can view the query dependency in the query editor to see the order the queries execute.

dependancy

This could be extended to use parameters which I may cover a future post.

Philip Seamark on EmailPhilip Seamark on LinkedinPhilip Seamark on Twitter
Philip Seamark
Phil is Microsoft Data Platform MVP and an experienced database and business intelligence (BI) professional with a deep knowledge of the Microsoft B.I. stack along with extensive knowledge of data warehouse (DW) methodologies and enterprise data modelling. He has 25+ years experience in this field and an active member of Power BI community.

11 thoughts on “Dynamic SQL Using Power Query

  • Phillip,

    Thanks, that’s very interesting – a technique with a lot of potential uses. Should the graphic underneath “and replace with… (replace the red text)” be a different one? They currently appear to be identical rather than the expected ‘before and after’.

    • Hi Steve,
      They aren’t identical, I just didn’t scrolling was required (and I’d lost my red highlighting).
      Will fix now and thanks for the heads-up! 🙂

  • Hi Philip,
    Very interesting and useful article.
    Whenever you get around to paremeterized queries could you imagine a situation where the parameters came from Excel. We use Power BI as a datamodel generator until we have converted our Excel Power Pivot models. But I expect those models will have a life of their own besides Power BI. Some of these models use the idea of generating a dynamic dataset, so it could be great to let people work in their usual environment, and then grab the parameters from that selection. Is that a possibility?

  • I tried with another database and got the following error message when replacing the string with ” & Text.Combine(mySQL,”,”) & “. The column in the database is “numeric”.

    Expression.Error: We cannot convert the value 1 to type Text.
    Details:
    Value=1
    Type=Type

    Do you know how I can fix that? Thanks.

    • Same to me.
      The only solution was to transform data type in Sales from numeric to text. Then it works.

  • This doesn’t quite work for me. I get the “Permission is required to run this native database query.” notification, but when I choose Edit Permission -> Run, nothing seems to happen (The Alert never goes away and no data is returned).

    Too bad. This looks like the perfect solution to my problem.

  • I am populating two seperate List Items (single item value) with a date, eg. startDate and endDate.
    These will be variable (coming from Excel).

    I want to feed these start and end dates into the WHERE date BETWEEN startDate and endDate, but I can’t get it to work. Can we use this outside of the IN clause ?

  • This works so good! But there is an issue. You can’t auto refresh the data if there is a function in you Power Query “Text.Combine()”. Have you found a way around this?
    My problem is I generate a list in one system that can’t communicate directly with the other system… (the perfect Power BI problem), but if I have to filter in PowerBI I have to bring back millions of records, so I want to do the filter in the query.

  • I was just wrapping up doing this when I ran across Derrick’s comment above only minutes after he posted. When you go to setup auto refresh it states “You can’t schedule refresh for this dataset because one or more sources currently don’t support refresh”. Extremely disappointing since that negates basically what Power BI is used for. Very cool application though and learned a bit about M.

Leave a Reply