One of the common patterns of a data model is the subscription pattern. In this pattern, we have subscriptions (or tickets, or issues, or whatever else you want to call it) open date and close date. The way that this data is stored in a table makes it a bit challenging to get informative insight out of it. In this post (first of the series), I am going to explain about the subscription pattern and one of the common calculations needed for it; which is active subscribers at any given date, or open tickets at any given date.
What is a Subscription Pattern
If we talk about individual sales patterns, the data is stored mostly with one or more directly actionable date fields; an example is the order date. You can use the Order Date directly in a slicer and find all sales transactions that happened on that date. Subscription pattern is when we have a table with FROM and TO date fields. these also can be called as START and END, or OPEN and CLOSE. The FROM or TO fields together convey the meaning, not individually.
For example; subscribers to an online product or service, are stored in a table like this:
If I connect a date table to this table, let’s say to the Start Date, then what I am going to say? I can analyze subscriptions that have started on a specific day. If I connect a date table to End Date, then I can find out subscriptions that IF ended on a specific date. But what about all the other dates in between?
As an example, this is very common to ask what were the number of active subscription on 10th of Feb 2020? Or open tickets on that date?
So, the subscription pattern is when the two date fields convey a meaning together. You need to get insight FROM this field TO that field, you need the range in between, the meaning is not just the two values themselves, it is more than that.
Snapshot Table Can be Huge
One way to work with a subscription pattern is to build a snapshot table. Snapshot table is a table that has a record per day (even if that day had no action happening), per product, per customer, and per all other dimension’s key attributes. Although this can be an effective method, the snapshot table can be soon become one of the largest tables in your data model and consume a lot of resources. Imagine how big a snapshot table can get over time:
(365 days) * (20 years) * (10 products) * (1000 customers)= 73,000,000 records
And this is just for 10 products and 1000 customers. This table would be usually much bigger than this.
I will write another blog article and explain the method to create a snapshot table in that, However, I believe there is more value in doing this calculation using DAX measure, as you won’t need such a big snapshot table. So, here we talk about that second method.
Sample Dataset
My sample dataset, which you can download it down below this post, includes tables like below:
You already have seen the sample content of the Subscription table in the screenshot earlier. The date table is also created as a general date table using the script here.
You might wonder why the Date table is not connected to the Subscription table? Well, think about it this way: Date table should be connected to a table that individual date fields of that table have a meaning just by themselves. Otherwise, the relationship doesn’t make sense. For example; if I connect the Date table to Start Date, then I can have charts and visuals that tell me subscriptions that have STARTED at any given date.
Having a relationship from the Date table to the Start Date or End Date is absolutely fine, and I won’t discourage you from doing so. That relationship will help you to create charts tell you that this is when subscriptions have started and ended. However, that will never give you the active subscribers at any given date. That is what we are going to build through a DAX measure.
DAX Measure: Active Subscribers
The goal for this measure is to tell us how many subscribers have been active on any given date. Let’s go through that using an example;
If I want to see how many active subscribers we had on 10th of Feb 2020, then these are rules I need to apply:
- Find all subscriptions that their End Date is blank, these are subscriptions that are active even just right now.
- Find all subscriptions that their End Date is AFTER 10th of Feb 2020, these are subscriptions that are not active anymore but have been active at that date.
- Both of the above operations should be done only on subscriptions that have started BEFORE 10th of Feb 2020.
My DAX measure then is the below script:
Active Subscribers = var _currDate=SELECTEDVALUE('Date'[Date]) var _firstStartDate=MIN(Subscription[Start Date]) var _subscribers= FILTER( Subscription, ( ISBLANK(Subscription[End Date]) // still subscribers || Subscription[End Date]>=_currDate ) // was active that date but ended now && Subscription[Start Date]<=_currDate // started before that date ) var _countRows=COUNTROWS(_subscribers) return if( _currDate<=TODAY() && _currDate>=_firstStartDate, if(ISBLANK(_countRows),0,_countRows) )
This is how the expression above works:
Getting the current date from the visual:
var _currDate=SELECTEDVALUE('Date'[Date])
Getting the first start date in the subscription table:
var _firstStartDate=MIN(Subscription[Start Date])
Create a virtual table with all the rules that I mentioned above:
var _subscribers= FILTER( Subscription, ( ISBLANK(Subscription[End Date]) // still subscribers || Subscription[End Date]>=_currDate ) // was active that date but ended now && Subscription[Start Date]<=_currDate // started before that date )
returning the count of rows of the virtual table
var _countRows=COUNTROWS(_subscribers)
Because we want the calculation only to be done up until today’s date, checking the current date against today’s date, and also to be after the first start date in the subscription table
return if( _currDate<=TODAY() && _currDate>=_firstStartDate, if(ISBLANK(_countRows),0,_countRows) )
Altogether, this simple DAX measure helps us to get all we need without the need to a super large snapshot table:
Active Subscribers = var _currDate=SELECTEDVALUE('Date'[Date]) var _firstStartDate=MIN(Subscription[Start Date]) var _subscribers= FILTER( Subscription, ( ISBLANK(Subscription[End Date]) // still subscribers || Subscription[End Date]>=_currDate ) // was active that date but ended now && Subscription[Start Date]<=_currDate // started before that date ) var _countRows=COUNTROWS(_subscribers) return if( _currDate<=TODAY() && _currDate>=_firstStartDate, if(ISBLANK(_countRows),0,_countRows) )
You can see that I have used Subscription without the ALL in my expression, and the reason is that I don’t have any relationship between the Date table and the subscription table, If you have that type of relationship, then here, you would need to use ALL to change the context of filtering from relationship-based, to be custom based on your DAX code.
If you want to slice and dice this data by other dimensions, you can. Here you can select a customer and see the subscription history of this customer:
As you can see, this customer, was an active subscriber until the 14th of Feb, and then re-subscribed on the 1st of March.
Download Sample Power BI File
Download the sample Power BI report here:
Hi Reza,
this is a greag blog post. Thank you for that.
I was wondering how you would set up the measure if I do have a relationshop between my date table and my subscription table. Because normally that would be case. I would have an active relationshop between the date and my start date, so I can measure how many subscribers have started on a particular date. Then I would also create an Inactive relationshop between the date and my end date so I can measure how many subscribers ended on a particular date (I use this relationshop with the userelationship function in DAX). So I need both of my relationships. But I would also want to know how many subscribers I had on every single date.
So how would you use the ALL-function?
Thanks in advance for your response.
Hi Peter,
Here is the code using ALL;
Active Subscribers =
var _currDate=SELECTEDVALUE('Date'[Date])
var _firstStartDate=MIN(Subscription[Start Date])
var _subscribers=
FILTER(
ALL(Subscription),
( ISBLANK(Subscription[End Date]) // still subscribers
|| Subscription[End Date]>=_currDate ) // was active that date but ended now
&& Subscription[Start Date]<=_currDate // started before that date ) var _countRows=COUNTROWS(_subscribers) return if( _currDate<=TODAY() && _currDate>=_firstStartDate,
if(ISBLANK(_countRows),0,_countRows)
)
Cheers
Reza
Hi Reza,
I didn’t realize it was that simple.
It works flawless! Very nice solution which comes in handy in a los of cases.
Does it work flawlessly on big models too?
Thanks again and keep up all you good work in the community!
Best regards,
Peter
Hi Peter.
You’re welcome.
On a very huge table, it might be a bit slower, especially because the calculation happens runtime. However, comparing it with the snapshot scenario which you need to create a super huge table, this method usually performs better.
Cheers
Reza
Really great solution, thank you. I have added a Revenue column to your data and with a similar measure, replacing the countrows dax with a sum dax, I get the Revenue per day.
Hi Paul
Great to see it comes to use 🙂
Cheers
Reza
Hi Reza,
I am trying to use this method to show active employees grouped using a time intelligence measure (prior month, prior month last year, etc) is joined to a date dimension table. I employee counts should also group by sites, so my employee data includes a site_id and joins on this to a sites table. The start & end dates are represented by hire_date and termination_date. Unfortunately the measure currently returns zeros. Would I need to change something to handle the sites and time intelligence groupings? Here is my measure DAX…
Active_Employees =
var _currDate=SELECTEDVALUE(‘Time Intelligence'[Date])
var _firstHireDate=MIN(‘HR_Employees'[hireddate])
var _employees=
FILTER(
ALL(‘HR_Employees’),
( ISBLANK(‘HR_Employees'[termdate]) // still employees
|| ‘HR_Employees'[termdate] >=_currDate ) // was active that date but terminated now
&& ‘HR_Employees'[hireddate] <=_currDate // started before that date
)
var _countRows=COUNTROWS(_employees)
return
if(
_currDate=_firstHireDate,
if(ISBLANK(_countRows),0,_countRows)
)
Hi John
Is your Date dimension Marked as Date table? or you are using the default date table of Power BI? depends on that, you might need to add “.[Date]” in some parts of the expression to get it working.
Cheers
Reza
Thanks for the Response Reza,
My Time Intellgence DAX create a periods table with Prior Month, Prior Month -1, Prior Month Prior Year, Year to Date & Prior Year to Date. All the columns and measures in my data are by Date and are at a Site level, so they rollup to these groups. Unfortunetly Headcount is something we could get at a Date level but don’t want to summarize, we just want the end of period value. I am not sure how to acheive this?
Do you get the slicing and dicing by the current date in your visual already?
can you send me your PBIX file?
Cheers
Reza
Hi Reza, this was the most comprehensive article on this matter so far: got me the closest to the desired results. I am stuck though where: the active results, when filtered by date – it shows what was active as long as it was raised on that day. I would like when I filter by day to see everything that was (is) active: what was raised on that day as well as what was already raised previously, but still open.
Any advice?
Much appreciated
This should give you that result.
To make sure we are talking about the same problem, can you share some screenshots of your tables and the data and the desired output?
Cheers
Reza
Hi Reza,
I downloaded your file and tried to get the active subscribers by month, instead of the Date as you visualized here. However the measure fails to aggregate over the months using the date table, and the graph returns blank. I have the same issue in my own data set. Is it not possible to use the other columns in the date table with this measure, to be able to see the subscribers per month/year/week instead of per day?
Hi
some of the functions that I used here, requires a date dimension, which might not work for you if your data is at the month level
can you share screenshots of your tables?
Cheers
Reza
HI I have the same issue as when I want to filter based on quarter no or month no the data is not broken down, how shall I adapt the calculation?
Hi
Many of these calculations are written on a monthly basis. if you want quarterly the calculations has to be modified to leverage the quarterly context
Cheers
Reza
Hi Reza,
(apologies if this is duplicated from the other day, my post appears to have been lost)
Firstly, thank you for this. It’s working great and provided me with new insights. I’m using it to count open tech support tickets.
Is this solution something that can also handle time of day? I’d really like the cutoff to be at 17:30 rather than midnight, as automated tickets come into our desk thoruhg the night.
And do you know if it would be possible to utilise a date heirachy with this? When I try using anything but my Dates[Date] column, I see no data.
Thank you so much,
Ben
Hi Ben
Very interesting use case
Do you have a time dimension? I have an article about that. you need to have filters on both dimensions in that case
Cheers
Reza
Hi. How do I use this pattern if I want to count active subscribers (distinct) if they have several subscriptions.
Hi Roger,
A general piece of advice is that your distinct count should not consider the subscription ID. it should be based on the userID for active subscriptions.
for more detailed advice, please share your dataset, and we can write a blog article explaining that.
Cheers
Reza