Customer retention is one of the most common scenarios of data analysis, which is very useful for business. As a business, you would like to know how many new customers you have in each month, how many returning, and how many lost customers. Knowing these, would help you to focus on the growth plan, and bring more new customers, keep more returning customers, and reduce the number of lost customers. I have written previously about a simple way to find out which customers are new, or lost. However, in this post, I am going to share more details about some other analysis that can be useful, especially when you want to analyze the customer retention for each period (when the date is an axis of the chart).
Customer Retention Over Period
It is useful to know if a customer is new, lost or returning right now, which is what I have explained here:
However, it is even more useful to know that at which period that customer became a new, lost, or returning customer:
The visual above is doing the calculation of new, lost, and returning customers for each period.
the data model for this sample dataset is like this:
Below is a list of expressions that are useful for this analysis.
Staring Easy: This period’s Customers
Getting this period’s customers, is a simple discount count as below:
This period's customers = DISTINCTCOUNT(FactInternetSales[CustomerKey])
this is just showing how many unique customers we have in each period (active customers who made a purchase in this period)
Running Total Customers: All Customers Until This Period
Another useful calculation for customers is to find out the total of customers from the beginning of this business until now. regardless of the fact that they are active or not. This is a running total calculation, which you can achieve either using quick measures or writing the statement like below;
Running Total Customers = var _currdate=MAX(DimDate[FullDateAlternateKey].[Date]) var _salesthisperiod=[Sales] return if(_salesthisperiod>0, CALCULATE( [This period's customers], FILTER( ALLSELECTED(DimDate[FullDateAlternateKey].[Date]), ISONORAFTER(DimDate[FullDateAlternateKey].[Date], _currdate, DESC) ) ) )
The reason for using IF statement and checking the sales value is that if there is no sales in this period, we consider it as a month that nothing yet happened (these are probably months in the future in our date table), we are not interested to do our calculation for those months.
Running Total Sales
In addition to the running total of the count of customers, it is also good to have the running total of sales. the Sales Measure itself has this calculation:
Sales = SUM(FactInternetSales[SalesAmount])
and the running total sales comes as below;
Running Total Sales = var _currdate=MAX(DimDate[FullDateAlternateKey].[Date]) return CALCULATE( [Sales], FILTER( ALLSELECTED(DimDate[FullDateAlternateKey].[Date]), ISONORAFTER(DimDate[FullDateAlternateKey].[Date], _currdate, DESC) ) )
This is a measure that we use in calculating the count of new, lost, and returning customers.
New Customers – Fixed Period
for calculating the new, lost and returning customers. I first, explain how it works with a fixed period. Let’s say we consider a customer new, if that customer purchased everything in his/her transaction history only the current active period, and let’s say the active period is one year (fixed). So, if there is a customer who purchased in total (up until that given date) $250, and in the period of 1 year from the given date in the table, it is also $250, then that customer is new in that period, otherwise, not. Here is one of the ways you can calculate it:
New Customers - Fixed Period = var _currdate=MAX(DimDate[FullDateAlternateKey].[Date]) var _salesthisperiod=[Sales] var _customers= ADDCOLUMNS( DimCustomer, 'Sales Until Now', CALCULATE( [Sales], DATESINPERIOD( DimDate[FullDateAlternateKey].[Date], _currdate, -1, YEAR)), 'Running Total Sales', [Running Total Sales] ) var _newcustomers= FILTER(_customers, [Running Total Sales]=[Sales Until Now] && [Sales Until Now]>0) return if(_salesthisperiod>0, COUNTROWS(_newcustomers))
The code above is creating a virtual table with two new columns. the table has one row per customer (because it is created on the basis of DimCustomer, and has one column for the customer’s total sales, and another one for the sales until now, which means the sales until the given date in the context of the table visual. The return result is filtered for those records that their total sales are similar to their sales until now.
my sample dataset has most of the customers only purchased in one period. However, further down in the screenshot above, you can see that there are customers who purchased in multiple periods, so the count of this period’s customers is not always exactly similar to the New Customers – Fixed Period.
Returning Customers – Fixed Period
if the sales amount coming from a customer in the last active period is less than the total sales of that customer, then that customer is returning. It means the customer is not purchasing in this period for the first time, the customer purchased in prior periods too.
Returning Customers - Fixed period = var _currdate=MAX(DimDate[FullDateAlternateKey].[Date]) var _salesthisperiod=[Sales] var _customers= ADDCOLUMNS( DimCustomer, 'Sales Until Now', CALCULATE( [Sales], DATESINPERIOD( DimDate[FullDateAlternateKey].[Date], _currdate, -1, YEAR)), 'Running Total Sales', [Running Total Sales]) var _returningcustomers= FILTER(_customers, [Sales Until Now]>0 && [Sales Until Now]<[Running Total Sales]) return if(_salesthisperiod>0, COUNTROWS(_returningcustomers))
You might think, whey we don’t create the _customer variable above as a calculated table instead? the reason is that if you do it as a calculated table, then the calculation is done at the time of refresh, and the date of the table visual won’t impact the result. For this type of calculation, especially if the period is not fixed (further down in this article) it is better to do calculations dynamically.
Lost Customers – Fixed Period
You can calculate Lost customers from the three prior measures: [Running Total Customers] – [New Customers – Fixed Period] – [Returning Customers – Fixed Period]. However, sometimes, you might just need New customers and Lost customers. So better to have an independent calculation for the Lost customers itself.
Lost Customers - Fixed Period = var _currdate=MAX(DimDate[FullDateAlternateKey].[Date]) var _salesthisperiod=[Sales] var _customers= ADDCOLUMNS( DimCustomer, 'Sales Until Now', CALCULATE( [Sales], DATESINPERIOD( DimDate[FullDateAlternateKey].[Date], _currdate, -1, YEAR)), 'Running Total Sales', [Running Total Sales] ) var _lostcustomers= FILTER(_customers, [Running Total Sales]>0 && [Sales Until Now]=0) return if(_salesthisperiod>0, COUNTROWS(_lostcustomers) )
A customer is lost that has no sales transaction in the last active period (one year from the given date) but had sales transactions up until that date (running total sales).
The measure above using in a visual as below:
Making the Period Flexible; What If Parameter
You can also make the period flexible using the What-If parameter for Period. let’s say in this example, period is the number of months;
This will create a calculated table with a Period Value measure as the selected number of months. Now, this can be used in a calculation like below examples.
New Customers – Flexible Period
The only change in the below expression is the use of the [Period Value] measure and changing the interval to month.
New Customers = var _currdate=MAX(DimDate[FullDateAlternateKey].[Date]) var _salesthisperiod=[Sales] var _customers= ADDCOLUMNS( DimCustomer, 'Sales Until Now', CALCULATE( [Sales], DATESINPERIOD( DimDate[FullDateAlternateKey].[Date], _currdate, -1*[Period Value], MONTH)), 'Running Total Sales', [Running Total Sales] ) var _newcustomers= FILTER(_customers, [Running Total Sales]=[Sales Until Now] && [Sales Until Now]>0) return if(_salesthisperiod>0, COUNTROWS(_newcustomers))
Returning Customers – Flexible Period
Returning Customers = var _currdate=MAX(DimDate[FullDateAlternateKey].[Date]) var _salesthisperiod=[Sales] var _customers= ADDCOLUMNS( DimCustomer, 'Sales Until Now', CALCULATE( [Sales], DATESINPERIOD( DimDate[FullDateAlternateKey].[Date], _currdate, -1*[Period Value], MONTH)), 'Running Total Sales', [Running Total Sales]) var _returningcustomers= FILTER(_customers, [Sales Until Now]>0 && [Sales Until Now]<[Running Total Sales]) return if(_salesthisperiod>0, COUNTROWS(_returningcustomers))
Lost Customers – Flexible Period
Lost Customers = var _currdate=MAX(DimDate[FullDateAlternateKey].[Date]) var _salesthisperiod=[Sales] var _customers= ADDCOLUMNS( DimCustomer, 'Sales Until Now', CALCULATE( [Sales], DATESINPERIOD( DimDate[FullDateAlternateKey].[Date], _currdate, -1*[Period Value], MONTH)), 'Running Total Sales', [Running Total Sales] ) var _lostcustomers= FILTER(_customers, [Running Total Sales]>0 && [Sales Until Now]=0) return if(_salesthisperiod>0, COUNTROWS(_lostcustomers) )
Results
As a result, now I have flexible customer retention by each period, which can be used in charts and table visual and any other visual by date:
Download Sample Power BI File
Download the sample Power BI report here:
Dear Raza,
Thank you for a very helpful post, When I am trying to create a running sales and customers using DAX variable getting below date error. what is [Date] referring in the query.
Column reference to ‘FullDateAlternateKey’ in table ‘DimDate’ Cannot be used with a variation ‘Date’ because it does not have any.
Running Sales_Var =
var _currdt=MAX(DimDate[FullDateAlternateKey].[Date])
return
CALCULATE(
[Sales],
FILTER(
ALLSELECTED(DimDate[FullDateAlternateKey].[Date]),
ISONORAFTER(DimDate[FullDateAlternateKey].[Date], _currdt, DESC)
)
)
.[Date] should be only used for scenarios that you are using the Auto time intelligence of Power BI feature.
if you are not using that, then remove “.[date]” in any of those expressions
Cheers
Reza
Hi Reza,
very nice post, thank you! I am using modified patterns of The Italians but I really like the flexible version, as I am showing data using different Time Intelligence, e.g. YTD and MAT and I need to adjust the definition of “lost and found” dynamically 🙂
My question is – could you please give a hint how to modify the last formulae in order to show sales of new (lost, returning) customers?
Thank you
Martin
If you want to get the Sales for new customers, for example, you can return that as the result of Calculate, something like this:
Calculate ([Sales],_newCustomers)
Because the _newCustomers is a subset of customers and filters the FactSales table, then you would get the sale of the new customers only.
Cheers
Reza
Hi Reza I have a big sales data of over 600 thousand rows for last three months only, when i am applying your template when new customer dax is writted it takes a lot of time and shows an error either “Not enough Memory” or “Locking Conflict”,
is there a way out of this?
having large amount of data in the table visual with these heavy DAX expression will cause performance issue. Here I wrote about it.
I recommend filtering your table using slicers
Cheers
Reza
Hi Reza , excllent post . This is what I was looking for; however when I put city in the filter context to see where those new customers bought , it’s reevaluating the measure to show the new customer for that city only. So let’s say for London, that customer is considered “New” , although he already bought in another city . How would you go about this ? and only show a customer as new if he never bought anywhere ? thanks for your help.
Hi Dave
Can you please share a screenshot of your tables and the data in it?
Cheers
Reza
Hi Reza,
How about in DAX counting new customer (with first purchase within the period) who are returning and making purchase within a period (say 60 days) after first purchase?
Illustrative table result:
Period | # New Cust. | # Cust. Returning within 60 days
2021-Jan | 100 | 60
2021-Feb | 120 | 80
2021-Mar | 90 | 50
..
..
Thanks in advance for the help!
Hi Danny
You can achieve that using another measure with the date filter of DatesInPeriod and use 60 days as the parameter for it.
Cheers
Reza
Hello, excelent post, i would like to know how to fix the fact that my calendar table has dates greater than last sales date,
it results than when I move for example, -3 months in the slicer, it happens that my calendar table has last date 31/12 so, new customers are mistaken calculated since -3 is not exactly – 3 for sales!
Ihope i am clear because english is not my native language best regards, thanks
Hi
You can write your measure in a way that considers that last day of sales rather than the last day of the date table. this can be done by something like max(date field from fact table)
Cheers
Reza
Can you explain more where is the “Sales Unitil Now” come from in the New Customer – Fixed Period DAX. I am just newbie and it quite hard for me to keep track 🙁
Sales until now is actually a column added to a virtual table using ADDColumns inside a measure expression
Learn more about addcolumns here.