This is the second of my RFM analysis how-to. In this step we'll build the actual model for your data in Google Sheets (for additional screenshots feel free to check out my blog).
Creating a RFM analysis example step by step
First step is to prepare the data and to calculate the following metrics for each customer:
- The most recent transaction
- The number of transactions per month for each customer
- The average amount purchased each month by each customer
In the following we will work in the sheet Tutorial of the provided workbook.
We’ll start by finding out the number of transactions each individual customer had. This is easily done by copying the formula =COUNTIF(‘Sample Data’!B:B,A2)from B2 down to B3403. It will count how often the value from the referenced cell occurs in column B from the sheet Sample Data.
Next step is to identify the most recent transaction for each customer. You can do so by copying the following formula from C2 down to C3403:
=MAXIFS(‘Sample Data’!C$2:C,‘Sample Data’!B$2:B,$A2)
The formula uses cell A2 as a reference to filter the corresponding rows in the sheet Sample Data with column B and returns the highest (=most recent) date from column C respectively.
Similar to this we use the formula =MINIFS(‘Sample Data’!C$2:C,‘Sample Data’!B$2:B,$A2) in in cells D2 to C3403 to get the date of the first transactions of each customer.
For the final RFM model you’ll need the amount of time the customer has been with the business. In our example we’ll use months for this. As such put =DATEDIF(D2,now(),“M”)into E2 and drag it down E3403.
Next we want to know how much each customer spends on average each month. Plug in =SUMIF(‘Sample Data’!B:B,A3,‘Sample Data’!D:D)/E3 from F2 to F3403.
The last step is to calculate the average number of transactions per month for each customer. So simply write =B2/E2 into G2 and copy it down.
For all above instead of “M” you could also use “Y” or “D” to set the time unit to years or days respectively. It doesn’t really matter what you choose as we we’ll be coding each data point into a 1-5 scale for the RFM analysis later on anyway. However to make your data more vividly choose a unit, which make sense to your business model, e.g. if you are selling cars it would make sense to choose years, while it might make more sense to choose days when you are selling coffee.
Calculating R,F and M
Two steps are necessary to calculate the R, F and M scores:
- Determine how each customer ranks for recency, frequency and monetary
- Assign a score to each recency, frequency and monetary rank
Luckily Google Sheets has a handy formula for returning the rank of a specified value in a dataset.
As such plug in and copy down the following formulas.
For the recency rank: =RANK(C2,C$2:C$4296,1) in H2 to H3403
For the frequency rank: =RANK(G2,G$2:G$4296,1) in I2 to I3403
For the monetary rank: =RANK(F2,F$2:F$4296,1) in J2 to J3403
The last argument (1) in the formulas ensures that the highest values in the respective dataset gets a higher rank and vice versa. E.g. a customer with an average order value of 10$ would get a higher rank than a customer with an order value of 5$.
Next we’ll create a RFM rank matrix to convert a customer’s ranks on recency, frequency and monetary into the wanted 1-5 rating. For this you will use the formula PERCENTILE to get the minimum rank a customer has to have on each for the three factors to get a certain rating. Since we have ratingd from 1-5 we’ll need five percentiles which equal to 20% percent steps.
Put =PERCENTILE(H$2:H$4296,0.8) into Q2 to get the lowest possible rank for the highest R score, similar put =PERCENTILE(H$2:H$4296,0.6) into Q3 to get the lowest possible rank for a R score of 4 and so on. Do the same for the F score in column R and and for the M score in column S.
No we’ll do the actual conversion for each customer. Put the formula =if(H2>=Q$2,$T$2,if(H2>=Q$3,$T$3,if(H2>=Q$4,$T$4,if(H2>=Q$5,$T$5,$T$6)))) in K2 and copy it down to K3403 to determine the R score. The conditional statements will determine if the respektive rank is above one of the thresholds of the RFM rank matrix and then assign an according score. Do the same for the F score by copying =if(I2>=R$2,$T$2,if(I2>=R$3,$T$3,if(I2>=R$4,$T$4,if(I2>=R$5,$T$5,$T$6)))) from L2 to L3404 and for the M score by copying =if(J2>=S$2,$T$2,if(J2>=S$3,$T$3,if(J2>=S$4,$T$4,if(J2>=S$5,$T$5,$T$6)))) from M2 to M3404.
As a last step we want count how often each of the RFM combination (there are 5x5x5 = 125 combinations) occurred. Again we will use the COUNTIF formula for this. Write =COUNTIF(N:N,Q10)into R10 and copy it down to R134.
You are done! ….at least technically. You gave each of your customers a RFM score based on the recency, frequency and monetary value of past purchases. However in order to make your analysis actionable you should group your 125 segments into more meaningful clusters and define strategies for each of them. You'll do that in the next part.:)