Next Best product modelling is used to determine what is the best-suited product or most likely product the customer is going to purchase in his next visit based on the past purchase history of similar-looking customer or similar customer segments.

Purpose of the analysis

Identifying the correct set of customers for campaigns of a certain target product is crucial. We can determine this by looking at the transaction history of customers over a certain time period. Seeing what product categories people have purchased from before in consecutive transactions can tell us: 

1) What are the most popular consecutive product purchases and 

2) What is a new customer likely to buy next based on their most recent purchase. 

Based on a certain target product category, the Next Best Product Modelling (NBPM) notebook is used to determine what customers to target depending on their last transaction. 

Also to decide how this targeting is to be done, we have seen for different products what is the best customer segment type to target (based on recency, frequency and monetary segments created). For different customer types, the preferred products would be different which is taken into consideration while making the model.

Sample Output of Next Best product to Plan a customers Lifecycle Journey:

Components of Notebook

There are 5 main parts (excel sheets) that will need to be downloaded. Note that, if below the data it says ‘Showing the first 1000 rows’ then use the button next to the download button and select ‘Download full results (max 1 million rows)’, which will result in the command running again. These 5 sections are:

  1. RF Segment-wise Best Products - For each product category, what RF segment (Active, Lapsed etc.) is the best for targeting. (Cmd3)
  2. RFM Segment-wise Best Products - For each product category, what RFM segment (Top, Mid, Base etc.) is the best for targeting. (Cmd4) (Can be skipped if RFM segmentation not done earlier)
  3. Overall Repeaters - Average of all visits over the time period for Repeaters. (Cmd5)
  4. One-Timers - First and second visit of Repeaters over the time period. (Cmd6)
  5. Last 2 visits (Repeaters) - Second last and last visits of Repeaters over the time period. (Cmd7)

Inputs to be provided

  • Org_id of the brand
  • Start Date and End Date - Duration for which the notebook is to be run. Format: yyyy-mm-dd. (example: 2020-01-16)
  • Active Period and Lapsed Period of brand.
  • Product Database - Name of the database where the Product table is stored (examples: read_api_100, ds_capillary).
  • Product Table - Name of the Product table in the Product database.
  • Product Itemcode - Column name which would be joined with bill_lineitems on item_code.
  • Main Product Category - Choose the main product category.
  • Sub Product Category - Choose the sub-product category from which segments are to be defined. 
  • RFM Database - Name of the database where RFM table is stored (examples: read_api_100, ds_capillary).
  • RFM Table - Name of the RFM table in the RFM database.
  • RFM User_id - Column name which would be joined with bill_summary on dim_event_user_id.
  • RFM Segment - Column name which has the RFM segments (values in the column would be like Top, Mid, Base etc.).
  • RFM Frequency Column - Column name with the Frequency value i.e number of visits.


Notebook links:  India Cluster | APAC Cluster | EMEA Cluster

  1. For RF and RFM segment-wise best products (Cmds 3,4) - Download the data to excel. Create a pivot table with Product (Sub Category) on the rows. and Frequency Tag and Recency Tag on the columns. Select Sum of Quantity as values, and under the value field settings choose show values as % of Column Total. Row-wise for each product, select the segments with % values above average for that product. This can be done by Conditional formatting, inside which we use a new formula. Select the first row % values and apply a new rule as “=B6>AVERAGE($B6:$G6)”, and select a colour inside format. Use format painter to apply this rule to all other rows. How the pivot would look like for RF segments is shown below:
  2.  The pivot table would give product-wise the best Segments to target. Example from the above table: ACTION FIGURES are purchased most by Repeaters. For Repeaters use Overall Repeaters and/or Last 2 visits data, for OneTimers use the One-Timers data. If there is an equal distribution among the Repeaters and OneTimers (like for BARBIE & ACCESSORIES shown above) then use both.
  3. For Overall Repeaters, One Timers, Last 2 visits (Cmds 5,6,7) - Download the data to excel. Columns 1,2,3 contain the previous visit/ first visit/ second last visit data while Columns 4,5,6 contain the next visit/ second visit/ last visit data. Columns 1,4 are the Main Categories. Columns 2,5 are the Sub Categories. Columns 3,6 is the average rate of products sold from the corresponding subcategory. Column 7 (Customers) is the number of customers who have made these consecutive transactions. Column 8 (Latency) is the average number of days between these consecutive transactions.
  4. Column 5 (Sub Category) will be the target product. Filter out this column for the target product category. Column 2 would then give the products bought in previous transactions in descending order of popularity. Those customers who have purchased these Column 2 products are identified for that particular target product category.
  5. Choose only those products (customer purchases) from Column 2 where the customer count (Column 7) is not too low. Ideally, select at least 55-60% customers.
  6. More filters can be used. Filtering Column 1 for a particular Main product category (can be the same or different as that of the target product). Or filter the Latency column, where for example if we want the number of days between the consecutive purchases to be less than 60 days then what customers to target.
  7. The final product categories in Column 2 would thus correspond to the most recent i.e final transaction of customers, and these customers are to be selected for targeting.

Screenshots of sample of final results:

RF Product Preference final table (Similarly for RFM product preference)

Overall reporters table

Overall Repeaters table is shown with a filter on target product ACTION FIGURES