Customer Segmentation is a method to group customers based on some business KPIs such as recency, frequency, and latency. These segments are used to target specific groups of customers to provide better campaign offers.

  • Customer Snapshot: This gives an overall view of all the existing customers of a brand grouping them in different buckets like One timer and Repeaters (based on visits). This also provides a snapshot of the customers' demographics captured.
  • Recency & Frequency Buckets: Users are divided into segments based on their recency and frequency visits. These segments are presented using bar graphs.

Steps to run Notebook

The Databricks notebook has different commands where codes are written. The Cmd{number} represents that command line that you need to refer to.


  1. Open your cluster-specific link provided for the Notebook - SEA Cluster, India Cluster, EMEA cluster.

  2. Clone the Notebook into your workspace.
  3. Cmd 1: Import all the required Python libraries.
  4. Cmd 2: Read instructions on how to use the notebook.
  5. Input the following data into the text box.Org_id, Start date, End date, Active Period (Run the Lapsation Notebook to get this number).
  6. Cmd 6: Run the single view command.
  7. Cmd 7-12: Customer Snapshot  
    1. Cmd 7: Create recency buckets (Active, Lapsed and Lost) based on the Lapsation period defined earlier, Provide periods for recency buckets.
    2. Cmd 8-9:  Data Manipulation(no change required).
    3. Cmd 10-11: Visualization of the distribution of customers across Recency and frequency buckets (make changes in Cmd 12 as per absolute values).

Understanding the Sample Output

The graph displays the distribution of users based on their frequency of visits. You can also see a distribution of customers based on their recency as well.

The following image gives a simple insight of the graph.