SQL Interview Question: Retrieving Top Customers by Total Spend

Learn to solve the query to retrieve top customers by total spend - a must-know for data roles at MAANG companies like Amazon and Microsoft. Detailed solution included.

Mentor

Blog

For any business, knowing who are the biggest spending customers is super important.

It helps you figure out who to focus your marketing efforts on, what products or services they love, and how to keep them coming back for more.

That's why questions around analysing customer spend data are so common in interviews, especially at big tech companies like MAANG (Meta, Amazon, Apple, Netflix, and Google).

One question you're likely to come across is something along the lines of: "Retrieve the top N customers based on how much they've spent over a certain period, and include their total spend amount."

It's a classic query that tests your SQL skills and your ability to turn a business requirement into an efficient database query.

In this article, we will tackle this SQL interview question for data science roles. Let’s get started!

Question:

You have a database with two tables: Customers and Orders.

The Customers table contains customer information such as CustomerID, CustomerName, and CustomerEmail.

The Orders table contains order details like OrderID, CustomerID, OrderDate, and TotalAmount.

Write a SQL query to retrieve the top 3 customers who have spent the most money on orders in the last year, along with their total spend during that period.

Solution:

SELECT
    c.CustomerName,
    SUM(o.TotalAmount) AS TotalSpent
FROM
    Customers c
INNER JOIN
    Orders o ON c.CustomerID = o.CustomerID
WHERE
    o.OrderDate >= DATE_SUB(CURDATE(), INTERVAL 1 YEAR)
GROUP BY
    c.CustomerID
ORDER BY
    TotalSpent DESC
LIMIT 3;

Here's a breakdown of the query:

  1. SELECT c.CustomerName, SUM(o.TotalAmount) AS TotalSpent: This part retrieves the CustomerName from the Customers table and calculates the total amount spent by each customer using the SUM aggregate function on the TotalAmount column from the Orders table. The total spend is aliased as TotalSpent.
    1. FROM Customers c INNER JOIN Orders o ON c.CustomerID = o.CustomerID: This line joins the Customers and Orders tables based on the CustomerID column, allowing us to connect customer information with their order details.
      1. WHERE o.OrderDate >= DATE_SUB(CURDATE(), INTERVAL 1 YEAR): This WHERE clause filters the Orders table to include only orders placed within the last year. The DATE_SUB function subtracts one year from the current date (CURDATE( ))to get the cutoff date for the last year.
        1. GROUP BY c.CustomerID: This line groups the results by the CustomerID, so that the SUM function can calculate the total spend for each individual customer.
          1. ORDER BY TotalSpent DESC: The results are ordered in descending order based on the TotalSpent column, ensuring that the customers with the highest total spend appear first.
            1. LIMIT 3: This limits the output to the top 3 rows, giving us the top 3 customers who have spent the most money in the last year, along with their respective total spends.

              The query first joins the Customers and Orders tables to connect customer information with their order details.

              It then filters the Orders table to include only orders placed within the last year, groups the results by CustomerID, calculates the total spend for each customer using SUM(TotalAmount), orders the results by the total spend in descending order, and finally limits the output to the top 3 rows.

              This query provides the top 3 customers based on their total spend in the last year, along with their respective CustomerName and TotalSpent values.


              If you'd like to learn more about tackling such SQL interview questions or want to practice with additional examples, feel free to connect with me.

              I'd be happy to share my insights and help you prepare for success in your upcoming interviews.