Solving Walmart's Data Science SQL Question: Finding Managers with Large Teams

Solving Walmart's Data Science SQL Question: Finding Managers with Large Teams

Dive into a common Walmart Data Science interview question. Learn how to use SQL to identify managers with large teams and boost your interview skills.

Hey there! As a senior data engineer at Walmart, I've seen my fair share of interesting interview questions.

Today’s question touches on a few important concepts in SQL and organisational structure.

We're dealing with self-joins, aggregations, and conditional logic. So, let's dive into this question and break down how to solve it.

Even if you're not interviewing at Walmart, this is a great problem to practice your SQL skills. Ready? Here's the question:

Problem Statement

Find managers with at least 7 direct reporting employees. In situations where user is reporting to himself/herself, count that also. Output first names of managers.

Solution

To solve this problem, we need to identify managers who have at least 7 direct reports, including cases where an employee reports to themselves.

Here's a step-by-step approach to achieve this:

Step-by-Step Solution:

πŸ“Œ Identify Direct Reports: We need to count the number of direct reports for each manager. This can be done by grouping the employees by their manager_id and counting the number of employees in each group.

πŸ“Œ Filter Managers with at Least 7 Direct Reports: After counting the direct reports, we filter out the managers who have fewer than 7 direct reports.

πŸ“Œ Select Manager Details: Finally, we select the first names of the managers who meet the criteria.

SQL Query:

Assuming we have a table named Employees with the columns as described, here's how you can write the SQL query:

SELECT
    e.first_name
FROM
    Employees e
JOIN (
    SELECT
        manager_id,
        COUNT(*) AS direct_reports
    FROM
        Employees
    GROUP BY
        manager_id
    HAVING
        COUNT(*) >= 7
) m ON e.id = m.manager_id;

Explanation:

πŸ“Œ Subquery to Count Direct Reports:

The subquery SELECT manager_id, COUNT(*) AS direct_reports FROM Employees GROUP BY manager_id HAVING COUNT(*) >= 7 groups the employees by their manager_id and counts the number of employees in each group. The HAVING COUNT(*) >= 7 clause filters out groups with fewer than 7 employees.

πŸ“Œ Join to Get Manager Details:

We join the Employees table with the subquery result on the id of the employee and the manager_id from the subquery. This ensures that we only get the managers who have at least 7 direct reports.

πŸ“Œ Select Required Columns:

We select the first_name of the managers from the joined result.

This query ensures that we get the first names of managers who have at least 7 direct reports, including cases where an employee reports to themselves.


**************

And there you have it!

I hope you found this walkthrough helpful and learned something new along the way. 

Got questions about this problem? Want to practice more Walmart-style interview questions? Connect with me on a free trial session!