Debug Long SQL Queries in Interviews

Quick summary

Summarize this blog with AI

Some SQL interviews do not start with a blank editor. The interviewer gives you a long query with several CTEs, joins, filters, business rules, and maybe one suspicious metric. Then they ask: "What does this do?" or "How would you change it?"

This format can feel harder than writing SQL from scratch because you are holding someone else's assumptions in your head. You have to understand the output, trace the data flow, spot possible bugs, and make a safe change without breaking the business logic.

The wrong move is to start editing immediately. The right move is to read the query like a system. This guide gives you a repeatable workflow for long SQL query review in analyst, BI, analytics engineering, and data engineering interviews.

The Rule: Do Not Edit First

When you see a long query, resist the urge to fix the first thing that looks odd. A filter that looks wrong may be intentional. A CTE that looks redundant may protect grain. A DISTINCT may be hiding a join issue, but removing it without understanding the downstream metric can make the result worse.

Start with a short verbal reset:

"Before changing anything, I want to understand the final grain, then trace each CTE and join so I do not accidentally change the metric."

That sentence signals maturity. You are treating SQL as production logic, not a puzzle.

The Six-Pass Method

Use six passes. Each pass answers one question.

Output pass: What does one final row represent?
Input pass: What source tables are involved?
CTE pass: What does each block produce?
Join pass: Which joins can multiply or drop rows?
Filter pass: Are filters applied at the right stage?
Validation pass: What checks would prove the change is safe?

This method keeps you from reading line by line forever. You are building a map of the query.

Pass One: Final Output Grain

The first question is always: what does one final row represent?

Possible grains include:

One row per customer.
One row per customer per month.
One row per order.
One row per product per day.
One row per experiment variant.
One row per dashboard KPI.

If the final query is:

SELECT
  customer_id,
  COUNT(*) AS orders,
  SUM(revenue) AS revenue
FROM enriched_orders
GROUP BY customer_id;

The output appears to be customer-level. But whether COUNT(*) really means orders depends on the grain of enriched_orders. If enriched_orders is actually one row per order item, the count is wrong.

Say the uncertainty out loud: "This appears to produce one row per customer. I need to confirm whether the upstream CTE is still order-level before trusting COUNT(*)."

Pass Two: Source Tables

Before tracing CTEs, identify the source tables and what role each plays.

Fact tables: Events, orders, sessions, payments, tickets.
Dimension tables: Customers, products, accounts, campaigns.
Mapping tables: User-to-account, product-to-category, campaign-to-channel.
History tables: Status changes, plan changes, subscription periods.

This matters because fact tables often have many rows per entity, dimensions are often one row per entity, mapping tables may be many-to-many, and history tables often need effective-date logic.

If the query joins orders to order items, the grain probably changes. If it joins customers to a subscription history table, the result may duplicate customers unless the query chooses a single active subscription row.

Pass Three: Label Each CTE

Long SQL becomes manageable when each CTE gets a plain-English label. You do not need to memorize every column. You need to understand each block's purpose and grain.

WITH first_orders AS (
  SELECT
    customer_id,
    MIN(order_date) AS first_order_date
  FROM orders
  GROUP BY customer_id
),
orders_with_customer_type AS (
  SELECT
    o.order_id,
    o.customer_id,
    o.order_date,
    o.revenue,
    CASE
      WHEN o.order_date = f.first_order_date THEN 'new'
      ELSE 'returning'
    END AS customer_type
  FROM orders o
  JOIN first_orders f
    ON f.customer_id = o.customer_id
)
SELECT
  DATE_TRUNC('month', order_date) AS order_month,
  customer_type,
  COUNT(*) AS orders,
  SUM(revenue) AS revenue
FROM orders_with_customer_type
GROUP BY 1, 2;

A strong interview summary is:

first_orders: one row per customer, finds each customer's first order date.
orders_with_customer_type: one row per order, labels orders as new or returning.
Final query: one row per order month per customer type.

That is much better than reading syntax aloud. It shows you understand the transformation.

Pass Four: Join Risk

Most long-query bugs are join bugs. For each join, ask whether it preserves, expands, or filters the current grain.

Use this checklist:

Is the right-side key unique?
Is the relationship one-to-one, one-to-many, or many-to-many?
Should unmatched rows remain?
Could a filter on the right table undo a LEFT JOIN?
Does the final aggregation count rows or business entities?

This common bug removes zero-activity customers:

SELECT
  c.customer_id,
  COUNT(o.order_id) AS shipped_orders
FROM customers c
LEFT JOIN orders o
  ON o.customer_id = c.customer_id
WHERE o.status = 'shipped'
GROUP BY c.customer_id;

The WHERE clause filters out customers with no matching shipped orders. If the requirement is to keep customers with zero shipped orders, move the filter into the join:

SELECT
  c.customer_id,
  COUNT(o.order_id) AS shipped_orders
FROM customers c
LEFT JOIN orders o
  ON o.customer_id = c.customer_id
 AND o.status = 'shipped'
GROUP BY c.customer_id;

Explain the reason: "The status condition belongs in the join because the left table defines the population."

Pass Five: Filter Timing

A filter can be correct in one place and wrong in another. Ask what the filter is supposed to do:

Define the input population.
Limit the metric window.
Choose current records.
Remove invalid data.
Filter groups after aggregation.

Suppose the task is to find customers whose first-ever order was in May. This query is wrong:

WITH may_orders AS (
  SELECT *
  FROM orders
  WHERE order_date >= DATE '2026-05-01'
    AND order_date <  DATE '2026-06-01'
),
first_orders AS (
  SELECT
    customer_id,
    MIN(order_date) AS first_order_date
  FROM may_orders
  GROUP BY customer_id
)
SELECT *
FROM first_orders;

It finds each customer's first May order, not their first-ever order. The filter was applied too early. Calculate first-ever order first, then filter:

WITH first_orders AS (
  SELECT
    customer_id,
    MIN(order_date) AS first_order_date
  FROM orders
  GROUP BY customer_id
)
SELECT *
FROM first_orders
WHERE first_order_date >= DATE '2026-05-01'
  AND first_order_date <  DATE '2026-06-01';

In an interview, call this out as filter timing. It is a clearer explanation than "move the WHERE clause."

Pass Six: Validation Checks

Before changing the query, propose checks. Validation is not optional polish. It is how you prove the query still answers the same question.

Useful checks include:

Row count after each major CTE.
Distinct count of the expected grain key.
Duplicate count for supposedly unique keys.
Null count for join keys and required fields.
Minimum and maximum dates after filters.
Comparison against a known customer, product, or date.
Reconciliation to a simpler source aggregate.

SELECT
  COUNT(*) AS rows,
  COUNT(DISTINCT order_id) AS distinct_orders,
  COUNT(*) - COUNT(DISTINCT order_id) AS duplicate_order_rows,
  MIN(order_date) AS min_order_date,
  MAX(order_date) AS max_order_date
FROM orders_with_customer_type;

A candidate who proposes this check is showing production judgment: the answer is not done until it is plausible.

How to Modify a Query Safely

When asked to change a long query, follow this order:

Restate the new requirement.
Identify which CTE owns the relevant grain.
Make the smallest change in that CTE.
Update downstream aliases or grouping only if needed.
Run validation checks against the old and new output.

Example prompt: "This query reports revenue by order month. Change it to revenue by customer signup month."

A safe response:

"I would first confirm that signup month should come from the customer table, while revenue still comes from orders. Then I would join customers before the final aggregation, group by DATE_TRUNC('month', customers.signup_date), and keep the revenue logic unchanged. I would validate that total revenue still reconciles to the old query unless the population definition changed."

That answer separates dimension change from metric change. It avoids rewriting everything just because the prompt changed one grouping field.

Full Practice Example

Review this query:

WITH recent_orders AS (
  SELECT *
  FROM orders
  WHERE order_date >= DATE '2026-05-01'
),
joined AS (
  SELECT
    c.customer_id,
    c.signup_date,
    o.order_id,
    o.order_date,
    oi.product_id,
    oi.quantity,
    oi.price
  FROM customers c
  LEFT JOIN recent_orders o
    ON o.customer_id = c.customer_id
  LEFT JOIN order_items oi
    ON oi.order_id = o.order_id
)
SELECT
  DATE_TRUNC('month', signup_date) AS signup_month,
  COUNT(order_id) AS orders,
  SUM(quantity * price) AS revenue
FROM joined
WHERE order_date < DATE '2026-06-01'
GROUP BY 1;

A structured read:

Final grain appears to be one row per signup month.
The query starts from customers, so it may intend to preserve customers with no orders.
The join to order_items changes grain from order-level to item-level.
COUNT(order_id) counts item rows, not distinct orders.
The WHERE order_date < DATE '2026-06-01' filter removes customers with no order.
The May start filter is in recent_orders, but the June end filter is later, making the time-window logic split across stages.

A safer version aggregates item revenue to order grain first and keeps date filters together:

WITH order_revenue AS (
  SELECT
    o.order_id,
    o.customer_id,
    o.order_date,
    SUM(oi.quantity * oi.price) AS revenue
  FROM orders o
  JOIN order_items oi
    ON oi.order_id = o.order_id
  WHERE o.order_date >= DATE '2026-05-01'
    AND o.order_date <  DATE '2026-06-01'
  GROUP BY o.order_id, o.customer_id, o.order_date
)
SELECT
  DATE_TRUNC('month', c.signup_date) AS signup_month,
  COUNT(DISTINCT r.order_id) AS orders,
  COALESCE(SUM(r.revenue), 0) AS revenue
FROM customers c
LEFT JOIN order_revenue r
  ON r.customer_id = c.customer_id
GROUP BY 1;

This version protects order grain before aggregating by signup month. Whether it is the final answer depends on the business requirement, but the reasoning is sound.

Phrases That Work Under Pressure

Use clear phrases when you are unsure:

"I want to confirm the output grain before editing."
"This join may multiply rows because the right side may not be unique."
"This filter changes the population, so I want to verify whether that is intended."
"I would validate with row counts before and after this CTE."
"The query may be correct syntactically but wrong for the metric definition."
"I would test one known customer manually before trusting the aggregate."

These are not filler. They show the interviewer how you think.

Common Interview Prompts

What does this query do?

Answer with final output first, then CTE summaries. Do not read line by line.

Why are the numbers too high?

Look for join multiplication, item-level joins, many-to-many mapping tables, and missing DISTINCT where the business entity requires it.

Why did zero rows disappear?

Look for filters on the right side of a LEFT JOIN in the WHERE clause.

How would you add a new metric?

Add it at the grain where it belongs. Do not calculate item-level, order-level, and customer-level metrics in the same layer without clear aggregation boundaries.

How would you make this easier to maintain?

Suggest clearer CTE names, centralized date filters, explicit business definitions, fewer repeated expressions, and validation queries. Formatting helps only after the logic is understandable.

Final Checklist

Use this checklist in any long SQL review:

Name the final output grain.
Identify source table roles.
Label each CTE by purpose and grain.
Find joins that multiply rows.
Find filters that remove rows too early.
Check whether counts match the business entity.
Check timestamp boundaries.
Propose validation before editing.
Make the smallest safe change.

A long SQL interview is a maintenance test. The interviewer wants to see whether you can enter existing logic, understand it, debug it, and change it without damaging the metric. A structured read is how you do that reliably.