SQL NULL Logic for Interviews: NOT IN, NOT EXISTS

Quick summary

Summarize this blog with AI

Introduction

SQL interview questions about missing data are dangerous because the wrong query usually still runs. You do not get a syntax error. You get a believable answer that is quietly wrong.

That is why people keep getting burned by NULL, NOT IN, LEFT JOIN, and aggregate behavior. An interviewer asks for users with no purchase, customers without a completed order, or rows with missing values. The candidate writes something that looks reasonable, but one NULL in the wrong place changes the result set.

The good news is that these are not random traps. There are a few repeatable rules. If you know how SQL evaluates NULL, how anti-joins work, and where join filters belong, you can solve these prompts cleanly and explain your reasoning without sounding memorized.

Why NULL Logic Breaks Good Candidates

Most languages train you to think in two states: TRUE and FALSE. SQL adds a third state: UNKNOWN.

That one difference explains most NULL confusion.

NULL means unknown or missing, not zero and not an empty string.
Comparisons against NULL usually evaluate to UNKNOWN.
The WHERE clause keeps only rows where the predicate is TRUE.

The last line matters most. WHERE does not keep FALSE, and it does not keep UNKNOWN either. So once a predicate becomes UNKNOWN, the row disappears.

This is why col = NULL fails, why NOT IN can collapse when a subquery contains NULL, and why a LEFT JOIN can stop behaving like a left join if you filter the right table in WHERE.

Rule 1: Use IS NULL and IS NOT NULL Explicitly

If a column can contain NULL, do not compare it with equality operators.

-- Wrong
SELECT *
FROM orders
WHERE shipped_at = NULL;

That predicate never becomes TRUE. The correct versions are explicit:

SELECT *
FROM orders
WHERE shipped_at IS NULL;

SELECT *
FROM orders
WHERE shipped_at IS NOT NULL;

This sounds basic, but strong candidates still miss it when the prompt layers NULL logic inside a larger join or subquery.

COUNT(*) and COUNT(column) Are Not the Same Metric

Another common interview mistake is assuming all counts behave the same.

COUNT(*) counts rows.
COUNT(column) counts non-null values in that column.

SELECT
    COUNT(*) AS total_rows,
    COUNT(shipped_at) AS rows_with_ship_time
FROM orders;

If shipped_at is null for unshipped orders, those rows still contribute to COUNT(*) but not to COUNT(shipped_at).

That matters anytime the question uses language like row count, completed-user count, or rows with a valid timestamp. If the business question is row count, use COUNT(*). If the business question is populated-value count, COUNT(column) may be right. Say the distinction out loud in the interview.

Why NOT IN Breaks When NULL Appears

Suppose the prompt is: find customers who have never placed an order.

SELECT c.customer_id
FROM customers c
WHERE c.customer_id NOT IN (
    SELECT o.customer_id
    FROM orders o
);

That works only if the subquery returns non-null customer IDs. If the subquery includes even one NULL, the logic becomes unsafe.

Conceptually, SQL has to evaluate something like this:

c.customer_id <> 101
AND c.customer_id <> 205
AND c.customer_id <> NULL

The last comparison is UNKNOWN, so the whole predicate can no longer cleanly evaluate to TRUE.

If you insist on NOT IN, remove nulls inside the subquery:

SELECT c.customer_id
FROM customers c
WHERE c.customer_id NOT IN (
    SELECT o.customer_id
    FROM orders o
    WHERE o.customer_id IS NOT NULL
);

That version can be valid. But in interview settings, there is usually a more durable pattern.

Why NOT EXISTS Is Usually the Safer Anti-Join

If the real question is "give me rows from table A with no matching row in table B," think anti-join first.

NOT EXISTS is usually the safest and clearest expression of that logic.

SELECT c.customer_id
FROM customers c
WHERE NOT EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
);

This reads like the business question: start from customers, look for a matching order, and keep the customer only when no match exists.

It avoids the NULL list-membership problem because you are testing for the existence of a matching row, not comparing against a set that might contain NULL.

This is a strong default for prompts like users with no purchase, customers without a completed order, employees with no manager record, and accounts with no successful login in the last 30 days.

How LEFT JOIN Filters Accidentally Become Inner Joins

Another classic failure mode looks like this:

SELECT c.customer_id, o.order_id
FROM customers c
LEFT JOIN orders o
    ON c.customer_id = o.customer_id
WHERE o.status = 'completed';

Candidates write this because they want all customers plus completed orders. But WHERE o.status = 'completed' removes rows where o.status is NULL, which are exactly the unmatched rows introduced by the left join.

So the query no longer preserves all customers. It behaves like an inner join over completed orders.

If your intention is "keep every customer, attach completed orders when they exist," keep the right-side filter in the ON clause:

SELECT c.customer_id, o.order_id
FROM customers c
LEFT JOIN orders o
    ON c.customer_id = o.customer_id
   AND o.status = 'completed';

Now unmatched customers remain, with null order columns.

This distinction matters because these prompts are often asking one of two different questions:

Return all customers, plus a matching completed order if one exists.
Return only customers who have at least one completed order.

Those are different business questions, so they need different SQL.

Dialect Notes for Null-Safe Comparisons

If the conversation shifts from interview logic to dialect specifics, these are useful details to know:

PostgreSQL supports IS DISTINCT FROM and IS NOT DISTINCT FROM for null-safe comparison.
MySQL has the null-safe equality operator <=>.
Many warehouses support IS DISTINCT FROM, but function names and edge behavior still vary.

Example in PostgreSQL:

SELECT *
FROM prices
WHERE old_price IS DISTINCT FROM new_price;

That can be cleaner than manual NULL handling when the question is whether a value changed. But if the interviewer is testing core SQL reasoning, IS NULL, EXISTS, and careful join logic matter more than dialect trivia.

Worked Example: Customers Without a Completed Order

Suppose the prompt is: "Find customers who do not have any completed order."

The fragile version is:

SELECT c.customer_id
FROM customers c
WHERE c.customer_id NOT IN (
    SELECT o.customer_id
    FROM orders o
    WHERE o.status = 'completed'
);

That can fail if orders.customer_id is nullable.

A safer answer is:

SELECT c.customer_id
FROM customers c
WHERE NOT EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
      AND o.status = 'completed'
);

You can also express the anti-join with LEFT JOIN and IS NULL:

SELECT c.customer_id
FROM customers c
LEFT JOIN orders o
    ON c.customer_id = o.customer_id
   AND o.status = 'completed'
WHERE o.order_id IS NULL;

That is valid too, but it is easier to break if you move the status filter into WHERE. That is why NOT EXISTS is often the safer interview answer.

What to Say While You Solve It

A strong explanation is short and precise:

"This is an anti-join problem. I want rows from customers where no completed order exists, so I am using NOT EXISTS instead of NOT IN because nulls in the subquery can break NOT IN. If I used a left join version, I would keep the order-status filter in the ON clause so unmatched customers stay in the result."

That explanation sounds senior because it ties the SQL pattern to the business meaning.

Validation Checklist Before You Finalize the Query

Can the subquery return NULL values?
Am I counting rows or non-null values?
Did a right-table filter accidentally turn a LEFT JOIN into an inner join?
Am I using anti-join logic because the prompt is really "no matching row exists"?
If I replaced nulls with COALESCE, did I preserve the real business meaning?

Those checks catch most silent bugs in this topic area.

FAQ

Is NOT IN always wrong?

No. NOT IN is fine when the subquery cannot return NULL. The problem is that people often do not verify that assumption.

When is LEFT JOIN ... IS NULL a good answer?

It is good when you want an anti-join and you keep the right-table match conditions inside the ON clause. It becomes fragile when those filters drift into WHERE.

Should I use COALESCE everywhere to avoid null problems?

No. Use COALESCE only when the fallback value matches the business meaning. Replacing unknown with zero too early often hides the real logic problem instead of solving it.

SQL NULL Logic for Interviews: NOT IN, NOT EXISTS, and LEFT JOIN Filters Without Silent Bugs

Summarize this blog with AI

Introduction

Why NULL Logic Breaks Good Candidates

Rule 1: Use IS NULL and IS NOT NULL Explicitly

COUNT(*) and COUNT(column) Are Not the Same Metric

Why NOT IN Breaks When NULL Appears

Why NOT EXISTS Is Usually the Safer Anti-Join

How LEFT JOIN Filters Accidentally Become Inner Joins

Dialect Notes for Null-Safe Comparisons

Worked Example: Customers Without a Completed Order

What to Say While You Solve It

Validation Checklist Before You Finalize the Query

FAQ

Is NOT IN always wrong?

When is LEFT JOIN ... IS NULL a good answer?

Should I use COALESCE everywhere to avoid null problems?

Begin Your SQL, Python, and R Journey

AI in SQL Interviews: How to Prepare Without Crossing the Line

SQL CASE WHEN for Interviews: Conditional Aggregation, NULLs, and the COUNT Trap

SQL Funnel Queries for Event Data: Daily Conversion Rates Without Inflated Counts

SQL Retention Queries: Cohorts, Return Windows, and the Edge Cases That Break Them

How to Solve Gaps and Islands SQL Interview Questions with LAG and ROW_NUMBER

SQL NULL Logic for Interviews: NOT IN, NOT EXISTS, and LEFT JOIN Filters Without Silent Bugs

Summarize this blog with AI

Introduction

Why NULL Logic Breaks Good Candidates

Rule 1: Use IS NULL and IS NOT NULL Explicitly

COUNT(*) and COUNT(column) Are Not the Same Metric

Why NOT IN Breaks When NULL Appears

Why NOT EXISTS Is Usually the Safer Anti-Join

How LEFT JOIN Filters Accidentally Become Inner Joins

Dialect Notes for Null-Safe Comparisons

Worked Example: Customers Without a Completed Order

What to Say While You Solve It

Validation Checklist Before You Finalize the Query

Related Reading

FAQ

Is NOT IN always wrong?

When is LEFT JOIN ... IS NULL a good answer?

Should I use COALESCE everywhere to avoid null problems?

Begin Your SQL, Python, and R Journey

AI in SQL Interviews: How to Prepare Without Crossing the Line

SQL CASE WHEN for Interviews: Conditional Aggregation, NULLs, and the COUNT Trap

SQL Funnel Queries for Event Data: Daily Conversion Rates Without Inflated Counts

SQL Retention Queries: Cohorts, Return Windows, and the Edge Cases That Break Them

How to Solve Gaps and Islands SQL Interview Questions with LAG and ROW_NUMBER