SQL CTE vs Subquery vs Temp Table: How to Choose the Right Pattern

SQL Updated Apr 9, 2026 7 mins read Leon Leon
SQL CTE vs Subquery vs Temp Table: How to Choose the Right Pattern cover image

Quick summary

Summarize this blog with AI

Introduction

A lot of SQL discussions sound more ideological than technical. One person says CTEs are cleaner. Another says subqueries are faster. Someone else says serious workloads should use temp tables. That argument usually goes nowhere because the three tools solve different problems.

The right question is not "which one is best?" The right question is "what do I need this intermediate step to do?"

The short decision framework

Use this When you want Why it fits
Subquery A small, one-use intermediate result that stays close to the consuming logic It keeps the query local and avoids introducing extra named steps
CTE A readable multi-step query, reusable derived logic inside one statement, or recursion It lets you name stages and reason about them one at a time
Temp table A materialized intermediate result you want to inspect, index, reuse, or join multiple times It breaks a large problem into stable physical steps

That is the answer most people are looking for. The rest is nuance.

Start from the job each pattern does

A subquery is still just a query expression. It is useful when the intermediate logic is simple and only matters in one place.

A CTE is a named query expression. The main benefit is clarity. It gives a complex statement checkpoints.

A temp table is different. It is not only a readability tool. It creates a physical intermediate result for the session. That means you can inspect it, reuse it in later statements, and sometimes improve performance by indexing or by giving the optimizer a simpler next step.

One business question, three implementations

Suppose you need users who placed at least two paid orders in their first 30 days after signup.

Subquery version

select u.user_id
from users u
join (
    select
        o.user_id,
        count(*) as paid_orders_first_30_days
    from orders o
    join users u2
      on u2.user_id = o.user_id
    where o.status = 'paid'
      and o.created_at >= u2.signup_at
      and o.created_at < u2.signup_at + interval '30 day'
    group by o.user_id
) x
  on u.user_id = x.user_id
where x.paid_orders_first_30_days >= 2;

This is fine if the derived result is single-use and still easy to read.

CTE version

with paid_orders_first_30_days as (
    select
        o.user_id,
        count(*) as paid_orders_first_30_days
    from orders o
    join users u
      on u.user_id = o.user_id
    where o.status = 'paid'
      and o.created_at >= u.signup_at
      and o.created_at < u.signup_at + interval '30 day'
    group by o.user_id
)
select user_id
from paid_orders_first_30_days
where paid_orders_first_30_days >= 2;

The result is the same, but the logic is easier to name, review, and extend.

Temp table version

create temporary table paid_orders_first_30_days as
select
    o.user_id,
    count(*) as paid_orders_first_30_days
from orders o
join users u
  on u.user_id = o.user_id
where o.status = 'paid'
  and o.created_at >= u.signup_at
  and o.created_at < u.signup_at + interval '30 day'
group by o.user_id;

create index on paid_orders_first_30_days (paid_orders_first_30_days);

select user_id
from paid_orders_first_30_days
where paid_orders_first_30_days >= 2;

This version makes sense when that intermediate set is large, reused later, or worth indexing and inspecting.

When a subquery is the cleanest answer

Subqueries are underrated because many people only remember the ugly correlated ones that run once per outer row. But a normal derived table or exists subquery is often the most direct expression of the logic.

Use a subquery when:

  • The intermediate result is used once.
  • Naming the step would not add much clarity.
  • The logic belongs tightly next to the join or filter that consumes it.
  • exists expresses the question better than a join.

For example, this is often cleaner than introducing a CTE:

select c.customer_id
from customers c
where exists (
    select 1
    from orders o
    where o.customer_id = c.customer_id
      and o.status = 'paid'
);

The danger is not "subquery." The danger is repeated or correlated work that scales poorly.

When a CTE is the best default

CTEs shine when the problem has natural stages. If you are deduplicating, then classifying, then aggregating, naming each step reduces mistakes.

Use a CTE when:

  • The statement has multiple conceptual phases.
  • You want to separate grain-changing steps from filters.
  • You need recursion.
  • You want teammates to understand the query without mentally untangling nested parentheses.

A good CTE query reads like a pipeline, not a maze.

with base_events as (
    select user_id, event_time, event_name
    from events
    where event_time >= date '2026-01-01'
),
first_purchase as (
    select user_id, min(event_time) as first_purchase_at
    from base_events
    where event_name = 'purchase'
    group by 1
),
retained_users as (
    select distinct e.user_id
    from base_events e
    join first_purchase p
      on e.user_id = p.user_id
    where e.event_time >= p.first_purchase_at + interval '7 day'
)
select count(*) as retained_users
from retained_users;

Each CTE answers one question. That is the main reason to use it.

CTEs are not a guaranteed performance trick

This is where a lot of confusion starts. A CTE can improve readability without improving runtime at all. In some engines it may be inlined into the outer query. In others it may be materialized. In some workloads the same CTE structure can help one query and hurt another.

The safe rule is:

Use CTEs for clarity first. Verify performance with the execution plan instead of assuming the optimizer treated the CTE the way you hoped.

That is the practical answer both in real work and in interviews.

When temp tables earn their extra ceremony

Temp tables are more work, so they need a real reason. Good reasons include:

  • You need to inspect the intermediate result before continuing.
  • The intermediate step is expensive and reused multiple times.
  • The next step benefits from indexing the intermediate rows.
  • The optimizer keeps producing unstable plans for one giant statement.
  • You want to break a hard-to-debug query into deterministic phases.

This is common in ETL, messy interview take-homes, and performance debugging. A temp table turns a giant moving target into a concrete dataset you can query directly.

A common anti-pattern: stacking CTEs just to look organized

Not every query should become a ladder of ten CTEs. If each step is referenced once and only renames columns without changing the logic, the query becomes verbose without becoming clearer.

Good named steps answer real questions. Bad named steps force the reader to hop between blocks that do not add meaning.

Ask of each CTE:

  • Does it isolate a change in grain?
  • Does it make a filter safer?
  • Does it make a business rule obvious?
  • Would debugging be harder without it?

If the answer is no, the extra layer may be noise.

A common anti-pattern: using temp tables to hide uncertainty

Temp tables are not a substitute for understanding the grain. If your counts are wrong because the join explodes rows, moving the broken step into a temp table only preserves the mistake in a new place.

Use temp tables when materialization helps. Do not use them as a ritual.

Interview answer: how to explain the tradeoff well

A strong interview answer sounds like this:

I pick the tool based on the job. I use a subquery when the intermediate logic is small and single-use, a CTE when the query has multiple logical stages or recursion, and a temp table when I need to materialize, inspect, reuse, or index an expensive intermediate result. I do not assume one is always faster; I check the plan.

That answer is concise, technical, and mature. It shows you understand both readability and execution.

Performance checklist before switching patterns

If a query is slow, do not change the syntax blindly. Check these first:

  1. Is the join order causing row explosion?
  2. Are filters applied at the right step?
  3. Is the expensive part repeated because of a correlated subquery?
  4. Would materializing the intermediate result reduce repeated work?
  5. Would an index on a temp table help the next join or filter?
  6. Does the execution plan show the optimizer already inlined the CTE?

Sometimes the right fix is a temp table. Sometimes the right fix is a better predicate. Sometimes the right fix is changing in to exists. Syntax alone is rarely the whole story.

Practical rules that hold up well

  • Default to the clearest form you can explain.
  • Use exists for existence checks.
  • Use CTEs for multi-step logic and recursion.
  • Use temp tables when the intermediate result becomes a real asset.
  • Validate performance with explain, not folklore.

FAQ

Are CTEs always slower than subqueries?

No. In many engines the optimizer can transform them into similar plans. The right answer depends on the database, the query shape, and whether the intermediate result is reused or materialized.

Should I replace every subquery with a CTE for readability?

No. If the subquery is small and only used once, keeping it local can be clearer than scrolling through extra named blocks.

When should I switch from a CTE to a temp table?

Switch when you need to inspect the intermediate rows, reuse them across multiple steps, add indexes, or stabilize a query plan that performs poorly as one large statement.

Where do views fit into this decision?

Views are for reusable logic across sessions and users. CTEs and subqueries are statement-local. Temp tables are session-local physical intermediates. They solve different scope problems.

Interview Prep

Begin Your SQL, Python, and R Journey

Master 230 interview-style coding questions and build the data skills needed for analyst, scientist, and engineering roles.

Related Articles

All Articles