Quick summary
Summarize this blog with AI
Introduction
Recursive CTEs confuse people for a predictable reason: they are one of the few places where a SQL query refers to its own output while it is still being built. If the anchor set is wrong, the recursion walks the wrong graph. If the join points in the wrong direction, the tree goes sideways. If termination is weak, the query keeps expanding until the engine stops it or your patience runs out.
The goal is not to memorize the syntax. The goal is to build a mental model you can debug under pressure.
The mental model that makes recursion easier
A recursive CTE has two parts:
- The anchor member: the starting rows.
- The recursive member: the rule that finds the next rows from the current rows.
The engine runs the anchor first, then repeatedly feeds the growing result back into the recursive member until no new rows are produced or a recursion limit is hit.
If you can answer these three questions, you can usually write the query:
- What are my starting rows?
- How do I move one step outward?
- What prevents the walk from continuing forever?
Basic hierarchy example
Suppose employees has employee_id, manager_id, and full_name. You want the full org tree beneath one manager.
with recursive org as (
select
employee_id,
manager_id,
full_name,
0 as depth
from employees
where employee_id = 101
union all
select
e.employee_id,
e.manager_id,
e.full_name,
o.depth + 1 as depth
from employees e
join org o
on e.manager_id = o.employee_id
)
select *
from org
order by depth, employee_id;
Read it in English:
- Start at employee
101. - Then find rows whose
manager_idequals the current employee. - Repeat.
That is the whole pattern.
Direction matters more than syntax
A lot of bugs come from joining the relationship backward.
e.manager_id = o.employee_idwalks down the hierarchy.e.employee_id = o.manager_idwalks up the hierarchy.
If the output feels inverted, check the edge direction before changing anything else.
Adding paths so the result becomes explainable
Depth is useful. Paths are better. They let you see exactly how the recursion reached each row.
with recursive org as (
select
employee_id,
manager_id,
full_name,
0 as depth,
cast(full_name as varchar(500)) as path
from employees
where employee_id = 101
union all
select
e.employee_id,
e.manager_id,
e.full_name,
o.depth + 1,
concat(o.path, ' > ', e.full_name) as path
from employees e
join org o
on e.manager_id = o.employee_id
)
select *
from org
order by path;
When debugging recursion, a path column is one of the fastest ways to spot duplicates, loops, and wrong joins.
Why recursive CTEs loop forever
A recursive query runs forever when the recursive member can keep reintroducing rows that were already seen or when there is a cycle in the data.
Common causes:
- Parent-child data contains a cycle.
- The join condition is too broad.
- The anchor starts with too many rows.
union allkeeps adding duplicates that the recursion can revisit.- The recursive member does not shrink the search space.
You need an explicit guard.
Cycle protection pattern
In Postgres, one convenient guard is to carry a visited array.
with recursive org as (
select
employee_id,
manager_id,
full_name,
array[employee_id] as visited_ids,
0 as depth
from employees
where employee_id = 101
union all
select
e.employee_id,
e.manager_id,
e.full_name,
o.visited_ids || e.employee_id,
o.depth + 1
from employees e
join org o
on e.manager_id = o.employee_id
where not e.employee_id = any(o.visited_ids)
)
select *
from org;
If your dialect does not support arrays, the fallback is usually a path string plus a containment check, or a hard depth limit while you validate the data.
Date expansion is recursion too
Hierarchies are the classic use case, but recursive CTEs also help with expanding ranges into repeated units when the engine lacks a helper like generate_series.
with recursive subscription_days as (
select
subscription_id,
start_date as day,
end_date
from subscriptions
union all
select
subscription_id,
day + interval '1 day',
end_date
from subscription_days
where day < end_date
)
select subscription_id, day
from subscription_days
order by subscription_id, day;
This is the same mental model:
- Start at
start_date. - Move forward one day.
- Stop at
end_date.
That is why people who understand recursive CTEs can also reason better about date spines and range expansion.
Debugging recursive CTEs step by step
When a recursive query breaks, do not stare at the full statement. Debug it in order.
1. Test the anchor by itself
Run only the anchor member. If it returns the wrong start rows, the rest of the query never had a chance.
2. Test one recursion step manually
Take one known row from the anchor and run the recursive join as a normal query. Make sure the next hop is correct.
3. Add a depth column immediately
A depth column tells you whether the recursion is growing the way you expect.
4. Add a path or visited set
If rows repeat, the path makes the repetition visible.
5. Add a temporary depth cap while debugging
where o.depth < 10
That does not solve the real problem, but it protects you while you inspect the output.
6. Check whether union or union all is correct
union all is usually faster and more transparent, but it preserves duplicates. union can remove duplicate rows, which sometimes masks a logic issue and sometimes helps terminate a recursion that naturally revisits identical rows. Choose intentionally.
Performance rules that matter in practice
Recursive CTEs can be expensive because each step may join back to a large base table.
The biggest practical wins are:
- Index the join key, such as
manager_idorparent_id. - Keep the anchor narrow.
- Filter early inside the recursive member when possible.
- Carry only columns you need.
- Avoid expensive expressions inside every recursive step.
- Materialize results in a temp table if you need to reuse them downstream.
If the recursive walk is deep and the graph is large, inspect the execution plan. Sometimes the right answer is still recursion. Sometimes the right answer is precomputed hierarchy tables or a different model entirely.
Dialect notes worth remembering
- Postgres: recursive CTEs are flexible and work well with arrays for cycle checks.
- MySQL 8: supports recursive CTEs, but you may need string-based visited tracking instead of arrays.
- SQL Server: supports recursive CTEs and also gives you
option (maxrecursion n)for protection during debugging. - SQLite: supports recursion, but some recursive references are only valid in certain parts of the query, so placement matters.
The syntax looks similar across engines, but the safety tools and limits differ.
A strong interview explanation
If you need to explain recursive CTEs quickly, say this:
A recursive CTE has a starting set and a rule that finds the next set from the current result. It keeps iterating until no new rows are found or a recursion guard stops it. I use it for hierarchies, graph walks, and range expansion, and I always think about cycle protection and termination.
That answer shows you understand both the mechanism and the risks.
When not to use recursive CTEs
Do not reach for recursion if the data is not actually recursive. A normal self join is simpler for fixed-depth problems. A calendar table is simpler for repeated business-day logic. A temp table or batch process may be better when the hierarchy is huge and reused constantly.
Recursive CTEs are powerful, but they are not a badge of sophistication. They are a tool for problems that genuinely expand step by step.
FAQ
What is the difference between the anchor member and the recursive member?
The anchor member returns the starting rows. The recursive member uses those rows to find the next rows. The engine keeps rerunning the recursive member against the growing result until termination.
Why does my recursive CTE return duplicates?
Usually because the join condition is too broad, the data contains multiple paths to the same row, or you are using union all without a visited-set guard.
How do I stop a recursive CTE from looping forever?
Use a cycle check, validate the relationship data, and add a termination condition such as a visited set or a depth cap while debugging.
Should I use recursion for date generation?
Use recursion when your database does not have a simpler series generator and the range is bounded. If you repeatedly need business calendars or long reusable ranges, a permanent calendar table is usually better.