Recursive CTEs in SQL: Hierarchies, Paths, Cycle Checks, and Debugging

SQL Updated Apr 9, 2026 7 mins read Leon Leon
Recursive CTEs in SQL: Hierarchies, Paths, Cycle Checks, and Debugging cover image

Quick summary

Summarize this blog with AI

Introduction

Recursive CTEs confuse people for a predictable reason: they are one of the few places where a SQL query refers to its own output while it is still being built. If the anchor set is wrong, the recursion walks the wrong graph. If the join points in the wrong direction, the tree goes sideways. If termination is weak, the query keeps expanding until the engine stops it or your patience runs out.

The goal is not to memorize the syntax. The goal is to build a mental model you can debug under pressure.

The mental model that makes recursion easier

A recursive CTE has two parts:

  1. The anchor member: the starting rows.
  2. The recursive member: the rule that finds the next rows from the current rows.

The engine runs the anchor first, then repeatedly feeds the growing result back into the recursive member until no new rows are produced or a recursion limit is hit.

If you can answer these three questions, you can usually write the query:

  • What are my starting rows?
  • How do I move one step outward?
  • What prevents the walk from continuing forever?

Basic hierarchy example

Suppose employees has employee_id, manager_id, and full_name. You want the full org tree beneath one manager.

with recursive org as (
    select
        employee_id,
        manager_id,
        full_name,
        0 as depth
    from employees
    where employee_id = 101

    union all

    select
        e.employee_id,
        e.manager_id,
        e.full_name,
        o.depth + 1 as depth
    from employees e
    join org o
      on e.manager_id = o.employee_id
)
select *
from org
order by depth, employee_id;

Read it in English:

  • Start at employee 101.
  • Then find rows whose manager_id equals the current employee.
  • Repeat.

That is the whole pattern.

Direction matters more than syntax

A lot of bugs come from joining the relationship backward.

  • e.manager_id = o.employee_id walks down the hierarchy.
  • e.employee_id = o.manager_id walks up the hierarchy.

If the output feels inverted, check the edge direction before changing anything else.

Adding paths so the result becomes explainable

Depth is useful. Paths are better. They let you see exactly how the recursion reached each row.

with recursive org as (
    select
        employee_id,
        manager_id,
        full_name,
        0 as depth,
        cast(full_name as varchar(500)) as path
    from employees
    where employee_id = 101

    union all

    select
        e.employee_id,
        e.manager_id,
        e.full_name,
        o.depth + 1,
        concat(o.path, ' > ', e.full_name) as path
    from employees e
    join org o
      on e.manager_id = o.employee_id
)
select *
from org
order by path;

When debugging recursion, a path column is one of the fastest ways to spot duplicates, loops, and wrong joins.

Why recursive CTEs loop forever

A recursive query runs forever when the recursive member can keep reintroducing rows that were already seen or when there is a cycle in the data.

Common causes:

  • Parent-child data contains a cycle.
  • The join condition is too broad.
  • The anchor starts with too many rows.
  • union all keeps adding duplicates that the recursion can revisit.
  • The recursive member does not shrink the search space.

You need an explicit guard.

Cycle protection pattern

In Postgres, one convenient guard is to carry a visited array.

with recursive org as (
    select
        employee_id,
        manager_id,
        full_name,
        array[employee_id] as visited_ids,
        0 as depth
    from employees
    where employee_id = 101

    union all

    select
        e.employee_id,
        e.manager_id,
        e.full_name,
        o.visited_ids || e.employee_id,
        o.depth + 1
    from employees e
    join org o
      on e.manager_id = o.employee_id
    where not e.employee_id = any(o.visited_ids)
)
select *
from org;

If your dialect does not support arrays, the fallback is usually a path string plus a containment check, or a hard depth limit while you validate the data.

Date expansion is recursion too

Hierarchies are the classic use case, but recursive CTEs also help with expanding ranges into repeated units when the engine lacks a helper like generate_series.

with recursive subscription_days as (
    select
        subscription_id,
        start_date as day,
        end_date
    from subscriptions

    union all

    select
        subscription_id,
        day + interval '1 day',
        end_date
    from subscription_days
    where day < end_date
)
select subscription_id, day
from subscription_days
order by subscription_id, day;

This is the same mental model:

  • Start at start_date.
  • Move forward one day.
  • Stop at end_date.

That is why people who understand recursive CTEs can also reason better about date spines and range expansion.

Debugging recursive CTEs step by step

When a recursive query breaks, do not stare at the full statement. Debug it in order.

1. Test the anchor by itself

Run only the anchor member. If it returns the wrong start rows, the rest of the query never had a chance.

2. Test one recursion step manually

Take one known row from the anchor and run the recursive join as a normal query. Make sure the next hop is correct.

3. Add a depth column immediately

A depth column tells you whether the recursion is growing the way you expect.

4. Add a path or visited set

If rows repeat, the path makes the repetition visible.

5. Add a temporary depth cap while debugging

where o.depth < 10

That does not solve the real problem, but it protects you while you inspect the output.

6. Check whether union or union all is correct

union all is usually faster and more transparent, but it preserves duplicates. union can remove duplicate rows, which sometimes masks a logic issue and sometimes helps terminate a recursion that naturally revisits identical rows. Choose intentionally.

Performance rules that matter in practice

Recursive CTEs can be expensive because each step may join back to a large base table.

The biggest practical wins are:

  • Index the join key, such as manager_id or parent_id.
  • Keep the anchor narrow.
  • Filter early inside the recursive member when possible.
  • Carry only columns you need.
  • Avoid expensive expressions inside every recursive step.
  • Materialize results in a temp table if you need to reuse them downstream.

If the recursive walk is deep and the graph is large, inspect the execution plan. Sometimes the right answer is still recursion. Sometimes the right answer is precomputed hierarchy tables or a different model entirely.

Dialect notes worth remembering

  • Postgres: recursive CTEs are flexible and work well with arrays for cycle checks.
  • MySQL 8: supports recursive CTEs, but you may need string-based visited tracking instead of arrays.
  • SQL Server: supports recursive CTEs and also gives you option (maxrecursion n) for protection during debugging.
  • SQLite: supports recursion, but some recursive references are only valid in certain parts of the query, so placement matters.

The syntax looks similar across engines, but the safety tools and limits differ.

A strong interview explanation

If you need to explain recursive CTEs quickly, say this:

A recursive CTE has a starting set and a rule that finds the next set from the current result. It keeps iterating until no new rows are found or a recursion guard stops it. I use it for hierarchies, graph walks, and range expansion, and I always think about cycle protection and termination.

That answer shows you understand both the mechanism and the risks.

When not to use recursive CTEs

Do not reach for recursion if the data is not actually recursive. A normal self join is simpler for fixed-depth problems. A calendar table is simpler for repeated business-day logic. A temp table or batch process may be better when the hierarchy is huge and reused constantly.

Recursive CTEs are powerful, but they are not a badge of sophistication. They are a tool for problems that genuinely expand step by step.

FAQ

What is the difference between the anchor member and the recursive member?

The anchor member returns the starting rows. The recursive member uses those rows to find the next rows. The engine keeps rerunning the recursive member against the growing result until termination.

Why does my recursive CTE return duplicates?

Usually because the join condition is too broad, the data contains multiple paths to the same row, or you are using union all without a visited-set guard.

How do I stop a recursive CTE from looping forever?

Use a cycle check, validate the relationship data, and add a termination condition such as a visited set or a depth cap while debugging.

Should I use recursion for date generation?

Use recursion when your database does not have a simpler series generator and the range is bounded. If you repeatedly need business calendars or long reusable ranges, a permanent calendar table is usually better.

Interview Prep

Begin Your SQL, Python, and R Journey

Master 230 interview-style coding questions and build the data skills needed for analyst, scientist, and engineering roles.

Related Articles

All Articles