SQL Retention Queries: Cohorts, Return Windows, and the Edge Cases That Break Them

SQL Updated Mar 25, 2026 7 mins read Leon Leon
SQL Retention Queries: Cohorts, Return Windows, and the Edge Cases That Break Them cover image

Quick summary

Summarize this blog with AI

Introduction

Retention questions look simple until you force the metric to become precise. “What is day 7 retention?” sounds straightforward, but the answer changes immediately once you ask how the cohort is defined, what counts as a return, and whether the window is exact-day or within-range.

That is why retention SQL shows up so often in analytics interviews. It is not only testing syntax. It is testing whether you can turn a fuzzy product metric into a defensible query.

If you skip the definition step, the SQL may run perfectly and still answer the wrong question.

Why Retention Answers Disagree So Often

Most broken retention queries fail for one of these reasons:

  • the cohort event was never defined clearly
  • the return window is ambiguous
  • the activity table duplicates users inside the numerator
  • incomplete current periods are mixed with mature cohorts
  • events are counted instead of users
  • the query uses calendar months when the business question is really “months since signup”

That is why two people can both write “correct” SQL and still disagree on the answer.

Define the Metric Before You Write SQL

Lock down these four choices first:

  • Cohort event: signup, first purchase, first feature use, or something else?
  • Return event: any activity, a purchase, a session, or a specific action?
  • Time grain: days, weeks, or months?
  • Window rule: exact-day, within-window, rolling, or fixed calendar period?

A lot of confusion disappears once you write the metric in plain English.

Common Retention Definitions Are Not Interchangeable

Metric label What it usually means Typical SQL rule
Day 7 retention user returned exactly 7 days later activity_date = cohort_date + 7 days
Days 1-7 retention user returned at least once during days 1 through 7 activity_date BETWEEN cohort_date + 1 AND cohort_date + 7
Week 1 retention user was active during the first post-cohort week compare user activity to cohort-relative week number
Month 2 retention user was active two months after cohort month compare month_number = 2

If you do not pin this down before you type, the result will be unstable.

Build a Clean Base Table First

Do not try to solve retention directly from a noisy event stream. Build a clean base first.

For most interview questions, that means:

  1. one row per user with the cohort date or cohort month
  2. one distinct activity row per user per reporting period
  3. a derived offset from cohort period to activity period

Example starting point:

WITH signups AS (
    SELECT
        user_id,
        signup_at::date AS cohort_date
    FROM users
), activity_days AS (
    SELECT DISTINCT
        user_id,
        event_at::date AS activity_date
    FROM events
    WHERE event_name = 'app_open'
)
SELECT *
FROM signups s
JOIN activity_days a
  ON a.user_id = s.user_id;

The DISTINCT is important. Retention is usually a user-level metric. Multiple events from the same user in the same window should not inflate the numerator.

Worked Example: Exact Day 7 Retention by Signup Date

Suppose the prompt is:

“For each signup date, calculate the percentage of users who came back exactly 7 days later.”

WITH signups AS (
    SELECT
        user_id,
        signup_at::date AS cohort_date
    FROM users
), activity_days AS (
    SELECT DISTINCT
        user_id,
        event_at::date AS activity_date
    FROM events
    WHERE event_name = 'app_open'
), retained_users AS (
    SELECT DISTINCT
        s.user_id,
        s.cohort_date
    FROM signups s
    JOIN activity_days a
      ON a.user_id = s.user_id
     AND a.activity_date = s.cohort_date + INTERVAL '7 days'
), cohort_sizes AS (
    SELECT
        cohort_date,
        COUNT(*) AS cohort_users
    FROM signups
    GROUP BY cohort_date
), retained_counts AS (
    SELECT
        cohort_date,
        COUNT(*) AS retained_users
    FROM retained_users
    GROUP BY cohort_date
)
SELECT
    c.cohort_date,
    c.cohort_users,
    COALESCE(r.retained_users, 0) AS retained_users,
    ROUND(COALESCE(r.retained_users, 0)::numeric / c.cohort_users, 4) AS day_7_retention
FROM cohort_sizes c
LEFT JOIN retained_counts r
  ON c.cohort_date = r.cohort_date
ORDER BY c.cohort_date;

The quality checks built into this version are important:

  • cohort sizes are aggregated separately
  • returning users are deduplicated at user-day grain
  • each user contributes at most one row to the numerator
  • the final ratio is calculated after the counts are stable

Worked Example: Within-Window Day 7 Retention

Now change the metric slightly:

“For each signup date, what percentage of users returned at least once during days 1 through 7?”

That is not the same query. The return condition changes.

WITH signups AS (
    SELECT
        user_id,
        signup_at::date AS cohort_date
    FROM users
), activity_days AS (
    SELECT DISTINCT
        user_id,
        event_at::date AS activity_date
    FROM events
    WHERE event_name = 'app_open'
), retained_users AS (
    SELECT DISTINCT
        s.user_id,
        s.cohort_date
    FROM signups s
    JOIN activity_days a
      ON a.user_id = s.user_id
     AND a.activity_date >= s.cohort_date + INTERVAL '1 day'
     AND a.activity_date <= s.cohort_date + INTERVAL '7 days'
)
SELECT
    s.cohort_date,
    COUNT(DISTINCT s.user_id) AS cohort_users,
    COUNT(DISTINCT r.user_id) AS retained_users,
    ROUND(COUNT(DISTINCT r.user_id)::numeric / COUNT(DISTINCT s.user_id), 4) AS day_1_7_retention
FROM signups s
LEFT JOIN retained_users r
  ON s.user_id = r.user_id
 AND s.cohort_date = r.cohort_date
GROUP BY s.cohort_date
ORDER BY s.cohort_date;

Same topic, different metric. That is exactly why retention questions are so easy to get wrong if you do not confirm the definition first.

Worked Example: Monthly Cohort Retention

Monthly retention is usually easier to reason about when you compute a cohort-relative month number.

WITH signups AS (
    SELECT
        user_id,
        DATE_TRUNC('month', signup_at)::date AS cohort_month
    FROM users
), activity_months AS (
    SELECT DISTINCT
        user_id,
        DATE_TRUNC('month', event_at)::date AS activity_month
    FROM events
    WHERE event_name = 'app_open'
), user_month_offsets AS (
    SELECT
        s.user_id,
        s.cohort_month,
        a.activity_month,
        (
            DATE_PART('year', age(a.activity_month, s.cohort_month)) * 12
            + DATE_PART('month', age(a.activity_month, s.cohort_month))
        )::int AS month_number
    FROM signups s
    JOIN activity_months a
      ON a.user_id = s.user_id
     AND a.activity_month >= s.cohort_month
)
SELECT
    cohort_month,
    month_number,
    COUNT(DISTINCT user_id) AS retained_users
FROM user_month_offsets
GROUP BY cohort_month, month_number
ORDER BY cohort_month, month_number;

This handles skipped periods correctly. A user can disappear in month 1 and return in month 2. Whether that counts depends on the metric, but the SQL now makes that behavior explicit.

Exclude Incomplete Cohorts When the Metric Needs Maturity

One of the most common real-world mistakes is comparing cohorts that have not had enough time to mature.

If you are calculating day 7 retention, a user who signed up yesterday should not be included yet. The cohort has not had the chance to produce a day 7 return.

A simple maturity filter can save you from fake performance drops:

WHERE cohort_date <= CURRENT_DATE - INTERVAL '7 days'

The same idea applies to weekly and monthly retention. Never compare a partial current period against fully matured older periods unless the analysis explicitly intends to do that.

What to Say in the Interview

A strong explanation sounds like this:

“Before I write the SQL, I want to confirm the cohort event, the return event, and whether day 7 means exactly seven days later or anytime during the first seven days. Then I will deduplicate activity to one row per user per window, calculate cohort sizes separately, and make sure the numerator is user-level rather than event-level.”

That explanation does two things:

  • it proves you understand why the metric is ambiguous
  • it shows you know how to protect the numerator and denominator from duplication

Validation Checklist

Before trusting the output, ask these questions:

  • Is the numerator ever larger than the cohort size?
  • Are users counted once per window or once per event?
  • Are incomplete cohorts excluded when the metric requires maturity?
  • Does the query include activity before the cohort event?
  • Am I grouping by calendar period or by periods since cohort, and is that what the prompt wants?

These checks catch most retention bugs quickly.

If you want the adjacent interview skills around this topic:

FAQ

Is day 7 retention the same as returning anytime in the first seven days?

No. Exact day 7 retention means activity exactly seven days after the cohort event. Within-window retention means at least one return during a defined range such as days 1 through 7.

Do I need a date spine for every retention query?

No. A date spine is helpful when you need a dense reporting table with explicit zero-activity periods. For many interview questions, a clean cohort table plus deduplicated activity periods is enough.

How should I handle multiple return events in the same window?

Count the user once per cohort-window combination unless the prompt explicitly asks for event frequency. Retention is usually a user metric, not an event-count metric.

Interview Prep

Begin Your SQL, Python, and R Journey

Master 230 interview-style coding questions and build the data skills needed for analyst, scientist, and engineering roles.

Related Articles

All Articles