Quick summary
Summarize this blog with AI
Introduction
Window functions become much easier once you separate three ideas: partition, order, and frame. Most confusing results come from the third one.
People learn SUM() OVER (...) or LAST_VALUE() from examples that happen to work, then get surprised when adding ORDER BY changes the answer. In many cases the database is not misbehaving. The frame changed, and the query author never made that frame explicit.
This guide explains what a window frame actually is, why ROWS and RANGE behave differently, when the default frame quietly changes your result, and how to write running totals and moving averages without guesswork.
Partition, Order, and Frame Are Different Decisions
Take this expression:
SUM(revenue) OVER (
PARTITION BY account_id
ORDER BY event_date
)
It contains three ideas:
| Piece | What it controls | Question it answers |
|---|---|---|
PARTITION BY |
Which rows belong to the same logical window | Which rows are even comparable here? |
ORDER BY |
The sequence inside that window | In what order should the rows be read? |
| Frame clause | Which ordered rows are visible from the current row | How far backward or forward can this function see? |
If you understand only partition and order, you still do not know what rows the function can see at each step.
The Default Frame That Surprises People
Exact behavior varies by dialect, but in many engines an ordered aggregate window like this:
SUM(revenue) OVER (
PARTITION BY account_id
ORDER BY event_date
)
behaves roughly like this:
SUM(revenue) OVER (
PARTITION BY account_id
ORDER BY event_date
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)
That means the function usually sees rows from the start of the partition through the current row, not the entire partition. So the result is often cumulative, not partition-wide.
This is why adding ORDER BY can change SUM(), MAX(), or LAST_VALUE() from a whole-partition answer into a running answer.
ROWS vs RANGE
The biggest difference shows up when the sort key has duplicates.
Suppose the ordered rows are:
event_date revenue
---------- -------
2026-03-01 100
2026-03-01 50
2026-03-02 80
Now compare these two expressions:
SUM(revenue) OVER (
ORDER BY event_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_sum_rows
SUM(revenue) OVER (
ORDER BY event_date
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_sum_range
The results are different:
| event_date | revenue | ROWS result |
RANGE result |
|---|---|---|---|
| 2026-03-01 | 100 | 100 | 150 |
| 2026-03-01 | 50 | 150 | 150 |
| 2026-03-02 | 80 | 230 | 230 |
ROWS uses physical row position after sorting. RANGE groups peers that share the same ordered value. That is why both 2026-03-01 rows can receive the same cumulative total under RANGE.
If you expected a row-by-row running sum and got repeated totals instead, this is usually the reason.
If duplicate sort keys are common and you need one-row-at-a-time behavior, ROWS is usually the safer choice.
| Frame mode | Meaning of CURRENT ROW |
Why it matters |
|---|---|---|
ROWS |
The literal current row position | Good for row-by-row running totals and exact trailing-row windows |
RANGE |
The current row's peer group in the ORDER BY |
Duplicate sort values can share the same result |
GROUPS where supported |
The current peer group as a unit | Useful when you want peer-group steps instead of row steps |
CURRENT ROW does not mean the same thing in every frame mode.
What CURRENT ROW Means in Different Modes
That is why frame confusion often shows up in running totals and LAST_VALUE(), not in row numbering itself.
Ranking functions like ROW_NUMBER() and RANK() are driven primarily by partitioning, ordering, and peer groups. They still depend on a clear ORDER BY, but the frame is usually not the thing that surprises people there.
The frame matters most for aggregate window functions and value functions such as FIRST_VALUE(), LAST_VALUE(), and NTH_VALUE(). Those functions can return very different answers when the frame changes.
Not Every Window Function Cares About the Frame
Safe Patterns for Common Tasks
| Task | Recommended pattern | Why |
|---|---|---|
| Row-by-row running total | ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW |
Counts physical rows in order and avoids peer grouping surprises |
| Trailing N-row moving average | ROWS BETWEEN N PRECEDING AND CURRENT ROW |
Makes the exact row window explicit |
| Whole-partition max or min repeated on every row | Usually omit ORDER BY |
Avoids turning a partition-wide metric into a running metric |
| Final value in the partition | Use a frame ending with UNBOUNDED FOLLOWING |
Lets the function see future rows too |
If you need deterministic row-by-row behavior, prefer ROWS and add a stable tie-breaker to the window ORDER BY.
Running Totals and Moving Averages Without Guesswork
For a true row-by-row cumulative sum, write the frame explicitly:
SUM(revenue) OVER (
PARTITION BY account_id
ORDER BY event_date, event_id
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_revenue
The extra event_id tie-breaker matters when multiple rows share the same date. It makes the order deterministic in the same way ranking queries do in the latest-row-per-group pattern.
For a trailing three-row moving average:
AVG(revenue) OVER (
PARTITION BY account_id
ORDER BY event_date, event_id
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS avg_last_3_rows
That means exactly what it says: current row plus the two rows immediately before it in the ordered sequence.
Why LAST_VALUE() Looks Broken
LAST_VALUE() confuses people because the default frame often ends at the current row. So this query:
LAST_VALUE(status) OVER (
PARTITION BY account_id
ORDER BY status_ts
) AS last_status
often returns the current row's status, not the final status in the partition. The function is doing exactly what the frame allowed it to do.
If you want the true final value across the whole partition, widen the frame:
LAST_VALUE(status) OVER (
PARTITION BY account_id
ORDER BY status_ts
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS final_status_in_partition
That one line changes the meaning from "last visible value so far" to "last value in the full partition."
When to Omit ORDER BY
If the business question is partition-wide and does not depend on sequence, do not add ordering just because window functions often include it in examples.
MAX(revenue) OVER (
PARTITION BY account_id
) AS max_revenue_in_partition
This is usually clearer and safer than adding ORDER BY and then compensating for the frame behavior later.
Dialect Notes Worth Remembering
- Exact frame defaults can vary by database, so check your dialect if a result feels surprising.
- Support for some
RANGEforms is more limited than support forROWS, especially when using offsets. - If you need deterministic row-by-row behavior across duplicate sort keys,
ROWSplus a stable secondary sort key is usually the safest choice.
The goal is not to memorize every dialect difference. The goal is to stop relying on implicit behavior when the frame affects the result.
A Practical Debug Checklist
- Ask what rows the current row should be able to see.
- Check whether you accidentally changed a partition-wide metric into a cumulative metric by adding
ORDER BY. - Look for duplicate sort values that make
RANGEgroup peers together. - Add a tie-breaker to the ordered window when row sequence matters.
- Write the frame explicitly when the result is important enough that default behavior would be risky.
Most window-function confusion disappears once the frame is treated as a first-class design choice instead of an invisible default.
FAQ
Why did adding ORDER BY change my SUM() or MAX() result?
Because in many engines the ordered window changed from a whole-partition aggregate to a frame that runs from the beginning of the partition through the current row. That makes the result cumulative unless you define a different frame.
When should I use ROWS instead of RANGE?
Use ROWS when you care about physical row positions after sorting, such as running totals and moving averages over exact rows. Use RANGE only when peer grouping by equal sort values is actually what you want.
Why does LAST_VALUE() keep returning the current row?
Because the default frame often stops at the current row. If the function cannot see future rows, the last visible value is the current one. Extend the frame to UNBOUNDED FOLLOWING when you want the final value in the partition.