SQL Cookbook Lesson

How to select the first row in each group by group with SQL

Learn how to select the first row in each group by group with SQL with examples and explanations from SQLPad.

Problem

How can I select the first row of each group in a SQL query?

Sample data

Consider a sales table as below:

id product sale_date
1 A 2020-01-01
2 A 2020-01-02
3 B 2020-01-01
4 B 2020-01-03
5 C 2020-01-01
6 C 2020-01-04

Here, we have multiple sales records for each product.

MySQL Solution

SELECT s1.*
FROM sales s1
LEFT JOIN sales s2
ON s1.product = s2.product AND s1.sale_date > s2.sale_date
WHERE s2.id IS NULL;

Explanation (MySQL)

In MySQL, the problem of selecting the first row in each group can be solved using a self join. The query joins the sales table to itself where product matches and the first table's sale date is later than the second's. The WHERE clause excludes rows where there is no smaller sale date, which means it only keeps the earliest sale for each product.

PostgreSQL Solution

SELECT DISTINCT ON (product) *
FROM sales
ORDER BY product, sale_date;

Explanation (PostgreSQL)

PostgreSQL provides a DISTINCT ON statement that returns the first record for each group of rows, where the groups are defined by the columns listed inside the parentheses. In this case, it will return the first record for each group of rows that have the same product. The ORDER BY clause after the DISTINCT ON statement is mandatory and it will determine which row is considered the "first" in each group. Here, we are ordering by sale_date, so the first sale for each product will be returned.

Keep Practicing

Related Cookbook Lessons

View all lessons

Interview Prep

Begin Your SQL, Python, and R Journey

Master 230 interview-style coding questions and build the data skills needed for analyst, scientist, and engineering roles.