Forum

Posted by Luming, Nov. 29, 2021, 3:04 p.m.

SQL 141

Hi Leon,
I have gone through your lectures of Outer Joins and still feels like I am a bit confuse about the difference between putting something in the 'on' clause and in the 'where 'clause. For example,
WITH daily_revenue AS
(SELECT D.date,
SUM(CASE WHEN total_amt IS NOT NULL THEN total_amt ELSE 0 END) AS daily_revenue
FROM dates D
LEFT JOIN food_order fo
ON DATE(fo.created_at)=D.date AND restaurant_id =100011
WHERE D.date >='2021-08-01' AND D.date <='2021-08-31'
GROUP BY 1
)
SELECT date,
SUM(daily_revenue)OVER(ORDER BY date)AS cumulative_daily_rev
FROM daily_revenue
My question here is why restaurant_id =100011 has to be in the 'on' clause? Why I cannot put it in the 'WHERE' clause?

Answers

Hey Luming,
Great question, glad to hear you went through all the OUTER JOIN lectures.
This is exactly why we need to learn by practicing real coding problems. : )
This boils down to the ordering of how the series of commands are executed by the database engine, if you put the condition (restaurant_id=100011) in the WHERE clause, since WHERE clause is executed later, if there are days when there were no orders for this restaurant, it will filter them out: but we still want to keep a record of 0 revenue days.
Instead, if we put the condition in the LEFT JOIN predicate, if a date has no order, all of the columns on food_order table will be NULL (which was later converted into 0 through COALESCE), if we add this condition in the where clause, this row will simply disappear from your results.
If this is still confusing, I would suggest breaking down the query into multiple steps, and manually checking the results until you fully get it.
This is actually a very important (advanced) usage of JOIN, and this question is very tricky.
Keep up the good work Luming!
Cheers,
Leon
SQLPad user avatar

Leon (949)

Nov. 30, 2021, 9:05 a.m.