Forum

Posted by Sylvia, April 7, 2022, 1:22 p.m.

Question #59 Out of Memory Error

I think this is the most efficient way to get the percentage of revenue for each of the following films: film_id <= 10 by its category. However, I get the "Out of Memory Error". I am trying to understand why the solution is MORE efficient than my codes. Can anyone please help? 

My solution: 

SELECT X.* FROM
(

SELECT distinct a.film_id, 
c.name AS category_name, 
SUM(f.amount) OVER (PARTITION BY a.film_id) *100.0/SUM(f.amount) OVER (PARTITION BY c.name) AS revenue_percent_category
FROM film a
INNER JOIN film_category b
ON a.film_id = b.film_id
INNER JOIN category c
ON b.category_id = c.category_id
INNER JOIN inventory d
ON b.film_id = d.film_id
INNER JOIN rental e
ON d.inventory_id = e.inventory_id
INNER JOIN payment f
ON e.rental_id = f.rental_id
ORDER BY a.film_id
) X
WHERE X.film_id <= 10

 

The answer: 

WITH movie_revenue AS (
    SELECT
        I.film_id, SUM(P.amount) revenue
    FROM payment P
    INNER JOIN rental R
    ON R.rental_id = P.rental_id
    INNER JOIN inventory I
    ON I.inventory_id = R.inventory_id    
    GROUP BY I.film_id
)
SELECT 
    MR.film_id, 
    C.name category_name,
    revenue * 100.0 / SUM(revenue) OVER(PARTITION BY C.name) revenue_percent_category
FROM movie_revenue MR
INNER JOIN film_category FC
  ON FC.film_id = MR.film_id
INNER JOIN category C
  ON C.category_id = FC.category_id
ORDER BY film_id
LIMIT 10
;

Answers

Also, I think the "LIMIT 10" in the solution may be misleading in the case when some films with film_id <= 10 do not have any rental revenue. 

For example, if the film with film_id = 2 does not have any rental revenue, then film_id = 11 will show up in the result of "LIMIT 10" but it will not show in my solution below. Let me know whether I'm missing anything. Thanks! 

 

WITH revenue AS
(SELECT distinct a.film_id, 
 sum(d.amount) AS revenue
 FROM film a
INNER JOIN inventory b
ON a.film_id = b.film_id
INNER JOIN rental c
ON b.inventory_id = c.inventory_id
INNER JOIN payment d
ON c.rental_id = d.rental_id
GROUP BY a.film_id
ORDER BY a.film_id
 )

SELECT * FROM
(
SELECT distinct a.film_id, c.name AS category_name, 
a.revenue * 100.0/ SUM(a.revenue) OVER (PARTITION BY c.name)
AS revenue_percent_category
FROM revenue a
INNER JOIN film_category b
ON a.film_id = b.film_id 
INNER JOIN category c
ON b.category_id = c.category_id
ORDER BY a.film_id
) X
WHERE X.film_id <= 10

Hi Sylvia,

Your code looks good but I am afraid it's not efficient due to the DISTINCT operation.

The trick is instead of do SELECT DISTINCT, you select all the columns then group by all of them like the following:

SELECT X.film_id, category_name, revenue_percent_category FROM
(

SELECT  a.film_id, 
c.name AS category_name, 
SUM(f.amount) OVER (PARTITION BY a.film_id) *100.0/SUM(f.amount) OVER (PARTITION BY c.name) AS revenue_percent_category
FROM film a
INNER JOIN film_category b
ON a.film_id = b.film_id
INNER JOIN category c
ON b.category_id = c.category_id
INNER JOIN inventory d
ON b.film_id = d.film_id
INNER JOIN rental e
ON d.inventory_id = e.inventory_id
INNER JOIN payment f
ON e.rental_id = f.rental_id
ORDER BY a.film_id
) X
WHERE X.film_id <= 10
GROUP BY 1,2,3

 

 

Leon, April 7, 2022, 2:39 p.m.
SQLPad user avatar

Sylvia (34)

April 7, 2022, 1:40 p.m.