Instructions: An actor’s productivity is defined as the number of movies he/she has played. Write a query to return the category_id, actor_id and number of moviesby the most productive actor in that category. For example: John Doe filmed the most action movies, your …

Question 74

The most productive actors by category

Unsolved Hard Lecture

Instructions:

An actor’s productivity is defined as the number of movies he/she has played.
Write a query to return the category_id, actor_id and number of moviesby the most productive actor in that category.
For example: John Doe filmed the most action movies, your query will return John as the result for action movie category.
Do this for every movie category.

Table 1: actor

actor

actor_id	first_name	last_name
1	PENELOPE	GUINESS
2	NICK	WAHLBERG
3	ED	CHASE
4	JENNIFER	DAVIS
5	JOHNNY	LOLLOBRIGIDA

  col_name   | col_type
-------------+--------------------------
 actor_id    | integer
 first_name  | text
 last_name   | text

Table 2: film_actor

film_actor

actor_id	film_id	last_update
1	1	2017-02-15 10:05:03-08
1	23	2017-02-15 10:05:03-08
1	25	2017-02-15 10:05:03-08
1	106	2017-02-15 10:05:03-08
1	140	2017-02-15 10:05:03-08

Films and their casts

  col_name   | col_type
-------------+--------------------------
 actor_id    | smallint
 film_id     | smallint

Table 3: film_category

film_category

film_id	category_id	last_update
1	6	2017-02-15 10:07:09-08
2	11	2017-02-15 10:07:09-08
3	6	2017-02-15 10:07:09-08
4	11	2017-02-15 10:07:09-08
5	8	2017-02-15 10:07:09-08

A film can only belong to one category

  col_name   | col_type
-------------+--------------------------
 film_id     | smallint
 category_id | smallint

Sample results

 category_id | actor_id | num_movies
-------------+----------+------------
           1 |       50 |          6
           2 |      150 |          6
           3 |       17 |          7
           4 |       86 |          6
           5 |      196 |          6
           6 |       48 |          6
           7 |        7 |          7

Solution

postgres

WITH actor_movies AS (
  SELECT 
    FC.category_id,
    FA.actor_id, 
    COUNT(DISTINCT F.film_id) num_movies
  FROM film_actor FA
  INNER JOIN film F
  ON F.film_id = FA.film_id
  INNER JOIN film_category FC
  ON FC.film_id = F.film_id
  GROUP BY FC.category_id, FA.actor_id
)
SELECT category_id, actor_id, num_movies
FROM (
	SELECT 
		category_id, 
		actor_id, 
		num_movies,
		ROW_NUMBER()OVER(PARTITION BY category_id ORDER BY num_movies DESC) AS productivity_idx
	FROM actor_movies
) X
WHERE productivity_idx = 1;

Explanation

This query retrieves the most productive actors for each film category based on the number of movies they have appeared in. It does this by first creating a CTE (Common Table Expression) called actor_movies that combines data from three tables: film_actor, film, and film_category. It selects the category ID, actor ID, and the count of distinct film IDs for each actor and category combination, and groups the results by category and actor.

Then, the main query selects the category ID, actor ID, and number of movies from the actor_movies CTE, and adds a calculated column called productivity_idx. This column is generated using the ROW_NUMBER() function, which assigns a sequential number to each row within a category based on the number of movies the actor has appeared in, in descending order. So, the most productive actor for each category will have a productivity_idx of 1.

Finally, the outer query filters the results to only show the rows where productivity_idx equals 1, i.e., the most productive actor for each category.

Last Submission

postgres

No submission yet for this engine. Run and submit your query to save it here.

Expected results

Submit a query to compare against expected output.

Engine Postgres

Interview timer

Recommended interview pacing

Easy: 5 min for direct warm-up style questions.

Medium: 10 min for multi-step interview queries.

Hard: 15 min for layered questions with tighter time pressure.

A common bar is solving about 2 medium-or-harder questions in a 30 minute interview.

15:00

Timer

Your results

Run your query to preview results here.

More ROW_NUMBER, RANK, DENSE_RANK Questions

Browse Topic Set

Question	Level	Company	FTPR
#212. Highest spender's issuers	Hard	visa	0%
#211. Top issuer by category	Medium	visa	30%
#208. Top 3 urls by testing groups	Hard	apple	10%
#203. Top product by country by month	Hard	apple	100%
#201. Top 3 students for each subject	Hard	snap	67%
#188. Top 1 popular question by department	Hard	google	14%
#166. Top 10 customers based on spend growth	Medium	walmart	11%
#165. Session stitching	Hard	walmart	17%
#158. First trip completion rate	Medium	lyft	22%
#157. Number of trips before a driver got banned	Medium	lyft	10%
#148. Most popular video category	Medium	google	13%
#145. Returning customers after first buy	Hard	afterpay	16%
#144. Third order	Medium	afterpay	9%
#142. First order date	Easy	afterpay	40%
#139. Poor first delivery experience	Medium	doordash	17%
#137. Extremely late first orders	Medium	doordash	9%
#135. Unlucky employees	Easy	robinhood	16%
#127. Average rating after 10th trip	Hard	uber	11%
#123. Top listing in the United States, United Kingdom and Canada	Medium	airbnb	10%
#122. Top country by wow growth	Hard	airbnb	13%
#120. First ever booking	Hard	airbnb	10%
#119. Top 2 countries by bookings	Hard	airbnb	10%
#116. Top answers day by device	Easy	amazon	14%
#110. Most popular product by category	Medium	amazon	12%