Data Analyst Portfolio Projects: SQL, Power BI, and Python Workflows That Get Interviews

CAREER Updated May 30, 2026 9 mins read Leon Leon
Data Analyst Portfolio Projects: SQL, Power BI, and Python Workflows That Get Interviews cover image

Quick summary

Summarize this blog with AI

A data analyst portfolio is not stronger because it has more dashboards, more certificates, or more tools listed in the README. It is stronger when a hiring manager can open one project and quickly see that you can turn an unclear business problem into trustworthy analysis.

That expectation matters more in 2026 because many entry-level candidates now have the same visible checklist: SQL course, Python course, Power BI certificate, a few public datasets, and a dashboard screenshot. The candidates who stand out show the workflow behind the final chart. They define the business question, clean imperfect data, choose the right metric, write defensible SQL, explain tradeoffs, and make a recommendation that a stakeholder could act on.

This guide gives you a practical portfolio structure for data analyst roles. Use it to build fewer projects with more evidence.

What a strong analyst portfolio has to prove

A good portfolio project should prove five things:

  • You can frame a business question. The project should not be "I analyzed a dataset." It should answer a question like "Which customer segment is driving margin decline?" or "Where is checkout abandonment increasing?"
  • You can work with imperfect data. Real analyst work includes missing values, duplicate keys, inconsistent categories, timezone issues, and metric definitions that need clarification.
  • You can use SQL as the analytical backbone. Even if the final output is in Power BI, Tableau, Excel, or Python, SQL should create clean, auditable tables or views.
  • You can communicate the answer. A dashboard without a short written decision memo is usually just decoration. Include what changed, why it matters, and what you would do next.
  • You can explain your judgment. Hiring teams want to know why you used a join, why you counted users instead of events, and what assumptions could change the result.

The portfolio workflow to use

For each project, build the same repeatable workflow. This makes your portfolio easier to review and gives you a clean interview story.

1. Start with a decision, not a dataset

Pick a scenario where someone needs to decide something. For example:

  • A retail manager wants to know which categories deserve more inventory.
  • A subscription company wants to know why trial conversion dropped.
  • A support leader wants to know which ticket types are creating avoidable backlog.
  • A finance team wants to know whether discounting is helping revenue or hurting margin.

Write the decision in one sentence at the top of the project:

Goal: identify which product categories are growing revenue but losing margin so the merchandising team can adjust pricing or promotion strategy.

That sentence gives every query a purpose.

2. Choose data with enough mess to be credible

Clean sample datasets are useful for learning syntax, but a portfolio should show that you can handle data that does not arrive interview-ready. Public government data, marketplace data, company filings, local agency data, public APIs, and multi-file CSV exports are often better than a single polished tutorial dataset.

You do not need huge data. A few thousand rows across three or four related tables can be enough if the data requires real choices: duplicated IDs, inconsistent date formats, missing customer attributes, products that changed category, or records that need to be excluded from a metric.

3. Model the data in SQL before dashboarding

A common beginner mistake is loading raw CSV files directly into a BI tool and building charts immediately. That can work for exploration, but it hides the analytical work. Instead, use SQL to create the cleaned layer and the metric layer.

A simple structure is enough:

-- 1. Clean raw order records.
create view cleaned_orders as
select
    order_id,
    customer_id,
    cast(order_date as date) as order_date,
    lower(trim(status)) as status
from raw_orders
where order_id is not null;

-- 2. Define one row per order item.
create view fact_order_items as
select
    oi.order_id,
    o.customer_id,
    o.order_date,
    p.category,
    p.product_name,
    oi.quantity,
    oi.unit_price,
    oi.quantity * oi.unit_price as gross_sales
from raw_order_items oi
join cleaned_orders o
    on oi.order_id = o.order_id
join raw_products p
    on oi.product_id = p.product_id
where o.status not in ('cancelled', 'refunded');

-- 3. Build a metric table for reporting.
create view category_weekly_metrics as
select
    date_trunc('week', order_date)::date as week_start,
    category,
    count(distinct order_id) as orders,
    count(distinct customer_id) as customers,
    sum(gross_sales) as revenue,
    sum(gross_sales) / nullif(count(distinct order_id), 0) as average_order_value
from fact_order_items
group by 1, 2;

This approach shows that you understand grain. It also makes the dashboard easier to trust because every chart can trace back to a defined metric table.

4. Add validation checks

Validation queries are one of the easiest ways to look more job-ready than other candidates. They show that you do not blindly trust a chart just because it rendered.

-- Check for order IDs that appear more than once in the order header table.
select order_id, count(*) as rows_per_order
from cleaned_orders
group by order_id
having count(*) > 1;

-- Check whether revenue changed unexpectedly after joining products.
select
    (select sum(quantity * unit_price) from raw_order_items) as raw_item_revenue,
    (select sum(gross_sales) from fact_order_items) as modeled_revenue;

-- Check missing product mappings.
select oi.product_id, count(*) as order_items
from raw_order_items oi
left join raw_products p
    on oi.product_id = p.product_id
where p.product_id is null
group by oi.product_id
order by order_items desc;

Include a short "data quality notes" section in the README. Explain what you found, what you fixed, and what you would ask the business owner before using the result in production.

Three portfolio projects worth building

Project 1: Retail category performance

This is one of the best portfolio projects for analyst roles because it naturally combines SQL, business metrics, and dashboarding. Build a dataset with orders, order items, products, customers, and optional inventory.

Answer questions like:

  • Which categories grew revenue month over month?
  • Which categories have high revenue but declining margin?
  • Which products are top sellers within each category?
  • Are discounts increasing units sold enough to justify lower margin?
  • Which categories should the team investigate for stockouts or overstock?

The final deliverables should include SQL views, a Power BI or Tableau dashboard, and a one-page recommendation memo. The memo matters because the goal is not to prove that you can make charts. The goal is to prove that you can guide a business decision.

Project 2: Subscription funnel and retention

This project is useful for product analyst, growth analyst, and business analyst roles. Use an events table with signup, activation, trial start, payment, cancellation, and product usage events.

Good SQL patterns to demonstrate include cohort definition, conversion windows, deduplication, and retention logic. A strong project might show that users who activate within 24 hours are much more likely to convert, but only for a specific acquisition channel.

Include a warning about incomplete periods. For example, do not compare a fully matured January cohort to a May cohort that has only had seven days to return.

Project 3: Operations backlog and service quality

This project works well for career switchers because many people have domain experience in operations, support, healthcare, logistics, finance, education, or retail. Use tickets, cases, claims, shipments, appointments, or requests.

Questions to answer:

  • Which request types create the longest resolution times?
  • Where is backlog growing fastest?
  • Which queues have the highest reopen rate?
  • Which customer segment is most affected?
  • What operational change would reduce the problem?

This project lets you show SQL, time-based metrics, business interpretation, and stakeholder empathy.

How to present the project

For each project, use this structure in your README or project page:

  • Business question: one sentence.
  • Data source: where the data came from and what tables are included.
  • Metric definitions: revenue, margin, conversion, retention, backlog, or any KPI used.
  • SQL workflow: raw tables, cleaned views, metric tables, and validation checks.
  • Dashboard: link or screenshots with the key views.
  • Recommendation: what you would do based on the result.
  • Limitations: what could make the analysis wrong or incomplete.

The limitation section is not a weakness. It is a signal that you understand real analysis. A project that says "this analysis assumes refunds are complete and product categories were historically stable" is more credible than one that pretends the data is perfect.

A 30-day plan

If you are starting from zero, do not try to build six projects at once. Build one strong end-to-end project first.

  • Days 1-3: choose one business question, collect data, and define the tables.
  • Days 4-8: load the data into a database and create cleaned SQL views.
  • Days 9-13: write metric queries and validation checks.
  • Days 14-18: build the dashboard or notebook.
  • Days 19-22: write the decision memo and limitations.
  • Days 23-26: clean the README, organize files, and make the project easy to review.
  • Days 27-30: practice explaining the project in a mock interview.

One project built this way is more useful than five disconnected dashboards.

How to talk about the project in interviews

Use a simple story:

I started with a business question, defined the metric grain, cleaned the raw data in SQL, validated the joins and row counts, built a dashboard from the metric table, and summarized the recommendation. The biggest data issue was duplicate order rows, so I handled that before calculating category revenue.

That explanation tells the interviewer you can do the job. It also gives them places to ask follow-up questions, which is good. Strong candidates can explain their work under pressure.

FAQ

Do I need Power BI and Python in every project?

No. Use the tools that fit the question. For many analyst roles, one SQL-plus-dashboard project and one SQL-plus-Python project is enough. The important part is showing a complete workflow, not forcing every tool into every project.

Is a CSV-based project acceptable?

Yes, if you turn the CSV into a realistic analytical workflow. Load it into tables, clean it, document assumptions, define metrics, validate outputs, and explain the business recommendation. A CSV screenshot alone is weak; a CSV transformed into a credible analysis is useful.

Should I use popular datasets like Superstore?

You can, but you need to add original thinking. If you use a common dataset, change the question, add data quality checks, define new metrics, or combine it with another source. A common dataset with a generic dashboard will not stand out.

What matters most to recruiters?

Recruiters scan for relevance quickly. Make the project title, business question, tools, and outcome obvious. Hiring managers will look deeper for SQL quality, metric judgment, and whether your recommendation follows from the analysis.

Interview Prep

Begin Your SQL, Python, and R Journey

Master 230 interview-style coding questions and build the data skills needed for analyst, scientist, and engineering roles.

Related Articles

All Articles