Quick summary
Summarize this blog with AI
SQL is the backbone of database operations. Prepping for an interview or just brushing up? This cheat sheet highlights essential SQL commands and concepts often tackled in interviews. Dive in for a quick SQL refresher.
1. Basic SQL Syntax
SELECT
SELECT first_name, last_name
FROM employees;
WHERE
SELECT first_name, last_name
FROM employees
WHERE department_id = 3;
DISTINCT
SELECT DISTINCT job_title
FROM jobs;
ORDER BY
SELECT first_name, last_name, salary
FROM employees
ORDER BY salary DESC;
2. Advanced SQL Syntax
GROUP BY and HAVING
SELECT department_id, COUNT(*)
FROM employees
GROUP BY department_id
HAVING COUNT(*) > 5;
INNER JOIN
SELECT employees.first_name, departments.department_name
FROM employees
INNER JOIN departments
ON employees.department_id = departments.department_id;
LEFT JOIN
SELECT employees.first_name, departments.department_name
FROM employees
LEFT JOIN departments
ON employees.department_id = departments.department_id;
SUB-QUERY
SELECT first_name
FROM employees
WHERE department_id = (SELECT department_id FROM departments WHERE department_name = 'Sales');
3. Functions and Operations
Aggregate Functions
SELECT AVG(salary), MIN(salary), MAX(salary)
FROM employees;
String Functions
SELECT UPPER(first_name), LOWER(last_name)
FROM employees;
Date Functions
SELECT CURRENT_DATE, EXTRACT(YEAR FROM hire_date)
FROM employees;
4. SQL Constraints
PRIMARY KEY
ALTER TABLE employees
ADD PRIMARY KEY (employee_id);
FOREIGN KEY
ALTER TABLE employees
ADD FOREIGN KEY (department_id) REFERENCES departments(department_id);
NOT NULL
ALTER TABLE employees
MODIFY first_name VARCHAR(50) NOT NULL;
5. Subqueries
Using Subquery with IN
SELECT first_name, last_name
FROM employees
WHERE department_id IN (SELECT department_id FROM departments WHERE location = 'New York');
Using Subquery with EXISTS
SELECT department_name
FROM departments d
WHERE EXISTS (SELECT 1 FROM employees e WHERE e.department_id = d.department_id AND e.salary > 5000);
6. Database Design
Normalization
CREATE TABLE orders (
order_id INT PRIMARY KEY,
product_name VARCHAR(50),
customer_name VARCHAR(50),
address VARCHAR(250)
);
Denormalization
CREATE TABLE orders (
order_id INT PRIMARY KEY,
product_id INT,
product_name VARCHAR(50), -- denormalized
customer_id INT,
customer_name VARCHAR(50) -- denormalized
);
7. Best Practices & Optimization Techniques
Using indexes efficiently
SELECT * FROM employees WHERE department_id = 5;
Avoiding SELECT *
SELECT first_name, last_name, salary FROM employees;
Filtering data as early as possible
SELECT e.first_name, d.department_name
FROM (SELECT * FROM employees WHERE salary > 5000) e
JOIN departments d ON e.department_id = d.department_id;
8. Common Interview Scenarios & Questions
Difference between WHERE and HAVING
SELECT department_id
FROM employees
WHERE salary > 5000
GROUP BY department_id;
SELECT department_id, AVG(salary)
FROM employees
GROUP BY department_id
HAVING AVG(salary) > 5000;
Difference between INNER JOIN and LEFT JOIN
SELECT e.first_name, d.department_name
FROM employees e
INNER JOIN departments d ON e.department_id = d.department_id;
SELECT e.first_name, d.department_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id;
Explain normalization and denormalization
Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. It involves dividing large tables into smaller, related tables and defining relationships between them. The main aim is to isolate data so that additions, deletions, or modifications of data in one table don’t corrupt or wrongly impact the other tables.
Denormalization, on the other hand, is the process of intentionally introducing redundancy into a database for improving read performance. While normalization focuses on minimizing redundancy, denormalization does the opposite. It's used when the read operation's performance is more crucial than the write operation's efficiency.
When to use a sub-query vs. a JOIN?
Sub-queries and JOINs are powerful SQL tools, but they serve slightly different purposes:
- Sub-query: Useful when the results of one query depend on another separate query. It allows you to select rows that have data based on the results of another query. If you need to retrieve a single value or compute a summary value before getting to the main query, a sub-query might be a good fit.
- JOIN: Useful when you need to combine rows from two or more tables based on a related column. If you're retrieving data from multiple tables in a single command and there's a clear relationship between those tables, a JOIN is usually more efficient.
While both can achieve similar results, the choice often comes down to readability and performance. A rule of thumb is to start with JOINs, and only use sub-queries when necessary, as they can be less performant and harder to read, especially when nested multiple times.
9. Additional Resources
For those looking to delve deeper into SQL and database management, here are some recommended resources:
- W3Schools SQL Tutorial
- SQLZoo
- PostgreSQL Official Documentation
- More SQL Interview Questions on SQLPad
10. Conclusion
SQL remains a fundamental skill for data professionals, developers, and many other roles. This cheat sheet has covered the essential SQL topics that are often brought up in interviews. While it's essential to understand these concepts, real expertise comes from hands-on experience and continuous learning. Always ensure you practice and keep updating your knowledge.
Good luck with your interviews!