Basic Queries & Aggregation
Filtering, grouping, and aggregation are the foundation of every SQL interview. These 6 challenges cover the patterns you will encounter at every company, from startups to FAANG.
Shared Schema for This Lesson
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
product VARCHAR(50),
category VARCHAR(30),
amount DECIMAL(10,2),
quantity INT,
order_date DATE,
status VARCHAR(20)
);
INSERT INTO orders VALUES
(1, 101, 'Laptop', 'Electronics', 999.99, 1, '2024-01-15', 'completed'),
(2, 102, 'Phone', 'Electronics', 699.99, 2, '2024-01-16', 'completed'),
(3, 103, 'Desk Chair', 'Furniture', 349.99, 1, '2024-01-17', 'pending'),
(4, 101, 'Monitor', 'Electronics', 449.99, 1, '2024-01-18', 'completed'),
(5, 104, 'Keyboard', 'Electronics', 89.99, 3, '2024-01-19', 'shipped'),
(6, 102, 'Bookshelf', 'Furniture', 199.99, 1, '2024-01-20', 'completed'),
(7, 105, 'Laptop', 'Electronics', 1299.99, 1, '2024-01-21', 'cancelled'),
(8, 103, 'Mouse', 'Electronics', 29.99, 2, '2024-01-22', 'completed'),
(9, 101, 'Desk', 'Furniture', 599.99, 1, '2024-01-23', 'completed'),
(10, 106, 'Tablet', 'Electronics', 549.99, 1, '2024-01-24', 'pending'),
(11, 104, 'Headphones', 'Electronics', 199.99, 1, '2024-01-25', 'completed'),
(12, 102, 'Phone Case', 'Accessories', 19.99, 5, '2024-01-26', 'completed');
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100),
city VARCHAR(50),
signup_date DATE
);
INSERT INTO customers VALUES
(101, 'Alice Johnson', 'alice@email.com', 'New York', '2023-06-15'),
(102, 'Bob Smith', 'bob@email.com', 'San Francisco', '2023-08-20'),
(103, 'Charlie Brown', 'charlie@email.com', 'Chicago', '2023-11-01'),
(104, 'Diana Lee', 'diana@email.com', 'New York', '2024-01-05'),
(105, 'Eve Davis', 'eve@email.com', 'Austin', '2024-01-10'),
(106, 'Frank Wilson', 'frank@email.com', 'San Francisco', '2024-01-20');
Challenge 1: Multi-Condition Filtering
order_id, product, amount, and order_date, sorted by amount descending.Solution
SELECT order_id, product, amount, order_date
FROM orders
WHERE status = 'completed'
AND category = 'Electronics'
AND amount > 100
ORDER BY amount DESC;
-- Result:
-- order_id | product | amount | order_date
-- ---------+------------+---------+------------
-- 1 | Laptop | 999.99 | 2024-01-15
-- 2 | Phone | 699.99 | 2024-01-16
-- 4 | Monitor | 449.99 | 2024-01-18
-- 11 | Headphones | 199.99 | 2024-01-25
Key insight: The order of WHERE conditions does not affect performance — the query optimizer reorders them. But listing the most selective condition first improves readability.
Challenge 2: GROUP BY with Multiple Aggregations
Solution
SELECT
category,
SUM(amount * quantity) AS total_revenue,
ROUND(AVG(amount), 2) AS avg_order_value,
COUNT(*) AS num_orders,
COUNT(DISTINCT customer_id) AS unique_customers
FROM orders
WHERE status = 'completed'
ORDER BY total_revenue DESC;
-- Wait - this is wrong! We need GROUP BY.
SELECT
category,
SUM(amount * quantity) AS total_revenue,
ROUND(AVG(amount), 2) AS avg_order_value,
COUNT(*) AS num_orders,
COUNT(DISTINCT customer_id) AS unique_customers
FROM orders
WHERE status = 'completed'
GROUP BY category
ORDER BY total_revenue DESC;
-- Result:
-- category | total_revenue | avg_order_value | num_orders | unique_customers
-- ------------+---------------+-----------------+------------+-----------------
-- Electronics | 2679.93 | 475.99 | 5 | 4
-- Furniture | 799.98 | 399.99 | 2 | 2
-- Accessories | 99.95 | 19.99 | 1 | 1
Common mistake: Forgetting GROUP BY when using aggregation functions with non-aggregated columns. This is the single most common SQL error in interviews. Some databases (MySQL with ONLY_FULL_GROUP_BY off) silently return wrong results instead of an error.
Challenge 3: HAVING to Filter Aggregated Results
customer_id, order count, and total spending.Solution
SELECT
customer_id,
COUNT(*) AS order_count,
SUM(amount * quantity) AS total_spending
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 2
AND SUM(amount * quantity) >= 500
ORDER BY total_spending DESC;
-- Result:
-- customer_id | order_count | total_spending
-- ------------+-------------+---------------
-- 101 | 3 | 2049.97
-- 102 | 3 | 1599.91
Key insight: WHERE filters individual rows before grouping. HAVING filters groups after aggregation. You cannot use column aliases in HAVING in standard SQL — repeat the aggregation expression.
Challenge 4: Subquery in WHERE
order_id, product, amount, and how much above the average each order is.Solution
-- Approach 1: Subquery
SELECT
order_id,
product,
amount,
ROUND(amount - (SELECT AVG(amount) FROM orders), 2) AS above_avg
FROM orders
WHERE amount > (SELECT AVG(amount) FROM orders)
ORDER BY amount DESC;
-- Approach 2: CTE (more readable)
WITH avg_calc AS (
SELECT AVG(amount) AS avg_amount FROM orders
)
SELECT
o.order_id,
o.product,
o.amount,
ROUND(o.amount - a.avg_amount, 2) AS above_avg
FROM orders o
CROSS JOIN avg_calc a
WHERE o.amount > a.avg_amount
ORDER BY o.amount DESC;
-- Result:
-- order_id | product | amount | above_avg
-- ---------+----------+---------+----------
-- 7 | Laptop | 1299.99 | 866.16
-- 1 | Laptop | 999.99 | 566.16
-- 2 | Phone | 699.99 | 266.16
-- 9 | Desk | 599.99 | 166.16
-- 10 | Tablet | 549.99 | 116.16
-- 4 | Monitor | 449.99 | 16.16
Performance note: The subquery (SELECT AVG(amount) FROM orders) is a scalar subquery — it runs once and returns a single value. This is efficient. Correlated subqueries (which reference the outer query) run once per row and can be slow on large tables.
Challenge 5: CASE WHEN for Conditional Logic
price_tier that labels orders as 'Budget' (under $100), 'Mid-Range' ($100-$499), 'Premium' ($500-$999), or 'Luxury' ($1000+). Also add a revenue_contribution column that calculates what percentage of total revenue each order represents.Solution
SELECT
order_id,
product,
amount,
CASE
WHEN amount < 100 THEN 'Budget'
WHEN amount < 500 THEN 'Mid-Range'
WHEN amount < 1000 THEN 'Premium'
ELSE 'Luxury'
END AS price_tier,
ROUND(100.0 * amount / SUM(amount) OVER (), 2) AS revenue_contribution_pct
FROM orders
ORDER BY amount DESC;
-- Result (first 5 rows):
-- order_id | product | amount | price_tier | revenue_contribution_pct
-- ---------+------------+---------+------------+-------------------------
-- 7 | Laptop | 1299.99 | Luxury | 24.74
-- 1 | Laptop | 999.99 | Premium | 19.03
-- 2 | Phone | 699.99 | Premium | 13.32
-- 9 | Desk | 599.99 | Premium | 11.42
-- 10 | Tablet | 549.99 | Premium | 10.46
Key insight: CASE WHEN evaluates conditions top-to-bottom and returns the first match. Order your conditions from most specific to least specific. The window function SUM(amount) OVER () computes the total across all rows without collapsing the result set — far cleaner than a subquery for this use case.
Challenge 6: COALESCE and NULL Handling
Solution
SELECT
c.name,
c.city,
COALESCE(SUM(o.amount * o.quantity), 0) AS total_spent,
COALESCE(CAST(MAX(o.order_date) AS VARCHAR), 'No orders') AS last_order_date
FROM customers c
LEFT JOIN orders o
ON c.customer_id = o.customer_id
AND o.status = 'completed'
GROUP BY c.customer_id, c.name, c.city
ORDER BY total_spent DESC;
-- Result:
-- name | city | total_spent | last_order_date
-- ---------------+---------------+-------------+----------------
-- Alice Johnson | New York | 2049.97 | 2024-01-23
-- Bob Smith | San Francisco | 1599.91 | 2024-01-26
-- Charlie Brown | Chicago | 409.97 | 2024-01-22
-- Diana Lee | New York | 469.96 | 2024-01-25
-- Eve Davis | Austin | 0 | No orders
-- Frank Wilson | San Francisco | 0 | No orders
Key insight: The filter o.status = 'completed' is in the JOIN condition, not in WHERE. If we put it in WHERE, the LEFT JOIN becomes an INNER JOIN because NULL status values (from unmatched rows) would be filtered out. This is the most commonly tested NULL trap in SQL interviews.
COALESCE(a, b) returns the first non-NULL argument. It is not the same as IFNULL (MySQL-specific) or NVL (Oracle-specific). Use COALESCE for portability. Remember that NULL = NULL is NULL (not TRUE), NULL <> 1 is NULL (not TRUE), and NULL IN (1, 2, NULL) is NULL.
Lilly Tech Systems