Beginner

NumPy for ML Interviews

NumPy is the single most important library for ML coding interviews. Google, Meta, and Amazon expect ML engineer candidates to think in arrays, not loops. This lesson explains why, what interviewers look for, and the patterns you must know.

Why NumPy Dominates ML Interviews

Every ML framework — PyTorch, TensorFlow, JAX, scikit-learn — is built on top of NumPy's array semantics. When an interviewer asks you to implement softmax, batch normalization, or a distance matrix, they are testing whether you can express mathematical operations as vectorized array computations instead of nested Python loops.

Performance Signal

Writing np.dot(X, W) instead of a triple-nested loop shows you understand that NumPy delegates to optimized C/BLAS routines. A vectorized solution can be 100x faster — and interviewers know this.

Mathematical Fluency

Translating math notation directly to NumPy code (e.g., softmax(x) = exp(x) / sum(exp(x)) becomes np.exp(x) / np.exp(x).sum()) shows you can bridge theory and implementation.

Production Readiness

Production ML code is vectorized. If you write loops in an interview, the interviewer questions whether you can write code that scales to millions of samples in production.

Framework Transfer

NumPy's API is the template for PyTorch tensors, TensorFlow tensors, and JAX arrays. Master NumPy and you can immediately work in any framework.

The Vectorization Philosophy

The core principle: express computation as operations on entire arrays, not element-by-element loops. This is not just about speed — it is about thinking at the right level of abstraction for ML.

import numpy as np

# ---- THE WRONG WAY: Python loops ----
# Computing dot product with loops
def dot_product_loop(a, b):
    result = 0
    for i in range(len(a)):
        result += a[i] * b[i]
    return result

# Computing matrix multiply with loops
def matmul_loop(A, B):
    n, m = A.shape[0], B.shape[1]
    k = A.shape[1]
    result = np.zeros((n, m))
    for i in range(n):
        for j in range(m):
            for p in range(k):
                result[i, j] += A[i, p] * B[p, j]
    return result

# ---- THE RIGHT WAY: NumPy vectorized ----
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

dot = np.dot(a, b)          # 32 - one call, runs in C
matmul = A @ B              # matrix multiply operator
norm = np.linalg.norm(a)    # L2 norm
dist = np.linalg.norm(a - b)  # Euclidean distance
💡
Interview rule of thumb: If you write a for loop over array elements in a NumPy problem, you are almost certainly doing it wrong. The interviewer expects you to find the vectorized equivalent. The only exception is when the problem has inherent sequential dependencies (like RNN forward pass).

Common Interview Patterns

These are the NumPy patterns that appear most frequently in ML coding interviews at top companies.

PatternExampleNumPy Approach
Element-wise opsApply activation functionnp.maximum(0, x) for ReLU
ReductionCompute loss over batchnp.mean(losses, axis=0)
BroadcastingSubtract mean from featuresX - X.mean(axis=0)
Matrix multiplyLinear layer forward passX @ W + b
Boolean maskingFilter predictions above thresholdpreds[preds > 0.5]
Fancy indexingSelect specific classesprobs[np.arange(n), labels]
Pairwise computationDistance matrixBroadcasting with [:, None]
EinsumBatch matrix multiplynp.einsum('bij,bjk->bik', A, B)

What Interviewers Evaluate

Based on publicly shared interview experiences from Google, Meta, and Amazon ML roles, here is what interviewers look for in NumPy coding challenges:

# Interviewer evaluation rubric (what they look for):

evaluation_criteria = {
    "vectorization": {
        "weight": "HIGH",
        "description": "Can the candidate avoid Python loops?",
        "red_flag": "Nested for-loops over array elements",
        "green_flag": "Broadcasting, einsum, matrix ops"
    },
    "numerical_stability": {
        "weight": "HIGH",
        "description": "Does the candidate handle edge cases?",
        "red_flag": "np.exp(x) / np.sum(np.exp(x))  # overflow!",
        "green_flag": "np.exp(x - np.max(x)) / np.sum(np.exp(x - np.max(x)))"
    },
    "axis_awareness": {
        "weight": "MEDIUM",
        "description": "Does the candidate understand axis parameter?",
        "red_flag": "Confusion about which axis to reduce along",
        "green_flag": "Clear reasoning: axis=0 is across samples, axis=1 is across features"
    },
    "broadcasting_rules": {
        "weight": "MEDIUM",
        "description": "Can the candidate use broadcasting correctly?",
        "red_flag": "Manual reshaping when broadcasting would work",
        "green_flag": "Knows that (n,1) and (1,m) broadcast to (n,m)"
    },
    "memory_efficiency": {
        "weight": "LOW-MEDIUM",
        "description": "Does the candidate consider memory?",
        "red_flag": "Creating unnecessary intermediate arrays",
        "green_flag": "Using in-place ops, views, and out= parameter"
    }
}

NumPy Fundamentals Refresher

Before diving into the challenges, make sure you have these fundamentals solid.

import numpy as np

# ---- Array Creation ----
a = np.array([1, 2, 3])              # from list
b = np.zeros((3, 4))                  # 3x4 zeros
c = np.ones((2, 3))                   # 2x3 ones
d = np.eye(3)                         # 3x3 identity
e = np.arange(0, 10, 2)              # [0, 2, 4, 6, 8]
f = np.linspace(0, 1, 5)             # [0, 0.25, 0.5, 0.75, 1.0]
g = np.random.randn(3, 4)            # 3x4 standard normal

# ---- Shape Manipulation ----
x = np.arange(12)
x_2d = x.reshape(3, 4)               # 3 rows, 4 cols
x_auto = x.reshape(3, -1)            # -1 infers the dimension
x_T = x_2d.T                         # transpose: (4, 3)
x_flat = x_2d.ravel()                # flatten to 1D (view)
x_flat2 = x_2d.flatten()             # flatten to 1D (copy)

# ---- Axis Convention ----
# For a 2D array (samples x features):
#   axis=0 -> operate across rows (along samples)
#   axis=1 -> operate across columns (along features)
data = np.array([[1, 2, 3],
                 [4, 5, 6]])
data.mean(axis=0)   # [2.5, 3.5, 4.5]  mean per feature
data.mean(axis=1)   # [2.0, 5.0]        mean per sample
data.sum()          # 21                 sum of everything

# ---- Key Properties ----
print(x_2d.shape)     # (3, 4)
print(x_2d.dtype)     # int64
print(x_2d.ndim)      # 2
print(x_2d.size)      # 12
print(x_2d.strides)   # memory layout info

Course Roadmap

This course is structured from foundational array operations to advanced ML implementations. Each lesson contains real interview challenges with full solutions.

Course Structure:

Lesson 2: Array Operations (6 challenges)
  - Reshape, broadcasting, fancy indexing
  - Boolean masking, stacking, splitting
  - Foundation for all subsequent lessons

Lesson 3: Matrix Mathematics (6 challenges)
  - Dot product, matrix multiply, inverse
  - Determinant, eigenvalues, SVD
  - Core linear algebra for ML

Lesson 4: Statistical Operations (6 challenges)
  - Mean/std/var, percentiles, correlation
  - Normalization, z-scores, covariance
  - Feature engineering fundamentals

Lesson 5: ML Implementations (5 challenges)
  - Softmax, cross-entropy, gradient descent
  - Batch normalization, cosine similarity matrix
  - The challenges Google/Meta actually ask

Lesson 6: Distance & Similarity (5 challenges)
  - Euclidean, Manhattan, cosine distances
  - Pairwise distance matrix, KNN prediction
  - Core to retrieval and recommendation

Lesson 7: Performance & Vectorization (5 challenges)
  - Replace loops with vectorized ops
  - Einsum, memory efficiency, broadcasting tricks
  - Optimization patterns for production

Lesson 8: Quick Reference & Tips
  - Cheat sheet, common patterns, FAQ

Key Takeaways

💡
  • NumPy fluency is the most tested technical skill in ML coding interviews
  • Vectorization is not optional — loops over array elements are a red flag
  • Numerical stability (e.g., log-sum-exp trick) separates strong from weak candidates
  • Understanding axis semantics and broadcasting rules is essential
  • The patterns you learn in NumPy transfer directly to PyTorch, TensorFlow, and JAX