Advanced

Quick Reference & Tips

Your one-page cheat sheet for statistics and probability interviews. Print this, review it on the morning of your interview, and walk in confident.

Essential Probability Formulas

Formula Expression When to Use
Addition Rule P(A ∪ B) = P(A) + P(B) - P(A ∩ B) Probability of A or B (or both)
Multiplication Rule P(A ∩ B) = P(A|B) · P(B) Probability of A and B together
Conditional Probability P(A|B) = P(A ∩ B) / P(B) Probability of A given B occurred
Bayes' Theorem P(A|B) = P(B|A) · P(A) / P(B) Updating beliefs with evidence
Law of Total Probability P(A) = ∑ P(A|Bᵢ) · P(Bᵢ) Overall probability via conditioning
Independence Test P(A ∩ B) = P(A) · P(B) Checking if events are independent
Complement P(A′) = 1 - P(A) "At least one" problems (complement counting)
Combinations C(n,k) = n! / (k!(n-k)!) Counting unordered selections
Permutations P(n,k) = n! / (n-k)! Counting ordered arrangements
Expected Value E[X] = ∑ xᵢ · P(xᵢ) Average outcome over many trials
Variance Var(X) = E[X²] - (E[X])² Spread of the distribution
Linearity of Expectation E[X+Y] = E[X] + E[Y] (always) Even when X, Y are dependent

Common Distributions Reference

Distribution Type Mean Variance Use Case
Bernoulli(p) Discrete p p(1-p) Single yes/no trial (click, convert)
Binomial(n,p) Discrete np np(1-p) Number of successes in n trials
Geometric(p) Discrete 1/p (1-p)/p² Trials until first success
Poisson(λ) Discrete λ λ Events per time interval (errors, arrivals)
Uniform(a,b) Continuous (a+b)/2 (b-a)²/12 Equal probability over range
Normal(μ,σ²) Continuous μ σ² Sample means, errors (CLT)
Exponential(λ) Continuous 1/λ 1/λ² Time between events (memoryless)
Beta(α,β) Continuous α/(α+β) αβ/((α+β)²(α+β+1)) Prior for probabilities (CTR, conversion)
Gamma(α,β) Continuous α/β α/β² Prior for rates, waiting times

Hypothesis Testing Quick Reference

Concept Key Points
p-value P(data this extreme | H₀ true). NOT P(H₀ true | data).
Type I Error (α) False positive. Rejecting H₀ when true. Typically 0.05.
Type II Error (β) False negative. Failing to reject H₀ when false. Typically 0.20.
Power (1 - β) P(detect real effect). Target: 0.80. Depends on n, α, effect size.
z-test Known σ or large n. z = (X̄ - μ₀)/(σ/√n).
t-test Unknown σ. t = (X̄ - μ₀)/(s/√n). Use t-distribution with n-1 df.
Chi-squared Categorical data. χ² = ∑(O-E)²/E. Goodness-of-fit or independence.
ANOVA 3+ group means. F = MS_between/MS_within. Follow up with post-hoc tests.
Bonferroni Multiple testing: use α/m per test. Conservative but simple.
BH (FDR) Multiple testing: controls false discovery rate. More powerful than Bonferroni.

Critical Values to Memorize

💡
Normal distribution (z-scores):
• 68% of data within ±1σ | z = 1.00 → p = 0.317 (two-tailed)
• 90% confidence → z = 1.645 | 95% confidence → z = 1.96 | 99% confidence → z = 2.576
• 95% of data within ±1.96σ | 99.7% within ±3σ

Sample size rules of thumb:
• n ≥ 30 for CLT to apply (for moderately skewed data)
• n ≥ 5 expected per cell for chi-squared test
• Sample size ∝ 1/δ² (halving detectable effect quadruples required n)
• Quick sample size: n ≈ 16σ²/δ² for 80% power at α = 0.05

Interview Communication Tips for Math Questions

  1. Always Start with Intuition

    Before writing any formula, explain the concept in plain English. "Bayes' theorem lets us update our beliefs when we get new evidence" is a better opening than jumping to P(A|B) = P(B|A)P(A)/P(B). The formula should support your explanation, not replace it.

  2. State Your Assumptions

    Before solving, say "I will assume independence" or "I will assume normal distribution because n is large enough for CLT." This shows mathematical maturity. If the interviewer wants different assumptions, they will tell you.

  3. Narrate Your Work

    Do not solve silently. Say "First I will find P(F) using the law of total probability, then apply Bayes' theorem." This lets the interviewer follow your reasoning and give hints if you go off track.

  4. Sanity Check Your Answer

    After computing a probability, check: Is it between 0 and 1? Do boundary cases make sense? Does the answer change in the expected direction when you vary inputs? Say these checks out loud.

  5. Connect to Practice

    After solving a theoretical problem, add a sentence about real-world relevance. "This is essentially the same calculation we do when evaluating a spam filter on imbalanced data." This shows you can bridge theory and application.

  6. Know When to Approximate

    Interviewers do not expect you to compute C(100,47) by hand. Say "For large n, the binomial is well-approximated by a normal distribution" and use the approximation. Knowing when and how to approximate is a strength, not a weakness.

Pro Tip: Practice 5 probability problems per day for two weeks before your interview. Do them on paper (not a computer), narrating your reasoning out loud. Time yourself: most interview problems should be solvable in 5-10 minutes. If you cannot solve a problem, study the solution, then redo it from memory the next day.

Frequently Asked Questions

How much math do I need to memorize for a statistics interview?

You should have the key formulas internalized (Bayes' theorem, binomial PMF, normal 68-95-99.7 rule, sample size formula) but you do not need to memorize proofs. Interviewers care more about when to apply each formula and why it works than about perfect recall. If you forget a specific formula, say "I know this involves the ratio of between-group and within-group variance" and the interviewer will usually help with the exact notation. Understanding beats memorization every time.

What is the difference between a data scientist and an ML engineer statistics interview?

Data Scientist: Heavy emphasis on hypothesis testing, A/B testing, experimental design, causal inference, and communicating statistical results to stakeholders. You will likely be asked to design an experiment, choose metrics, calculate sample sizes, and interpret results. Companies like Meta and Airbnb weight this heavily.

ML Engineer: More focus on probability (distributions, Bayes, information theory) and how it connects to ML algorithms. Less experimental design, more mathematical foundations. Questions might involve deriving loss functions, understanding probabilistic models, or explaining why certain algorithms work. Google and DeepMind lean this direction.

Should I learn Bayesian or frequentist statistics for interviews?

Both. Most companies test frequentist methods (hypothesis testing, p-values, confidence intervals) because that is what their A/B testing platforms use. However, Bayesian concepts (priors, posteriors, Bayes' theorem) come up in algorithm questions and are increasingly used in industry (Thompson sampling, Bayesian optimization, Bayesian A/B testing). A strong candidate can explain the same concept from both perspectives and articulate when each is more appropriate. Start with frequentist if you are short on time, but do not skip Bayes' theorem.

I am weak at probability puzzles. How should I practice?

Start with the 15 puzzles in Lesson 6 of this course. For each puzzle: (1) try to solve it yourself for 10 minutes, (2) read the solution carefully, (3) identify the technique used (complement counting, recursion, symmetry, conditional probability), (4) redo it from memory the next day. After mastering these, work through "Fifty Challenging Problems in Probability" by Mosteller. The key is recognizing patterns: most puzzles use one of about 8 standard techniques. Once you see the pattern, the solution becomes straightforward.

How important is A/B testing knowledge for tech interviews?

Extremely important for data science roles. At Meta, Google, Amazon, Netflix, and Uber, nearly every data scientist is involved in A/B testing. You should be able to: (1) design an experiment from scratch (hypothesis, metrics, sample size, duration), (2) identify common pitfalls (peeking, multiple testing, network effects, novelty effects), (3) interpret results correctly (effect size vs statistical significance), and (4) explain advanced techniques (CUPED, sequential testing, multi-armed bandits). For ML engineering roles, basic A/B testing knowledge is still expected, but the depth is less.

What if the interviewer asks me to derive something I do not remember?

Stay calm and work from first principles. For example, if asked to derive the variance of the sample mean: "I know variance measures spread. The sample mean averages n observations. If observations are independent with variance σ², then Var(sum) = nσ², and Var(mean) = Var(sum/n) = nσ²/n² = σ²/n." Walking through the logic step by step is more impressive than reciting a memorized derivation. If you are truly stuck, say "I know the result is σ²/n but let me think about the derivation..." Most interviewers will give you a hint to move forward.

Are there statistics questions specific to certain industries?

Ad Tech (Google, Meta): CTR modeling, auction theory, bid optimization, attribution modeling.
E-commerce (Amazon, Shopify): Demand forecasting, price elasticity, conversion funnel analysis, inventory optimization.
Ride-sharing (Uber, Lyft): Surge pricing, marketplace experiments (two-sided), spatial statistics, ETA estimation.
Healthcare: Survival analysis, clinical trial design, multiple endpoint testing, regulatory requirements for statistical evidence.
Finance: Value at Risk (VaR), heavy-tailed distributions, time series analysis, Monte Carlo simulation. For all industries, the core probability and hypothesis testing skills are the same — only the applications differ.

Final Checklist Before Your Interview

💡
Probability: Can you solve Bayes' theorem problems step by step? Explain independence vs mutual exclusivity? Apply the law of total probability?

Distributions: Can you name 5 distributions, state when to use each, and give the mean and variance? Can you explain the Central Limit Theorem?

Hypothesis Testing: Can you correctly define a p-value? Explain Type I vs Type II errors? Calculate sample sizes? Choose between t-test, chi-squared, and ANOVA?

Bayesian: Can you explain prior, likelihood, posterior, and evidence? Describe conjugate priors? Compare Bayesian vs frequentist approaches?

A/B Testing: Can you design an experiment end-to-end? Handle multiple testing? Explain CUPED? Discuss novelty and network effects?

Puzzles: Can you solve the Monty Hall problem, Birthday Problem, and Coupon Collector from scratch? Do you know the key techniques (complement counting, recursion, symmetry)?
Remember: The goal is not to memorize 69 answers. It is to build the statistical intuition that lets you tackle any question — including ones you have never seen before. If you understand the why behind each formula, you can reconstruct any how on the spot. That is what separates good candidates from great ones.