Intermediate

Mathematical Problems

Mathematics is the backbone of competitive programming and machine learning alike. This lesson presents five contest-style problems covering modular arithmetic, matrix exponentiation, number theory, combinatorics, and probability. Each problem includes a full problem statement, optimal approach, complete Python solution, and complexity analysis.

Problem 1: Modular Arithmetic — Training Loss Accumulator

📝
Problem Statement: You are training a neural network for N epochs. After each epoch i (1-indexed), the loss decreases by a factor computed as (base^i) mod M, where base and M are given. Compute the sum of all loss factors across all N epochs, modulo 10^9 + 7.

Input: Three integers N, base, M (1 ≤ N ≤ 10^18, 2 ≤ base, M ≤ 10^9)
Output: One integer — the sum (base^1 + base^2 + ... + base^N) mod (10^9 + 7)

Approach: Geometric Series with Modular Inverse

The sum of a geometric series is S = base * (base^N - 1) / (base - 1). In modular arithmetic, division by (base - 1) is multiplication by its modular inverse. We use Fermat's little theorem: if p is prime and gcd(a, p) = 1, then a^(-1) mod p = a^(p-2) mod p.

MOD = 10**9 + 7

def mod_pow(base, exp, mod):
    """Fast modular exponentiation: O(log exp)"""
    result = 1
    base %= mod
    while exp > 0:
        if exp & 1:
            result = result * base % mod
        exp >>= 1
        base = base * base % mod
    return result

def mod_inverse(a, mod):
    """Modular inverse using Fermat's little theorem (mod must be prime)"""
    return mod_pow(a, mod - 2, mod)

def solve(N, base, M):
    """Sum of base^1 + base^2 + ... + base^N mod MOD"""
    # Use geometric series formula: S = base * (base^N - 1) / (base - 1)
    base_mod = base % MOD

    if base_mod == 1:
        # Special case: sum = N mod MOD
        return N % MOD

    numerator = base_mod * (mod_pow(base_mod, N, MOD) - 1) % MOD
    denominator_inv = mod_inverse(base_mod - 1, MOD)
    return (numerator * denominator_inv) % MOD

# Example test
print(solve(3, 2, 1000))   # 2^1 + 2^2 + 2^3 = 2 + 4 + 8 = 14
print(solve(10, 3, 1000))  # Sum of 3^1 to 3^10 = 88572

Complexity: O(log N) time for modular exponentiation, O(1) space.

Problem 2: Matrix Exponentiation — Neural Network State Transitions

📝
Problem Statement: A recurrent neural network has K hidden states. At each timestep, state transitions follow a K×K transition matrix T. Given the initial state vector v (length K) and the number of timesteps N, compute the final state vector T^N * v, with all values modulo 10^9 + 7.

Input: K (1 ≤ K ≤ 100), N (1 ≤ N ≤ 10^18), followed by K×K matrix T and vector v
Output: The resulting vector of K integers

Approach: Matrix Fast Exponentiation

Just like scalar exponentiation uses repeated squaring to compute a^N in O(log N), matrix exponentiation uses repeated squaring of matrices to compute T^N in O(K^3 log N). This is one of the most powerful techniques in competitive programming, applicable to linear recurrences, graph path counting, and state machine problems.

MOD = 10**9 + 7

def mat_mul(A, B, mod):
    """Multiply two matrices modulo mod: O(K^3)"""
    K = len(A)
    C = [[0] * K for _ in range(K)]
    for i in range(K):
        for j in range(K):
            for k in range(K):
                C[i][j] = (C[i][j] + A[i][k] * B[k][j]) % mod
    return C

def mat_pow(M, n, mod):
    """Matrix exponentiation: O(K^3 * log n)"""
    K = len(M)
    # Identity matrix
    result = [[1 if i == j else 0 for j in range(K)] for i in range(K)]
    while n > 0:
        if n & 1:
            result = mat_mul(result, M, mod)
        M = mat_mul(M, M, mod)
        n >>= 1
    return result

def mat_vec_mul(M, v, mod):
    """Multiply matrix by vector: O(K^2)"""
    K = len(M)
    result = [0] * K
    for i in range(K):
        for j in range(K):
            result[i] = (result[i] + M[i][j] * v[j]) % mod
    return result

def solve(K, N, T, v):
    """Compute T^N * v mod MOD"""
    T_N = mat_pow(T, N, MOD)
    return mat_vec_mul(T_N, v, MOD)

# Example: 2-state RNN, 5 timesteps
T = [[1, 1],
     [1, 0]]  # Fibonacci transition matrix
v = [1, 0]
print(solve(2, 10, T, v))  # [89, 55] = Fib(11), Fib(10)

Complexity: O(K^3 log N) time, O(K^2) space.

💡
ML connection: Matrix exponentiation is directly related to computing powers of transition matrices in Markov chains, which underpin many ML models including Hidden Markov Models and PageRank. The ability to compute M^N efficiently is essential when N is astronomically large (e.g., steady-state distributions).

Problem 3: Number Theory — Feature Hash Collision Analysis

📝
Problem Statement: You are designing a feature hashing scheme for a recommender system. Given N feature IDs (integers), you want to find the largest prime P ≤ M such that mapping each feature to (feature_id mod P) produces zero collisions. If no such prime exists, output -1.

Input: N (1 ≤ N ≤ 10^5), M (2 ≤ M ≤ 10^7), followed by N distinct integers
Output: The largest valid prime P, or -1

Approach: Sieve of Eratosthenes + Collision Check

Generate all primes up to M using the sieve. Then iterate from the largest prime downward, checking if all N features produce distinct values modulo that prime. Early termination makes this efficient in practice.

def sieve_of_eratosthenes(limit):
    """Generate all primes up to limit: O(N log log N)"""
    is_prime = [True] * (limit + 1)
    is_prime[0] = is_prime[1] = False
    for i in range(2, int(limit**0.5) + 1):
        if is_prime[i]:
            for j in range(i*i, limit + 1, i):
                is_prime[j] = False
    return [i for i in range(2, limit + 1) if is_prime[i]]

def check_collision_free(features, p):
    """Check if all features are distinct modulo p: O(N)"""
    seen = set()
    for f in features:
        r = f % p
        if r in seen:
            return False
        seen.add(r)
    return True

def solve(N, M, features):
    """Find largest prime P <= M with no hash collisions"""
    primes = sieve_of_eratosthenes(M)

    # Iterate from largest prime downward
    for p in reversed(primes):
        if p < N:
            # Pigeonhole: can't have N distinct values mod p if p < N
            return -1
        if check_collision_free(features, p):
            return p

    return -1

# Example
features = [10, 23, 45, 67, 89]
print(solve(5, 50, features))   # Finds largest prime <= 50 with no collisions
print(solve(5, 100, features))  # Larger search space

Complexity: O(M log log M) for the sieve, O(N * number_of_primes_checked) for collision checking. In practice, a valid prime is found quickly for sparse feature sets.

Problem 4: Combinatorics — Model Ensemble Selection

📝
Problem Statement: You have N trained models and want to create ensembles. An ensemble must contain exactly K models, and each ensemble must include at least one model from category A (the first A models) and at least one from category B (the remaining N-A models). How many valid ensembles can you form? Output the answer modulo 10^9 + 7.

Input: N, K, A (1 ≤ K ≤ N ≤ 10^6, 1 ≤ A < N)
Output: One integer — the count of valid ensembles mod 10^9 + 7

Approach: Inclusion-Exclusion with Precomputed Factorials

Total ensembles of size K = C(N, K). Invalid ensembles = those with no A models + those with no B models. By inclusion-exclusion: Valid = C(N, K) - C(N-A, K) - C(A, K) + C(0, K). The last term is 0 when K > 0.

MOD = 10**9 + 7

def precompute_factorials(n, mod):
    """Precompute factorial and inverse factorial arrays: O(N)"""
    fact = [1] * (n + 1)
    for i in range(1, n + 1):
        fact[i] = fact[i - 1] * i % mod

    inv_fact = [1] * (n + 1)
    inv_fact[n] = pow(fact[n], mod - 2, mod)
    for i in range(n - 1, -1, -1):
        inv_fact[i] = inv_fact[i + 1] * (i + 1) % mod

    return fact, inv_fact

def comb(n, r, fact, inv_fact, mod):
    """Compute C(n, r) mod p in O(1) with precomputed factorials"""
    if r < 0 or r > n:
        return 0
    return fact[n] * inv_fact[r] % mod * inv_fact[n - r] % mod

def solve(N, K, A):
    """Count valid ensembles using inclusion-exclusion"""
    B = N - A
    fact, inv_fact = precompute_factorials(N, MOD)

    total = comb(N, K, fact, inv_fact, MOD)
    no_a = comb(B, K, fact, inv_fact, MOD)   # All from B, none from A
    no_b = comb(A, K, fact, inv_fact, MOD)   # All from A, none from B

    result = (total - no_a - no_b) % MOD
    return result

# Example: 10 models, pick 3, first 4 are category A
print(solve(10, 3, 4))  # C(10,3) - C(6,3) - C(4,3) = 120 - 20 - 4 = 96
print(solve(6, 2, 3))   # C(6,2) - C(3,2) - C(3,2) = 15 - 3 - 3 = 9

Complexity: O(N) precomputation, O(1) per query.

Problem 5: Probability — Random Model Convergence

📝
Problem Statement: You are training a model with stochastic gradient descent. At each step, the model improves (moves forward one position) with probability p = A/B, or stays in place with probability 1 - p. The model starts at position 0 and converges when it reaches position N. What is the expected number of steps to converge? Output the answer as a fraction modulo 10^9 + 7 (i.e., output numerator * modular_inverse(denominator) mod 10^9 + 7).

Input: N (1 ≤ N ≤ 10^18), A, B (1 ≤ A ≤ B ≤ 10^9)
Output: Expected steps mod 10^9 + 7

Approach: Linearity of Expectation

Each position requires expected 1/p = B/A steps to advance. Since the model must advance N times, total expected steps = N * B/A. In modular arithmetic, this is N * B * inverse(A) mod MOD.

MOD = 10**9 + 7

def solve(N, A, B):
    """Expected steps = N * B / A mod MOD"""
    # E[steps per advance] = 1/p = B/A
    # E[total steps] = N * B/A
    # In modular arithmetic: N * B * A^(-1) mod MOD

    N_mod = N % MOD
    B_mod = B % MOD
    A_inv = pow(A, MOD - 2, MOD)

    return N_mod * B_mod % MOD * A_inv % MOD

# Example: reach position 5, probability 1/2 per step
# Expected = 5 * 2 / 1 = 10
print(solve(5, 1, 2))   # 10

# Reach position 10, probability 3/4 per step
# Expected = 10 * 4 / 3 = 40/3
print(solve(10, 3, 4))  # 40 * inverse(3) mod MOD

# Verify: reach position 1, probability 1/1
print(solve(1, 1, 1))   # 1

Complexity: O(log MOD) for modular inverse, O(1) otherwise.

💡
Contest tip: Modular arithmetic problems are among the most common in competitive programming. Master these three operations: (1) fast modular exponentiation via repeated squaring, (2) modular inverse via Fermat's little theorem, and (3) precomputed factorials for combinatorics. These three cover 90% of math problems you will encounter in contests.

Key Takeaways

  • Modular exponentiation (repeated squaring) computes a^n mod m in O(log n) — essential for any problem with large exponents.
  • Matrix exponentiation extends this to linear recurrences and state transitions in O(K^3 log N).
  • The Sieve of Eratosthenes generates all primes up to N in O(N log log N) — the standard approach for number theory problems.
  • Precompute factorials and inverse factorials for O(1) combination queries in combinatorics problems.
  • Modular inverse via Fermat's little theorem handles division in modular arithmetic when the modulus is prime.