Intermediate

Mathematical Problems

Mathematics is the backbone of competitive programming and machine learning alike. This lesson presents five contest-style problems covering modular arithmetic, matrix exponentiation, number theory, combinatorics, and probability. Each problem includes a full problem statement, optimal approach, complete Python solution, and complexity analysis.

Problem 1: Modular Arithmetic — Training Loss Accumulator

📝

Problem Statement: You are training a neural network for N epochs. After each epoch i (1-indexed), the loss decreases by a factor computed as (base^i) mod M, where base and M are given. Compute the sum of all loss factors across all N epochs, modulo 10^9 + 7.

Input: Three integers N, base, M (1 ≤ N ≤ 10^18, 2 ≤ base, M ≤ 10^9)
Output: One integer — the sum (base^1 + base^2 + ... + base^N) mod (10^9 + 7)

Approach: Geometric Series with Modular Inverse

The sum of a geometric series is S = base * (base^N - 1) / (base - 1). In modular arithmetic, division by (base - 1) is multiplication by its modular inverse. We use Fermat's little theorem: if p is prime and gcd(a, p) = 1, then a^(-1) mod p = a^(p-2) mod p.

MOD = 10**9 + 7

def mod_pow(base, exp, mod):
    """Fast modular exponentiation: O(log exp)"""
    result = 1
    base %= mod
    while exp > 0:
        if exp & 1:
            result = result * base % mod
        exp >>= 1
        base = base * base % mod
    return result

def mod_inverse(a, mod):
    """Modular inverse using Fermat's little theorem (mod must be prime)"""
    return mod_pow(a, mod - 2, mod)

def solve(N, base, M):
    """Sum of base^1 + base^2 + ... + base^N mod MOD"""
    # Use geometric series formula: S = base * (base^N - 1) / (base - 1)
    base_mod = base % MOD

    if base_mod == 1:
        # Special case: sum = N mod MOD
        return N % MOD

    numerator = base_mod * (mod_pow(base_mod, N, MOD) - 1) % MOD
    denominator_inv = mod_inverse(base_mod - 1, MOD)
    return (numerator * denominator_inv) % MOD

# Example test
print(solve(3, 2, 1000))   # 2^1 + 2^2 + 2^3 = 2 + 4 + 8 = 14
print(solve(10, 3, 1000))  # Sum of 3^1 to 3^10 = 88572

Complexity: O(log N) time for modular exponentiation, O(1) space.

Problem 2: Matrix Exponentiation — Neural Network State Transitions

📝

Problem Statement: A recurrent neural network has K hidden states. At each timestep, state transitions follow a K×K transition matrix T. Given the initial state vector v (length K) and the number of timesteps N, compute the final state vector T^N * v, with all values modulo 10^9 + 7.

Input: K (1 ≤ K ≤ 100), N (1 ≤ N ≤ 10^18), followed by K×K matrix T and vector v
Output: The resulting vector of K integers

Approach: Matrix Fast Exponentiation

Just like scalar exponentiation uses repeated squaring to compute a^N in O(log N), matrix exponentiation uses repeated squaring of matrices to compute T^N in O(K^3 log N). This is one of the most powerful techniques in competitive programming, applicable to linear recurrences, graph path counting, and state machine problems.

MOD = 10**9 + 7

def mat_mul(A, B, mod):
    """Multiply two matrices modulo mod: O(K^3)"""
    K = len(A)
    C = [[0] * K for _ in range(K)]
    for i in range(K):
        for j in range(K):
            for k in range(K):
                C[i][j] = (C[i][j] + A[i][k] * B[k][j]) % mod
    return C

def mat_pow(M, n, mod):
    """Matrix exponentiation: O(K^3 * log n)"""
    K = len(M)
    # Identity matrix
    result = [[1 if i == j else 0 for j in range(K)] for i in range(K)]
    while n > 0:
        if n & 1:
            result = mat_mul(result, M, mod)
        M = mat_mul(M, M, mod)
        n >>= 1
    return result

def mat_vec_mul(M, v, mod):
    """Multiply matrix by vector: O(K^2)"""
    K = len(M)
    result = [0] * K
    for i in range(K):
        for j in range(K):
            result[i] = (result[i] + M[i][j] * v[j]) % mod
    return result

def solve(K, N, T, v):
    """Compute T^N * v mod MOD"""
    T_N = mat_pow(T, N, MOD)
    return mat_vec_mul(T_N, v, MOD)

# Example: 2-state RNN, 5 timesteps
T = [[1, 1],
     [1, 0]]  # Fibonacci transition matrix
v = [1, 0]
print(solve(2, 10, T, v))  # [89, 55] = Fib(11), Fib(10)

Complexity: O(K^3 log N) time, O(K^2) space.

💡

ML connection: Matrix exponentiation is directly related to computing powers of transition matrices in Markov chains, which underpin many ML models including Hidden Markov Models and PageRank. The ability to compute M^N efficiently is essential when N is astronomically large (e.g., steady-state distributions).

Problem 3: Number Theory — Feature Hash Collision Analysis

📝

Problem Statement: You are designing a feature hashing scheme for a recommender system. Given N feature IDs (integers), you want to find the largest prime P ≤ M such that mapping each feature to (feature_id mod P) produces zero collisions. If no such prime exists, output -1.

Input: N (1 ≤ N ≤ 10^5), M (2 ≤ M ≤ 10^7), followed by N distinct integers
Output: The largest valid prime P, or -1

Approach: Sieve of Eratosthenes + Collision Check

Generate all primes up to M using the sieve. Then iterate from the largest prime downward, checking if all N features produce distinct values modulo that prime. Early termination makes this efficient in practice.

def sieve_of_eratosthenes(limit):
    """Generate all primes up to limit: O(N log log N)"""
    is_prime = [True] * (limit + 1)
    is_prime[0] = is_prime[1] = False
    for i in range(2, int(limit**0.5) + 1):
        if is_prime[i]:
            for j in range(i*i, limit + 1, i):
                is_prime[j] = False
    return [i for i in range(2, limit + 1) if is_prime[i]]

def check_collision_free(features, p):
    """Check if all features are distinct modulo p: O(N)"""
    seen = set()
    for f in features:
        r = f % p
        if r in seen:
            return False
        seen.add(r)
    return True

def solve(N, M, features):
    """Find largest prime P <= M with no hash collisions"""
    primes = sieve_of_eratosthenes(M)

    # Iterate from largest prime downward
    for p in reversed(primes):
        if p < N:
            # Pigeonhole: can't have N distinct values mod p if p < N
            return -1
        if check_collision_free(features, p):
            return p

    return -1

# Example
features = [10, 23, 45, 67, 89]
print(solve(5, 50, features))   # Finds largest prime <= 50 with no collisions
print(solve(5, 100, features))  # Larger search space

Complexity: O(M log log M) for the sieve, O(N * number_of_primes_checked) for collision checking. In practice, a valid prime is found quickly for sparse feature sets.

Problem 4: Combinatorics — Model Ensemble Selection

📝

Problem Statement: You have N trained models and want to create ensembles. An ensemble must contain exactly K models, and each ensemble must include at least one model from category A (the first A models) and at least one from category B (the remaining N-A models). How many valid ensembles can you form? Output the answer modulo 10^9 + 7.

Input: N, K, A (1 ≤ K ≤ N ≤ 10^6, 1 ≤ A < N)
Output: One integer — the count of valid ensembles mod 10^9 + 7

Approach: Inclusion-Exclusion with Precomputed Factorials

Total ensembles of size K = C(N, K). Invalid ensembles = those with no A models + those with no B models. By inclusion-exclusion: Valid = C(N, K) - C(N-A, K) - C(A, K) + C(0, K). The last term is 0 when K > 0.

MOD = 10**9 + 7

def precompute_factorials(n, mod):
    """Precompute factorial and inverse factorial arrays: O(N)"""
    fact = [1] * (n + 1)
    for i in range(1, n + 1):
        fact[i] = fact[i - 1] * i % mod

    inv_fact = [1] * (n + 1)
    inv_fact[n] = pow(fact[n], mod - 2, mod)
    for i in range(n - 1, -1, -1):
        inv_fact[i] = inv_fact[i + 1] * (i + 1) % mod

    return fact, inv_fact

def comb(n, r, fact, inv_fact, mod):
    """Compute C(n, r) mod p in O(1) with precomputed factorials"""
    if r < 0 or r > n:
        return 0
    return fact[n] * inv_fact[r] % mod * inv_fact[n - r] % mod

def solve(N, K, A):
    """Count valid ensembles using inclusion-exclusion"""
    B = N - A
    fact, inv_fact = precompute_factorials(N, MOD)

    total = comb(N, K, fact, inv_fact, MOD)
    no_a = comb(B, K, fact, inv_fact, MOD)   # All from B, none from A
    no_b = comb(A, K, fact, inv_fact, MOD)   # All from A, none from B

    result = (total - no_a - no_b) % MOD
    return result

# Example: 10 models, pick 3, first 4 are category A
print(solve(10, 3, 4))  # C(10,3) - C(6,3) - C(4,3) = 120 - 20 - 4 = 96
print(solve(6, 2, 3))   # C(6,2) - C(3,2) - C(3,2) = 15 - 3 - 3 = 9

Complexity: O(N) precomputation, O(1) per query.

Problem 5: Probability — Random Model Convergence

📝

Problem Statement: You are training a model with stochastic gradient descent. At each step, the model improves (moves forward one position) with probability p = A/B, or stays in place with probability 1 - p. The model starts at position 0 and converges when it reaches position N. What is the expected number of steps to converge? Output the answer as a fraction modulo 10^9 + 7 (i.e., output numerator * modular_inverse(denominator) mod 10^9 + 7).

Input: N (1 ≤ N ≤ 10^18), A, B (1 ≤ A ≤ B ≤ 10^9)
Output: Expected steps mod 10^9 + 7

Approach: Linearity of Expectation

Each position requires expected 1/p = B/A steps to advance. Since the model must advance N times, total expected steps = N * B/A. In modular arithmetic, this is N * B * inverse(A) mod MOD.

MOD = 10**9 + 7

def solve(N, A, B):
    """Expected steps = N * B / A mod MOD"""
    # E[steps per advance] = 1/p = B/A
    # E[total steps] = N * B/A
    # In modular arithmetic: N * B * A^(-1) mod MOD

    N_mod = N % MOD
    B_mod = B % MOD
    A_inv = pow(A, MOD - 2, MOD)

    return N_mod * B_mod % MOD * A_inv % MOD

# Example: reach position 5, probability 1/2 per step
# Expected = 5 * 2 / 1 = 10
print(solve(5, 1, 2))   # 10

# Reach position 10, probability 3/4 per step
# Expected = 10 * 4 / 3 = 40/3
print(solve(10, 3, 4))  # 40 * inverse(3) mod MOD

# Verify: reach position 1, probability 1/1
print(solve(1, 1, 1))   # 1

Complexity: O(log MOD) for modular inverse, O(1) otherwise.

💡

Contest tip: Modular arithmetic problems are among the most common in competitive programming. Master these three operations: (1) fast modular exponentiation via repeated squaring, (2) modular inverse via Fermat's little theorem, and (3) precomputed factorials for combinatorics. These three cover 90% of math problems you will encounter in contests.

Key Takeaways

Modular exponentiation (repeated squaring) computes a^n mod m in O(log n) — essential for any problem with large exponents.
Matrix exponentiation extends this to linear recurrences and state transitions in O(K^3 log N).
The Sieve of Eratosthenes generates all primes up to N in O(N log log N) — the standard approach for number theory problems.
Precompute factorials and inverse factorials for O(1) combination queries in combinatorics problems.
Modular inverse via Fermat's little theorem handles division in modular arithmetic when the modulus is prime.

← Previous CP for ML Engineers Next → String Algorithms