Advanced

String Backtracking

String backtracking problems partition, transform, or generate strings by exploring all valid placements of characters or segments. These problems test your ability to define constraints precisely and prune invalid branches early. In ML, similar techniques appear in tokenization algorithms (BPE, WordPiece) and sequence generation in language models.

💡
String Pattern: Most string backtracking problems iterate over substring partitions: for each position, try all possible segment lengths, validate the segment, then recurse on the remainder.

Problem 1: Palindrome Partitioning (LC 131)

Problem: Given a string s, partition it such that every substring in the partition is a palindrome. Return all possible partitions.

Example: s = "aab"[["a","a","b"], ["aa","b"]]

Brute Force: Generate All Partitions, Filter

def partition_brute(s: str) -> list[list[str]]:
    """Generate all possible partitions, keep palindromic ones.

    Generates 2^(n-1) partitions (each gap can be a cut or not).
    Then checks each partition for palindrome property.

    Time: O(2^n * n^2), Space: O(2^n * n)
    """
    def is_palindrome(t):
        return t == t[::-1]

    def all_partitions(s):
        if not s:
            return [[]]
        result = []
        for i in range(1, len(s) + 1):
            for rest in all_partitions(s[i:]):
                result.append([s[:i]] + rest)
        return result

    return [p for p in all_partitions(s)
            if all(is_palindrome(seg) for seg in p)]

Optimal: Backtracking with Palindrome Check Before Recursion

def partition(s: str) -> list[list[str]]:
    """Palindrome partitioning with early palindrome pruning.

    At each position, try all possible prefix lengths.
    Only recurse if the prefix is a palindrome. This prunes
    many branches that would fail validation later.

    Optimization: precompute palindrome lookup table using DP.

    Time: O(n * 2^n) worst case
    Space: O(n) recursion depth + O(n^2) for palindrome table
    """
    n = len(s)
    # Precompute: is_pal[i][j] = True if s[i:j+1] is palindrome
    is_pal = [[False] * n for _ in range(n)]
    for i in range(n - 1, -1, -1):
        for j in range(i, n):
            if s[i] == s[j] and (j - i <= 2 or is_pal[i + 1][j - 1]):
                is_pal[i][j] = True

    result = []

    def backtrack(start, path):
        if start == n:
            result.append(path[:])
            return

        for end in range(start, n):
            if is_pal[start][end]:
                path.append(s[start:end + 1])
                backtrack(end + 1, path)
                path.pop()

    backtrack(0, [])
    return result

# partition("aab") -> [["a","a","b"], ["aa","b"]]

Problem 2: Generate Parentheses (LC 22)

Problem: Given n pairs of parentheses, generate all combinations of well-formed parentheses.

Example: n = 3["((()))","(()())","(())()","()(())","()()()"]

Brute Force: Generate All 22n Strings, Validate

def generate_parenthesis_brute(n: int) -> list[str]:
    """Generate all strings of length 2n from '(' and ')', filter valid.

    Time: O(2^(2n) * n), Space: O(2^(2n))
    Most generated strings are invalid.
    """
    def is_valid(s):
        count = 0
        for c in s:
            count += 1 if c == '(' else -1
            if count < 0:
                return False
        return count == 0

    result = []

    def generate(pos, current):
        if pos == 2 * n:
            if is_valid(current):
                result.append(current)
            return
        generate(pos + 1, current + '(')
        generate(pos + 1, current + ')')

    generate(0, '')
    return result

Optimal: Backtracking with Open/Close Counts

def generate_parenthesis(n: int) -> list[str]:
    """Generate valid parentheses via constrained backtracking.

    Track open_count and close_count:
    - Can add '(' if open_count < n
    - Can add ')' if close_count < open_count

    This ensures every generated string is valid by construction.
    No post-validation needed.

    Time: O(4^n / sqrt(n)) - the nth Catalan number
    Space: O(n) recursion depth
    """
    result = []

    def backtrack(path, open_count, close_count):
        if len(path) == 2 * n:
            result.append("".join(path))
            return

        if open_count < n:
            path.append('(')
            backtrack(path, open_count + 1, close_count)
            path.pop()

        if close_count < open_count:
            path.append(')')
            backtrack(path, open_count, close_count + 1)
            path.pop()

    backtrack([], 0, 0)
    return result

# generate_parenthesis(3)
# -> ["((()))","(()())","(())()","()(())","()()()"]
ML Context: Generate parentheses is structurally identical to generating valid arithmetic expression trees. In symbolic regression (a form of AutoML), the algorithm generates candidate mathematical expressions by backtracking through operator/operand choices with syntactic constraints.

Problem 3: Letter Combinations of a Phone Number (LC 17)

Problem: Given a string containing digits 2-9, return all possible letter combinations the number could represent (like a phone keypad).

Example: digits = "23"["ad","ae","af","bd","be","bf","cd","ce","cf"]

Brute Force: Iterative Cartesian Product

def letter_combinations_brute(digits: str) -> list[str]:
    """Build combinations iteratively using nested expansion.

    Start with [''], then for each digit, combine every existing
    string with every letter the digit maps to.

    Time: O(4^n * n), Space: O(4^n * n)
    """
    if not digits:
        return []

    phone = {'2': 'abc', '3': 'def', '4': 'ghi', '5': 'jkl',
             '6': 'mno', '7': 'pqrs', '8': 'tuv', '9': 'wxyz'}

    result = ['']
    for digit in digits:
        new_result = []
        for existing in result:
            for letter in phone[digit]:
                new_result.append(existing + letter)
        result = new_result

    return result

Optimal: Backtracking

def letter_combinations(digits: str) -> list[str]:
    """Phone number letter combinations via backtracking.

    At each digit position, try all letters that digit maps to.
    Recurse to the next digit position.

    Time: O(4^n * n) where n = number of digits
    Space: O(n) recursion depth
    """
    if not digits:
        return []

    phone = {'2': 'abc', '3': 'def', '4': 'ghi', '5': 'jkl',
             '6': 'mno', '7': 'pqrs', '8': 'tuv', '9': 'wxyz'}

    result = []

    def backtrack(idx, path):
        if idx == len(digits):
            result.append("".join(path))
            return

        for letter in phone[digits[idx]]:
            path.append(letter)
            backtrack(idx + 1, path)
            path.pop()

    backtrack(0, [])
    return result

# letter_combinations("23")
# -> ["ad","ae","af","bd","be","bf","cd","ce","cf"]

Problem 4: Restore IP Addresses (LC 93)

Problem: Given a string containing only digits, return all possible valid IP addresses formed by inserting dots.

Example: s = "25525511135"["255.255.11.135", "255.255.111.35"]

Brute Force: Three Nested Loops for Dot Positions

def restore_ip_brute(s: str) -> list[str]:
    """Try all possible positions for 3 dots using nested loops.

    Time: O(n^3 * n), Space: O(1) extra
    Checks many invalid configurations.
    """
    result = []
    n = len(s)

    def is_valid_part(part):
        if not part or len(part) > 3:
            return False
        if len(part) > 1 and part[0] == '0':
            return False
        return 0 <= int(part) <= 255

    for i in range(1, min(4, n)):
        for j in range(i + 1, min(i + 4, n)):
            for k in range(j + 1, min(j + 4, n)):
                p1, p2, p3, p4 = s[:i], s[i:j], s[j:k], s[k:]
                if all(is_valid_part(p) for p in [p1, p2, p3, p4]):
                    result.append(f"{p1}.{p2}.{p3}.{p4}")

    return result

Optimal: Backtracking with Constraint Pruning

def restore_ip_addresses(s: str) -> list[str]:
    """Restore valid IPs via backtracking with pruning.

    At each step, try segment lengths 1, 2, or 3.
    Prune if:
    - Segment has leading zero (except "0" itself)
    - Segment value > 255
    - Remaining characters cannot fill remaining segments

    Time: O(3^4) = O(81) constant! (at most 4 segments of length 1-3)
    Space: O(1) extra (fixed depth of 4)
    """
    result = []

    def backtrack(start, parts):
        if len(parts) == 4:
            if start == len(s):
                result.append(".".join(parts))
            return

        remaining_digits = len(s) - start
        remaining_parts = 4 - len(parts)

        # Pruning: remaining digits must fit remaining parts
        if remaining_digits < remaining_parts or remaining_digits > remaining_parts * 3:
            return

        for length in range(1, 4):
            if start + length > len(s):
                break

            segment = s[start:start + length]

            # No leading zeros (except "0" itself)
            if len(segment) > 1 and segment[0] == '0':
                break

            if int(segment) > 255:
                break

            parts.append(segment)
            backtrack(start + length, parts)
            parts.pop()

    backtrack(0, [])
    return result

# restore_ip_addresses("25525511135")
# -> ["255.255.11.135", "255.255.111.35"]

Problem 5: Word Break II (LC 140)

Problem: Given a string s and a dictionary of words, add spaces to s to construct sentences where each word is in the dictionary. Return all such sentences.

Example: s = "catsanddog", wordDict = ["cat","cats","and","sand","dog"]["cats and dog", "cat sand dog"]

Brute Force: Try All Partitions

def word_break_brute(s: str, wordDict: list[str]) -> list[str]:
    """Try all possible word breaks without memoization.

    At each position, try every possible prefix. If the prefix
    is in the dictionary, recurse on the suffix.

    Time: O(2^n * n), Space: O(n) recursion depth
    Very slow due to overlapping subproblems.
    """
    word_set = set(wordDict)
    result = []

    def backtrack(start, path):
        if start == len(s):
            result.append(" ".join(path))
            return

        for end in range(start + 1, len(s) + 1):
            word = s[start:end]
            if word in word_set:
                path.append(word)
                backtrack(end, path)
                path.pop()

    backtrack(0, [])
    return result

Optimal: Backtracking with Memoization

def word_break_ii(s: str, wordDict: list[str]) -> list[str]:
    """Word break with memoization to avoid recomputation.

    Cache the list of valid sentences for each starting index.
    This prevents exponential blowup when the same suffix
    appears in multiple decomposition paths.

    Time: O(n^2 * 2^n) worst case, typically much faster
    Space: O(n * number_of_sentences)
    """
    word_set = set(wordDict)
    memo = {}

    def backtrack(start):
        if start in memo:
            return memo[start]

        if start == len(s):
            return [""]

        sentences = []
        for end in range(start + 1, len(s) + 1):
            word = s[start:end]
            if word in word_set:
                rest = backtrack(end)
                for sentence in rest:
                    if sentence:
                        sentences.append(word + " " + sentence)
                    else:
                        sentences.append(word)

        memo[start] = sentences
        return sentences

    return backtrack(0)

# word_break_ii("catsanddog", ["cat","cats","and","sand","dog"])
# -> ["cat sand dog", "cats and dog"]
ML Context: Word Break is directly related to tokenization in NLP. BPE (Byte Pair Encoding) and WordPiece tokenizers segment text into subword units from a vocabulary. The Viterbi algorithm used in WordPiece is essentially memoized backtracking over all possible segmentations, choosing the one with the highest probability.

Complexity Summary

ProblemBrute ForceOptimalKey Optimization
Palindrome PartitionO(2n × n2)O(n × 2n)Precomputed palindrome table
Generate ParenthesesO(22n × n)O(Cn) CatalanOpen/close count constraints
Letter CombinationsO(4n × n)O(4n × n)Both approaches are equivalent
Restore IP AddressesO(n3)O(1) constantFixed depth + length/range pruning
Word Break IIO(2n × n)O(n2 × results)Memoize suffix decompositions