Advanced

Hard Problems (41-50)

Hard problems combine multiple patterns and require deep understanding. In interviews, you are not always expected to reach the optimal solution — demonstrating clear problem-solving process and arriving at a working solution matters more. These ten problems test binary search, heap, stack, DP, and sliding window at their most challenging.

Problem 41: Median of Two Sorted Arrays

LeetCode #4 — Given two sorted arrays nums1 and nums2, return the median of the two sorted arrays. The overall run time complexity should be O(log(m+n)).

Pattern: Binary Search on Partition — Binary search on the shorter array to find the correct partition point.

ML Context: Finding medians and percentiles of distributed datasets (data stored across multiple machines) requires this merge-and-partition approach. It is used in distributed quantile computation for feature normalization.

def findMedianSortedArrays(nums1, nums2):
    """
    Binary search on the shorter array.
    Time: O(log(min(m,n)))
    Space: O(1)
    """
    if len(nums1) > len(nums2):
        nums1, nums2 = nums2, nums1  # ensure nums1 is shorter
    m, n = len(nums1), len(nums2)
    low, high = 0, m

    while low <= high:
        i = (low + high) // 2       # partition in nums1
        j = (m + n + 1) // 2 - i    # partition in nums2

        left1 = nums1[i - 1] if i > 0 else float('-inf')
        right1 = nums1[i] if i < m else float('inf')
        left2 = nums2[j - 1] if j > 0 else float('-inf')
        right2 = nums2[j] if j < n else float('inf')

        if left1 <= right2 and left2 <= right1:
            if (m + n) % 2 == 0:
                return (max(left1, left2) + min(right1, right2)) / 2
            return max(left1, left2)
        elif left1 > right2:
            high = i - 1
        else:
            low = i + 1

# findMedianSortedArrays([1,3], [2]) -> 2.0
# findMedianSortedArrays([1,2], [3,4]) -> 2.5

Problem 42: Merge K Sorted Lists

LeetCode #23 — Merge k sorted linked lists and return it as one sorted list.

Pattern: Min Heap — Push the head of each list into a heap, pop the smallest, push its next.

ML Context: Merging k sorted streams is fundamental to external merge sort for large-scale data preprocessing, multi-source data ingestion in feature pipelines, and merging results from distributed inference.

import heapq

def mergeKLists(lists):
    """
    Time: O(N log k) where N = total elements, k = number of lists
    Space: O(k) for the heap
    """
    dummy = ListNode(0)
    current = dummy
    heap = []
    for i, lst in enumerate(lists):
        if lst:
            heapq.heappush(heap, (lst.val, i, lst))
    while heap:
        val, i, node = heapq.heappop(heap)
        current.next = node
        current = current.next
        if node.next:
            heapq.heappush(heap, (node.next.val, i, node.next))
    return dummy.next

Problem 43: Trapping Rain Water

LeetCode #42 — Given n non-negative integers representing an elevation map, compute how much water it can trap after raining.

Pattern: Two Pointers — Track left_max and right_max, fill from the shorter side.

ML Context: The "bounded by neighbors" concept appears in pooling operations in CNNs (max pooling, average pooling) and in computing local statistics for feature engineering over spatial data.

def trap(height):
    """
    Two pointer approach.
    Time: O(n), Space: O(1)
    """
    left, right = 0, len(height) - 1
    left_max = right_max = 0
    water = 0
    while left < right:
        if height[left] < height[right]:
            if height[left] >= left_max:
                left_max = height[left]
            else:
                water += left_max - height[left]
            left += 1
        else:
            if height[right] >= right_max:
                right_max = height[right]
            else:
                water += right_max - height[right]
            right -= 1
    return water

# trap([0,1,0,2,1,0,1,3,2,1,2,1]) -> 6
# trap([4,2,0,3,2,5]) -> 9

Problem 44: Longest Palindromic Substring

LeetCode #5 — Given a string s, return the longest palindromic substring.

Pattern: Expand Around Center — For each character (and each pair), expand outward while palindromic.

ML Context: Palindrome detection and string pattern analysis appear in DNA/RNA sequence analysis in bioinformatics, where palindromic sequences indicate restriction enzyme recognition sites.

def longestPalindrome(s):
    """
    Expand around center approach.
    Time: O(n^2), Space: O(1)
    """
    start, max_len = 0, 1

    def expand(left, right):
        nonlocal start, max_len
        while left >= 0 and right < len(s) and s[left] == s[right]:
            if right - left + 1 > max_len:
                start = left
                max_len = right - left + 1
            left -= 1
            right += 1

    for i in range(len(s)):
        expand(i, i)      # odd length palindromes
        expand(i, i + 1)  # even length palindromes
    return s[start:start + max_len]

# longestPalindrome("babad") -> "bab" (or "aba")
# longestPalindrome("cbbd") -> "bb"

Problem 45: Edit Distance

LeetCode #72 — Given two strings word1 and word2, return the minimum number of operations (insert, delete, replace) to convert word1 to word2.

Pattern: 2D DP — dp[i][j] = min operations to convert word1[0:i] to word2[0:j].

ML Context: Edit distance (Levenshtein distance) is directly used in spell correction, fuzzy string matching in NLP, evaluating machine translation quality (WER), and DNA sequence alignment in bioinformatics.

def minDistance(word1, word2):
    """
    Time: O(m * n), Space: O(n) with space optimization
    """
    m, n = len(word1), len(word2)
    prev = list(range(n + 1))
    for i in range(1, m + 1):
        curr = [i] + [0] * n
        for j in range(1, n + 1):
            if word1[i-1] == word2[j-1]:
                curr[j] = prev[j-1]  # no operation needed
            else:
                curr[j] = 1 + min(
                    prev[j],      # delete
                    curr[j-1],    # insert
                    prev[j-1]     # replace
                )
        prev = curr
    return prev[n]

# minDistance("horse", "ros") -> 3
# minDistance("intention", "execution") -> 5

Problem 46: Word Ladder

LeetCode #127 — Given beginWord, endWord, and a dictionary, find the length of the shortest transformation sequence where each step changes exactly one letter and the intermediate word must exist in the dictionary.

Pattern: BFS on Implicit Graph — Each word is a node; edges connect words that differ by one character.

ML Context: BFS on implicit graphs is used in knowledge graph traversal for question answering, word embedding neighbor search, and finding shortest paths in entity relationship graphs.

from collections import deque

def ladderLength(beginWord, endWord, wordList):
    """
    Time: O(n * m * 26) where n = words, m = word length
    Space: O(n * m)
    """
    word_set = set(wordList)
    if endWord not in word_set:
        return 0
    queue = deque([(beginWord, 1)])
    visited = {beginWord}

    while queue:
        word, length = queue.popleft()
        for i in range(len(word)):
            for c in 'abcdefghijklmnopqrstuvwxyz':
                next_word = word[:i] + c + word[i+1:]
                if next_word == endWord:
                    return length + 1
                if next_word in word_set and next_word not in visited:
                    visited.add(next_word)
                    queue.append((next_word, length + 1))
    return 0

# ladderLength("hit", "cog", ["hot","dot","dog","lot","log","cog"]) -> 5
# hit -> hot -> dot -> dog -> cog

Problem 47: Sliding Window Maximum

LeetCode #239 — Given an array nums and a sliding window of size k, return the max value in each window as the window slides from left to right.

Pattern: Monotonic Deque — Maintain a deque of indices in decreasing order of their values.

ML Context: Sliding window maximum is used in max pooling layers in CNNs, computing rolling statistics for time-series features, and peak detection in signal processing for audio ML models.

from collections import deque

def maxSlidingWindow(nums, k):
    """
    Monotonic decreasing deque.
    Time: O(n) - each element added/removed from deque at most once
    Space: O(k) for the deque
    """
    dq = deque()  # stores indices
    result = []
    for i, num in enumerate(nums):
        # Remove elements outside window
        while dq and dq[0] < i - k + 1:
            dq.popleft()
        # Remove smaller elements (they will never be max)
        while dq and nums[dq[-1]] < num:
            dq.pop()
        dq.append(i)
        if i >= k - 1:
            result.append(nums[dq[0]])
    return result

# maxSlidingWindow([1,3,-1,-3,5,3,6,7], 3) -> [3,3,5,5,6,7]

Problem 48: Largest Rectangle in Histogram

LeetCode #84 — Given an array of integers heights representing the histogram's bar heights, find the area of the largest rectangle in the histogram.

Pattern: Monotonic Stack — Maintain a stack of increasing heights; pop and compute area when a shorter bar is found.

ML Context: Histogram-based algorithms appear in gradient boosting (histogram-based splitting in LightGBM and XGBoost), where finding optimal split points efficiently in histograms is a core operation.

def largestRectangleArea(heights):
    """
    Monotonic stack approach.
    Time: O(n), Space: O(n)
    """
    stack = []  # stores indices
    max_area = 0
    heights.append(0)  # sentinel to flush remaining bars
    for i, h in enumerate(heights):
        while stack and heights[stack[-1]] > h:
            height = heights[stack.pop()]
            width = i if not stack else i - stack[-1] - 1
            max_area = max(max_area, height * width)
        stack.append(i)
    heights.pop()  # restore original array
    return max_area

# largestRectangleArea([2,1,5,6,2,3]) -> 10
# (rectangle of height 5, width 2 at indices 2-3)

Problem 49: Regular Expression Matching

LeetCode #10 — Given an input string s and a pattern p, implement regular expression matching with support for '.' (matches any single character) and '*' (matches zero or more of the preceding element).

Pattern: 2D DP — dp[i][j] = whether s[0:i] matches p[0:j]. Handle '*' by trying zero or more matches.

ML Context: Pattern matching and regular expressions are used in text preprocessing pipelines, log parsing for ML monitoring systems, and tokenization rules in NLP models.

def isMatch(s, p):
    """
    Time: O(m * n) where m = len(s), n = len(p)
    Space: O(m * n)
    """
    m, n = len(s), len(p)
    dp = [[False] * (n + 1) for _ in range(m + 1)]
    dp[0][0] = True

    # Handle patterns like a*, a*b*, a*b*c* that can match empty string
    for j in range(2, n + 1):
        if p[j-1] == '*':
            dp[0][j] = dp[0][j-2]

    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if p[j-1] == '*':
                # Zero occurrences of preceding element
                dp[i][j] = dp[i][j-2]
                # One or more occurrences
                if p[j-2] == '.' or p[j-2] == s[i-1]:
                    dp[i][j] = dp[i][j] or dp[i-1][j]
            elif p[j-1] == '.' or p[j-1] == s[i-1]:
                dp[i][j] = dp[i-1][j-1]
    return dp[m][n]

# isMatch("aa", "a") -> False
# isMatch("aa", "a*") -> True
# isMatch("ab", ".*") -> True

Problem 50: Minimum Window Substring

LeetCode #76 — Given two strings s and t, return the minimum window substring of s such that every character in t (including duplicates) is included in the window.

Pattern: Sliding Window + Hash Map — Expand right to include all characters, shrink left to minimize window.

ML Context: The sliding window with constraint satisfaction pattern appears in attention mechanism implementations (finding relevant context windows) and in corpus search for relevant training examples.

from collections import Counter

def minWindow(s, t):
    """
    Time: O(|s| + |t|)
    Space: O(|s| + |t|)
    """
    if not t or not s:
        return ""
    t_count = Counter(t)
    required = len(t_count)
    formed = 0
    window_counts = {}
    left = 0
    min_len = float('inf')
    min_start = 0

    for right in range(len(s)):
        char = s[right]
        window_counts[char] = window_counts.get(char, 0) + 1
        if char in t_count and window_counts[char] == t_count[char]:
            formed += 1

        while formed == required:
            if right - left + 1 < min_len:
                min_len = right - left + 1
                min_start = left
            left_char = s[left]
            window_counts[left_char] -= 1
            if left_char in t_count and window_counts[left_char] < t_count[left_char]:
                formed -= 1
            left += 1

    return "" if min_len == float('inf') else s[min_start:min_start + min_len]

# minWindow("ADOBECODEBANC", "ABC") -> "BANC"
# minWindow("a", "a") -> "a"

Key Takeaways

  • Binary search on answer space (Median of Two Sorted Arrays) is the most advanced binary search pattern. When the search space is not an array but a range of possible answers, binary search still applies.
  • Monotonic stack/deque (Sliding Window Max, Largest Rectangle, Trapping Rain Water) maintains elements in sorted order for O(1) access to extremes in a sliding context.
  • 2D DP on strings (Edit Distance, Regular Expression) requires careful state definition and transition logic. Drawing the DP table on paper is essential for getting these right.
  • BFS on implicit graphs (Word Ladder) treats problem states as nodes and valid transitions as edges, without explicitly building the graph.
  • For hard problems, focus on your problem-solving process: clarify constraints, identify the pattern, start with brute force, then optimize. Interviewers value clear thinking over perfect code.