Intermediate

Arrays & Strings

Arrays and strings are the most frequently tested data structures in coding interviews. This lesson covers 6 essential problems with brute force and optimal solutions, complete with time/space complexity analysis.

💡

Why this matters for AI/ML: NumPy arrays are the foundation of all ML computation. Feature vectors, embedding matrices, batch tensors — they are all arrays. Understanding array manipulation patterns helps you write efficient data preprocessing pipelines and custom training loops.

Problem 1: Two Sum

📝

Problem: Given an array of integers nums and an integer target, return the indices of the two numbers that add up to target. Assume exactly one solution exists and you cannot use the same element twice.

Brute Force: O(n²) Time, O(1) Space

def two_sum_brute(nums, target):
    """Check every pair of numbers."""
    for i in range(len(nums)):
        for j in range(i + 1, len(nums)):
            if nums[i] + nums[j] == target:
                return [i, j]
    return []

# Test
print(two_sum_brute([2, 7, 11, 15], 9))  # [0, 1]
print(two_sum_brute([3, 2, 4], 6))        # [1, 2]

Optimal: O(n) Time, O(n) Space

def two_sum(nums, target):
    """
    Use a hash map to store complement values.
    For each number, check if (target - number) was seen before.
    """
    seen = {}  # value -> index
    for i, num in enumerate(nums):
        complement = target - num
        if complement in seen:
            return [seen[complement], i]
        seen[num] = i
    return []

# Test
print(two_sum([2, 7, 11, 15], 9))   # [0, 1]
print(two_sum([3, 2, 4], 6))         # [1, 2]
print(two_sum([3, 3], 6))            # [0, 1]

AI/ML context: This hash map pattern is used in feature crossing for recommendation systems — finding pairs of features whose combined value meets a threshold.

Problem 2: Maximum Subarray (Kadane's Algorithm)

📝

Problem: Given an integer array nums, find the contiguous subarray with the largest sum and return its sum.

Brute Force: O(n²) Time, O(1) Space

def max_subarray_brute(nums):
    """Try every possible subarray."""
    max_sum = float('-inf')
    for i in range(len(nums)):
        current_sum = 0
        for j in range(i, len(nums)):
            current_sum += nums[j]
            max_sum = max(max_sum, current_sum)
    return max_sum

# Test
print(max_subarray_brute([-2, 1, -3, 4, -1, 2, 1, -5, 4]))  # 6

Optimal (Kadane's): O(n) Time, O(1) Space

def max_subarray(nums):
    """
    Kadane's Algorithm: At each position, decide whether to
    extend the current subarray or start a new one.

    Key insight: If current_sum becomes negative, it can only
    hurt future subarrays, so reset it.
    """
    max_sum = nums[0]
    current_sum = nums[0]

    for i in range(1, len(nums)):
        # Either extend or start fresh
        current_sum = max(nums[i], current_sum + nums[i])
        max_sum = max(max_sum, current_sum)

    return max_sum

# Test
print(max_subarray([-2, 1, -3, 4, -1, 2, 1, -5, 4]))  # 6  (subarray [4,-1,2,1])
print(max_subarray([1]))                                  # 1
print(max_subarray([-1, -2, -3]))                         # -1

AI/ML context: Kadane's algorithm is structurally similar to computing running statistics in online learning. The “extend or reset” decision mirrors how gradient accumulation works in mini-batch training — deciding when accumulated gradients help vs. hurt convergence.

Problem 3: Rotate Array

📝

Problem: Given an array, rotate it to the right by k steps in-place.

Brute Force: O(n*k) Time, O(1) Space

def rotate_brute(nums, k):
    """Rotate one step at a time, k times."""
    k = k % len(nums)
    for _ in range(k):
        last = nums[-1]
        for i in range(len(nums) - 1, 0, -1):
            nums[i] = nums[i - 1]
        nums[0] = last

# Test
arr = [1, 2, 3, 4, 5, 6, 7]
rotate_brute(arr, 3)
print(arr)  # [5, 6, 7, 1, 2, 3, 4]

Optimal (Reverse): O(n) Time, O(1) Space

def rotate(nums, k):
    """
    Three-step reverse approach:
    1. Reverse entire array
    2. Reverse first k elements
    3. Reverse remaining elements

    Example: [1,2,3,4,5,6,7], k=3
    Step 1: [7,6,5,4,3,2,1]
    Step 2: [5,6,7,4,3,2,1]
    Step 3: [5,6,7,1,2,3,4]
    """
    def reverse(arr, left, right):
        while left < right:
            arr[left], arr[right] = arr[right], arr[left]
            left += 1
            right -= 1

    n = len(nums)
    k = k % n  # Handle k > n

    reverse(nums, 0, n - 1)
    reverse(nums, 0, k - 1)
    reverse(nums, k, n - 1)

# Test
arr = [1, 2, 3, 4, 5, 6, 7]
rotate(arr, 3)
print(arr)  # [5, 6, 7, 1, 2, 3, 4]

arr2 = [-1, -100, 3, 99]
rotate(arr2, 2)
print(arr2)  # [3, 99, -1, -100]

AI/ML context: Array rotation is used in circular buffer implementations for streaming data in real-time ML systems and in data augmentation pipelines where you cyclically shift time-series data.

Problem 4: Reverse Words in a String

📝

Problem: Given a string s, reverse the order of words. A word is a sequence of non-space characters. Return the string with words in reverse order, separated by single spaces.

Pythonic Solution: O(n) Time, O(n) Space

def reverse_words(s):
    """
    Split by whitespace, filter empties, reverse, join.
    Python's split() without args handles multiple spaces.
    """
    return " ".join(s.split()[::-1])

# Test
print(reverse_words("the sky is blue"))        # "blue is sky the"
print(reverse_words("  hello world  "))        # "world hello"
print(reverse_words("a good   example"))       # "example good a"

Manual Approach (No split): O(n) Time, O(n) Space

def reverse_words_manual(s):
    """
    Build words character by character, then reverse.
    Shows understanding of the underlying process.
    """
    words = []
    current_word = []

    for char in s:
        if char != ' ':
            current_word.append(char)
        elif current_word:
            words.append(''.join(current_word))
            current_word = []

    # Don't forget the last word
    if current_word:
        words.append(''.join(current_word))

    # Reverse and join
    words.reverse()
    return ' '.join(words)

# Test
print(reverse_words_manual("the sky is blue"))  # "blue is sky the"

AI/ML context: Text preprocessing is the first step in any NLP pipeline. Tokenization, word reversal, and string manipulation are core operations in building vocabulary indices, processing text corpora, and implementing custom tokenizers.

Problem 5: Valid Anagram

📝

Problem: Given two strings s and t, return True if t is an anagram of s, and False otherwise.

Brute Force (Sort): O(n log n) Time, O(n) Space

def is_anagram_sort(s, t):
    """Sort both strings and compare."""
    return sorted(s) == sorted(t)

print(is_anagram_sort("anagram", "nagaram"))  # True
print(is_anagram_sort("rat", "car"))          # False

Optimal (Frequency Count): O(n) Time, O(1) Space

def is_anagram(s, t):
    """
    Count character frequencies. If both strings have
    identical frequency distributions, they are anagrams.
    Space is O(1) because alphabet size is fixed (26 letters).
    """
    if len(s) != len(t):
        return False

    freq = {}
    for char in s:
        freq[char] = freq.get(char, 0) + 1

    for char in t:
        if char not in freq:
            return False
        freq[char] -= 1
        if freq[char] < 0:
            return False

    return True

# Test
print(is_anagram("anagram", "nagaram"))  # True
print(is_anagram("rat", "car"))          # False
print(is_anagram("", ""))                # True

Pythonic One-Liner

from collections import Counter

def is_anagram_counter(s, t):
    return Counter(s) == Counter(t)

print(is_anagram_counter("listen", "silent"))  # True

AI/ML context: Frequency-based comparisons are fundamental in NLP. Bag-of-words models, TF-IDF vectors, and character n-gram features all rely on counting element frequencies — the same technique used in anagram detection.

Problem 6: Sliding Window Maximum

📝

Problem: Given an array nums and a window size k, return the maximum value in each sliding window of size k as the window moves from left to right.

Brute Force: O(n*k) Time, O(n-k+1) Space

def max_sliding_window_brute(nums, k):
    """Find max in each window by scanning all k elements."""
    if not nums or k == 0:
        return []

    result = []
    for i in range(len(nums) - k + 1):
        window_max = max(nums[i:i + k])
        result.append(window_max)
    return result

# Test
print(max_sliding_window_brute([1, 3, -1, -3, 5, 3, 6, 7], 3))
# [3, 3, 5, 5, 6, 7]

Optimal (Monotonic Deque): O(n) Time, O(k) Space

from collections import deque

def max_sliding_window(nums, k):
    """
    Use a monotonic decreasing deque to track the maximum.

    The deque stores indices. We maintain the invariant that
    elements in the deque are in decreasing order of their values.
    The front of the deque is always the index of the maximum
    in the current window.

    For each new element:
    1. Remove elements from back that are smaller (they can
       never be the max while the new element is in the window)
    2. Add current index to back
    3. Remove front if it's outside the window
    4. Front of deque is the current window's max
    """
    if not nums or k == 0:
        return []

    dq = deque()  # stores indices
    result = []

    for i in range(len(nums)):
        # Remove elements smaller than current from back
        while dq and nums[dq[-1]] <= nums[i]:
            dq.pop()

        dq.append(i)

        # Remove front if outside window
        if dq[0] <= i - k:
            dq.popleft()

        # Window is fully formed starting at index k-1
        if i >= k - 1:
            result.append(nums[dq[0]])

    return result

# Test
print(max_sliding_window([1, 3, -1, -3, 5, 3, 6, 7], 3))
# [3, 3, 5, 5, 6, 7]

print(max_sliding_window([1], 1))   # [1]
print(max_sliding_window([9, 11], 2))  # [11]

AI/ML context: Sliding window operations are everywhere in ML — moving average in time series, max pooling in CNNs, and windowed attention in transformers. The monotonic deque technique is exactly how efficient max pooling is implemented in deep learning frameworks.

Summary: Complexity Comparison

Problem	Brute Force	Optimal	Key Technique
Two Sum	O(n²)	O(n)	Hash map for complement lookup
Max Subarray	O(n²)	O(n)	Kadane's: extend or reset
Rotate Array	O(n*k)	O(n)	Three reverses
Reverse Words	O(n)	O(n)	Split + reverse + join
Valid Anagram	O(n log n)	O(n)	Frequency counting
Sliding Window Max	O(n*k)	O(n)	Monotonic deque

← Previous DSA for ML Engineers Next → Hash Maps & Sets