HeapProblem 12 of 18

Median in a stream of Integers

Brute Force
Time: O(n² log n)
Space: O(n)
Optimal
Time: O(n log n)
Space: O(n)

Problem Statement

Given a stream of integers, find the median after each new element is added.

Example:

  • Input stream: [5, 15, 1, 3, 2, 8]
  • Output medians: [5, 10, 5, 4, 3, 4]
  • Explanation:
    • [5] → median = 5
    • [5, 15] → median = (5+15)/2 = 10
    • [1, 5, 15] → median = 5
    • [1, 3, 5, 15] → median = (3+5)/2 = 4
    • [1, 2, 3, 5, 15] → median = 3
    • [1, 2, 3, 5, 8, 15] → median = (3+5)/2 = 4

Approach 1: Maintain Sorted Array (Brute Force)

Intuition

Keep elements in a sorted array. Insert each new element in its correct position. Median is the middle element(s).

Algorithm

  1. For each new element, find its position using binary search
  2. Insert at that position (shifting elements)
  3. Return median based on array length
java
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

class MedianFinder {
    private List<Integer> sorted;
    
    public MedianFinder() {
        sorted = new ArrayList<>();
    }
    
    public void addNum(int num) {
        // Binary search for position
        int pos = Collections.binarySearch(sorted, num);
        if (pos < 0) {
            pos = -(pos + 1);
        }
        sorted.add(pos, num);
    }
    
    public double findMedian() {
        int n = sorted.size();
        if (n % 2 == 1) {
            return sorted.get(n / 2);
        } else {
            return (sorted.get(n / 2 - 1) + sorted.get(n / 2)) / 2.0;
        }
    }
}

Complexity Analysis

  • Time Complexity: O(n) per insertion (shifting elements), O(1) for median
  • Space Complexity: O(n) for storing all elements

Approach 2: Two Heaps (Optimal)

Intuition

Use two heaps to maintain the lower and upper halves of the data. Max-heap for lower half, min-heap for upper half. Median is derived from the roots of these heaps.

Algorithm

  1. Max-heap (lower half): Stores smaller half of numbers
  2. Min-heap (upper half): Stores larger half of numbers
  3. Balance heaps: |max-heap size - min-heap size| ≤ 1
  4. For odd count: median = top of larger heap
  5. For even count: median = average of both tops
java
import java.util.PriorityQueue;
import java.util.Collections;

class MedianFinder {
    private PriorityQueue<Integer> maxHeap;  // Lower half
    private PriorityQueue<Integer> minHeap;  // Upper half
    
    public MedianFinder() {
        maxHeap = new PriorityQueue<>(Collections.reverseOrder());
        minHeap = new PriorityQueue<>();
    }
    
    public void addNum(int num) {
        // Add to maxHeap first
        maxHeap.offer(num);
        
        // Balance: move max of lower half to upper half
        minHeap.offer(maxHeap.poll());
        
        // Ensure maxHeap has same or one more element
        if (maxHeap.size() < minHeap.size()) {
            maxHeap.offer(minHeap.poll());
        }
    }
    
    public double findMedian() {
        if (maxHeap.size() > minHeap.size()) {
            return maxHeap.peek();
        }
        return (maxHeap.peek() + minHeap.peek()) / 2.0;
    }
}

Complexity Analysis

  • Time Complexity: O(log n) per insertion, O(1) for median
  • Space Complexity: O(n) for storing all elements

Approach 3: Two Heaps - Alternative Balance Strategy

Intuition

Instead of always adding to one heap first, decide based on the value and heap tops.

Algorithm

  1. If num ≤ max of lower half, add to max-heap
  2. Else add to min-heap
  3. Rebalance if needed
java
import java.util.PriorityQueue;
import java.util.Collections;

class MedianFinder {
    private PriorityQueue<Integer> maxHeap;
    private PriorityQueue<Integer> minHeap;
    
    public MedianFinder() {
        maxHeap = new PriorityQueue<>(Collections.reverseOrder());
        minHeap = new PriorityQueue<>();
    }
    
    public void addNum(int num) {
        if (maxHeap.isEmpty() || num <= maxHeap.peek()) {
            maxHeap.offer(num);
        } else {
            minHeap.offer(num);
        }
        
        // Rebalance
        if (maxHeap.size() > minHeap.size() + 1) {
            minHeap.offer(maxHeap.poll());
        } else if (minHeap.size() > maxHeap.size()) {
            maxHeap.offer(minHeap.poll());
        }
    }
    
    public double findMedian() {
        if (maxHeap.size() > minHeap.size()) {
            return maxHeap.peek();
        }
        return (maxHeap.peek() + minHeap.peek()) / 2.0;
    }
}

Complexity Analysis

  • Time Complexity: O(log n) per insertion, O(1) for median
  • Space Complexity: O(n) for storing all elements

Approach 4: Self-Balancing BST (Multiset)

Intuition

Use a balanced BST (like multiset in C++) that allows O(log n) insertion and O(log n) access to median.

Algorithm

  1. Insert element into multiset
  2. Maintain iterator pointing to median
  3. Update iterator after each insertion
java
import java.util.TreeMap;

class MedianFinder {
    private TreeMap<Integer, Integer> tree;
    private int count;
    
    public MedianFinder() {
        tree = new TreeMap<>();
        count = 0;
    }
    
    public void addNum(int num) {
        tree.merge(num, 1, Integer::sum);
        count++;
    }
    
    public double findMedian() {
        int target = (count + 1) / 2;
        int accumulated = 0;
        Integer first = null, second = null;
        
        for (var entry : tree.entrySet()) {
            accumulated += entry.getValue();
            if (first == null && accumulated >= target) {
                first = entry.getKey();
            }
            if (accumulated >= target + (count % 2 == 0 ? 1 : 0)) {
                second = entry.getKey();
                break;
            }
        }
        
        if (count % 2 == 1) {
            return first;
        }
        return (first + second) / 2.0;
    }
}

Complexity Analysis

  • Time Complexity: O(log n) per insertion, O(log n) or O(1) for median (depending on implementation)
  • Space Complexity: O(n) for storing all elements

Key Takeaways

  1. Two heaps is the classic solution - O(log n) insert, O(1) median
  2. Max-heap for lower half, min-heap for upper half is the key insight
  3. Keep heaps balanced - size difference ≤ 1
  4. This is LeetCode 295 - Find Median from Data Stream
  5. For follow-up with data in range [0, 100], counting sort / bucket approach works
  6. Multiset with iterator tracking is elegant but implementation is tricky
  7. The two-heap approach extends to finding any percentile
  8. Memory efficient for streaming - we keep all elements but access is O(1)