Searching & SortingProblem 36 of 36

Partitioning and Sorting Arrays with Many Repeated Entries

Brute Force
Time: O(n^2)
Space: O(1)
Optimal
Time: O(n)
Space: O(1)

Problem Statement

Given an array containing many repeated/duplicate elements, sort the array efficiently. Standard sorting algorithms may not be optimal when there are many duplicates because they don't take advantage of this property.

Constraints:

  • 1 ≤ N ≤ 10^6
  • Number of distinct elements is much smaller than N
  • Array elements can be any comparable type

Example:

  • Input: arr = [4, 9, 4, 4, 1, 9, 4, 4, 9, 4, 4, 1, 4]
  • Output: [1, 1, 4, 4, 4, 4, 4, 4, 4, 4, 9, 9, 9]

Example 2 (Dutch National Flag):

  • Input: arr = [2, 0, 2, 1, 1, 0]
  • Output: [0, 0, 1, 1, 2, 2]

Approach 1: Standard Sorting (Suboptimal for Many Duplicates)

Intuition

Standard comparison-based sorts like quicksort with 2-way partitioning perform poorly with many duplicates because equal elements cause unbalanced partitions.

Complexity Analysis

  • Time Complexity: O(n²) for arrays with many duplicates
  • Space Complexity: O(log n) to O(n) for recursion

Approach 2: Dutch National Flag (3-Way Partitioning)

Intuition

Partition array into three parts: elements less than pivot, equal to pivot, and greater than pivot. This handles duplicates efficiently by grouping equal elements together.

Algorithm

  1. Choose a pivot
  2. Maintain three regions: <pivot, =pivot, >pivot
  3. Use three pointers: low, mid, high
  4. Recursively sort only < and > regions (skip = region)
java
public class ThreeWayPartition {
    private int[] partition3Way(int[] arr, int low, int high) {
        int pivot = arr[low];
        int lt = low;
        int gt = high;
        int i = low;
        
        while (i <= gt) {
            if (arr[i] < pivot) {
                swap(arr, lt, i);
                lt++;
                i++;
            } else if (arr[i] > pivot) {
                swap(arr, i, gt);
                gt--;
            } else {
                i++;
            }
        }
        
        return new int[]{lt, gt};
    }
    
    private void swap(int[] arr, int i, int j) {
        int temp = arr[i];
        arr[i] = arr[j];
        arr[j] = temp;
    }
    
    public void quickSort3Way(int[] arr, int low, int high) {
        if (low >= high) return;
        
        int[] bounds = partition3Way(arr, low, high);
        
        quickSort3Way(arr, low, bounds[0] - 1);
        quickSort3Way(arr, bounds[1] + 1, high);
    }
    
    public void sort(int[] arr) {
        quickSort3Way(arr, 0, arr.length - 1);
    }
}

Dry Run Example

arr = [4, 9, 4, 4, 1, 9, 4] pivot = 4 Initial: lt=0, gt=6, i=0 arr = [4, 9, 4, 4, 1, 9, 4] i=0: arr[0]=4 == pivot, i++ lt=0, gt=6, i=1 i=1: arr[1]=9 > pivot, swap(1,6), gt-- arr = [4, 4, 4, 4, 1, 9, 9] lt=0, gt=5, i=1 i=1: arr[1]=4 == pivot, i++ lt=0, gt=5, i=2 i=2: arr[2]=4 == pivot, i++ lt=0, gt=5, i=3 i=3: arr[3]=4 == pivot, i++ lt=0, gt=5, i=4 i=4: arr[4]=1 < pivot, swap(0,4), lt++, i++ arr = [1, 4, 4, 4, 4, 9, 9] lt=1, gt=5, i=5 i=5: arr[5]=9 > pivot, swap(5,5), gt-- arr = [1, 4, 4, 4, 4, 9, 9] lt=1, gt=4, i=5 i > gt, done. Result: lt=1, gt=4 arr = [1, 4, 4, 4, 4, 9, 9] ^-----------^ equal to pivot Recursively sort [0, 0] and [5, 6] Both are already sorted or single elements. Final: [1, 4, 4, 4, 4, 9, 9]

Complexity Analysis

  • Time Complexity: O(n) average for arrays with many duplicates
  • Space Complexity: O(log n) for recursion stack

Approach 3: Dutch National Flag (3 Colors)

Special case when there are exactly 3 distinct values (like 0, 1, 2):


Approach 4: Counting Sort (For Limited Range)

When elements are in a small known range:

Complexity

  • Time: O(n + k) where k is the range
  • Space: O(k)

Approach 5: Bucket Sort by Key Frequency

Group elements by value, then reconstruct:

Complexity

  • Time: O(n + k log k) where k is number of distinct elements
  • Space: O(k)

Comparison of Approaches

| Approach | Time | Space | Best When | |----------|------|-------|-----------| | Standard QuickSort | O(n²) worst | O(log n) | Few duplicates | | 3-Way QuickSort | O(n) avg | O(log n) | Many duplicates | | Dutch Flag (3 colors) | O(n) | O(1) | Exactly 3 values | | Counting Sort | O(n+k) | O(k) | Small range of values | | Bucket by Frequency | O(n + k log k) | O(k) | Few distinct values |


Key Takeaways

  1. 3-Way Partitioning: Essential for handling duplicates in quicksort
  2. Dutch National Flag: Classic algorithm for 3-value sorting
  3. Counting Sort: O(n) when range is small
  4. Equal Elements Region: Skip sorting elements equal to pivot
  5. Pivot Selection: Random pivot helps avoid worst case
  6. Applications: Database sorting, radix sort, string sorting
  7. Interview Classic: Dutch National Flag is a common interview question
  8. Stability: 3-way quicksort is not stable; counting sort can be made stable