Partitioning and Sorting Arrays with Many Repeated Entries
Problem Statement
Given an array containing many repeated/duplicate elements, sort the array efficiently. Standard sorting algorithms may not be optimal when there are many duplicates because they don't take advantage of this property.
Constraints:
- 1 ≤ N ≤ 10^6
- Number of distinct elements is much smaller than N
- Array elements can be any comparable type
Example:
- Input:
arr = [4, 9, 4, 4, 1, 9, 4, 4, 9, 4, 4, 1, 4] - Output:
[1, 1, 4, 4, 4, 4, 4, 4, 4, 4, 9, 9, 9]
Example 2 (Dutch National Flag):
- Input:
arr = [2, 0, 2, 1, 1, 0] - Output:
[0, 0, 1, 1, 2, 2]
Approach 1: Standard Sorting (Suboptimal for Many Duplicates)
Intuition
Standard comparison-based sorts like quicksort with 2-way partitioning perform poorly with many duplicates because equal elements cause unbalanced partitions.
Complexity Analysis
- Time Complexity: O(n²) for arrays with many duplicates
- Space Complexity: O(log n) to O(n) for recursion
Approach 2: Dutch National Flag (3-Way Partitioning)
Intuition
Partition array into three parts: elements less than pivot, equal to pivot, and greater than pivot. This handles duplicates efficiently by grouping equal elements together.
Algorithm
- Choose a pivot
- Maintain three regions: <pivot, =pivot, >pivot
- Use three pointers: low, mid, high
- Recursively sort only < and > regions (skip = region)
public class ThreeWayPartition {
private int[] partition3Way(int[] arr, int low, int high) {
int pivot = arr[low];
int lt = low;
int gt = high;
int i = low;
while (i <= gt) {
if (arr[i] < pivot) {
swap(arr, lt, i);
lt++;
i++;
} else if (arr[i] > pivot) {
swap(arr, i, gt);
gt--;
} else {
i++;
}
}
return new int[]{lt, gt};
}
private void swap(int[] arr, int i, int j) {
int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
public void quickSort3Way(int[] arr, int low, int high) {
if (low >= high) return;
int[] bounds = partition3Way(arr, low, high);
quickSort3Way(arr, low, bounds[0] - 1);
quickSort3Way(arr, bounds[1] + 1, high);
}
public void sort(int[] arr) {
quickSort3Way(arr, 0, arr.length - 1);
}
}Dry Run Example
arr = [4, 9, 4, 4, 1, 9, 4]
pivot = 4
Initial: lt=0, gt=6, i=0
arr = [4, 9, 4, 4, 1, 9, 4]
i=0: arr[0]=4 == pivot, i++
lt=0, gt=6, i=1
i=1: arr[1]=9 > pivot, swap(1,6), gt--
arr = [4, 4, 4, 4, 1, 9, 9]
lt=0, gt=5, i=1
i=1: arr[1]=4 == pivot, i++
lt=0, gt=5, i=2
i=2: arr[2]=4 == pivot, i++
lt=0, gt=5, i=3
i=3: arr[3]=4 == pivot, i++
lt=0, gt=5, i=4
i=4: arr[4]=1 < pivot, swap(0,4), lt++, i++
arr = [1, 4, 4, 4, 4, 9, 9]
lt=1, gt=5, i=5
i=5: arr[5]=9 > pivot, swap(5,5), gt--
arr = [1, 4, 4, 4, 4, 9, 9]
lt=1, gt=4, i=5
i > gt, done.
Result: lt=1, gt=4
arr = [1, 4, 4, 4, 4, 9, 9]
^-----------^ equal to pivot
Recursively sort [0, 0] and [5, 6]
Both are already sorted or single elements.
Final: [1, 4, 4, 4, 4, 9, 9]
Complexity Analysis
- Time Complexity: O(n) average for arrays with many duplicates
- Space Complexity: O(log n) for recursion stack
Approach 3: Dutch National Flag (3 Colors)
Special case when there are exactly 3 distinct values (like 0, 1, 2):
Approach 4: Counting Sort (For Limited Range)
When elements are in a small known range:
Complexity
- Time: O(n + k) where k is the range
- Space: O(k)
Approach 5: Bucket Sort by Key Frequency
Group elements by value, then reconstruct:
Complexity
- Time: O(n + k log k) where k is number of distinct elements
- Space: O(k)
Comparison of Approaches
| Approach | Time | Space | Best When | |----------|------|-------|-----------| | Standard QuickSort | O(n²) worst | O(log n) | Few duplicates | | 3-Way QuickSort | O(n) avg | O(log n) | Many duplicates | | Dutch Flag (3 colors) | O(n) | O(1) | Exactly 3 values | | Counting Sort | O(n+k) | O(k) | Small range of values | | Bucket by Frequency | O(n + k log k) | O(k) | Few distinct values |
Key Takeaways
- 3-Way Partitioning: Essential for handling duplicates in quicksort
- Dutch National Flag: Classic algorithm for 3-value sorting
- Counting Sort: O(n) when range is small
- Equal Elements Region: Skip sorting elements equal to pivot
- Pivot Selection: Random pivot helps avoid worst case
- Applications: Database sorting, radix sort, string sorting
- Interview Classic: Dutch National Flag is a common interview question
- Stability: 3-way quicksort is not stable; counting sort can be made stable