ArrayProblem 11 of 36
Find duplicate in an array of N+1 Integers
Problem Statement
Given an array of integers containing n + 1 integers where each integer is in the range [1, n] inclusive, there is exactly one repeated number. Find and return this duplicate number.
Constraints:
- You must not modify the array
- You must use only constant extra space O(1)
Example:
- Input:
[1, 3, 4, 2, 2] - Output:
2 - Explanation: 2 appears twice
Approach 1: Brute Force (Nested Loops)
Intuition
Compare each element with every other element to find duplicates.
Algorithm
- For each element at index i
- Check all elements at index j > i
- If arr[i] == arr[j], return arr[i]
java
public class Solution {
public int findDuplicate(int[] nums) {
int n = nums.length;
for (int i = 0; i < n; i++) {
for (int j = i + 1; j < n; j++) {
if (nums[i] == nums[j]) {
return nums[i];
}
}
}
return -1; // No duplicate found
}
}Complexity Analysis
- Time Complexity: O(n²) - Two nested loops
- Space Complexity: O(1) - No extra space
Approach 2: Sorting
Intuition
Sort the array, then duplicates will be adjacent.
Complexity Analysis
- Time Complexity: O(n log n) - Due to sorting
- Space Complexity: O(1) or O(n) - Depends on whether we copy the array
Approach 3: Using HashSet
Intuition
Use a set to track seen numbers. If we see a number twice, it's the duplicate.
java
import java.util.HashSet;
import java.util.Set;
public class Solution {
public int findDuplicate(int[] nums) {
Set<Integer> seen = new HashSet<>();
for (int num : nums) {
if (seen.contains(num)) {
return num;
}
seen.add(num);
}
return -1;
}
}Complexity Analysis
- Time Complexity: O(n) - Single pass
- Space Complexity: O(n) - HashSet storage
Approach 4: Floyd's Cycle Detection (Optimal)
Intuition
Treat the array as a linked list where arr[i] points to index arr[i]. Since there's a duplicate, there must be a cycle. Use Floyd's Tortoise and Hare algorithm to find the cycle entry point.
Why this works:
- Array indices: 0 to n
- Values: 1 to n
- If we follow
next = arr[current], values 1-n create a valid traversal - A duplicate value means two indices point to the same "node" → cycle exists
- The duplicate number is the entry point of the cycle
Algorithm
-
Phase 1: Detect cycle (find meeting point)
- Slow pointer moves 1 step:
slow = arr[slow] - Fast pointer moves 2 steps:
fast = arr[arr[fast]] - They will meet inside the cycle
- Slow pointer moves 1 step:
-
Phase 2: Find cycle entry (duplicate)
- Reset slow to start
- Move both pointers 1 step at a time
- They meet at the cycle entry = duplicate number
java
public class Solution {
public int findDuplicate(int[] nums) {
// Phase 1: Find intersection point
int slow = nums[0];
int fast = nums[0];
do {
slow = nums[slow]; // Move 1 step
fast = nums[nums[fast]]; // Move 2 steps
} while (slow != fast);
// Phase 2: Find cycle entry point (duplicate)
slow = nums[0];
while (slow != fast) {
slow = nums[slow];
fast = nums[fast];
}
return slow;
}
}Dry Run Example
nums = [1, 3, 4, 2, 2]
Visualize as linked list:
Index: 0 -> 1 -> 3 -> 2 -> 4 -> 2 (cycle!)
Value: 1 3 2 4 2
0 → 1 → 3 → 2 → 4
↑ ↓
└───────┘
Phase 1: Find meeting point
slow = nums[0] = 1
fast = nums[0] = 1
Step 1: slow = nums[1] = 3
fast = nums[nums[1]] = nums[3] = 2
Step 2: slow = nums[3] = 2
fast = nums[nums[2]] = nums[4] = 2
slow == fast == 2, STOP
Phase 2: Find cycle entry
slow = nums[0] = 1
fast = 2
Step 1: slow = nums[1] = 3
fast = nums[2] = 4
Step 2: slow = nums[3] = 2
fast = nums[4] = 2
slow == fast == 2, STOP
Duplicate: 2
Complexity Analysis
- Time Complexity: O(n) - Linear traversal
- Space Complexity: O(1) - Only two pointers
Why Floyd's Algorithm Works Here
- Array as Graph:
index ipoints toindex nums[i] - Guaranteed Cycle: Since n+1 numbers in range [1,n], by Pigeonhole Principle, at least one number repeats
- Cycle Entry = Duplicate: The duplicate number has two "incoming edges" (two indices pointing to it)
- Mathematical Proof:
- Let cycle entry be at distance
μfrom start - Let cycle length be
λ - When they meet:
2(μ + k) = μ + k + mλ→μ + k = mλ - Starting from meeting point, after
μmore steps, we reach entry
- Let cycle entry be at distance
Key Takeaways
- Floyd's Cycle Detection is the key to O(n) time and O(1) space
- The problem cleverly maps to a linked list cycle problem
- Constraints (values 1-n, n+1 elements) guarantee the cycle exists
- This technique is used in many problems: detecting cycles in linked lists, finding start of cycle
- Alternative: Binary search on value range - O(n log n) time, O(1) space
- If array modification is allowed, negative marking achieves O(n) time, O(1) space