ArrayProblem 11 of 36

Find duplicate in an array of N+1 Integers

Brute Force
Time: O(n²)
Space: O(1)
Optimal
Time: O(n)
Space: O(1)

Problem Statement

Given an array of integers containing n + 1 integers where each integer is in the range [1, n] inclusive, there is exactly one repeated number. Find and return this duplicate number.

Constraints:

  • You must not modify the array
  • You must use only constant extra space O(1)

Example:

  • Input: [1, 3, 4, 2, 2]
  • Output: 2
  • Explanation: 2 appears twice

Approach 1: Brute Force (Nested Loops)

Intuition

Compare each element with every other element to find duplicates.

Algorithm

  1. For each element at index i
  2. Check all elements at index j > i
  3. If arr[i] == arr[j], return arr[i]
java
public class Solution {
    public int findDuplicate(int[] nums) {
        int n = nums.length;
        
        for (int i = 0; i < n; i++) {
            for (int j = i + 1; j < n; j++) {
                if (nums[i] == nums[j]) {
                    return nums[i];
                }
            }
        }
        
        return -1; // No duplicate found
    }
}

Complexity Analysis

  • Time Complexity: O(n²) - Two nested loops
  • Space Complexity: O(1) - No extra space

Approach 2: Sorting

Intuition

Sort the array, then duplicates will be adjacent.

Complexity Analysis

  • Time Complexity: O(n log n) - Due to sorting
  • Space Complexity: O(1) or O(n) - Depends on whether we copy the array

Approach 3: Using HashSet

Intuition

Use a set to track seen numbers. If we see a number twice, it's the duplicate.

java
import java.util.HashSet;
import java.util.Set;

public class Solution {
    public int findDuplicate(int[] nums) {
        Set<Integer> seen = new HashSet<>();
        
        for (int num : nums) {
            if (seen.contains(num)) {
                return num;
            }
            seen.add(num);
        }
        
        return -1;
    }
}

Complexity Analysis

  • Time Complexity: O(n) - Single pass
  • Space Complexity: O(n) - HashSet storage

Approach 4: Floyd's Cycle Detection (Optimal)

Intuition

Treat the array as a linked list where arr[i] points to index arr[i]. Since there's a duplicate, there must be a cycle. Use Floyd's Tortoise and Hare algorithm to find the cycle entry point.

Why this works:

  • Array indices: 0 to n
  • Values: 1 to n
  • If we follow next = arr[current], values 1-n create a valid traversal
  • A duplicate value means two indices point to the same "node" → cycle exists
  • The duplicate number is the entry point of the cycle

Algorithm

  1. Phase 1: Detect cycle (find meeting point)

    • Slow pointer moves 1 step: slow = arr[slow]
    • Fast pointer moves 2 steps: fast = arr[arr[fast]]
    • They will meet inside the cycle
  2. Phase 2: Find cycle entry (duplicate)

    • Reset slow to start
    • Move both pointers 1 step at a time
    • They meet at the cycle entry = duplicate number
java
public class Solution {
    public int findDuplicate(int[] nums) {
        // Phase 1: Find intersection point
        int slow = nums[0];
        int fast = nums[0];
        
        do {
            slow = nums[slow];          // Move 1 step
            fast = nums[nums[fast]];    // Move 2 steps
        } while (slow != fast);
        
        // Phase 2: Find cycle entry point (duplicate)
        slow = nums[0];
        while (slow != fast) {
            slow = nums[slow];
            fast = nums[fast];
        }
        
        return slow;
    }
}

Dry Run Example

nums = [1, 3, 4, 2, 2] Visualize as linked list: Index: 0 -> 1 -> 3 -> 2 -> 4 -> 2 (cycle!) Value: 1 3 2 4 2 0 → 1 → 3 → 2 → 4 ↑ ↓ └───────┘ Phase 1: Find meeting point slow = nums[0] = 1 fast = nums[0] = 1 Step 1: slow = nums[1] = 3 fast = nums[nums[1]] = nums[3] = 2 Step 2: slow = nums[3] = 2 fast = nums[nums[2]] = nums[4] = 2 slow == fast == 2, STOP Phase 2: Find cycle entry slow = nums[0] = 1 fast = 2 Step 1: slow = nums[1] = 3 fast = nums[2] = 4 Step 2: slow = nums[3] = 2 fast = nums[4] = 2 slow == fast == 2, STOP Duplicate: 2

Complexity Analysis

  • Time Complexity: O(n) - Linear traversal
  • Space Complexity: O(1) - Only two pointers

Why Floyd's Algorithm Works Here

  1. Array as Graph: index i points to index nums[i]
  2. Guaranteed Cycle: Since n+1 numbers in range [1,n], by Pigeonhole Principle, at least one number repeats
  3. Cycle Entry = Duplicate: The duplicate number has two "incoming edges" (two indices pointing to it)
  4. Mathematical Proof:
    • Let cycle entry be at distance μ from start
    • Let cycle length be λ
    • When they meet: 2(μ + k) = μ + k + mλμ + k = mλ
    • Starting from meeting point, after μ more steps, we reach entry

Key Takeaways

  1. Floyd's Cycle Detection is the key to O(n) time and O(1) space
  2. The problem cleverly maps to a linked list cycle problem
  3. Constraints (values 1-n, n+1 elements) guarantee the cycle exists
  4. This technique is used in many problems: detecting cycles in linked lists, finding start of cycle
  5. Alternative: Binary search on value range - O(n log n) time, O(1) space
  6. If array modification is allowed, negative marking achieves O(n) time, O(1) space