StringProblem 29 of 43

Find the first repeated word in string

Brute Force
Time: O(n²)
Space: O(n)
Optimal
Time: O(n)
Space: O(n)

Problem Statement

Given a string containing words separated by spaces, find the first word that appears more than once.

Example:

  • Input: "Ravi had been saying that he had been there"
  • Output: "had"
  • Explanation: "had" is the first word that repeats

Example 2:

  • Input: "he had had he"
  • Output: "had"
  • Explanation: "had" appears at positions 2 and 3, while "he" appears at 1 and 4. "had" repeats first.

Approach 1: Brute Force (Compare Each Word with All Following)

Intuition

For each word, check if it appears again in the remaining part of the string. Return the first word that has a duplicate.

Algorithm

  1. Split the string into words
  2. For each word at index i
  3. Check if it appears at any index j > i
  4. If found, return that word
  5. If no repeats found, return empty string
java
public class Solution {
    public String firstRepeatedWord(String str) {
        String[] words = str.split("\\s+");
        int n = words.length;
        int minSecondIdx = n;
        String result = "";
        
        for (int i = 0; i < n; i++) {
            for (int j = i + 1; j < n; j++) {
                if (words[i].equals(words[j])) {
                    if (j < minSecondIdx) {
                        minSecondIdx = j;
                        result = words[i];
                    }
                    break;
                }
            }
        }
        
        return result;
    }
}

Complexity Analysis

  • Time Complexity: O(n² * m) where n is number of words and m is average word length
  • Space Complexity: O(n * m) for storing words

Approach 2: Using HashSet (Optimal)

Intuition

Use a HashSet to track seen words. The first word we encounter that's already in the set is our answer.

Algorithm

  1. Split string into words
  2. Create empty HashSet
  3. For each word:
    • If word is in set, return it (this is the first repeat)
    • Otherwise, add word to set
  4. If no repeats found, return empty string
java
import java.util.*;

public class Solution {
    public String firstRepeatedWord(String str) {
        Set<String> seen = new HashSet<>();
        
        for (String word : str.split("\\s+")) {
            if (seen.contains(word)) {
                return word;
            }
            seen.add(word);
        }
        
        return "";  // No repeated word
    }
}

Complexity Analysis

  • Time Complexity: O(n * m) where n is number of words, m is average word length
  • Space Complexity: O(n * m) for HashSet storage

Approach 3: Handle Punctuation and Case

Intuition

In real scenarios, we need to handle punctuation and case sensitivity. Words like "Hello" and "hello," should be treated as the same word.

java
import java.util.*;
import java.util.regex.*;

public class Solution {
    public String firstRepeatedWord(String str) {
        Set<String> seen = new HashSet<>();
        
        // Extract words using regex, convert to lowercase
        Pattern pattern = Pattern.compile("\\b\\w+\\b");
        Matcher matcher = pattern.matcher(str.toLowerCase());
        
        while (matcher.find()) {
            String word = matcher.group();
            if (seen.contains(word)) {
                return word;
            }
            seen.add(word);
        }
        
        return "";
    }
}

Complexity Analysis

  • Time Complexity: O(n * m)
  • Space Complexity: O(n * m)

Approach 4: Using Trie

Intuition

For very large texts or streaming data, a Trie can be more memory-efficient than storing full words in a HashSet.

java
import java.util.*;

class TrieNode {
    TrieNode[] children = new TrieNode[26];
    boolean isEndOfWord = false;
}

class Trie {
    TrieNode root = new TrieNode();
    
    // Returns true if word already exists
    boolean insertAndCheck(String word) {
        TrieNode current = root;
        
        for (char c : word.toLowerCase().toCharArray()) {
            if (!Character.isLetter(c)) continue;
            int idx = c - 'a';
            
            if (current.children[idx] == null) {
                current.children[idx] = new TrieNode();
            }
            current = current.children[idx];
        }
        
        if (current.isEndOfWord) {
            return true;
        }
        
        current.isEndOfWord = true;
        return false;
    }
}

public class Solution {
    public String firstRepeatedWord(String str) {
        Trie trie = new Trie();
        
        for (String word : str.split("\\s+")) {
            if (trie.insertAndCheck(word)) {
                return word;
            }
        }
        
        return "";
    }
}

Complexity Analysis

  • Time Complexity: O(n * m) where n is number of words, m is max word length
  • Space Complexity: O(total unique characters) - can be less than HashSet for similar prefixes

Key Takeaways

  1. HashSet is the standard approach for O(n) time complexity
  2. Clean input before processing (handle punctuation, case)
  3. Trie can be useful for streaming data or prefix-related queries
  4. The problem definition matters: "first repeated" means first word whose second occurrence appears earliest
  5. String splitting is O(n) and adds to preprocessing time
  6. Consider memory efficiency for very large texts