Binary TreesProblem 34 of 35Important

Find all Duplicate subtrees in a Binary tree [IMP]

Brute Force
Time: O(n^2)
Space: O(n^2)
Optimal
Time: O(n)
Space: O(n)

Problem Statement

Given a binary tree, find all duplicate subtrees. For each duplicate subtree, return the root node of any one of them. Two trees are duplicates if they have the same structure with same node values.

Example:

1 / \ 2 3 / / \ 4 2 4 / 4 Duplicate subtrees: - Subtree [2, 4] appears twice - Subtree [4] appears three times Output: [2, 4] (return root nodes of any one occurrence) / 4

Approach 1: Brute Force (Compare All Subtrees)

Intuition

For every pair of nodes, check if their subtrees are identical. Collect all nodes whose subtrees have duplicates.

Algorithm

  1. Get all nodes in the tree
  2. For each pair of nodes, check if subtrees are identical
  3. If identical, add one to result (avoid adding both)
  4. Return all such roots
java
import java.util.*;

class TreeNode {
    int val;
    TreeNode left, right;
    TreeNode(int x) { val = x; }
}

public class Solution {
    private void getAllNodes(TreeNode root, List<TreeNode> nodes) {
        if (root == null) return;
        nodes.add(root);
        getAllNodes(root.left, nodes);
        getAllNodes(root.right, nodes);
    }
    
    private boolean areIdentical(TreeNode t1, TreeNode t2) {
        if (t1 == null && t2 == null) return true;
        if (t1 == null || t2 == null) return false;
        
        return t1.val == t2.val &&
               areIdentical(t1.left, t2.left) &&
               areIdentical(t1.right, t2.right);
    }
    
    public List<TreeNode> findDuplicateSubtreesBruteForce(TreeNode root) {
        List<TreeNode> nodes = new ArrayList<>();
        getAllNodes(root, nodes);
        
        List<TreeNode> duplicates = new ArrayList<>();
        Set<TreeNode> added = new HashSet<>();
        
        for (int i = 0; i < nodes.size(); i++) {
            if (added.contains(nodes.get(i))) continue;
            
            for (int j = i + 1; j < nodes.size(); j++) {
                if (areIdentical(nodes.get(i), nodes.get(j))) {
                    if (!added.contains(nodes.get(i))) {
                        duplicates.add(nodes.get(i));
                        added.add(nodes.get(i));
                        added.add(nodes.get(j));
                    }
                }
            }
        }
        
        return duplicates;
    }
}

Complexity Analysis

  • Time Complexity: O(n^2) in best case, O(n^3) if many comparisons needed
  • Space Complexity: O(n^2) - Storing all nodes and comparison overhead

Approach 2: Optimal (Serialization with HashMap)

Intuition

Serialize each subtree into a unique string. Use a HashMap to count occurrences of each serialization. When a serialization appears for the second time, add that subtree's root to the result.

Algorithm

  1. Perform postorder traversal
  2. Serialize each subtree as: "left,val,right"
  3. Store serializations in HashMap with count
  4. When count becomes 2, add root to result
  5. Return all collected roots
java
import java.util.*;

class TreeNode {
    int val;
    TreeNode left, right;
    TreeNode(int x) { val = x; }
}

public class Solution {
    private Map<String, Integer> serialCount = new HashMap<>();
    private List<TreeNode> result = new ArrayList<>();
    
    private String serialize(TreeNode root) {
        if (root == null) return "#";
        
        // Postorder: process children first
        String left = serialize(root.left);
        String right = serialize(root.right);
        
        // Create serialization for current subtree
        String serial = left + "," + root.val + "," + right;
        
        // Count occurrences
        int count = serialCount.getOrDefault(serial, 0) + 1;
        serialCount.put(serial, count);
        
        // Add to result only when count becomes 2
        if (count == 2) {
            result.add(root);
        }
        
        return serial;
    }
    
    public List<TreeNode> findDuplicateSubtrees(TreeNode root) {
        serialCount.clear();
        result.clear();
        serialize(root);
        return result;
    }
}

Optimization with Unique IDs

Complexity Analysis

  • Time Complexity: O(n) - Each node is visited once
  • Space Complexity: O(n) - HashMap storage for serializations

Key Takeaways

  1. Serialization for tree comparison: Convert tree to string for easy hashing
  2. Postorder traversal: Ensures children are processed before parent
  3. Count == 2 condition: Add to result only on first duplicate, not on every occurrence
  4. ID optimization: Use integer IDs instead of full strings for better performance
  5. This is a classic problem for demonstrating tree serialization technique