# Trees

## Terminology

1. **Neighbor:** parent or child of a node
2. **Ancestor:** node reachable by traversing its parent chain
3. **Descendant:** node in the node’s subtree
4. **Degree:** number of children of a node
5. **Degree of a tree:** maximum degree of nodes in the tree
6. **Distance:** number of edges along the shortest path between two nodes
7. **Level/depth:** number of edges along the unique path between a node and the root node
8. **Height:** maximum number of edges between node till last leaf
9. **Width:** number of nodes in a level
10. **Complete binary tree:** every level except possibly last is completely filled, left first
11. **Balanced:** height differs no more than 1

## Traversals

1. Pre-order: SELF, left, right
2. In-order: left, SELF, right
3. Post-order: left, right, SELF

## Runtime analysis

Unbalanced trees have `h = n` in worst case, balanced trees have `h = log n`

1. Access: $$O(h)$$
2. Search: $$O(h)$$
3. Insert: $$O(h)$$
4. Remove: $$O(h)$$

## Corner cases

1. Empty tree
2. Single node
3. Two nodes
4. Skewed tree

## Take note…

1. When using BFS/DFS, `visited` array is not needed since tree has only 1 direction (assuming directed tree)

## Common operations

1. Insert value
   * Find appropriate position and insert
2. Delete value
   * Cases:
     1. No children: delete as per usual
     2. 1 child: swap node with child and delete
     3. 2 child: swap with successor and delete
3. Count number of nodes in tree
4. Whether value is in tree
   * Tree searching: think of conditions to branch left and right
5. Calculate height of the tree
   * `NULL` nodes have height of `-1` while leafs have height of `0`

{% code fullWidth="false" %}

```python
def height(node):
    if not node:
        return -1
    return max(height(node.left), height(node.right)) + 1
```

{% endcode %}

## Techniques

### Recursion

Most tree problems will require [recursion](https://interviews.woojiahao.com/algorithms/recursion "mention") to some degree. Some tips to think about recursion for trees:

* Think of each recursive call as returning the value we want for all subsequent calls (often thinking of it as returning value from the left/right sub-tree)
* Try replacing the return result with multiple values (using a tuple or array) or returning another tree node

### Level-order traversal

Some problems can also be solved by traversing in level-order, much like the "virus" traversal for graphs

### Summation of nodes

Nodes can be summed to verify if nodes are negative

### Modifying existing traversal algorithms

Some problems such as [Binary Tree Right Side View](https://leetcode.com/problems/binary-tree-right-side-view/) requires some modification to the predefined traversal orders in [#traversals](#traversals "mention"). As such, think about possibilities of reversing the order of traversal to start from the right first

### Merkle hashing

This is a relatively unused technique but it is useful when you're trying to do tree comparisons to detect mismatches

* `SHA-256` is used to hash every leaf
* Combining the hashes of the left and right forms the hash of the parent's hash
* Repeat this process till the root of the tree to form the tree hash

I have only seen this technique used for [Subtree of Another Tree](https://leetcode.com/problems/subtree-of-another-tree/)

### Split tree problems

This is a unique class of tree problems where we break the problem into two smaller sub-problems:

1. Finding the overall answer
2. Finding the recursive answer

These problems often arise when the optimal answer can be found by either taking a path through a node (thus voiding the remaining path) or by continuing upwards through the traversal (as seen in [Binary Tree Maximum Path Sum](https://leetcode.com/problems/binary-tree-maximum-path-sum/) and [Diameter of Binary Tree](https://leetcode.com/problems/diameter-of-binary-tree/))

The first sub-problem focuses on solving the overall answer (by assuming a path is taken through the current node) while the second sub-problem focuses on solving the recursive answer, often optimizing the optimal value along a linear path (i.e. does not pass through a sub-tree's root)

```python
ans = 0
def split(node):
    nonlocal ans
    if not node:
        return -10**7
    
    left = split(node.left)
    right = split(node.right)
    ans = max(ans, left + right + node.val)
    return max(left, right) + node.val    
```

### Tree construction

For most tree construction questions, you will need both the pre-order/post-order and in-order traversal to re-construct the tree:

```python
# Definition for a binary tree node.
# class TreeNode:
#     def __init__(self, val=0, left=None, right=None):
#         self.val = val
#         self.left = left
#         self.right = right
class Solution:
    def buildTree(self, preorder: List[int], inorder: List[int]) -> Optional[TreeNode]:
        indices = {}
        N = len(preorder)
        for i in range(N):
            indices[inorder[i]] = i

        def build(i, l, r):
            if l > r:
                return None

            if l == r:
                return TreeNode(preorder[i])
            
            root_value = preorder[i]
            node = TreeNode(preorder[i])
            node.left = build(i + 1, l, indices[root_value] - 1)
            node.right = build(i + indices[root_value] - l + 1, indices[root_value] + 1, r)
            return node
        
        return build(0, 0, N - 1)
```

This is based on the property that the values in a pre-order traversal represent the root of each sub-tree throughout the traversal. The in-order traversal is split along the root's value, the left being the values found in the left sub-tree and the right being the values found in the right sub-tree

Another interesting property is that if given the choice to serialize a tree, you can achieve a complete reconstruction using only the pre-order traversal given that you also serialize the `NULL` nodes into fixed characters like `#`. This way, we can just continue removing elements from the top traversal and taking those as the left and right values

### Finding the center of a tree

A center node of a tree is one whose sub-trees have the minimum height, causing nodes to be distributed evenly on the left and right. Additionally, a tree can contain at most two centers

To find the center of a tree, topological sort (found under [graph](https://interviews.woojiahao.com/algorithms/graph "mention")) can be used, eliminating all leaf nodes at most two nodes remain

```python
class Solution:
    def findMinHeightTrees(self, n: int, edges: List[List[int]]) -> List[int]:
        if n <= 2:
            return list(range(n))

        graph = {i: set() for i in range(n)}
        for a, b in edges:
            graph[a].add(b)
            graph[b].add(a)
        
        frontier = []
        for i, s in graph.items():
            if len(s) == 1:
                frontier.append(i)
            
        while n > 2:
            n -= len(frontier)
            next_frontier = []
            for i in frontier:
                for o in graph[i]:
                    graph[o].remove(i)
                    if len(graph[o]) == 1:
                        next_frontier.append(o)
            frontier = next_frontier

        return frontier
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://interviews.woojiahao.com/data-structures/graphs/trees.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
