CSCI 104 - Fall 2019 Data Structures and Design

BSTs

1 - BSTs

Before we go over binary search trees, let's start by going over binary trees. What does it mean for a tree to be binary?

A BST is a specific type of binary tree. In a BST, left children (the left subtree) hold values that are less than the parent's value, and right children (the right subtree) hold values greater than the parent's value.

2 - Searching

BSTs exist to enable (potentially) fast searches.

Why do we say potentially? Can someone think of an example in which the search is really slow, even if we have a valid BST?

Our search function will simply return true or false depending on whether or not our search parameter exists in the tree. Another reasonable return value of a search function could be an iterator pointing to the found element (see std::map find).

To search for key X in a BST, we compare X to the current node.

2.1 - Example

Take a look at this example:

Operation: find(6) // We begin at the root

Let's walk through this.

Now, here's an example where we try to find a node that does not exist in the tree:

Operation: find(0) // We begin at the root

Let's walk through this one too.

The best-case runtime for searching a value X in a BST with N elements is O(logN). What is the worst-case runtime?

2 - Traversals

A traversal is a methodology for stepping through a structure (such as using Breadth-First Traversal as opposed to Depth-First Traversal on a graph). Because nodes in a tree can have multiple children, there are a few different tree traversals. Do linked lists also have multiple ways to traverse them? Why or why not?

The three main tree traversals are Pre-Order Traversal, In-Order Traversal, and Post-Order Traversal. In each of these traversals, we must eventually visit every node, they're just being visited in a different order.

Which one is this?

// Recurse left
// Operate on current node
// Recurse right

How do the other two differ?

For a BST, what is special about operating on elements using an in-order traversal? If we were printing integers using this traversal, what would the output look like?

If we wanted to delete the whole tree, which traversal would we use?

What's a balanced Binary Search Tree?

Most operations on a BST take time directly proportional to the height of the tree, so we want to keep the height small.

A balanced BST is a tree that automatically keeps its height small (guaranteed to be logarithmic) for a sequence of insertions and deletions. This structure provide efficient implementations for abstract data structures.

A tree is considered balanced if it conforms to the Height-Balancing property. A node in a tree is height-balanced if the heights of its subtrees differ by no more than 1.

Here is an example of balanced vs. non balanced trees.

bst

How can we maintain these properties?

The BST property is maintained by smart insertion and deletion. In an insert, you traverse the tree based on the key to be inserted. Once you encounter a situation where you can't traverse any further, you know that the key can be placed there. Because we are traversing based on the key value, we are inherently upholding the BST property.

The same thing can be said about a deletion. This is done by choosing which node to promote. Either the predecessor, if the node has two children, or the child if the node has 1 child. By doing this, the BST property is being maintained.

A BST that maintains its balance throughout all insertions and deletions is called a self-balancing BST. These types of trees that auto-balance or self balance inherently with the insertion are called Self-Balancing Binary Search Trees. Examples are:

  1. Splay Trees
  2. AVL Trees
  3. Red Black Trees
  4. B-Trees
  5. 2-3 Trees

For all of these trees, the height-balancing property is upheld by the nature of an insert or remove. The best way to do so is with rotation, or series of rotations. This is something you'll have to know for your last homework!

Assignment

For the assignment you will have to uphold check the two properties of a balanced BSTs.

A node is defined as such:

class Node {
    Node *left;
    Node *right;
    int key;
}

1. BST Property: Is this a BST?

Given a binary tree, determine if it is a valid binary search tree (BST).

Remember that a BST is defined as follows:

bool isBST(Node *root)

2. Height-balancing property: Is it balanced?

Given a binary tree, determine if it obeys the height-balancing property.

For this problem, a height-balanced binary tree is defined as:

bool isBalanced(Node *root)

Check off

Use make to run both tests.