7.0 Data Structures – Binary Search Trees

A tree is a data structure that has many variants.  From our human perceptive, trees have too many branches and leafs. They are very useful in sunny days, where we walk under their shadows. In Computer Science world, they have another usage. We use their leafs to store data. Yes!! their leafs are a place where we can store data (numbers, characters, …etc.). So, let’s start discovering one of the most powerful data structures used in real applications.
Before we launch, make sure that you understand the word Recursion and how this concept is applied in programming. Refer to this blog post to review what you have learned before continuing with this post.

Trees Specification

As we have just mentioned, a tree has many branches and leafs. However, when it comes to the computer world, we restrict our definition of a tree for simplicity. Our tree data structure specifications are customized to the Binary Search Tree as follows:

– We want to group some items together.
– We graphically represent the data as tree nodes.
– A tree starts with a root node and has branches.
– Every node (including the root) has at most two branches. This is why it’s called a Binary Search Tree.
– A branch represents another node. (we will call this a child node to its parent).
– A node that has no branches (no children) is called leave node.

– Before we continue, have a look at the following figure that demonstrates tree nodes (green circles) and what they represent (the red illustration)
– A tree must have at least one node, which is the root node.
– A left child node value must be less than the value of the parent node. In the above figure, the node with the value 4 is a left child node to the root node, which has a value of 6 (4<6).
– A right child node value must be greater that the value of the parent node. In the above figure, the node with the value 8 is a right child node to the root node, which has a value of 6 (8>6).
– Look again at the figure and notice that all left child nodes value are less than the parent node and the right child nodes value are greater than the parent node.
– This restriction is to make the search operation very easy and direct (we will see this soon in the implementation). This is why it’s called a Binary Search Tree.

Finally, we want to be able to insert and delete nodes. In addition, we want to search the data structure for a specific node in an efficient way.
Let’s break it to pieces and see these specifications in code

template<class ItemType>
class Tree{
public:
   Tree();
   void InsertItem(ItemType item);
   void DeleteItem(ItemType item);
   void RetrieveItem(ItemType &item,bool &found);
   ~Tree();
};

We have a constructor that builds the tree starting from the root element and some helping functions that play the role of inserting, deleting and searching for items.

Binary Search Trees Implementation

Binary Search Trees implementation uses pointers and recursion extensively. To represent tree nodes, we create a structure that contains data about the node’s value as well as pointers to the left and right children. Here is how this is represented in code:

struct Node{
    ItemType info;
    Node<ItemType>* right;
    Node<ItemType>* left;
};

And as we have mentioned in the specifications above, a tree has at least one node which is the root node. So, the private data members will be as follows

class Tree{
private:
    Node<ItemType>* root;
};

This will represent a pointer to the root of the tree. From this root, we can traverse all the node of the tree. Traversing a tree simply means to start from the root node and visit all the nodes of the tree. A tree at first will have no root. So, we initialize the constructor as follows

Tree<ItemType>::Tree(){
    root = NULL;
}

After that, we make a helper function that is used to insert a value in a binary search tree given a pointer to its root node. This function definition is as follows

void Insert(Node<ItemType>* &tree,ItemType item){
    if(tree == NULL){
        tree = new Node<ItemType>;
        tree->right = NULL;
        tree->left = NULL;
        tree->info = item;
    }
    else if(item < tree->info)
        Insert(tree->left,item);
    else
        Insert(tree->right,item);
}

Let’s tackle this function code. First, the function takes a pointer to a tree (indeed it’s a pointer to the tree root node) and an item to insert in the tree. Next, we want to put this new value in the correct node. Returning to the tree specification above, we see that the Binary Search Tree has a very important characteristic. The right child node value must be greater than its parent and the left child node value must be smaller than the parent. This is the key to the above code.
We have three cases when we insert nodes to a binary search tree:
– The first case is when the tree is empty (that is it has been just created). We know this case by checking whether the root node pointer is null or not. If it is, then we create a new node that holds the new value (the value to be inserted) and has no left nor right children.
– The second case is when the pointer to the tree passed to the function already contains a value. We then have to compare the value of this node with the to be inserted value. If the node value is greater than the our new value, we insert the new value to the left of the node. If not, we insert it the right of the node.
The last two cases is implemented with recursion. But Why?!! Why don’t we simply create a new node for the new value, and if it’s smaller than our node, we make the left child points to the newly created node and else we make the right child points to the newly created node?!
The answer is that we MUST use recursion and the following picture illustrate the case.

After we have coded that helper function, we use it in our class as follows

void Tree<ItemType>::InsertItem(ItemType item){
    Insert(root,item);
}

The same scenario, we will create helper functions that is used to delete a node from a binary search tree given its root node. But the tree operation is a little bit tricky. The following picture illustrates the three cases of deleting a node.

To apply the above scenarios, we need to create three helper functions, the first is used to get the predecessor of the node to be deleted. The second is used to delete a node from the tree given its pointer and the third is used to to search for the value in the tree and call the second function to delete that node. Seems confusing? Yes! Let’s give it a try. The first function is

void GetPredecessor(Node<ItemType>* tree,ItemType &data){
    while(tree->right != NULL)
        tree = tree->right;
    data = tree->info;
}

Given a pointer to a tree node, we want to get the value of its (predecessor). As we indicated above, we continue traversing the right children of the node till it has none. Then, we return its value in the passed-by-reference variable.
The second function is

void DeleteNode(Node<ItemType>* &tree){
    ItemType data;
    Node<ItemType>* tempPtr;
    tempPtr = tree;
    if(tree->left == NULL){
        tree = tree->right;
        delete tempPtr;
    }
    else if(tree->right == NULL){
        tree = tree->left;
        delete tempPtr;
    }
    else{
        GetPredecessor(tree->left,data);
        tree->info = data;
        Delete(tree->left,data);
   }
}

The Delete node function is a straightforward implementation of the scenario in the picture. If there is no left child, we make the parent points to the right one and delete the selected node. If there is no right child, we make the parent points to the left one and delete the selected node. If there is both right and left children , we get the predecessor, replace the selected node with the value of that predecessor and finally delete that predecessor.
The third function is

void Delete(Node<ItemType>* &tree,ItemType item){
    if(item < tree->info)
        Delete(tree->left,item);
    else if(item > tree->info)
        Delete(tree->right,item);
    else
        DeleteNode(tree);
}

This is the last function used to search for a value to be deleted. We search both the left and right trees of the root. And finally if the node value is equal to the value we are searching for, we call the second function. Let’s see how this function is called from within the class context.

void Tree<ItemType>::DeleteItem(ItemType item){
    Delete(root,item);
}

Let’s move to the next operation we want to implement, the retrieve item one. It’s designed to check whether a value exists in a tree or not (search the tree for a value). We follow the same scenario, in which we first create a helper function to traverse the tree and check whether the item is found or not. This function definition is as follows

void Retrieve(Node<ItemType>* root,ItemType &item,bool &found){
    if(root == NULL)
        found = false;
    else if(item < root->info)
        return Retrieve(root->left,item,found);
    else if(item > root->info)
        return Retrieve(root->right,item,found);
    else
        found = true;
}

The function also takes the tree root and the item to search for as a parameter, the boolean parameter will then indicate whether it’s found or not (passed-by-reference). First, we check whether the pointer to the tree is null, in this case the item is not found. If it’s not null, we compare the given item with the tree node value. If the item is smaller, we search the left branch of the tree. If it’s greater, we search the right branch of the tree. The item is found if the item is equal to the tree node value.
We use this helper function in the class Tree as follows

void Tree<ItemType>::RetrieveItem(ItemType &item,bool &found){
    Retrieve(root,item,found);
}

The de-constructor of the class is to release all memory spaces occupied by the pointers we created. So, it’s not simply making the tree root equal to null. To traverse a tree and release (that is delete) nodes of a tree, we need a helper function to do the task. The function definition is as follows

void Destroy(Node<ItemType>*& tree){
    if(tree != NULL)
    {
        Destroy(tree->left);
        Destroy(tree->right);
        delete tree;
    }
}

We first check whether the tree pointer is not null. After that, we call the function recursively to destroy both left and right branches. Then, we delete the tree node itself. We call this function within the class context as follows

Tree<ItemType>::~Tree(){
    Destroy(root);
}

You can download the full Binary Search Tree implementation here. The file contains some other functions the we haven’t discussed here. The most important one is PrintTree which you will use to print all the elements of a tree. Try thinking about how it is functioning.

Test Driver

The test driver is written to monitor the operation of the tree, by inserting some elements printing them and searching for any value. I will leave the test driver for you as it is very straightforward. Comment below for any help.

Conclusion

By now, we have covered the idea behind Binary Search Trees. We use trees in many applications that requires fast searching techniques. Also, trees are used in indexing. Try searching for other applications of search trees as well as trees that are not binary.

The next post will discuss some advanced tips in Lists. Stay tuned!

Your feedback is very important to me. Don’t hesitate to comment for any help :).

4 thoughts on “7.0 Data Structures – Binary Search Trees

leave your feedback

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s