Unlocking the Potential of Binary Search Trees with C# Programming

The article explores the features of binary search trees and the benefits they provide for searching and sorting data.

Denis Kondratev

Mar. 31, 23 · Analysis

Likes (4)

Comment

Save

4.5K Views

Sorting can take a lot of time when dealing with large amounts of data. It would be great if, instead of sorting the data every time, we could directly write them into memory in the correct position where they would already be sorted. This would allow us to always know in advance where to search for them, for example, starting from the center.

We would know exactly where to go; to the left, discarding half of the data on the right, or to the right, discarding half of the data on the left. This means that the number of elements on each search operation would be halved. This gives us nothing but fast logarithmic complexity.

But if we really want to insert it into the correct position, it will work quite slowly. After all, we first need to allocate new space in memory for the required number of elements. Then copy a part of the old data there, then put the new data, and then copy all remaining elements to the new locations. In other words, we get a fast search but slow insertion.

In such a situation, a linked list is more suitable.

The insertion in it really happens instantly in O(1) time without any copying. However, before we can do this, we will have to find the place for insertion by going through most of the nodes of the linked list, which will again lead us to the linear complexity of O(N).

To make these operations fast, a binary search tree was invented. It is a hierarchical data structure consisting of nodes, each of which stores data, a key-value pair, and can have a maximum of two children.

In a simple binary tree, data can be stored in any order. In a binary search tree, however, data can only be added according to special rules that allow the operations mentioned above to work quickly.

To understand these rules, let's first create a foundation for this tree and its nodes.

     C# 
   
 
 
   public class BinarySearchTree<T> where T : IComparable<T>
{
    private Node? _head;

    private class Node
    {
        public readonly T Value;
        private Node? _left;
        private Node? _right;

        public Node(T value)
        {
            Value = value;
        }
    }
} 
  

Insertion of a New Value

So, let's implement the insertion of a new value. First of all, if the root does not exist initially, that is, the tree is completely empty, we simply add a new node, which becomes the root. Next, if the new element being added is less than the current node we are standing on, we recursively move to the left. If it is greater than or equal to the current node, we recursively move to the right.

When we find the place where the child's parent is NULL, that is, it points to NULL, we perform the insertion.

     C# 
   
 
 
   public class BinarySearchTree<T> where T : IComparable<T>
{
    ...

    public void Add(T value)
    {
        if (_head is null)
        {
            _head = new Node(value);
        }
        else
        {
            _head.Insert(value);
        }
    }

    ...
}


private class Node
{
    ...
    
    public void Insert(T value)
    {
        ref var branch = ref value.CompareTo(Value) < 0 
            ? ref _left 
            : ref _right;

        if (branch is null)
        {
            branch = new Node(value);
        }
        else
        {
            branch.Insert(value);
        }
    }
} 
  

Thanks to recursion, all the elements that we traverse while searching for the correct location are stored in a stack. Once a new node is added and recursion begins to unwind, we will ascend through each ancestor node until we reach the root. This feature greatly simplifies the code in the future. Therefore, keep this in mind.

If we continue to insert new elements in this way, we will notice that they will all eventually be sorted. Small nodes will always be on the left, while larger nodes will always be on the right. Thanks to this, we always quickly find the correct path for both insertions of new elements and searching without affecting the other nodes in the tree.

So now we have search and insertion happening in O(log(N)) time. This also allows us to quickly find the minimum and maximum elements of the tree. The minimum will always be the lowest element on the left, while the maximum will be the lowest element on the right.

Therefore, in the first case, we simply always recursively descend to the left until we hit NULL. In the second case, it is similar, but we descend to the right. It is important to understand that such nodes cannot have more than one child. Otherwise, they would not be the minimum or maximum.

     C# 
   
 
 
   public class BinarySearchTree<T> where T : IComparable<T>
{
    ...

    public T Min()
    {
        if (_head is null)
        {
            throw new InvalidOperationException("The tree is empty.");
        }

        return _head.Min().Value;
    }
    
    public T Max()
    {
        if (_head is null)
        {
            throw new InvalidOperationException("The tree is empty.");
        }

        return _head.Max().Value;
    }

    ...
}


private class Node
{
    ...
    
    public Node Min()
    {
        return _left is null ? this : _left.Min();
    }

    public Node Max()
    {
        return _right is null ? this : _right.Max();
    }
} 
  

Item Search

We will create a simple search function to find a specific node in the tree based on its value. Due to the structure of the tree, the search process is relatively straightforward. If the current node contains the given value, we will return it. If the value being searched for is less than the value of the current node, we will recursively search in the left subtree. Conversely, if the search value is greater than the current node's value, we will look in the right subtree. If we reach an empty subtree, i.e., it points to NULL, then the node with the searched value does not exist. The algorithm is similar to the insertion algorithm, and we will implement the Contains method for the tree and the Find method for the nodes.

     C# 
   
 
 
   public class BinarySearchTree<T> where T : IComparable<T>
{
    ...

    public bool Contains(T value)
    {
        return _head?.Find(value) is not null;
    }

    ...
}


private class Node
{
    ...
    
    public Node? Find(T value)
    {
        var comparison = value.CompareTo(Value);

        if (comparison == 0)
        {
            return this;
        }
        
        return comparison < 0 
            ? _left?.Find(value) 
            : _right?.Find(value);
    }
} 
  

Removing Values

Now, in order for our tree not to break in case we want to delete some element from it, we need some special deletion rules. Firstly, if the element being deleted is a leaf node, then we simply replace it with NULL.

If this element had one child, then we replace the deleted node not with NULL but with that child.

So, in these two cases, deletion simply boils down to replacing the deleted node with its child, which can be either a regular existing node or NULL. Therefore, we simply check that the parent definitely does not have two children and overwrite the deleted node with one of its children, which, I repeat, can turn out to be either an existing node or NULL.

The last case is when the deleted node had two children. In this case, the node that will come in place of the deleted node must be greater than all nodes in the left subtree of the deleted node and smaller than all nodes in the right subtree of the deleted node.

Therefore, first, we need to find either the largest element in the left subtree or the smallest element in the right subtree. After we have found it, we overwrite the deleted node's data and recursively delete the node that moved to the deleted node's position. It will be deleted according to the same rules: it will either be replaced with NULL or its only child.

     C# 
   
 
 
   public class BinarySearchTree<T> where T : IComparable<T>
{
    ...
    
    public bool Remove(T value)
    {
        return _head?.Remove(value, out _head) ?? false;
    }

    ...
}


private class Node
{
    ...
    
    public bool Remove(T value, out Node? root)
    {
        var comparison = value.CompareTo(Value);

        if (comparison < 0)
        {
            root = this;
            return _left?.Remove(value, out _left) ?? false;
        }

        if (comparison > 0)
        {
            root = this;
            return _right?.Remove(value, out _right) ?? false;
        }

        if (_left is null || _right is null)
        {
            root = _left ?? _right;
            return true;
        }
        
        var leftMax = _left.Max();
        _left.Remove(leftMax.Value, out _left);
        Value = leftMax.Value;
        root = this;
        return true;
    }
} 
  

Tree Traversal

To check that the deletion was successful and the order of all nodes has been preserved, there is a special tree traversal method that allows us to output all nodes in ascending order. This method is called an in-order traversal. It involves recursively outputting first the left child, then the parent, and then the right child. Let's convert the tree to a regular list using an in-order traversal.

     C# 
   
 
 
   public class BinarySearchTree<T> where T : IComparable<T>
{
    ...
    
    public List<T> ToList()
    {
        var list = new List<T>();
        _head?.AddTo(list);

        return list;
    }

    ...
}


private class Node
{
    ...
    
    public void AddTo(ICollection<T> list)
    {
        _left?.AddTo(list);
        list.Add(Value);
        _right?.AddTo(list);
    }
} 
  

Now we have a simple way to output the tree to the console. Let's do that and make sure that the deletion works correctly.

     C# 
   
 
 
   var tree = new BinarySearchTree<int>();
tree.Add(50);
tree.Add(40);
tree.Add(30);
tree.Add(45);
tree.Add(35);
Print(tree.ToList());
tree.Remove(35);
tree.Remove(40);
Print(tree.ToList());

void Print(List<int> list)
{
    foreach (var value in list)
    {
        Console.Write(value);
        Console.Write(" ");
    }
    
    Console.WriteLine();
} 
  

Another type of tree traversal is called a pre-order traversal. It involves outputting first the parent, then the left child, and then the right child. This can be useful, for example, when copying a tree in memory, because we traverse the nodes in the exact same order as they were placed in the tree from top to bottom.

There are other types of binary search tree traversals, but their implementation differs little.

Conclusion

Finally, we have a data structure that can do everything quickly. Let's take a moment to think and create a tree from elements 1 to 5. If each subsequent node is always greater or always less than the previous node, then we again get an ordinary linked list with operations of complexity O(N).

Therefore, our tree turns out to be completely useless. Fortunately, people quickly realized this and came up with a more advanced tree called AVL with self-balancing, which will again allow us to achieve logarithmic complexity regardless of the incoming data.

But we will cover this type of tree and its balancing implementation in the next article Understanding AVL Trees in C#: A Guide to Self-Balancing Binary Search Trees.

Binary search tree Data structure Data (computing) Element Tree (data structure) csharp

Opinions expressed by DZone contributors are their own.

Related

Trending