Balanced Binary Search Trees

Analyzing the binary search tree

•      Fact: the running time of the fundamental operations (insert, delete, find) is O(height).

•      If we assume that all insertions and deletions are equally likely, it can be proved that the average depth of all nodes is O(log n)

•      This does not necessarily mean that the running time is O(log n),

–    because deletions always replace a deleted node with a node from the right subtree, making the left subtree deeper than the right.

•      Nobody has proved that alternating between smallest element in the left subtree and the largest element in the right subtree actually improves average time.

 

Problems with Binary Search Trees

•      Analyses has shown that find, insert, and erase methods of a BST are quite efficient on the average

–    But, the tree can become badly unbalanced

–    This leads to a worst case time of O(n) for searching

–    This is no better than an array or list

•      If we could keep the tree balanced (automatically) the search time would always be O(log n)

•      AVL trees, red-black trees and splay tree are BST in which the height is always logarithmic in n

 

How to balance a tree

•      What do you think should be the definition of a balanced tree?

•      A balance condition must be easy to maintain, and ensure the depth of the tree is O(log n)

•      Ways to balance a tree

–    Left and right subtrees of the root must have the same height

–    Every node must have left and right subtrees of the same height (all of the leaves are of the same height)

•      The requirements for AVL balancing and red-black ordering are different.

 

Balancing algorithms

•      There are quite a few general algorithms to keep a BST balanced

•      Most are quite a bit more complicated that the basic BST

•      All of them take longer on average to update a tree

–    Updates are inserts and deletions

•      One of the oldest forms of balanced search trees is the AVL tree

–    AVL stands for Adelson-Velskii and Landis, the developers of the algorithm.

•      A popular alternative the AVL tree is the red-black tree

–    A nonrecursive implementation for insertion can be done relatively effortlessly compared with AVL trees.

 

AVL trees

•      An AVL tree is a binary search tree that is either empty or has the following two properties

–   1) the heights of the left and right subtrees differ by at most 1

–   2) the left and right subtrees are AVL trees

•      The BST ordering property must be maintained

•      As data elements are added, rotations are necessary to keep the tree balanced

 

Rotations

•      The basic mechanism that keeps a binary search tree balanced is the rotation

–   This is an adjustment to the tree, around a node, that maintains the required ordering

•      Single rotations – if an insertion causes a tree or subtree to lose the balance property, a rotation should be done at that node

 

Balancing the tree

•      Build this BST
80  60  90  85  120  now add 100

•      Find the root of the lowest subtree that is unbalanced

–    Remember, unbalanced is defined as heights of the left and right subtrees differ by at most 1

•      In this case, we will rotate the 90 up and to the left (it becomes the new root)

•      80 ( the node rotated around) becomes it new left child of the new root

•      This leaves the left child of the new root stranded

–    85 becomes the right child of 80

 

Single rotations

•      Left rotation around x

–    x is moved to where its left  child is, and x’s right child is moved to where x was

–    The left subtree of x’s right child becomes the right subtree of x

•      Right is visa versa

•      Build this BST
57  30  75  20  40  60  83  78  88  92

–    The inner subtree of the node going up must change its parent

–    The outer subtrees keep the same parent and are unchanged

 

AVL balancing

•      If an insert violates the balancing property, the insert is not complete until the property is restored

•      Rotations should always be done at the lowest node where the balancing is violated.

•      Only nodes on the path from insertion point to the root might have to be rebalanced.  No other nodes are affected.

 

One more example

•      Build this BST
57 30 75 20 40  10

 

Look at this binary search tree

•      Build a AVL tree with this input
90  100  50   30   70   80

•      How would you use right and left rotations to balance this tree?

•      Try a right rotation around 90

–    No improvement

•      Try a left rotation around 50

–    Looks like no improvement

•      But now do a right rotation around 90

•      This is an example of a case where a single rotation will not balance the tree

•      It requires a double rotation.

 

Double rotations

•      There are some situations where a single rotation will not improve the balance of the tree

•      In this case, a double rotation must be done.

•      A left double rotation works this way

–   Left rotation around the left child of an item, followed by a right rotation around the item itself

–   Right rotation around the right child of an item, followed by a left rotation around the item itself

 

 

The need for double rotations

•      50   10  80  70  90  75

•      If you first try a left rotation around 50, nothing is gained

•      You must make the longer subtree of 80 the right or “outside” subtree

–   So rotate right around 80

–   Now rotate left around 50

 

When to do a double rotation

•      If the value of the insertion that causes the tree to become unbalanced is between the two nodes where the rotation should take place, it requires a double rotation

•      Look again at this tree: 50   10  80  70  90  75

–   The rotation needs to be done between the 50 and 80 nodes

–   The insertion that forces the rotation is 75, a value between 50 and 80

–   This indicates a double rotation is necessary

 

Build this AVL tree

•      25, 16, 45, 12, 20, 22, 8

 

Major features of rotations

•      There are four kinds of rotations

–    Left rotation

–    Right rotation

–    Left rotation around the left child of an item, followed by a right rotation around the item itself

–    Right rotation around the right child of an item, followed by a left rotation around the item itself

•      Nodes not in the subtree of the item rotated about are unaffected by the rotation

•      A rotation takes constant time (not dependant on the size of the tree)

•      Before and after a rotation, the tree is still a BST

 

The Height of an AVL tree

•      It can be shown that the height of an AVL tree is at most roughly only slightly more than (log n)

•      Since the height of the tree is roughly log n, the find function runs in O(log n)

•      Since rotations take constant time, insert and delete are also O(log n)

•      Even though this is the same time complexity as an ordinary BST, it is slower because of the overhead of balancing.

–    These are the constants and multiply factors that are dropped when figuring Big O

 

Build this trees, balancing every time it is necessary

•      4  2  6  1  3  5  7  16  15  14  13  12  8  10

 

•      Double rotations

–    Left rotation around the left child of an item, followed by a right rotation around the item itself

–    Right rotation around the right child of an item, followed by a left rotation around the item itself