Dynamic Programming

Optimal Binary Search Trees

Section 3.5

Some redefinitions of BST

• The text, “Foundations of Algorithms” defines the level, height and depth of a tree a little differently than Carrano/Prichard

• The depth of a node is the number of edges in the path from the root to the node

– This is also the level of the node

• So the root is at level zero, instead of level one as defined in Carrano/Prichard

Optimal binary search tree

• The goal is to organize the keys in a BST so that the average time to locate a any key is minimized.

• We want to optimize a search where the keys do not all have the same probability of success

• An example would be a search in one of the trees on the overhead (Fig 3.10) for a name picked at random from people in the US

– Since Tom is a more common name than Ursula, it would be assigned a greater probability

• We will consider only those cases where what we are searching for is in the tree

– So the trees have to be built knowing all the keys that may be searched for, as well as the probability for each key

– Not being in the tree is a little different problem.

Search Time

• The number of comparisons done by a procedure to locate a key is called the search time.

• The goal is to determine a tree for which the average search time is minimal.

– We must know the probability that the key is in the search tree

– The probability will always be between 0 and 1

– The sum of the probabilities of all the keys will be 1

• The search time for a given key is depth(key) + 1 where depth(key) is the depth of the node in the tree

Average Search Time

• Let k₁, k₂, k₃, …k_n be the keys in a BST

• Let p_i be the probability that k_iis the search key

– Associated with each key is a probability that that key will be in the tree

^•If c_i is the number of comparisons needed to find K_iin a given tree, the average search time for that tree is
Sum c_ip_i
^{i=1 to n}

• The number of comparisons depends on the depth of the key in the tree; which depends on the shape of the tree

– So, we are not talking about balanced binary search trees

A simple example

• Suppose we have three keys we want to put in an optimal BST

• The probabilities are
p₁ = 0.7 p₂ = 0.2 p₃ = 0.1

^•Find the average search time for the tree on the overhead (fig 3.11)
Sum c_ip_iThis is an in class assignment to hand in
^{i=1 to n}

– Tree 1 is 3(0.7) + 2(0.2) + 1(0.1) = 2.6

• On the next overhead are four other possibilities of trees. Find the search time for each; then state which tree has the optimal average search time

– Tree2 is 2(0.7) + 3(0.2) + 1(0.1) = 2.1

– Tree3 is 2(0.7) + 1(0.2) + 2(0.1) = 1.8

– Tree4 is 1(0.7) + 3(0.2) + 2(0.1) = 1.5

– Tree5 is 1(0.7) + 2(0.2) + 3(0.1) = 1.4

Create this Optimal BST

• Draw two different possible BSTs using the following data, then calculate the average search time for each tree

• Try for an optimal BST in the second tree

• The probability for each key is given in parentheses
case(.05), else(.35), end(.05), if(.15) of(.05) then(.35)

• Be sure to maintain BST ordering

How to find the optimal BST

• It is not practical to find the optimal BST by considering all possibilities, since that is at least exponential in n

• Instead, we divide the problem into smaller problems, and keep the results in a table to use when solving bigger problems

• This is the classic dynamic programming solution; solve the problem for the smallest case, save that to solve a problem on step bigger, etc

• We want the nodes with the lowest probability to be lowest in our tree

Optimal solutions

• Suppose we have an optimal BST with Key_k at the root

– We know that the left subtree must also be optimal;

• The average search time for the left subtree is A[1][k-1]

– The right subtree must also be optimal

• The average search time for the right subtree is A[k+1][n]

• Since trees are defined recursively, the subtrees can also be considered the root, and its subtrees must be optimal.

– So starting from the bottom up, we find the minimum average search time for each subtree, and keep it in a table

Storing the data

• The probability of each key is kept in a 1-D array

– The probability of key₁ is kept at index 1; etc

• The minimum average search time of each subtree is kept in a 2-D array

• The data for each subtree is kept in A[i][j] with K_ibeing the smallest key in the subtree, and K_i the largest key in the subtree

_–A[i][i] = p_i

– A[i][i-1] and A[j+1][j] are defined to be 0

Example

• p₁ = 0.4 p₂ = 0.1 p₃ = 0.2 p₄ = 0.3

• Put the probabilities in the 1-D array

• Build the 2-D array on the
board

_–A[i][i] = p_i

– A[i][i-1] and A[j+1][j] are defined to be 0

• To find the other A[i][j] we use the formula

^•A[i][j] = min(A[i][k-1] + A[k+1][j] +Sum p_m
^{i ≤}^k^≤^j ^{m=1 to j}

• In order to actually build the optimal BST, we have to keep track of the k that was used for the minimum average time

– This is kept also kept in a 2-D array

Analyzing the algorithm

• This algorithm is much like the algorithm for finding chained matrix multiplication

• So it has the same time complexity, O(n³)

• One of the reasons for presenting this algorithm is to show that a balanced BST is not necessarily the “best” BS

• If you have the necessary data to build an optimal BST, then you can speed up searches.

• What is necessary to build a optimal BST?

– Know all the keys to be searched

– Know the probability of each key