Reading: Deitel & Deitel, Exercise 15.22
Deleting Nodes from a Binary Search Tree
Any container class has a small number of basic operations, including
inserting, deleting, and searching. The last worksheet discussed
insertion into a binary tree and searching a binary search tree for a
particular element, but did not mention the operation of deletion from
a binary search tree. Here is one of a number of possible prototypes
for this function.
bool DeleteNode(T elt)
This would return true if there was a node with the value elt
in the tree which was successfully deleted, and false if there
were no such node in the tree.
There are several ways to delete an element from a binary search tree. One trivial method which can be used in cases where deletions are rarely done is to simply add an additional field of type bool called deleted. When a node is created, deleted is set to FALSE, and if that node is deleted, it is set to TRUE. Operations such as traversal or find ignore nodes in which deleted is TRUE. This is called a logical delete, because the node is still in the tree.
Logical deletes are not practical for most purposes because they slow down the performance of insertions, searches, and other operations, and so most implementations of binary search trees use a physical delete in which the node is actually deleted from the tree and its memory returned to the heap. This is a non-trivial operation because it is important to maintain the binary search tree structure.
As with the insertion and find operation, start at the root and traverse the tree to locate the node to be deleted. However, deleting a node must be done from the parent of the node to be deleted, because it will be necessary to change the value of one of the pointers in the parent.
The difficulty in deleting a node from a binary search tree depends on how many children the node to be deleted has. There are three cases:
No Children If the node to be deleted has no children, then deleting it is trivial. Just return its memory to the heap with delete and set the pointer to it in its parent to NULL. Note that in order to delete a node, you must have a pointer to the parent of the node to be deleted.
One Child If the node to be deleted has only one child, set the pointer in the parent so that it points to this child, and return the memory to the heap. In other words, the child takes the place of its parent in the tree. This can be diagrammed like this:
Two children The algorithm for deleting a node with two children from a binary search tree is more complex. Rather than actually deleting the node, copy the data values of the rightmost node in the left subtree (or the leftmost node in the right subtree) to the node to be deleted, and then delete the node from which the data was copied. This deletion is easy since this node has either no children or only one child. Since the rightmost node of the left subtree is guaranteed to be the immediate inorder predecessor (the node which would be visited immediately before the node to be deleted in an inorder traversal), copying its value to the node to be deleted will retain the binary search tree structure. Here is an example:
See also Deitel & Deitel, Exercise 15.22, which gives a different algorithm for deletion in the case of deleting a node with two children. That algorithm does actually delete the node and relinks the leftmost node of the right subtree in its place, instead of copying the value from that node.
Destructor Functions
Our code for a Binary Search Tree does not have a destructor function. The destructor function for a tree has to return all allocated memory to the heap. This involves traversing the tree and calling delete at each node after traversing both children (which order of traversal would you use?).
Exercise 1: Write a destructor function for a binary search tree.
Using a Tree for Searching and Sorting
Since finding an element in a binary search tree is relatively fast (What is the order of the find operation for a balanced tree?), binary search trees are often used as container classes to implement the ADT set. They can also be used to sort data. The algorithm for using a binary tree to sort data is called tree sort. Here it is:
Inserting an element into a balanced binary search tree is
where
is the number of elements in the tree.
However, if the tree becomes unbalanced--as it does for example if the
nodes are inserted in an already-sorted order--then the time for
inserting the
-th element is linear in
and the total time can grow
as order
. To avoid this problem, adjustments in the tree can
be made after each insertion as necessary to restore balance; an ADT
that includes such rebalancing is called a balanced binary
search tree.
If you take more courses in computer science you
will study how the rebalancing operations can be efficiently implemented.
Breadth First Traversal
We have seen three ways to traverse a tree, inorder, preorder, and postorder. These are all instances of depth first traversal in which a path is followed all the way to its leaf before other paths are taken. An alternate approach is breadth first traversal (sometimes called level order traversal) in which all nodes at a particular level are visited before visiting nodes at a lower level. That is, the root is visited first, then all of the nodes at level 2, then all of the nodes at level 3, and so on. For the following tree, the output of breadth first traversal looks like this:
The algorithm is simple. It requires a queue of nodes.
1. Add the root to the queue
2. While the queue is not empty
a. Remove a node from the head of the queue
b. Display the data for this node
c. Add all children of this node to the queue
Exercise 2: Write a member function void BreadthFirst()
for the class BinTree which prints the element value of each node.
Other types of trees
A binary tree is subset of the general category of nary trees
in which each node can have up to
children. (A binary tree is an
n-ary tree where
is
.) The node for an
-ary tree where
is
would look like this:
template <class T>
class node {
private:
T data;
node<T> *children[6];
// etc
An
-ary tree is a special case of a general tree in which
there is no upper bound to the number of children that a node may have.
In this case, each node contains a vector or list of its children.
template <class T>
class node {
private:
T data;
vector<node<T>* > children;
// etc
Exercise 3:
Define a template class GeneralTree which has one
private member, the root node. Each node can have any number of children,
so you should use an STL vector class of children in each node.
This class should have a
constructor which takes one argument of type T, and this
will be assigned to the root. There should be the following two
public member functions:
bool InsertChild(T parent, T child)
which inserts a new node into the tree with the value child.
This node should be a child of the node with the value parent.
The function should return true if the node is successfully
inserted, and false if the node was not inserted because
there was no node in the tree with the value parent.
Since there is no particular order specified, it will be necessary to
search the entire tree to find the node with the value parent.
void PrintTree() which prints out all members of the
tree using breadth first traversal.
Note that you need to search to insert, and searching is a little tricky. At each node, call the insert helper function for each child. This should be a boolean function, returning true if it found and inserted the new node, false otherwise. If one of your calls returns true, then you can return true, otherwise go on to the next child. If you call your inserthelper function on all of the children, and this returns false in all cases (or if there are no children), return false.
There is a main to test your program in
/dept/cs/cs2/wksht21/main.cpp This references an include
file, employee.h, which defines a class employee. This is in the
same directory.
Some of the quizzes will build on this code, so make sure that you have a working implementation before you start the quiz.