next up previous
Next: About this document ...

CSCI-1200 Computer Science II
Summer, 2002
Worksheet 19
The Standard Template Library

Reading: Deitel & Deitel, Chapter 20

Right now you are probably asking yourself ``If there are certain container classes like lists, queues, stacks, and binary trees that everyone uses, why not just code them up correctly once and put them in a library so that they won't need to be rewritten each time they are needed?'' The answer is ``They have.'' There have been a number of attempts to develop such a library, but now there is a standard which has been adopted by the ANSI/ISO standards committee which which is in the process of being made available for all of the standard C++ compilers along with other standard libraries like iostream.h or math.h. This is called the Standard Template Library or STL, and much of the code for it was developed right here at Rensselaer by Professor Musser and some of his students.

There are some very important facts about STL that make it different from previous software libraries and from code that appears in textbooks:

With most programming languages, it would not be possible to provide a library designed like STL. STL makes extensive use of template classes and template functions--hence the name, Standard Template Library.1Templates or generic units are crucial to achieving generality without sacrificing efficiency, because they allow programs to be specialized for particular tasks at compile time, rather than at run time as is necessary in C or most other older languages.

The STL library consists of a set of standard container classes with member functions to perform all of the important operations on them, and a set of generic algorithms such as sorting which can be done on these container classes. Each of the classes has its own header file which must be included. Most of the generic algorithms are defined in the header file algorithm.

Introduction to the STL vector container

Two of the most useful of the STL container classes are vector, which provides operations similar to those of arrays, but without the limitations of arrays; and list, which provides many useful operations on linked lists. In the next worksheet we'll look at the list container, but let's start with vectors, which are both simpler and more familiar due to their resemblance to arrays.

One of the main advantages of an STL vector over an array is automatic expansion. Although a vector has a finite size at any given time, it can be expanded as needed. In fact the expansion happens automatically: you don't need to worry about allocating new storage and copying elements into it, because that's taken care of by code internal to the class.

Here is a short example of the vector class.

     0  #include <iostream>
     1  #include <vector> 
     2  using namespace std;
     3  void print_vec(vector<int> &vec)
     4  {
     5      vector<int>::iterator i;
     6      cout << " The vector is ";
     7      for (i = vec.begin(); i != vec.end(); i++)
     8          cout << *i << ' ';
     9      cout << endl;
    10  }
    11    
    12  int main()
    13  {
    14     vector<int> v;
    15     int values[] = {21, 56, 78, 44, 17};
    16     for (int i=0;i<5;i++) 
    17         v.insert(v.end(),values[i]);
    18     print_vec(v);
    19     return 0;
    20  }

This program simply creates a vector of integers at line 14, inserts five values (each value is inserted at the end), and then prints out the values.

All container classes have an iterator type defined for them. An iterator is like a pointer in that it can be used to designate a particular element in the container. If i is an iterator, such as the vector<int>::iterator declared in line 5, *i means dereferencing i to get its location within the container, either to store into it (for example, with *i = x;) or retrieve its value (for example, as in line 8 with cout « *i). And ++i means incrementing it to point to the next location in the container, as in line 7. Iterators are more general than pointers, though, in that dereferencing and ``next'' can be given different implementations, appropriate to the data structure used in implementing the container. (As we'll see in the next worksheet, list iterators are implemented quite differently from vector iterators.)

This program uses three member functions for the class vector:

iterator insert(iterator position, const T& elt) This function inserts a new element elt into the vector at the place pointed to by the iterator position.

iterator begin() This function returns an iterator pointing to the first element in the vector.

iterator end() This function returns an iterator pointing to the location one past the last element actually stored in the vector.

Alert: A general convention in STL is that the end member functions return an iterator pointing one step past the last element in the container, not to the last element. Since this iterator doesn't actually point to an element, trying to dereference it--i.e., applying the * operator to it--will result in an error.

With this background, you should be able to read and understand this program. Pay careful attention to how the iterator is declared in the print_vector function, and how it is used to iterate through the vector from beginning to end.

The class vector is defined in the header file vector.

Exercise 1: Add a function reverse_print_vector to this program which prints the values in reverse order; i.e., starting with the last. It will be helpful for you to know that the $--$ operator is defined on a vector iterator, and it works in the usual way. Add a line to main which calls this new function.

Vector member functions and operators

Here is a list of member functions which are defined on a vector. This list is not complete.

vector() The default constructor. Takes constant time.

vector(int n, const T& value) A constructor which creates a vector of size n and puts the value value in each cell. Takes $O(n)$ time.

~vector() A destructor. Takes $O(n)$ time, where $n$ is the size of the vector at the time it is destroyed.

bool operator==(const vector<T>& x, const vector<T>& y) Returns true if the sequences of elements in x and y are the same, false otherwise. Takes linear time.

iterator begin() returns an iterator value pointing to the first element in the vector. Takes constant time.

iterator end() This function returns an iterator pointing to the cell one past the end of the vector. Takes constant time.

iterator insert(iterator position, const T& elt) This function inserts a new element elt into the vector at the place pointed to by the iterator position. The time is $O(m)$ where $m$ is the number of elements past the insertion point. See also push_back.

int size() returns the number of elements currently in the vector. Takes constant time.

bool empty() Returns true if the vector contains no elements, false otherwise. Takes constant time.

void push_back(const T& x) adds the element x at the end of the vector. Takes constant time.

void pop_back() erases the last element of the vector. Takes constant time.

void erase(iterator position) erases the element pointed to by position. The time is $O(m)$ where $m$ is the number of elements past the erasure point. See also pop_back.

T& front() returns the first element of the vector. Takes constant time.

T& back() returns the last element of the vector, the element pointed to by the iterator (end()-1). Takes constant time.

Note that erase actually removes the element from the vector so that the size is reduced by one, and insert adds a new element into the vector so that the size is increased by one. You can also change the value of a particular cell in a vector by using the * operator on an iterator. Here is code to change the value of the first element in a vector to 7.

    ...
    vector<int> vec;
    vector<int>::iterator i;
    ... some code to put at least 8 elements in vec
    i = vec.begin();
    *i = 7;
You can also use square brackets and an integer index on a vector as you would with an array. Here is are examples, continuing the above example:
    vec[7] = 23;
    int x = vec[7];
Note that these dereferencing and indexing operations will only work successfully if the size of the vector is at least 8. Otherwise, the results are indeterminate.

Random access iterators

The following statement prints the value of the element 700 past where the iterator i is pointing:
cout << *(i+700);

This works, and works efficiently, because vector iterators, like array pointers, are ``random access'' iterators. Roughly this means that they can be moved around large distances in the container in constant time. The notation used for these ``big hop'' operations is just what it is for array pointers: i += $k$ or i -= $k$, where $k$ is any integer (not so large that it takes us outside the bounds of the container). Operations i + $k$ and i - $k$ must also defined; they return the iterators that would be $k$ elements away but don't change i (again, just as for array pointers). And all of these operations must take only constant time; i.e., the time they take is not allowed to depend on how big $k$ is. For example, i += 10000 must be just as fast as i += 3. (Again, we're assuming that we stay within the bounds of the container.)

Exercise 2 Write a complete program that creates a vector of type double, inserts the following elements into it
4.5, 6.7, 2.3, 9.8, 5.5, 1.2
finds an iterator which is pointing to the highest value, erases that value, and prints out the vector.

Introduction to STL's Generic Algorithms

STL defines 32 generic algorithms, which are called generic because they can operate on more than one kind of container. In fact their interfaces are not directly in terms of containers but in terms of iterators. So far we've only mentioned random access iterators, but STL defines several other categories of iterators that are not as powerful as random access iterators but for which many useful generic algorithms are still possible: input, output, forward, and bidirectional iterators. See Deitel & Deitel 20.1.2 for their definitions. Bidirectional iterators, for example, which are provided by STL lists, have constant time ++ and - operations defined on them but not +=, -=, +, or -.

Here are some examples of generic algorithms. Note that in their description we include performance guarantees and iterator requirements (both of which are omitted in Deitel & Deitel).

template <class iterator, class T>
iterator find(iterator first, iterator last, const T& value)
This function returns an iterator which points to the first instance of value as an element in the container. Takes $O(n)$ time where $n$ is the distance from first to the position where the element is found, or to last if the element isn't in the range. Input iterators are required.

template <class iterator>
void sort(iterator first, iterator last)
This function sorts the elements between (and including) first and last (not including last. Why?). Takes $O(n \log n)$ time where $n$ is the distance from first to last. Random access iterators are required.

template <class iterator, class T>
iterator lower_bound(iterator first, iterator last, const T& value)
This function returns an iterator which points to the first position in the sorted range from first to last which is not less than value. Note that this is the first position into which value may be inserted while keeping the range sorted. Takes $O(\log n)$ time where $n$ is the distance from first to last. Random access iterators are required.

template <class iterator, class T>
iterator upper_bound(iterator first, iterator last, const T& value)
This function returns an iterator which points to the first position in a sorted range which is greater than value. Note that this is the last position into which value may be inserted while keeping the range sorted. Takes $O(\log n)$ time where $n$ is the distance from first to last. Random access iterators are required.

template <class iterator, class T>
void replace(iterator first, iterator last, const T& old, const T& new)
replaces all instances of old with new in the range between first and last. Takes $O(n)$ time where $n$ is the distance from first to last. Forward iterators are required.

To use any of the generic algorithms, you must include the algorithm header file.

Here is a sample program which applies several generic algorithms to a vector. It creates a vector of points, inserts some values in it, sorts them, displays the sorted values, and then prompts the user to enter a value of x. If the value entered is between the lowest and highest values in the array, the program uses a simple linear interpolation function to estimate the corresponding value of y. This continues until the user enters 0.

#include <vector>
#include <algorithm>
#include <iostream>
using namespace std;

class point {
public:
    double x, y;
    point() {x = y = 0.0;}
    point (double xx, double yy) {x = xx; y = yy;}
};

bool operator==(const point &p, const point &q) {return p.x == q.x;}
bool operator<(const point &p, const point &q) {return p.x < q.x;}

//**** prototypes of functions used in main program
void get(vector<point>&);
void print(vector<point>&);
void read_and_interpolate(vector<point>&);

int main()
{
    vector<point> points;
    get(points);
    sort(points.begin(), points.end());
    cout << endl << "After sorting, the points are:" << endl;
    print(points);
    point leftmost = *(points.begin());
    point rightmost = *(points.end()-1);
    cout << "The range of x values is " << leftmost.x 
         << " to " << rightmost.x << endl;
    cout << "Now enter some points for interpolation" << endl;
    read_and_interpolate(points);
    return 0;
}

void print(vector<point>& the_points)
{
    vector<point>::iterator i;
    for (i = the_points.begin(); i != the_points.end(); ++i)
        cout << "x = " << (*i).x << " y = " << (*i).y << endl;
    cout << endl;
}

void get(vector<point>& the_points)
{
  point temp;
  
  double xvals[]={4.5, 6.7, 3.1, 8.7, 5.5, 5.1, 9.5, 2.4};
  for (int i = 0; i < 8; ++i) {
      temp.x = xvals[i];
      temp.y = temp.x * temp.x;
      the_points.push_back(temp);
  }
}

double linear_interpolate(double x, const point& p0, const point& p1)
{
  double slope = (p1.y - p0.y)/(p1.x - p0.x);
  return p0.y + slope * (x - p0.x);
}

void read_and_interpolate(vector<point>& the_points)
{
  point target;  
  vector<point>::iterator above; 
  char* prompt = " > ";
   
  cout << prompt;
  cin >> target.x;
  while (target.x != 0) {
    // Use the upper_bound function, from algorithm, to compute an 
    // iterator referring to the first point in vector the_points 
    // which has an x value greater than target.x:
    above = upper_bound(the_points.begin(), the_points.end(), target);
    // Now check to see if target.x is in range for interpolation:
    if (above == the_points.begin()) 
      cout << "Value x = " << target.x << " is too low, try again\n";
    else if (above == the_points.end()) 
      cout << "Value x = " << target.x << " is too high, try again\n";
    else {
      target.y = linear_interpolate(target.x, *(above-1), *above);
      cout << "For x = " << target.x 
           << " the interpolated y value is " << target.y << endl;
    }
    cout << prompt;
    cin >> target.x;
  }
  cout << endl;
}
This program is in /dept/cs/cs2/wksht20/interp.cpp. Compile it, run it, and study it to make sure that you know how it works. We will return to this example in the next worksheet and explore whether it would work just as well to use a list container instead of a vector.

Exercise 3: Write a program that creates a vector of Students, sorts the vector by student number (note that you will need to overload the < operator for the Student class to do this), and then asks the user to enter an upper and lower bound of student numbers. You should print out the names of all students in the vector in this range. The class Student is started for you. There is a file called Students in /dept/cs/cs2/wksht20 which contains a list of of students one per line. Your program should open this file and read the students into the vector. Notice that the upper_bound, lower_bound, and find functions all take the element type of the vector as the last argument, and this means that you have to pass in an instance of type Student as the last argument, not an integer.

class Student {
private: 
   char sname[32];
   int snum;
public:
   Student(char *name, int n) {
       strcpy(sname,name);
       snum = n;
   }
   int getnum() {return snum;}
   char *getname(){return sname;}
};

The implementation of the vector class template

As we've stressed a number of times in this worksheet, and in the course as a whole, you don't need to know how a library component is implemented in order to use it. However, it's interesting to see how some of the things we've been studying and practicing come into play in the implementations of some of the STL components. A vector is generally implemented2 as a block of contiguous memory and has three pointers: one to the start of the block, one to the end of the block, and one pointing one step past the last memory location of the block in which an element is actually stored. When the last two pointers are equal, the vector is full. The code implementing push_back and insert includes a test for this case, and, when necessary, expands the storage by doubling the size of the memory block by allocating new memory and copying the old values into this new memory, deletes the old memory, and adjusts the pointers to point to the new memory.

Since contiguous storage is used, inserting somewhere in the middle of a vector requires a loop to move each of the elements that are past the insertion point over by one, to make room for the new element. This loop requires $m$ steps where $m$ is the number of elements moved, which is why insertion is a linear time operation except when it is done at the end of the vector. A similar story holds for erasure of an element.

Since the storage is in a contiguous block, a vector iterator can be implemented simply as a C++-pointer, because pointers have all of the required operations and all operations have the random-access property (i.e., they take only constant time).

You can examine all of the STL source code by looking in the header files.




next up previous
Next: About this document ...
Paul Lalli 2002-05-17