next up previous
Next: About this document ...

CSCI-1200 - Computer Science II
Summer, 2002
Worksheet 1
Data Types, Arrays

Reading: Deitel & Deitel, 1.22, 4.1 - 4.4

The definition of a data type

All data in a computer program has a type. There are a number of fundamental data types that are defined by the C and C++ languages. Examples include integers, characters, and floating point numbers. A data type has two components:

Consider the type integer. On the computers in our lab, the hardware instructions and memory provide a type containing integers occupying four bytes (32 bits)1 of memory. The C++ declaration
int x, y;
in a program creates two such integer variables with the symbolic names x and y. Each of these represents a memory location consisting of four bytes.

A bit can have one of two values, zero or one. Since there are 32 bits allocated for each integer, there are $2^{32}$ possible values for an integer. However, typically the leftmost bit of an integer is a sign bit; a zero in this postion represents a positive number and a one represents a negative number.

Here are some examples of the representation of integers in memory.

Value Representation
0 00000000 00000000 00000000 00000000
1 00000000 00000000 00000000 00000001
2 00000000 00000000 00000000 00000010
3 00000000 00000000 00000000 00000011
4 00000000 00000000 00000000 00000100
5 00000000 00000000 00000000 00000101
8 00000000 00000000 00000000 00001000
10 00000000 00000000 00000000 00001010
20 00000000 00000000 00000000 00010100

The largest positive value which can be represented in an int variable on our computers is $2^{31} - 1$ which has the value $2,147,483,647$ and is represented in memory as
01111111 11111111 11111111 11111111
The smallest value is $-2^{31}$ or $-2,147,483,648$. These are not symmetrical because of the way that negative integers are represented. Appendix C of Deitel & Deitel discusses number systems and their computer representation further. We will not go into the details of such representations in this course, other than to note that a common kind of programming error is the failure to remember that computer representations of numeric types are typically limited to a finite range, and therefore the results of some arithmetic operations may not be representable. For example, with integer types this phenomenon occurs when a result is larger than the largest representable value ($2,147,483,647$ in our case); this is called overflow. Such numbers are much easier to generate than you might think, even with simple programs, and exactly what happens when such overflows occur is not defined by C or C++ language rules. Most implementations of the language resort to whatever the hardware instructions do, which can be different on different computers. To avoid overflow, and the uncertain results that it can cause, you generally must be careful to check that the size of the numbers you are computing with remains within the bounds of the representation.

Here are some of the operations which can be performed on integers.

= assignment (copies a value)
+ addition
- subtraction
* multiplication
/ division (note that integer division truncates, for example 17 / 3 returns 5.)
% modulus or remainder. 17 % 3 returns 2, the remainder of the division operation
++ increment (adds one to the value)
-- decrement (subtracts one from the value)
> greater than
< less than
== is equal to (different from =)
+= addition assignment shortcut. Set the left operand to its current value plus the value of
  the right operand. The statements  x += 5;  and  x = x + 5;  are equivalent.
-= subtraction assignment shortcut
*= multiplication assignment shortcut
/= division assignment shortcut
%= remainder assignment shortcut

In the <iostream> header file, the following two operations are defined on integers:
cout << displays the integer on standard output (the terminal)  
cin >> reads an integer value from standard input (the keyboard)  

The ++ and -- operations require special attention. These are unary operators, which means that they take only one operand, as opposed to binary operators, which take two operands (all of the other operators are binary). These operators can either precede or follow their operands. If they precede the operand, then the value returned is the new value the operand has after the operation, while if they follow their operand, the value returned is the original value of the operand, before the operation.

This can best be explained with some examples.

#include <iostream.h>
int main()
{
   int x,y;
   x = 7;
   y = ++x;
   cout << x << " " << y << endl; // prints 8 8
   x = 7;
   y = x++;
   cout << x << " " << y << endl; // prints 8 7
   x = 7;
   y = x--;
   cout << x << " " << y << endl; // prints 6 7  
   x = 7;
   y = ++x * 2;
   cout << x << " " << y << endl; // prints 8 16
   x = 7;
   y = x++ * 2;   
   cout << x << " " << y << endl; // prints 8 14
   return 0;
}
You cannot assume that an integer will be four bytes long on all computers. On older PCs, an integer is 2 bytes or 16 bits, which means that the largest value that it can hold is $2^{15} -1$ or 32,767. There are two other declarations of integers which are guaranteed to allocate enough space. A long int is guaranteed to be at least 32 bits and a short int is guaranteed to be at least 16 bits on all computers. These are generally shortened to long and short. For example:
int main()
{
    long x;
    short y;
    ...
If you are a little confused by the discussion of how integers are represented in a computer, don't worry about it because you do not need to know all of the details about how a data type represents its values in order to use that type. The important thing is to learn the meaning of the various operations on the type (and enough about the internal representation to know when it might impact the correctness of your program, such as the integer overflow issue already discussed). This concept, that a data type is defined by its operations, is an important one (often called the data abstraction principle) and we will return to it repeatedly throughout this course.

Exercise 1: What would the following program print? (answer on last page)

#include <iostream>
using namespace std;
int main()
{
   int i,j,k;
   i = 5;
   j = i++;
   cout << i << " " << j << endl;
   i = 5;
   j = ++i;
   cout << i << " " << j << endl;
   i = 17;
   j = 7;
   k = i % j;
   cout << k << endl;
   i = 17;
   j = 7;
   j += i;
   cout << i << " " << j << endl;
   i = 17;
   j = 7;
   k = i / j;
   cout << k << endl;
   i = 17;
   j = 7;
   i %= j;
   cout << i << endl;
   i = 17;
   j = 7;
   i -= j + 2;
   cout << i << endl;
   return 0;
}
Real Numbers

Just as integers come in two sizes (long and short), real numbers (often called floating point numbers) come in two sizes: float (32 bits) and double (64 bits). The representation of these values in memory is beyond the scope of this course. Remember, though, that you do not need to know how values are represented in memory in order to use a data type.

All of the operations listed above that can be performed on integers can also be performed on variables of type float and double except $\%$ (modulus or remainder) (Why not?).

C and C++ will often perform implicit conversion (called type coercion) between integers and real numbers. For example, you can add a variable of type float to a variable of type int and get the correct answer. The value returned will be of type float. Likewise, comparison operators like $>$, $<$ or $==$ can be used with two operands of different types and will behave correctly.

You can also explicitly change a variable's type within an arithmetic expression by casting. This is done by putting the new type in parentheses before the variable name. For example, if you want to divide two integers to get a floating-point result, you must cast at least one of the integers to a float. Here is an example:

int x, y;
x = 17;
y = 3;
cout << x/y << endl;          // this prints 5
cout << (double)x/y << endl;   // this prints 5.66667
cout << (double)(x/y) << endl; // this prints 5 (Why?)
You must be careful when writing assignment statements using mixed mode arithmetic. All expressions on the right hand side of the assignment operator are evaluated before the assignment (and possible implicit type coercion) occurs. Some examples:

double f;
f = 9/10;    // integer division yields 0, which is then coerced into float 0.0
int x;
x = .75 + .75; // right hand side is 1.5, which is truncated to integer 1
x = 1/2 + 1/2; // each of the "1/2" expressions is 0, so x is assigned 0
Precedence and Associativity

All operations have a precedence and an associativity. Precedence refers to the order in which operations are done. For example, multiplication has higher precedence than addition. This means that in an arithmetic expression, the multiplication operation is done before the addition operation. For example:

int x=7, y=5, z;
z = x + y * 2; 
// z now has the value 17, not 24
Associativity refers to the order in which two operations of the same precedence are done. $+,-,*,/,$ and $\%$ are all left associative, which means that the left operation is done first. For example, the expression 12 - 4 - 3 produces 5 because the left subtraction is done before the right subtraction. If the right subtraction were done first, the answer would have been 11.

An example of an operation which is right associative is the assignment operator. One sometimes sees statements like the following in a program:
x = y = z = 0;
The rightmost assignment (z = 0) is done first, and this returns the value zero, which is then assigned to y, and this operation also returns 0, which is assigned to x. Precedence and associativity can be changed with parentheses. For example:

int x,y,z;
x = 7;
y = 5;
z = (x + y) * 2; 
// z now has the value 24
z = 12 - (4 - 3);
// z now has the value 11
A complete table of operator precedence and associativity is in appendix A of the Deitel textbook.

Exercise 2: Write C++ statements that would compute the following equations:

\begin{displaymath}
a = b(c-1) ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ a = \frac{b + c / 2}{d - e}
\end{displaymath} (1)

The Boolean Data Type

The bool data type is one of the newest additions to the C++ language. Its size differs from compiler to compiler. It holds one of two values: true (which is equivalent to a non-zero integer value) or false (an integer zero).

You can assign the value true or false to a boolean variable, but boolean expressions, which resolve to either true or false, are used mainly in predicates, such as an if-expression or while-expression. They can be very useful in situations such as improving the readability of a complicated predicate. As an example, the following code fragments are equivalent:

if ( (ht > 68 && ht < 72) && (wt > 165 && wt < 200) )
  cout << "You are a size L" << endl;


bool ht_ok, wt_ok;
ht_ok = ht > 68 && ht < 72;
wt_ok = wt > 165 && wt < 200;
if (ht_ok && wt_ok)
  cout << "You are a size L" << endl;

The Character Data Type

The char data type takes up one byte (8 bits) so there are $2^8$ or 256 possible values. Each value represents an ASCII (American Standard Code for Information Interchange) code. This is a mapping between the value in a byte and a character. Values in the range 32 .. 126 represent printable characters.

ASCII Codes for printable characters
 32       33  !    34  "    35  #    36  $    37  %    38  &    39  '   
 40  (    41  )    42  *    43  +    44  ,    45  -    46  .    47  /   
 48  0    49  1    50  2    51  3    52  4    53  5    54  6    55  7   
 56  8    57  9    58  :    59  ;    60  <    61  =    62  >    63  ?   
 64  @    65  A    66  B    67  C    68  D    69  E    70  F    71  G   
 72  H    73  I    74  J    75  K    76  L    77  M    78  N    79  O   
 80  P    81  Q    82  R    83  S    84  T    85  U    86  V    87  W   
 88  X    89  Y    90  Z    91  [    92  \    93  ]    94  ^    95  _   
 96  `    97  a    98  b    99  c   100  d   101  e   102  f   103  g   
104  h   105  i   106  j   107  k   108  l   109  m   110  n   111  o   
112  p   113  q   114  r   115  s   116  t   117  u   118  v   119  w   
120  x   121  y   122  z   123  {   124  |   125  }   126  ~
Notice that the upper case letters (A .. Z) are consecutive (65 .. 90), the lower case letters (a .. z) are consecutive (97 .. 122) and the digits (0 .. 9) are consecutive (48 .. 57). Note also that ASCII 32 is the space character.

The ASCII values below 32 are used for various non-printable characters, only a few of which are of interest to us. ASCII 10 is the newline, ASCII 13 is the carriage return, and ASCII 9 is the tab character.

Character constants are represented in single quotes. Here is an example:

    char c1, c2;
    c1 = 'A';  // sets the value of the variable c1 to 65
    c2 = '$';  // sets the value of the variable c2 to 36
    ...

It is easy to get confused between the digits 0 .. 9 which are simply characters, and the numeric values 0 .. 9. The following statement sets the value of c3 to 51, the ASCII code for the character 3

char c3;
c3 = '3';

The newline character is represented like this '\n' and the tab character is represented like this '\t'. The single quote character (ASCII 39) is represented as '\'' and the double quote character (ASCII 34) is represented as '\"'.

C and C++ are weakly typed languages, which means that for certain purposes, some types can be used as other types. For instance, variables of type char can often be treated like integers:

    char c4, c5;
    c4 = 89; // sets the value of c4 to 'Y'
    c5 = 51;  // sets the value of c5 to '3'
    cout << c4 << c5 << endl;  // prints Y3
    c4++;
    cout << c4 << endl; // prints Z
    c4++;
    cout << c4 << endl; // prints [
    c4 += 10;
    cout << c4 << endl; // prints e
The cout statement and its associated operator << checks the type of each variable and ``knows'' how to display each type. If the type is an integer, it displays an integer value, if the type is a char, it displays the character that the value represents. To display the ASCII code of a char variable, you can cast it to an integer. For example:
char c6;
c6 = '?';
cout << (int) c6 << endl; // prints 63

The same is true of the cin statement and its associated operator >>.

Exercise 3: Write a program that displays the printable ASCII codes and the characters that they represent. Your output should start out looking something like this:

32   
33   !
34   "
35   #
36   $

Exercise 4: What would the following program print?

#include <iostream>
using namespace std;
int main()
{
    char b, c;
    int x;
    b = 'a';
    c = 32;
    cout << b << c << 'b' << c << (int) b << endl;
    b += 25;
    cout << b << c << (int) c << c << (int) b << endl;
    b++;
    cout << b << endl;
    x = 65;
    cout << x << c << (char)x << endl;
    return 0;
}

Arrays

An array is a set of elements of the same type in contiguous memory. You can declare an array of any predefined type. In C and C++, arrays are declared by appending the size of the array in square brackets to the name of the variable. Here are some examples:
int x[10]; // an array of ten integers
double q[43]; // an array of 43 real double precision numbers
char c[128]; // an array of 128 characters
short s[12]; // an array of 12 16-bit integers

You can do anything with an element of an array that you can do with a simple variable of that type. To access a particular element in one of these arrays, you must put an subscript in square brackets following the array name. Here are some examples:
x[3] = 17;
q[10] = 3.1416;
cout << x[3];
int y;
y = x[3];

Alert: The subscript for an array always ranges from 0 to $n-1$ where $n$ is the size of the array. For example, the valid subscripts for array x are 0 to 9. The first element is x[0], the second element is x[1] and the last element is x[9]. If you violate this, you will not get an error message, but your program will behave strangely. For example, the following statement in your program:
x[10] = 17;
would not generate an error message, but it would assign the value 17 to a memory location beyond x, which might be another variable in your program, so you would be inadvertently changing the value of some other variable in your program. $\Box$

You can also put an arithmetic expression in the square brackets. However, you must make sure that it evaluates to an integer in the range 0 to $n-1$. The following are all valid statements:
int x[10];
int y = 2;
x[5] = 6;
x[y] = x[5];
x[y+2] = 3;
x[x[2]] = 4;
for(y=7;y<10;y++) x[y]=y;

After executing these statements, the array x has the following values (U means undefined):


\begin{picture}(270,42)
\put(0,15){\framebox (27,27){U}}
\put(27,15){\framebox (...
...0){7}}
\put(229,0){\makebox(0,0){8}}
\put(256,0){\makebox(0,0){9}}
\end{picture}

Arrays of characters are so common that they have a special name, strings, and there are some special conventions that apply to them.

If a string is to be initialized with a string constant, it is not necessary to put the size inside the square brackets; the compiler will automatically compute the correct size. Here is an example:
char president[] = "Bill Clinton";
Individual letters in this string can be accessed in the usual way. For example:
cout << president[5];
prints the character C on the screen.

Alert: String constants are quite different than character constants. Syntactically, character constants are enclosed within single quotes. Semantically, character constants represent exactly one character, while string constants can be used to represent zero or more characters. Remember: Only single characters go inside single quotes. The only exceptions are the special characters noted above, which all use the backslash \ character (\n, \', etc.) $\Box$

char ch = "x";     // This is a syntax error
char str[] = 'x';  // Also an error
cout treats character arrays differently from other arrays. Using cout << with an array variable will print out the address of the array in hexadecimal (not usually very useful) for any array except a character array. The following statements
   int x[10];
   // ... code here to assign values to elements of x 
   cout << x << endl;
print out something like this:
0xf7fffa50
but these statements
   char president[] = "Bill Clinton";
   cout << president << endl;
print Bill Clinton.

There is a large library of string functions in <string> that compare two strings, copy one string to another, and so on. All of these are based on the convention that a string is terminated with a NULL character, that is, a character with the ASCII value 0 (written as '\0'). This means that any array of characters should be at least one character longer than the length of the longest string that it might contain. String constants automatically append the terminal '\0'. In the preceding example, the array named president has enough room for 13 characters, allocated like so:


\begin{picture}(350,40)
\put(0,15){\framebox (25,25){B}}
\put(25,15){\framebox (...
...10}}
\put(287,0){\makebox(0,0){11}}
\put(312,0){\makebox(0,0){12}}
\end{picture}

Here is another example:

#include<iostream>
using namespace std;

int main()
{
   char s[100];       // all 100 characters are uninitialized to start
   s[0]='c';
   s[1]='a';
   s[2]='t';
   s[3]='\0';
   s[4]='x';
   cout << s << end; // this prints cat
                     // nothing after the terminal null is printed
   return 0;
}
When a string is read into a character array using cin >>, a '\0' is automatically appended onto the end of the string that is read in. You should be sure that the array is large enough to hold the string which is entered.

The length of a string stored in an array does not have to be the same size as the array! You are allowed to store a string in an array much larger than the length of a string. Any characters following the first NULL character are ignored when the array is used as a string. Consider what we promise to be the final example:

char bbb[]="Beanie babies bite";  // array bbb holds 19 characters (right?)
cout << bbb;                      // prints:  Beanie babies bite
bbb[15] = '\0'                    // replace a character with NULL character
cout << bbb;                      // prints:  Beanie babies b
   // the string now has 16 characters, but the array still holds 19
bbb[15] = 'y'
cout << bbb;                      // prints:  Beanie babies byte

Exercise 5: Write a program that reads a string of characters from the keyboard and then prints out the number of times that each character appears in that string. Note that if you use cin >> s; the program will place in the array s all characters typed until the first space or until the return key is pressed. Your program should then print out the number of times that each printable character appears in the string. Hint: create an array of integers with each cell corresponding to a character (this is easy because each character has a numeric value associated with it). Set all cells in the array to zero. As each character is read in, increment the cell corresponding to that character by 1. For example if the first character was 'A', then the cell in the array which corresponds to ASCII 65 would be increased by 1.

Answers to Exercises

Exercise 1:                Exercise 2:                Exercise 4:
6 5 a = b * (c - 1); a b 97
6 6   z 32 122
3 a = (b + c / 2) / (d - e);            {
17 24   65 A
2    
3    
8    




next up previous
Next: About this document ...
Paul Lalli 2002-05-20