Reading: Deitel & Deitel, 1.22, 4.1 - 4.4
The definition of a data type
All data in a computer program has a type. There are a number of fundamental data types that are defined by the C and C++ languages. Examples include integers, characters, and floating point numbers. A data type has two components:
Consider the type integer. On the computers in our lab,
the hardware instructions and memory provide a type containing
integers occupying four bytes (32 bits)1 of memory. The C++ declaration
int x, y;
in a program creates two such integer variables with the symbolic
names x and y. Each of these represents a memory location
consisting of four bytes.
A bit can have one of two values, zero or one. Since there are 32
bits allocated for each integer, there are
possible values
for an integer. However, typically the leftmost bit of an integer is a
sign bit; a zero in this postion
represents a positive number and a one represents a
negative number.
Here are some examples of the representation of integers in memory.
| Value | Representation |
| 0 | 00000000 00000000 00000000 00000000 |
| 1 | 00000000 00000000 00000000 00000001 |
| 2 | 00000000 00000000 00000000 00000010 |
| 3 | 00000000 00000000 00000000 00000011 |
| 4 | 00000000 00000000 00000000 00000100 |
| 5 | 00000000 00000000 00000000 00000101 |
| 8 | 00000000 00000000 00000000 00001000 |
| 10 | 00000000 00000000 00000000 00001010 |
| 20 | 00000000 00000000 00000000 00010100 |
The largest positive value which can be represented in an int
variable on our computers is
which has the value
and is represented in memory as
01111111 11111111 11111111 11111111
The smallest value is
or
. These are not symmetrical because of
the way that negative integers are represented. Appendix C of Deitel
& Deitel discusses number systems and their computer representation
further. We will not go into the details of such representations in
this course, other than to note that a common kind of programming
error is the failure to remember that computer representations of
numeric types are typically limited to a finite range, and therefore
the results of some arithmetic operations may not be representable.
For example, with integer types this phenomenon occurs when a result
is larger than the largest representable value (
in our
case); this is called overflow. Such numbers are much easier
to generate than you might think, even with simple programs, and
exactly what happens when such overflows occur is not defined by C or
C++ language rules. Most implementations of the language resort to
whatever the hardware instructions do, which can be different on
different computers. To avoid overflow, and the uncertain results
that it can cause, you generally must be careful to check that the
size of the numbers you are computing with remains within the bounds
of the representation.
Here are some of the operations which can be performed on integers.
= |
assignment (copies a value) |
+ |
addition |
- |
subtraction |
* |
multiplication |
/ |
division (note that integer division truncates, for example
17 / 3 returns 5.) |
% |
modulus or remainder. 17 % 3 returns 2, the remainder of the division operation |
++ |
increment (adds one to the value) |
-- |
decrement (subtracts one from the value) |
> |
greater than |
< |
less than |
== |
is equal to (different from =) |
+= |
addition assignment shortcut. Set the left operand to its current value plus the value of |
| the right operand. The statements x += 5; and x = x + 5; are equivalent. | |
-= |
subtraction assignment shortcut |
*= |
multiplication assignment shortcut |
/= |
division assignment shortcut |
%= |
remainder assignment shortcut |
In the <iostream> header file, the following two operations are defined on integers:
cout << displays the integer on standard output (the terminal) |
|
cin >> reads an integer value from standard input (the keyboard) |
The ++ and -- operations require special attention. These
are unary operators, which means that they take only one
operand, as opposed to
binary operators, which take two operands (all of the
other operators are binary).
These operators can either precede or follow their operands. If
they precede the operand, then the value returned is the new value
the operand has after the operation, while if they follow their operand,
the value returned is the original value of the operand, before
the operation.
This can best be explained with some examples.
#include <iostream.h>
int main()
{
int x,y;
x = 7;
y = ++x;
cout << x << " " << y << endl; // prints 8 8
x = 7;
y = x++;
cout << x << " " << y << endl; // prints 8 7
x = 7;
y = x--;
cout << x << " " << y << endl; // prints 6 7
x = 7;
y = ++x * 2;
cout << x << " " << y << endl; // prints 8 16
x = 7;
y = x++ * 2;
cout << x << " " << y << endl; // prints 8 14
return 0;
}
You cannot assume that an integer will be four bytes long on all
computers. On older PCs, an integer is
2 bytes or 16 bits, which means that the largest value that it can
hold is
int main()
{
long x;
short y;
...
If you are a little confused by the discussion of how integers are
represented in a computer, don't worry about it because you do
not need to know all of the details about how a data type represents
its values in order to use that type. The important thing is to
learn the meaning of the various operations on the type (and enough
about the internal representation to know when it might impact the
correctness of your program, such as the integer overflow issue already
discussed). This concept, that a data type is
defined by its operations, is an important one (often called
the data abstraction principle) and we will
return to it repeatedly throughout this course.
Exercise 1:
What would the following program print? (answer on last page)
#include <iostream>
using namespace std;
int main()
{
int i,j,k;
i = 5;
j = i++;
cout << i << " " << j << endl;
i = 5;
j = ++i;
cout << i << " " << j << endl;
i = 17;
j = 7;
k = i % j;
cout << k << endl;
i = 17;
j = 7;
j += i;
cout << i << " " << j << endl;
i = 17;
j = 7;
k = i / j;
cout << k << endl;
i = 17;
j = 7;
i %= j;
cout << i << endl;
i = 17;
j = 7;
i -= j + 2;
cout << i << endl;
return 0;
}
Real Numbers
Just as integers come in two sizes (long and short), real numbers (often called floating point numbers) come in two sizes: float (32 bits) and double (64 bits). The representation of these values in memory is beyond the scope of this course. Remember, though, that you do not need to know how values are represented in memory in order to use a data type.
All of the operations listed above that can be performed on integers
can also be performed on variables of type float and double except
(modulus or remainder) (Why not?).
C and C++ will often perform implicit conversion (called type
coercion) between integers and real numbers. For example, you can
add a variable of type float to a variable of type int and
get the correct answer. The value returned will be of type float. Likewise, comparison operators like
,
or
can be
used with two operands of different types and will behave correctly.
You can also explicitly change a variable's type within an arithmetic expression by casting. This is done by putting the new type in parentheses before the variable name. For example, if you want to divide two integers to get a floating-point result, you must cast at least one of the integers to a float. Here is an example:
int x, y; x = 17; y = 3; cout << x/y << endl; // this prints 5 cout << (double)x/y << endl; // this prints 5.66667 cout << (double)(x/y) << endl; // this prints 5 (Why?)You must be careful when writing assignment statements using mixed mode arithmetic. All expressions on the right hand side of the assignment operator are evaluated before the assignment (and possible implicit type coercion) occurs. Some examples:
double f; f = 9/10; // integer division yields 0, which is then coerced into float 0.0 int x; x = .75 + .75; // right hand side is 1.5, which is truncated to integer 1 x = 1/2 + 1/2; // each of the "1/2" expressions is 0, so x is assigned 0Precedence and Associativity
All operations have a precedence and an associativity. Precedence refers to the order in which operations are done. For example, multiplication has higher precedence than addition. This means that in an arithmetic expression, the multiplication operation is done before the addition operation. For example:
int x=7, y=5, z; z = x + y * 2; // z now has the value 17, not 24Associativity refers to the order in which two operations of the same precedence are done.
12 - 4 - 3 produces 5 because the left subtraction is
done before the right subtraction. If the right subtraction were
done first, the answer would have been 11.
An example of an operation which is right associative is the
assignment operator. One sometimes sees statements like the
following in a program:
x = y = z = 0;
The rightmost assignment (z = 0) is done first, and
this returns the value zero, which is then assigned to y,
and this operation also returns 0, which is assigned to x.
Precedence and associativity can be changed with parentheses.
For example:
int x,y,z; x = 7; y = 5; z = (x + y) * 2; // z now has the value 24 z = 12 - (4 - 3); // z now has the value 11A complete table of operator precedence and associativity is in appendix A of the Deitel textbook.
Exercise 2: Write C++ statements that would compute the following equations:
| (1) |
The Boolean Data Type
The bool data type is one of the newest additions to the C++ language. Its size differs from compiler to compiler. It holds one of two values: true (which is equivalent to a non-zero integer value) or false (an integer zero).
You can assign the value true or false to a boolean variable, but boolean expressions, which resolve to either true or false, are used mainly in predicates, such as an if-expression or while-expression. They can be very useful in situations such as improving the readability of a complicated predicate. As an example, the following code fragments are equivalent:
if ( (ht > 68 && ht < 72) && (wt > 165 && wt < 200) ) cout << "You are a size L" << endl; bool ht_ok, wt_ok; ht_ok = ht > 68 && ht < 72; wt_ok = wt > 165 && wt < 200; if (ht_ok && wt_ok) cout << "You are a size L" << endl;
The Character Data Type
The char data type takes up one byte (8 bits) so there are
or 256 possible values. Each value represents an ASCII
(American Standard Code for Information Interchange) code.
This is a mapping between the value in a byte and a character.
Values in the range 32 .. 126 represent printable characters.
ASCII Codes for printable characters
32 33 ! 34 " 35 # 36 $ 37 % 38 & 39 '
40 ( 41 ) 42 * 43 + 44 , 45 - 46 . 47 /
48 0 49 1 50 2 51 3 52 4 53 5 54 6 55 7
56 8 57 9 58 : 59 ; 60 < 61 = 62 > 63 ?
64 @ 65 A 66 B 67 C 68 D 69 E 70 F 71 G
72 H 73 I 74 J 75 K 76 L 77 M 78 N 79 O
80 P 81 Q 82 R 83 S 84 T 85 U 86 V 87 W
88 X 89 Y 90 Z 91 [ 92 \ 93 ] 94 ^ 95 _
96 ` 97 a 98 b 99 c 100 d 101 e 102 f 103 g
104 h 105 i 106 j 107 k 108 l 109 m 110 n 111 o
112 p 113 q 114 r 115 s 116 t 117 u 118 v 119 w
120 x 121 y 122 z 123 { 124 | 125 } 126 ~
Notice that the upper case letters (A .. Z) are consecutive (65 .. 90),
the lower case letters (a .. z) are consecutive (97 .. 122) and the
digits (0 .. 9) are consecutive (48 .. 57). Note also that ASCII 32
is the space character.
The ASCII values below 32 are used for various non-printable characters, only a few of which are of interest to us. ASCII 10 is the newline, ASCII 13 is the carriage return, and ASCII 9 is the tab character.
Character constants are represented in single quotes. Here is an example:
char c1, c2;
c1 = 'A'; // sets the value of the variable c1 to 65
c2 = '$'; // sets the value of the variable c2 to 36
...
It is easy to get confused between the digits 0 .. 9 which are simply characters, and the numeric values 0 .. 9. The following statement sets the value of c3 to 51, the ASCII code for the character 3
char c3;
c3 = '3';
The newline character is represented like this '\n'
and the tab character is represented like this '\t'.
The single quote character (ASCII 39) is represented as '\''
and the double quote character (ASCII 34) is represented as '\"'.
C and C++ are weakly typed languages, which means that for certain purposes, some types can be used as other types. For instance, variables of type char can often be treated like integers:
char c4, c5;
c4 = 89; // sets the value of c4 to 'Y'
c5 = 51; // sets the value of c5 to '3'
cout << c4 << c5 << endl; // prints Y3
c4++;
cout << c4 << endl; // prints Z
c4++;
cout << c4 << endl; // prints [
c4 += 10;
cout << c4 << endl; // prints e
The cout statement and its associated operator << checks
the type of each variable and ``knows'' how to display each type.
If the type is an integer, it displays an integer value, if the type
is a char, it displays the character that the value represents.
To display the ASCII code of a char variable, you can cast it to
an integer. For example:
char c6;
c6 = '?';
cout << (int) c6 << endl; // prints 63
The same is true of the cin statement and its associated
operator >>.
Exercise 3: Write a program that displays the printable ASCII codes and the characters that they represent. Your output should start out looking something like this:
32 33 ! 34 " 35 # 36 $
Exercise 4: What would the following program print?
#include <iostream>
using namespace std;
int main()
{
char b, c;
int x;
b = 'a';
c = 32;
cout << b << c << 'b' << c << (int) b << endl;
b += 25;
cout << b << c << (int) c << c << (int) b << endl;
b++;
cout << b << endl;
x = 65;
cout << x << c << (char)x << endl;
return 0;
}
Arrays
An array is a set of elements of the same type in contiguous memory.
You can declare an array of any predefined type. In C and C++,
arrays are declared by appending the size of the array in
square brackets to the name of the variable. Here are some examples:
int x[10]; // an array of ten integers
double q[43]; // an array of 43 real double precision numbers
char c[128]; // an array of 128 characters
short s[12]; // an array of 12 16-bit integers
You can do anything with an element of an array that you can do
with a simple variable of that type.
To access a particular element in one of these arrays, you must
put an subscript in square brackets following the array name.
Here are some examples:
x[3] = 17;
q[10] = 3.1416;
cout << x[3];
int y;
y = x[3];
Alert: The subscript for an array always ranges from 0 to
where
is the size of the array. For example, the valid subscripts
for array x are 0 to 9. The first element is x[0],
the second element is x[1] and the last element is x[9].
If you violate this, you will not get an error message, but your
program will behave strangely. For example, the following
statement in your program:
x[10] = 17;
would not generate an error message, but it would assign the value
17 to a memory location beyond x, which might be another
variable in your program, so you would be inadvertently changing the
value of some other variable in your program.
You can also put an arithmetic expression in the square brackets.
However, you must make sure that it evaluates to an integer in the
range 0 to
. The following are all valid statements:
int x[10];
int y = 2;
x[5] = 6;
x[y] = x[5];
x[y+2] = 3;
x[x[2]] = 4;
for(y=7;y<10;y++) x[y]=y;
After executing these statements, the array x has the following values (U means undefined):
Arrays of characters are so common that they have a special name, strings, and there are some special conventions that apply to them.
If a string is
to be initialized with a string constant, it is not necessary to put
the size inside the square brackets; the compiler will automatically
compute the correct size. Here is an example:
char president[] = "Bill Clinton";
Individual letters in this string can be accessed in the usual
way. For example:
cout << president[5];
prints the character C on the screen.
Alert: String constants are quite different than character
constants. Syntactically, character constants are enclosed within
single quotes. Semantically, character constants represent exactly
one character, while string constants can be used to represent
zero or more characters. Remember: Only single characters go inside
single quotes. The only exceptions are the special characters
noted above, which all use the backslash \ character
(\n, \', etc.)
char ch = "x"; // This is a syntax error char str[] = 'x'; // Also an errorcout treats character arrays differently from other arrays. Using
cout << with an array variable will print out
the address of the array in hexadecimal (not usually very useful) for
any array except a character array. The following statements
int x[10]; // ... code here to assign values to elements of x cout << x << endl;print out something like this:
0xf7fffa50
char president[] = "Bill Clinton"; cout << president << endl;print Bill Clinton.
There is a large library of string functions in <string> that
compare two strings, copy one string to another, and so on. All of
these are based on the convention that a string is terminated with a
NULL character, that is, a character with the ASCII value 0 (written
as '\0'). This means that any array of
characters should be at least one character longer than the length
of the longest string that it might contain. String constants
automatically append the terminal '\0'. In the preceding example,
the array named president has enough room for 13 characters, allocated
like so:
Here is another example:
#include<iostream>
using namespace std;
int main()
{
char s[100]; // all 100 characters are uninitialized to start
s[0]='c';
s[1]='a';
s[2]='t';
s[3]='\0';
s[4]='x';
cout << s << end; // this prints cat
// nothing after the terminal null is printed
return 0;
}
When a string is read into a character array using cin >>, a
'\0' is automatically appended onto the end of the string that is
read in. You should be sure that the array is large enough to hold the
string which is entered.
The length of a string stored in an array does not have to be the same size as the array! You are allowed to store a string in an array much larger than the length of a string. Any characters following the first NULL character are ignored when the array is used as a string. Consider what we promise to be the final example:
char bbb[]="Beanie babies bite"; // array bbb holds 19 characters (right?) cout << bbb; // prints: Beanie babies bite bbb[15] = '\0' // replace a character with NULL character cout << bbb; // prints: Beanie babies b // the string now has 16 characters, but the array still holds 19 bbb[15] = 'y' cout << bbb; // prints: Beanie babies byte
Exercise 5: Write a program that reads a string of characters
from the keyboard and then prints out the number of times that each
character appears in that string. Note that if you use
cin >> s;
the program will place in the array s all characters typed
until the first space or until the return key is pressed. Your
program should then print out the number of times that each printable
character appears in the string. Hint: create an array of
integers with each cell corresponding to a character (this is easy
because each character has a numeric value associated with it). Set
all cells in the array to zero. As each character is read in,
increment the cell corresponding to that character by 1. For example
if the first character was 'A', then the cell in the array
which corresponds to ASCII 65 would be increased by 1.
| Exercise 1: | Exercise 2: | Exercise 4: |
| 6 5 | a = b * (c - 1); | a b 97 |
| 6 6 | z 32 122 | |
| 3 | a = (b + c / 2) / (d - e); | { |
| 17 24 | 65 A | |
| 2 | ||
| 3 | ||
| 8 |