Thursday, October 29, 2009

11.10. Case Study: String Class



11.10. Case Study: String
Class


As a capstone to our study of overloading,
we'll build our own String class to handle
the creation and manipulation of strings (Figs. 11.9–11.11). The
C++ standard library provides a similar, more robust class string as well. We present an example of the standard class
string in Section
11.13 and study class string in detail in Chapter
18. For now, we'll make extensive use of operator
overloading to craft our own class String.


First, we present the header file for class String. We
discuss the private data used to represent String objects. Then we walk
through the class's public interface,
discussing each of the services the class provides. We discuss the
member-function definitions for the class String. For each of the overloaded operator functions, we show the
code in the program that invokes the overloaded operator function, and we
provide an explanation of how the overloaded operator function works.


String Class Definition


Now let's walk through the String class header file in
Fig. 11.9. We begin with the internal pointer-based
representation of a String. Lines 55–56 declare the private
data members of the class. Our String class has a length field, which represents the number of characters in the
string, not including the null character at the end, and has a pointer
sPtr that points to the dynamically allocated
memory representing the character string.











Fig. 11.9. String class definition with operator
overloading.


 

 1   // Fig. 11.9: String.h
2 // String class definition with operator overloading.
3 #ifndef STRING_H
4 #define STRING_H
5
6 #include <iostream>
7 using std::ostream;
8 using std::istream;
9
10 class String
11 {
12 friend ostream &operator<<( ostream &, const String & );
13 friend istream &operator>>( istream &, String & );
14 public:
15 String( const char * = "" ); // conversion/default constructor
16 String( const String & ); // copy constructor
17 ~String(); // destructor
18
19 const String &operator=( const String & ); // assignment operator
20 const String &operator+=( const String & ); // concatenation operator
21
22 bool operator!() const; // is String empty?
23 bool operator==( const String & ) const; // test s1 == s2
24 bool operator<( const String & ) const; // test s1 < s2
25
26 // test s1 != s2
27 bool operator!=( const String &right ) const
28 {
29 return !( *this == right );
30 } // end function operator!=
31
32 // test s1 > s2
33 bool operator>( const String &right ) const
34 {
35 return right < *this;
36 } // end function operator>
37
38 // test s1 <= s2
39 bool operator<=( const String &right ) const
40 {
41 return !( right < *this );
42 } // end function operator <=
43
44 // test s1 >= s2
45 bool operator>=( const String &right ) const
46 {
47 return !( *this < right );
48 } // end function operator>=
49
50 char &operator[]( int ); // subscript operator (modifiable lvalue)
51 char operator[]( int ) const; // subscript operator (rvalue)
52 String operator()( int, int = 0 ) const; // return a substring
53 int getLength() const; // return string length
54 private:
55 int length; // string length (not counting null terminator)
56 char *sPtr; // pointer to start of pointer-based string
57
58 void setString( const char * ); // utility function
59 }; // end class String
60
61 #endif




 


Overloading the Stream Insertion
and Stream Extraction Operators as friends


Lines 12–13 (Fig. 11.9) declare the overloaded stream insertion operator
function operator<< (defined in Fig. 11.10, lines 170–174) and the overloaded stream extraction
operator function operator>> (defined in Fig. 11.10, lines 177–183) as
friends of the class. The
implementation of operator<< is straightforward. Note that
operator>> restricts the total number of
characters that can be read into array temp to 99 with setw (line 180); the 100th position is reserved for the
string's terminating null character. [Note: We did not have this restriction for
operator>> in class Array (Figs.
11.6–11.7),
because that class's operator>> read
one array element at a time and stopped reading values when the end of the array
was reached. Object cin does not know how to do
this by default for input of character arrays.] Also, note the use of
operator= (line 181) to assign the C-style string
temp to the String object to which s refers. This statement invokes the conversion
constructor to create a temporary String object
containing the C-style string; the temporary String is
then assigned to s. We could eliminate the overhead of creating the
temporary String object here by providing
another overloaded assignment operator that receives a parameter of type
const char *.











Fig. 11.10. String class member-function and
friend-function definitions.


 

 1   // Fig. 11.10: String.cpp
2 // String class member-function and friend-function definitions.
3 #include <iostream>
4 using std::cerr;
5 using std::cout;
6 using std::endl;
7
8 #include <iomanip>
9 using std::setw;
10
11 #include <cstring> // strcpy and strcat prototypes
12 using std::strcmp;
13 using std::strcpy;
14 using std::strcat;
15
16 #include <cstdlib> // exit prototype
17 using std::exit;
18
19 #include "String.h" // String class definition
20
21 // conversion (and default) constructor converts char * to String
22 String::String( const char *s )
23 : length( ( s != 0 ) ? strlen( s ) : 0 )
24 {
25 cout << "Conversion (and default) constructor: " << s << endl;
26 setString( s ); // call utility function
27 } // end String conversion constructor
28
29 // copy constructor
30 String::String( const String &copy )
31 : length( copy.length )
32 {
33 cout << "Copy constructor: " << copy.sPtr << endl;
34 setString( copy.sPtr ); // call utility function
35 } // end String copy constructor
36
37 // Destructor
38 String::~String()
39 {
40 cout << "Destructor: " << sPtr << endl;
41 delete [] sPtr; // release pointer-based string memory
42 } // end ~String destructor
43
44 // overloaded = operator; avoids self assignment
45 const String &String::operator=( const String &right )
46 {
47 cout << "operator= called" << endl;
48
49 if ( &right != this ) // avoid self assignment
50 {
51 delete [] sPtr; // prevents memory leak
52 length = right.length; // new String length
53 setString( right.sPtr ); // call utility function
54 } // end if
55 else
56 cout << "Attempted assignment of a String to itself" << endl;
57
58 return *this; // enables cascaded assignments
59 } // end function operator=
60
61 // concatenate right operand to this object and store in this object
62 const String &String::operator+=( const String &right )
63 {
64 size_t newLength = length + right.length; // new length
65 char *tempPtr = new char[ newLength + 1 ]; // create memory
66
67 strcpy( tempPtr, sPtr ); // copy sPtr
68 strcpy( tempPtr + length, right.sPtr ); // copy right.sPtr
69
70 delete [] sPtr; // reclaim old space
71 sPtr = tempPtr; // assign new array to sPtr
72 length = newLength; // assign new length to length
73 return *this; // enables cascaded calls
74 } // end function operator+=
75
76 // is this String empty?
77 bool String::operator!() const
78 {
79 return length == 0;
80 } // end function operator!
81
82 // Is this String equal to right String?
83 bool String::operator==( const String &right ) const
84 {
85 return strcmp( sPtr, right.sPtr ) == 0;
86 } // end function operator==
87
88 // Is this String less than right String?
89 bool String::operator<( const String &right ) const
90 {
91 return strcmp( sPtr, right.sPtr ) < 0;
92 } // end function operator<
93
94 // return reference to character in String as a modifiable lvalue
95 char &String::operator[]( int subscript )
96 {
97 // test for subscript out of range
98 if ( subscript < 0 || subscript >= length )
99 {
100 cerr << "Error: Subscript " << subscript
101 << " out of range" << endl;
102 exit( 1 ); // terminate program
103 } // end if
104
105 return sPtr[ subscript ]; // non-const return; modifiable lvalue
106 } // end function operator[]
107
108 // return reference to character in String as rvalue
109 char String::operator[]( int subscript ) const
110 {
111 // test for subscript out of range
112 if ( subscript < 0 || subscript >= length )
113 {
114 cerr << "Error: Subscript " << subscript
115 << " out of range" << endl;
116 exit( 1 ); // terminate program
117 } // end if
118
119 return sPtr[ subscript ]; // returns copy of this element
120 } // end function operator[]
121
122 // return a substring beginning at index and of length subLength
123 String String::operator()( int index, int subLength ) const
124 {
125 // if index is out of range or substring length < 0,
126 // return an empty String object
127 if ( index < 0 || index >= length || subLength < 0 )
128 return ""; // converted to a String object automatically
129
130 // determine length of substring
131 int len;
132
133 if ( ( subLength == 0 ) || ( index + subLength > length ) )
134 len = length - index;
135 else
136 len = subLength;
137
138 // allocate temporary array for substring and
139 // terminating null character
140 char *tempPtr = new char[ len + 1 ];
141
142 // copy substring into char array and terminate string
143 strncpy( tempPtr, &sPtr[ index ], len );
144 tempPtr[ len ] = '\0';
145
146 // create temporary String object containing the substring
147 String tempString ( tempPtr );
148 delete [] tempPtr; // delete temporary array
149 return tempString; // return copy of the temporary String
150 } // end function operator()
151
152 // return string length
153 int String::getLength() const
154 {
155 return length;
156 } // end function getLength
157
158 // utility function called by constructors and operator=
159 void String::setString( const char *string2 )
160 {
161 sPtr = new char[ length + 1 ]; // allocate memory
162
163 if ( string2 != 0 ) // if string2 is not null pointer, copy contents
164 strcpy( sPtr, string2 ); // copy literal to object
165 else // if string2 is a null pointer, make this an empty string
166 sPtr[ 0 ] = '\0'; // empty string
167 } // end function setString
168
169 // overloaded output operator
170 ostream &operator<<( ostream &output, const String &s )
171 {
172 output << s.sPtr;
173 return output; // enables cascading
174 } // end function operator<<
175
176 // overloaded input operator
177 istream &operator>>( istream &input, String &s )
178 {
179 char temp[ 100 ]; // buffer to store input
180 input >> setw( 100 ) >> temp;
181 s = temp; // use String class assignment operator
182 return input; // enables cascading
183 } // end function operator>>




String Conversion Constructor


Line 15 (Fig. 11.9) declares a conversion
constructor. This constructor (defined in Fig. 11.10, lines 22–27) takes a
const char * argument (that defaults to the empty string; Fig. 11.9, line 15) and
initializes a String object containing that same character string. Any
single-argument constructor can be thought of as a conversion constructor. As we'll see,
such constructors are helpful when we are doing any String operation
using char * arguments. The conversion
constructor can convert a char * string into a String object,
which can then be assigned to the target String object. The availability of this conversion
constructor means that it is not necessary to supply an overloaded assignment
operator for specifically assigning character strings to String objects. The compiler invokes the conversion
constructor to create a temporary String
object containing the character string; then the overloaded assignment operator
is invoked to assign the temporary String object to another
String object.



Software Engineering Observation 11.8








When a
conversion constructor is used to perform an implicit conversion, C++ can apply
only one implicit constructor call (i.e., a single user-defined conversion) to
try to match the needs of another overloaded operator. The compiler will not
match an overloaded operator's needs by performing a series of implicit,
user-defined conversions.



The String conversion
constructor could be invoked in such a declaration as String s1( "happy"
)
. The conversion constructor calculates the
length of its character-string argument and assigns it to data member
length in the member-initializer list. Then, line 26 calls utility
function setString (defined in Fig. 11.10, lines 159–167), which
uses new to allocate a sufficient amount of memory to private
data member sPtr and uses strcpy to copy the character string
into the memory to which sPtr points.[4]



[4] There is a subtle issue in the implementation
of this conversion constructor. As implemented, if a null pointer (i.e.,
0 ) is passed to the constructor, the program
will fail. The proper way to implement this constructor would be to detect
whether the constructor argument is a null pointer, then "throw an exception."
Chapter
16 discusses how we can make classes more robust
in this manner. Also, note that a null pointer (0) is not the same as
the empty string (""). A null pointer is a
pointer that does not point to anything. An empty string is an actual string
that contains only a null character ('\0').


String Copy Constructor


Line 16 in Fig. 11.9
declares a copy constructor (defined in Fig. 11.10, lines
30–35) that initializes a String object by
making a copy of an existing String object. As with
our class Array (Figs.
11.6–11.7), such copying must be done carefully to avoid the pitfall
in which both String objects point to the same
dynamically allocated memory. The copy constructor operates similarly to the
conversion constructor, except that it simply copies the length member
from the source String object to the target String object. Note that the copy constructor calls
setString to create new space for the target
object's internal character string. If it simply copied the sPtr in the source object to the target object's sPtr, then both objects would point to the same dynamically
allocated memory. The first destructor to execute would then delete the
dynamically allocated memory, and the other object's sPtr would be
undefined (i.e., sPtr would be a
dangling pointer), a situation likely to cause a serious runtime error.


String Destructor


Line 17 of Fig. 11.9 declares the
String destructor (defined in Fig. 11.10, lines 38–42). The destructor uses delete
[]
to release the dynamic memory to which
sPtr points.


Overloaded Assignment Operator


Line 19 (Fig. 11.9) declares the overloaded
assignment operator function operator= (defined in Fig. 11.10, lines 45–59). When the
compiler sees an expression like string1 = string2, the compiler
generates the function call


string1.operator=( string2 );


The overloaded assignment operator function
operator= tests for self-assignment. If this is a
self-assignment, the function does not need to change the object. If this test
were omitted, the function would immediately delete the space in the target
object and thus lose the character string, such that the pointer would no longer
be pointing to valid data—a classic example of a dangling pointer. If there is
no self-assignment, the function deletes the memory and copies the
length field of the source object to the
target object. Then operator= calls setString to create new space for the target object and copy the
character string from the source object to the target object.
Whether or not this is a self-assignment, operator=
returns *this to enable cascaded assignments.


Overloaded Addition Assignment
Operator


Line 20 of Fig. 11.9
declares the overloaded string-concatenation operator += (defined in Fig. 11.10, lines 62–74). When the compiler sees the expression
s1 += s2 (line 40 of Fig. 11.11), the compiler generates the member-function
call













Fig. 11.11. String class test
program.


 

 1   // Fig. 11.11: fig11_11.cpp
2 // String class test program.
3 #include <iostream>
4 using std::cout;
5 using std::endl;
6 using std::boolalpha;
7
8 #include "String.h"
9
10 int main()
11 {
12 String s1( "happy" );
13 String s2( " birthday" );
14 String s3;
15
16 // test overloaded equality and relational operators
17 cout << "s1 is \"" << s1 << "\"; s2 is \"" << s2
18 << "\"; s3 is \"" << s3 << '\"'
19 << boolalpha << "\n\nThe results of comparing s2 and s1:"
20 << "\ns2 == s1 yields " << ( s2 == s1 )
21 << "\ns2 != s1 yields " << ( s2 != s1 )
22 << "\ns2 > s1 yields " << ( s2 > s1 )
23 << "\ns2 < s1 yields " << ( s2 < s1 )
24 << "\ns2 >= s1 yields " << ( s2 >= s1 )
25 << "\ns2 <= s1 yields " << ( s2 <= s1 );
26
27
28 // test overloaded String empty (!) operator
29 cout << "\n\nTesting !s3:" << endl;
30
31 if ( !s3 )
32 {
33 cout << "s3 is empty; assigning s1 to s3;" << endl;
34 s3 = s1; // test overloaded assignment
35 cout << "s3 is \"" << s3 << "\"";
36 } // end if
37
38 // test overloaded String concatenation operator
39 cout << "\n\ns1 += s2 yields s1 = ";
40 s1 += s2; // test overloaded concatenation
41 cout << s1;
42
43 // test conversion constructor
44 cout << "\n\ns1 += \" to you\" yields" << endl;
45 s1 += " to you"; // test conversion constructor
46 cout << "s1 = " << s1 << "\n\n";
47
48 // test overloaded function call operator () for substring
49 cout << "The substring of s1 starting at\n"
50 << "location 0 for 14 characters, s1(0, 14), is:\n"
51 << s1( 0, 14 ) << "\n\n";
52
53 // test substring "to-end-of-String" option
54 cout << "The substring of s1 starting at\n"
55 << "location 15, s1(15), is: "
56 << s1( 15 ) << "\n\n";
57
58 // test copy constructor
59 String *s4Ptr = new String( s1 );
60 cout << "\n*s4Ptr = " << *s4Ptr << "\n\n";
61
62 // test assignment (=) operator with self-assignment
63 cout << "assigning *s4Ptr to *s4Ptr" << endl;
64 *s4Ptr = *s4Ptr; // test overloaded assignment
65 cout << "*s4Ptr = " << *s4Ptr << endl;
66
67 // test destructor
68 delete s4Ptr;
69
70 // test using subscript operator to create a modifiable lvalue
71 s1[ 0 ] = 'H';
72 s1[ 6 ] = 'B';
73 cout << "\ns1 after s1[0] = 'H' and s1[6] = 'B' is: "
74 << s1 << "\n\n";
75
76 // test subscript out of range
77 cout << "Attempt to assign 'd' to s1[30] yields:" << endl;
78 s1[ 30 ] = 'd'; // ERROR: subscript out of range
79 return 0;
80 } // end main




 

Conversion (and default) constructor: happy
Conversion (and default) constructor: birthday
Conversion (and default) constructor:
s1 is "happy"; s2 is " birthday"; s3 is ""

The results of comparing s2 and s1:
s2 == s1 yields false
s2 != s1 yields true
s2 > s1 yields false
s2 < s1 yields true
s2 >= s1 yields false
s2 <= s1 yields true

Testing !s3:
s3 is empty; assigning s1 to s3;
operator= called
s3 is "happy"

s1 += s2 yields s1 = happy birthday

s1 += " to you" yields
Conversion (and default) constructor: to you
Destructor: to you
s1 = happy birthday to you

Conversion (and default) constructor: happy birthday
Copy constructor: happy birthday
Destructor: happy birthday
The substring of s1 starting at
location 0 for 14 characters, s1(0, 14), is:
happy birthday

Destructor: happy birthday
Conversion (and default) constructor: to you
Copy constructor: to you
Destructor: to you
The substring of s1 starting at
location 15, s1(15), is: to you

Destructor: to you
Copy constructor: happy birthday to you

*s4Ptr = happy birthday to you

assigning *s4Ptr to *s4Ptr
operator= called
Attempted assignment of a String to itself
*s4Ptr = happy birthday to you
Destructor: happy birthday to you

s1 after s1[0] = 'H' and s1[6] = 'B' is: Happy Birthday to you

Attempt to assign 'd' to s1[30] yields:
Error: Subscript 30 out of range




s1.operator+=( s2 )


Function operator+= calculates
the combined length of the concatenated string and stores it in local variable
newLength, then creates a temporary pointer (tempPtr) and allocates a new character array in which the
concatenated string will be stored. Next, operator+= uses
strcpy to copy the original character strings from sPtr and
right.sPtr into the memory to which tempPtr points. Note that
the location into which strcpy will copy the first character of
right.sPtr is determined by the pointer-arithmetic calculation
tempPtr + length. This calculation indicates
that the first character of right.sPtr
should be placed at location length in the array to which tempPtr
points. Next, operator+= uses delete [] to release the space
occupied by this object's original character string, assigns tempPtr to
sPtr so that this String object points to the new character
string, assigns newLength to length so that this
String object contains the new string length and returns
*this as a const String & to enable cascading of
+= operators.


Do we need a second overloaded
concatenation operator to allow concatenation of a String and a char *? No. The const char * conversion constructor converts a C-style string into a
temporary String object, which then matches
the existing overloaded concatenation operator. This is exactly what the
compiler does when it encounters line 45 in Fig. 11.11. Again, C++ can perform such conversions only one
level deep to facilitate a match. C++ can also perform an implicit
compiler-defined conversion between fundamental types before it performs the
conversion between a fundamental type and a class. Note that, when a temporary
String object is created in this case, the
conversion constructor and the destructor are called (see the output resulting
from line 45, s1 += "toyou", in Fig. 11.11).
This is an example of function-call overhead that is hidden from the client of
the class when temporary class objects are created and destroyed during implicit
conversions. Similar overhead is generated by copy constructors in call-by-value
parameter passing and in returning class objects by value.



Performance Tip 11.2








Overloading the += concatenation operator with an additional version that takes
a single argument of type const char *
executes more efficiently than having only a version that takes a
String argument. Without the const char * version of the
+= operator, a const char * argument would first be converted
to a String object with class String's conversion constructor,
then the += operator that receives a String argument would be
called to perform the
concatenation.




Software Engineering Observation 11.9








Using
implicit conversions with overloaded operators, rather than overloading
operators for many different operand types, often requires less code, which
makes a class easier to modify, maintain and
debug.



Overloaded Negation Operator


Line 22 of Fig. 11.9 declares the overloaded negation operator (defined
in Fig. 11.10, lines 77–80). This operator determines whether
an object of our String class is empty. For
example, when the compiler sees the expression !string1, it generates the function call


string1.operator!()


This function simply returns the result of testing whether
length is equal to zero.


Overloaded Equality and Relational
Operators


Lines 23–24 of Fig. 11.9 declare the overloaded
equality operator (defined in Fig. 11.10, lines 83–86) and the
overloaded less-than operator (defined in Fig. 11.10, lines 89–92) for class
String. These are similar, so let us discuss only one example, namely,
overloading the == operator. When the compiler sees the expression
string1 == string2, the compiler generates the member-function call


string1.operator==( string2 )


which returns true if string1 is equal to
string2. Each of these operators uses function strcmp (from
<cstring>) to compare the character strings in the
String objects. Many C++ programmers
advocate using some of the overloaded operator functions to implement others.
So, the !=, >, <= and >= operators
are implemented (Fig.
11.9, lines 27–48) in terms of operator== and
operator<. For example, overloaded function operator>= (implemented in lines 45–48 in the header file) uses the
overloaded < operator to determine whether one
String object is greater than or equal
to another. Note that the operator functions for !=, >,
<= and >= are
defined in the header file. The compiler inlines these definitions to eliminate
the overhead of the extra function calls.



Software Engineering Observation 11.10








By
implementing member functions using previously defined member functions, you
reuse code to reduce the amount of code that must be written and
maintained.



Overloaded Subscript Operators


Lines 50–51 in the header file declare
two overloaded subscript operators (defined in Fig. 11.10, lines 95–106 and 109–120,
respectively)—one for non-const Strings and one for
const Strings. When the compiler sees an expression like string1[ 0
]
, the compiler generates the member-function call


string1.operator[]( 0 )


(using the appropriate version of operator[] based on
whether the String is const). Each implementation of
operator[] first validates the subscript to ensure
that it is in range. If the subscript is out of range, each function prints an
error message and terminates the program with a call to exit.[5] If the subscript is in
range, the non-const version of operator[] returns a char
&
to the appropriate character of the String object; this
char & may be used as an lvalue to
modify the designated character of the String object. The
const version of operator[] returns the appropriate character
of the String object; this can be used only as an rvalue to read the value of the character.



[5]
Again, it is more appropriate when a subscript is out of range to "throw an
exception" indicating the out-of-range subscript.



Error-Prevention Tip 11.2








Returning a
non-const char reference from an
overloaded subscript operator in a String class
is dangerous. For example, the client could use this reference to insert a null
('\0') anywhere in the
string.



Overloaded Function Call
Operator


Line 52 of Fig. 11.9 declares the overloaded function call operator (defined in Fig. 11.10, lines 123–150). We overload this operator to select a
substring from a String. The two
integer parameters specify the start location and the length of the substring
being selected from the String. If the
start location is out of range or the substring length is negative, the operator
simply returns an empty String. If the
substring length is 0, then the substring is selected to the end of the
String object. For example, suppose string1 is a
String object containing the string "AEIOU". For the
expression string1( 2, 2 ), the compiler generates the member-function
call


string1.operator()( 2, 2 )


When this call executes, it produces a String object
containing the string "IO" and returns a copy of that object.


Overloading the function call operator () is powerful, because functions can take arbitrarily long and
complex parameter lists. So we can use this capability for many interesting
purposes. One such use of the function call operator is an alternate
array-subscripting notation: Instead of using C's awkward double-square-bracket
notation for pointer-based two-dimensional arrays, such as in a[b][c], some programmers prefer to overload the function call
operator to enable the notation a( b, c ).
The overloaded function call operator must be a non-static member function. This operator is used only when the
"function name" is an object of class String.


String Member Function getLength


Line 53 in Fig. 11.9 declares function
getLength (defined in Fig. 11.10,
lines 153–156), which returns the length of a String.


Notes on Our String Class


At this point, you should step
through the code in main, examine the
output window and check each use of an overloaded operator. As you study the
output, pay special attention to the implicit constructor calls that are
generated to create temporary String objects
throughout the program. Many of these calls introduce additional overhead into
the program that can be avoided if the class provides overloaded operators that
take char * arguments. However, additional
operator functions can make the class harder to maintain, modify and
debug.


 


No comments:

Post a Comment