1.2. UNIX ArchitectureIn a strict sense, an operating system can be defined as the software that controls the hardware resources of the computer and provides an environment under which programs can run. Generally, we call this software the kernel, since it is relatively small and resides at the core of the environment. Figure 1.1 shows a diagram of the UNIX System architecture. Figure 1.1. Architecture of the UNIX operating systemThe interface to the kernel is a layer of software called the system calls (the shaded portion in Figure 1.1). Libraries of common functions are built on top of the system call interface, but applications are free to use both. (We talk more about system calls and library functions in Section 1.11.) The shell is a special application that provides an interface for running other applications. In a broad sense, an operating system is the kernel and all the other software that makes a computer useful and gives the computer its personality. This other software includes system utilities, applications, shells, libraries of common functions, and so on. For example, Linux is the kernel used by the GNU operating system. Some people refer to this as the GNU/Linux operating system, but it is more commonly referred to as simply Linux. Although this usage may not be correct in a strict sense, it is understandable, given the dual meaning of the phrase operating system. (It also has the advantage of being more succinct.) |
Thursday, October 15, 2009
Section 1.2. UNIX Architecture
15.3 Authorization
< Day Day Up > |
15.3 AuthorizationAuthorization in a J2EE environment is based on the concept of roles. A user who has been granted at least one of the required roles is allowed to access a resource, such as a method in an enterprise bean. In the J2EE deployment descriptor, role names are associated with a set of method permissions and security constraints. The containers are responsible for mapping users and groups to these roles so that an authorization decision can be made when a resource is accessed. There are at least two ways to implement a role-based authorization engine: the role-permission model and the role-privilege model. The role-permission interpretation of the J2EE security model considers a role to be a set of permissions and uses the role name defined in the method-permission and security-constraint descriptors as the label to a set of permissions. A permission defines a resource�the enterprise bean to which a method-permission descriptor applies or a URI described in a security-constraint descriptor�and a set of actions�remote EJB methods or HTTP methods. For example, a Teller role may be associated with a set of permissions to invoke the getBalance() method on an AccountBean enterprise bean and to perform a GET invocation over HTTP to a /finance/account URI. If multiple method-permission descriptors refer to the same role, they are consolidated so that a single set of permissions is associated with the role name, likely within the scope of that application. Administrators define authorization policies for the roles in their application. This is done by associating subjects to a role, an operation that grants each subject the permissions associated with that role. This effectively grants the subject access to the enterprise bean methods and to the URIs permitted by that role. The method-permission table represents the association of a role to a set of permissions (see Table 3.1 on page 84). Based on the J2EE security model, a method can be accessed by a user who has at least one of the roles associated with the method. The roles associated with a method form the set of required roles to perform an action. The roles associated with a subject form the set of granted roles to that subject. A subject will be allowed to perform an action if the subject's granted roles contain at least one of the required roles to perform that action. An authorization table, or protection matrix, represents the association of a role to subjects (see Table 3.2 on page 84). In such a table, the role is defined as the security object, and users and groups are defined as security subjects. According to the J2EE security model, it is responsibility of the Application Assembler to associate actions on protected resources to sets of required roles. The Deployer refines and configures the policies specified by the Application Assembler when the application is installed into a WAS. Association of roles to subjects can be performed when the application gets installed in a WAS or at a later time, as part of security administration. An alternative approach to implement role-based authorization is to treat a role as a privilege attribute, typically a user group. For example, if a role name is Manager, a user group named Manager should be defined in the user registry. This model has advantages and disadvantages.
Regardless of which model is used to achieve role-based authorization, the container runtime will need to make runtime authorization decisions each time an attempt is made to access a resource. For example, when the getBalance() method in the AccountBean enterprise bean is invoked, the container must make an authorization decision to verify whether the invocation is allowed. As the J2SE permission security model (see Chapter 8 on page 253) augmented by JAAS (see Chapter 9 on page 289) is rich enough to check permissions based on a subject, it is logical to use the J2SE permission model to perform access checks. Therefore, containers can invoke the java.lang.SecurityManager's checkPermission() method. This also implies that J2EE resources and methods will need to be modeled by different permission types. In Java 2, permissions are implemented as java.security.Permission objects (see Section 8.2 on page 258). The process of authorization starts from the time of application installation. When an application is installed, the deployment descriptor information that is stored with the application is read by the container deployment tools. The subject-to-role mapping is effectively managed using the tools provided by the container and/or security provider. Such tools must allow changing the permission-to-role mapping. Effectively, a set of policy management tools will help manage the security policies associated with authorization to access the various J2EE components. When a resource is accessed, the container will consult the authorization runtime provided by the security provider to verify whether the user accessing a resource can be allowed to perform the operation. The security provider may be the same vendor as the container provider or a different one. These steps are illustrated in Figure 15.3. Figure 15.3. Authorization in a J2EE ContainerThe advantage of using the J2SE/JAAS Permission model is that a single security provider can perform both J2SE permission checks�for example, for accesses to the file system�and J2EE permission checks�for example, to execute EJB methods. Part of the design of authorization engines is to create java.security.Policy objects. Similar to how LoginModules can be used as an approach to allow multiple authentication mechanisms to be used by a container, one of many Policy implementations can be used by the container when making an access check on a resource. |
< Day Day Up > |
Working with Binary (BLOB) Data
< Day Day Up > |
Working with Binary (BLOB) DataA common task I'm often asked about is how to read and write binary data�typically representing a graphical image�using the ADO.NET classes. As most ADO.NET articles and books don't seem to cover this important task, I'll use this chapter's first section to illustrate two distinct ways to work with stored binary data�one with the DataReader class and the other with the DataSet class. I'll also explore the use of "chunking" to facilitate working with very large amounts of binary data in an efficient manner. The section concludes with a demo application that allows you to view the images stored for a sample database and update the database with any other image stored in the file system. Using the DataReader ClassMost managed providers define a data reader object that allows for reading a forward-only stream of rows from a specific data store type. For example, there is a data reader class for SQL Server (SqlDataReader), Oracle (OracleDataReader), OLE DB provider support (OleDbDataReader), and ODBC driver support (OdbcDataReader). Up to this point, I've ignored the data readers for the simple reason that�for the most part�they don't provide us MFC developers with anything that we don't already have, since we can pick from a plethora of native database access technologies. However, the data reader does make the task of reading and writing binary data very simple; hence its inclusion in this section. The first step in constructing a data reader object is to connect to a data store using one of the managed data connection objects (such as SqlConnection). Once that's done, you then construct a command object (such as SqlCommand) specifying the query to run against the data store. If the command will yield a result set, you can then call the command object's ExecuteReader method, which returns a data reader object.
Note that, unlike the disconnected nature of the DataSet, the data reader does not read all of the data in a result set into memory at once. Instead, the data reader is an object that keeps a database connection open and basically behaves like a connected, server-side, read-only cursor. It does this by reading only as much of the complete result set as necessary, thereby saving memory, especially in cases where you expect a large result set. As a result, the act of closing the connection also closes the data reader. Reading Binary Data with a Data Reader ObjectOnce the data reader object has been constructed, call its Read method to advance to the result set's next record. Simply call this method successively until it returns a Boolean value of false, indicating that there are no more records to read.
While most data can be obtained through the overloaded Item indexer (into an Object), the data reader classes also provide a number of data-type-specific methods such as GetBoolean, GetChars, GetDateTime, and GetGuid. The data-type-specific method we're most interested in here is the GetBytes method. The GetBytes method is used to read binary data and enables you to specify the binary data's column index, the buffer into which to read the data, the index of the buffer where writing should begin, and the amount of data to copy.
Here's an example of allocating and then populating a Byte array with binary data from a data reader object:
Note that the code snippet calls GetBytes twice. This is done because GetBytes can't be called to retrieve the data until a receiving buffer of the required length is allocated. Therefore, the first call is made with the buffer (third parameter) set to a null reference in order to determine the number of bytes to allocate. Once the image buffer has been allocated, the second call to GetBytes results in the population of the buffer. Most image classes are designed to read the graphic from a specified file in order to display the graphic. Therefore, you could write the image data to disk using the FileStream object covered in Chapter 3. Here's a snippet illustrating the writing of a Byte buffer to disk.
Now that you've seen the individual steps involved in using the DataReader class to read a binary value from a database and, optionally, save it to a file, here's a generic function (GetPictureValue) that takes a SqlDataReader object, column index value, and destination file name. Using what you've just learned, this function first ensures that the column value is not null (GetBytes will throw an exception if the column is null), allocates a Byte buffer, reads the data into that buffer, and then saves the data to the specified destination file name.
You can now write something like the following where the code reads every photo in the Employees table and writes that photo out to a file named using the EmployeeID value. (The value 1 being passed to the GetPictureValue function refers to the second column�Photo�specified in the command object's constructor.)
Writing Binary Data with a Command and Parameter ObjectAs the data reader objects are read-only, they can't be used to insert into or update a data store. Instead, you would use a command object. You can't insert the binary data into an SQL statement; however, each command object contains a collection of parameter objects that serve as placeholders in the query. Parameters are used in situations where either the data cannot be passed in the query (as in our case) or in situations where the data to be passed won't be resolved until after the command has been constructed. To specify that a command has a parameter, you simply use the special designation @parameterName in the query passed to the command object's constructor. In the following example, I'm specifying that once executed, the command object's parameter collection will include a parameter object named Photo that will contain the data to be used in this SQL UPDATE statement.
The next thing you would do is to construct the parameter object. Most managed providers define a parameter class that is specific to a given data store. For the SQL Server managed provider, this class is called SqlParameter. Here's an example of setting up an SqlParameter object and then adding it to the command object's parameter collection, where the constructor is being used to specify such values as parameter name, data type, data length, parameter direction (input, in this case, as we're setting a database value), and the data value.
Once the parameters that you've specified for a given command have been constructed and added to the command object, you can then execute the command. As I presented a GetPictureValue in the previous section, here's a SetPictureValue function that illustrates how to use parameter objects for both the image data as well as the EmployeeID column value. This allows you to set up the connection and command one time and then SetPictureValue for each row whose image you wish to set.
Note that you could also have the client set up the command's parameters so that it's only done once and then have the SetPictureValue update the various parameter object members such as value and length. However, I like to minimize the work required by the client, and in the case of allocating an object such as the parameter object, the trade-off of performance vs. client work is such that I don't mind instantiating the parameter object each time. Obviously, this is reversed with regard to the connection because�depending on the environment�a connection may take some time to establish. Therefore, with this function, the client is responsible for creating the connection. In this simple test, I'm calling SetPictureValue for two employees. Also note the two parameters specified in the command object's constructor.
Using the DataSet ClassIn contrast to the DataReader class, the DataSet class is much easier to use�with the trade-off being much less control than the DataReader offers. As you already saw how to connect to a data source and fill a DataSet in Chapter 6, the following code assumes that you have a DataRow object containing a column (named Photo) that contains binary data. The following code is all you need in order to read a binary value from a DataRow object and then output that data to a file.
As you can see, all that is required is to simply cast the data returned from the Item property! The reason this is so much simpler is that the DataReader gives you much more control in reading data in terms of specifying data type, precision, size, scale, and so on. This paradigm is fine for the DataReader, since with it you read one column of data at a time for each row. However, the DataSet's internal DataTable object(s) is/are filled with data with a single call to the data adapter's Fill method, so that ADO.NET is obligated to look at the schema of the data store and download all the data at once, pursuant to the type of data. You can verify this by enumerating the column objects defined by each DataSet object's DataTable object, where you'll see that the schema information for that column was also downloaded and, ostensibly, used in determining how to read the data. Therefore, with the column data already downloaded and accessible via the Item property, moving the data into a variable is much easier, albeit at the cost of flexibility. That said, writing binary data is just as easy. The following code snippet first constructs a FileStream object encapsulating the file that will contain the image data for a given record. A Byte array is then allocated for the length of the file, and the data is read into that array. A DataRow object representing the record is then updated. As with any update to disconnected data, you then need to call the adapter's Update method to commit the changes to the data store.
Demo Application to Read and Write Image DataAssuming you've run the SQL Server script provided for the chapter's demo applications and imported the necessary data (as shown in the sidebar entitled "Creating the Sample Database for SQL Server or MSDE" at the beginning of this chapter), you should have a database named ExtendingMFCWithDotNet that includes a table called Contributors. This table defines four columns.
As mentioned, my main focus in these ADO.NET chapters is on the disconnected side of things. Therefore, this demo will illustrate using the DataSet class to load the Contributors table, displaying the Name and Role values of each record in a list view. When a particular record is selected, the associated image data (Photo value) will be displayed on the dialog. As an added bonus, you'll also see how to display an image from memory using GDI+�as opposed to saving to a temporary file first. The demo will also allow the user to select�from the file system�a different image file for the record. The record can then be saved with this new image data.
At this point, you should have a fully functional application that allows you to both read and write image data to a SQL Server database. Running this application should provide results similar to those shown in Figure 7-2. Figure 7-2. The BLOBData demo application illustrates how to read and write image data. |
< Day Day Up > |
Section 5.3. Basic Goals of NIC Initialization
5.3. Basic Goals of NIC InitializationEach network device is represented in the Linux kernel by an instance of the net_device data structure. In Chapter 8, you will see how net_device data structures are allocated and how their fields are initialized, partly by the device driver and partly by core kernel routines. In this chapter, we focus on how device drivers allocate the resources needed to establish device/kernel communication, such as:
|
List of Tables
< Day Day Up > |
List of Tables
Chapter 1: Basic Linux Installation
Table 1: Sources of Linux hardware information.
Table 2: Sample partition device names.
Table 3: Basic Red Hat Linux directories.
Table 4: Partitions for installing Linux on a small hard drive.
Table 5: Partitions for installing Linux on a small hard drive.
Table 6: Partitions for installing Linux as a file server with a large hard drive.
Table 7: Partitions for installing Linux in a dual-boot configuration.
Table 8: Partitions for installing Linux on a production computer.
Table 9: Linux partition types.
Table 10: Major command options at the fdisk prompt.
Table 11: Options when configuring a software RAID device.
Chapter 2: Installing Linux as a File Server
Table 1: Red Hat Linux package groups.
Table 2: Some common rpm command switches.
Table 3: Service scripts in /etc/rc.d/init.d.
Chapter 3: Setting up Your Server File System
Table 1: Computers on my network.
Table 2: Samba TCP/IP ports for configuring a firewall.
Table 3: SWAT menu options.
Table 4: Key Samba variables.
Table 5: Browser election values.
Table 6: Additional browser values, in the case of a tie.
Chapter 4: Setting up Your File Server's Users
Table 1: Creating a new user.
Table 2: Configuring a user account.
Table 3: User and group configuration files.
Table 4: Key commands in the smb.conf file to set up a Samba PDC.
Table 5: File permissions.
Table 6: File permission numeric values.
Table 7: User quotas.
Chapter 6: Connecting Windows Workstations
Table 1: Registry files for disabling encryption.
Table 2: Network properties configuration tabs.
Table 3: Tabs in the Advanced TCP/IP Settings dialog.
Table 3: Microsoft Windows user categories.
Chapter 7: Configuring Printers
Table 1: CUPS RPM packages.
Table 2: CUPS configuration files.
Table 3: Tabs in the "Edit a print queue" window.
Table 4: CUPS printer devices.
Table 5: Authorizing a connection to a shared printer.
Chapter 8: Administration and Management
Table 1: Long listing output.
Table 2: ls command examples.
Table 3: cp command examples.
Table 4: mv command examples.
Table 5: rm command examples.
Table 6: Some vi command examples.
Table 7: /etc/crontab command columns.
Table 8: /etc/fstab command columns.
< Day Day Up > |
Where You Have to Distribute
[ Team LiB ] |
Where You Have to DistributeSo you want to minimize distribution boundaries and utilize your nodes through clustering as much as possible. The rub is that there are limits to that approach�that is, places where you need to separate the processes. If you're sensible, you'll fight like a cornered rat to eliminate as many of them as you can, but you won't eliminate them all.
The overriding theme, in Colleen Roe's memorable phrase, is to be "parsimonious with object distribution." Sell your favorite grandma first if you possibly can. |
[ Team LiB ] |
SuSE
SuSESuSE (pronounced sue suh) has a desktop focus. The SuSE developers have developed an easy-to-use installer and simple configuration tools. SuSE has the best documentation of any distribution, thorough and understandable. SuSE is developed privately, not publicly. The software is not available during the development process. It's developed by the SuSE developers, depending on sales of boxed sets for much of its revenue. SuSE includes some commercial software, in addition to the free open source software. Two boxed sets are available:
SuSE also provides business software such as an enterprise server. SuSE provides the opportunity to install its software free of charge directly from FTP. The FTP version is mainly the same, except that some of the commercial software included on the CDs is removed. This is not something you want to do via a dial-up connection. SuSE features include the following:
|
Professional Apache Tomcat
|
16.2. Controlling the Loop
< Free Open Study > |
16.2. Controlling the LoopWhat can go wrong with a loop? Any answer would have to include incorrect or omitted loop initialization, omitted initialization of accumulators or other variables related to the loop, improper nesting, incorrect termination of the loop, forgetting to increment a loop variable or incrementing the variable incorrectly, and indexing an array element from a loop index incorrectly.
C++ Example of Treating a Loop as a Black Box
Cross-Reference If you use the while ( true )-break technique described earlier, the exit condition is inside the black box. Even if you use only one exit condition, you lose the benefit of treating the loop as a black box. What are the conditions under which this loop terminates? Clearly, all you know is that either inputFile.EndOfFile() becomes true or MoreDataAvailable becomes false. Entering the LoopUse these guidelines when entering a loop: Enter the loop from one location only A variety of loop-control structures allows you to test at the beginning, middle, or end of a loop. These structures are rich enough to allow you to enter the loop from the top every time. You don't need to enter at multiple locations. Put initialization code directly before the loop The Principle of Proximity advocates putting related statements together. If related statements are strewn across a routine, it's easy to overlook them during modification and to make the modifications incorrectly. If related statements are kept together, it's easier to avoid errors during modification. Keep loop-initialization statements with the loop they're related to. If you don't, you're more likely to cause errors when you generalize the loop into a bigger loop and forget to modify the initialization code. The same kind of error can occur when you move or copy the loop code into a different routine without moving or copying its initialization code. Putting initializations away from the loop�in the data-declaration section or in a housekeeping section at the top of the routine that contains the loop�invites initialization troubles. Cross-Reference For more on limiting the scope of loop variables, see "Limit the scope of loop-index variables to the loop itself" later in this chapter. Use while( true ) for infinite loops You might have a loop that runs without terminating�for example, a loop in firmware such as a pacemaker or a microwave oven. Or you might have a loop that terminates only in response to an event�an "event loop." You could code such an infinite loop in several ways. Faking an infinite loop with a statement like for i = 1 to 99999 is a poor choice because the specific loop limits muddy the intent of the loop�99999 could be a legitimate value. Such a fake infinite loop can also break down under maintenance. The while( true ) idiom is considered a standard way of writing an infinite loop in C++, Java, Visual Basic, and other languages that support comparable structures. Some programmers prefer to use for( ;; ), which is an accepted alternative. Prefer for loops when they're appropriate The for loop packages loop-control code in one place, which makes for easily readable loops. One mistake programmers commonly make when modifying software is changing the loop-initialization code at the top of a while loop but forgetting to change related code at the bottom. In a for loop, all the relevant code is together at the top of the loop, which makes correct modifications easier. If you can use the for loop appropriately instead of another kind of loop, do it. Don't use a for loop when a while loop is more appropriate A common abuse of the flexible for loop structure in C++, C#, and Java is haphazardly cramming the contents of a while loop into a for loop header. The following example shows a while loop crammed into a for loop header:
The advantage of C++'s for loop over for loops in other languages is that it's more flexible about the kinds of initialization and termination information it can use. The weakness inherent in such flexibility is that you can put statements into the loop header that have nothing to do with controlling the loop. Reserve the for loop header for loop-control statements�statements that initialize the loop, terminate it, or move it toward termination. In the example just shown, the inputFile.GetRecord() statement in the body of the loop moves the loop toward termination, but the recordCount statements don't; they're housekeeping statements that don't control the loop's progress. Putting the recordCount statements in the loop header and leaving the inputFile.GetRecord() statement out is misleading; it creates the false impression that recordCount controls the loop. If you want to use the for loop rather than the while loop in this case, put the loop-control statements in the loop header and leave everything else out. Here's the right way to use the loop header: C++ Example of Logical if Unconventional Use of a for Loop HeaderrecordCount = 0; The contents of the loop header in this example are all related to control of the loop. The inputFile.MoveToStart() statement initializes the loop, the !inputFile.EndOfFile() statement tests whether the loop has finished, and the inputFile.GetRecord() statement moves the loop toward termination. The statements that affect recordCount don't directly move the loop toward termination and are appropriately not included in the loop header. The while loop is probably still more appropriate for this job, but at least this code uses the loop header logically. For the record, here's how the code looks when it uses a while loop: C++ Example of Appropriate Use of a while Loop// read all the records from a file Processing the Middle of the LoopThe following subsections describe handling the middle of a loop: Use { and } to enclose the statements in a loop Use code brackets every time. They don't cost anything in speed or space at run time, they help readability, and they help prevent errors as the code is modified. They're a good defensive-programming practice. Avoid empty loops In C++ and Java, it's possible to create an empty loop, one in which the work the loop is doing is coded on the same line as the test that checks whether the work is finished. Here's an example: C++ Example of an Empty Loop
In this example, the loop is empty because the while expression includes two things: the work of the loop�inputChar = dataFile.GetChar()�and a test for whether the loop should terminate�inputChar != CharType_Eof. The loop would be clearer if it were recoded so that the work it does is evident to the reader: C++ Example of an Empty Loop Converted to an Occupied Loopdo { The new code takes up three full lines rather than one line and a semicolon, which is appropriate since it does the work of three lines rather than that of one line and a semicolon. Keep loop-housekeeping chores at either the beginning or the end of the loop Loop-housekeeping chores are expressions like i = i + 1 or j++, expressions whose main purpose isn't to do the work of the loop but to control the loop. The housekeeping is done at the end of the loop in this example: C++ Example of Housekeeping Statements at the End of a LoopnameCount = 0;
As a general rule, the variables you initialize before the loop are the variables you'll manipulate in the housekeeping part of the loop. Make each loop perform only one function The mere fact that a loop can be used to do two things at once isn't sufficient justification for doing them together. Loops should be like routines in that each one should do only one thing and do it well. If it seems inefficient to use two loops where one would suffice, write the code as two loops, comment that they could be combined for efficiency, and then wait until benchmarks show that the section of the program poses a performance problem before changing the two loops into one. Cross-Reference For more on optimization, see Chapter 25, "Code-Tuning Strategies," and Chapter 26, "Code-Tuning Techniques." Exiting the LoopThese subsections describe handling the end of a loop: Assure yourself that the loop ends This is fundamental. Mentally simulate the execution of the loop until you are confident that, in all circumstances, it ends. Think through the nominal cases, the endpoints, and each of the exceptional cases. Make loop-termination conditions obvious If you use a for loop and don't fool around with the loop index and don't use a goto or break to get out of the loop, the termination condition will be obvious. Likewise, if you use a while or repeat-until loop and put all the control in the while or repeat-until clause, the termination condition will be obvious. The key is putting the control in one place. Don't monkey with the loop index of a for loop to make the loop terminate Some programmers jimmy the value of a for loop index to make the loop terminate early. Here's an example:
|
The intent in this example is to terminate the loop under some condition by setting i to 100, a value that's larger than the end of the for loop's range of 0 through 99. Virtually all good programmers avoid this practice; it's the sign of an amateur. When you set up a for loop, the loop counter is off limits. Use a while loop to provide more control over the loop's exit conditions.
Avoid code that depends on the loop index's final value It's bad form to use the value of the loop index after the loop. The terminal value of the loop index varies from language to language and implementation to implementation. The value is different when the loop terminates normally and when it terminates abnormally. Even if you happen to know what the final value is without stopping to think about it, the next person to read the code will probably have to think about it. It's better form and more self-documenting if you assign the final value to a variable at the appropriate point inside the loop.
This code misuses the index's final value:
C++ Example of Code That Misuses a Loop Index's Terminal Value
for ( recordCount = 0; recordCount < MAX_RECORDS; recordCount++ ) {
if ( entry[ recordCount ] == testValue ) {
break;
}
}
// lots of code
...
if ( recordCount < MAX_RECORDS ) { <-- 1
return( true );
}
else {
return( false );
}
(1)Here's the misuse of the loop index's terminal value.
In this fragment, the second test for recordCount < MaxRecords makes it appear that the loop is supposed to loop though all the values in entry[] and return true if it finds the one equal to testValue and false otherwise. It's hard to remember whether the index gets incremented past the end of the loop, so it's easy to make an off-by-one error. You're better off writing code that doesn't depend on the index's final value. Here's how to rewrite the code:
C++ Example of Code That Doesn't Misuse a Loop Index's Terminal Value
found = false;
for ( recordCount = 0; recordCount < MAX_RECORDS; recordCount++ ) {
if ( entry[ recordCount ] == testValue ) {
found = true;
break;
}
}
// lots of code
...
return( found );
This second code fragment uses an extra variable and keeps references to recordCount more localized. As is often the case when an extra boolean variable is used, the resulting code is clearer.
Consider using safety counters A safety counter is a variable you increment each pass through a loop to determine whether a loop has been executed too many times. If you have a program in which an error would be catastrophic, you can use safety counters to ensure that all loops end. This C++ loop could profitably use a safety counter:
C++ Example of a Loop That Could Use a Safety Counter
do {
node = node->Next;
...
} while ( node->Next != NULL );
Here's the same code with the safety counters added:
C++ Example of Using a Safety Counter
safetyCounter = 0;
do {
node = node->Next;
...
safetyCounter++; <-- 1
if ( safetyCounter >= SAFETY_LIMIT ) {
Assert( false, "Internal Error: Safety-Counter Violation." ); <-- 1
}
...
} while ( node->Next != NULL );
(1)Here's the safety-counter code.
Safety counters are not a cure-all. Introduced into the code one at a time, safety counters increase complexity and can lead to additional errors. Because they aren't used in every loop, you might forget to maintain safety-counter code when you modify loops in parts of the program that do use them. If safety counters are instituted as a projectwide standard for critical loops, however, you learn to expect them and the safety-counter code is no more prone to produce errors later than any other code is.
Exiting Loops Early
Many languages provide a means of causing a loop to terminate in some way other than completing the for or while condition. In this discussion, break is a generic term for break in C++, C, and Java; for Exit-Do and Exit-For in Visual Basic; and for similar constructs, including those simulated with gotos in languages that don't support break directly. The break statement (or equivalent) causes a loop to terminate through the normal exit channel; the program resumes execution at the first statement following the loop.
The continue statement is similar to break in that it's an auxiliary loop-control statement. Rather than causing a loop exit, however, continue causes the program to skip the loop body and continue executing at the beginning of the next iteration of the loop. A continue statement is shorthand for an if-then clause that would prevent the rest of the loop from being executed.
Consider using break statements rather than boolean flags in a while loop In some cases, adding boolean flags to a while loop to emulate exits from the body of the loop makes the loop hard to read. Sometimes you can remove several levels of indentation inside a loop and simplify loop control just by using a break instead of a series of if tests.
Putting multiple break conditions into separate statements and placing them near the code that produces the break can reduce nesting and make the loop more readable.
Be wary of a loop with a lot of break s scattered through it A loop's containing a lot of breaks can indicate unclear thinking about the structure of the loop or its role in the surrounding code. A proliferation of breaks raises the possibility that the loop could be more clearly expressed as a series of loops rather than as one loop with many exits.
According to an article in Software Engineering Notes, the software error that brought down the New York City phone systems for 9 hours on January 15, 1990, was due to an extra break statement (SEN 1990):
C++ Example of Erroneous Use of a break Statement Within a do-switch-if Block
do {
...
switch
...
if () {
...
break; <-- 1
...
}
...
} while ( ... );
(1)This break was intended for the if but broke out of the switch instead.
Multiple breaks don't necessarily indicate an error, but their existence in a loop is a warning sign, a canary in a coal mine that's gasping for air instead of singing as loud as it should be.
Use continue for tests at the top of a loop A good use of continue is for moving execution past the body of the loop after testing a condition at the top. For example, if the loop reads records, discards records of one kind, and processes records of another kind, you could put a test like this one at the top of the loop:
Pseudocode Example of a Relatively Safe Use of continue
while ( not eof( file ) ) do
read( record, file )
if ( record.Type <> targetType ) then
continue
-- process record of targetType
...
end while
Using continue in this way lets you avoid an if test that would effectively indent the entire body of the loop. If, on the other hand, the continue occurs toward the middle or end of the loop, use an if instead.
Use the labeled break structure if your language supports it Java supports use of labeled breaks to prevent the kind of problem experienced with the New York City telephone outage. A labeled break can be used to exit a for loop, an if statement, or any block of code enclosed in braces (Arnold, Gosling, and Holmes 2000).
Here's a possible solution to the New York City telephone code problem, with the programming language changed from C++ to Java to show the labeled break:
Java Example of a Better Use of a Labeled break Statement Within a
do-switch-if Block
do {
...
switch
...
CALL_CENTER_DOWN:
if () {
...
break CALL_CENTER_DOWN; <-- 1
...
}
...
} while ( ... );
(1)The target of the labeled break is unambiguous.
Use break and continue only with caution Use of break eliminates the possibility of treating a loop as a black box. Limiting yourself to only one statement to control a loop's exit condition is a powerful way to simplify your loops. Using a break forces the person reading your code to look inside the loop for an understanding of the loop control. That makes the loop more difficult to understand.
Use break only after you have considered the alternatives. You don't know with certainty whether continue and break are virtuous or evil constructs. Some computer scientists argue that they are a legitimate technique in structured programming; some argue that they aren't. Because you don't know in general whether continue and break are right or wrong, use them, but only with a fear that you might be wrong. It really is a simple proposition: if you can't defend a break or a continue, don't use it.
Checking Endpoints
A single loop usually has three cases of interest: the first case, an arbitrarily selected middle case, and the last case. When you create a loop, mentally run through the first, middle, and last cases to make sure that the loop doesn't have any off-by-one errors. If you have any special cases that are different from the first or last case, check those too. If the loop contains complex computations, get out your calculator and manually check the calculations.
Willingness to perform this kind of check is a key difference between efficient and inefficient programmers. Efficient programmers do the work of mental simulations and hand calculations because they know that such measures help them find errors. |
Inefficient programmers tend to experiment randomly until they find a combination that seems to work. If a loop isn't working the way it's supposed to, the inefficient programmer changes the < sign to a <= sign. If that fails, the inefficient programmer changes the loop index by adding or subtracting 1. Eventually the programmer using this approach might stumble onto the right combination or simply replace the original error with a more subtle one. Even if this random process results in a correct program, it doesn't result in the programmer's knowing why the program is correct.
You can expect several benefits from mental simulations and hand calculations. The mental discipline results in fewer errors during initial coding, in more rapid detection of errors during debugging, and in a better overall understanding of the program. The mental exercise means that you understand how your code works rather than guessing about it.
Using Loop Variables
Here are some guidelines for using loop variables:
Use ordinal or enumerated types for limits on both arrays and loops Generally, loop counters should be integer values. Floating-point values don't increment well. For example, you could add 1.0 to 26,742,897.0 and get 26,742,897.0 instead of 26,742,898.0. If this incremented value were a loop counter, you'd have an infinite loop.
Cross-Reference
For details on naming loop variables, see "Naming Loop Indexes" in Section 11.2.
Use meaningful variable names to make nested loops readable Arrays are often indexed with the same variables that are used for loop indexes. If you have a one-dimensional array, you might be able to get away with using i, j, or k to index it. But if you have an array with two or more dimensions, you should use meaningful index names to clarify what you're doing. Meaningful array-index names clarify both the purpose of the loop and the part of the array you intend to access.
Here's code that doesn't put this principle to work; it uses the meaningless names i, j, and k instead:
|
What do you think the array indexes in transaction mean? Do i, j, and k tell you anything about the contents of transaction? If you had the declaration of transaction, could you easily determine whether the indexes were in the right order? Here's the same loop with more readable loop variable names:
Java Example of Good Loop Variable Names
for ( int payCodeIdx = 0; payCodeIdx < numPayCodes; payCodeIdx++ ) {
for (int month = 0; month < 12; month++ ) {
for ( int divisionIdx = 0; divisionIdx < numDivisions; divisionIdx++ ) {
sum = sum + transaction[ month ][ payCodeIdx ][ divisionIdx ];
}
}
}
What do you think the array indexes in transaction mean this time? In this case, the answer is easier to come by because the variable names payCodeIdx, month, and divisionIdx tell you a lot more than i, j, and k did. The computer can read the two versions of the loop equally easily. People can read the second version more easily than the first, however, and the second version is better since your primary audience is made up of humans, not computers.
Use meaningful names to avoid loop-index cross-talk Habitual use of i, j, and k can give rise to index cross-talk�using the same index name for two different purposes. Take a look at this example:
C++ Example of Index Cross-Talk
for ( i = 0; i < numPayCodes; i++ ) { <-- 1
// lots of code
...
for ( j = 0; j < 12; j++ ) {
// lots of code
...
for ( i = 0; i < numDivisions; i++ ) { <-- 2
sum = sum + transaction[ j ][ i ][ k ];
}
}
}
(1)i is used first here….
(2)…and again here.
The use of i is so habitual that it's used twice in the same nesting structure. The second for loop controlled by i conflicts with the first, and that's index cross-talk. Using more meaningful names than i, j, and k would have prevented the problem. In general, if the body of a loop has more than a couple of lines, if it might grow, or if it's in a group of nested loops, avoid i, j, and k.
Limit the scope of loop-index variables to the loop itself Loop-index cross-talk and other uses of loop indexes outside their loops is such a significant problem that the designers of Ada decided to make for loop indexes invalid outside their loops; trying to use one outside its for loop generates an error at compile time.
C++ and Java implement the same idea to some extent�they allow loop indexes to be declared within a loop, but they don't require it. In the example on page 378, the recordCount variable could be declared inside the for statement, which would limit its scope to the for loop, like this:
C++ Example of Declaring a Loop-Index Variable Within a for loop
for ( int recordCount = 0; recordCount < MAX_RECORDS; recordCount++ ) {
// looping code that uses recordCount
}
In principle, this technique should allow creation of code that redeclares recordCount in multiple loops without any risk of misusing the two different recordCounts. That usage would give rise to code that looks like this:
C++ Example of Declaring Loop-Indexes Within for loops and Reusing Them Safely�Maybe!
for ( int recordCount = 0; recordCount < MAX_RECORDS; recordCount++ ) {
// looping code that uses recordCount
}
// intervening code
for ( int recordCount = 0; recordCount < MAX_RECORDS; recordCount++ ) {
// additional looping code that uses a different recordCount
}
This technique is helpful for documenting the purpose of the recordCount variable; however, don't rely on your compiler to enforce recordCount's scope. Section 6.3.3.1 of The C++ Programming Language (Stroustrup 1997) says that recordCount should have a scope limited to its loop. When I checked this functionality with three different C++ compilers, however, I got three different results:
The first compiler flagged recordCount in the second for loop for multiple variable declarations and generated an error.
The second compiler accepted recordCount in the second for loop but allowed it to be used outside the first for loop.
The third compiler allowed both usages of recordCount and did not allow either one to be used outside the for loop in which it was declared.
As is often the case with more esoteric language features, compiler implementations can vary.
How Long Should a Loop Be?
Loop length can be measured in lines of code or depth of nesting. Here are some guidelines:
Make your loops short enough to view all at once If you usually look at loops on your monitor and your monitor displays 50 lines, that puts a 50-line restriction on you. Experts have suggested a loop-length limit of one page. When you begin to appreciate the principle of writing simple code, however, you'll rarely write loops longer than 15 or 20 lines.
Limit nesting to three levels Studies have shown that the ability of programmers to comprehend a loop deteriorates significantly beyond three levels of nesting (Yourdon 1986a). If you're going beyond that number of levels, make the loop shorter (conceptually) by breaking part of it into a routine or simplifying the control structure.
Cross-Reference
For details on simplifying nesting, see Section 19.4, "Taming Dangerously Deep Nesting."
Move loop innards of long loops into routines If the loop is well designed, the code on the inside of a loop can often be moved into one or more routines that are called from within the loop.
Make long loops especially clear Length adds complexity. If you write a short loop, you can use riskier control structures such as break and continue, multiple exits, complicated termination conditions, and so on. If you write a longer loop and feel any concern for your reader, you'll give the loop a single exit and make the exit condition unmistakably clear.
< Free Open Study > |