Tải bản đầy đủ - 0 (trang)
Chapter 8. Object-Oriented Programming with Functions

Chapter 8. Object-Oriented Programming with Functions

Tải bản đầy đủ - 0trang

file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm



In the previous chapters, I concentrated on C++ language rules that define what is syntactically

legal in a C++ program and what is not. Similar to natural languages, illegal constructs should be

ruled out, not because of bad taste or ambiguity, but because the compiler will not be able to

convert them into object code. As for legal constructs, there exists a large variety of ways to "say

the same thing." In the previous chapters, I compared different ways of using legal constructs, often

from the point of view of program correctness, performance, and, yes, bad taste. But my major

concern was with program maintainability, making sure that the maintenance programmer does not

spend extra effort trying to understand what the code designer had in mind when the source code

was written.

In this chapter (and in the chapters that follow), understandability of code will be my major

concern. However, the focus of the discussion will shift from writing control constructs in a

segment of code to a higher level of programming: breaking the program into cooperating pieces

(functions and, later, classes).

I will not get into systems analysis, that is, deciding what functions should be in the program to

support the goals of the application. That would make the scope of this book too broad. So I will

assume that whatever functions are necessary for achieving the goal of the program are already

there. Instead, I will concentrate on the ways in which additional functions should be used to make

the program more maintainable and reusable.

There is always more than one way of dividing the job between client functions that cooperate with

each other to achieve the program goal. There is always more than one way of designing the server

functions that handle data and operations on behalf of their client functions. Assuming that all

versions are equivalent from the point of view of program correctness, how do you decide which

one is better?

In the past, most programmers would use program performance as the major criterion. Hardware

progress made this criterion irrelevant for many applications, especially for interactive applications.

For those applications where performance is still important, it is the choice of algorithms and data

structures that affects performance, not the way work is allocated between client and server

functions.

Another important criterion is the ease of writing code. It is still a relevant criterion for small

programs that are developed by a few people, are used for short periods of time, and then are either

discarded or replaced by totally new code. For large systems that are designed by many cooperating

developers and are maintained for long periods of time, the economics of software development

suggests a different answer. The best version of the program is the one whose parts are easier to

reuse (providing savings during development of the application or its future releases) or easier to

maintain (providing savings during program evolution).



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm (412 of 1187) [8/17/2002 2:57:53 PM]



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm



These two characteristics¡Xmaintainability and reusability¡Xare the most important characteristics

of software quality. However, these characteristics are too general. Indeed, it is not obvious how to

decide which version of the code is less expensive to maintain and which version of the code is less

expensive to reuse.

Reusability is related to the independence of program parts. Among several versions of C++ code,

the version that has fewer links with other segments of the program is more reusable in other

contexts. Maintainability is also related to independence of program parts. Among several versions

of C++ code, the version that takes less time to understand, preferably without studying other

segments of the program, is easier to change without side effects to other parts of code.

This is why the need to refer to other segments of the program is evidence of poor quality of the

code. This is why the potential to understand the code in isolation, without referring to other code

segments, is evidence of good quality of the code. I will often say that this version of the code is

better than another one if this version can be understood with less effort or with fewer lookups in

other parts of the code.

This is nice but is still not specific enough for the practicing programmer. The concepts of code

understandability and independence should be supported by more-specific technical criteria that are

easier to recognize and to use. In this chapter, I will offer you several technical criteria. Two

criteria, cohesion and coupling, are relatively old. Two other criteria, encapsulation and information

hiding, are relatively new, and the industry has not accumulated enough experience in using them.

In addition to encapsulation and information hiding, I will use several varieties of criteria related to

code understandability and independence:

ϒΠ



pushing responsibility down to server function from client functions



ϒΠ



limiting knowledge shared by server and client functions



ϒΠ



separation of concerns for client and server functions



ϒΠ



avoiding tearing apart what should belong together



ϒΠ



passing developer's knowledge to the maintainer in code rather than in comments



I did not find one all-embracing term for these principles ("principle of maximum independence"?

"Shtern principle"? "sharing knowledge on the-need-to-know basis"? "principle of self-explanatory

code"?). As you are going to see, these principles somewhat overlap with each other and with the

criteria of cohesion, coupling, information hiding, and encapsulation. I think that practicing

programmers should be familiar with all of these principles. Their major advantage is that they are

operational: They show the programmer the directions to go in search of a better design. Using

them will help you to understand how to improve your coding practices.

file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm (413 of 1187) [8/17/2002 2:57:53 PM]



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm



The idea behind these criteria or principles is that the functions in a program cooperate doing parts

of the same job. For any division of responsibilities among them, these functions have to share

some knowledge, have common concerns, partition parts of the same job, and do it in different

functions. After all, these functions are parts of the same program. To make these functions

reusable and understandable, you should assign the responsibilities between functions (design the

system) in such a way that these dependencies among functions are minimized.

As it is often the case with high-quality programming, writing a better program requires more

programming time and results in more source code lines than writing a lower quality program does.

Some programmers (and managers) might be disappointed by this increase in the amount of work

to do. I would like to persuade these programmers (and managers) with an analogy about traffic

rules.

When I sit waiting at a red light, I sometimes think that without restrictive traffic rules we all could

get to our destinations faster. And this is probably true, at least for some destinations and for some

drivers. But not for all destinations and not for all drivers. Driving without rules causes more traffic

accidents and more congestion. Those drivers that avoid accidents will indeed get to their

destinations faster without the rules. But many more drivers will be delayed either by accidents or

by congestion caused by accidents. Traffic rules force us to make the investment of time up front to

save time in the long run.

Similarly, ignoring rules of maintainability and reusability will let you write programs faster, at

least for some applications and for some programmers. But not for all applications and not for all

programmers. The time saved by writing programs that are hard to understand will be more than

offset by the time spent trying to understand what the code designers were striving to achieve when

they were writing code (and where they went wrong).

This is why the software industry pays so much attention to writing comments. Comments are an

investment that we make up front so that it pays off in the long run (when they are clear, complete,

and up-to-date). Often, line comments are obscure, incomplete, or do not reflect changes made after

the code was written. Investing in writing self-explanatory code is better than investing in

comments.

If you are writing a small program, the rules for writing self-explanatory quality code are not very

important. If you are developing a large application, investing up front in writing quality code is

crucial for reaping the benefits in the long run.



Cohesion

Cohesion describes the relatedness of the steps that the designer puts into the same segment of

code, for example, a function.

file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm (414 of 1187) [8/17/2002 2:57:53 PM]



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm



When a function has good cohesion (high cohesion), it performs one task over one computational

object or data structure; when cohesion is poor (weak cohesion), the function performs several

tasks over one object or even several tasks over several objects. When the function exhibits poor

cohesion, it contains diverse unrelated computations over unrelated computational objects. This

means that these objects belong elsewhere: The designer tore apart things that should have

belonged with something else and instead put them into the same function.

High-cohesion functions are easy to name. Usually, one uses a verb-plus- noun combination. The

active verb is used for the action that the function performs, and the noun is used for the object (or

the subject) of the action. For example, insertItem(), findAccount(), and so on (if the function

name is honest, which is not always the case).

For low-cohesion functions, one has to use several verbs or nouns, for example,

findOrInsertItem().



Here is an example, however awkward. (All good examples of poor cohesion are awkward because

they describe poorly designed functions.)

void initializeGlobalObjects ()

{ numaccts = 0;

fstream inf("trans.dat",ios::in);

numtrans = 0;

if (inf==NULL) exit(1); }



//

//

//

//



one computational object

transaction file

another computational object

transaction file again



In this example, numaccts should be initialized where accounts are processed¡Xit belongs with

account processing. Similarly, numtrans should be initialized where transactions are processed¡Xit

belongs with transaction processing, not with account initialization. In this function, I tore apart

what belongs with other steps of processing and put them together into a function of weak

cohesion.

The remedy is to redesign. As I mentioned in Chapter 1, "Object-Oriented Approach: What's So

Good About It?" redesign means changing the list of parts (functions) and their responsibilities. In

the case of poor cohesion, redesign usually means breaking the function with poor cohesion into

several cohesive functions. The tradeoff is that you can wind up with too many small functions.

Besides potential impact on performance, this imposes on the maintenance programmer a larger

number of things to remember (function names and their interfaces). For a small function like

initializeGlobalObjects() above, breaking up does not make sense. Such a function probably

should be eliminated.

Cohesion is not a very strong criterion; the decision to redesign by breaking up functions should not

file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm (415 of 1187) [8/17/2002 2:57:53 PM]



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm



be made lightly. In case of doubt, cohesion needs other criteria to complement it. However,

cohesion is important for evaluating designs. Make sure you use it when you evaluate design

alternatives¡Xthe distribution of work between functions.



Coupling

Coupling is a much stronger and useful criterion than cohesion. It describes the interface, or flow of

data values, between a called function (a server function) and a calling function (a client function).

Coupling can be implicit, with functions communicating through global variables, or explicit, when

the client and server functions communicate through parameters. Implicit coupling is higher¡Xit

results in a higher degree of dependency between the client and server functions. Explicit coupling

is lower: When functions communicate through parameters, it is easier to understand, reuse, and

modify them.

The intensity of coupling is described by the number of values that flow from the client function to

the server function and back. A large number of values means strong coupling: a high degree of

dependency between functions. A small number of values means weak coupling: a low degree of

dependency between the client and server functions.



Implicit Coupling

The client function supplies the server function with input data for computations and depends on

the results computed by the server function (server output). Coupling is implicit when the functions

communicate through global variables that are not listed in the server function interface.

Consider, for example, an interactive program that prompts the user to enter the year and prints

whether it is a leap year.

int year, remainder; bool leap;

cout << "Enter the year: ";

cin >> year;

remainder = year % 4;

if (remainder != 0)

by 4

leap = false;

leap year

else

{ if (year%100 == 0 && year%400 != 0)

leap = false;

but not by

400

else



// program data

// prompt the user

// accept user input

// it is not divisible

// hence, it is not a



// divisible by 100



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm (416 of 1187) [8/17/2002 2:57:53 PM]



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm



leap = true; }

leap year

if (leap)

cout << year << " is a leap year\n";

else

cout << year << " is not a leap year\n";

}



// otherwise, it is a



// print results



This program is similar to the code I discussed in Chapter 4, "C++ Control Flow," (Listings 4.8 and

4.9). This is a small program and, in truth, it does not need any modularization. On the other hand,

a program that benefits the most from modularization should be fairly large. Studying program

details and comparing different alternatives would become a task in itself and would distract you

from the discussion of the principles of modularization, which I would like you to concentrate on. It

is these principles you will apply in real life, not the details of examples.

This is why I would like you to pretend that this is a very large and complex program and follow

me through several cycles of its redesign by breaking it into cooperating functions.

So, this is a large monolithic program that I would like to break into manageable components.

Again, for simplicity's sake, let us break it into only two functions, main() which is responsible for

the user interface and the general flow of computations, and isLeap(), which uses the values of

year and remainder to compute the value of leap that is used by main() to print the results.

void isLeap()

{ if (remainder != 0)

// it is not divisible by 4

leap = false;

// hence, it is not a leap year

else if (year%100==0 && year%400!=0)

leap = false;

// divisible by 100 but not by 400

else

leap = true; }

// otherwise, it is a leap year



There is one technical problem here that is related to the concept of scope discussed in Chapter 6,

"Memory Management: The Stack and the Heap." The values of year and remainder that the

function isLeap() uses are set in main(). The value of leap that the function isLeap() computes

is used by main(). However, if I define these variables in main(), they will be visible only in

main(): The C++ scope rules would prevent them from being visible in any other function, and

isLeap() will not be able to manipulate these variables. If I define these variables in isLeap(),

they would be visible only in isLeap().The C++ scope rules would prevent these variables from

being visible in main(). To make these variables visible both in main() and isLeap(), I have to

file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm (417 of 1187) [8/17/2002 2:57:54 PM]



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm



define them as global to both of these functions.

Listing 8.1 demonstrates this solution. A sample run of this program is shown in Figure 8-1.



Figure 8-1. Output for program in Listing 8.1.



Example 8.1. Example of implicit coupling through global variables.

#include

using namespace std;

int year, remainder;

bool leap;



// global input variables

// global output variable



void isLeap()

// inputs: year, remainder; output: leap

{ if (remainder != 0)

// access three global variables

leap = false

// if not divisible by 4, it is not leap

else if (year%100==0 && year%400!=0)

// access global variables

leap = false;

// divisible by 100 but not by 400: not

leap

else

leap = true; }

// otherwise, it is a leap year

int main()

{ cout << "Enter the year: ";

cin >> year;

// prompt

remainder = year % 4;

// access

isLeap();

// define

if (leap)

cout << year << " is a leap year\n";

else

cout << year << " is not a leap year\n";

return 0;

}



the user, enter data

global variables

whether it is a leap year

// print results



In this program, function main() calls function isLeap(). Function main() is a client that gets its

job done by calling other functions. Function isLeap() is a server that does some job for a client

that calls it. This relationship between the two functions is shown in a structure chart in Figure 8-2.

The structure chart also shows the data flow between the functions. Variables year and remainder

are set in main() and are used by isLeap() as its input values to compute its results. The value of

variable leap is produced by function isLeap() as its output and is used by main() after the call to

isLeap().



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm (418 of 1187) [8/17/2002 2:57:54 PM]



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm



Figure 8-2. A structure chart for the program in Listing 8.1.



Notice that input variables year and remainder must have legitimate values before the function

isLeap() is called by main(). It is the responsibility of the client function to make sure that these

variables are properly initialized: The server isLeap() does not make any checks of validity; it

assumes that the client main() lives up to its obligations.

Similarly, the output variables (in this case the variable leap) do not have to have a legitimate

value before the call to the server function. It is the responsibility of the server isLeap() to set the

output value, and the client main() uses this value after (but not before) the call.

It is important to understand the data flow between the functions. If I know that variables year and

remainder are input variables for isLeap(), I would expect that the server function uses these

values but does not change them. It would be quite odd to expect that function isLeap() does

something like that.

void isLeap()

{ remainder = 4; year = 2000;



. . .



// unexpected nonsense!



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm (419 of 1187) [8/17/2002 2:57:54 PM]



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm



Similarly, if I know that variable leap is an output variable for function isLeap(), I would not

expect the client main() to initialize this variable before main() calls isLeap() (or, for that

matter, change its value immediately after the call without first using it for some purpose).

int main()

{ cout << "Enter the year:

cin >> year;

remainder = year % 4;

leap = false;

isLeap();

leap = true;

. . .



";

//

//

//

//

//



prompt the user, enter data

access global variables

misleading initialization before call

define whether it is a leap year

misleading (and incorrect) if done after call



What will the maintainer assume reading the main() function above? After establishing the goal of

the assignment to remainder (it is used in isLeap() for computing the value of variable leap), the

maintainer will study isLeap() again, trying to figure out the purpose of the assignment to leap.

For a small function, it will only take a few seconds to figure out that the value assigned in client

main() to leap is not used in the server isLeap() or even in client main(). But this is true only

for a small function. For a large program, this will require more time, and the maintainer might

become confused and come to a wrong conclusion.

True, some programmers dislike noninitialized variables so much that they initialize variables even

when initialization is not needed. They say that this is helpful when the server function for some

reason fails to assign the value. But isLeap() is not one of those functions! Neither are the

majority of the functions you have written or will ever write. If programmers understood the data

flow between functions, the functions would never fail to assign values to their output variables.

You see that this innocent-looking "defensive" programming technique results in code that requires

more time to understand. From the point of view of quality criteria (readability and independence

of program parts), this technique invariably results in inferior code; that is, it contributes to the

software crisis we all would like to eliminate. Avoid this practice. Instead of initializing everything

in sight, tell the maintainer what values will be used as server input (by initializing them in the

client) and what values are server output variables (by not initializing them in the client).

I hope that you follow this discussion and see the importance of passing to the maintainer the

knowledge the developer has about the data flow between the functions. Let us go back to the

discussion of coupling.

Coupling describes how much one has to study to understand the data flow between the functions.

file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm (420 of 1187) [8/17/2002 2:57:54 PM]



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm



Often it requires one to study the pattern of data handling by the client and the server functions. In

Listing 8.1, for example, I notice that main() assigns values to variables year and remainder and

that isLeap() uses these values. I also notice that main() does not initialize leap, isLeap()

assigns to leap a value, and main() uses this value after the call to isLeap(). That's that.

However, to establish these simple dependencies, I have to study both the client and server

functions in their entirety. It is easy to do for this trivial example I am discussing, but it takes more

time for any function of realistic size and complexity. Can one improve this labor-intensive and

error-prone technique? Sure. The way to do that is to use explicit coupling instead of implicit

coupling.



Explicit Coupling

Explicit coupling is through function parameters, when all input and output variables used by the

server function are included in the server function parameters, and no global variables are used in

the data flow between the client and the server. Listing 8.2 shows the same example as in Listing

8.1 where explicit parameters replace the use of implicit data flows through global variables. This

program executes in the same way as the program in Listing 8.1.



Example 8.2. Example of explicit coupling through parameters.

#include

using namespace std;

void isLeap(int year, int remainder, bool &leap)

// parameters

// inputs: year, remainder; output: leap

{ if (remainder != 0)

leap = false;

else if (year%100==0 && year%400!=0)

leap = false;

else

leap = true; }

int main()

{ int year, remainder;

// local input variables

bool leap;

// local output variable

cout << "Enter the year: ";

cin >> year;

// input variables are set

remainder = year % 4;

isLeap(year,remainder,leap);

// output variable is set

if (leap)

// output variable is used

cout << year << " is a leap year\n";

else

cout << year << " is not a leap year\n";

return 0;

}



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm (421 of 1187) [8/17/2002 2:57:54 PM]



file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm



In Listing 8.2, the server function isLeap() has three parameters. There are no global variables.

Variables year, remainder, and leap are defined as local in the client function main(). Why is

this possible? Because they do not have to be known in the scope of function isLeap() as they do

in Listing 8.1. Instead, function isLeap() accesses these variables as actual arguments that are

passed from the client function in the call to function isLeap().

This is a general observation: When two functions communicate through data, the components of

the data flow should be either declared as global to both functions, or they can be defined in the

scope of the client function and passed as parameters to the server function.

As in the previous example, variables year and remainder are input variables for isLeap() and

variable leap is an output variable. How do I know? I study the header (or the

prototype¡Xwhatever is available) of the function isLeap() rather than the body of the function.

void isLeap(int year, int remainder, bool &leap)

{ . . . }



// parameters



Can you tell without studying the function code what the role of each parameter is? Sure.

Parameters year and remainder are passed by value. Hence, they cannot be output parameters.

You do not expect function isLeap() to set their values.

void isLeap(int year, int remainder, bool &leap)

{ remainder=4; year=2000; . . .

parameters



// parameters

// useless for value



Hence, you conclude that these two are input parameters. The values of the actual arguments

should be set by the client code before the function call, and these values will be used by the server

function in its computations.

Similarly, parameter leap is passed by reference. This means that it is an output parameter.

Actually, it also can be an input/output parameter; that is, the client function might set its value

initially, and then the server function might update that value. But the main point is that function

isLeap() changes the value of parameter leap.

How much should I study to arrive at these conclusions? Not very much, just the header of the

function. The structure chart for the program in Listing 8.2 is shown in Figure 8-2. It is the same as

for the program in Listing 8.1, but explicit data flows of global variables are replaced by explicit

data flows of parameters. Does the amount of time I spend depend on the size or complexity of the

file://///Administrator/General%20English%20Learning/it2002-7-6/core.htm (422 of 1187) [8/17/2002 2:57:54 PM]



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Chapter 8. Object-Oriented Programming with Functions

Tải bản đầy đủ ngay(0 tr)

×