Object-Oriented Testing: Myth and Reality

Robert V. Binder

This article appeared in Object magazine, May 1995.

Perceptions About Testing Objects

There are many effective approaches to testing object-oriented software (see
biblography.) However, most have been recently developed and are not yet widely disseminated. As software developers seem to abhor a conceptual vacuum, some myths about testing objects have formed. I've heard these myths my testing seminars, in discussions with clients, and seen a few in print.

A common result of this testing mythology is that testing-by-poking-around becomes accepted practice. The developer tries to demonstrate that objects they've produced do something useful without crashing. They may even ask someone else to "break" them. The breaker tries a few dirty tricks. Either effort may reveal some faults. The faults are repaired and the system is then deemed "tested".

There are several problems with testing-by-poking-around. Many parts of the object or system under test are never activated. There is no systematic basis for test design, so it is not possible to determine how much of the system, its specification, and its requirements have actually been exercised. Remaining faults will surface at later date. Then, they will surely be more difficult and costly to repair, with an operational impact ranging from nuisance to catastrophe. Finally, testing-by-poking-around misses a big opportunity. Test case design is a potent defect-prevention strategy. If tests are developed in parallel with classes, better classes will result sooner.

A test process that complements object-oriented design and programming can significantly increase reuse, quality, and productivity. Establishing such a process usually means dealing with some common mis-perceptions (myths) about testing object-oriented software. This article is about these perceptions. We'll explore these myths and their assumptions and then explain why the myth is at odds with reality.

Myth: Testing is unnecessary -- objects and object-oriented development are different, after all. Object-oriented development is iterative, so with each iteration, we'll find and reduce bugs -- this is an easier and more effective way (than testing) to develop trustworthy classes. Reuse by itself means that an object has been sufficiently exercised. Objects acquire stability through reuse; this a "natural" way to obtain high quality. With iterative and incremental development we obviate the need for a separate test activity, which was really only necessary in the first place because conventional programming languages made it so easy to make mistakes.

Reality: Human error is as likely as ever. As with conventional languages, it results in simple programming mistakes, unanticipated interaction, and incorrect or missing behavior. Reuse in no way guarantees that a sufficient number of paths and states have been exercised to reveal all faults. Reuse is limited to the extent that supplier classes are trustworthy. Debugging by reuse is simply not an efficient strategy for rapid, large-scale development. As reusers, we would do well to follow President Regan's SALT strategy: "Trust but verify."

Myth: Testing gets in the way. Class development is an exploratory process, not unlike composing music or prose. Writing good software is like telling a good story -- it just flows and grows. The idea of testing to find faults is fundamentally wrong -- all we need to do is to keep "improving" our good ideas. The simple act of expression is sufficient to create trustworthy classes. Testing is a destructive, rote process -- it isn't a good use of developer's creative abilities and technical skills.

Reality: Testing can be a complementary, integral part of development. On a personal level, this argument often boils down to simple distaste for what is perceived to be an unpleasant task (see Testing is Trivial.) While poorly organized testing is usually an ineffective hassle, a good testing strategy contributes. There are two sophisticated alternatives to unit testing: Mill's Clean Room approach and Glib's inspections. Both assert that unit testing effort can be minimized if the formal, defined, and repeatable development practices they recommend are used. Neither advocates skipping integration and system testing.

Myth: Testing is structured/waterfall idea -- it can't be consistent with incremental object-oriented development. Objects evolve -- they aren't just designed, thrown over the wall for coding, and over another wall for testing. What's more, if you test each class in a system separately then you have to do "big-bang" integration, which is an especially bad idea with object-oriented systems.

Reality: Testing can be incremental and iterative. While the iterative and incremental nature of object-oriented development is inconsistent with a simple, sequential test process (test each unit, then try to integrate test all of them, then do system test), it does not mean that testing is irrelevant. The boundary that defines the scope of unit and integration testing is different for object-oriented development. Tests can be designed and exercised at many points in the process. Thus "design a little, code a little" becomes "design a little, code a little, test a little." The scope of testing corresponds to the collaborations necessary to accomplish the responsibility under test. The appropriate goals and techniques shift with the scope of integration: very oo-specific for a small scope and increasingly less oo-specific for a larger scope. If the system under development is a library component (not an application system), then scope remains small. As the scope of integration in an application system approaches all components in the system, the appropriate testing techniques are very similar to conventional system testing.

Myth: Testing is trivial. Testing is simply poking around until you run out of time. All we need to do is start the app, try each use-case, and try some garbage input. Testing is neither serious nor challenging work -- hasn't most of it has already been automated?

Reality: Hunches about testing completeness are notoriously optimistic. Adequate testing requires a sophisticated understanding of the system under test. You need to be able to develop an abstract view of the dynamics of control flow, data flow, and state space in a formal model. You need an understanding of system requirements at least as good as the designer's or user's. You need to be able to define the expected results for any input and state you select as a test case. This is interesting work for which little automation is available.

Myth: Automated GUI testing is sufficient. If a system is automatically exercised by trying permutations of GUI commands supplied by a command playback tool, the underlying application objects will be sufficiently tested.

Reality: GUI-based tests may be little more than automated testing-by-poking-around. While there are many useful capture/playback products to choose from, the number of hours a script runs has no direct or necessary correlation with the extent that the system under test has been exercised. It is quite possible to retest the same application logic over and over, resulting in an inflated confidence. Further, GUI test tools are typically of little use for objects in embedded systems.

Myth: If programmers were more careful, testing would be unnecessary. Programming errors can be eliminated by extra effort, extra pressure, or extra incentive. Bugs are simply an indication of poor work habits. These poor work habits could be avoided if we'd use a better management strategy.

Reality: Many bugs only surface during integration. There are many interactions among components that cannot be easily forseen until all or most components of a system are integrated and exercised. So, even if we could eliminate all individual sources of error, integration errors are highly likely. Compared to conventional systems, object-oriented systems have more components which must be integrated earlier in development. Since there are elements of a system that are not present until the code has been loaded into the target machine and exercised, there is no way that all faults could be removed by class or class-cluster testing alone, even if every method was subjected to a formal proof of correctness. Static methods cannot reveal interaction errors with the target or transient performance problems in hard real-time systems.

Myth: Testing is inconsistent with a commitment to quality. Testing assumes faults have escaped the design and programming process. This assumption is really just an excuse for sloppy development. All bugs are due to errors that could be avoided if different developer behavior could be induced. This perception is often a restatement of the preceeding sloppy programmer myth

Reality: Reliable software cannot be obtained without testing Testing activities can begin and proceed in parallel with concept definition, OOA, OOD, and programming. When testing is correctly interleaved with development, it adds considerable value to the entire development process. The necessity of testing is not an indictment of anything more than the difficulty of building large systems.

Myth: Testing is too expensive -- we don't have time. To test beyond testing-by-poking- around takes too much time and costs too much. Test tools are an unnecessary luxury since all we need are a few good pokes. Besides, projects always slip -- testing time gets squeezed anyway.

Reality: Pay me now, or pay me much more later. The cost of finding and correcting errors is always higher as the time between fault injection and detection increases. The lowest cost results when you prevent errors. If a fault goes unnoticed, it can easily take hours or days of debugging to diagnose, locate, and correct after the component is in widespread use. Failures in operational systems can cause severe secondary problems. Proper testing is very cheap by comparison, even when done manually. Efficient testing requires automated support. The typical testing tool pays for itself after one or two projects by either increasing the number of tests that can be run with a given budget or reducing the cost to achieve a reliability goal. Reduction of testing due to schedule slippage is a frequent problem. The fact that it happens does not mean that testing is unwise. This kind of schedule compression can usually be avoided by starting test activities earlier in the project.

Myth: Testing is the same (as it is with conventional software). The only kind of testing that matters is "black-box" system testing, where we define the externally observable behavior to be produced from a given input. We don't need to use any information about the implementation to select these tests and their test inputs.

Reality: OO code structure matters. Effective testing is guided by information about likely sources of error. The combination of polymorphism, inheritance, and encapsulation are unique to object-oriented languages, presenting opportunities for error that do not exist in conventional languages. Our testing strategy should help us look for these new kinds of errors and offer criteria to help decide when we've done enough looking. Since the "fundamental paradigm shift" often touted for object-oriented development has lead to some new points of view and representations, our techniques for extracting test cases from these representations must also change.

Myth: Conventional testing is useless for objects. Objects are so different, everything we know about conventional testing is useless. Conventional testing deals with statement control flow in modules; object methods are small and encapsulated, so control flow can be checked by inspection. Conventional testing doesn't deal with state and collaborations.

Reality: Conventional testing techniques can be adapted. There is a large body of knowledge about testing which is theoretically and practically proven. We know how to model paths, states, data flow, and how to efficiently select test cases. While the kinds of errors we're likely to see in object-oriented code are different than those in conventional software, basic testing techniques continue to apply, with the necessary changes. For example, I've developed a technique for testing all intra-class data flows which combines ideas from data-flow testing and state-machine testing. It offers a systematic means to exercise all possible data flows over all possible method activation sequences.

Myth: Inheritance means never having to say you're sorry. Specializing from trusted (or tested) super classes means that their subclasses will also be correct -- we don't need to retest inherited features.

Reality: Subclasses create new ways to misuse inherited features. Each subclass is a new and different context for an inherited superclass feature. Different test cases are needed for each context. Inherited features need to be exercised in the unique context of the subclass. We need to retest inherited methods, even if they weren't changed. Although methods A and B may work perfectly well in a superclass, the action and interaction of A, B, and C (an extension) in subclass should be tested. We need to see that C uses A and B correctly and that C does not produce any side effects which cause failures in A or B. None of this is assured by testing the superclass containing only A and B, or by only testing method C in its subclass.

Myth: Reuse means never having to say you're sorry. Reusing a trusted class (either by defining instance variables of its type or sending a message) means that the behavior of the server object is trustworthy, and obviates the need for retesting the server.

Reality: Every new usage provides ways to misuse a server. Even if many server object of a given class function correctly, nothing is to prevent a new client class from using it incorrectly. Thus, all client class' use of a server needs to exercised. An interesting corollary is that we can't automatically trust a server because it performs correctly for one client.

Myth: Black box testing is sufficient. If you do a careful job of test case design using the class interface or specification, you can be assured that the class has been fully exercised. White-box testing (looking at a method's implementation to design tests) violates the very concept of encapsulation.

Reality: OO structure matters, part II. Many studies have shown that black-box test suites thought to be excruciatingly thorough by developers only exercise from one-third to a half of the statements (let alone paths or states) in the implementation under test. There are three reasons for this. First, the inputs or states selected typically exercise normal paths, but don't force all possible paths/states. Second, black-box testing alone cannot reveal surprises. Suppose we've tested all of the specified behaviors of the system under test. To be confident there are no unspecified behaviors we need to know if any parts of the system have not been exercised by the black-box test suite. The only way this information can be obtained is by code instrumentation. Third, it is often difficult to exercise exception and error-handling without examination of the source code.

Conclusion

Testing is another powerful tool for object-oriented development. Like other powerful tools, it produces the best results when applied at the right time to the right problem.


Previous    Top    Back Home    Next
Copyright 1995, RBSC Corporation. All rights reserved.
Last Revised: 1 December 1995.