Kitabı oku: «Software Testing Foundations», sayfa 3

Yazı tipi:

2.1.2 Testing Terminology

Testing is not debugging

In order to remedy a software fault it has to be located. To start with, we only know the effect of the fault, but not its location within the code. The process of finding and correcting faults is called debugging and is the responsibility of the developer. Debugging is often confused with testing, although these are two distinct and very different tasks. While debugging pinpoints software faults, testing is used to reveal the effect a fault causes (see figure 2-1).

Confirmation testing

Correcting a fault improves the quality of the product (assuming the correction doesn’t cause additional, new faults). Tests used to check that a fault has been successfully remedied are called confirmation tests. Testers are often responsible for confirmation testing, whereas developers are more likely to be responsible for component testing (and debugging). However, these roles can change in an agile development environment or for other software lifecycle models.

Unfortunately, in real-world situations fault correction often leads to the creation of new faults that are only revealed when completely new input scenarios are used. Such unpredictable side effects make testing trickier. Once a fault has been corrected you need to repeat your previous tests to make sure the targeted failure has been remedied, and you also need to write new tests that check for unwanted side effects of the correction process.

Objectives of testing

Static and dynamic tests are designed to achieve various objectives:

A qualitative evaluation of work products related to the requirements, the specifications, user stories, program design, and code

Prove that all specific requirements have been completely implemented and that the test object functions as expected for the users and other stakeholders

Provide information that enables stakeholders to make a solid estimate of the test object’s quality and thus generate confidence in the quality provided3

The level of quality-related risk can be reduced through identification and correction of software failures. The system will then contain fewer undiscovered faults.

Analysis of the program and its documentation in order to avoid unwanted faults, and to document and remedy known ones

Analyze and execute the program in order to reproduce known failures

Receive information about the test object in order to decide whether the component in question can be committed for integration with other components

Demonstrate that the test object adheres and/or conforms to the necessary contractual, legal and regulatory requirements and standards

Objectives depend on context

Test objectives can vary depending on the context. Furthermore, they can vary according to the development model you use (agile or otherwise) and the level of test you are performing—i.e., component, integration, system, or acceptance tests (see section 3.4).

When you are testing a component, your main objective should be to reveal as many failures as possible and to identify (i.e., debug) and remedy the underlying faults as soon as possible. Another primary objective can be to select tests that achieve the maximum possible level of code coverage (see section 2.3.1).

One objective of acceptance testing is to confirm that the system works and can be used as planned, and thus fulfills all of its functional and non-functional requirements. Another is to provide information that enables stakeholders to evaluate risk and make an informed decision about whether (or not) to go live.

Side Note: Scheme for naming different types of testing

The various names used for different types of tests can be confusing. To understand the naming of tests it is useful to differentiate between the following naming categories:

1 Test objectiveThe naming of a test type is based on the test objective (for example, a “load test”).
2 Test method/techniqueA test is named according to the method or technique used to specify and/or perform the test (i.e., “state transition testing”, as described in section 5.1.3)
3 Test objectA test is named according to the type of object to be tested (for example, “GUI test“ or “database test“)
4 Test levelA test is named according to the corresponding level of the development model being used (for example, a “system test“)
5 Test personA test is named after the person or group who perform the test (for example, “developer test“, “user test“)
6 Test scopeA test is named according to its scope (for example, a “partial regression test“)

As you can see, not all of these terms define a distinct type of test. Instead, the different names highlight different aspects of a test that are important or in focus in a particular context or with regard to a particular testing objective.

2.1.3 Test Artifacts and the Relationships Between Them

The previous sections have already described some types of test artifacts. The following sections provide an overview of the types of artifacts necessary to perform dynamic testing.

Test basis

The test basis is the cornerstone of the testing process. As previously noted, the test basis comprises all documents that help us to decide whether a failure has occurred during testing. In other words, the test basis defines the expected behavior of the test object. Common sense and specialist knowledge can also be seen as part of the test basis and can be used to reach a decision. In most cases a requirements document, a specification, or a user story is available, which serves as a test basis.

Test cases and test runs

The test basis is used to define test cases, and a test run takes place when the test object is fed with appropriate test data and executed on a computer. The results of the test run are checked and the team decides whether a failure has occurred—i.e., whether there is a discrepancy between the test object’s expected and actual behaviors. Usually, certain preconditions have to be met in order to run a test case—for example, the corresponding database has to be available and filled with suitable data.

Test conditions

An individual test cannot be used to test the entire test basis, so it has to focus on a specific aspect. Test conditions are therefore extrapolated from the test basis in order to pursue specific test objectives (see above). A test condition can be checked using one or more tests and can be a function, a transaction, a quality attribute, or a structural element of a component or system. Examples of test conditions in our case study VSR-II system are vehicle configuration permutations (see section 5.1.5), the look and feel of the user interface, or the system’s response time.

Test item

By the same token, a test object can rarely be tested as a complete object in its own right. Usually, we need to identify separate items that are then tested using individual test cases. For example, the test item for VSR-II’s price calculation test condition is the calculate_price() method (see section 5.1.1). The corresponding test cases are specified using appropriate testing techniques (see Chapter 5).

Test suites and test execution schedules

It makes little sense to perform test cases individually. Test cases are usually combined in test suites that are executed in a test cycle. The timing of test cycles is defined in the test execution schedule.

Test scripts

Test suites are automated using scripts that contain the test sequence and all of the actions required to create the necessary preconditions for testing, and to clean up once testing is completed. If you execute tests manually, the same information has to be made available for the manual tester.

Test logs

Test runs are logged and recorded in a test summary report.

Test plan

For every test object, you need to create a test plan that defines everything you need to conduct your tests (see section 6.2.1). This includes your choice of test objects and testing techniques, the definition of the test objectives and reporting scheme, and the coordination of all test-related activities.

Figure 2-2 shows the relationships between the various artifacts involved. Defining the individual activities involved in the testing process (see section 2.3) helps to clarify when each artifact is created.

Fig. 2-2 The relationships between test artifacts

2.1.4 Testing Effort

Testing effort depends on the project (environment)

Testing takes up a large portion of the development effort, even if only a part of all conceivable tests—or, more precisely, all conceivable test cases— can be considered. It is difficult to say just how much effort you should spend testing, as this depends very much on the nature of the project at hand.⁴

The importance of testing—and thus the amount of effort required for testing—is often made clear by the ratio of testers to developers. In practice, the following ratios can be found: from one tester for every ten developers to three testers per developer. Testing effort and budget vary massively in real-world situations.

Case Study: Testing effort and vehicle variants

VSR-II enables potential customers to configure their own vehicle on a computer screen. The extras available for specific models and the possible combinations of options and preconfigured models are subject to a complex set of rules. The old VSR System allowed customers to select combinations that were not actually deliverable. As a consequence of the VSR-II QA/Test planning requirement Functional suitability/DreamCar = high (see below) customers should no longer be able to select non-deliverable combinations.

The product owner responsible for the DreamCar module wants to know how much testing effort will be required to test this aspect of the module as comprehensively as possible. To do this, he makes an estimate of the maximum number of vehicle configuration options available. The results are as follows:

There are 10 vehicle models, each with 5 different engine options; 10 types of wheel rims with summer or winter tires; 10 colors, each with matt, glossy, or pearl effect options; and 5 different entertainment systems. These options result in 10×5×10×2×10×3×5=150.000 different variants, so testing one variant every second would take a total of 1.7 days.

A further 50 extras (each of which is selectable or not) produce a total of 150.000×2⁵⁰ = 168.884.986.026.393.600.000 variations.

The product owner intuitively knows that he doesn’t have to test for every possible combination, but rather for the rules that define which combinations of options are not deliverable. Nevertheless, possible software faults create the risk that the DreamCar module wrongly classifies some configurations as deliverable (or permissible combinations as non-deliverable).

How much testing effort is required here and how much can it effectively cost? The product owner decides to ask the QA/testing lead for advice. One possible solution to the issue is to use pairwise testing (see the side note in section 5.1.5).

Side Note: When is increased testing effort justifiable?

Is a high testing effort affordable and justifiable? Jerry Weinberg’s response to this question is: “Compared with what?” [DeMarco 93]. This response points out the risks of using a faulty software system. Risk is calculated from the likelihood of a certain situation arising and the expected costs when it does. Potential faults that are not discovered during testing can later generate significant costs.

Example: The cost of failure

In March 2016, a concatenation of software faults destroyed the space telescope Hitomi, which was built at a cost of several hundred million dollars. The satellite’s software wrongly assumed that it was rotating too slowly and tried to compensate using countermeasures. The signals from the redundant control systems were then wrongly interpreted and the speed of rotation increased continuously until the centrifugal force became too much and Hitomi disintegrated (from [URL: Error Costs]).

In 2018 and 2019 two Boeing 737 MAX 8 airplanes crashed due to design flaws in the airplane’s MCAS flight control software [URL: MAX-groundings]. Here too, the software—misdirected by incorrect sensor information—generated fatal countermeasures.

Testing effort has to remain in reasonable proportion to the results testing can achieve. “Testing makes economic sense as long as the cost of finding and remedying faults is lower than the costs produced by the corresponding failure occurring when the system is live.”⁵ [Pol 00]. Reasonable testing effort therefore always depends on the degree of risk involved in failure and an evaluation of the danger this incurs. The price of the destroyed space telescope Hitomi could have paid for an awful lot of testing.

Case Study: Risks and losses when failures occur

The DreamCar module constantly updates and displays the price of the current configuration. Registered customers with validated ID can order a vehicle online.

Once a customer clicks the Order button and enters their PIN, the vehicle is ordered and the purchase committed. Once the statutory revocation deadline has passed, the chosen configuration is automatically passed on to the production management system that initiates the build process.

Because the online purchase process is binding, if the system calculates and displays an incorrect price the customer has the right to insist on the paying that price. This means that wrongly calculated prices could lead to the manufacturer selling thousands of cars at prices that are too low. Depending on the degree of miscalculation, this could lead to millions of dollars in losses. Having each purchase order checked manually is not an option, as the whole point of the VSR-II system is that vehicles can be ordered completely automatically online.

Defining test thoroughness and scope depending on risk factors

Systems or system parts with a high risk have to be tested more extensively than those that do not cause major damage in case of failure.⁶ Risk assessment has to be carried out for the individual system parts or even for individual failure modes. If there is a high risk of a system or subsystem malfunctioning, the test requires more effort than for less critical (sub) systems. These procedures are defined through international standards for the production of safety-critical systems. For example, the [RTC-DO 178B] Airborne Systems and Equipment Certification standard prescribes complex testing procedures for aviation systems.

Although there are no material risks involved, a computer game that saves scores incorrectly can be costly for its manufacturer, as such faults affect the public acceptance of a game and its parent company’s other products, and can lead to lost sales and damage to the company’s reputation.

2.1.5 Applying Testing Skills Early Ensures Success

Testing is an important factor in any success story

In software projects, it is never too early to begin preparing your tests. The following examples illustrate the benefits of involving testers with appropriate test knowledge in individual activities within the software development life cycle:

Close cooperation between developers and testers throughout the development process

If testers are involved in checking the requirements (for example, using reviews, detailed in section 4.2) or refining user stories, they can use their specialist knowledge to find and remedy ambiguities and faults very early in the work product. The identification and correction of flawed requirements reduces the risk of producing inappropriate or non-testable functionality.

Close cooperation between testers and systems designers at the design stage helps all those involved to better understand the system’s design and the corresponding tests. This increased awareness can reduce the risk of producing fundamental construction faults and makes it easier to design appropriate tests—for example, to test interfaces during integration testing (see section 3.4.2).

Developers and testers who work together at the coding stage have a better understanding of the code itself and the tests it requires. This reduces the risk of producing faulty code and of designing inappropriate tests (see false negatives in section 6.4.1).

If testers verify and validate software before release, they are sure to identify and remedy additional faults that would otherwise remain undiscovered. This increases the probability that the product fulfills its requirements and satisfies all of its stakeholders.

In addition to these examples, achieving the previously defined test objectives will also aid successful software development and maintenance.

2.1.6 The Basic Principles of Testing

The previous sections addressed software testing, whereas the following section summarize the basics of testing in general. These are guidelines that have developed over decades of testing experience.

1 Principle #1:Testing shows the presence of defects, not their absenceTesting establishes the presence of defects and reveals the faults that cause them. Depending on the effort made and the thoroughness of the tests involved, testing reduces the probability of leaving undiscovered faults in the test object. However, testing does not enable us to prove that a test object contains no faults. Even if an object doesn’t fail during testing, this is no proof of freedom from faults or overall correctness.
2 Principle #2:Exhaustive testing is impossibleWith the exception of extremely simple or trivial test objects, it is impossible to design and perform a complete test suite that covers all possible combinations of input data and their preconditions. Tests are always samples, and the effort allocated to them depends on the risks they cover and the priority assigned to them.
3 Principle #3:Early testing saves time and moneyDynamic and static testing activities should be defined and begun as early as possible in the system’s lifecycle. The term “shift left” implies early testing. Early testing reveals faults at an early stage of the development process. In a software context, this helps to avoid (or at least reduce) the increasingly costly repair of faults later in the development lifecycle.
4 Principle #4:Defects cluster togetherGenerally speaking, defects are not evenly distributed throughout a system. Most defects can usually be found in a small number of modules, and this (estimated or observed) clustering effect can be used to help analyze risk. Testing effort can then be concentrated on the most relevant parts of the system (see also principle #2 above).
5 Principle #5:Beware the pesticide paradoxOver time, tests become less effective the same way insects develop resistance to pesticides. If you repeat tests on an unchanged system, they won’t reveal any new failures. In order for your tests to remain effective you need to check your test cases regularly and, if necessary, modify them or add new ones. This ensures that you test previously unchecked components and previously untested input data, thus revealing any failures that these produces. The pesticide paradox can have a positive effect too. For example, if an automated regression test reveals a low number of failures, this may not be the result of high software quality but rather due to the ineffectiveness of the (possibly outdated) test cases in use.
6 Principle #6:Testing is context-dependentTests need to be adapted to the proposed purpose and the surrounding environment of the system in question. No two systems can be effectively tested the same way. Testing thoroughness, exit criteria, and other parameters need to be defined uniquely according to the system’s working environment. An embedded system requires different tests than, for example, an e-commerce system. Testing in an agile project will be very different from that in a project based on a sequential life-cycle model.
7 Principle #7:Absence-of-errors is a fallacyEven if you test all requirements comprehensively and correct all the faults you find, it is still possible to develop a system that is difficult to use, that doesn’t fulfill the user’s expectations, or that is simply of poor quality compared with other, similar systems (or earlier versions of the same system). Prototyping and early involvement of a system’s users are preventive measures used to avoid this problem.

Side Note