Читать книгу: «Software Testing Foundations», страница 2

Шрифт:

2 Software Testing Basics

This introductory chapter explains the basic concepts of software testing that are applied in the chapters that follow. Important concepts included in the syllabus are illustrated throughout the book using our practical VSR-II case study. The seven fundamental principles of software testing are introduced, and the bulk of the chapter is dedicated to explaining the details of the testing process and the various activities it involves. To conclude, we will discuss the psychological issues involved in testing, and how to avoid or work around them.

2.1 Concepts and Motivations

Quality requirements

Industrially manufactured products are usually spot-checked to make sure they fulfill the planned requirements and perform the required task. Different products have varying quality requirements and, if the final product is flawed or faulty, the production process or the design has to be modified to remedy this.

Software is intangible

What is generally true for industrial production processes is also true for the development of software. However, checking parts of the product or the finished product can be tricky because the product itself isn’t actually tangible, making “hands-on” testing impossible. Visual checks are limited and can only be performed by careful scrutiny of the development documentation.

Faulty software is a serious problem

Software that is unreliable or that simply doesn’t perform the required task can be highly problematic. Bad software costs time and money and can ruin a company’s reputation. It can even endanger human life—for example, when the “autopilot” software in a partially autonomous vehicle reacts erroneously or too late.

Testing helps to assess software quality

It is therefore extremely important to check the quality of a software product to minimize the risk of failures or crashes. Testing monitors software quality and reduces risk by revealing faults at the development stage. Software testing is therefore an essential but also highly complex task.

Case Study: The risks of using faulty software
Every release of the VSR-II system has to be suitably tested before it is delivered and rolled out. This aims to identify and remedy faults before they can do any damage. For example, if the system executes an order in a faulty way, this can cause serious financial problems for the customer, the dealer and the manufacturer, as well as damaging the manufacturer’s image. Undiscovered faults like this increase the risk involved in running the software.

Testing involves taking a spot-check approach

Testing is often understood as spot-check execution¹ of the software in question (the test object) on a computer. The test object is fed with test data covering various test cases and is then executed. The evaluation that follows checks whether the test object fulfills its planned requirements.²

Testing involves more than just executing tests on a computer

However, testing involves much more than just performing a series of test cases. The test process involves a range of separate activities, and performing tests and checking the results are just two of these. Other testing activities include test planning, test analysis, and the design and implementation of test cases. Additional activities include writing reports on test progress and results, and risk analysis. Test activities are organized differently depending on the stage in a product’s lifecycle. Test activities and documentation are often contractually regulated between the customer and the supplier, or are based on the company’s own internal guidelines. Detailed descriptions of the individual activities involved in software testing are included in sections 2.3 and 6.3.

Static and dynamic testing

Alongside the dynamic tests that are performed on a computer (see Chapter 5), documents such as requirement specifications, user stories, and source code also need to be tested as early as possible in the development process. These are known as static tests (see Chapter 4). The sooner faults in the documentation are discovered and remedied, the better it is for the future development process, as you will no longer be working with flawed source material.

Verification and validation

Testing isn’t just about checking that a system fulfills its requirements, user stories, or other specifications; it is also about ensuring that the product fulfills the wishes and expectations of its users in a real-world environment. In other words, checking whether it is possible to use the system as intended and making sure it fulfills its planned purpose. Testing therefore also involves validation (see Principle #7 in section 2.1.6—“Absence-of-errors is a fallacy”).

No large system is fault-free

There is currently no such thing as a fault-free software system, and this situation is unlikely to change for systems above a given degree of complexity or those with a large number of lines of code. Many faults are caused by a failure to identify or test for exceptions during code development—things like failing to account for leap years, or not considering constraints when it comes to timing or resource allocation. It is therefore common—and sometimes unavoidable—that software systems go live, even though faults still occur for certain combinations of input data. However, other systems work perfectly day in day out in all manner of industries.

Freedom from faults cannot be achieved through testing

With the exception of very small programs, even if every test you perform returns zero defects, you cannot be sure that additional tests won’t reveal previously undiscovered faults. It is impossible to prove complete freedom from faults by testing.

2.1.1 Defect and Fault Terminology

The test basis as a starting point for testing

A situation can only be classed as faulty if you define in advance what exactly is supposed to happen in that situation. In order to make such a definition, you need to know the requirements made of the (sub)system you are testing as well as other additional information. In this context, we talk about the test basis against which tests are performed and that serves as the basis for deciding whether a specific function is faulty.

What counts as a defect?

A defect is therefore defined as a failure to fulfill a predefined requirement, or a discrepancy between the actual behavior (at run time or while testing) and the expected behavior (as defined in the specifications, the requirements, or the user stories). In other words, when does the system’s behavior fail to conform to its actual requirements?

Unlike physical systems, software systems don’t fail due to age or wear. Every defect that occurs is present from the moment the software is coded, but only becomes apparent when the system is running.

Faults cause failures

System failures result from faults and only become apparent to the tester or the user during testing or at run-time. For example, when the system produces erroneous output or crashes.

We need to distinguish between the effects of a fault and its causes. A system failure is caused by a fault in the software, and the resulting condition is considered to be a defect. The word “bug” is also used to describe defects that result from coding errors, such as an incorrectly programmed or forgotten instruction in the code.

Defect masking

It is possible that a fault can be offset by one or more other faults in other parts of the program. Under these circumstances, the fault in question only becomes apparent when the others have been remedied. In other words, correcting a fault in one place can lead to unexpected side effects in others.

Not all faults cause system failures, and some failures occur never, once, or constantly for all users. Some failures occur a long way from where they are caused.

A fault is always the result of an error or a mistake made by a person— for example, due to a programming error at the development stage.

People make errors

Errors occur for various reasons. Some typical (root) causes are:

■ All humans make errors!
■ Time pressure is often present in software projects and is a regular source of errors.
■ The complexity of the task at hand, the system architecture, the system design, or the source code.
■ Misunderstandings between participants in the project—often in the form of differing interpretations of the requirements or other documents.
■ Misunderstandings relating to system interaction via internal and external interfaces. Large systems often have a huge number of interfaces.
■ The complexity of the technology in use, or of new technologies previously unknown to project participants that are introduced during the project.
■ Project participants are not sufficiently experienced or do not have appropriate training.

A human error causes a fault in part of the code, which then causes some kind of visible system failure that, ideally, is revealed during testing (see figure 2-1: Debugging, see below). Static tests (see Chapter 4) can directly detect faults in the source code.

System failures can also be caused by environmental issues such as radiation and magnetism, or by physical pollution that causes hardware and firmware failures. We will not be addressing these types of failures here.

Fig. 2-1 The relationships between, errors, faults, and failures

False positive and false negative results

Not every unexpected test result equates to a failure. Often, a test will indicate a failure even though the underlying fault (or its cause) isn’t part of the test object. Such a result is known as a “false positive”. The opposite effect can also occur—i.e., a failure doesn’t occur even though testing should reveal its presence. This type of result is known as a “false negative”. You have to bear both of these situations in mind when evaluating your test results. Your result can also be a “correct positive” (failure revealed by testing) or a “correct negative” (expected behavior confirmed by testing). For more detail on these situations, see section 6.4.1.

Learning from your mistakes

If faults and the errors or mistakes that cause them are revealed by testing it is worth taking a closer look at the causes in order to learn how to avoid making the same (or similar) errors or mistakes in future. The knowledge you gain this way can help you optimize your processes and reduce or prevent the occurrence of additional faults.

Case Study: Vague requirements as a cause of software faults
Customers can use the VSR EasyFinance module to calculate various vehicle- financing options. The interest rate the system uses is stored in a table, although the purchase of vehicles involved in promotions and special offers can be subject to differing interest rates.
VSR-II is to include the following additional requirement:
REQ: If the customer agrees to and passes an online credit check, the EasyFinance module applies an interest rate from a special bonus interest rate table.
The author of this requirement unfortunately forgot to clarify that a reduction in the interest rate is not permissible for vehicles sold as part of a special offer. This resulted in this special case not being tested in the first release. In turn, this meant that customers were offered low interest rates online but were charged higher rates when billed, resulting in complaints.

2.1.2 Testing Terminology

Testing is not debugging

In order to remedy a software fault it has to be located. To start with, we only know the effect of the fault, but not its location within the code. The process of finding and correcting faults is called debugging and is the responsibility of the developer. Debugging is often confused with testing, although these are two distinct and very different tasks. While debugging pinpoints software faults, testing is used to reveal the effect a fault causes (see figure 2-1).

Confirmation testing

Correcting a fault improves the quality of the product (assuming the correction doesn’t cause additional, new faults). Tests used to check that a fault has been successfully remedied are called confirmation tests. Testers are often responsible for confirmation testing, whereas developers are more likely to be responsible for component testing (and debugging). However, these roles can change in an agile development environment or for other software lifecycle models.

Unfortunately, in real-world situations fault correction often leads to the creation of new faults that are only revealed when completely new input scenarios are used. Such unpredictable side effects make testing trickier. Once a fault has been corrected you need to repeat your previous tests to make sure the targeted failure has been remedied, and you also need to write new tests that check for unwanted side effects of the correction process.

Objectives of testing

Static and dynamic tests are designed to achieve various objectives:

■ A qualitative evaluation of work products related to the requirements, the specifications, user stories, program design, and code
■ Prove that all specific requirements have been completely implemented and that the test object functions as expected for the users and other stakeholders
■ Provide information that enables stakeholders to make a solid estimate of the test object’s quality and thus generate confidence in the quality provided³
■ The level of quality-related risk can be reduced through identification and correction of software failures. The system will then contain fewer undiscovered faults.
■ Analysis of the program and its documentation in order to avoid unwanted faults, and to document and remedy known ones
■ Analyze and execute the program in order to reproduce known failures
■ Receive information about the test object in order to decide whether the component in question can be committed for integration with other components
■ Demonstrate that the test object adheres and/or conforms to the necessary contractual, legal and regulatory requirements and standards

Objectives depend on context

Test objectives can vary depending on the context. Furthermore, they can vary according to the development model you use (agile or otherwise) and the level of test you are performing—i.e., component, integration, system, or acceptance tests (see section 3.4).

When you are testing a component, your main objective should be to reveal as many failures as possible and to identify (i.e., debug) and remedy the underlying faults as soon as possible. Another primary objective can be to select tests that achieve the maximum possible level of code coverage (see section 2.3.1).

One objective of acceptance testing is to confirm that the system works and can be used as planned, and thus fulfills all of its functional and non-functional requirements. Another is to provide information that enables stakeholders to evaluate risk and make an informed decision about whether (or not) to go live.

Side Note: Scheme for naming different types of testing
The various names used for different types of tests can be confusing. To understand the naming of tests it is useful to differentiate between the following naming categories:
1. Test objective
The naming of a test type is based on the test objective (for example, a “load test”).
2. Test method/technique
A test is named according to the method or technique used to specify and/or perform the test (i.e., “state transition testing”, as described in section 5.1.3)
3. Test object
A test is named according to the type of object to be tested (for example, “GUI test“ or “database test“)
4. Test level
A test is named according to the corresponding level of the development model being used (for example, a “system test“)
5. Test person
A test is named after the person or group who perform the test (for example, “developer test“, “user test“)
6. Test scope
A test is named according to its scope (for example, a “partial regression test“)
As you can see, not all of these terms define a distinct type of test. Instead, the different names highlight different aspects of a test that are important or in focus in a particular context or with regard to a particular testing objective.

2.1.3 Test Artifacts and the Relationships Between Them

The previous sections have already described some types of test artifacts. The following sections provide an overview of the types of artifacts necessary to perform dynamic testing.

Test basis

The test basis is the cornerstone of the testing process. As previously noted, the test basis comprises all documents that help us to decide whether a failure has occurred during testing. In other words, the test basis defines the expected behavior of the test object. Common sense and specialist knowledge can also be seen as part of the test basis and can be used to reach a decision. In most cases a requirements document, a specification, or a user story is available, which serves as a test basis.

Test cases and test runs

The test basis is used to define test cases, and a test run takes place when the test object is fed with appropriate test data and executed on a computer. The results of the test run are checked and the team decides whether a failure has occurred—i.e., whether there is a discrepancy between the test object’s expected and actual behaviors. Usually, certain preconditions have to be met in order to run a test case—for example, the corresponding database has to be available and filled with suitable data.

Test conditions

An individual test cannot be used to test the entire test basis, so it has to focus on a specific aspect. Test conditions are therefore extrapolated from the test basis in order to pursue specific test objectives (see above). A test condition can be checked using one or more tests and can be a function, a transaction, a quality attribute, or a structural element of a component or system. Examples of test conditions in our case study VSR-II system are vehicle configuration permutations (see section 5.1.5), the look and feel of the user interface, or the system’s response time.

Test item

By the same token, a test object can rarely be tested as a complete object in its own right. Usually, we need to identify separate items that are then tested using individual test cases. For example, the test item for VSR-II’s price calculation test condition is the calculate_price() method (see section 5.1.1). The corresponding test cases are specified using appropriate testing techniques (see Chapter 5).

Test suites and test execution schedules

It makes little sense to perform test cases individually. Test cases are usually combined in test suites that are executed in a test cycle. The timing of test cycles is defined in the test execution schedule.

Test scripts

Test suites are automated using scripts that contain the test sequence and all of the actions required to create the necessary preconditions for testing, and to clean up once testing is completed. If you execute tests manually, the same information has to be made available for the manual tester.

Test logs

Test runs are logged and recorded in a test summary report.

Test plan

For every test object, you need to create a test plan that defines everything you need to conduct your tests (see section 6.2.1). This includes your choice of test objects and testing techniques, the definition of the test objectives and reporting scheme, and the coordination of all test-related activities.

Figure 2-2 shows the relationships between the various artifacts involved. Defining the individual activities involved in the testing process (see section 2.3) helps to clarify when each artifact is created.

Fig. 2-2 The relationships between test artifacts

2.1.4 Testing Effort

Testing effort depends on the project (environment)

Testing takes up a large portion of the development effort, even if only a part of all conceivable tests—or, more precisely, all conceivable test cases— can be considered. It is difficult to say just how much effort you should spend testing, as this depends very much on the nature of the project at hand.⁴

The importance of testing—and thus the amount of effort required for testing—is often made clear by the ratio of testers to developers. In practice, the following ratios can be found: from one tester for every ten developers to three testers per developer. Testing effort and budget vary massively in real-world situations.

Case Study: Testing effort and vehicle variants
VSR-II enables potential customers to configure their own vehicle on a computer screen. The extras available for specific models and the possible combinations of options and preconfigured models are subject to a complex set of rules. The old VSR System allowed customers to select combinations that were not actually deliverable. As a consequence of the VSR-II QA/Test planning requirement Functional suitability/DreamCar = high (see below) customers should no longer be able to select non-deliverable combinations.

The product owner responsible for the DreamCar module wants to know how much testing effort will be required to test this aspect of the module as comprehensively as possible. To do this, he makes an estimate of the maximum number of vehicle configuration options available. The results are as follows:

There are 10 vehicle models, each with 5 different engine options; 10 types of wheel rims with summer or winter tires; 10 colors, each with matt, glossy, or pearl effect options; and 5 different entertainment systems. These options result in 10×5×10×2×10×3×5=150.000 different variants, so testing one variant every second would take a total of 1.7 days.

A further 50 extras (each of which is selectable or not) produce a total of 150.000×2⁵⁰ = 168.884.986.026.393.600.000 variations.

The product owner intuitively knows that he doesn’t have to test for every possible combination, but rather for the rules that define which combinations of options are not deliverable. Nevertheless, possible software faults create the risk that the DreamCar module wrongly classifies some configurations as deliverable (or permissible combinations as non-deliverable).

How much testing effort is required here and how much can it effectively cost? The product owner decides to ask the QA/testing lead for advice. One possible solution to the issue is to use pairwise testing (see the side note in section 5.1.5).

Side Note: When is increased testing effort justifiable?
Is a high testing effort affordable and justifiable? Jerry Weinberg’s response to this question is: “Compared with what?” [DeMarco 93]. This response points out the risks of using a faulty software system. Risk is calculated from the likelihood of a certain situation arising and the expected costs when it does. Potential faults that are not discovered during testing can later generate significant costs.

Example: The cost of failure

In March 2016, a concatenation of software faults destroyed the space telescope Hitomi, which was built at a cost of several hundred million dollars. The satellite’s software wrongly assumed that it was rotating too slowly and tried to compensate using countermeasures. The signals from the redundant control systems were then wrongly interpreted and the speed of rotation increased continuously until the centrifugal force became too much and Hitomi disintegrated (from [URL: Error Costs]).

In 2018 and 2019 two Boeing 737 MAX 8 airplanes crashed due to design flaws in the airplane’s MCAS flight control software [URL: MAX-groundings]. Here too, the software—misdirected by incorrect sensor information—generated fatal countermeasures.

Testing effort has to remain in reasonable proportion to the results testing can achieve. “Testing makes economic sense as long as the cost of finding and remedying faults is lower than the costs produced by the corresponding failure occurring when the system is live.”⁵ [Pol 00]. Reasonable testing effort therefore always depends on the degree of risk involved in failure and an evaluation of the danger this incurs. The price of the destroyed space telescope Hitomi could have paid for an awful lot of testing.

Case Study: Risks and losses when failures occur
The DreamCar module constantly updates and displays the price of the current configuration. Registered customers with validated ID can order a vehicle online.
Once a customer clicks the Order button and enters their PIN, the vehicle is ordered and the purchase committed. Once the statutory revocation deadline has passed, the chosen configuration is automatically passed on to the production management system that initiates the build process.
Because the online purchase process is binding, if the system calculates and displays an incorrect price the customer has the right to insist on the paying that price. This means that wrongly calculated prices could lead to the manufacturer selling thousands of cars at prices that are too low. Depending on the degree of miscalculation, this could lead to millions of dollars in losses. Having each purchase order checked manually is not an option, as the whole point of the VSR-II system is that vehicles can be ordered completely automatically online.

Defining test thoroughness and scope depending on risk factors

Systems or system parts with a high risk have to be tested more extensively than those that do not cause major damage in case of failure.⁶ Risk assessment has to be carried out for the individual system parts or even for individual failure modes. If there is a high risk of a system or subsystem malfunctioning, the test requires more effort than for less critical (sub) systems. These procedures are defined through international standards for the production of safety-critical systems. For example, the [RTC-DO 178B] Airborne Systems and Equipment Certification standard prescribes complex testing procedures for aviation systems.

Although there are no material risks involved, a computer game that saves scores incorrectly can be costly for its manufacturer, as such faults affect the public acceptance of a game and its parent company’s other products, and can lead to lost sales and damage to the company’s reputation.

1.Here, we are referring to the dynamic testing processes discussed in Chapter 5. Static testing (see Chapter 4) doesn’t require the software to be executed.

2.Testing alone cannot prove that all requirements have been fulfilled (see below).

3.If comprehensive testing reveals few (or no) failures, this increases stakeholder confidence in the product.

4.Section 6.2.5 goes into more detail on this topic.

5.These costs include not only software repair, renewed testing, and replacement of the faulty software, but also immaterial costs such as bad publicity and legal issues.

6.Section 6.2.4 goes into detail on this topic.