Introduction to Risk Based Testing | Risk Management | Risk Identification | Risk Analysis or Risk Assessment | Risk Mitigation or Risk Control
Introduction to Risk Based Testing
Risk is the possibility of a negative or undesirable outcome or event. A specific risk is any problem that may occur that would decrease customer, user, participant, or stakeholder perceptions of product quality or project success. In testing, we’re concerned with two main types of risks. The first type of
risk is product or quality risk. When the primary impact of a potential problem is on product quality, such potential problems are called product risks. A synonym for product risks, which we use most frequently ourselves, is quality risks. An example of a quality risk is a possible reliability defect that could cause a system to crash during normal operation.
The second type of risk is project or planning risks. When the primary impact of a potential problem is on project success, such potential problems are called project risks. Some people also refer to project risks as planning risks. An example of a project risk is a possible staffing shortage that could delay completion of a project.
Of course, you can consider a quality risk as a special type of project risk. While the ISTQB definintion of project risk is given here, Jamie likes the informal definition of a project risk as anything that might prevent the project from
delivering the right product, on time and on budget. However, the difference is that you can run a test against the system or software to determine whether a quality risk has become an actual outcome. You can test for system crashes, for example. Other project risks are usually not testable. You can’t test for a staffing shortage.
One potentially misleading aspect of this metaphor is that insurance professionals and actuaries can use statistically valid data for quantitative risk analysis. Typically, risk-based testing relies on qualitative analyses because we don’t have
the same kind of data insurance companies have.
During risk-based testing, you have to remain aware of many possible sources of risks. There are safety risks for some systems. There are business and economic risks for most systems. There are privacy and data security risks for many systems. There are technical, organizational, and political risks too.
Risk management includes three primary activities:
■ Risk identification, figuring out what the different project and quality risks
are for the project
■ Risk analysis, assessing the level of risk—typically based on likelihood and
impact—for each identified risk item
■ Risk mitigation, which is really more properly called “risk control” because
it consists of mitigation, contingency, transference, and acceptance actions
for various risks
In some sense, these activities are sequential, at least in terms of when they start. They are staged such that risk identification starts first. Risk analysis comes next. Risk control starts once risk analysis has determined the level of risk. However, since risk management should be continuous in a project, the reality is that risk identification, risk analysis, and risk control are all recurring activities.
Everyone has their own perspective on how to manage risks on a project, including what the risks are, the level of risk, and the appropriate controls to put in place for risks. Therefore, risk management should include all project stakeholders.
In many cases, though, not all stakeholders can participate or are willing to do so. In such cases, some stakeholders may act as surrogates for other stakeholders. For example, in mass-market software development, the marketing team might ask a small sample of potential customers to help identify potential defects that would affect their use of the software most heavily. In this case, the sample of potential customers serves as a surrogate for the entire eventual customer
base. As another example, business analysts on IT projects can sometimes represent the users rather than involving users in potentially distressing risk analysis sessions that include conversations about what could go wrong and
how bad it would be.
Technical test analysts bring particular expertise to risk management due to their defect-focused outlook, especially as relates to technically based sources of risk and likelihood. So they should participate whenever possible. In fact, in many cases, the test manager will lead the quality risk analysis effort, with technical test analysts providing key support in the process.
For proper risk-based testing, we need to identify both product and project risks. We can identify both kinds of risks using techniques like these:
■ Expert interviews
■ Independent assessments
■ Use of risk templates
■ Project retrospectives
■ Risk workshops and brainstorming
■ Calling on past experience
Conceivably, you can use a single integrated process to identify both project and product risks. We usually separate them into two separate processes since they have two separate deliverable. We include the project risk identification process in the test planning process and thus hand the bulk of the responsibility for these kinds of risks to managers, including test managers. In parallel, the quality
risk identification process occurs early in the project.
That said, project risks—and not just for testing but also for the project as a whole—are often identified as by-products of quality risk analysis. In addition, if you use a requirements specification, design specification, use cases, or other
documentation as inputs into your quality risk analysis process, you should expect to find defects in those documents as another set of by-products. These are valuable by-products, which you should plan to capture and escalate to the proper person.
Techniques that are more formal often look downstream to identify potential effects of the risk item if it were to become an actual negative outcome.These effects include effects on the system—or the system of systems if applicable—as well as effects on the potential users, customers, stakeholders, and even society in general. Failure Mode and Effect Analysis is an example of such a formal risk management technique, and it is commonly used on safety-critical and embedded systems.
Risk Analysis or Risk Assessment
The next step in the risk management process is referred to in the Advanced topics as risk analysis. We prefer to call it risk assessment because analysis would seem to us to include both identification and assessment of risk. For example,
the process of identifying risk items often includes analysis of work products such as requirements and metrics such as defects found in past projects. Regardless of what we call it, risk analysis or risk assessment involves the study
of the identified risks. We typically want to categorize each risk item appropriately and assign each risk item an appropriate level of risk.
We can use ISO 9126 or other quality categories to organize the risk items. In our opinion—and in the Pragmatic Risk Analysis and Management process described here—it doesn’t matter so much what category a risk item goes into,
usually, so long as we don’t forget it. However, in complex projects and for large organizations, the category of risk can determine who has to deal with the risk. A practical implication of categorization like this will make the categorization important.
So what technical factors should we consider when assessing likelihood?
Here’s a list to get you started:
■ Complexity of technology and teams
■ Personnel and training issues
■ Intrateam and interteam conflict
■ Supplier and vendor contractual problems
■ Geographical distribution of the development organization, as with
■ Legacy or established designs and technologies versus new technologies
■ The quality—or lack of quality—in the tools and technology used
■ Bad managerial or technical leadership
■ Time, resource, and management pressure, especially when financial
Risk Mitigation or Risk Control
Having identified and assessed risks, we now must control them. As we mentioned
earlier, the Advanced syllabus refers to this as risk mitigation, but that’s
not right. Risk control is a better term. We actually have four main options for
■ Mitigation, where we take preventive measures to reduce the likelihood of
the risk occurring and/or the impact of a risk should it occur
■ Contingency, where we have a plan or perhaps multiple plans to reduce the
impact of a risk should it occur
■ Transference, where we get another party to accept the consequences of a
risk should it occur
■ Finally, ignoring or accepting the risk and its consequences should it
Of course, testing is not effective against all kinds of quality risks. In some cases, you can estimate the risk reduction effectiveness of testing in general and for specific test techniques for given risk items. There’s not much point in usingtesting to reduce risk where there is a low level of test effectiveness. For example, code maintainability issues related to poor commenting or use of unstructured programming techniques will not tend to show up—at least, not initially—during testing.
Once we get to test execution, we run tests to mitigate quality risks. Where testing finds defects, testers reduce risk by providing the awareness of defects
and opportunities to deal with them before release. Where testing does not find defects, testing reduces risk by ensuring that under certain conditions the system operates correctly. Of course, running a test only demonstrates operation
under certain conditions and does not constitute a proof of correctness under all possible conditions.
If, during test execution, we need to reduce the time or effort spent on testing, we can use risk as a guide. If the residual risk is acceptable, we can curtail our tests. Notice that, in general, those tests not yet run are less important than those tests already run. If we do curtail further testing, that property of risk based test execution serves to transfer the remaining risk onto the users, customers, help desk and technical support personnel, or operational staff.
Suppose we do have time to continue test execution? In this case, we can adjust our risk analysis—and thus our testing—for further test cycles based on what we’ve learned from our current testing. First, we revise our risk analysis. Then, we reprioritize existing tests and possibly add new tests. What should we look for to decide whether to adjust our risk analysis? We can start with the following
■ Totally new or very much changed product risks
■ Unstable or defect-prone areas discovered during the testing
■ Risks, especially regression risk, associated with fixed defects
■ Discovery of unexpected bug clusters
■ Discovery of business-critical areas that were missed