Wednesday, November 11, 2009

8.8 Summary




I l@ve RuBoard










8.8 Summary


The exponential distribution, another special case of the Weibull distribution family, is the simplest and perhaps most widely used distribution in reliability and survival studies. In software, it is best used for modeling the defect arrival pattern at the back end of the development process�for example, the final test phase. When calendar-time (versus execution-time) data are used, a key assumption for the exponential model is that the testing effort is homogeneous throughout the testing phase. If this assumption is not met, normalization of the data with respect to test effort is needed for the model to work well.


In addition to the exponential model, numerous software reliability growth models have been proposed, each with its own assumptions, applicability, and limitations. However, relatively few have been verified in practical environments with industry data, and even fewer are in use. Based on the criteria variable they use, software reliability growth models can be classified into two major classes: time between failures models and fault count models. In this chapter we summarize several well-known models in each class and illustrate the modeling process with a real-life example. From the practitioner's vantage point, the most important criteria for evaluating and choosing software reliability growth models are predictive validity, simplicity, and quality of assumptions.



Recommendations for Small Projects


The Rayleigh model discussed in Chapter 7 and the models discussed in this chapter are used to project a software's quality performance in the field (e.g., failure rate or number of defects) based on data gathered during the development of the project. Data requirements for these models are no different from those for other metrics discussed in previous chapters. The size of the teams or the development organizations does not affect implementation of these models. They do require modeling tools and personnel with sufficient statistical modeling training. For organizations that do not have one or both of these requirements, I recommend a nonparametric method, which requires only pencils, paper, and a calculator, and which is based on the principles of simplicity and empirical validity.


The recommended method simply makes use of the test effectiveness metric discussed in Chapter 6. Assuming the level of test effectiveness is established via previous similar projects, with the volume of testing defects for the current project available, the volume of field defects can be estimated via the test effectiveness formula. Specifically, use the following three-step approach:


1. Select one or more previous projects that are comparable to the current project. Examine and establish the test effectiveness level and its variants (e.g., average, standard deviation, interval estimates) of these projects.


2. Gather the testing defect data from the current project, estimate the number of field defects based on the test effectiveness formula and the established test effectiveness value. If possible, conduct interval estimates (using data on standard deviations) as well as point estimates.


3. Based on the pattern of defect arrivals over time from history (discussed in section 8.6), spread out the total number of estimated field defects into years, quarters, or even months for more precise maintenance planning.


As an example, the ratios of the number of defects found during the system test to the number of field defects in a year for four software projects are as follows:



  • Project A: 1.54


  • Project B: 1.55


  • Project C: 1.50


  • Project D: 1.34



In other words, the system test effectiveness for Project A is 1.54 / (1.54+1) = 60.6%, and so forth. The average ratio for the four projects is 1.48 and average test effectiveness is 59.7%. Suppose the system test for the current project found 74 defects, using the average ratio of average test effectiveness, the number of field defects of the current project within a year in the field is estimated to be 50 (74/1.48).


Interval estimate is usually a better approach than point estimate. To derive an interval estimate (with a lower bound and an upper bound), the size of the sample (in this case n = 4), its standard deviation, and the t-distribution will be involved. For details, refer to a text on statistical methods such as Snedecor and Cochran (1980, pp. 54�59).


To cross-validate our estimate for this example, we also looked into a similar ratio for the four previous projects, by including the testing defects from two other tests (in addition to system test).


The ratios are as follows:



  • Project A: 2.93


  • Project B: 3.03


  • Project C: 3.26


  • Project D: 2.84



In other words, the cumulative test effectiveness of two other tests and the system test for Project A is 2.93 / (2.93 + 1) = 74.6%. Using this new set of ratios, we came up with another estimate that is close to the first estimate. Therefore, our confidence on the robustness of the estimates increased. If the two estimates differ significantly, then apparently some judgment has to be made.


Caution: An implicit assumption of this simple method for defect estimation is that the defect arrival patterns during testing are similar. If they are not, the validity of this method will be compromised. If the pattern for the current project is in vivid contrast to comparable projects, such as the case shown in Figure 4.2, then the empirical validity of this method may not be established. In that case, we don't recommend using this method.


When the total number of field defects is estimated, estimating the pattern of arrivals over time can be based on history, which is organization- and product-type specific. If there are considerable fluctuations in the history data, the three-point moving average can be used to come up with a smooth pattern. If no history data is available, assess the type of software and, if applicable, use the distribution pattern in Table 8.3 or Table 8.4.


It is worth noting that for systems software, our observation is that the yearly distribution pattern is roughly similar to the half-life decay pattern of carbon dating. In other words, defect volume in the first year is about half of the total defects over the life of defect arrivals of the product, and in the second year, it is about half of the remaining defects, and so forth. If nothing is available and only a high-level yearly spread is needed, this half-life decay pattern could be used for systems software.



Software reliability models are most often used for reliability projection when development work is complete and before the software is shipped to customers. They can also be used to model the failure pattern or the defect arrival pattern in the field and thereby provide valuable input to maintenance planning.







    I l@ve RuBoard



    No comments:

    Post a Comment