# Using worksheet functions for quality control

This article is an excerpt from *Business Analysis with Microsoft® Excel, Second Edition* by Conrad Carlberg and is available in electronic format on Safari® HelpDesk Online.

**Safari** puts the power of a comprehensive Microsoft Office library right on your desktop. This book excerpt courtesy of Safari Books Online.

Applies to |
---|

Microsoft Office Excel 2003 |

Show me more Safari titles about Excel.

You can use Excel to help answer a variety of questions that occasionally arise in quality control situations. It's important to understand the nature of the tools that are available to you when such questions come up. Because statistical quality control is largely an exercise in probability—and in choosing the right tool in a given case—this chapter concludes with a discussion of some of these questions and how you can use Excel to answer them.

## Sampling units from a finite population

To this point we have discussed statistical process control and acceptance sampling in terms of theoretically infinite populations. There has been no limit to the size of the population of products that has been sampled to create X-and-S charts, P-charts or operating characteristic curves.

Things change some when you sample from a finite population. You have a finite population when you are interested only in a specific group, such as the items in a special production run, or your own company's sales staff, or customers' responses to a temporary price reduction.

When you sample from a finite population, it's usual to do so without replacement: that is, if you are going to sample two items, you select the first item, and then select the second item without putting the first item back into the pool. Suppose that you have a population of 10 items. The chance of selecting any given item at random is 1/10, or 10%. If, after selecting the item, you put it back into the pool, the chance of selecting any given item is still 10%. If, however, you do not return the item to the pool, the chance of selecting at random any given item as your second choice is 1/9, or 11%.

### Case study: manufacturing

A customer wants you to manufacture 200 coffee cups with a glazing that differs substantially from the glazing of those that you normally make. This will be a special production run, and therefore the 200 cups constitute a finite population. Your agreement with the customer allows for a 5% defect rate.

You plan to sample 20 cups, without replacement, from your production run, and to reject the run if you find unacceptable imperfections in the glazing on more than 5% of the sample. That is, you will reject the run if the sample contains two or more defectives. What is the probability that your full run has met the 5% criterion if you find zero or one defective cup?

You answer this question with Excel's HYPGEOMDIST function. It takes four arguments:

- The number of "successes" in the sample. Here, that argument is zero or one: the number of imperfect cups in your sample if you are to accept the production run.
- The sample size. Here, that argument is 20, the number of cups you sample.
- The number of "successes" in the population. Here, that argument is 10. If you tested all 200 cups, then the 5% criterion implies that not more than 10 would be imperfect.
- The size of the population. Here, that argument is 200, the number of special cups that you manufactured.

HYPGEOMDIST returns the probability that you would observe an exact number of successes, given the sample size, the successes in the population, and the population size. So, if you entered

**=HYPGEOMDIST(0,20,10,200)**

then Excel would return .34. Therefore, there's a 34% probability of finding no imperfect cups in a sample of 20, when there are exactly 10 imperfect cups in the population of 200.

You also need to know the probability of finding exactly one defective cup in your sample, so you enter

**=HYPGEOMDIST(1,20,1****0,200)**

Excel returns .40. Therefore, there's a 40% probability of finding exactly one imperfect cup in your sample. Together, these two probabilities add up to 74%. So it is more likely (74%) than not (100% – 74% = 26%) that there are 10 imperfect cups in the full production run.

**Tip ** You can use an array constant in HYPGEOMDIST, and similar functions, to get the sum of the function's results. Instead of entering

**=HYPGEOMDIST(0,20,10,200)**

and

**=HYPGEOMDIST(1,20,10,200)**

and summing the results, you can enter

**=SUM(HYPGEOMDIST({ 0,1} ,20,10,200))**

This formula executes HYPGEOMDIST twice: once for the first element in the array {0,1} and once for the second element. It then adds the results together, and in this case it returns .74, or 74%.

## Sampling units from a nonfinite population

When you monitor a nonfinite population, you are interested in a larger group than when you monitor a finite population. For example, instead of testing a special, finite production run, you might be testing your normal, ongoing, nonfinite product line. If you were testing a new invoice format, you might try it for a week before deciding to adopt it; then, sampling for invoice accuracy would involve a finite population. On the other hand, if you were monitoring invoice accuracy as a normal procedure, you would probably consider your sample to be from a nonfinite population.

To make probability statements about a sample from a nonfinite population, you use Excel's NORMSINV function.

### Case study: videotape rentals

Suppose that you are in the business of renting videotapes to the public. After tapes have been viewed some number of times, their quality deteriorates to the point that you consider them to be defective. Furthermore, some customers own tape players that are defective, and they can ruin your videotapes.

You want to maintain an inventory of tapes that is at least 85% acceptable; you would really prefer 95%, but because you can't keep your customers' machines from damaging the rental tapes, you relax the criterion a little. Although you have a finite number of tapes on any given day, your inventory is constantly changing due to the acquisition of new tapes, and the removal of old ones. You therefore consider your population of tapes to be nonfinite.

Testing video tapes is a time consuming process, and you would like to keep the sample as small as possible. A rule of thumb that works well in quality testing is to make sure both of these equations result in a number that is 5 or greater:

n * p

and

n * (p-1)

where n is the sample size, and p is the probability of an acceptable unit in the population. If your tapes meet your criterion of 85% acceptable, p is .85. To make sure that both n * p and n * (p–1) are both greater than 5, you will need n, the sample size, to be at least 43. To keep the numbers easy to work with, you decide to take a sample of 50.

** Note ** The rule of thumb described previously is due to the relationship between the binomial and the normal distributions. The sampling distribution of a binomial variable such as defective/acceptable is very close to a normal distribution when both n * p and n * (1–p) are greater than 5.

You test the random sample of 50 tapes, finding three that are defective and 47 that are acceptable, so 94% of your sample is acceptable. What is the probability that at least 85% of your population of tapes is acceptable?

You decide that you want to make a correct decision about the defects in your population of tapes 95% of the time that you test samples. The following Excel formula returns the criterion that you need if you are to have that amount of protection (it is known as a critical value):

=NORMSINV(.95)

Excel returns 1.64. This critical value is the number that your test statistic needs to exceed if you are to make a correct decision.

**Tip** NORMSINV() is easier and quicker to use than NORMINV(). However, NORMINV() gives you greater control over the characteristics of the underlying distribution.

To get the test statistic itself, enter the Excel formula

**=(0.94-0.85)/SQRT(0.15*0.85/50)**

which returns 1.78. The general formula is

= (x-p)/SQRT(P*(P-1)/N)

where x is the percent acceptable in the sample, p is the hypothetical percent acceptable in the population, n is the sample size, and the denominator is the standard error of p.

Because your test statistic of 1.78 exceeds your critical value of 1.64, you conclude that your inventory of tapes is at least 85% acceptable.

## Sampling defects in units

This chapter has so far discussed one particular attribute: whether a unit is acceptable or defective. A related measure is the number of defects in a particular unit. For example, if you were concerned with the quality of the purchase orders that your company distributes, you might want a more detailed measure than acceptable/defective. It might be important to understand the frequency of occurrence of critical defects in the purchase orders (such as account number or ship-to address) versus minor defects (such as the spelling of the supplier's street address).

To make inferences about numbers of defects, as distinct from numbers of defective units, the Excel POISSON function is often useful.

### Case study: forms

As the manager of the purchasing department for a large firm, you have noticed an unusual number of complaints that deliveries from suppliers have been late. Following up on some of the complaints, you find that some suppliers report that the deliveries have been delayed because of errors on the purchase orders relating to unit pricing, want-dates, model numbers, and contract references.

Because suppliers sometimes place inappropriate blame on buyers, you decide to examine a sample of purchase orders to see whether the overall rate of defects per form might be high enough to cause these delays.

You decide that an overall defect rate of .5 defects per purchase order is acceptable. If Purchase Order A is perfect and Purchase Order B has only one defect (therefore, the average rate is .5 defects per purchase order), there should be enough good information on B for the supplier to be able either to fill the order or to quickly resolve the incorrect information.

You also decide that you want to limit the likelihood of deciding that the average defect rate is one-half of one defect per order, when in fact it is some other number, to 5%.

You sample 10 purchase orders at random and examine them for inaccuracies. You find 12 instances of misinformation in the sample. Given this data, should you continue to believe that the average number of defects in all your purchase orders is .5?

Use Excel's POISSON function. Enter

**=1-POISSON(11,5,TRUE)**

which returns .005. The first argument, 11, is 12–1: that is, the number of inaccuracies that you found, minus 1. The second argument, 5, is the number of inaccuracies that you would expect to find in 10 purchase orders if the average number of inaccuracies were .5. The third argument, trUE, specifies the cumulative form of the Poisson distribution: that is, the sum of the probability for zero inaccuracies, plus the probability for one inaccuracy, and so on.

You decided beforehand that the level of protection you wanted against an incorrect decision was 5%, or .05. Because .005 is less than .05, you reject your hypothesis that there are .5 errors per form among all your purchase orders. You probably need to make sure that your staff is properly trained on your new system, and that the system itself is operating as it was designed.

## About the author

Conrad Carlberg is president of Network Control Systems, Inc., a software and consulting firm that specializes in the development of quality control and forecasting tools for the health industry. He holds a Ph.D. in statistics from the University of Colorado, and is a multiple recipient of Microsoft's Excel's Most Valuable Professional award.

See titles by Conrad Carlberg on Safari.

Continue reading the next section of this book, "Measuring Quality: Summary," on Safari HelpDesk Online.