This is “Sample Size Considerations”, section 7.4 from the book Beginning Statistics (v. 1.0). For details on it (including licensing), click here.

For more information on the source of this book, or why it is available for free, please see the project's home page. You can browse or download additional books there. You may also download a PDF copy of this book (33 MB) or just this chapter (870 KB), suitable for printing or most e-readers, or a .zip file containing this book's HTML files (for use in a web browser offline).

Has this book helped you? Consider passing it on:
Creative Commons supports free culture from music to education. Their licenses helped make this book available to you.
DonorsChoose.org helps people like you help teachers fund their classroom projects, from art supplies to books to calculators.

7.4 Sample Size Considerations

Learning Objective

  1. To learn how to apply formulas for estimating the size sample that will be needed in order to construct a confidence interval for a population mean or proportion that meets given criteria.

Sampling is typically done with a set of clear objectives in mind. For example, an economist might wish to estimate the mean yearly income of workers in a particular industry at 90% confidence and to within $500. Since sampling costs time, effort, and money, it would be useful to be able to estimate the smallest size sample that is likely to meet these criteria.

Estimating μ

The confidence interval formulas for estimating a population mean μ have the form x-±E. When the population standard deviation σ is known,

E=zα2σn

The number zα2 is determined by the desired level of confidence. To say that we wish to estimate the mean to within a certain number of units means that we want the margin of error E to be no larger than that number. Thus we obtain the minimum sample size needed by solving the displayed equation for n.

Minimum Sample Size for Estimating a Population Mean

The estimated minimum sample size n needed to estimate a population mean μ to within E units at 100(1α)% confidence is

n=(zα2)2σ2E2(roundedup)

To apply the formula we must have prior knowledge of the population in order to have an estimate of its standard deviation σ. In all the examples and exercises the population standard deviation will be given.

Example 8

Find the minimum sample size necessary to construct a 99% confidence interval for μ with a margin of error E = 0.2. Assume that the population standard deviation is σ = 1.3.

Solution:

Confidence level 99% means that α=10.99=0.01 so α2=0.005. From the last line of Figure 12.3 "Critical Values of " we obtain z0.005=2.576. Thus

n=(zα2)2σ2E2=(2.576)2(1.3)2(0.2)2=280.361536

which we round up to 281, since it is impossible to take a fractional observation.

Example 9

An economist wishes to estimate, with a 95% confidence interval, the yearly income of welders with at least five years experience to within $1,000. He estimates that the range of incomes is no more than $24,000, so using the Empirical Rule he estimates the population standard deviation to be about one-sixth as much, or about $4,000. Find the estimated minimum sample size required.

Solution:

Confidence level 95% means that α=10.95=0.05 so α2=0.025. From the last line of Figure 12.3 "Critical Values of " we obtain z0.025=1.960.

To say that the estimate is to be “to within $1,000” means that E = 1000. Thus

n=(zα2)2σ2E2=(1.960)2(4000)2(1000)2=61.4656

which we round up to 62.

Estimating p

The confidence interval formula for estimating a population proportion p is p^±E, where

E=zα2p^(1p^)n

The number zα2 is determined by the desired level of confidence. To say that we wish to estimate the population proportion to within a certain number of percentage points means that we want the margin of error E to be no larger than that number (expressed as a proportion). Thus we obtain the minimum sample size needed by solving the displayed equation for n.

Minimum Sample Size for Estimating a Population Proportion

The estimated minimum sample size n needed to estimate a population proportion p to within E at 100(1α)% confidence is

n=(zα2)2p^(1p^)E2(roundedup)

There is a dilemma here: the formula for estimating how large a sample to take contains the number p^, which we know only after we have taken the sample. There are two ways out of this dilemma. Typically the researcher will have some idea as to the value of the population proportion p, hence of what the sample proportion p^ is likely to be. For example, if last month 37% of all voters thought that state taxes are too high, then it is likely that the proportion with that opinion this month will not be dramatically different, and we would use the value 0.37 for p^ in the formula.

The second approach to resolving the dilemma is simply to replace p^ in the formula by 0.5. This is because if p^ is large then 1p^ is small, and vice versa, which limits their product to a maximum value of 0.25, which occurs when p^=0.5. This is called the most conservative estimateThe estimate obtained using p^=0.5, which gives the largest estimate of n., since it gives the largest possible estimate of n.

Example 10

Find the necessary minimum sample size to construct a 98% confidence interval for p with a margin of error E = 0.05,

  1. assuming that no prior knowledge about p is available; and
  2. assuming that prior studies suggest that p is about 0.1.

Solution:

Confidence level 98% means that α=10.98=0.02 so α2=0.01. From the last line of Figure 12.3 "Critical Values of " we obtain z0.01=2.326.

  1. Since there is no prior knowledge of p we make the most conservative estimate that p^=0.5. Then

    n=(zα2)2p^(1p^)E2=(2.326)2(0.5)(10.5)0.052=541.0276

    which we round up to 542.

  2. Since p ≈ 0.1 we estimate p^ by 0.1, and obtain

    n=(zα2)2p^(1p^)E2=(2.326)2(0.1)(10.1)0.052=194.769936

    which we round up to 195.

Example 11

A dermatologist wishes to estimate the proportion of young adults who apply sunscreen regularly before going out in the sun in the summer. Find the minimum sample size required to estimate the proportion to within three percentage points, at 90% confidence.

Solution:

Confidence level 90% means that α=10.90=0.10 so α2=0.05. From the last line of Figure 12.3 "Critical Values of " we obtain z0.05=1.645.

Since there is no prior knowledge of p we make the most conservative estimate that p^=0.5. To estimate “to within three percentage points” means that E = 0.03. Then

n=(zα2)2p^(1p^)E2=(1.645)2(0.5)(10.5)0.032=751.6736111

which we round up to 752.

Key Takeaways

  • If the population standard deviation σ is known or can be estimated, then the minimum sample size needed to obtain a confidence interval for the population mean with a given maximum error of the estimate and a given level of confidence can be estimated.
  • The minimum sample size needed to obtain a confidence interval for a population proportion with a given maximum error of the estimate and a given level of confidence can always be estimated. If there is prior knowledge of the population proportion p then the estimate can be sharpened.

Exercises

    Basic

  1. Estimate the minimum sample size needed to form a confidence interval for the mean of a population having the standard deviation shown, meeting the criteria given.

    1. σ = 30, 95% confidence, E = 10
    2. σ = 30, 99% confidence, E = 10
    3. σ = 30, 95% confidence, E = 5
  2. Estimate the minimum sample size needed to form a confidence interval for the mean of a population having the standard deviation shown, meeting the criteria given.

    1. σ = 4, 95% confidence, E = 1
    2. σ = 4, 99% confidence, E = 1
    3. σ = 4, 95% confidence, E = 0.5
  3. Estimate the minimum sample size needed to form a confidence interval for the proportion of a population that has a particular characteristic, meeting the criteria given.

    1. p ≈ 0.37, 80% confidence, E = 0.05
    2. p ≈ 0.37, 90% confidence, E = 0.05
    3. p ≈ 0.37, 80% confidence, E = 0.01
  4. Estimate the minimum sample size needed to form a confidence interval for the proportion of a population that has a particular characteristic, meeting the criteria given.

    1. p ≈ 0.81, 95% confidence, E = 0.02
    2. p ≈ 0.81, 99% confidence, E = 0.02
    3. p ≈ 0.81, 95% confidence, E = 0.01
  5. Estimate the minimum sample size needed to form a confidence interval for the proportion of a population that has a particular characteristic, meeting the criteria given.

    1. 80% confidence, E = 0.05
    2. 90% confidence, E = 0.05
    3. 80% confidence, E = 0.01
  6. Estimate the minimum sample size needed to form a confidence interval for the proportion of a population that has a particular characteristic, meeting the criteria given.

    1. 95% confidence, E = 0.02
    2. 99% confidence, E = 0.02
    3. 95% confidence, E = 0.01

    Applications

  1. A software engineer wishes to estimate, to within 5 seconds, the mean time that a new application takes to start up, with 95% confidence. Estimate the minimum size sample required if the standard deviation of start up times for similar software is 12 seconds.

  2. A real estate agent wishes to estimate, to within $2.50, the mean retail cost per square foot of newly built homes, with 80% confidence. He estimates the standard deviation of such costs at $5.00. Estimate the minimum size sample required.

  3. An economist wishes to estimate, to within 2 minutes, the mean time that employed persons spend commuting each day, with 95% confidence. On the assumption that the standard deviation of commuting times is 8 minutes, estimate the minimum size sample required.

  4. A motor club wishes to estimate, to within 1 cent, the mean price of 1 gallon of regular gasoline in a certain region, with 98% confidence. Historically the variability of prices is measured by σ=$0.03. Estimate the minimum size sample required.

  5. A bank wishes to estimate, to within $25, the mean average monthly balance in its checking accounts, with 99.8% confidence. Assuming σ=$250, estimate the minimum size sample required.

  6. A retailer wishes to estimate, to within 15 seconds, the mean duration of telephone orders taken at its call center, with 99.5% confidence. In the past the standard deviation of call length has been about 1.25 minutes. Estimate the minimum size sample required. (Be careful to express all the information in the same units.)

  7. The administration at a college wishes to estimate, to within two percentage points, the proportion of all its entering freshmen who graduate within four years, with 90% confidence. Estimate the minimum size sample required.

  8. A chain of automotive repair stores wishes to estimate, to within five percentage points, the proportion of all passenger vehicles in operation that are at least five years old, with 98% confidence. Estimate the minimum size sample required.

  9. An internet service provider wishes to estimate, to within one percentage point, the current proportion of all email that is spam, with 99.9% confidence. Last year the proportion that was spam was 71%. Estimate the minimum size sample required.

  10. An agronomist wishes to estimate, to within one percentage point, the proportion of a new variety of seed that will germinate when planted, with 95% confidence. A typical germination rate is 97%. Estimate the minimum size sample required.

  11. A charitable organization wishes to estimate, to within half a percentage point, the proportion of all telephone solicitations to its donors that result in a gift, with 90% confidence. Estimate the minimum sample size required, using the information that in the past the response rate has been about 30%.

  12. A government agency wishes to estimate the proportion of drivers aged 16–24 who have been involved in a traffic accident in the last year. It wishes to make the estimate to within one percentage point and at 90% confidence. Find the minimum sample size required, using the information that several years ago the proportion was 0.12.

    Additional Exercises

  1. An economist wishes to estimate, to within six months, the mean time between sales of existing homes, with 95% confidence. Estimate the minimum size sample required. In his experience virtually all houses are re-sold within 40 months, so using the Empirical Rule he will estimate σ by one-sixth the range, or 406=6.7.

  2. A wildlife manager wishes to estimate the mean length of fish in a large lake, to within one inch, with 80% confidence. Estimate the minimum size sample required. In his experience virtually no fish caught in the lake is over 23 inches long, so using the Empirical Rule he will estimate σ by one-sixth the range, or 236=3.8.

  3. You wish to estimate the current mean birth weight of all newborns in a certain region, to within 1 ounce (1/16 pound) and with 95% confidence. A sample will cost $400 plus $1.50 for every newborn weighed. You believe the standard deviations of weight to be no more than 1.25 pounds. You have $2,500 to spend on the study.

    1. Can you afford the sample required?
    2. If not, what are your options?
  4. You wish to estimate a population proportion to within three percentage points, at 95% confidence. A sample will cost $500 plus 50 cents for every sample element measured. You have $1,000 to spend on the study.

    1. Can you afford the sample required?
    2. If not, what are your options?

Answers

    1. 35
    2. 60
    3. 139
    1. 154
    2. 253
    3. 3832
    1. 165
    2. 271
    3. 4109
  1. 23

  2. 62

  3. 955

  4. 1692

  5. 22,301

  6. 22,731

  1. 5

    1. no
    2. decrease the confidence level