Chapter 29 Estimating a Population Rate or Proportion
We’ve focused on creating statistical inferences about a population mean, or difference between means, where we care about a quantitative outcome. Now, we’ll tackle categorical outcomes. We’ll start by estimating a confidence interval around a population proportion.
29.1 Ebola Mortality Rates through 9 Months of the Epidemic
The World Health Organization’s Ebola Response Team published an article in the October 16, 2014 issue of the New England Journal of Medicine, which contained some data I will use in this example, focusing on materials from their Table 2.
As of September 14, 2014, a total of 4,507 confirmed and probable cases of Ebola virus disease (EVD) had been reported from West Africa. In our example, we will look at a set of 1,737 cases, with definitive outcomes, reported in Guinea, Liberia, Nigeria and Sierra Leone.
Across these 1,737 cases, a total of 1,229 cases led to death. Based on these sample data, what can be said about the case fatality rate in the population of EVD cases with definitive outcomes for this epidemic?
29.2 A 100(1-\(\alpha\))% Confidence Interval for a Population Proportion
Suppose we want to estimate a confidence interval for an unknown population proportion, \(\pi\), on the basis of a random sample of n observations from that population which yields a sample proportion of p. Note that this p is the sample proportion – it’s not a p value.
A 100(1-\(\alpha\))% confidence interval for the population proportion \(\pi\) can be created by using the standard normal distribution, the sample proportion, p, and the standard error of a sample proportion, which is defined as the square root of p multiplied by (1-p) divided by the sample size, n.
Specifically, our confidence interval is \(p \pm Z_{\alpha/2} \sqrt{\frac{p(1-p)}{n}}\)
where \(Z_{\alpha/2}\) = the value from a standard Normal distribution cutting off the top \(\alpha/2\) of the distribution, obtained in R by substituting the desired \(\alpha/2\) value into the following command: qnorm(alpha/2, lower.tail=FALSE)
.
- Note: This interval is reasonably accurate so long as np and n(1-p) are each at least 5.
29.3 Working through the Ebola Virus Disease Example
In our case, we have n = 1,737 subjects, of whom we observed death in 1,229, for a sample proportion of \(p = \frac{1229}{1737} = 0.708\). The standard error of that sample proportion will be
\(SE(p) = \sqrt{\frac{p(1-p)}{n}} = \sqrt{\frac{0.708(1-0.708)}{1737}} = 0.011\)
And our 95% confidence interval (so that we’ll use \(\alpha\) = 0.05) for the true population proportion, \(\pi\), of EVD cases with definitive outcomes, who will die is \(p \pm Z_{.025} \sqrt{\frac{p(1-p)}{n}}\), or 0.708 \(\pm\) 1.96(0.011) = \(0.708 \pm 0.022\), or (0.686, 0.730)
Note that I simply recalled from our prior work that \(Z_{0.025} = 1.96\), but we can verify this:
qnorm(0.025, lower.tail=FALSE)
[1] 1.96
Since both np=(1737)(0.708)=1230 and n(1-p)=(1737)(1-0.708)=507 are substantially greater than 5, this should be a reasonably accurate confidence interval.
We are 95% confident that the true population proportion is between 0.686 and 0.730. Equivalently, we could say that we’re 95% confident that the true case fatality rate expressed as a percentage rather than a proportion, is between 68.6% and 73.0%.
I am aware of at least three different procedures for estimating a confidence interval for a population proportion using R. All have minor weaknesses: none is importantly different from the others in many practical situations.
29.4 The prop.test
approach (Wald test)
The prop.test
function can be used to establish a very similar confidence interval to the one we calculated above, based on something called the Wald test.
prop.test(x = 1229, n = 1737)
1-sample proportions test with continuity correction
data: 1229 out of 1737, null probability 0.5
X-squared = 300, df = 1, p-value <2e-16
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.685 0.729
sample estimates:
p
0.708
The 95% confidence interval by this method is (0.685, 0.729), which is close, but not quite the same, to our original estimate of (0.686, 0.730.)
The difference from our calculated interval is attributable to differences in rounding, plus the addition of a continuity correction, since we are using a Normal approximation to the exact binomial distribution to establish our margin for error. R, by default, includes this continuity correction for this test.
29.5 The binom.test
approach (Clopper and Pearson “exact” test)
The binom.test
command can be used to establish an “exact” confidence interval. This uses the method of Clopper and Pearson from 1934, and is exact in the sense that it guarantees, for instance, that the confidence interval is at least 95%, but not that the interval isn’t wider than perhaps it needs to be.
binom.test(x = 1229, n = 1737)
Exact binomial test
data: 1229 and 1737
number of successes = 1000, number of trials = 2000, p-value
<2e-16
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
0.686 0.729
sample estimates:
probability of success
0.708
The 95% confidence interval by this method is (0.686, 0.729), which is a little closer, but still not quite the same, as our original estimate of (0.686, 0.730.)
29.6 SAIFS: single augmentation with an imaginary failure or success
SAIFS stands for “single augmentation with an imaginary failure or success” and the method I’ll describe is one of several similar approaches. The next subsection describes the R code for calculating the relevant confidence interval.
An approach I like for the estimation of a confidence interval for a single population proportion/rate is to estimate the lower bound of a confidence interval with an imaginary failure added to the observed data, and estimate the upper bound of a confidence interval with an imaginary success added to the data.
Suppose we have X successes in n trials, and we want to establish a confidence interval for the population proportion of successes.
Let \(p_1 = (X+0)/(n+1), p_2 = (X+1)/(n+1), q_1 = 1 - p_1, q_2 = 1 - p_2\)
The lower bound of a 100(1-\(\alpha\))% confidence interval for the population proportion of successes using the SAIFS procedure is then \(LB_{SAIFS}(x,n,\alpha) = p_1 - t_{\alpha/2, n-1}\sqrt{\frac{p_1 q_1}{n}}\)
The upper bound of that same 100(1-\(\alpha\))% confidence interval for the population proportion of successes using the SAIFS procedure is \(UB_{SAIFS}(x,n,\alpha) = p_2 + t_{\alpha/2, n-1}\sqrt{\frac{p_2 q_2}{n}}\)
29.7 Using the SAIFS Approach in the Ebola Example
Returning to the Ebola Virus Disease example, we’ve got 1229 “successes” (it’s hard to think of death as a success, but from a calculation standpoint, that’s what we’re doing) out of 1737 “trials”, so that X = 1229 and n = 1737.
So we have \(p_1 = \frac{X + 0}{n + 1} = \frac{1229}{1738} = 0.7071\), \(p_2 = \frac{X + 1}{n + 1} = \frac{1230}{1738} = 0.7077\), and \(q_1 = 1 - p_1 = 0.2929\) and \(q_2 = 1 - p_2 = 0.2923\)
We have \(n = 1737\) so if we want a 95% confidence interval (\(\alpha = 0.05\)), then we have \(t_{\alpha/2, n-1} = t_{0.025, 1736} = 1.9613\), which I determined using R’s qt
function:
qt(0.025, df = 1736, lower.tail=FALSE)
[1] 1.96
- Thus, our lower bound for a 95% confidence interval is \(p_1 - t_{\alpha/2, n-1}\sqrt{\frac{p_1 q_1}{n}}\), or \(0.7071 - 1.9613\sqrt{\frac{0.7071(0.2929)}{1737}}\), which is 0.7071 - 0.0214 or 0.6857
- And our upper bound is \(p_2 + t_{\alpha/2, n-1}\sqrt{\frac{p_2 q_2}{n}}\), or \(0.7077 - 1.9613\sqrt{\frac{0.7077(0.2923)}{1737}}\), which is 0.7077 + 0.0214, or 0.7291
So the 95% SAIFS confidence interval for the population proportion, \(\pi\), of Ebola Virus Disease patients with definite outcomes who will die is (0.686, 0.729).
29.8 A Function in R to Calculate the SAIFS Confidence Interval
I built an R function, called saifs.ci
and contained in the Markdown for this document as well as the R functions by T Love.R
script on the web site, which takes as its arguments a value for X = the number of successes, n = the number of trials, and conf.level = the confidence level, and produces the sample proportion, the SAIFS lower bound and upper bound for the specified two-sided confidence interval for the population proportion, using the equations above.
Here, for instance, are 95%, 90% and 99% confidence intervals for the population proportion \(\pi\) that we have been studying in the example about Ebola Virus disease.
saifs.ci(x = 1229, n = 1737)
Sample Proportion 0.025 0.975
0.708 0.686 0.729
saifs.ci(x = 1229, n = 1737, conf=0.9)
Sample Proportion 0.05 0.95
0.708 0.689 0.726
saifs.ci(x = 1229, n = 1737, conf=0.99, dig=5)
Sample Proportion 0.005 0.995
0.708 0.679 0.736
Note that in the final interval, I asked the machine to round to five digits rather than the default of three. On my desktop (and probably yours), doing so results in this output:
Sample Proportion 0.005 0.995
0.70754 0.67898 0.73585
I’ve got some setting wrong in my bookdown
work so that this doesn’t show up above when the function is called. Sorry!
29.8.1 The saifs.ci
function in R
`saifs.ci` <-
function(x, n, conf.level=0.95, dig=3)
{
p.sample <- round(x/n, digits=dig)
p1 <- x / (n+1)
p2 <- (x+1) / (n+1)
var1 <- (p1*(1-p1))/n
se1 <- sqrt(var1)
var2 <- (p2*(1-p2))/n
se2 <- sqrt(var2)
lowq = (1 - conf.level)/2
tcut <- qt(lowq, df=n-1, lower.tail=FALSE)
lower.bound <- round(p1 - tcut*se1, digits=dig)
upper.bound <- round(p2 + tcut*se2, digits=dig)
res <- c(p.sample, lower.bound, upper.bound)
names(res) <- c('Sample Proportion',lowq, 1-lowq)
res
}
29.9 Comparing the Confidence Intervals for the Ebola Virus Disease Example
Our three approaches give the following results:
Approach | 95% CI for Population Proportion |
---|---|
prop.test (Wald) |
(0.685, 0.729) |
binom.test (Clopper & Pearson) |
(0.686, 0.729) |
saifs.ci (Borkowf) |
(0.686, 0.729) |
So in this case, it really doesn’t matter which one you choose. With a smaller sample, we may not come to the same conclusion about the relative merits of these different approaches.
29.10 Can the Choice of Confidence Interval Method Matter?
The SAIFS approach will give a substantially different confidence interval than either the Wald or Clopper-Pearson approaches with a small sample size, and a probability of “success” that is close to either 0 or 1. For instance, suppose we run 10 trials, and obtain a single success, then use these data to estimate the true proportion of success, \(\pi\).
The 95% confidence intervals under this circumstance are very different.
Method | R Command | 95% CI for \(\pi\) |
---|---|---|
Wald | prop.test(x = 1, n = 10) |
0.005, 0.459 |
Clopper-Pearson | binom.test(x = 1, n = 10) |
0.003, 0.445 |
SAIFS | saifs.ci(x = 1, n = 10) |
-0.115, 0.458 |
Note that the Wald and Clopper-Pearson procedures at least force the confidence interval to appear in the (0, 1) range. The SAIFS approach gives us some impossible values, and is thus a bit hard to take seriously – in reporting the result, we’d probably have to report the SAIFS interval as (0, 0.458).
If instead, we consider a situation where our null hypothesis is that the true proportion \(\pi\) is 0.10 (or 10%), and we run each of these three methods to obtain a 95% confidence interval, then we will come to somewhat different conclusions depending on the choice of method, if we observe 4 successes in 100 trials.
Method | R Command | 95% CI for \(\pi\) |
---|---|---|
Wald | prop.test(x = 4, n = 100) |
0.013, 0.105 |
Clopper-Pearson | binom.test(x = 4, n = 100) |
0.011, 0.099 |
SAIFS | saifs.ci(x = 4, n = 100) |
0.001, 0.093 |
Now, the Wald test suggests we retain the null hypothesis, the Clopper-Pearson test suggests we reject it (barely) and the SAIFS interval is more convinced that we should reject \(H_0: \pi = 0.10\) in favor of \(H_A: \pi \neq 0.10\).
None of these three approaches is always better than the others. When we have a sample size below 100, or the sample proportion of success is either below 0.10 or above 0.90, caution is warranted, although in many cases, the various methods give similar responses.
Data | Wald 95% CI | Clopper-Pearson 95% CI | SAIFS 95% CI |
---|---|---|---|
10 successes in 30 trials | 0.179, 0.529 | 0.173, 0.528 | 0.148, 0.534 |
10 successes in 50 trials | 0.105, 0.341 | 0.1, 0.337 | 0.083, 0.333 |
90 successes in 100 trials | 0.82, 0.948 | 0.824, 0.951 | 0.829, 0.96 |
95 successes in 100 trials | 0.882, 0.981 | 0.887, 0.984 | 0.894, 0.994 |