13  Proportions and Rates

13.1 R setup for this chapter

Note

Appendix A lists all R packages used in this book, and also provides R session information.

13.2 Data: strep_tb data from the medicaldata R package

Note

Appendix C provides further guidance on pulling data from other systems into R, while Appendix D gives more information (including download links) for all data sets used in this book. Appendix B describes the 431-Love.R script, and demonstrates its use.

My source for these data is Higgins (2023). See https://higgi13425.github.io/medicaldata/ for more details.

In this chapter, we start with the strep_tb data from the medicaldata R package - we’ll look at study arm (streptomycin or control) and the dichotomous outcome of improved (true, false). The code below helps us change this logical true/false variable into a factor, which we’ll call imp_f.

strep <- medicaldata::strep_tb |>
  mutate(
    imp_f = factor(improved),
    imp_f = fct_recode(imp_f,
      "Improved" = "TRUE",
      "Worsened" = "FALSE"
    ),
    imp_f = fct_relevel(imp_f, "Improved")
  )
strep
# A tibble: 107 × 14
   patient_id arm     dose_strep_g dose_PAS_g gender baseline_condition
   <chr>      <fct>          <dbl>      <dbl> <fct>  <fct>             
 1 0001       Control            0          0 M      1_Good            
 2 0002       Control            0          0 F      1_Good            
 3 0003       Control            0          0 F      1_Good            
 4 0004       Control            0          0 M      1_Good            
 5 0005       Control            0          0 F      1_Good            
 6 0006       Control            0          0 M      1_Good            
 7 0007       Control            0          0 F      1_Good            
 8 0008       Control            0          0 M      1_Good            
 9 0009       Control            0          0 F      2_Fair            
10 0010       Control            0          0 M      2_Fair            
# ℹ 97 more rows
# ℹ 8 more variables: baseline_temp <fct>, baseline_esr <fct>,
#   baseline_cavitation <fct>, strep_resistance <fct>, radiologic_6m <fct>,
#   rad_num <dbl>, improved <lgl>, imp_f <fct>
table(strep$arm, strep$imp_f)
              
               Improved Worsened
  Streptomycin       38       17
  Control            17       35

Reading this two-by-two table, we have 38 subjects who were on Streptomycin and improved, 17 who were on Streptomycin and worsened, 17 who were on the Control and improved, and 35 who were on Control and worsened, which is a total of 107 subjects.

13.3 Estimating a Proportion

Within those who received Streptomycin, 38 improved and 17 did not out of 55 subjects. Can we estimate a confidence interval for the population proportion improving across all subjects who received Streptomycin?

13.4 Standard errors and confidence intervals

Note: This section adapted from Gelman, Hill, and Vehtari (2021), pages 51-52.

The standard error is the estimated standard deviation of an estimate, and it helps us to understand how much uncertainty in that estimate can be attributed to the fact that we’re taking a (random, ideally) sample from the population or process of interest.

  • When estimating the sample mean, using a simple random sample of size \(n\), the standard error is \(\sigma/\sqrt{n}\), where \(\sigma\) is the population standard deviation.
  • When estimating a sample proportion, where we have a sample of size \(n\), with \(y\) yes and \(n-y\) no responses, the estimated proportion of the population who would answer yes is \(\hat{p} = y/n\), and the standard error of this estimate is \(\sqrt{\hat{p}(1 - \hat{p})/n}\).
    • This method works pretty well so long as \(y\) is somewhere between say 10% and 90% of \(n\), and so long as \(y\) and \(n-y\) are both at least 5.

The sampling distribution of our estimator often follows a Normal (bell-shaped) distribution thanks to the Central Limit Theorem. The usual 95% confidence interval for large samples is based on the assumption that the sampling distribution follows the Normal distribution, and is estimated by the point estimate \(\pm\) 2 standard errors. A 50% confidence interval is estimated by the point estimate \(\pm\) 2/3 of a standard error.

For example, if we had 40 successes out of 100 trials, we’d have

  • \(\hat{p} = 40/100 = 0.4\) with
  • standard error \(\sqrt{0.4(1 - 0.4)/100}\) = 0.049.
  • Our 95% confidence interval would be approximately \(0.4 \pm 2 \times 0.049\), or (.302, .498).
  • Our 50% confidence interval would be approximately \(0.4 \pm 2/3 \times 0.049\), or (.367, .433).

We can use R to calculate these results more precisely as follows…

binom.test(x = 40, n = 100, conf.level = 0.95)

    Exact binomial test

data:  40 and 100
number of successes = 40, number of trials = 100, p-value = 0.05689
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.3032948 0.5027908
sample estimates:
probability of success 
                   0.4 
binom.test(x = 40, n = 100, conf.level = 0.50)

    Exact binomial test

data:  40 and 100
number of successes = 40, number of trials = 100, p-value = 0.05689
alternative hypothesis: true probability of success is not equal to 0.5
50 percent confidence interval:
 0.3628095 0.4385965
sample estimates:
probability of success 
                   0.4 

The confidence interval represents a range of values which are roughly consistent with the data, given the assumed sampling distribution. If the model for our data is correct, then in repeated applications of the same procedure, 95% confidence intervals will include the true value 95% of the time.

13.4.1 Using a Bayesian augmentation

In some settings, especially if < 10% or > 90% of the yes/no responses are yes, we can use the estimate \(\hat{p} = (y + 2) / (n + 4)\) rather than \(y/n\), and the standard error \(\sqrt{\hat{p}(1 - \hat{p}/(n + 4)}\), while setting the minimum value of the confidence interval for \(p\) to be 0, and the maximum to 1. We can accomplish this with the following augmentation to the binom.test() function.

binom.test(x = 38 + 2, n = 55 + 4, conf.level = 0.95)

    Exact binomial test

data:  38 + 2 and 55 + 4
number of successes = 40, number of trials = 59, p-value = 0.008641
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.5436200 0.7937535
sample estimates:
probability of success 
             0.6779661 

13.4.2 SAIFS: single augmentation with an imaginary failure or success

Another approach that can be used to augment a confidence interval for a proportion in a Bayesian way is the saifs_ci() function I have provided below, and in Appendix B as part of the Love-431.R script.

`saifs_ci` <- 
  function(x, n, conf.level=0.95, dig=3)
  {
    p.sample <- round(x/n, digits=dig)
    
    p1 <- x / (n+1)
    p2 <- (x+1) / (n+1)
    
    var1 <- (p1*(1-p1))/n
    se1 <- sqrt(var1)
    var2 <- (p2*(1-p2))/n
    se2 <- sqrt(var2)
    
    lowq = (1 - conf.level)/2
    tcut <- qt(lowq, df=n-1, lower.tail=FALSE)
    
    lower.bound <- round(p1 - tcut*se1, digits=dig)
    upper.bound <- round(p2 + tcut*se2, digits=dig)
    tibble(
      sample_x = x,
      sample_n = n,
      sample_p = p.sample,
      lower = lower.bound,
      upper = upper.bound,
      conf_level = conf.level
    )
  }

We can use the saifs_ci() function from Love-431.R as follows with this example, restricting the result to three decimal places.

saifs_ci(x = 38, n = 55, conf.level = 0.95, dig = 3)
# A tibble: 1 × 6
  sample_x sample_n sample_p lower upper conf_level
     <dbl>    <dbl>    <dbl> <dbl> <dbl>      <dbl>
1       38       55    0.691 0.552 0.821       0.95
saifs_ci(x = 38, n = 55, conf.level = 0.50, dig = 3)
# A tibble: 1 × 6
  sample_x sample_n sample_p lower upper conf_level
     <dbl>    <dbl>    <dbl> <dbl> <dbl>      <dbl>
1       38       55    0.691 0.636 0.739        0.5

13.5 Larger Example: Ebola Mortality Rates

The World Health Organization’s Ebola Response Team published an article in the October 16, 2014 issue of the New England Journal of Medicine, which contained some data I will use in this example, focusing on materials from their Table 2.

As of September 14, 2014, a total of 4,507 confirmed and probable cases of Ebola virus disease (EVD) had been reported from West Africa. In our example, we will look at a set of 1,737 cases, with definitive outcomes, reported in Guinea, Liberia, Nigeria and Sierra Leone.

Across these 1,737 cases, a total of 1,229 cases led to death. Based on these sample data, what can be said about the case fatality rate in the population of EVD cases with definitive outcomes for this epidemic?

We have n = 1,737 subjects, of whom we observed death in 1,229, for a sample proportion of \(p = \frac{1229}{1737} = 0.708\). The standard error of that sample proportion will be

\(SE(p) = \sqrt{\frac{p(1-p)}{n}} = \sqrt{\frac{0.708(1-0.708)}{1737}} = 0.011\)

And our 95% confidence interval (so that we’ll use \(\alpha\) = 0.05) for the true population proportion, \(\pi\), of EVD cases with definitive outcomes, who will die is \(p \pm Z_{.025} \sqrt{\frac{p(1-p)}{n}}\), or 0.708 \(\pm\) 1.96(0.011) = \(0.708 \pm 0.022\), or (0.686, 0.730)

Note that I simply recalled from our prior work that \(Z_{0.025} = 1.96\), but we can verify this:

qnorm(0.025, lower.tail=FALSE)
[1] 1.959964

Since both np=(1737)(0.708)=1230 and n(1-p)=(1737)(1-0.708)=507 are substantially greater than 5, this should be a reasonably accurate confidence interval.

13.5.1 What about the saifs_ci() result?

saifs_ci(x = 1229, n = 1737, conf.level = 0.95)
# A tibble: 1 × 6
  sample_x sample_n sample_p lower upper conf_level
     <dbl>    <dbl>    <dbl> <dbl> <dbl>      <dbl>
1     1229     1737    0.708 0.686 0.729       0.95

13.6 Assessing a 2 x 2 table

Consider the following two-by-two table (two rows and two columns, ignoring the margins)

table(strep$arm, strep$imp_f)
              
               Improved Worsened
  Streptomycin       38       17
  Control            17       35

This table is in standard epidemiological format, which means that:

  • The rows of the table describe the “treatment” (which we’ll take here to be arm).
    • The more interesting (sometimes also the more common) “treatment” is placed in the top row. That’s Streptomycin here.
  • The columns of the table describe the “outcome” (which we’ll take here to be whether the subject improved or not.)
    • Typically, the more common or more interesting “outcome” is placed to the left. Here, we’ll use “improved” on the left.
twoby2(table(strep$arm, strep$imp_f), conf.level = 0.95)
2 by 2 table analysis: 
------------------------------------------------------ 
Outcome   : Improved 
Comparing : Streptomycin vs. Control 

             Improved Worsened    P(Improved) 95% conf. interval
Streptomycin       38       17         0.6909    0.5579   0.7984
Control            17       35         0.3269    0.2139   0.4644

                                   95% conf. interval
             Relative Risk: 2.1134    1.3773   3.2429
         Sample Odds Ratio: 4.6021    2.0389  10.3877
Conditional MLE Odds Ratio: 4.5304    1.8962  11.2779
    Probability difference: 0.3640    0.1754   0.5182

             Exact P-value: 0.0002 
        Asymptotic P-value: 0.0002 
------------------------------------------------------

13.7 How should we interpret these results?

In our sample, we observe 69.1% of the 55 Streptomycin subjects who improved as compared to 32.7% of the 52 Placebo subjects. We want to compare these two probabilities (represented using proportions as 0.691 vs. 0.327) to estimate the size of the difference between the proportions with a point estimate and 95% confidence interval.

But there are other comparisons we could make, too. The tricky part is that we have multiple ways to describe the relationship between treatment and outcome. We might compare outcome “risks” directly using the difference in probabilities, or the ratio of the two probabilities, or we might convert the risks to odds, and compare the ratio of those odds. In any case, we’ll get different point estimates and confidence intervals, all of which will help us make conclusions about the evidence available in this trial speaking to differences between Streptomycin and Placebo.

13.8 Relating a Treatment to an Outcome

The question of interest is whether the percentage of subjects who improved is different (and probably larger) in the subjects who received Streptomycin than in those who received the Placebo.

Treatment Arm Improved Worsened Total Proportion who improved
Streptomycin 38 17 55 0.691
Placebo 17 35 52 0.327

In other words, what is the relationship between the treatment and the outcome?

13.9 Definitions of Probability and Odds

  • Proportion = Probability = Risk of the trait = number with trait / total
  • Odds of having the trait = (number with the trait / number without the trait) to 1

If p is the proportion of subjects with a trait, then the odds of having the trait are \(\frac{p}{1-p}\) to 1.

So, the probability of improvement in this case is \(\frac{38}{55} = 0.691\) in the Streptomycin group. The odds of a good result are thus \(\frac{0.691}{1 - 0.691} = 2.236\) to 1 in the Streptomycin group.

Treatment Arm Improved Worsened Total Pr(improved) Odds(improved)
Streptomycin 38 17 55 0.691 2.236
Placebo 17 35 52 0.327 0.486

13.10 Defining the Relative Risk

Among the Streptomycin subjects, the risk of improvement (a good outcome) is 69.1% or, stated as a proportion, 0.691. Among the Placebo subjects, the risk of a good outcome is 32.7% or, stated as a proportion, 0.327.

Our “crude” estimate of the relative risk of improvement for Streptomycin subjects as compared to Placebo subjects, is the ratio of these two risks, or 0.691/0.327 = 2.113

  • The fact that this relative risk is greater than 1 indicates that the probability of improving is higher for Streptomycin subjects than for Placebo subjects.
  • A relative risk of 1 would indicate that the probability of improving is the same for Streptomycin subjects and for Placebo subjects.
  • A relative risk less than 1 would indicate that the probability of improving is lower for Streptomycin subjects than for Placebo subjects.

13.11 Defining the Risk Difference

Our “crude” estimate of the risk difference of improving for Streptomycin subjects as compared to Placebo subjects, is 0.691 - 0.327 = 0.364 or 36.4 percentage points.

  • The fact that this risk difference is greater than 0 indicates that the probability of improving is higher for Streptomycin subjects than for Placebo subjects.
  • A risk difference of 0 would indicate that the probability of improving is the same for Streptomycin subjects and for Placebo subjects.
  • A risk difference less than 0 would indicate that the probability of improving is lower for Streptomycin subjects than for Placebo subjects.

13.12 Defining the Odds Ratio, or the Cross-Product Ratio

Among the Streptomycin subjects, the odds of improving are 2.236. Among the placebo subjects, the odds of improving are 0.486

So our “crude” estimate of the odds ratio of improving for Streptomycin subjects as compared to placebo subjects, is 2.236 / 0.486 = 4.60

Another way to calculate this odds ratio is to calculate the cross-product ratio, which is equal to (a x d) / (b x c), for the 2 by 2 table with counts specified as shown:

A Generic Table Good Outcome Bad Outcome
Treatment Group 1 a b
Treatment Group 2 c d

So, for our table, we have a = 38, b = 17, c = 17, and d = 35, so the cross-product ratio is \(\frac{38 \times 35}{17 \times 17} = \frac{1330}{289} = 4.60\). As expected, this is the same as the “crude” odds ratio estimate.

  • The fact that this odds ratio risk is greater than 1 indicates that the odds of improving are higher for Streptomycin subjects than for Placebo subjects.
  • An odds ratio of 1 would indicate that the odds of improving are the same for Streptomycin subjects and for Placebo subjects.
  • An odds ratio less than 1 would indicate that the odds of improving are lower for Streptomycin subjects than for Placebo subjects.

So, we have several different ways to compare the outcomes across the treatments. Are these differences and ratios large enough to rule out chance?

13.13 The Chi-Square Tests

At the end of the twoby2() output, we see two kinds of chi-square tests. The first is an “exact p-value” from Fisher’s exact test, and the second is an “asymptotic p-value” from Pearson’s chi-square test.

Both tests look at the null hypothesis that the rows are not associated in any way with the columns, so that the probability of improvement (in our case) is assumed to be exactly the same in the Streptomycin and Placebo groups. The small \(p\) values suggest that there is no strong evidence to retain this null hypothesis, and thus we might reject it.

More on chi-square tests in our next chapter, when we deal with larger contingency tables (i.e. those with more than 2 rows, more than 2 columns or both.)

13.14 The twobytwo function in R

The twoby2() function comes from the Epi package. I have provided a similar function called twobytwo as part of the Love-431.R script (see Appendix B) which works well if you have the counts in a 2x2 table available to you, but not the individual data points. All that is required is a single command, and a two-by-two table in standard epidemiological format (with the outcomes in the columns, and the treatments in the rows.)

Our next example involves a study where we are interested in comparing the percentage of patients in each arm of a trial (of Ibuprofen vs. Placebo) that showed an improvement in their temperature (temp_drop > 0). Our primary interest is in comparing the percentage of Ibuprofen patients whose temperature dropped to the percentage of Placebo patients whose temperature dropped.

Here’s the data we have for the Ibuprofen vs. Placebo study.

Treatment Arm Dropped Did Not Drop
Ibuprofen 107 43
Placebo 80 70

The command just requires you to read off the cells of the table, followed by the labels for the two treatments, then the two outcomes, then a specification of the names of the rows (exposures) and columns (outcomes) from the table, and a specification of the confidence level you desire. We’ll use 90% here.

The resulting output follows.

twobytwo(107, 43, 80, 70, 
         "Ibuprofen", "Placebo", "Dropped", "No Drop",
         conf.level = 0.90)
2 by 2 table analysis: 
------------------------------------------------------ 
Outcome   : Dropped 
Comparing : Ibuprofen vs. Placebo 

          Dropped No Drop    P(Dropped) 90% conf. interval
Ibuprofen     107      43        0.7133    0.6490   0.7701
Placebo        80      70        0.5333    0.4661   0.5993

                                   90% conf. interval
             Relative Risk: 1.3375    1.1492   1.5567
         Sample Odds Ratio: 2.1773    1.4583   3.2509
Conditional MLE Odds Ratio: 2.1716    1.4177   3.3437
    Probability difference: 0.1800    0.0881   0.2677

             Exact P-value: 0.0019 
        Asymptotic P-value: 0.0014 
------------------------------------------------------

13.14.1 Standard Epidemiological Format

Again, our table is in standard epidemiological format, which means that:

  • The rows of the table describe the “treatment” (which we’ll take here to be treat). The more interesting (sometimes also the more common) “treatment” is placed in the top row. That’s Ibuprofen here.
  • The columns of the table describe the “outcome” (which we’ll take here to be whether the subject’s temperature dropped.) Typically, the more common “outcome” is placed to the left.

13.14.2 Outcome Probabilities and Confidence Intervals Within the Treatment Groups

The twobytwo output starts with estimates of the probability (risk) of a “Dropped” outcome among subjects who fall into the two treatment groups (Ibuprofen or Placebo), along with 90% confidence intervals for each of these probabilities.

2 by 2 table analysis: 
------------------------------------------------------ 
Outcome   : Dropped 
Comparing : Ibuprofen vs. Placebo 

          Dropped No Drop    P(Dropped) 90% conf. interval
Ibuprofen     107      43        0.7133    0.6490   0.7701
Placebo        80      70        0.5333    0.4661   0.5993

The conditional probability of a temperature drop given that the subject is in the Ibuprofen group, is symbolized as Pr(Dropped | Ibuprofen) = 0.7133, and the 90% confidence interval around that proportion is (0.6490, 0.7701).

  • Note that these two confidence intervals fail to overlap, and so we expect to see a fairly large difference in the estimated probability of a temperature drop when we compare Ibuprofen to Placebo.

13.14.3 Relative Risk, Odds Ratio and Risk Difference, with Confidence Intervals

These elements are followed by estimates of the relative risk, odds ratio, and risk difference, each with associated 90% confidence intervals.

                                   90% conf. interval
             Relative Risk: 1.3375    1.1492   1.5567
         Sample Odds Ratio: 2.1773    1.4583   3.2509
Conditional MLE Odds Ratio: 2.1716    1.4177   3.3437
    Probability difference: 0.1800    0.0881   0.2677
  • The relative risk, or the ratio of P(Temperature Drop | Ibuprofen) to P(Temperature Drop | Placebo), is shown first. Note that the 90% confidence interval is entirely greater than 1.
  • The odds ratio is presented using two different definitions (the sample odds ratio is the cross-product ratio we mentioned earlier). Note that the 90% confidence interval using either approach is entirely greater than 1.
  • The probability (or risk) difference [P(Temperature Drop | Ibuprofen) - P(Temperature Drop | Placebo)] is presented last. Note that the 90% confidence interval is entirely greater than 0.
  • Note carefully that if there had been no difference between Ibuprofen and Placebo, the relative risk and odds ratios would be 1, but the probability difference would be zero.

13.15 Ebola Virus Study

Returning to the Ebola example, suppose now we want to compare the proportion of deaths among cases that had a definitive outcome who were hospitalized to the proportion of deaths among cases that had a definitive outcome who were not hospitalized.

The article suggests that of the 1,737 cases with a definitive outcome, there were 1,153 hospitalized cases. Across those 1,153 hospitalized cases, 741 people (64.3%) died, which means that across the remaining 584 non-hospitalized cases, 488 people (83.6%) died.

Here is the initial contingency table, using only the numbers from the previous paragraph.

Initial Ebola Table Deceased Alive Total
Hospitalized 741 1153
Not Hospitalized 488 584
Total 1737

Now, we can use arithmetic to complete the table, since the rows and the columns are each mutually exclusive and collectively exhaustive.

Ebola 2x2 Table Deceased Alive Total
Hospitalized 741 412 1153
Not Hospitalized 488 96 584
Total 1229 508 1737

We want to compare the fatality risk (probability of being in the deceased column) for the population of people in the hospitalized row to the population of people in the not hospitalized row. We can type in the four cell sizes and the names of the rows then the columns into the twobytwo() function, like this.

twobytwo(741, 412, 488, 96, 
         "Hosp", "Not Hosp", "Dead", "Alive", 
         conf.level = 0.95)
2 by 2 table analysis: 
------------------------------------------------------ 
Outcome   : Dead 
Comparing : Hosp vs. Not Hosp 

         Dead Alive    P(Dead) 95% conf. interval
Hosp      741   412     0.6427    0.6146   0.6698
Not Hosp  488    96     0.8356    0.8033   0.8635

                                    95% conf. interval
             Relative Risk:  0.7691    0.7271   0.8135
         Sample Odds Ratio:  0.3538    0.2756   0.4542
Conditional MLE Odds Ratio:  0.3540    0.2726   0.4566
    Probability difference: -0.1929   -0.2325  -0.1508

             Exact P-value: 0.0000 
        Asymptotic P-value: 0.0000 
------------------------------------------------------

Another way to accomplish the same thing would be to put the Alive outcome in the left column, like this:

twobytwo(412, 741, 96, 488, 
         "Hosp", "Not Hosp", "Alive", "Dead", 
         conf.level = 0.95)
2 by 2 table analysis: 
------------------------------------------------------ 
Outcome   : Alive 
Comparing : Hosp vs. Not Hosp 

         Alive Dead    P(Alive) 95% conf. interval
Hosp       412  741      0.3573    0.3302   0.3854
Not Hosp    96  488      0.1644    0.1365   0.1967

                                   95% conf. interval
             Relative Risk: 2.1737    1.7823   2.6512
         Sample Odds Ratio: 2.8264    2.2016   3.6284
Conditional MLE Odds Ratio: 2.8248    2.1900   3.6678
    Probability difference: 0.1929    0.1508   0.2325

             Exact P-value: 0.0000 
        Asymptotic P-value: 0.0000 
------------------------------------------------------

I’ll leave the interpretation of the results from this last example as an exercise.

13.16 For More Information