Chapter 19 Modeling an Ordinal Categorical Outcome in Ohio SMART

19.1 Preliminaries

library(gmodels)
library(nnet)

smart_oh <- read.csv("data/smart_ohio.csv") %>% tbl_df

19.2 A subset of the Ohio SMART data

Let’s consider the following data. The outcome we’ll study now is genhealth, which has five ordered categories. I’ll include the subset of all observations in smart_oh with complete data on these 7 variables.

Variable	Description
`SEQNO`	Subject identification code
`genhealth`	Five categories (1 = Excellent, 2 = Very Good, 3 = Good, 4 = Fair, 5 = Poor) on general health
`physhealth`	Now thinking about your physical health, which includes physical illness and injury, for how many days during the past 30 days was your physical health not good?
`costprob`	1 indicates Yes to “Was there a time in the past 12 months when you needed to see a doctor but could not because of cost?”, and 0 otherwise.
`sleephrs`	average amount of sleep the subject gets in a 24-hour period
`incomegroup`	8 income groups from < 10,000 to 75,000 or more
`bmi`	body-mass index

To make my life easier later, I’m going to drop any subjects with missing data on these variables. I’m also going to drop the 11 subjects who have no missing data, but have a listed bmi above 60.

sm1 <- smart_oh %>%
    select(SEQNO, genhealth, physhealth, costprob, sleephrs, 
           incomegroup, bmi) %>%
    filter(bmi <= 60) %>%
    drop_na

In total, we have 4721 subjects in the sm1 sample.

19.2.1 Several Ways of Storing Multi-Categorical data

We will store the information in our outcome, genhealth in both a numeric form (gen_n) and an ordered factor (gen_h) with some abbreviated labels) because we’ll have some use for each approach in this material.

sm1 <- sm1 %>%
    mutate(genh = fct_recode(genhealth,
                             "1-E" = "1_Excellent",
                             "2_VG" = "2_VeryGood",
                             "3_G" = "3_Good",
                             "4_F" = "4_Fair",
                             "5_P" = "5_Poor"),
           genh = factor(genh, ordered = TRUE),
           gen_n = as.numeric(genhealth))

sm1 %>% count(genh, gen_n, genhealth)

# A tibble: 5 x 4
  genh  gen_n genhealth       n
  <ord> <dbl> <fct>       <int>
1 1-E       1 1_Excellent   807
2 2_VG      2 2_VeryGood   1607
3 3_G       3 3_Good       1469
4 4_F       4 4_Fair        598
5 5_P       5 5_Poor        240

19.3 Building Cross-Tabulations

Is income group associated with general health?

19.3.1 Using base `table` functions

addmargins(table(sm1$incomegroup, sm1$genh))

        
          1-E 2_VG  3_G  4_F  5_P  Sum
  0-9K     20   32   55   49   27  183
  10-14K   11   36   65   67   37  216
  15-19K   29   67  126   79   46  347
  20-24K   40  115  138  108   45  446
  25-34K   60  170  186   83   30  529
  35-49K   92  227  252   72   23  666
  50-74K  151  298  261   46   18  774
  75K+    404  662  386   94   14 1560
  Sum     807 1607 1469  598  240 4721

More people answer Very Good and Good than choose the other categories. It might be easier to look at percentages here.

19.3.1.1 Adding percentages within each row

Here are the percentages giving each genhealth response within each income group.

addmargins(
    round(100*prop.table(
        table(sm1$incomegroup, sm1$genh)
        ,1)
        ,1)
    )

        
           1-E  2_VG   3_G   4_F   5_P   Sum
  0-9K    10.9  17.5  30.1  26.8  14.8 100.1
  10-14K   5.1  16.7  30.1  31.0  17.1 100.0
  15-19K   8.4  19.3  36.3  22.8  13.3 100.1
  20-24K   9.0  25.8  30.9  24.2  10.1 100.0
  25-34K  11.3  32.1  35.2  15.7   5.7 100.0
  35-49K  13.8  34.1  37.8  10.8   3.5 100.0
  50-74K  19.5  38.5  33.7   5.9   2.3  99.9
  75K+    25.9  42.4  24.7   6.0   0.9  99.9
  Sum    103.9 226.4 258.8 143.2  67.7 800.0

So, for example, 11.3% of the genhealth responses in subjects with incomes between 25 and 34 thousand dollars were Excellent.

19.3.1.2 Adding percentages within each column

Here are the percentages in each incomegroup within each genhealth response.

addmargins(
    round(100*prop.table(
        table(sm1$incomegroup, sm1$genh)
        ,2)
        ,1)
    )

        
           1-E  2_VG   3_G   4_F   5_P   Sum
  0-9K     2.5   2.0   3.7   8.2  11.2  27.6
  10-14K   1.4   2.2   4.4  11.2  15.4  34.6
  15-19K   3.6   4.2   8.6  13.2  19.2  48.8
  20-24K   5.0   7.2   9.4  18.1  18.8  58.5
  25-34K   7.4  10.6  12.7  13.9  12.5  57.1
  35-49K  11.4  14.1  17.2  12.0   9.6  64.3
  50-74K  18.7  18.5  17.8   7.7   7.5  70.2
  75K+    50.1  41.2  26.3  15.7   5.8 139.1
  Sum    100.1 100.0 100.1 100.0 100.0 500.2

From this table, we see that 7.4% of the Excellent genhealth responses were given by people with incomes between 25 and 34 thousand dollars.

19.3.2 Using `xtabs`

The xtabs function provides a formula method for obtaining cross-tabulations.

xtabs(~ incomegroup + genh, data = sm1)

           genh
incomegroup 1-E 2_VG 3_G 4_F 5_P
     0-9K    20   32  55  49  27
     10-14K  11   36  65  67  37
     15-19K  29   67 126  79  46
     20-24K  40  115 138 108  45
     25-34K  60  170 186  83  30
     35-49K  92  227 252  72  23
     50-74K 151  298 261  46  18
     75K+   404  662 386  94  14

19.3.3 Storing a table in a tibble

We can store the elements of a cross-tabulation in a tibble, like this:

(sm1.tableA <- sm1 %>% count(incomegroup, genh))

# A tibble: 40 x 3
   incomegroup genh      n
   <fct>       <ord> <int>
 1 0-9K        1-E      20
 2 0-9K        2_VG     32
 3 0-9K        3_G      55
 4 0-9K        4_F      49
 5 0-9K        5_P      27
 6 10-14K      1-E      11
 7 10-14K      2_VG     36
 8 10-14K      3_G      65
 9 10-14K      4_F      67
10 10-14K      5_P      37
# ... with 30 more rows

From such a tibble, we can visualize the data in many ways, but we can also return to xtabs and include the frequencies (n) in that setup.

xtabs(n ~ incomegroup + genh, data = sm1.tableA)

           genh
incomegroup 1-E 2_VG 3_G 4_F 5_P
     0-9K    20   32  55  49  27
     10-14K  11   36  65  67  37
     15-19K  29   67 126  79  46
     20-24K  40  115 138 108  45
     25-34K  60  170 186  83  30
     35-49K  92  227 252  72  23
     50-74K 151  298 261  46  18
     75K+   404  662 386  94  14

And, we can get the \(\chi^2\) test of independence, with:

summary(xtabs(n ~ incomegroup + genh, data = sm1.tableA))

Call: xtabs(formula = n ~ incomegroup + genh, data = sm1.tableA)
Number of cases in table: 4721 
Number of factors: 2 
Test for independence of all factors:
    Chisq = 763.5, df = 28, p-value = 9.671e-143

19.3.4 Using `CrossTable` from the `gmodels` package

The CrossTable function from the gmodels package produces a cross-tabulation with various counts and proportions like people often generate with SPSS and SAS.

CrossTable(sm1$incomegroup, sm1$genh, chisq = T)


 
   Cell Contents
|-------------------------|
|                       N |
| Chi-square contribution |
|           N / Row Total |
|           N / Col Total |
|         N / Table Total |
|-------------------------|

 
Total Observations in Table:  4721 

 
                | sm1$genh 
sm1$incomegroup |       1-E |      2_VG |       3_G |       4_F |       5_P | Row Total | 
----------------|-----------|-----------|-----------|-----------|-----------|-----------|
           0-9K |        20 |        32 |        55 |        49 |        27 |       183 | 
                |     4.069 |    14.731 |     0.066 |    28.760 |    33.664 |           | 
                |     0.109 |     0.175 |     0.301 |     0.268 |     0.148 |     0.039 | 
                |     0.025 |     0.020 |     0.037 |     0.082 |     0.112 |           | 
                |     0.004 |     0.007 |     0.012 |     0.010 |     0.006 |           | 
----------------|-----------|-----------|-----------|-----------|-----------|-----------|
         10-14K |        11 |        36 |        65 |        67 |        37 |       216 | 
                |    18.200 |    19.152 |     0.073 |    57.430 |    61.654 |           | 
                |     0.051 |     0.167 |     0.301 |     0.310 |     0.171 |     0.046 | 
                |     0.014 |     0.022 |     0.044 |     0.112 |     0.154 |           | 
                |     0.002 |     0.008 |     0.014 |     0.014 |     0.008 |           | 
----------------|-----------|-----------|-----------|-----------|-----------|-----------|
         15-19K |        29 |        67 |       126 |        79 |        46 |       347 | 
                |    15.494 |    22.121 |     3.010 |    27.944 |    45.593 |           | 
                |     0.084 |     0.193 |     0.363 |     0.228 |     0.133 |     0.074 | 
                |     0.036 |     0.042 |     0.086 |     0.132 |     0.192 |           | 
                |     0.006 |     0.014 |     0.027 |     0.017 |     0.010 |           | 
----------------|-----------|-----------|-----------|-----------|-----------|-----------|
         20-24K |        40 |       115 |       138 |       108 |        45 |       446 | 
                |    17.225 |     8.928 |     0.004 |    46.959 |    21.986 |           | 
                |     0.090 |     0.258 |     0.309 |     0.242 |     0.101 |     0.094 | 
                |     0.050 |     0.072 |     0.094 |     0.181 |     0.188 |           | 
                |     0.008 |     0.024 |     0.029 |     0.023 |     0.010 |           | 
----------------|-----------|-----------|-----------|-----------|-----------|-----------|
         25-34K |        60 |       170 |       186 |        83 |        30 |       529 | 
                |    10.238 |     0.563 |     2.781 |     3.817 |     0.359 |           | 
                |     0.113 |     0.321 |     0.352 |     0.157 |     0.057 |     0.112 | 
                |     0.074 |     0.106 |     0.127 |     0.139 |     0.125 |           | 
                |     0.013 |     0.036 |     0.039 |     0.018 |     0.006 |           | 
----------------|-----------|-----------|-----------|-----------|-----------|-----------|
         35-49K |        92 |       227 |       252 |        72 |        23 |       666 | 
                |     4.192 |     0.000 |     9.670 |     1.811 |     3.482 |           | 
                |     0.138 |     0.341 |     0.378 |     0.108 |     0.035 |     0.141 | 
                |     0.114 |     0.141 |     0.172 |     0.120 |     0.096 |           | 
                |     0.019 |     0.048 |     0.053 |     0.015 |     0.005 |           | 
----------------|-----------|-----------|-----------|-----------|-----------|-----------|
         50-74K |       151 |       298 |       261 |        46 |        18 |       774 | 
                |     2.641 |     4.527 |     1.688 |    27.624 |    11.582 |           | 
                |     0.195 |     0.385 |     0.337 |     0.059 |     0.023 |     0.164 | 
                |     0.187 |     0.185 |     0.178 |     0.077 |     0.075 |           | 
                |     0.032 |     0.063 |     0.055 |     0.010 |     0.004 |           | 
----------------|-----------|-----------|-----------|-----------|-----------|-----------|
           75K+ |       404 |       662 |       386 |        94 |        14 |      1560 | 
                |    70.730 |    32.310 |    20.360 |    54.318 |    53.777 |           | 
                |     0.259 |     0.424 |     0.247 |     0.060 |     0.009 |     0.330 | 
                |     0.501 |     0.412 |     0.263 |     0.157 |     0.058 |           | 
                |     0.086 |     0.140 |     0.082 |     0.020 |     0.003 |           | 
----------------|-----------|-----------|-----------|-----------|-----------|-----------|
   Column Total |       807 |      1607 |      1469 |       598 |       240 |      4721 | 
                |     0.171 |     0.340 |     0.311 |     0.127 |     0.051 |           | 
----------------|-----------|-----------|-----------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  763.5307     d.f. =  28     p =  9.671345e-143

19.4 Graphing Categorical Data

19.4.1 A Bar Chart for a Single Variable

ggplot(sm1, aes(x = genhealth, fill = genhealth)) + 
    geom_bar() +
    scale_fill_brewer(palette = "Set1") +
    guides(fill = FALSE)

or, you might prefer to plot percentages, perhaps like this:

ggplot(sm1, aes(x = genhealth, fill = genhealth)) + 
    geom_bar(aes(y = (..count..)/sum(..count..))) +
    geom_text(aes(y = (..count..)/sum(..count..), 
                  label = scales::percent((..count..) / 
                                        sum(..count..))),
              stat = "count", vjust = 1, 
              color = "white", size = 5) +
    scale_y_continuous(labels = scales::percent) +
    scale_fill_brewer(palette = "Dark2") +
    guides(fill = FALSE) + 
    labs(y = "Percentage")

Use bar charts, rather than pie charts.

19.4.2 A Counts Chart for a 2-Way Cross-Tabulation

ggplot(sm1, aes(x = genhealth, y = incomegroup)) + 
    geom_count()

19.5 Building a Model for `genh` using `sleephrs`

To begin, we’ll predict each subject’s genh response using just one predictor, sleephrs.

19.5.1 A little EDA

Let’s start with a quick table of summary statistics.

sm1 %>% group_by(genh) %>%
    summarize(n(), mean(sleephrs), sd(sleephrs), median(sleephrs))

# A tibble: 5 x 5
  genh  `n()` `mean(sleephrs)` `sd(sleephrs)` `median(sleephrs)`
  <ord> <int>            <dbl>          <dbl>              <dbl>
1 1-E     807             7.05           1.29                  7
2 2_VG   1607             7.05           1.18                  7
3 3_G    1469             6.90           1.44                  7
4 4_F     598             6.64           1.74                  7
5 5_P     240             6.62           2.53                  6

To actually see what’s going on, we might build a comparison boxplot, or violin plot. The plot below shows both, together, with the violin plot helping to indicate the discrete nature of the sleephrs data and the boxplot indicating quartiles and outlying values within each genhealth category.

ggplot(sm1, aes(x = genhealth, y = sleephrs)) +
    geom_violin(aes(fill = genhealth), trim = TRUE) +
    geom_boxplot(width = 0.2) +
    guides(fill = FALSE, color = FALSE) +
    theme_bw()

19.5.2 Describing the Proportional-Odds Cumulative Logit Model

To fit the ordinal logistic regression model (specifically, a proportional-odds cumulative-logit model) in this situation, we’ll use the polr function in the MASS library.

Our outcome is genh, which has five ordered levels, with 1-E best and 5-P worst.
Our model will include one quantitative predictor, sleephrs.

The model will have four logit equations:

one estimating the log odds that genh will be less than or equal to 1 (i.e. genhealth = 1_Excellent,)
one estimating the log odds that genh \(\leq\) 2 (i.e. genhealth = 1_Excellent or 2_VeryGood,)
another estimating the log odds that genh \(\leq\) 3 (i.e. genhealth = 1_Excellent, 2_VeryGood or 3_Good,) and, finally,
one estimating the log odds that genh \(\leq\) 4 (i.e. genhealth = 1_Excellent, 2_VeryGood, 3_Good or 4_Fair)

That’s all we need to estimate the five categories, since Pr(genh \(\leq\) 5) = 1, because (5_Poor) is the maximum category for genhealth.

We’ll have a total of five free parameters when we add in the slope for sleephrs, and I’ll label these parameters as \(\zeta_1, \zeta_2, \zeta_3, \zeta_4\) and \(\beta_1\). The \(\zeta\)s are read as “zeta” values, and the people who built the polr function use that term.

The four logistic equations that will be fit differ only by their intercepts. They are:

\[ logit[Pr(genh \leq 1)] = log \frac{Pr(genh \leq 1}{Pr(genh > 1)} = \zeta_1 - \beta_1 sleephrs \]

which describes the log odds of a genh value of 1 (Excellent) as compared to a genh value greater than 1 (which includes Very Good, Good, Fair and Poor).

The second logit model is:

\[ logit[Pr(genh \leq 2)] = log \frac{Pr(genh \leq 2}{Pr(genh > 2)} = \zeta_2 - \beta_1 sleephrs \]

which describes the log odds of a genh value of 1 (Excellent) or 2 (Very Good) as compared to a genh value greater than 2 (which includes Good, Fair and Poor).

Next we have:

\[ logit[Pr(genh \leq 3)] = log \frac{Pr(genh \leq 3}{Pr(genh > 3)} = \zeta_3 - \beta_1 sleephrs \]

which describes the log odds of a genh value of 1 (Excellent) or 2 (Very Good) or 3 (Good) as compared to a genh value greater than 3 (which includes Fair and Poor).

Finally, we have

\[ logit[Pr(genh \leq 4)] = log \frac{Pr(genh \leq 4}{Pr(genh > 4)} = \zeta_4 - \beta_1 sleephrs \]

which describes the log odds of a genh value of 4 or less, which includes Excellent, Very Good, Good and Fair as compared to a genh value greater than 4 (which is Poor).

Again, the intercept term is the only piece that varies across the four equations.

In this case, a positive coefficient \(\beta_1\) for sleephrs means that increasing the value of sleephrs would increase the genh category (describing a worse level of general health, since higher values of genh are associated with worse health.)

19.5.3 Fitting a Proportional Odds Logistic Regression with `polr`

Our model m1 will use proportional odds logistic regression (sometimes called an ordered logit model) to predict genh on the basis of sleephrs. The polr function can help us do this. Note that we include Hess = TRUE to retain what is called the Hessian matrix, which lets R calculate standard errors more effectively in summary and other follow-up descriptions of the model.

m1 <- polr(genh ~ sleephrs, 
            data = sm1, Hess = TRUE)

summary(m1)

Call:
polr(formula = genh ~ sleephrs, data = sm1, Hess = TRUE)

Coefficients:
           Value Std. Error t value
sleephrs -0.1335    0.01957  -6.821

Intercepts:
         Value    Std. Error t value 
1-E|2_VG  -2.5156   0.1430   -17.5887
2_VG|3_G  -0.8823   0.1389    -6.3534
3_G|4_F    0.6186   0.1385     4.4666
4_F|5_P    2.0202   0.1473    13.7183

Residual Deviance: 13598.48 
AIC: 13608.48

confint(m1)

Waiting for profiling to be done...

      2.5 %      97.5 % 
-0.17193705 -0.09522207

19.6 Interpreting Model `m1`

19.6.1 Looking at Predictions

Consider two individuals:

Harry, who sleeps an average of 6 hours per night, so Harry’s sleephrs = 6, and
Sally, who sleeps an average of 5 hours per night, so Sally’s sleephrs = 5.

We’re going to start by using our model m1 to predict the genh for Harry and Sally, so we can see the effect (on the predicted genh probabilities) of a change of one unit in sleephrs.

For example, what are the log odds that Harry, who sleeps 6 hours, will describe his genh as Excellent (genh \(\leq\) 1)?

\[ logit[Pr(genh \leq 1)] = \zeta_1 - \beta_1 sleephrs \\ logit[Pr(genh \leq 1)] = -2.5156 - (-0.1335) sleephrs \\ logit[Pr(genh \leq 1)] = -2.5156 - (-0.1335) (6) = -1.7146 \]

That’s not much help. So we’ll convert it to a probability by taking the inverse logit. The formula is

\[ Pr(genh \leq 1) = \frac{exp(\zeta_1 + \beta_1 sleephrs)}{1 + exp(\zeta_1 + \beta_1 sleephrs)} = \frac{exp(-1.7146)}{1 + exp(-1.7146)} = \frac{0.180}{1.180} = 0.15 \]

So the model estimates a 15% probability that Harry will describe his genh as Excellent.

OK. Now, what are the log odds that Harry, who sleeps 6 hours, will describe his genh as either Excellent or Very Good (genh \(\leq\) 2)?

\[ logit[Pr(genh \leq 2)] = \zeta_2 - \beta_1 sleephrs \\ logit[Pr(genh \leq 2)] = -0.8823 - (-0.1335) sleephrs \\ logit[Pr(genh \leq 2)] = -0.8823 - (-0.1335) (6) = -0.0813 \]

Again, we’ll convert this to a probability by taking the inverse logit.

\[ Pr(genh \leq 2) = \frac{exp(\zeta_2 + \beta_1 sleephrs)}{1 + exp(\zeta_2 + \beta_1 sleephrs)} = \frac{exp(-0.0813)}{1 + exp(-0.0813)} = \frac{0.922}{1.922} = 0.48 \]

So, the model estimates a probability of .48 that Harry will describe his genh as either Excellent or Very Good, so by subtraction, that’s a probability of .33 that Harry describes his genh as Very Good.

Happily, that’s the last time we’ll calculate this by hand.

19.6.2 Making Predictions for Harry (and Sally) with `predict`

Suppose Harry sleeps for 6 hours on average, and Sally for 5.

temp.dat <- data.frame(name = c("Harry", "Sally"), 
                       sleephrs = c(6,5))

predict(m1, temp.dat, type = "p")

        1-E      2_VG       3_G       4_F        5_P
1 0.1525552 0.3271236 0.3255890 0.1385406 0.05619160
2 0.1360859 0.3104153 0.3369786 0.1528152 0.06370498

The predicted probabilities of falling into each category of genh are:

Subject	`sleephrs`	Pr(1_E)	Pr(2_VG)	Pr(3_G)	Pr(4_F)	Pr(5_P)
Harry	6	15.3	32.7	32.6	13.9	5.6
Sally	5	13.6	31.0	33.7	15.3	6.4

Harry has a higher predicted probability of lower (healthier) values of genh. Specifically, Harry has a higher predicted probability than Sally of falling into the Excellent and Very Good categories, and a lower probability than Sally of falling into the Good, Fair and Poor categories.
This means that Harry, with a higher sleephrs is predicted to have, on average, a lower (that is to say, healthier) value of genh.
As we’ll see, this association will be indicated by a negative coefficient of sleephrs in the proportional odds logistic regression model.

19.6.3 Predicting the actual classification of `genh`

The default prediction approach actually returns the predicted genh classification for Harry and Sally, which is just the classification with the largest predicted probability. Here, for Harry that is Very Good, and for Sally, that’s Good.

predict(m1, temp.dat)

[1] 2_VG 3_G 
Levels: 1-E 2_VG 3_G 4_F 5_P

19.6.4 A Cross-Tabuation of Predictions?

addmargins(table(predict(m1), sm1$genh))

      
        1-E 2_VG  3_G  4_F  5_P  Sum
  1-E     2    0    3    1    2    8
  2_VG  750 1494 1279  471  155 4149
  3_G    55  113  187  126   83  564
  4_F     0    0    0    0    0    0
  5_P     0    0    0    0    0    0
  Sum   807 1607 1469  598  240 4721

The m1 model classifies all subjects in the sm1 sample as either Excellent, Very Good or Good, and most subjects as Very Good.

19.6.5 The Fitted Model Equations

summary(m1)

Call:
polr(formula = genh ~ sleephrs, data = sm1, Hess = TRUE)

Coefficients:
           Value Std. Error t value
sleephrs -0.1335    0.01957  -6.821

Intercepts:
         Value    Std. Error t value 
1-E|2_VG  -2.5156   0.1430   -17.5887
2_VG|3_G  -0.8823   0.1389    -6.3534
3_G|4_F    0.6186   0.1385     4.4666
4_F|5_P    2.0202   0.1473    13.7183

Residual Deviance: 13598.48 
AIC: 13608.48

The first part of the output provides coefficient estimates for the sleephrs predictor, and these are followed by the estimates for the various model intercepts. Plugging in the estimates, we have:

\[ logit[Pr(genh \leq 1)] = -2.5156 - (-0.1335) sleephrs \\ logit[Pr(genh \leq 2)] = -0.8823 - (-0.1335) sleephrs \\ logit[Pr(genh \leq 3)] = 0.6186 - (-0.1335) sleephrs \\ logit[Pr(genh \leq 4)] = 2.0202 - (-0.1335) sleephrs \]

Note that we can obtain these pieces separately as follows:

m1$zeta

  1-E|2_VG   2_VG|3_G    3_G|4_F    4_F|5_P 
-2.5156272 -0.8822576  0.6186215  2.0202276

shows the boundary intercepts, and

m1$coefficients

 sleephrs 
-0.133488

shows the regression coefficient for sleephrs.

19.6.6 Interpreting the `sleephrs` coefficient

The first part of the output provides coefficient estimates for the sleephrs predictor.

The estimated slope for sleephrs is -0.133
- Remember Harry and Sally, who have the same values of bmi and costprob, but Harry gets one more hour of sleep than Sally does. We noted that Harry is predicted by the model to have a smaller (i.e. healthier) genh response than Sally.
- So a negative coefficient here means that higher values of sleephrs are associated with more of the probability distribution falling in lower values of genh.
- We usually don’t interpret this slope (on the log odds scale) directly, but rather exponentiate it.

19.6.7 Exponentiating the Slope Coefficient to facilitate Interpretation

We can compute the odds ratio associated with sleephrs and its confidence interval as follows…

exp(coef(m1))

 sleephrs 
0.8750379

exp(confint(m1))

Waiting for profiling to be done...

    2.5 %    97.5 % 
0.8420322 0.9091710

So, if Harry gets one more hour of sleep than Sally, our model predicts that Harry will have 88% of the odds of Sally of having a larger genh score. That means that Harry is likelier to have a smaller genh score.
- Since genh gets larger as a person’s general health gets worse (moves from Excellent towards Poor), this means that since Harry is predicted to have smaller odds of a larger genh score, he is also predicted to have smaller odds of worse general health.
- Our 95% confidence interval around that estimated odds ratio of 0.875 is (0.842, 0.909). Since that interval is entirely below 1, the odds of having the larger (worse) genh for Harry are statistically significantly lower than the odds for Sally.
- So, an increase in sleephrs is associated with smaller (better) genh scores.

19.6.8 Comparison to a Null Model

We can fit a model with intercepts only to test the significance of sleephrs in our model m1, using the anova function.

m0 <- polr(genh ~ 1, data = sm1)

anova(m1, m0)

Likelihood ratio tests of ordinal regression models

Response: genh
     Model Resid. df Resid. Dev   Test    Df LR stat.      Pr(Chi)
1        1      4717   13645.70                                   
2 sleephrs      4716   13598.48 1 vs 2     1 47.22141 6.340373e-12

We could also compare model m1 to the null model m0 with AIC or BIC.

AIC(m1, m0)

   df      AIC
m1  5 13608.48
m0  4 13653.70

BIC(m1,m0)

   df      BIC
m1  5 13640.78
m0  4 13679.54

19.7 The Assumption of Proportional Odds

Let us calculate the odds for all levels of genh if a person gets six hours of sleep. First, we’ll get the probabilities, in another way, to demonstrate how to do so…

(prob.6 <- exp(m1$zeta - 6*m1$coefficients)/(1 + exp(m1$zeta - 6*m1$coefficients)))

 1-E|2_VG  2_VG|3_G   3_G|4_F   4_F|5_P 
0.1525552 0.4796789 0.8052678 0.9438084

(prob.5 <- exp(m1$zeta - 5*m1$coefficients)/(1 + exp(m1$zeta - 5*m1$coefficients)))

 1-E|2_VG  2_VG|3_G   3_G|4_F   4_F|5_P 
0.1360859 0.4465012 0.7834798 0.9362950

Now, we’ll calculate the odds, first for a subject getting six hours of sleep:

(odds.6 = prob.6/(1-prob.6))

  1-E|2_VG   2_VG|3_G    3_G|4_F    4_F|5_P 
 0.1800179  0.9218900  4.1352580 16.7962532

And here are the odds, for a subject getting five hours of sleep:

(odds.5 = prob.5/(1-prob.5))

  1-E|2_VG   2_VG|3_G    3_G|4_F    4_F|5_P 
 0.1575225  0.8066887  3.6185077 14.6973588

Now, let’s take the ratio of the odds for someone who sleeps 6 hours over the odds for someone who sleeps 5.

odds.6/odds.5

1-E|2_VG 2_VG|3_G  3_G|4_F  4_F|5_P 
1.142808 1.142808 1.142808 1.142808

They are all the same. The odds ratios are equal, which means they are proportional. For any level of genh, the estimated odds that a person who sleeps 6 hours has better (lower) genh is about 1.14 times the odds for someone who sleeps 5 hours. Those who sleep more have higher odds of better (lower) genh. Less than 1 means lower odds, and more than 1 means greater odds.

Now, let’s take the log of the odds ratios:

log(odds.6/odds.5)

1-E|2_VG 2_VG|3_G  3_G|4_F  4_F|5_P 
0.133488 0.133488 0.133488 0.133488

That should be familiar. It is the slope coefficient in the model summary, without the minus sign. R tacks on a minus sign so that higher levels of predictors correspond to the ordinal outcome falling in the higher end of its scale.

If we exponentiate the slope estimated by R (-0.128), we get 0.88. If we have two people, and A sleeps one more hour on average than B, then the estimated odds of A having a higher ‘genh’ (i.e. worse general health) are 88% as higher as B’s.

19.7.1 Testing the Proportional Odds Assumption

One way to test the proportional odds assumption is to compare the fit of the proportional odds logistic regression to a model that does not make that assumption. A natural candidate is a multinomial logit model, which is typically used to model unordered multi-categorical outcomes, and fits a slope to each level of the genh outcome in this case, as opposed to the proportional odds logit, which fits only one slope across all levels.

Since the proportional odds logistic regression model is nested in the multinomial logit, we can perform a likelihood ratio test. To do this, we first fit the multinomial logit model, with the multinom function from the nnet package.

(m1_multi <- multinom(genh ~ sleephrs, data = sm1))

# weights:  15 (8 variable)
initial  value 7598.156385 
iter  10 value 6803.195314
final  value 6797.178003 
converged

Call:
multinom(formula = genh ~ sleephrs, data = sm1)

Coefficients:
     (Intercept)     sleephrs
2_VG   0.6981800 -0.001330807
3_G    1.0941754 -0.070982695
4_F    1.0640161 -0.199214133
5_P    0.2115135 -0.208324914

Residual Deviance: 13594.36 
AIC: 13610.36

The multinomial logit fits four intercepts and four slopes, for a total of 8 estimated parameters. The proportional odds logit, as we’ve seen, fits four intercepts and one slope, for a total of 5. The difference is 3, and we use that number in the sequence below to build our test of the proportional odds assumption.

LL_1 <- logLik(m1)
LL_1m <- logLik(m1_multi)
(G <- -2 * (LL_1[1] - LL_1m[1]))

[1] 4.126867

pchisq(G, 3, lower.tail = FALSE)

[1] 0.2480866

The p value is fairly large, so it indicates that the proportional odds model fits about as well as the more complex multinomial logit. A non-significant p value here isn’t always the best way to assess the proportional odds assumption, but it does provide some evidence of model adequacy.

19.8 Can model `m1` be fit using `rms` tools?

Yes.

d <- datadist(sm1)
options(datadist = "d")
m1_lrm <- lrm(genh ~ sleephrs, data = sm1, x = T, y = T)

m1_lrm

Logistic Regression Model
 
 lrm(formula = genh ~ sleephrs, data = sm1, x = T, y = T)
 
 
 Frequencies of Responses
 
  1-E 2_VG  3_G  4_F  5_P 
  807 1607 1469  598  240 
 
                       Model Likelihood     Discrimination    Rank Discrim.    
                          Ratio Test           Indexes           Indexes       
 Obs          4721    LR chi2      47.22    R2       0.011    C       0.546    
 max |deriv| 4e-13    d.f.             1    g        0.198    Dxy     0.092    
                      Pr(> chi2) <0.0001    gr       1.219    gamma   0.119    
                                            gp       0.049    tau-a   0.068    
                                            Brier    0.248                     
 
          Coef    S.E.   Wald Z Pr(>|Z|)
 y>=2_VG   2.5156 0.1430  17.59 <0.0001 
 y>=3_G    0.8823 0.1389   6.35 <0.0001 
 y>=4_F   -0.6186 0.1385  -4.47 <0.0001 
 y>=5_P   -2.0202 0.1473 -13.72 <0.0001 
 sleephrs -0.1335 0.0196  -6.82 <0.0001

The model is highly significant (remember the large sample size) but very weak, with a Nagelkerke R² of 0.01, and a C statistic of 0.55.

summary(m1_lrm)

             Effects              Response : genh 

 Factor      Low High Diff. Effect   S.E.     Lower 0.95 Upper 0.95
 sleephrs    6   8    2     -0.26698 0.039138 -0.34368   -0.19027  
  Odds Ratio 6   8    2      0.76569       NA  0.70915    0.82674

A two-hour change in sleephrs is associated with an odds ratio of 0.77, with 95% confidence interval (0.72, 0.84). Since these values are all below 1, we have a clear indication of a significant effect of sleephrs with higher sleephrs associated with lower genh, which means, in this case, better health.

There is also a tool in rms called orm which may be used to fit a wide array of ordinal regression models. I suggest you read Frank Harrell’s book on Regression Modeling Strategies if you want to learn more.

19.9 Building a Three-Predictor Model

Now, we’ll model genh using sleephrs, bmi and costprob.

19.9.1 Scatterplot Matrix

GGally::ggpairs(sm1 %>% 
                    select(bmi, sleephrs, costprob, genh))

`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

We might choose to plot the costprob data as a binary factor, rather than the raw 0-1 numbers included above, but not at this time.

19.9.2 Our Three-Predictor Model, `m2`

m2 <- polr(genh ~ sleephrs + bmi + costprob, data = sm1)

summary(m2)


Re-fitting to get Hessian

Call:
polr(formula = genh ~ sleephrs + bmi + costprob, data = sm1)

Coefficients:
            Value Std. Error t value
sleephrs -0.10480    0.01967  -5.329
bmi       0.08015    0.00422  18.996
costprob  0.77515    0.09532   8.132

Intercepts:
         Value   Std. Error t value
1-E|2_VG -0.0576  0.1872    -0.3079
2_VG|3_G  1.6671  0.1880     8.8684
3_G|4_F   3.2778  0.1929    16.9939
4_F|5_P   4.7365  0.2021    23.4374

Residual Deviance: 13165.90 
AIC: 13179.90

This model contains four intercepts (to cover the five genh categories) and three slopes (one each for sleephrs, bmi and costprob.)

19.9.3 Does the three-predictor model outperform `m1`?

anova(m1, m2)

Likelihood ratio tests of ordinal regression models

Response: genh
                      Model Resid. df Resid. Dev   Test    Df LR stat.
1                  sleephrs      4716   13598.48                      
2 sleephrs + bmi + costprob      4714   13165.90 1 vs 2     2 432.5806
  Pr(Chi)
1        
2       0

There is a statistically significant improvement in fit from model 1 to model 2. The AIC and BIC are also better for the three-predictor model than they were for the model with sleephrs alone.

AIC(m1, m2)

   df      AIC
m1  5 13608.48
m2  7 13179.90

BIC(m1, m2)

   df      BIC
m1  5 13640.78
m2  7 13225.12

19.9.4 Wald tests for individual predictors

To obtain the appropriate Wald tests, we can use lrm to fit the model instead.

d <- datadist(sm1)
options(datadist = "d")
m2_lrm <- lrm(genh ~ sleephrs + bmi + costprob, 
              data = sm1, x = T, y = T)
m2_lrm

Logistic Regression Model
 
 lrm(formula = genh ~ sleephrs + bmi + costprob, data = sm1, x = T, 
     y = T)
 
 
 Frequencies of Responses
 
  1-E 2_VG  3_G  4_F  5_P 
  807 1607 1469  598  240 
 
                       Model Likelihood     Discrimination    Rank Discrim.    
                          Ratio Test           Indexes           Indexes       
 Obs          4721    LR chi2     479.80    R2       0.102    C       0.634    
 max |deriv| 1e-08    d.f.             3    g        0.644    Dxy     0.267    
                      Pr(> chi2) <0.0001    gr       1.903    gamma   0.267    
                                            gp       0.149    tau-a   0.198    
                                            Brier    0.231                     
 
          Coef    S.E.   Wald Z Pr(>|Z|)
 y>=2_VG   0.0577 0.1872   0.31 0.7580  
 y>=3_G   -1.6672 0.1880  -8.87 <0.0001 
 y>=4_F   -3.2779 0.1929 -16.99 <0.0001 
 y>=5_P   -4.7365 0.2021 -23.44 <0.0001 
 sleephrs -0.1048 0.0197  -5.33 <0.0001 
 bmi       0.0802 0.0042  19.00 <0.0001 
 costprob  0.7751 0.0953   8.13 <0.0001

It appears that each of the added predictors (bmi and costprob) adds statistically detectable value to the model.

19.9.5 A Cross-Tabuation of Predictions?

addmargins(table(predict(m2), sm1$genh))

      
        1-E 2_VG  3_G  4_F  5_P  Sum
  1-E     0    0    3    0    1    4
  2_VG  685 1152  840  266  111 3054
  3_G   119  448  610  310  115 1602
  4_F     3    7   15   19   11   55
  5_P     0    0    1    3    2    6
  Sum   807 1607 1469  598  240 4721

At least the m2 model predicted that a few of the cases will fall in the Fair and Poor categories, but still, this isn’t impressive.

19.9.6 Interpreting the Effect Sizes

We can do this in two ways:

By exponentiating the polr output, which shows the effect of increasing each predictor by a single unit
- Increasing sleephrs by 1 hour is associated with reducing the odds (by a factor of 0.89 with 95% CI 0.86, 0.93)) of higher values of genh: hence increasing sleephrs is associated with increasing the odds of a response indicating better health.
- Increasing bmi by 1 kg/m² is associated with increasing the odds (by a factor of 1.08 with 95% CI 1.07, 1.09)) of higher values of genh: hence increasing bmi is associated with reducing the odds of a response indicating better health.
- Increasing costprob from 0 to 1 is associated with an increase (by a factor of 2.17 with 95% CI 1.80, 2.62)) of a higher genh value. Since higher genh values indicate worse health, those with costprob = 1 are modeled to have generally worse health.

exp(coef(m2))

sleephrs      bmi costprob 
0.900506 1.083454 2.170910

exp(confint(m2))

Waiting for profiling to be done...


Re-fitting to get Hessian

            2.5 %    97.5 %
sleephrs 0.866378 0.9358106
bmi      1.074553 1.0924755
costprob 1.801263 2.6174576

Or by looking at the summary provided by lrm, which like all such summaries produced by rms shows the impact of moving from the 25th to the 75th percentile on all continuous predictors.

summary(m2_lrm)

             Effects              Response : genh 

 Factor      Low   High Diff. Effect   S.E.     Lower 0.95 Upper 0.95
 sleephrs     6.00  8.0 2.00  -0.20960 0.039333 -0.28669   -0.13250  
  Odds Ratio  6.00  8.0 2.00   0.81091       NA  0.75075    0.87590  
 bmi         23.99 31.5 7.51   0.60196 0.031688  0.53986    0.66407  
  Odds Ratio 23.99 31.5 7.51   1.82570       NA  1.71580    1.94270  
 costprob     0.00  1.0 1.00   0.77515 0.095318  0.58833    0.96197  
  Odds Ratio  0.00  1.0 1.00   2.17090       NA  1.80100    2.61680

plot(summary(m2_lrm))

19.9.7 Quality of the Model Fit

Model m2, as we can see from the m2_lrm output, is still weak, with a Nagelkerke R² of 0.10, and a C statistic of 0.63.

19.9.8 Validating the Summary Statistics in `m2_lrm`

set.seed(43203); validate(m2_lrm)

          index.orig training    test optimism index.corrected  n
Dxy           0.2672   0.2688  0.2666   0.0022          0.2650 40
R2            0.1023   0.1036  0.1017   0.0018          0.1005 40
Intercept     0.0000   0.0000 -0.0043   0.0043         -0.0043 40
Slope         1.0000   1.0000  0.9926   0.0074          0.9926 40
Emax          0.0000   0.0000  0.0024   0.0024          0.0024 40
D             0.1014   0.1028  0.1008   0.0020          0.0995 40
U            -0.0004  -0.0004 -1.4830   1.4826         -1.4830 40
Q             0.1018   0.1032  1.5838  -1.4806          1.5825 40
B             0.2305   0.2304  0.2307  -0.0003          0.2308 40
g             0.6435   0.6478  0.6414   0.0064          0.6371 40
gp            0.1493   0.1501  0.1489   0.0012          0.1481 40

As in our work with binary logistic regression, we can convert the Dxy to C with C = 0.5 + (Dxy/2). Both the R² and C statistics are pretty consistent with what we saw above.

19.9.9 Testing the Proportional Odds Assumption

Again, we’ll fit the analogous multinomial logit model, with the multinom function from the nnet package.

(m2_multi <- multinom(genh ~ sleephrs + bmi + costprob, 
                      data = sm1))

# weights:  25 (16 variable)
initial  value 7598.156385 
iter  10 value 6840.882689
iter  20 value 6564.250202
final  value 6561.801819 
converged

Call:
multinom(formula = genh ~ sleephrs + bmi + costprob, data = sm1)

Coefficients:
     (Intercept)      sleephrs        bmi  costprob
2_VG   -1.311071  0.0009017081 0.07486152 0.2213598
3_G    -2.513028 -0.0601155794 0.12769461 0.6992653
4_F    -3.967370 -0.1635021625 0.16732351 1.1336569
5_P    -4.102140 -0.1678127362 0.14138136 1.4202490

Residual Deviance: 13123.6 
AIC: 13155.6

The multinomial logit fits four intercepts and 12 slopes, for a total of 16 estimated parameters. The proportional odds logit in model m2, as we’ve seen, fits four intercepts and three slopes, for a total of 7. The difference is 9, and we use that number in the sequence below to build our test of the proportional odds assumption.

LL_2 <- logLik(m2)
LL_2m <- logLik(m2_multi)
(G <- -2 * (LL_2[1] - LL_2m[1]))

[1] 42.29865

pchisq(G, 9, lower.tail = FALSE)

[1] 2.898116e-06

The result is highly significant, suggesting that we have a problem somewhere with the proportional odds assumption. When this happens, I suggest you build the following plot of score residuals:

par(mfrow = c(2,2))
resid(m2_lrm, 'score.binary', pl=TRUE)
par(mfrow= c(1,1))

From this plot, bmi and costprob look pretty different as we move from the Very Good toward the Poor cutpoints, relative to sleephrs, which is more stable.

19.9.10 Plotting the Fitted Model

19.9.10.1 Nomogram

fun.ge3 <- function(x) plogis(x - m2_lrm$coef[1] + m2_lrm$coef[2])
fun.ge4 <- function(x) plogis(x - m2_lrm$coef[1] + m2_lrm$coef[3])
fun.ge5 <- function(x) plogis(x - m2_lrm$coef[1] + m2_lrm$coef[4])

plot(nomogram(m2_lrm, fun=list('Prob Y >= 2 (VG or worse)' = plogis, 
                               'Prob Y >= 3 (Good or worse)' = fun.ge3,
                               'Prob Y >= 4 (Fair or Poor)' = fun.ge4,
                               'Prob Y = 5 (Poor)' = fun.ge5)))

19.9.10.2 Using Predict and showing mean prediction on 1-5 scale

ggplot(Predict(m2_lrm, fun = Mean(m2_lrm, code = TRUE)))

The nomogram and Predict results would be more interesting, of course, if we included a spline or interaction term. Let’s do that in model m3_lrm, and also add the incomegroup information.

19.10 A Larger Model, including income group

m3_lrm <- lrm(gen_n ~ rcs(sleephrs,3) + rcs(bmi, 4) + 
                  incomegroup + catg(costprob) + 
                  bmi %ia% costprob, 
              data = sm1, x = T, y = T)

m3_lrm

Logistic Regression Model
 
 lrm(formula = gen_n ~ rcs(sleephrs, 3) + rcs(bmi, 4) + incomegroup + 
     catg(costprob) + bmi %ia% costprob, data = sm1, x = T, y = T)
 
 
 Frequencies of Responses
 
    1    2    3    4    5 
  807 1607 1469  598  240 
 
                       Model Likelihood     Discrimination    Rank Discrim.    
                          Ratio Test           Indexes           Indexes       
 Obs          4721    LR chi2    1160.29    R2       0.231    C       0.706    
 max |deriv| 9e-12    d.f.            14    g        1.097    Dxy     0.413    
                      Pr(> chi2) <0.0001    gr       2.995    gamma   0.413    
                                            gp       0.236    tau-a   0.305    
                                            Brier    0.209                     
 
                    Coef    S.E.   Wald Z Pr(>|Z|)
 y>=2                4.7820 0.5771   8.29 <0.0001 
 y>=3                2.9158 0.5756   5.07 <0.0001 
 y>=4                1.1212 0.5738   1.95 0.0507  
 y>=5               -0.4625 0.5741  -0.81 0.4205  
 sleephrs           -0.4146 0.0420  -9.87 <0.0001 
 sleephrs'           0.2805 0.0324   8.67 <0.0001 
 bmi                -0.0037 0.0223  -0.16 0.8699  
 bmi'                0.4889 0.1047   4.67 <0.0001 
 bmi''              -1.3870 0.2851  -4.86 <0.0001 
 incomegroup=10-14K  0.3691 0.1895   1.95 0.0515  
 incomegroup=15-19K -0.0258 0.1728  -0.15 0.8813  
 incomegroup=20-24K -0.2095 0.1667  -1.26 0.2088  
 incomegroup=25-34K -0.6927 0.1628  -4.25 <0.0001 
 incomegroup=35-49K -0.8811 0.1585  -5.56 <0.0001 
 incomegroup=50-74K -1.3346 0.1574  -8.48 <0.0001 
 incomegroup=75K+   -1.5934 0.1514 -10.53 <0.0001 
 costprob=1          1.0717 0.4345   2.47 0.0137  
 bmi * costprob     -0.0236 0.0147  -1.60 0.1093

Another option here would have been to consider building incomegroup as a scored variable, with an order on its own, but I won’t force that here. Here’s the polr version…

m3 <- polr(genh ~ rcs(sleephrs,3) + rcs(bmi, 4) + 
               incomegroup + costprob + 
               bmi %ia% costprob, data = sm1)

19.10.1 Cross-Tabulation of Predicted/Observed Classifications

addmargins(table(predict(m3), sm1$genh))

      
        1-E 2_VG  3_G  4_F  5_P  Sum
  1-E     0    0    0    0    0    0
  2_VG  628 1097  693  138   38 2594
  3_G   171  493  728  386  154 1932
  4_F     6   15   43   63   38  165
  5_P     2    2    5   11   10   30
  Sum   807 1607 1469  598  240 4721

This model predicts more Fair results, but still far too many Very Good with no Excellent at all.

19.10.2 Nomogram

fun.ge3 <- function(x) plogis(x - m3_lrm$coef[1] + m3_lrm$coef[2])
fun.ge4 <- function(x) plogis(x - m3_lrm$coef[1] + m3_lrm$coef[3])
fun.ge5 <- function(x) plogis(x - m3_lrm$coef[1] + m3_lrm$coef[4])

plot(nomogram(m3_lrm, fun=list('Prob Y >= 2 (VG or worse)' = plogis, 
                               'Prob Y >= 3 (Good or worse)' = fun.ge3,
                               'Prob Y >= 4 (Fair or Poor)' = fun.ge4,
                               'Prob Y = 5 (Poor)' = fun.ge5)))

19.10.3 Using Predict and showing mean prediction on 1-5 scale

ggplot(Predict(m3_lrm, fun = Mean(m3_lrm, code = TRUE)))

Here, we’re plotting the mean score on the 1-5 gen_n scale.

19.10.4 Validating the Summary Statistics in `m3_lrm`

set.seed(43221); validate(m3_lrm)

          index.orig training    test optimism index.corrected  n
Dxy           0.4128   0.4144  0.4104   0.0040          0.4088 40
R2            0.2307   0.2341  0.2280   0.0061          0.2246 40
Intercept     0.0000   0.0000 -0.0095   0.0095         -0.0095 40
Slope         1.0000   1.0000  0.9837   0.0163          0.9837 40
Emax          0.0000   0.0000  0.0052   0.0052          0.0052 40
D             0.2456   0.2497  0.2423   0.0075          0.2381 40
U            -0.0004  -0.0004 -1.4336   1.4332         -1.4336 40
Q             0.2460   0.2502  1.6759  -1.4257          1.6717 40
B             0.2089   0.2086  0.2095  -0.0009          0.2098 40
g             1.0970   1.1054  1.0869   0.0185          1.0786 40
gp            0.2358   0.2366  0.2338   0.0028          0.2330 40

Still not very impressive, but much better than where we started. It’s not crazy to suggest that in new data, we might expect a Nagelkerke R² of 0.22 and a C statistic of 0.5 + (0.4088/2) = 0.7044.

19.11 References for this Chapter

Some of the material here is adapted from http://stats.idre.ucla.edu/r/dae/ordinal-logistic-regression/.
I also found great guidance at http://data.library.virginia.edu/fitting-and-interpreting-a-proportional-odds-model/
Other parts are based on the work of Jeffrey S. Simonoff (2003) Analyzing Categorical Data in Chapter 10. Related data and R code are available at http://people.stern.nyu.edu/jsimonof/AnalCatData/Splus/.
Another good source for a simple example is https://onlinecourses.science.psu.edu/stat504/node/177.
Also helpful is https://onlinecourses.science.psu.edu/stat504/node/178 which shows a more complex example nicely.