# How to output crude and adjusted models

## Background

It is a common practice in epidemiological papers to present regression estimates in both adjusted and unadjusted format. The unadjusted format is just that particular variable without any of the covariates and is often referred to as the crude estimate. The advantage with the two is that you provide the reader with a better understanding of how much the variables affect each other and also a sense of how much confounding is present in the model.

## The printCrudeAndAdjusted function

The printCrudeAndAdjusted was designed for this purpose. It provides a table with the coef and confint estimates for the full model, then reruns for each variable through an update call on the model providing outcome ~ variable as input. Below is a basic example:

library(datasets)
data(mtcars)
mtcars$am <- factor(mtcars$am, labels = c("Automatic", "Manual"))
fit <- lm(mpg ~ cyl + disp + hp + am, data = mtcars)
library(Greg)
printCrudeAndAdjustedModel(fit)
Variable Coef 2.5 % to 97.5 %   Coef 2.5 % to 97.5 %
(Intercept) 20.09 17.92 to 22.26   30.48 24.60 to 36.36
cyl -2.88 -3.53 to -2.22   -0.83 -2.39 to 0.72
disp -0.04 -0.05 to -0.03   -0.01 -0.03 to 0.01
hp -0.07 -0.09 to -0.05   -0.03 -0.06 to 0.00
Manual 7.24 3.64 to 10.85   3.45 0.46 to 6.43

As the variable names often aren’t that pretty you can use the rowname.fn in order to get the row names desired. Here’s a simple substitution example together with some limits to the digits, and a reference group:

printCrudeAndAdjustedModel(fit,
digits = 1,
rowname.fn = function(n){
if (n == "disp")
return("Displacement (cu.in.)")
if (n == "hp")
return("Gross horsepower")
if (n == "cyl")
return("No. cylinders")
if (n == "am")
return("Transmission")
return(n)
})
Variable Coef 2.5 % to 97.5 %   Coef 2.5 % to 97.5 %
(Intercept) 20.1 17.9 to 22.3   30.5 24.6 to 36.4
No. cylinders -2.9 -3.5 to -2.2   -0.8 -2.4 to 0.7
Displacement (cu.in.) 0.0 -0.1 to 0.0   0.0 0.0 to 0.0
Gross horsepower -0.1 -0.1 to 0.0   0.0 -0.1 to 0.0
Transmission
Automatic 0.0 ref   0.0 ref
Manual 7.2 3.6 to 10.8   3.4 0.5 to 6.4

You can achieve the same effect with the label function in the Hmisc package:

library(Hmisc)
label(mtcars$disp) <- "Displacement (cu.in)" label(mtcars$cyl) <- "No. cylinders"
label(mtcars$hp) <- "Gross horsepower" label(mtcars$am) <- "Transmission"

digits = 1,
add_references = TRUE)
Variable Coef 2.5 % to 97.5 %   Coef 2.5 % to 97.5 %
(Intercept) 20.1 17.9 to 22.3   30.5 24.6 to 36.4
No. cylinders -2.9 -3.5 to -2.2   -0.8 -2.4 to 0.7
Displacement (cu.in) 0.0 -0.1 to 0.0   0.0 0.0 to 0.0
Gross horsepower -0.1 -0.1 to 0.0   0.0 -0.1 to 0.0
Transmission
Automatic 0.0 ref   0.0 ref
Manual 7.2 3.6 to 10.8   3.4 0.5 to 6.4

If we want to style the table we can use htmlTable::addHtmlTableStyle

library(htmlTable)

digits = 1,
# We can also style the output as shown here
addHtmlTableStyle(css.rgroup = "")
Variable Coef 2.5 % to 97.5 %   Coef 2.5 % to 97.5 %
(Intercept) 20.1 17.9 to 22.3   30.5 24.6 to 36.4
No. cylinders -2.9 -3.5 to -2.2   -0.8 -2.4 to 0.7
Displacement (cu.in) 0.0 -0.1 to 0.0   0.0 0.0 to 0.0
Gross horsepower -0.1 -0.1 to 0.0   0.0 -0.1 to 0.0
Transmission
Automatic 0.0 ref   0.0 ref
Manual 7.2 3.6 to 10.8   3.4 0.5 to 6.4

The style can be added to all the tables by setting a theme option and activating the style for all the table elements.

setHtmlTableTheme(css.rgroup = "")

## Binding columns/rows

The returned variable from printCrudeAndAdjustedModel works just like any matrix, i.e. you can bind it both on rows and on columns:

fit_mpg <- lm(mpg ~ cyl + disp + hp + am, data = mtcars)
fit_weight <- lm(wt ~ cyl + disp + hp + am, data = mtcars)
rbind("Miles per gallon" = p_mpg,
"Weight (1000 lbs)" = p_weight)
Variable Coef 2.5 % to 97.5 %   Coef 2.5 % to 97.5 %
Miles per gallon
(Intercept) 20.1 17.9 to 22.3   30.5 24.6 to 36.4
No. cylinders -2.9 -3.5 to -2.2   -0.8 -2.4 to 0.7
Displacement (cu.in) 0.0 -0.1 to 0.0   0.0 0.0 to 0.0
Gross horsepower -0.1 -0.1 to 0.0   0.0 -0.1 to 0.0
Transmission
Automatic 0.0 ref   0.0 ref
Manual 7.2 3.6 to 10.8   3.4 0.5 to 6.4
Weight (1000 lbs)
(Intercept) 3.2 2.9 to 3.6   2.3 1.5 to 3.2
No. cylinders 0.4 0.3 to 0.6   -0.1 -0.3 to 0.2
Displacement (cu.in) 0.0 0.0 to 0.0   0.0 0.0 to 0.0
Gross horsepower 0.0 0.0 to 0.0   0.0 0.0 to 0.0
Transmission
Automatic 0.0 ref   0.0 ref
Manual -1.4 -1.9 to -0.8   -0.6 -1.0 to -0.1
cbind("Miles per gallon" = p_mpg,
"Weight (1000 lbs)" = p_weight)
Variable Crude 2.5 % to 97.5 % Adjusted 2.5 % to 97.5 % Crude 2.5 % to 97.5 % Adjusted 2.5 % to 97.5 %
(Intercept) 20.1 17.9 to 22.3 30.5 24.6 to 36.4 3.2 2.9 to 3.6 2.3 1.5 to 3.2
No. cylinders -2.9 -3.5 to -2.2 -0.8 -2.4 to 0.7 0.4 0.3 to 0.6 -0.1 -0.3 to 0.2
Displacement (cu.in) 0.0 -0.1 to 0.0 0.0 0.0 to 0.0 0.0 0.0 to 0.0 0.0 0.0 to 0.0
Gross horsepower -0.1 -0.1 to 0.0 0.0 -0.1 to 0.0 0.0 0.0 to 0.0 0.0 0.0 to 0.0
Transmission
Automatic 0.0 ref 0.0 ref 0.0 ref 0.0 ref
Manual 7.2 3.6 to 10.8 3.4 0.5 to 6.4 -1.4 -1.9 to -0.8 -0.6 -1.0 to -0.1

## Selecting column/rows using [

It is also possible to select individual rows/columns if so desired using the [ operator:

p_mpg[,1:2]
Variable Coef 2.5 % to 97.5 %
(Intercept) 20.1 17.9 to 22.3
No. cylinders -2.9 -3.5 to -2.2
Displacement (cu.in) 0.0 -0.1 to 0.0
Gross horsepower -0.1 -0.1 to 0.0
Transmission
Automatic 0.0 ref
Manual 7.2 3.6 to 10.8
p_mpg[1:2,]
Variable Coef 2.5 % to 97.5 %   Coef 2.5 % to 97.5 %
(Intercept) 20.1 17.9 to 22.3   30.5 24.6 to 36.4
No. cylinders -2.9 -3.5 to -2.2   -0.8 -2.4 to 0.7

The function is prepared for use with splines and strata but is currently not tested for mixed models. As the spline estimates are best interpreted in a graph they are omitted from the output.

library("survival")

set.seed(10)
n <- 500
ds <- data.frame(
ftime = rexp(n),
fstatus = sample(0:1, size = n, replace = TRUE),
y = rnorm(n = n),
x1 = factor(sample(LETTERS[1:4], size = n, replace = TRUE)),
x2 = rnorm(n, mean = 3, 2),
x3 = rnorm(n, mean = 3, 2),
x4 = factor(sample(letters[1:3], size = n, replace = TRUE)),
stringsAsFactors = FALSE)

library(survival)
library(splines)
fit <- coxph(Surv(ds$ftime, ds$fstatus == 1) ~ x1 + ns(x2, 4) + x3 + strata(x4), data = ds)

printCrudeAndAdjustedModel(fit, add_references = TRUE)
# Note that the crude is with the strata
a["x3", "Crude"] == exp(coef(coxph(Surv(ds$ftime, ds$fstatus == 1) ~ x3 + strata(x4), data = ds)))
##   x3
## TRUE