1 Introduction

This page contains information of the MI_boot method that is implemented in the psfmi package and that combines Multiple Imputation with bootstrapping for the validation of logistic regression / prediction models. Internal validation is always done of the last model that is selected by the function psfmi_lr. An explanation and examples of how to use the methods can be found below.

2 Method MI_boot

With this method bootstrap samples are drawn from each multiply imputed dataset. The same cases are drawn in each imputed dataset. The pooled model is analyzed in the each bootstrap training data and subsequently tested in original multiply imputed data. The method can be performed in combination with backward or forward selection.

How these steps work is visualized in the Figure below.

Schematic overview of the boot_MI method

Figure 2.1: Schematic overview of the boot_MI method

3 Examples

3.1 Method MI_boot

internal validation is done of the last model that is selected by the function psfmi_lr. In the example below, psfmi_lr is used with p.crit set at 1. This setting is also used in the psfmi_perform function. This means that first the full model is pooled and subsequently interval validation is done of the full model.

library(psfmi)
pool_lr <- psfmi_lr(data=lbpmilr, formula = Chronic ~ Pain + JobDemands + rcs(Tampascale, 3) +
                   factor(Satisfaction) + Smoking, p.crit = 1, direction="FW",
                 nimp=5, impvar="Impnr", method="D1")

set.seed(200)
res <- psfmi_perform(pool_lr, val_method = "MI_boot", nboot = 5, p.crit=1)
## 
## Boot 1
## 
## Boot 2
## 
## Boot 3
## 
## Boot 4
## 
## Boot 5
## 
## p.crit = 1, validation is done without variable selection
res
## $stats_val
##                   Orig  Apparent      Test  Optimism Corrected
## AUC          0.8871000 0.9131800 0.8786800 0.0345000 0.8526000
## R2           0.5605521 0.6354404 0.5332539 0.1021865 0.4583656
## Brier Scaled 0.4514569 0.5382256 0.4128859 0.1253397 0.3261172
## Slope        1.0000000 1.0000000 0.7449554 0.2550446 0.7449554
## 
## $intercept_test
##  intercept 
## -0.1683646 
## 
## $res_boot
##        ROC_app ROC_test    R2_app   R2_test Brier_sc_app Brier_sc_test
## Boot 1  0.8708   0.8767 0.5261128 0.5275393    0.4123378     0.4042778
## Boot 2  0.9099   0.8813 0.6249260 0.5439424    0.5238909     0.4342868
## Boot 3  0.9447   0.8793 0.7183270 0.5353093    0.6457725     0.4158834
## Boot 4  0.9225   0.8714 0.6620081 0.5166176    0.5715999     0.4284101
## Boot 5  0.9180   0.8847 0.6458284 0.5428610    0.5375269     0.3815714
##           intercept     Slope
## Boot 1 -0.153194405 0.9741632
## Boot 2 -0.008403925 0.7697086
## Boot 3 -0.121148496 0.5440115
## Boot 4 -0.074413523 0.6987898
## Boot 5 -0.484662668 0.7381040

Back to Examples

3.2 Method MI_boot including BW selection

Internal validation is done of the last model that is selected by the function psfmi_lr. In the example below, psfmi_lr is used with p.crit set at 1, and pooling is than done without variable selection, i.e. the full model is pooled. When subsequently interval validation is done with the psfmi_perform function including BW, BW is applied in each bootstrap sample from the full model.

library(psfmi)
pool_lr <- psfmi_lr(data=lbpmilr, formula = Chronic ~ Pain + JobDemands + rcs(Tampascale, 3) +
                   factor(Satisfaction) + Smoking, p.crit = 1, direction="FW",
                 nimp=5, impvar="Impnr", method="D1")

set.seed(200)
res <- psfmi_perform(pool_lr, val_method = "MI_boot", nboot = 5, p.crit=0.05, direction = "BW")
## 
## Boot 1
## Removed at Step 1 is - Smoking
## Removed at Step 2 is - JobDemands
## 
## Selection correctly terminated, 
## No more variables removed from the model
## 
## Boot 2
## Removed at Step 1 is - JobDemands
## Removed at Step 2 is - Smoking
## Removed at Step 3 is - rcs(Tampascale,3)
## 
## Selection correctly terminated, 
## No more variables removed from the model
## 
## Boot 3
## Removed at Step 1 is - Smoking
## Removed at Step 2 is - rcs(Tampascale,3)
## Removed at Step 3 is - JobDemands
## 
## Selection correctly terminated, 
## No more variables removed from the model
## 
## Boot 4
## Removed at Step 1 is - rcs(Tampascale,3)
## Removed at Step 2 is - Smoking
## Removed at Step 3 is - JobDemands
## 
## Selection correctly terminated, 
## No more variables removed from the model
## 
## Boot 5
## Removed at Step 1 is - Smoking
## Removed at Step 2 is - JobDemands
## Removed at Step 3 is - rcs(Tampascale,3)
## 
## Selection correctly terminated, 
## No more variables removed from the model
res
## $stats_val
##                   Orig  Apparent      Test   Optimism Corrected
## AUC          0.8871000 0.9001200 0.8740600 0.02606000 0.8610400
## R2           0.5605521 0.6012708 0.5231900 0.07808081 0.4824713
## Brier Scaled 0.4514569 0.5105213 0.4331223 0.07739902 0.3740579
## Slope        1.0000000 1.0000000 0.8603053 0.13969467 0.8603053
## 
## $intercept_test
##   intercept 
## -0.06307424 
## 
## $res_boot
##        ROC_app ROC_test    R2_app   R2_test Brier_sc_app Brier_sc_test
## Boot 1  0.8680   0.8783 0.5132543 0.5362514    0.3997598     0.4122129
## Boot 2  0.8941   0.8730 0.5801429 0.5236584    0.4867631     0.4385677
## Boot 3  0.9300   0.8730 0.6831782 0.5173532    0.5958985     0.4446954
## Boot 4  0.9156   0.8730 0.6410638 0.5169447    0.5565305     0.4441806
## Boot 5  0.8929   0.8730 0.5887150 0.5217423    0.5136547     0.4259549
##           intercept     Slope
## Boot 1 -0.181845909 1.0237503
## Boot 2 -0.066031545 0.8350387
## Boot 3  0.057587096 0.7083871
## Boot 4 -0.009979645 0.7793790
## Boot 5 -0.115101220 0.9549715

Back to Examples

3.3 Method MI_boot including FW selection

Internal validation is done of the last model that is selected by the function psfmi_lr. In the example below, psfmi_lr is used with p.crit set at 1, and pooling is than done without variable selection, i.e. the full model is pooled. When subsequently interval validation is done with the psfmi_perform function including FW, FW is applied in each bootstrap sample from the full model.

library(psfmi)
pool_lr <- psfmi_lr(data=lbpmilr, formula = Chronic ~ Pain + JobDemands + rcs(Tampascale, 3) +
                   factor(Satisfaction) + Smoking, p.crit = 1, direction="FW",
                 nimp=5, impvar="Impnr", method="D1")

set.seed(200)
res <- psfmi_perform(pool_lr, val_method = "MI_boot", nboot = 5, p.crit=0.05, direction = "FW")
## 
## Boot 1
## Entered at Step 1 is - Pain
## Entered at Step 2 is - factor(Satisfaction)
## Entered at Step 3 is - rcs(Tampascale,3)
## 
## Selection correctly terminated, 
## No new variables entered the model
## 
## Boot 2
## Entered at Step 1 is - Pain
## Entered at Step 2 is - factor(Satisfaction)
## 
## Selection correctly terminated, 
## No new variables entered the model
## 
## Boot 3
## Entered at Step 1 is - rcs(Tampascale,3)
## Entered at Step 2 is - Pain
## Entered at Step 3 is - factor(Satisfaction)
## 
## Selection correctly terminated, 
## No new variables entered the model
## 
## Boot 4
## Entered at Step 1 is - Pain
## Entered at Step 2 is - factor(Satisfaction)
## 
## Selection correctly terminated, 
## No new variables entered the model
## 
## Boot 5
## Entered at Step 1 is - Pain
## Entered at Step 2 is - factor(Satisfaction)
## 
## Selection correctly terminated, 
## No new variables entered the model
res
## $stats_val
##                   Orig  Apparent       Test Optimism  Corrected
## AUC          0.8871000 0.9024200  0.6622000 0.240220  0.6468800
## R2           0.5605521 0.6061594  0.2127155 0.393444  0.1671082
## Brier Scaled 0.4514569 0.5153547 -0.7659031 1.281258 -0.8298009
## Slope        1.0000000 1.0000000 -0.3117138 1.311714 -0.3117138
## 
## $intercept_test
## intercept 
## -4.969914 
## 
## $res_boot
##        ROC_app ROC_test    R2_app    R2_test Brier_sc_app Brier_sc_test
## Boot 1  0.8680   0.8170 0.5132543 0.34548078    0.3997598    -0.7666667
## Boot 2  0.8941   0.7290 0.5801429 0.25446803    0.4867631    -0.7665664
## Boot 3  0.9415   0.5627 0.7076212 0.02626783    0.6200654    -0.7666667
## Boot 4  0.9156   0.7940 0.6410638 0.39033595    0.5565305    -0.7666581
## Boot 5  0.8929   0.4083 0.5887150 0.04702472    0.5136547    -0.7629576
##        intercept       Slope
## Boot 1 -5.389639 -0.14785309
## Boot 2 -5.887225 -0.52051344
## Boot 3 -3.596809 -0.07416648
## Boot 4 -7.748762 -0.55147565
## Boot 5 -2.227136 -0.26456015

Back to Examples