`simpr`

provides a general, simple, and tidyverse-friendly framework for generating simulated data, fitting models on simulations, and tidying model results. The full workflow can happen in a single tidy pipeline without creating external functions, global values, or using loops. It’s useful for power analysis, design analysis, simulation studies, and for teaching statistics.

**Easily readable simulation specifications.**You can specify simulations in a few lines, including referring to other simulation variables and to simulation parameters that you’re varying (such as sample size).`simpr`

takes care of all the details of generating your simulation across varying parameters.**Sensibly handle errors.**`simpr`

has various options to keep going even when simulation or model-fitting hits errors, so that you don’t need to start over if a single iteration hits fatal numerical issues.**Reproducible workflows.**Individual simulations can be reproduced exactly without needed to perform the whole simulation again.**Easy-to-use parallel processing.**Building on`furrr`

, parallel processing for`simpr`

can usually be turned on with a couple lines of code.

The hardest part of any simulation is designing the data-generating process and deciding what values of parameters you want to explore. `simpr`

takes care of the rest so you can focus on these central issues.

```
## Install stable CRAN version
install.packages("simpr")
## Install latest development version
remotes::install_github("statisfactions/simpr")
library(simpr)
```

The `simpr`

workflow, inspired by the `infer`

package, distills a simulation study into five primary steps:

`specify()`

your data-generating process`define()`

parameters that you want to systematically vary across your simulation design (e.g.*n*, effect size)`generate()`

the simulation data`fit()`

models to your data (e.g.`lm()`

)`tidy_fits()`

for further processing using`broom::tidy()`

, such as computing power or Type I Error rates

`simpr`

makes no assumptions about your data and is not specialized to any particular type of data generating process or model. If R can generate it and if R can fit models, you can use `simpr`

to run your simulation. (The tidying step is limited by the models supported `broom::tidy()`

, although you can also supply your own tidying function or expression.)

Suppose we are calculating the power for a two-sample *t*-test where the data is log-normally distributed, which can be generated by `stats::rlnorm()`

.

```
set.seed(100)
## Data-generating mechanism
specify(a = ~ rlnorm(n, mean = 0),
b = ~ rlnorm(n, mean = 0.5)) %>%
## Vary n from 30 to 100
define(n = seq(30, 100, by = 10)) %>%
## 100 repetitions
generate(100) %>%
## fit t-tests
fit(t_test = ~ t.test(a, b)) %>%
## bring model results into a tidy tibble
tidy_fits()
#> # A tibble: 800 × 14
#> .sim_id n rep Source estimate estimate1 estimate2 statistic p.value
#> <int> <dbl> <int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 30 1 t_test -0.953 1.73 2.68 -1.60 0.117
#> 2 2 40 1 t_test -0.249 1.64 1.89 -0.581 0.563
#> 3 3 50 1 t_test -0.616 1.67 2.29 -1.19 0.237
#> 4 4 60 1 t_test -1.75 1.28 3.03 -3.30 0.00146
#> 5 5 70 1 t_test -0.876 1.61 2.48 -1.96 0.0525
#> 6 6 80 1 t_test -0.780 1.71 2.49 -2.13 0.0352
#> 7 7 90 1 t_test -0.818 1.60 2.42 -2.51 0.0129
#> 8 8 100 1 t_test -0.878 1.51 2.38 -2.61 0.00988
#> 9 9 30 2 t_test -0.487 1.96 2.44 -0.713 0.479
#> 10 10 40 2 t_test -2.29 1.37 3.66 -1.76 0.0851
#> # … with 790 more rows, and 5 more variables: parameter <dbl>, conf.low <dbl>,
#> # conf.high <dbl>, method <chr>, alternative <chr>
```

`specify()`

creates two variables `a`

and `b`

that are distributed lognormally (any R expression that generates data can be used here). The `specify`

expressions refer to the sample size, `n`

. `define()`

clarifies that `n`

varies between 30 and 100 by 10s. `generate()`

actually does the data generation, with 100 simulated datasets for each possible value of `define()`

. `fit()`

applies an arbitrary R expression to each simulated dataset, and `tidy_fits()`

brings things together in a tidy tibble that can be easily aggregated and plotted to calculate bias, power, etc.

See `vignette("simpr")`

to get started on using the package, or view the `simpr`

showcase for several applied examples.