- Hypothesis
tests
- One numerical variable (mean)
- One numerical variable (standardized mean \(t\))
- One numerical variable (median)
- One categorical (one proportion)
- One categorical variable (standardized proportion \(z\))
- Two categorical (2 level) variables
- Two categorical (2 level) variables (z)
- One categorical (>2 level) - GoF
- Two categorical (>2 level): Chi-squared test of independence
- One numerical variable, one categorical (2 levels) (diff in means)
- One numerical variable, one categorical (2 levels) (t)
- One numerical variable, one categorical (2 levels) (diff in medians)
- One numerical, one categorical (>2 levels) - ANOVA
- Two numerical vars - SLR
- Two numerical vars - correlation
- Two numerical vars - SLR (t)
- Multiple explanatory variables

- Confidence intervals
- One numerical (one mean)
- One numerical (one mean - standardized)
- One categorical (one proportion)
- One categorical variable (standardized proportion \(z\))
- One numerical variable, one categorical (2 levels) (diff in means)
- One numerical variable, one categorical (2 levels) (t)
- Two categorical variables (diff in proportions)
- Two categorical variables (z)
- Two numerical vars - SLR
- Two numerical vars - correlation
- Two numerical vars - t
- Multiple explanatory variables

This vignette is intended to provide a set of examples that nearly
exhaustively demonstrate the functionalities provided by
`infer`

. Commentary on these examples is limited—for more
discussion of the intuition behind the package, see the “Getting to Know
infer” vignette, accessible by calling
`vignette("infer")`

.

Throughout this vignette, we’ll make use of the `gss`

dataset supplied by `infer`

, which contains a sample of data
from the General Social Survey. See `?gss`

for more
information on the variables included and their source. Note that this
data (and our examples on it) are for demonstration purposes only, and
will not necessarily provide accurate estimates unless weighted
properly. For these examples, let’s suppose that this dataset is a
representative sample of a population we want to learn about: American
adults. The data looks like this:

```
# load in the dataset
data(gss)
# take a look at its structure
::glimpse(gss) dplyr
```

```
## Rows: 500
## Columns: 11
## $ year <dbl> 2014, 1994, 1998, 1996, 1994, 1996, 1990, 2016, 2000, 1998, 20…
## $ age <dbl> 36, 34, 24, 42, 31, 32, 48, 36, 30, 33, 21, 30, 38, 49, 25, 56…
## $ sex <fct> male, female, male, male, male, female, female, female, female…
## $ college <fct> degree, no degree, degree, no degree, degree, no degree, no de…
## $ partyid <fct> ind, rep, ind, ind, rep, rep, dem, ind, rep, dem, dem, ind, de…
## $ hompop <dbl> 3, 4, 1, 4, 2, 4, 2, 1, 5, 2, 4, 3, 4, 4, 2, 2, 3, 2, 1, 2, 5,…
## $ hours <dbl> 50, 31, 40, 40, 40, 53, 32, 20, 40, 40, 23, 52, 38, 72, 48, 40…
## $ income <ord> $25000 or more, $20000 - 24999, $25000 or more, $25000 or more…
## $ class <fct> middle class, working class, working class, working class, mid…
## $ finrela <fct> below average, below average, below average, above average, ab…
## $ weight <dbl> 0.8960, 1.0825, 0.5501, 1.0864, 1.0825, 1.0864, 1.0627, 0.4785…
```

Calculating the observed statistic,

```
<- gss %>%
x_bar specify(response = hours) %>%
calculate(stat = "mean")
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
x_bar observe(response = hours, stat = "mean")
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
generate(reps = 1000) %>%
calculate(stat = "mean")
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = x_bar, direction = "two-sided")
```

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = x_bar, direction = "two-sided")
```

p_value |
---|

0.038 |

Calculating the observed statistic,

```
<- gss %>%
t_bar specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
calculate(stat = "t")
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
t_bar observe(response = hours, null = "point", mu = 40, stat = "t")
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
generate(reps = 1000) %>%
calculate(stat = "t")
```

Alternatively, finding the null distribution using theoretical
methods using the `assume()`

verb,

```
<- gss %>%
null_dist_theory specify(response = hours) %>%
assume("t")
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = t_bar, direction = "two-sided")
```

Alternatively, visualizing the observed statistic using the theory-based null distribution,

```
visualize(null_dist_theory) +
shade_p_value(obs_stat = t_bar, direction = "two-sided")
```

Alternatively, visualizing the observed statistic using both of the null distributions,

```
visualize(null_dist, method = "both") +
shade_p_value(obs_stat = t_bar, direction = "two-sided")
```

Note that the above code makes use of the randomization-based null distribution.

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = t_bar, direction = "two-sided")
```

p_value |
---|

0.028 |

Alternatively, using the `t_test`

wrapper:

```
%>%
gss t_test(response = hours, mu = 40)
```

statistic | t_df | p_value | alternative | estimate | lower_ci | upper_ci |
---|---|---|---|---|---|---|

2.085 | 499 | 0.0376 | two.sided | 41.38 | 40.08 | 42.68 |

`infer`

does not support testing on one numerical variable
via the `z`

distribution.

Calculating the observed statistic,

```
<- gss %>%
x_tilde specify(response = age) %>%
calculate(stat = "median")
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
x_tilde observe(response = age, stat = "median")
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(response = age) %>%
hypothesize(null = "point", med = 40) %>%
generate(reps = 1000) %>%
calculate(stat = "median")
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = x_tilde, direction = "two-sided")
```

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = x_tilde, direction = "two-sided")
```

p_value |
---|

0.008 |

Calculating the observed statistic,

```
<- gss %>%
p_hat specify(response = sex, success = "female") %>%
calculate(stat = "prop")
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
p_hat observe(response = sex, success = "female", stat = "prop")
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(response = sex, success = "female") %>%
hypothesize(null = "point", p = .5) %>%
generate(reps = 1000) %>%
calculate(stat = "prop")
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = p_hat, direction = "two-sided")
```

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = p_hat, direction = "two-sided")
```

p_value |
---|

0.254 |

Note that logical variables will be coerced to factors:

```
<- gss %>%
null_dist ::mutate(is_female = (sex == "female")) %>%
dplyrspecify(response = is_female, success = "TRUE") %>%
hypothesize(null = "point", p = .5) %>%
generate(reps = 1000) %>%
calculate(stat = "prop")
```

Calculating the observed statistic,

```
<- gss %>%
p_hat specify(response = sex, success = "female") %>%
hypothesize(null = "point", p = .5) %>%
calculate(stat = "z")
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
p_hat observe(response = sex, success = "female", null = "point", p = .5, stat = "z")
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(response = sex, success = "female") %>%
hypothesize(null = "point", p = .5) %>%
generate(reps = 1000, type = "draw") %>%
calculate(stat = "z")
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = p_hat, direction = "two-sided")
```

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = p_hat, direction = "two-sided")
```

p_value |
---|

0.252 |

The package also supplies a wrapper around `prop.test`

for
tests of a single proportion on tidy data.

```
prop_test(gss,
~ NULL,
college p = .2)
```

statistic | chisq_df | p_value | alternative |
---|---|---|---|

635.6 | 1 | 0 | two.sided |

`infer`

does not support testing two means via the
`z`

distribution.

The `infer`

package provides several statistics to work
with data of this type. One of them is the statistic for difference in
proportions.

Calculating the observed statistic,

```
<- gss %>%
d_hat specify(college ~ sex, success = "no degree") %>%
calculate(stat = "diff in props", order = c("female", "male"))
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
d_hat observe(college ~ sex, success = "no degree",
stat = "diff in props", order = c("female", "male"))
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(college ~ sex, success = "no degree") %>%
hypothesize(null = "independence") %>%
generate(reps = 1000) %>%
calculate(stat = "diff in props", order = c("female", "male"))
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = d_hat, direction = "two-sided")
```

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = d_hat, direction = "two-sided")
```

p_value |
---|

0.994 |

`infer`

also provides functionality to calculate ratios of
proportions. The workflow looks similar to that for
`diff in props`

.

Calculating the observed statistic,

```
<- gss %>%
r_hat specify(college ~ sex, success = "no degree") %>%
calculate(stat = "ratio of props", order = c("female", "male"))
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
r_hat observe(college ~ sex, success = "no degree",
stat = "ratio of props", order = c("female", "male"))
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(college ~ sex, success = "no degree") %>%
hypothesize(null = "independence") %>%
generate(reps = 1000) %>%
calculate(stat = "ratio of props", order = c("female", "male"))
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = r_hat, direction = "two-sided")
```

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = r_hat, direction = "two-sided")
```

p_value |
---|

1 |

In addition, the package provides functionality to calculate odds
ratios. The workflow also looks similar to that for
`diff in props`

.

Calculating the observed statistic,

```
<- gss %>%
or_hat specify(college ~ sex, success = "no degree") %>%
calculate(stat = "odds ratio", order = c("female", "male"))
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(college ~ sex, success = "no degree") %>%
hypothesize(null = "independence") %>%
generate(reps = 1000) %>%
calculate(stat = "odds ratio", order = c("female", "male"))
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = or_hat, direction = "two-sided")
```

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = or_hat, direction = "two-sided")
```

p_value |
---|

1 |

Finding the standardized observed statistic,

```
<- gss %>%
z_hat specify(college ~ sex, success = "no degree") %>%
hypothesize(null = "independence") %>%
calculate(stat = "z", order = c("female", "male"))
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
z_hat observe(college ~ sex, success = "no degree",
stat = "z", order = c("female", "male"))
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(college ~ sex, success = "no degree") %>%
hypothesize(null = "independence") %>%
generate(reps = 1000) %>%
calculate(stat = "z", order = c("female", "male"))
```

Alternatively, finding the null distribution using theoretical
methods using the `assume()`

verb,

```
<- gss %>%
null_dist_theory specify(college ~ sex, success = "no degree") %>%
assume("z")
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = z_hat, direction = "two-sided")
```

Alternatively, visualizing the observed statistic using the theory-based null distribution,

```
visualize(null_dist_theory) +
shade_p_value(obs_stat = z_hat, direction = "two-sided")
```

Alternatively, visualizing the observed statistic using both of the null distributions,

```
visualize(null_dist, method = "both") +
shade_p_value(obs_stat = z_hat, direction = "two-sided")
```

Note that the above code makes use of the randomization-based null distribution.

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = z_hat, direction = "two-sided")
```

p_value |
---|

0.992 |

Note the similarities in this plot and the previous one.

The package also supplies a wrapper around `prop.test`

to
allow for tests of equality of proportions on tidy data.

```
prop_test(gss,
~ sex,
college order = c("female", "male"))
```

statistic | chisq_df | p_value | alternative | lower_ci | upper_ci |
---|---|---|---|---|---|

0 | 1 | 0.9964 | two.sided | -0.1009 | 0.0917 |

Calculating the observed statistic,

Note the need to add in the hypothesized values here to compute the observed statistic.

```
<- gss %>%
Chisq_hat specify(response = finrela) %>%
hypothesize(null = "point",
p = c("far below average" = 1/6,
"below average" = 1/6,
"average" = 1/6,
"above average" = 1/6,
"far above average" = 1/6,
"DK" = 1/6)) %>%
calculate(stat = "Chisq")
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
Chisq_hat observe(response = finrela,
null = "point",
p = c("far below average" = 1/6,
"below average" = 1/6,
"average" = 1/6,
"above average" = 1/6,
"far above average" = 1/6,
"DK" = 1/6),
stat = "Chisq")
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(response = finrela) %>%
hypothesize(null = "point",
p = c("far below average" = 1/6,
"below average" = 1/6,
"average" = 1/6,
"above average" = 1/6,
"far above average" = 1/6,
"DK" = 1/6)) %>%
generate(reps = 1000, type = "draw") %>%
calculate(stat = "Chisq")
```

Alternatively, finding the null distribution using theoretical
methods using the `assume()`

verb,

```
<- gss %>%
null_dist_theory specify(response = finrela) %>%
assume("Chisq")
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = Chisq_hat, direction = "greater")
```

Alternatively, visualizing the observed statistic using the theory-based null distribution,

```
visualize(null_dist_theory) +
shade_p_value(obs_stat = Chisq_hat, direction = "greater")
```

Alternatively, visualizing the observed statistic using both of the null distributions,

```
visualize(null_dist_theory, method = "both") +
shade_p_value(obs_stat = Chisq_hat, direction = "greater")
```

Note that the above code makes use of the randomization-based null distribution.

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = Chisq_hat, direction = "greater")
```

p_value |
---|

0 |

Alternatively, using the `chisq_test`

wrapper:

```
chisq_test(gss,
response = finrela,
p = c("far below average" = 1/6,
"below average" = 1/6,
"average" = 1/6,
"above average" = 1/6,
"far above average" = 1/6,
"DK" = 1/6))
```

statistic | chisq_df | p_value |
---|---|---|

488 | 5 | 0 |

Calculating the observed statistic,

```
<- gss %>%
Chisq_hat specify(formula = finrela ~ sex) %>%
hypothesize(null = "independence") %>%
calculate(stat = "Chisq")
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
Chisq_hat observe(formula = finrela ~ sex, stat = "Chisq")
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(finrela ~ sex) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute") %>%
calculate(stat = "Chisq")
```

Alternatively, finding the null distribution using theoretical
methods using the `assume()`

verb,

```
<- gss %>%
null_dist_theory specify(finrela ~ sex) %>%
assume(distribution = "Chisq")
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = Chisq_hat, direction = "greater")
```

Alternatively, visualizing the observed statistic using the theory-based null distribution,

```
visualize(null_dist_theory) +
shade_p_value(obs_stat = Chisq_hat, direction = "greater")
```

Alternatively, visualizing the observed statistic using both of the null distributions,

```
visualize(null_dist, method = "both") +
shade_p_value(obs_stat = Chisq_hat, direction = "greater")
```

Note that the above code makes use of the randomization-based null distribution.

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = Chisq_hat, direction = "greater")
```

p_value |
---|

0.119 |

Alternatively, using the wrapper to carry out the test,

```
%>%
gss chisq_test(formula = finrela ~ sex)
```

statistic | chisq_df | p_value |
---|---|---|

9.105 | 5 | 0.1049 |

Calculating the observed statistic,

```
<- gss %>%
d_hat specify(age ~ college) %>%
calculate(stat = "diff in means", order = c("degree", "no degree"))
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
d_hat observe(age ~ college,
stat = "diff in means", order = c("degree", "no degree"))
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(age ~ college) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute") %>%
calculate(stat = "diff in means", order = c("degree", "no degree"))
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = d_hat, direction = "two-sided")
```

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = d_hat, direction = "two-sided")
```

p_value |
---|

0.414 |

Finding the standardized observed statistic,

```
<- gss %>%
t_hat specify(age ~ college) %>%
hypothesize(null = "independence") %>%
calculate(stat = "t", order = c("degree", "no degree"))
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
t_hat observe(age ~ college,
stat = "t", order = c("degree", "no degree"))
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(age ~ college) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute") %>%
calculate(stat = "t", order = c("degree", "no degree"))
```

Alternatively, finding the null distribution using theoretical
methods using the `assume()`

verb,

```
<- gss %>%
null_dist_theory specify(age ~ college) %>%
assume("t")
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = t_hat, direction = "two-sided")
```

Alternatively, visualizing the observed statistic using the theory-based null distribution,

```
visualize(null_dist_theory) +
shade_p_value(obs_stat = t_hat, direction = "two-sided")
```

Alternatively, visualizing the observed statistic using both of the null distributions,

```
visualize(null_dist, method = "both") +
shade_p_value(obs_stat = t_hat, direction = "two-sided")
```

Note that the above code makes use of the randomization-based null distribution.

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = t_hat, direction = "two-sided")
```

p_value |
---|

0.404 |

Note the similarities in this plot and the previous one.

Calculating the observed statistic,

```
<- gss %>%
d_hat specify(age ~ college) %>%
calculate(stat = "diff in medians", order = c("degree", "no degree"))
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
d_hat observe(age ~ college,
stat = "diff in medians", order = c("degree", "no degree"))
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(age ~ college) %>% # alt: response = age, explanatory = season
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute") %>%
calculate(stat = "diff in medians", order = c("degree", "no degree"))
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = d_hat, direction = "two-sided")
```

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = d_hat, direction = "two-sided")
```

p_value |
---|

0.136 |

Calculating the observed statistic,

```
<- gss %>%
F_hat specify(age ~ partyid) %>%
calculate(stat = "F")
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
F_hat observe(age ~ partyid, stat = "F")
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(age ~ partyid) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute") %>%
calculate(stat = "F")
```

Alternatively, finding the null distribution using theoretical
methods using the `assume()`

verb,

```
<- gss %>%
null_dist_theory specify(age ~ partyid) %>%
hypothesize(null = "independence") %>%
assume(distribution = "F")
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = F_hat, direction = "greater")
```

Alternatively, visualizing the observed statistic using the theory-based null distribution,

```
visualize(null_dist_theory) +
shade_p_value(obs_stat = F_hat, direction = "greater")
```

Alternatively, visualizing the observed statistic using both of the null distributions,

```
visualize(null_dist, method = "both") +
shade_p_value(obs_stat = F_hat, direction = "greater")
```

Note that the above code makes use of the randomization-based null distribution.

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = F_hat, direction = "greater")
```

p_value |
---|

0.066 |

Calculating the observed statistic,

```
<- gss %>%
slope_hat specify(hours ~ age) %>%
calculate(stat = "slope")
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
slope_hat observe(hours ~ age, stat = "slope")
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(hours ~ age) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute") %>%
calculate(stat = "slope")
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = slope_hat, direction = "two-sided")
```

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = slope_hat, direction = "two-sided")
```

p_value |
---|

0.842 |

Calculating the observed statistic,

```
<- gss %>%
correlation_hat specify(hours ~ age) %>%
calculate(stat = "correlation")
```

Alternatively, using the `observe()`

wrapper to calculate
the observed statistic,

```
<- gss %>%
correlation_hat observe(hours ~ age, stat = "correlation")
```

Then, generating the null distribution,

```
<- gss %>%
null_dist specify(hours ~ age) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute") %>%
calculate(stat = "correlation")
```

Visualizing the observed statistic alongside the null distribution,

```
visualize(null_dist) +
shade_p_value(obs_stat = correlation_hat, direction = "two-sided")
```

Calculating the p-value from the null distribution and observed statistic,

```
%>%
null_dist get_p_value(obs_stat = correlation_hat, direction = "two-sided")
```

p_value |
---|

0.854 |

Not currently implemented since \(t\) could refer to standardized slope or standardized correlation.

Calculating the observed fit,

```
<- gss %>%
obs_fit specify(hours ~ age + college) %>%
fit()
```

Generating a distribution of fits with the response variable permuted,

```
<- gss %>%
null_dist specify(hours ~ age + college) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute") %>%
fit()
```

Generating a distribution of fits where each explanatory variable is permuted independently,

```
<- gss %>%
null_dist2 specify(hours ~ age + college) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute", variables = c(age, college)) %>%
fit()
```

Visualizing the observed fit alongside the null fits,

```
visualize(null_dist) +
shade_p_value(obs_stat = obs_fit, direction = "two-sided")
```