In this vignette, we will explain how to compute a Bayes factor for informed hypotheses on multiple binomial parameters.

As example for an inequality-constrained hypothesis on multiple independent binomials, we will use the dataset, `journals`

, which is included in the package `multibridge`

. The dataset is based on a study by Nuijten et al. (2016) which evaluated the prevalence of statistical reporting errors (i.e., inconsistencies between the reported test statistic and degrees of freedom, and the reported p-value) in the field of psychology. In total, the dataset contains a summary of statistical reporting errors of 16,695 research articles reporting results from null hypothesis significance testing (NHST). The selected articles were published in eight major journals in psychology between 1985 to 2013: *Developmental Psychology* (DP), *Frontiers in Psychology* (FP), the *Journal of Applied Psychology* (JAP), the *Journal of Consulting and Clinical Psychology* (JCCP), *Journal of Experimental Psychology: General* (JEPG), the *Journal of Personality and Social Psychology* (JPSP), the *Public Library of Science* (PLoS), and *Psychological Science* (PS).

```
# load the package and data
library(multibridge)
data(journals)
journals
```

```
## journal articles_downloaded articles_with_NHST perc_articles_with_NHST
## 1 DP 3379 2607 77.2
## 2 FP 2139 702 32.8
## 3 JAP 2782 1638 58.9
## 4 JCCP 3519 2413 68.6
## 5 JEPG 1184 821 69.3
## 6 JPSP 5108 4346 85.1
## 7 PLOS 10299 2487 24.1
## 8 PS 2307 1681 72.9
## nr_NHST mean_nr_NHST_per_article_with_NHST
## 1 37658 14.44
## 2 10149 14.46
## 3 15134 9.24
## 4 27429 11.37
## 5 18921 23.05
## 6 101621 23.38
## 7 31539 12.68
## 8 15654 9.31
## mean_nr_NHST_per_article_all_included errors dec_errors perc_errors
## 1 11.14 3821 654 10.15
## 2 4.74 1045 134 10.30
## 3 5.44 1146 302 7.57
## 4 7.79 3396 463 12.38
## 5 15.98 1495 210 7.90
## 6 19.89 8484 1139 8.35
## 7 3.06 4151 488 13.16
## 8 6.79 1423 191 9.09
## perc_dec_errors perc_articles_with_errors perc_articles_with_dec_errors
## 1 1.74 50.90 15.15
## 2 1.32 50.85 11.97
## 3 2.00 33.64 12.45
## 4 1.69 48.90 11.56
## 5 1.11 54.81 12.91
## 6 1.12 57.62 15.81
## 7 1.55 49.70 11.50
## 8 1.22 39.74 6.48
## APAfactor
## 1 0.7827710
## 2 0.7408296
## 3 0.7321825
## 4 0.7710022
## 5 0.8031367
## 6 0.8106632
## 7 0.6701959
## 8 0.7795777
```

The model that we will use assumes that the elements in the vector of successes \(x_1, \cdots, x_K\) and the elements in the vector of total number of observations \(n_1, \cdots, n_K\) in the \(K\) categories follow independent binomial distributions. The parameter vector of the binomial success probabilities, \(\theta_1, \cdots, \theta_K\), contains the probabilities of observing a value in a particular category; here, it reflects the probabilities of a statistical reporting error in one of the 8 journals. The parameter vector \(\theta_1, \cdots, \theta_K\) are drawn from independent beta distributions with parameters \(\alpha_1, \cdots, \alpha_K\) and \(\beta_1, \cdots, \beta_K\). The model can be described as follows:

\[\begin{align} x_1 \cdots x_K & \sim \prod_{k = 1}^K \text{Binomial}(\theta_k, n_k) \\ \theta_1 \cdots \theta_K &\sim \prod_{k = 1}^K \text{Beta}(\alpha_k, \beta_k) \\ \end{align}\]

Here, we test the inequality-constrained hypothesis \(\mathcal{H}_r\) formulated by Nuijten et al. (2016) that the prevalence for statistical reporting errors for articles published in social psychology journals (i.e., JPSP) is higher than for articles published in other journals. We will test this hypothesis against the encompassing hypothesis \(\mathcal{H}_e\) without any constraints. In addition, this hypothesis will be tested against the null hypothesis \(\mathcal{H}_0\) that all journals have the same prevalence to include an article with a statistical reporting error:

\[\begin{align*} \mathcal{H}_r &: (\theta_{\text{DP}}, \theta_{\text{FP}}, \theta_{\text{JAP}} , \theta_{\text{JCCP}} , \theta_{\text{JEPG}} , \theta_{\text{PLoS}}, \theta_{\text{PS}}) < \theta_{\text{JPSP}} \\ \mathcal{H}_e &: \theta_{\text{DP}} , \theta_{\text{FP}} , \cdots , \theta_{\text{JPSP}}\\ \mathcal{H}_0 &: \theta_{\text{DP}} = \theta_{\text{FP}} = \cdots = \theta_{\text{JPSP}}. \end{align*}\]

To evaluate the inequality-constrained hypothesis, we need to specify (1) a vector with observed successes, and (2) a vector containing the total number of observations, (3) the informed hypothesis, (4) a vector with prior parameters alpha for each binomial proportion, (5) a vector with prior parameters beta for each binomial proportion, and (6) the labels of the categories of interest (i.e., journal names):

```
# since percentages are rounded to two decimal values, we round the articles
# with an error to receive integer values (step 1)
x <- round(journals$articles_with_NHST * (journals$perc_articles_with_errors/100))
# total number of articles (step 2)
n <- journals$articles_with_NHST
# Specifying the informed Hypothesis (step 3)
Hr <- c('JAP , PS , JCCP , PLOS , DP , FP , JEPG < JPSP')
# Prior specification (step 4 and 5)
# We assign a uniform beta distribution to each binomial propotion
a <- rep(1, 8)
b <- rep(1, 8)
# categories of interest (step 6)
journal_names <- journals$journal
```

With this information, we can now conduct the analysis with the function `binom_bf_informed()`

. Since we are interested in quantifying evidence in favor of the restricted hypothesis, we set the Bayes factor type to `BFre`

. For reproducibility, we are also setting a seed with the argument `seed`

:

```
ineq_results <- multibridge::binom_bf_informed(x=x, n=n, Hr=Hr, a=a, b=b,
factor_levels=journal_names,
bf_type = 'BFre',
seed = 2020)
m1 <- summary(ineq_results)
m1
```

```
## Bayes factor analysis
##
## Hypothesis H_e:
## All parameters are free to vary.
##
## Hypothesis H_r:
## JAP , PS , JCCP , PLOS , DP , FP , JEPG < JPSP
##
## Bayes factor estimate BFre:
##
## 7.4289
##
## Based on 1 independent inequality-constrained hypothesis.
##
## Relative Mean-Square Error:
##
## 0.066
##
## Posterior Median and Credible Intervals Of Marginal Beta Distributions:
## alpha beta 2.5% 50% 97.5%
## 1 JAP 1 + 551 1 + 1087 0.314 0.337 0.360
## 2 PS 1 + 668 1 + 1013 0.374 0.397 0.421
## 3 JCCP 1 + 1180 1 + 1233 0.469 0.489 0.509
## 4 PLOS 1 + 1236 1 + 1251 0.477 0.497 0.517
## 5 DP 1 + 1327 1 + 1280 0.490 0.509 0.528
## 6 FP 1 + 357 1 + 345 0.472 0.509 0.545
## 7 JEPG 1 + 450 1 + 371 0.514 0.548 0.582
## 8 JPSP 1 + 2504 1 + 1842 0.561 0.576 0.591
```

From the summary of the results we can see that the overall relative mean-square error of \(0.066\) is quite high, which might suggest unstable results. This becomes apparent if we look at the result as percentage error which can be extracted from the output object:

`ineq_results$bf_list$error_measures`

```
## re2 cv percentage
## 1 0.06597936 0.2568645 25.6864%
```

Here the percentage error is at almost 25.6864%. For that reason, we will rerun the `binom_bf_informed()`

with more samples.

```
ineq_results <- multibridge::binom_bf_informed(x=x, n=n, Hr=Hr, a=a, b=b,
factor_levels=journal_names,
bf_type = 'BFre',
seed = 2020,
niter = 2e4)
m2 <- summary(ineq_results)
```

With more samples, the percentage error is considerably smaller:

`ineq_results$bf_list$error_measures`

```
## re2 cv percentage
## 1 0.003796574 0.06161634 6.1616%
```

Now, the overall relative mean-square error is \(0.0038\), which translates to a percentage error of 6.1616%.

The data are more likely under the informed hypothesis than under the alternative hypothesis; in fact we collected moderate evidence for the informed hypothesis. The results suggest that the data are 7.29 more likely under the informed hypothesis than under the hypothesis that all parameters are free to vary.

If we would like to compare the inequality-constrained hypothesis \(\mathcal{H}_r\) against the null hypothesis \(\mathcal{H}_0\) which states that the probability for a statistical reporting error is equal across all journals, we can set the Bayes factor type in `binom_bf_equality()`

to `BFr0`

.

```
eq_results <- multibridge::binom_bf_informed(x=x, n=n, Hr=Hr, a=a, b=b,
factor_levels=journal_names,
bf_type = 'BFr0',
seed = 2020,
niter = 2e4)
m3 <- summary(eq_results)
m3
```

```
## Bayes factor analysis
##
## Hypothesis H_0:
## All parameters are exactly equal.
##
## Hypothesis H_r:
## JAP , PS , JCCP , PLOS , DP , FP , JEPG < JPSP
##
## Bayes factor estimate BFr0:
##
## 5.3802e+68
##
## Based on 1 independent inequality-constrained hypothesis.
##
## Relative Mean-Square Error:
##
## 0.0038
##
## Posterior Median and Credible Intervals Of Marginal Beta Distributions:
## alpha beta 2.5% 50% 97.5%
## 1 JAP 1 + 551 1 + 1087 0.314 0.337 0.360
## 2 PS 1 + 668 1 + 1013 0.374 0.397 0.421
## 3 JCCP 1 + 1180 1 + 1233 0.469 0.489 0.509
## 4 PLOS 1 + 1236 1 + 1251 0.477 0.497 0.517
## 5 DP 1 + 1327 1 + 1280 0.490 0.509 0.528
## 6 FP 1 + 357 1 + 345 0.472 0.509 0.545
## 7 JEPG 1 + 450 1 + 371 0.514 0.548 0.582
## 8 JPSP 1 + 2504 1 + 1842 0.561 0.576 0.591
```

The resulting Bayes factor that compares the informed to the null hypothesis provides extreme evidence for the informed hypothesis; the data are 5.3810^{68} more likely under the informed hypothesis than under the null hypothesis. But since the data provided only moderate evidence against the encompassing hypotheses (i.e., BFre=7.29), it would be more sensible to say that this result suggests a misspecification of the null model rather than a well specified informed hypothesis.

Nuijten, Michèle B, Chris HJ Hartgerink, Marcel ALM van Assen, Sacha Epskamp, and Jelte M Wicherts. 2016. “The Prevalence of Statistical Reporting Errors in Psychology (1985–2013).” *Behavior Research Methods* 48: 1205–26.