This vignette aims to show how the **jfa** package
facilitates auditors in the standard audit sampling workflow (hereafter
“audit workflow”). In this example of the audit workflow, we will
consider the case of BuildIt. BuildIt is a fictional construction
company in the United States that is being audited by an external
auditor for a fictional audit firm. At the end of the year, BuildIt has
provided a summary of its financial situation in the financial
statements. The objective of the auditor is to formulate an opinion
about the fairness BuildIt’s financial statements.

The auditor needs to obtain sufficient and appropriate evidence for the hypothesis that the misstatement in the financial statements is lower than a certain amount: the materiality. If the financial statements contain misstatements that are considered material, this means that the errors in the financial statements are large enough that they might influence the decision of stakeholders relying on these financial statements. The performance materiality is the materiality that applies to each of the populations on which the financial statements are based. For this example, the performance materiality is set at 5% of the total value of the population.

In this example, we focus on the `BuildIt`

data set that
comes with the **jfa** package.

```
## ID bookValue auditValue
## 1 82884 242.61 242.61
## 2 25064 642.99 642.99
## 3 81235 628.53 628.53
## 4 71769 431.87 431.87
## 5 55080 620.88 620.88
## 6 93224 501.76 501.76
```

The population of interest consists of 3500 items, each with a booked value. Let’s assume that, before performing audit sampling, the auditor has assessed the quality of BuildIt’s internal control systems and found that they were working properly.

In order to formulate an opinion about the misstatement in the population, the auditor separates their audit workflow into four stages. First, they will need to plan the minimum size of a sample they need to inspect to perform inference about the population. Second, they will need to select the required sample from the population. Third, they will need to inspect the selected sample and determine the audit (true) value of the items it contains. Fourth, they will need to use the information from the inspected sample to perform inference about the misstatement in the population.

The auditor wants to make a statement that, with 95% confidence, the misstatement in the population is lower than the performance materiality of 5%. Based on last year’s audit at BuildIt, where the upper bound of the misstatement turned out to be 2.5%, they want to tolerate at most 2.5% errors in the intended sample. The auditor can therefore re-formulate their statistical statement as that they want to conclude that, when 2.5% errors are found in the sample, they can conclude with 95% confidence that the misstatement in the population is lower than the performance materiality of 5%.

Below, the auditor defines the performance materiality, confidence level, and expected misstatements in the sample.

```
# Specify the confidence, materiality, and expected misstatements.
confidence <- 0.95 # 95%
materiality <- 0.05 # 5%
expected <- 0.025 # 2.5%
```

Many audits are performed according to the *audit risk model
(ARM)*, which determines that the uncertainty about the auditor’s
statement as a whole (1 - the confidence) is a factor of three terms:
the inherent risk, the control risk, and the detection risk. Inherent
risk is the risk posed by an error in BuildIt’s financial statement that
could be material, before consideration of any related control systems
(e.g., computer systems). Control risk is the risk that a material
misstatement is not prevented or detected by BuildIt’s internal control
systems. Detection risk is the risk that the auditor will fail to find
material misstatements that exist in an BuildIt’s financial statements.
The *ARM* is practically useful because for a given level of
audit risk, the tolerable detection risk bears an inverse relation to
the other two risks. The *ARM* is useful for the auditor because
it enables them to incorporate existing information on BuildIt’s
organization to increase the required risk that they will fail to find a
material misstatement.

\[ \text{Audit risk} = \text{Inherent risk} \,\times\, \text{Control risk} \,\times\, \text{Detection risk}\]

Usually the auditor judges inherent risk and control risk on a three-point scale consisting of low, medium, and high. Different audit firms handle different standard percentages for these categories. The auditor’s firm defines the probabilities of low, medium, and high respectively as 50%, 60%, and 100%. Because the auditor assessed BuildIt’s internal control systems, they assess the control risk as medium (60%).

The auditor can choose to either perform a frequentist analysis,
where they use the increased detection risk as their level of
uncertainty, or perform a Bayesian analysis, where they incorporate the
information about the control risk into a prior distribution. For this
example, we will show how to perform a Bayesian analysis. A frequentist
analysis can easily be done through the following functions by setting
`prior = FALSE`

. In a frequentist analysis, the auditor uses
the value `c.adj`

as their new value for
`confidence`

.

```
# Adjust the required confidence for a frequentist analysis.
c.adj <- 1 - ((1 - confidence) / (ir * cr))
```

However, in a Bayesian audit, the auditor starts at step 0 by
defining the prior distribution that corresponds to their assessment of
the control risk. Using the `auditPrior()`

function, they can
create a prior distribution that incorporates the information in the
risk assessments from the *ARM*. For more information on how this
is done, see Derks et al. (2019).

```
# Step 0: Create a prior distribution according to the audit risk model.
prior <- auditPrior(
method = "arm", likelihood = "poisson", expected = expected,
materiality = materiality, ir = ir, cr = cr
)
```

The prior distribution can be shown by using the `plot()`

function.

Now that the prior distribution is specified, the auditor can
calculate the required sample size for their desired inference by using
the `planning()`

function. They uses the `prior`

object as input for the `planning()`

function.

```
# Step 1: Calculate the required sample size.
stage1 <- planning(materiality = materiality, expected = expected, conf.level = confidence, prior = prior)
```

The auditor can then inspect the result from her planning procedure
by using the `summary()`

function. The result show that,
given the prior distribution, the auditor needs to select a sample of
178 items so that, when at most 4.45 misstatements are found, they can
conclude with 95% confidence that the misstatement in BuildIt’s
financial statements is lower the performance materiality of 5%.

```
##
## Bayesian Audit Sample Planning Summary
##
## Options:
## Confidence level: 0.95
## Materiality: 0.05
## Hypotheses: H₀: Θ > 0.05 vs. H₁: Θ < 0.05
## Expected: 0.025
## Likelihood: poisson
## Prior distribution: gamma(α = 2.325, β = 53)
##
## Results:
## Minimum sample size: 178
## Tolerable errors: 4.45
## Posterior distribution: gamma(α = 6.775, β = 231)
## Expected most likely error: 0.025
## Expected upper bound: 0.049981
## Expected precision: 0.024981
## Expected BF₁₀: 9.6614
```

The auditor can inspect how the prior distribution compares to the
expected posterior distribution by using the `plot()`

function. The expected posterior distribution is the posterior
distribution that would occur if the auditor actually observed a sample
of 178 items, from which 4.45 were misstated.

The auditor is now ready to select the required 178 items from the
population. They can choose to do this according to one of two
statistical methods. In *record sampling*
(`units = "items"`

), inclusion probabilities are assigned on
the item level, treating item with a high value and a low value the
same, an item of $5,000 is equally likely to be selected as an item of
$1,000. In *monetary unit sampling*
(`units = "values"`

), inclusion probabilities are assigned on
the level of individual monetary units (e.g., a dollar). When a dollar
is selected to be in the sample, the item that includes that dollar is
selected. This favors items with a higher value, as an item with a value
of $5,000 is now five times more likely to be selected than an item with
a value of $1,000.

The auditor chooses to use *monetary unit sampling*, as they
wants to include more high-valued items. The `selection()`

function enables them to select the sample from the population. She uses
the `stage1`

object as an input for the
`selection()`

function, since this passes the calculated
sample size to the function.

```
# Step 2: Draw a sample from the financial statements.
stage2 <- selection(data = BuildIt, size = stage1, units = "values", values = "bookValue", method = "interval")
```

Like before, the auditor can inspect the outcomes of their sampling
procedure

by using the `summary()`

function.

```
##
## Audit Sample Selection Summary
##
## Options:
## Requested sample size: 178
## Sampling units: monetary units
## Method: fixed interval sampling
## Starting point: 1
##
## Data:
## Population size: 3500
## Population value: 1403221
## Selection interval: 7883.3
##
## Results:
## Selected sampling units: 178
## Proportion of value: 0.062843
## Selected items: 178
## Proportion of size: 0.050857
```

The selected sample can be isolated by indexing the
`sample`

object from the sampling result.

Next, the auditor can execute the audit by annotating the items in
the sample with their audit values (for exampling by writing the sample
to a *.csv* file using `write.csv()`

. They can then
load the annotated sample back into the R session for further
evaluation. For this example, the audit values of the sample items are
already included in the `auditValue`

column of the data
set.

Using the annotated sample, the auditor can perform inference about
the misstatement in the population via the `evaluation()`

function. By passing the `prior`

object to the function, it
automatically sets `method = "binomial"`

to be consistent
with the prior distribution.

```
# Step 4: Evaluate the sample.
stage4 <- evaluation(
materiality = materiality, conf.level = confidence, data = sample,
values = "bookValue", values.audit = "auditValue", prior = prior
)
```

The auditor can inspect the outcomes of their inference by using the
`summary()`

function. The resulting upper bound is 2.278%,
which is lower than the performance materiality of 5%. Therefore, the
auditor can concluse that there is a 95% probability that the
misstatement in BuildIt’s population is lower than 2.278%.

```
##
## Bayesian Audit Sample Evaluation Summary
##
## Options:
## Confidence level: 0.95
## Materiality: 0.05
## Hypotheses: H₀: Θ > 0.05 vs. H₁: Θ < 0.05
## Method: poisson
## Prior distribution: gamma(α = 2.325, β = 53)
##
## Data:
## Sample size: 178
## Number of errors: 0
## Sum of taints: 0
##
## Results:
## Posterior distribution: gamma(α = 2.325, β = 231)
## Most likely error: 0.0057359
## 95 percent credible interval: [0, 0.022781]
## Precision: 0.017045
## BF₁₀: 2179.8
```

They can inspect the prior and posterior distribution by using the
`plot()`

function.

Since the 95% upper credible bound on the misstatement in population
is lower than the performance materiality, the auditor has obtained
sufficient evidence to conclude that the population does not contain
material misstatements. The auditor can create a `html`

or
`pdf`

report of the statistical results using the
`report()`

function, as shown below.