Algorithm auditing: Get started

Introduction

Welcome to the ‘Algorithm auditing’ vignette of the jfa package. Here you can find a detailed explanation of the functions in the package that facilitate auditing of algorithms and predictive models. For more detailed explanations of each function, read the other vignettes on the package website.

Functions and intended usage

Below you can find an explanation of the available algorithm auditing functions in jfa.

Assess algorithmic fairness with model_fairness()

The model_fairnesS() function aims to assess fairness in algorithmic decision-making systems by computing and testing the equality of one of several model-agnostic fairness metrics between protected classes based on a set of true labels and the predictions of an algorithm. The ratio of these metrics between an unpriveleged protected class and a priveleged protected class is called parity, and quantifies relative fairness in the algorithms predictions. Available parity metrics include predictive rate parity, proportional parity, accuracy parity, false negative rate parity, false positive rate parity, true positive rate parity, negative predicted value parity, specificity parity, and demographic parity. The function returns an object of class jfaFairness that can be used with associated summary() and plot() methods.

Full function with default arguments:

model_fairness(
data,
protected,
target,
predictions,
privileged = NULL,
positive = NULL,
metric = c(
"prp", "pp", "ap", "fnrp", "fprp",
"tprp", "npvp", "sp", "dp"
),
alternative = c("two.sided", "less", "greater"),
conf.level = 0.95,
prior = FALSE
)

Example usage:

model_fairness(
data = compas,
protected = "Ethnicity",
target = "TwoYrRecidivism",
predictions = "Predicted",
privileged = "Caucasian",
positive = "yes",
metric = "prp"
)
##
##  Classical Algorithmic Fairness Test
##
## data: compas
## n = 6172, X-squared = 18.799, df = 5, p-value = 0.002095
## alternative hypothesis: fairness metrics are not equal across groups
##
## sample estimates:
##   African_American: 1.1522 [1.1143, 1.1891], p-value = 5.4523e-05
##   Asian: 0.86598 [0.11706, 1.6149], p-value = 1
##   Hispanic: 1.0229 [0.87836, 1.1611], p-value = 0.78393
##   Native_American: 1.0392 [0.25396, 1.6406], p-value = 1
##   Other: 1.0596 [0.86578, 1.2394], p-value = 0.5621
## alternative hypothesis: true odds ratio is not equal to 1

Benchmarks

To validate the statistical results, jfa’s automated unit tests regularly verify the main output from the package against the following benchmarks:

References

• Büyük, S. (2023). Automatic Fairness Criteria and Fair Model Selection for Critical ML Tasks, Master Thesis, Utrecht University. - View Online
• Friedler, S. A., Scheidegger, C., Venkatasubramanian, S., Choudhary, S., Hamilton, E. P., & Roth, D. (2019). A comparative study of fairness-enhancing interventions in machine learning. In Proceedings of the Conference on Fairness, Accountability, and Transparency. - View Online
• Pessach, D. & Shmueli, E. (2022). A review on fairness in machine learning. ACM Computing Surveys, 55(3), 1-44. - View Online