--- title: "Codelist diagnostics" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{a02_CodelistDiagnostics} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", message = FALSE, warning = FALSE, fig.width = 7 ) library(CDMConnector) if (Sys.getenv("EUNOMIA_DATA_FOLDER") == "") Sys.setenv("EUNOMIA_DATA_FOLDER" = tempdir()) if (!dir.exists(Sys.getenv("EUNOMIA_DATA_FOLDER"))) dir.create(Sys.getenv("EUNOMIA_DATA_FOLDER")) if (!eunomia_is_available()) downloadEunomiaData(datasetName = "synpuf-1k") ``` ## Introduction In this example we're going to summarise the characteristics of individuals with an ankle sprain, ankle fracture, forearm fracture, or a hip fracture using the Eunomia synthetic data. We'll begin by creating our study cohorts. ```{r} library(CDMConnector) library(CohortConstructor) library(CodelistGenerator) library(PhenotypeR) library(dplyr) library(ggplot2) con <- DBI::dbConnect(duckdb::duckdb(), CDMConnector::eunomiaDir("synpuf-1k", "5.3")) cdm <- CDMConnector::cdmFromCon(con = con, cdmName = "Eunomia Synpuf", cdmSchema = "main", writeSchema = "main", achillesSchema = "main") cdm$injuries <- conceptCohort(cdm = cdm, conceptSet = list( "ankle_sprain" = 81151, "ankle_fracture" = 4059173, "forearm_fracture" = 4278672, "hip_fracture" = 4230399 ), name = "injuries") cdm$injuries |> glimpse() ``` ## Summarising code use To get a good understanding of the codes we've used to define our cohorts we can use the `codelistDiagnostics()` function. ```{r} code_diag <- codelistDiagnostics(cdm$injuries) ``` From our results we can see a table of counts of our codes in our database based on the achilles results. ```{r} tableAchillesCodeUse(code_diag) ``` And we can also see orphan codes, which are codes that we did not include in our cohort definition but maybe could have. These are codes being used in the database that are associated with codes we included in our definition. So although many can be false positives, we may identify some codes that we may want to our cohort definitions. ```{r} tableOrphanCodes(code_diag) ```