Skip to contents

An S3 generic function that can be used on database queries from Aurum or GOLD extracts. Combine a database query with a cohort, only retaining observations between time_prev days prior to indexdt, and time_post days after indexdt, and for test data with values between lower_bound and upper_bound. The most recent numobs observations will be returned. cohort must contain variables patid and indexdt. The type of query must be specified for appropriate data manipulation. Input type = med if interested in medical diagnoses from the observation file, and type = test if interseted in test data from the observation file.

Usage

combine_query(
  db_query,
  cohort,
  query_type = c("med", "drug", "test", "hes_primary", "death"),
  time_prev = Inf,
  time_post = Inf,
  lower_bound = -Inf,
  upper_bound = Inf,
  numobs = 1,
  value_na_rm = TRUE,
  earliest_values = FALSE,
  reduce_output = TRUE
)

Arguments

db_query

Output from database query (ideally obtained through db_query).

cohort

Cohort to combine with the database query.

query_type

Type of query

time_prev

Number of days prior to index date to look for codes.

time_post

Number of days after index date to look for codes.

lower_bound

Lower bound for returned values when query_type = "test".

upper_bound

Upper bound for returned values when query_type = "test".

numobs

Number of observations to be returned.

value_na_rm

If TRUE will remove data with NA in the value column of the queried data and remove values outside of lower_bound and upper_bound when query_type = "test".

earliest_values

If TRUE will return the earliest values as opposed to most recent.

reduce_output

If TRUE will reduce output to just patid, event date, medical/product code, and test value.

Value

A data.table with observations that meet specified criteria.

Details

value_na_rm = FALSE may be of use when extracting variables like smoking status, where we want test data for number of cigarettes per day, but do not want to remove all observations with NA in the value column, because the medcodeid itself may indicate smoking status.

Examples

## Create connection to a temporary database
aurum_extract <- connect_database(file.path(tempdir(), "temp.sqlite"))

## Add observation data from all observation files in specified directory
cprd_extract(db = aurum_extract,
filepath = system.file("aurum_data", package = "rcprd"),
filetype = "observation")
#> 
  |                                                                            
  |                                                                      |   0%
#> Adding /home/runner/work/_temp/Library/rcprd/aurum_data/aurum_allpatid_set1_extract_observation_001.txt 2024-11-15 11:32:14.392743
#> 
  |                                                                            
  |=======================                                               |  33%
#> Adding /home/runner/work/_temp/Library/rcprd/aurum_data/aurum_allpatid_set1_extract_observation_002.txt 2024-11-15 11:32:14.429381
#> 
  |                                                                            
  |===============================================                       |  67%
#> Adding /home/runner/work/_temp/Library/rcprd/aurum_data/aurum_allpatid_set1_extract_observation_003.txt 2024-11-15 11:32:14.444564
#> 
  |                                                                            
  |======================================================================| 100%

## Query database for a specific medcode
db_query <- db_query(db_open = aurum_extract,
tab ="observation",
codelist_vector = "187341000000114")

## Define cohort
pat<-extract_cohort(filepath = system.file("aurum_data", package = "rcprd"))

### Add an index date to pat
pat$indexdt <- as.Date("01/01/2020", format = "%d/%m/%Y")

## Combine query with cohort retaining most recent three records
combine_query(cohort = pat,
db_query = db_query,
query_type = "med",
numobs = 3)
#>     patid       medcodeid obsdate
#>    <char>          <char>   <num>
#> 1:      1 187341000000114   -5373
#> 2:      2 187341000000114   -5769
#> 3:      6 187341000000114  -14727

## clean up
RSQLite::dbDisconnect(aurum_extract)
unlink(file.path(tempdir(), "temp.sqlite"))