https://research.ouhsc.edu/research-support Parent Page: Research Support id: 36054 Active Page: CRDW Data Guide id: 36178

CRDW Dataset Availability Guide

This guide outlines commonly requested datasets from the OU Health Clinical Research Data Warehouse (CRDW) and data availability across three major electronic health record systems: Epic, Meditech, and Centricity/GECB.

Dataset Type Grain
(one row per)
Comments/Typical Workflow Epic Meditech Centricity/
GECB
Patient/Demographics* Patient Includes most recently updated values for patient details and demographics across all systems. Narrowed down by inclusion criteria. Most commonly requested inclusion criteria based on:
  • Diagnoses
  • Age
  • Procedures
  • Visit at specific location in date range
Can be added to demographics upon request: RUCA codes, insurance (most recent available)
✓ Yes ✓ Yes ✓ Yes
Visit/Encounter Visit Includes details on encounters, such as admission/visit start and end datetimes, discharge disposition, length of stay, reason for visit, etc. Encounters can be linked/used to limit values from other tables (Meditech accounts, Epic encounter keys) ✓ Yes
All encounters, scheduled or otherwise
✓ Yes
Visits and admissions occurring at the hospital
✓ Yes
Scheduled visits (often limited to 'Arrived')
Diagnosis* Patient diagnosis Details about documented diagnosis, start and end dates, etc. Typically one row per diagnosis per patient. We can also create one row per set of comorbidities per patient using the Elixhauser or Charlson comorbidities lists. ✓ Yes ✓ Yes ✓ Yes
Medication*+ Med administration Details about medications administered by hospital or prescribed by clinics. No data available on prescriptions filled at outside providers. No data available currently on at-home med list. ✓ Yes ✓ Yes ✓ Yes
Document*+ Document Includes documents, reports, notes. Can be filtered based on the presence of a specific keyword/key phrase. Can be limited on document type. Most commonly requested document types include progress notes, HPI, operative notes, discharge summaries. Sometimes due to size of documents, this may require us to develop a REDCap project or an extra script to write individual documents to a shared server location. ✓ Yes ✓ Yes ✓ Yes
Observation*+ Observation Includes values not fitting into other tables, collected in Epic via 'flowsheet'. Most commonly requested include vitals (bp, heart rate, bmi, height, weight) and scores on standardized measures (e.g., PHQ-9, AUDIT, Glasgow Coma Score, MOCA score). ✓ Yes ✓ Yes ✓ Yes
Laboratory Value*+ Lab value One row per lab value, in Epic may be able to limit to sets. Most common requests include analytes from CBC, CMP, urinalysis. ✓ Yes ✓ Yes ✓ Yes
Image Impression*+ Image impression Impression from an image. At this time, we do not have access to images themselves, but they should be available in PACS system. ✓ Yes ✓ Yes ✗ No
Procedure/Order* Procedure/order Details about dates, provider attending, etc. related to orders and/or procedures. Typically limited based on CPT code. ✓ Yes ✓ Yes
appears in billing
✓ Yes
Birth/Delivery Birth/baby Details about births and deliveries performed at OU. ✓ Yes ✓ Yes ✗ No
Legend: * May require metadata files ("sweep and specify") | + If data request crosses 2023-06-03, may need multiple metadata files per table

Important Notes

Metadata Files: May require development of metadata files, we refer to as "sweep and specify" where we use key terms, ranges, or other information to assemble available values study team requests (e.g., ICD codes matching specific keywords, medications in a specific class, etc.). Study teams are expected to review and return these files so we can apply them for filtering datasets.

Data Source Transition: If data request crosses 2023-06-03, study teams may need to validate multiple metadata files per table (one per source).

Metadata File ("Sweep and Specify") Workflow

  1. Research team provides: inclusion criteria, key terms, date ranges
  2. Data team creates: ss-*.csv files with ALL matching values
  3. Research team reviews: and marks desired=TRUE/FALSE
  4. Data team applies: filters and generates analysis-ready datasets

Typical Project Timeline

Timeline varies with project complexity, team priorities, deadlines, etc. Alert CRDW of any critical upcoming deadlines related to your project.

  • Week 1: Initial consult, define inclusion criteria
  • Week 2-3: CRDW team creates metadata files and sends to study team
  • Week 4-5: Study team reviews metadata (CRITICAL: delaying this delays the rest of the CRDW data pull)
  • Week 6: Data team generates cohort
  • Week 7+: Iterative refinement as needed

Total: 6-8 weeks from initiation to analysis-ready dataset

Last updated: February 4, 2025