Building Your CDM

Building your CDM is a process that necessitates proper planning and execution, and we are here to help.  Successful use of an observational data network requires a collaborative, interdisciplinary approach that includes:

  • Local knowledge of the source data: underlying data capture process and its role in the healthcare system
  • Clinical understanding of medical products and disease
  • Domain expertise in the analytical use cases: epidemiology, pharmacovigilance, health economics and outcomes research
  • Command of advanced statistical techniques for large-scale modeling and exploratory analysis
  • Informatics experience with ontology management and leveraging standard terminologies for analysis
  • Technical/programming skills to implement design and develop a scalable solution

Getting Started

Ready to get started on the conversion (ETL) process? Here are some recommended steps for an effective process:

  1. Train on OMOP CDM and Vocabulary
  2. Discuss analysis opportunities (Why are we doing this? What do you want to be able to do once CDM is done?)
  3. Evaluate technology requirements and infrastructure
  4. Discuss data dictionary and documentation on raw database
  5. Perform a systematic scan of  raw database
  6. Draft Business Logic
    a.  Table level
    b. Variable level
    c. Value level (mapping)
    d. Capture what will not be captured (lost) in the transformation
  7. Create data sample to allow initial development

Helpful Hints

Having gone through the ETL process with several databases over the past few years, we know that there will be obstacles to overcome and challenges to solve. Here are some helpful hints and lessons learned from the OHDSI collaborative:

  • A successful ETL requires a village; don’t make one person try to be the hero and do it all themselves
    o  Team design
    o  Team implementation
    o  Team testing
  • Document early and often, the more details the better
  • Data quality checking is required at every step of the process
  • Don’t make assumptions about source data based on documentation; verify by looking at the data
  • Good design and comprehensive specifications should save unnecessary iterations and thrash during implementation
  • ETL design/documentation/implementation is a living process. It will never be done and it can always be better. But don’t let the perfect be the enemy of the good

For more information, check out the documentation on our wiki page:

And remember, the OHDSI community is here to help!  Contact us at