Can you discuss your career path around informatics and technology, and what your role is at Johns Hopkins University?
Looking back, I did not imagine how far my career path would go. I started as a Registered Nurse in critical care before making a transition into Health IT. My first position in information systems was working on a large-scale Department of Defense deployment that sought clinical professionals to support end-users. I discovered an affinity for the work, so I quickly moved to a development team where I learned a little MUMPS. A short time after, I was recruited to work as a Product Manager for the Cerner Corporation, where I was the company’s second Nurse hire.
Over the course of my career, I’ve had the great fortune to participate in many firsts in informatics: the first conversion of VT terminal functionality over to a GUI for DoD physicians, the first implementation of the first commercial integrated data repository, the first deployment of SNOMED CT by a major EMR vendor, and I developed the first terminology model for patient assessment scales using SNOMED CT. I lead the first data standards group supporting the first CTSAs, participated in the first federated implementation of i2b2, and helped build the first system linking all 5 University of California health systems’ EMR data for translational research. More recently, I helped develop the first data transformation pipeline harmonizing observational data from 4+ common data models onto OMOP. Notable other contributions include foundations for much derivative work: co-author of the first HL7 Continuity of Care Document specification; co-author of the first Common Terminology Services 2 FSM; contributor to chapters in Rethinking Clinical Trials: A Living Textbook of Pragmatic Clinical Trials; and participant in the Data Ingestion and Harmonization team for the largest HIPAA-complaint research repository in history: The National COVID Cohort Collaborative (N3C). I’ve consistently been able to find teams that were achieving cutting-edge advances in informatics. I’ve also been fortunate in that these talented people brought our timely visions to fruition.
Today my role at Johns Hopkins University is focused on bringing clinical terminology management, standards development and implementation expertise where it is most needed by our research group. I provide support to data harmonization and interoperability activities of the Biomedical Informatics and Data Science section of the Department of Internal Medicine under the direction of Dr. Christopher Chute, Chief Research Information Officer, Johns Hopkins Schools of Medicine, Public Health, and Nursing. We support local, regional, national and international human and public health research, primarily though data alignment/harmonization services and infrastructure development.
Thinking about my future career path, the arrival of AI supporting clinical research is the most promising development since the dawn of the digital age in health care. The possibilities from harnessing these new technologies is super exciting! I remember in the early 2000s, I presented at the AHMIA national meeting with a message about how transformational the coming digital revolution would be to the Medical Records profession. We received mixed reviews. This feels like a similar inflection point. If you fail to explore utilization of AI in your research practice, you may fall behind.
OHDSI and HL7 agreed on a formal partnership two years ago to align their standards to capture data from a single CDM. As an inaugural co-chair of the HL7 Terminology Services Management group, can you discuss why the OMOP CDM was the right partner to connect with HL7, and what impact this collaboration can have?
The OHDSI community has experienced rapid adoption of its work worldwide. Among the nascent assertions first envisioned for Learning Health Systems is a virtuous cycle where data generated as a by-product of care delivery supports analyses producing real-world evidence. OHDSI delivers on this “great promise” of Learning Health Systems through its research findings that inform improvements in public health and clinical practices via a highly collaborative community using open-source systems. HL7 has been the world’s preeminent health data standards developer for decades. The FHIR product family’s 12+ year history of rapid adoption clearly indicates it’s the world’s most important interoperability platform for health data in our time.
The agreement between the OHDSI and FHIR communities to work together at this point in the evolution of informatics makes complete sense. As a clinical terminologist and HL7 TSMG member, I saw enormous opportunity in an OMOP + FHIR partnership to accelerate adoption of research methods generated by the OHDSI community. The OMOP + FHIR partnership can couple FHIR’s API contemporary interoperability approaches with OHDSI advanced analytic methods. The complementary strengths of each community will advance interoperable utilization of clinical research data science methods using real world data. This, in turn, sets the stage for appropriate and ethical use of AI applied to improvements in human health at our doorstep. Without the OHDSI + HL7 partnership, widespread capabilities to use AI for greater good might emerge much more slowly.
As the COVID pandemic still looms large in our rear-view, a sense of urgency to generate and apply advanced scientific methods to public health issues remains palpable to me. I believe the OMOP + FHIR partnership provides a focused, high-ROI opportunity to create relevant informatics advances with broad impact that will allow us to better address immediate and future health crises, both local and global.
Davera Gabriel joins Dr. Dan Cooper and Dr. Michael Kahn at the Pediatric Exercise Data Working Group in July 2012.
You co-lead the FHIR and OMOP workgroup. For those who don’t know as much about FHIR, can you explain how it relates to OMOP, and what are some of the critical objectives for your team?
FHIR is a health data interoperability standard that leverages an Application Programming Interface (API). FHIR APIs permit organizations to exchange data without the programming overhead of sharing everything relevant about the data as it resides in other systems, such as what format it’s in or details about the database where it resides. Resources on a FHIR API are configured to contain all the requisite information to act on information in other systems as components of the transaction itself.
The FHIR core specification has evolved to become a “universal translator” for health data that can be constrained or extended to meet the business requirements of any health data exchange. Profiles specifying FHIR resource configuration details supporting specific use case scenarios are developed in HL7 working groups and FHIR accelerators. These are balloted under rigorous American National Standards Institute (ANSI) rules and published by HL7 in the FHIR specification and implementation guides. Aligned with open science principles, HL7 offers the FHIR specification and related publications free of charge worldwide.
FHIR continues to be enhanced to support data exchange requirements for real world data and observational research. However, FHIR developers do not aim to develop analytic methods to generate evidence. Rather, FHIR’s role is to provide a platform capable of supporting lossless exchange of knowledge artifacts and data via its resources and APIs. In contrast, the OMOP Common Data Model was designed to store observational data optimized for research analytics and utilization by OHDSI and OMOP-compliant tools generating analyses and research methods.
Research organizations invested in local implementations of OMOP have typically developed ETL and data exchange processes configured to immediate project needs. Data transformations between FHIR and OMOP permit broader utilization of data by systems, enabling FHIR API access to OMOP databases and data on FHIR servers to populate OMOP instances. However, prior OMOP + FHIR transformation projects have aimed to address the needs of a small number of organizations, typically also focused in a single clinical domain. Custom transformations between OMOP + FHIR are numerous and variable, are collectively quite expensive to create anew each time they are needed, and their variability can result in poor data quality.
Of critical importance to our team is the publication of a stable, reliable set of OMOP + FHIR transformations that would reduce implementation costs and increase the speed of ETL in projects. This would support generation of transformed data that are comparable and consistent, supporting the reliability of data, knowledge artifacts, and reproducible research results. Standardization of high-quality OMOP + FHIR transformations also creates the foundation to automate OMOP + FHIR ETL processes and creating alignment between the FHIR and OMOP data models in the future.
Through your role at Johns Hopkins, you helped develop the National COVID Cohort Collaborative (N3C) data harmonization infrastructure. What have you learned about OMOP through this process, and how critical has it been for N3C research?
Working on N3C has been illuminating in so many ways. First, we were all feeling the pressure of the pandemic. This was the first time I’d worked on a project whose impact was truly personal. The long hours at the project’s inception were not a burden in this context. We worked with subject matter experts from all the common data models involved in the data ingestion and harmonization pipeline. On the OMOP side of things, it was our great pleasure to work closely with Kristin Kostka and Clair Blacketer who were shining examples of the competent, collaborative approach we’ve learned is a hallmark of the OHDSI community.
Davera Gabriel led the UCReX data harmonization team, which presented a poster at the 2015 AMIA Joint summits.
A key implementation feature of N3C that made it possible to get up and running quickly was to leverage established research network sites as data partners for source data. This meant we had to develop custom maps for the ingestion pipeline from 4 CDMs to OMOP 5.3.1. This approach dramatically sped up the time to launch a viable research data enclave as it reduced the burden to create custom ETLs from EMRs with each data partner. Small compromises had to be made mapping all 4 of the other models to OMOP for the pipeline, and it was only when the researchers started using the N3C Enclave to study COVID in the context of many clinical specialties that I saw the utility of the OMOP CDM “flat” structure optimized for analytics.
There were a few data harmonization lessons resulting from my N3C experience. Due to the data represented in each source CDM and variability in the way each data partner had populated these, we learned there were key data important to researching COVID, such as ventilator settings, death records and advance directives that were sometimes just not available. Harmonizing units of measure for the measurement domain was necessary. We needed to develop a reusable process to uniformly identify visits, as this was more complicated than usual due to the wide array of care venues present in the source data, especially during COVID surges. However, the robust OMOP data quality procedures and dashboard greatly assisted us and continue to be critical components of the ingestion pipeline that sustain growth of high-quality data into the N3C Enclave.
As the project matured, additional data from over 60 public and government sources have been added, as well as the capability to provide linkage to imaging systems and EMR source data from networks such as All of Us. This dramatically increased the scope of supported research use cases. In instances where additional data work was needed, we found the OMOP CDM implementation conventions of preserving the source data and extension management allowed us to meet most requirements in new data sources. The OHDSI community and subject matter experts continue to be excellent collaborators: available and responsive to our evolving circumstances.
On Day 3 of the OHDSI Global Symposium, there will be an all-day HL7 FHIR-OMOP Connectathon. Can you discuss this event a bit, including what outputs you hope to achieve and who might benefit most from partaking in this Connectathon?
On Sunday at the 2023 Global Symposium, we are inviting anyone with prior data exchange projects to engage in test scenarios transforming data between FHIR and OMOP for common core EMR data. We are seeking participation from organizations that have software that can transform OMOP + FHIR data, and that can make their tools available for workshop participants for the day. We would like to get enough hands on keyboards in the day to validate not only the proposed transformations for core EMR data, but also provide functional feedback to our toolsmiths who brought systems to share. Aside from our tool developers, FHIR IG implementers and developers, researchers requiring data from both OMOP and FHIR sources, as well as individuals that are required to perform ETL between OMOP + FHIR systems would benefit the most by participating in our Connectathon.
Davera and her daughter Davienne at a concert
What are some of your hobbies, and what is one interesting thing that most community members might not know about you?
Hobbies I enjoy include many hand crafts: beadwork, fiber & dye arts, silk painting, sewing and decoupage. In the past couple of years, I’ve become an avid fan of Blues music and its rich history. Having been a musician and loving music my entire life, I’m not sure how I missed resonating so deeply with this art form in the past. And, since everyone gets them every now and then, it offers universal human expressions. Ain’t nothing quite like The Blues.
Something about me that only a few people may know is that I’ve participated in 3 large group Guinness World Records: the most beer sold at one venue (The Renaissance Pleasure Faire, Agoura CA); the largest gathering of pirates (Northern California Pirate Festival) and the world’s longest bicycle parade (Davis, California). The first two have since been surpassed, but I believe that bicycle parade is still #1!