Collaborator Spotlight: Yong Chen

Dr. Yong Chen, Professor of Biostatistics, founded and directs the Computing, Inference, and Learning Lab (PENNCIL) at the University of Pennsylvania. The mission of  the PENNCIL lab is to develop computational methods and software to transform real-world data into insights, to disseminate the methods and knowledge to research communities, and to bridge the gap from data to actionable health care.

Yong, who has been leading methodological work within the OHDSI community for several years, is an Elected Fellow in both the American Statistical Association (2020) and the the American College of Medical Informatics (ACMI) (2023), and he earned the 2021 OHDSI Titan Award for Methodological Research. His research areas include real-world data, clinical evidence generation, learning health systems and healthcare delivery.

In the latest edition of the Collaborator Spotlight, Yong discusses his career journey, recent advances in methods research, how his students use OHDSI in their research, and more.

 

Can you discuss your background and career journey?

I earned my PhD in Biostatistics at Johns Hopkins University, where I had the privilege of working with public health icons Dr. Kung-Yee Liang and Dr. Charles Rohde. Under their mentorship, I delved into the foundations of statistics, statistical evidence, and inference, which profoundly shaped the way I think about data and evidence. In 2005, I was honored to be part of the inaugural cohort of the Hopkins Sommer Scholars, a leadership development program created by Dean Alfred Sommer at the Bloomberg School of Public Health. This program provides emerging public health leaders with opportunities to address global health challenges through interdisciplinary collaboration and innovation.

Over the past 14 years, my career has been enriched by collaborations with remarkable scientists who have significantly shaped my research trajectory. One pivotal moment came in 2013 when I shifted my focus to real-world data (RWD)—a field that was still relatively new and often met with skepticism, due to the perceived noisiness and complexity of such data. While others viewed RWD with caution, I saw immense potential. The challenge of extracting reliable evidence from these complex data sets opened the door to a wealth of untapped research opportunities, and I have been deeply involved in this dynamic and evolving field ever since.

Another key turning point in my career occurred during the pandemic, when I became involved—quite unexpectedly—with a group of health economists and researchers at UnitedHealth Group. My colleague at Penn, Dr. David Asch, led meetings twice a week, where we addressed some of the most urgent healthcare policy questions arising from the pandemic. Rather than rushing to solutions, we invested significant time in carefully framing the right problems to tackle. This collaborative effort resulted in two impactful publications in JAMA, which examined site-of-care related mortality rates for COVID-19 patients at the onset of the pandemic, and healthcare disparities across ethnic groups.

A key lesson I took from this experience echoes the words of computer scientist Donald Knuth: “Premature optimization is the root of all evil.” This principle—focusing on clearly defining the problem before diving into solutions—has had a lasting influence on how I approach research in the years that followed.

How did you first connect with the OHDSI community, and why have you been inspired to become a leading collaborator within the community?

I was introduced to OHDSI about a decade ago by two key figures: Dr. Jesse Berlin from Johnson & Johnson and Dr. Hua Xu from the University of Texas (now at Yale). Through them, I connected with Dr. Patrick Ryan and Dr. Martijn Schuemie, who have since become two of my closest collaborators.

What drew me to OHDSI was the community’s pursuit of evidence in a manner that is rigorous, practical, scalable, and democratic. This aligns deeply with my own interests. Much of my early work—and my reading as a young biostatistician—centered around the philosophical underpinnings of evidence, particularly the question, “What is the evidence in the data?” This ties back to the likelihood principle and the work of pioneers like R. A. Fisher, Ian Hacking, Allan Birnbaum, and, more personally, Richard Royall, a biostatistics professor at Johns Hopkins, whose publications have profoundly shaped my thinking.

But the true power of OHDSI comes from its people. Throughout history, many disciplines have thrived due to the openness of the communities driving them. OHDSI embodies this spirit. The open science model and the community’s commitment to training the next generation of data scientists in critical thinking are key ingredients to its success. This thriving, collaborative environment makes working in OHDSI both fulfilling and fun.

You have been one of the leaders in methods research in the community. Why is it crucial to bring together statistics, informatics, and epidemiology to develop research methods that produce reliable evidence?

Bridging the disciplines of statistics, informatics, and epidemiology is not just beneficial—it’s essential. Each discipline brings its own perspective, language, and set of challenges. My role often involves acting as a connector between these worlds. For example, statisticians and informaticians may approach the same problem in very different ways. By fostering conversations across these disciplines, we uncover methodological gaps and identify new problems that require attention.

Our ability to translate innovations from one community to another helps push the boundaries of clinical evidence generation. On the other hand, the support from the medical informatics community to our work has been invaluable, and we remain committed to advancing methodological research that ensures reliable evidence and robust evidence synthesis for clinical decision-making.

What are some recent methodological developments in the community that have excited or inspired you?

To me, one of the most exciting developments is the concept of negative control outcomes (NCOs) within the OHDSI community. This idea is both elegant and pragmatic, offering a way to strengthen causal inference and mitigate bias in observational studies. While there is still much work to be done, I am thrilled by the potential of NCOs and look forward to contributing to the next generations of these methods as they evolve.

You are a Professor of Biostatistics at the University of Pennsylvania, and we have seen multiple students become involved with the community. Are you finding that students are excited about OHDSI in their research?

Absolutely! The OHDSI Symposium is a highlight of the year for our lab. It offers three key benefits to trainees. First, for students with a more theoretical background, it’s a revelation to see how cutting-edge methodologies translate into real-world applications. It challenges them to appreciate the practical side of statistical research.

Second, for those already engaged in OHDSI studies, the symposium is a vibrant gathering where they can meet international collaborators, share ideas, and get inspired by the global impact of their work. Lastly, attending the symposium helps us all reflect on the broader impact of our research, giving us insight into the current landscape of methodological innovation and helping us shape future research agendas.

You were a 2021 recipient of a Titan Award for Methodologic Research. What did that honor mean to you?

The Titan Award was a tremendous honor, but I’ve left it behind me. My focus now is on advancing large-scale, reliable clinical evidence generation and mentoring the next generation of researchers and scientists, all while working collaboratively with the extraordinary OHDSI community.

What are some of your hobbies, and what is one interesting thing that most community members might not know about you?

I enjoy skiing, swimming, and watching movies—especially thought-provoking sci-fi films that explore concepts like parallel universes and time travel. I used to play intramural soccer regularly and still hope to get back on the field someday. But most importantly, I love spending time with my daughters, who are 7 and 9—they keep me active and bring a lot of joy into my life.

One interesting thing about me is that over 20 years ago, I did a summer internship at the Space Telescope Science Institute (STScI), the operations center for both the Hubble Space Telescope (HST) and the James Webb Space Telescope (JWST). At the time, I was a PhD student in pure math with zero exposure to data, models, or real-world problems. That internship was a turning point for my career and helped shape my interest in applying mathematical and statistical principles to real-world challenges.

Top