Yong Chen, PhD
Associate Professor of Biostatistics
University of Pennsylvania
Yong Chen is Associate Professor of Biostatistics at University of Pennsylvania. He directs a Computing, Inference and Learning Lab at University of Pennsylvania (https://penncil.med.upenn.edu/about-pi/), which focuses on integrating fundamental principles and wisdoms of statistics into quantitative methods for tackling key challenges in modern biomedical data. Dr. Chen is an expert in synthesis of evidence from multiple data sources, including systematic review and meta-analysis, distributed algorithms, and data integration, with applications to comparative effectiveness studies, health policy, and precision medicine. He is also working on developing methods to deal with suboptimal data quality issues in health system data, dynamic risk prediction, pharmacovigilance, and personalized health management. He has over 100 publications in a wide spectrum of methodological and clinical areas.
Dr. Chen has been principal investigator on a number of grants, including R01s from the National Library of Medicine and National Institute of Allergy and Infectious Diseases, and Improving Methods for Conducting Patient-Centered Outcomes Research grant from Patient-Centered Outcomes Research Institute. Dr. Chen received his bachelor’s degree in Mathematics at the University of Science and Technology of China, Master degree in Pure Mathematics and Ph.D. in Biostatistics at the Johns Hopkins University. He is an elected fellow of the Society for Research Synthesis Methodology, and the International Statistical Institute. He is a recipient of Best Paper Award by the International Medical Informatics Association (IMIA) Yearbook Section on Clinical Research Informatics, Institute of Mathematical Statistics Travel Award, Margaret Merrell Award for excellence in research at the Johns Hopkins University, and Distinguished Faculty Award at the University of Pennsylvania.
Tong, J, Huang, J, Wang, X, Moore, J, Hubbard, R and Chen, Y. (Sep. 2019) An Augmented Estimation Procedure for EHR-based Association Studies Accounting for Differential Misclassification. Journal of the American Medical Informatics Association (in press).
Li, R, Duan, R, Kember, R, Regeneron Genetic Center, Rader, D, Damrauer, S, Moore, J and Chen, Y. (July, 2019) A regression framework to uncover pleiotropy in large-scale electronic health record data. Journal of the American Medical Informatics Association. 26, 1083-1090.
Li, R, Chen, Y and Moore, J (April, 2019). Integration of genetic and clinical information to improve imputation of data missing from electronic health records. Journal of the American Medical Informatics Association, 26, 1056-1063.
Liu, YL, Huang, J, Urbanowicz, R, Chen, K, Manduchi, E, Greene, C, Scheet, P, Moore, JH., and Chen, Y. (August 2019) A distributed analysis method for detecting genetic interactions for complex disease in large research consortia, Genetic Epidemiology (in press).
Maltenfort, M, Chen, Y and Forrest, C. (Aug. 2019) Prediction of 30-day pediatric unplanned hospitalizations using the Johns Hopkins Adjusted Clinical Groups Risk Adjustment, Plos One. 14(8):e0221233.
Huang, J, Chen, Y, Landis, R and Mahoney, K. (July, 2019) Patient portal: how does it change our health behaviors and outcomes? – a retrospective, observational cohort study at Penn Medicine, Journal of Medical Internet Research (in press).
Huang, J, Zhang, X, Tong, J, Du, J, Duan, R, Liu, Y, Moore, J, Tao, C and Chen, Y. (July, 2019) Comparing drug safety of Hepatitis C therapies using post-market data. BMC Medical Informatics and Decision Making. 19(Suppl 4): 147.
Du, J, Cunningham, RM, Xiang, Y, Li, F, Jia, Y, Boom, JA, Myneni, S, Bian, J, Luo, C, Chen, Y and Tao, C. (April 2019) Leveraging deep learning to understand health beliefs about the Human Papillomavirus Vaccine from social media. Nature Partner Journal (NPJ) Digital Medicine. 2, 27.
Duan, R, Boland, M, Moore, J and Chen, Y (2019). ODAL: A one-shot distributed algorithm to perform logistic regressions on electronic health records data from multiple clinical sites. Pacific Symposium on Biocomputing, 30-41. [c6].
Huang, J, Zhang, X, Du, J, Duan, R, Yang, L, Moore, JH, Chen, Y, Tao, C (Feb, 2019), Comparing adverse effects of Hepatitis C drugs using FAERS data, IEEE BIBM 2018 Proceedings.
Chen, Y, Wang, J, Chubak, J, Hubbard, R (Feb. 2019) Inflation of type I error rates due to differential misclassification in EHR-derived outcomes: Empirical illustration using breast cancer recurrence, Pharmacoepidemiology & Drug Safety. 28(2):264–268.
Tong, J, Huang, J, Du, J, Cai, Y, Tao, C and Chen, Y. (Dec. 2018) Identification of Rare Adverse Events with Year-varying Reporting Rates for FLU4 Vaccine in VAERS, AMIA. 1544-1551.
Hubbard, R, Huang, J, Harton, J, Oganisian, A, Choi, G, Utidjian, L, Eneli, I, Bailey, L, Chen, Y (September 2018) A Bayesian latent class approach for EHR-based phenotyping, Statistics in Medicine, 38:74-87.
Zhang, X., Duan, R., Du, J., Huang, J., Chen, Y, & Tao, C. (June, 2018). Comparing Pharmacovigilance Outcomes Between FAERS and EMR Data for Acute Mania Patients. 2018 IEEE International Conference on Healthcare Informatics Workshop (ICHI-W) (pp. 57-59).
Duan, R., Zhang, X.,Du, J., Huang, J., Tao, C and Chen, Y. (March, 2019). On the evidence consistency of pharmacovigilance outcomes between Food and Drug Administration Adverse Event Reporting System and electronic medical record data for acute mania patients. Health Informatics Journal.
Huang, J., Du, J., Duan, R., Zhang, X., Tao, C., Chen, Y. (May, 2018) Characterization of the differential adverse event rates by race/ethnicity groups for HPV vaccine by integrating data from different sources. Frontiers in pharmacology. 9:539.
Huang, J, Duan, R, Hubbard, R, Wu, Y, Moore, JH, Xu, H, and Chen, Y (March 2018), PIE: A prior knowledge guided integrated likelihood estimation method (PIE) for bias reduction in association studies using electronic health records data, Journal of the American Medical Informatics Association. Volume 25, Issue 3, Pages 345–352.
(selected as one of the top 5 best papers by the IMI Yearbook Section on Clinical Research Informatics from 741 papers published in 2017)
Duan, R, Zhang, X, Du, J, Huang, J, Tao, C, and Chen, Y (2017). Post-marketing Drug Safety Evaluation using Data Mining Based on FAERS. International Conference on Data Mining and Big Data (pp. 379-389). Springer, Cham.
Huang, J, Zhang, X, Du, J, Duan, R, Yang, L, Moore, JH, Chen, Y, Tao, C (2017), Comparing Different Adverse Effects Among Multiple Drugs Using FAERS Data, Stud Health Technol Inform. 245:1268.
Du, J, Huang, J, Duan, R, Chen, Y, Tao, C (2017), Comparing the Human Papillomavirus Vaccination Opinions Trends from Different Twitter User Groups with a Machine Learning Based System and Semiparametric Nonlinear Regression, Stud Health Technol Inform 245:1218.
Cai, Y, Du, J, Huang, J, Ellenberg, S, Hennessy, S, Tao, C, and Chen, Y. (July, 2017) A Signal Detection Method for Temporal Variation of Adverse Effect with Vaccine Adverse Event Reporting System Data. BMC Medical Informatics and Decision Making, 17(Suppl 2): 76.
Duan, R, Cao, M, Wu, Y, Huang, J, Denny, J, Xu, H and Chen, Y. (Feb. 2017) An Empirical Study for Impacts of Measurement Errors on EHR based Association Studies, AMIA annual symposium proceedings, 2016:1764-1773.
(This paper won the first prize of “Best of Student Papers in Knowledge Discovery and Data Mining (KDDM)”Awards).
Hong, C, Salanti, G, Morton, S, Riley, R, Chu, H, Kimmel, S, and Chen, Y (September 2019) Testing small study effects in multivariate meta-analysis, Biometrics (in press; discussion paper).
Wang, L, Chai, X, Chen, Y, and Chen, J (August, 2019) Novel Two-Phase Sampling Designs for Studying Binary Outcomes, Biometrics (in press).
Duan, R, Cao, M, Ning, Y, Zhu, M, Zhang, B, McDermott, A, Chu, H, Zhou, X, Moore, J, Ibrahim, J, Scharfstein, D and Chen, Y (July, 2019), Global identifiability of latent class models with applications to diagnostic test accuracy studies: a Grobner basis approach, Biometrics (in press).
Shen, W, Liu, S, Chen, Y and Ning, J. (Dec. 2018) Regression analysis of longitudinal data with outcome-dependent sampling and informative censoring. Scandinavian Journal of Statistics (in press).
Cai, Y, Huang, J, Ning, J, Lee, ML, Rosner, B and Chen, Y (September, 2019), Two-sample test for correlated data under outcome-dependent sampling with an application to self-reported weight loss data, Statistics in Medicine.
Chen, Y, Huang, J, Ning, Y, Liang, K-Y and Lindsay, B. (March, 2018) A conditional composite likelihood ratio test with boundary constraints. Biometrika 105 (1), 225-232.
Huang, J, Ning, Y, Liang, K-Y and Chen, Y (2018), Composite likelihood inference under boundary conditions, Statistica Sinica, (in press).
Hong, C, Riley, R and Chen, Y (March, 2018) An improved method for bivariate meta-analysis when within-study correlations are unknown, Research Synthesis Methods, 9(1):73–88.
Zhang, J, Ko, CW, Nie, L, Chen, Y and Tiwari, R. (Feb, 2018) Bayesian hierarchical methods for meta-analysis combining randomized-controlled and single-arm studies, Statistical Methods in Medical Research, 28 issue: 5, page(s): 1293-1310.
Ma, X, Lian, X, Chu, H, Ibrahim, J, and Chen, Y (Jan., 2018) A Bayesian hierarchical model for network meta-analysis of multiple diagnostic tests, Biostatistics, 19(1): 87–102
Hong, C, Ning, Y, Wei, P, Cao, Y and Chen, Y. (2017) A semiparametric model for vQTL mapping, Biometrics, 73(2): 571–581.
Hong, C, Ning, Y, Wang, S, Wu, H, Carroll, RJ and Chen, Y. (2017) PLEMT: A novel pseudolikelihood based EM test for homogeneity in generalized exponential tilt mixture models, Journal of the American Statistical Association, 112 (50) 1393–1404.
Chen, Y, Ning, J, Ning, Y, Liang, K-Y and Bandeen-Roche, K. (2017) On pseudolikelihood inference for semiparametric models with boundary problems . Biometrika, 104 (1): 165—179.
Piao, J, Liu, YL, Chen, Y and Ning, J. (July, 2017) Copas-like selection model to correct publication bias in systematic review of diagnostic test studies, Statistical Methods in Medical Research. 18(3):495–504.
Ning, J, Chen, Y and Piao, J (July, 2017) Maximum likelihood estimation and EM algorithm of Copas-like selection model for publication bias correction. Biostatistics, 18(3): 495–504.
Liu, Y, DeSantis, S and Chen, Y (2017) Bayesian mixed treatment comparisons meta-analysis for correlated outcomes subject to reporting bias, Journal of the Royal Statistical Society: Series C, 67(1): 127–144.
Li, X, Chen, Y, and Li, R (Feb., 2017) A frailty model for recurrent events during alternating restraint and non-restraint time periods, Statistics in Medicine, 36(4):643–654.
Chen, Y, Liu, Y, Chu, H, Lee, M and Schmid, C (2017) A simple and robust method for multivariate meta-analysis of diagnostic test accuracy, Statistics in Medicine, 36(1):105-121.
Chen, Y, Hong, C, Ning, Y and Su, X. (Jan., 2016) Meta-analysis of studies with bivariate binary outcomes: a marginal beta-binomial model approach, Statistics in Medicine, 35(1):21–40.
Chen, Y, Cai, Y, Hong, C, and Jackson, D. (2016) Inference for correlated effect sizes using multiple univariate meta-analyses, Statistics in Medicine, 35(9): 1405-1422.
Liu, Y, Chen, Y and Chu H. (2015) A unification of models for meta-analysis of diagnostic accuracy studies without a gold standard, Biometrics, 71(2):538—47
Ning, J, Chen, Y, Cai, C, Huang, X and Wang, MC. (2015) On the Dependence Structure of Bivariate Recurrent Event Processes: Inference and Estimation, Biometrika, 102(2): 345-358.
Chen, Y, Ning, J and Cai, C. (2015) Regression analysis of longitudinal data with irregular and informative observation times, Biostatistics, 16(4): 727-739.
Ning, Y and Chen, Y. (2015) A class of pseudolikelihood ratio tests for homogeneity in exponential tilt mixture models, Scandinavian Journal of Statistics 42 (2), 504–517.
Chen, Y, Hong, C and Riley, R. (2015) An alternative pseudolikelihood method for multivariate random-effects meta-analysis, Statistics in Medicine, 34 (3): 361-380.
Chen, Y, Liu, Y, Ning, J, Cormier J and Chu H. (2015) A hybrid model for combining case-control and cohort studies in systematic reviews of diagnostic tests, Journal of the Royal Statistical Society: Series C, 64(3): 469-489.
Chen, Y, Liu, Y, Ning, J, Nie, L, Zhu, H and Chu H. (Dec., 2014) A composite likelihood method for bivariate analysis of sensitivity and specificity in diagnostic reviews, Statistical Methods in Medical Research. 26(2).
Luo, S, Chen, Y, Su, X and Chu, H. (2014) mmeta: An R package for multivariate meta-analysis. Journal of Statistical Software, 56 (11).
Chen, Y and Liang, KY. (2010) On the asymptotic behaviour of the pseudolikelihood ratio test statistic with boundary problems, Biometrika, 97 (3), 603–620.