Transparency: others should be able to reproduce your study in every detail using the information you provide.
Prespecify what you're going to estimate and how: this will avoid hidden multiple testing (fishing expeditions, p-value hacking). Run your analysis only once.
Validation of your analysis: you should have evidence that your analysis does what you say it does (showing that statistics that are produced have nominal operating characteristics (e.g. p-value calibration), showing that specific important assumptions are met (e.g. covariate balance), using unit tests to validate pieces of code, etc.)
Best practices (generic)
Write a full protocol, and make it public prior to running the study. This should include
Research question + hypotheses to be tested
Which method(s), data, cohort definitions.
What is the primary analyses and what are sensitivity analyses?
Quality control
Amendments and Updates
Validate all code used to produce estimates. The purpose of validation is to ensure the code is doing what we require it to do. Possible options are:
Unit testing
Simulation
Double coding
Code review
Include negative controls (exposure-outcome pairs where we believe there is no effect)
Produce calibrated p-values
Make all analysis code available as open source so others can easily replicate your study
Best practices (new-user cohort design)
Use propensity scores (PS)
Build PS model using regularized regression and a large set of candidate covariates (as implemented in the CohortMethod package)
Use either variable-ratio matching or stratification on the PS
Compute covariate balance after matching for all covariates, and terminate study if a covariate has standardized difference > 0.1
Best practices (self-controlled case series)
Include a risk window just prior to start of exposure to detect time-varying confounding (e.g. contra-indications, protopathic bias)
Best practices ((nested) case-control)
Don't do a case-control study
development/best_practices_estimation.txt · Last modified: 2020/03/09 05:18 by schuemie