This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
projects:workgroups:patient-level_prediction:best-practice [2016/04/19 17:50] prijnbeek |
projects:workgroups:patient-level_prediction:best-practice [2016/05/04 08:23] jreps [Best practices] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== OHDSI Best Practices for Patient Level Prediction Modelling====== | + | ====== OHDSI Best Practices for Patient Level Prediction ====== |
:!: //This document is under development. Changes can be proposed and discussed via the [[projects:workgroups:patient-level_prediction|Patient-Level Prediction Workgroup]] meetings.// | :!: //This document is under development. Changes can be proposed and discussed via the [[projects:workgroups:patient-level_prediction|Patient-Level Prediction Workgroup]] meetings.// | ||
Line 6: | Line 6: | ||
- | * **Transparency**: others should be able to reproduce your study in every detail using the information you provide. | + | * **Transparency**: others should be able to reproduce your study in every detail using the information you provide. Make sure all analysis code is available as open source. |
- | * **Prespecify** what you're going to estimate and how: this will avoid hidden multiple testing (fishing expeditions, p-value hacking). Run your analysis only once. | + | * **Prespecify** what you're going to predict and how. This will avoid fishing expeditions, p-value hacking. |
- | * **Validation of your analysis**: you should have evidence that your analysis does what you say it does (showing that statistics that are produced have nominal operating characteristics (e.g. p-value calibration), showing that specific important assumptions are met (e.g. covariate balance), using unit tests to validate pieces of code, etc.) | + | * **Code validation**: it is important to add unit tests, code review, or double coding steps to validate the developed code base. We recommend to test the code on benchmark datasets. |
+ | ===== Best practices ===== | ||
- | ===== Best practices (generic) ===== | + | **Data characterisation and cleaning**: Before modelling it is important to characterize the cohorts, for example by looking at the prevalence of certain covariates. Tools are being developed in the community to facilitate this. A data cleaning step is recommend, e.g. remove outliers in lab values. |
+ | |||
+ | **Dealing with missing values **: A best practice still needs to established. | ||
+ | |||
+ | **Feature construction and selection**: Both feature construction and selection should be completely transparent using a standardised approach to be able repeat the modelling but also to enable application of the model on unseen data. | ||
+ | |||
+ | **Inclusion and exclusion criteria** should be made explicit. It is recommended to do sensitivity analyses not he choices made. Visualisation tools could help and this will be further explored in the WG. | ||
+ | |||
+ | **Model development** is done using a split-sample approach. The percentage used for training could depend on the number of cases, but as a rule of thumb 80/20 split is recommended. Hyper-parameter training should only be done on the training set. | ||
+ | |||
+ | **Internal validation** is done only once on the holdout set. The following performance measures should be calculated: | ||
+ | . Overall performance: Brier score (unscaled/scaled) | ||
+ | . Discrimination: Area under the ROC curve (AUC) | ||
+ | . Calibration: Intercept + Gradient of the line fit on the observed vs predicted probabilities | ||
+ | We recommend box plots of the predicted probabilities for the outcome vs non-outcome people, the ROC plot and a scatter plot of the observed vs predicted probabilities with the line fit to that data and the line x=y added. |