This is an old revision of the document!

OHDSI Best Practices for Patient Level Prediction

This document is under development. Changes can be proposed and discussed via the Patient-Level Prediction Workgroup meetings.

General principles

Transparency: others should be able to reproduce your study in every detail using the information you provide. Make sure all analysis code is available as open source.
Prespecify what you're going to predict and how. This will avoid fishing expeditions, p-value hacking.
Code validation: it is important to add unit tests, code review, or double coding steps to validate the developed code base. We recommend to test the code on benchmark datasets.

Best practices

Data characterisation and cleaning: Before modelling it is important to characterize the cohorts, for example by looking at the prevalence of certain covariates. Tools are being developed in the community to facilitate this.

Dealing with missing values : A best practice still needs to established.

Feature construction and selection: Both feature construction and selection should be completely transparent using a standardised approach to be able repeat the modelling but also to enable application of the model on unseen data.

Inclusion and exclusion criteria: All inclusion and exclusion criteria should be made explicit. It is recommended to do sensitivity analyses. Visualisation tools could help here, this will be further explored.

Model development is done using a split-sample approach. The percentage used for training could depend on the number of cases, but as a rule of thumb 80/20 split is recommended. Hyper-parameter training should only be done on the training set.

Model validation is done only once on the holdout set. The following performance measures should be added: To be added!

Observational Health Data Sciences and Informatics

Sidebar

Table of Contents

OHDSI Best Practices for Patient Level Prediction

General principles

Best practices

Observational Health Data Sciences and Informatics

User Tools

Site Tools

Sidebar

Table of Contents

OHDSI Best Practices for Patient Level Prediction

General principles

Best practices

Page Tools