User Tools

Site Tools


development:database_migration_management

Background

The OHDSI WebAPI services executable creates and maintains some read/write tables in the OHDSI database schema for services that need to persist data. These tables are not part of the Common Data Model. They are used by the OHDSI WebAPI services to save data such as analytical results, cohort definitions and operational data such as batch job execution statuses.

The OHDSI WebAPI executable may migrate some of these tables (and other database objects like sequences) in the OHDSI schema to later versions when new versions of the WebAPI services are released that require database table changes.

The OHDSI WebAPI executable has a dependency on a java framework called “Flyway” to manage the execution of the database schema migration scripts that perform the database migrations.

The advantages of this approach are:

  • It provides a timestamped, checksum verified audit trail of all applied database schema changes
  • All database schema changes are performed via SQL DDL scripts that are version controlled
  • Initial creation of tables and all subsequent database migrations are automatically executed dynamically in new and existing databases

The basics of how Flyway works

The Flyway data migration framework will execute database migration scripts that are provided in a known directory (or sub-directory) in the source code tree and each script name must begin with a database migration version number.

Each database migration script is a SQL source code file containing only the set of Database Definition Language (DDL) changes that need to be applied to the existing database schema in order to migrate it to the latest version.

The scripts are executed in the order of the database migration version number in the script names.

e.g. If we have 3 scripts: V1.0.0.0_circe_schema_migration.sql, V1.0.0.2_heracles_schema_migration.sql and V1.0.0.1_hermes_schema_migration.sql then the order of execution for an initial flyway database migration would be V1.0.0.0_circe_schema_migration.sql, V1.0.0.1_hermes_schema_migration.sql and V1.0.0.2_heracles_schema_migration.sql

Flyway allows scripts with the same name to exist in different sub-directories of the known migration scripts directory. This is very helpful to enable support for multiple DBMS systems. With the way that the WebAPI build is setup, the specific DBMS database migration script that runs will be dependent on which maven database-profile is active (or by default the SQL Server version).

Flyway keeps an audit trail of database migration scripts that have previously been run in it's own database table called “SCHEMA_VERSION”. That table includes the database migration version number (so Flyway knows which scripts have previously been run) and also a script checksum value so Flyway knows if a database script has been changed since it was last run. Note. The checksum value is sensitive to file end of line differences across platforms. e.g. A script with carriage return + line feed chars is not the same as a script with just carriage returns.

Using the SCHEMA_VERSION table Flyway only runs database migration scripts with version numbers greater than the latest previously run script version number. It also will throw an error if the contents of the database migration script has been changed since the last time it was run.

How the OHDSI WebAPI services use Flyway

The WebAPI database migration scripts are created in the sub-folders of the https://github.com/OHDSI/WebAPI/tree/master/src/main/resources/db/migration folder. A version of the database migration script must be placed in each subfolder. There is one subfolder for each DBMS. Each DBMS script must have valid SQL in the SQL dialect of that DBMS. e.g. Postgresql SQL in the migration script under the Postgresql sub-directory.

The database migration scripts file names are prefixed with “Vn.n.n.n_” where n.n.n.n is the numeric version number. e.g. V1.0.0.0. There is a separate version number assigned to each set of WebAPI services. Use the next available higher unused version number if you have a new set of WebAPI services. Additional database migration changes for the same set of WebAPI services would use a sub version (with an additional '.n' on the version suffix.

e.g. The next heracles related database migration script after V1.0.0.1_heracles_schema_migation.sql would be called V1.0.0.1.1_heracles_schema_migration.sql.

Note. The OHDSI WebAPI build has the Flyway “out of order execution” option enabled so that the rules around database script execution order are relaxed. This provides needed flexibility when we have typically one developer working on each subset of WebAPI services. Since we want to enable each developer to be able to work independently in parallel, this approach avoids the more complex alternative of requiring separate forks for WebAPI database migrations.

When the Flyway “out of order execution” option is enabled, it means that newly created scripts will also be run (the first time flyway encounters them) even if their version numbers are lower than the highest previously executed version number.

For the OHDSI WebAPI having this option enabled gives us the flexibility for one developer to e.g. add a new database migration script V1.0.0.1.1_heracles_schema_migration.sql and it will be executed one time, even if the last previous version change was V1.0.0.4_circe_schema_migration.sql. i.e. It allows each developer to flexibly & independently migrate their database tables for the set of WebAPI services they are responsible for, without schema change conflicts, as long as those sets of tables are distinct from the tables used by other WebAPI services.

Testing Flyway migration scripts prior to a pull request

Ideally you will need access to a local test environment with all three DBMS systems - SQL Server, Postgresql and Oracle. Build a copy of the WebAPI locally with whichever database profile you are most familiar with using. Flyway will then automatically execute the database migration script according to your maven build settings and you will see the tables are automatically created in your database. You can then manually test the database scripts for the other DBMSs by manually running them in a matching local DBMS to ensure they execute as expected.

Create idempotent database migration scripts

If a script fails half way through then you don't want to have to manually drop the objects that were created. This can be avoided by including DDL statements to drop objects if they already exist. That way a script will clean-up previously created objects automatically if it is re-run.

Manually rolling back a Flyway migration

It is not encouraged but it is possible to edit a database migration script and re-run it. You would need to manually delete the row containing the record of the previous execution of the script from the SCHEMA_VERSION table (and if necessary also drop any database objects that had been created). Care must be taken not to delete any other rows from the SCHEMA_VERSION table.

development/database_migration_management.txt · Last modified: 2015/04/24 13:18 by lee