User Tools

Site Tools


documentation:software:whiterabbit:test_framework

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
documentation:software:whiterabbit:test_framework [2016/06/03 06:22]
schuemie
documentation:software:whiterabbit:test_framework [2020/02/18 14:13] (current)
maximmoinat
Line 1: Line 1:
 ====== Rabbit-In-a-Hat testing framework ====== ====== Rabbit-In-a-Hat testing framework ======
 +
 +**NOTICE FEB 2020:** the current documentation can be found on Github
 +http://​ohdsi.github.io/​WhiteRabbit/​riah_test_framework.html
 +
  
 Rabbit-In-a-Hat can generate a framework for creating a set of [[https://​en.wikipedia.org/​wiki/​Unit_testing|unit tests]]. The framework consists of a set of R functions tailored to the source and target schema in your ETL. These functions can then be used to define the unit tests. ​ Rabbit-In-a-Hat can generate a framework for creating a set of [[https://​en.wikipedia.org/​wiki/​Unit_testing|unit tests]]. The framework consists of a set of R functions tailored to the source and target schema in your ETL. These functions can then be used to define the unit tests. ​
  
-Unit testing assumes that you have your data in source format somewhere in a database. You should already have created an ETL process that will extract from the source database, transform it into CDM format, and load it into a CDM schema. The unit test framework can be used to make sure that your ETL process is doing what it is supposed to do. For this you will need to create a new, empty database with exactly the same structure as your source database, and a new empty database where a test CDM database will live. The framework can be used to insert test data into the empty source schema, and once you'​ve ​run your ETL process on the test data it can be used to verify that the output of the ETL (in the test CDM schema) ​is what you'd expect given the test source data.+Unit testing assumes that you have your data in source format somewhere in a database. You should already have created an ETL process that will extract from the source database, transform it into CDM format, and load it into a CDM schema. The unit test framework can be used to make sure that your ETL process is doing what it is supposed to do. For this you will need to create a new, empty database with exactly the same structure as your source database, and a new empty database where a test CDM database will live. The framework can be used to insert test data into the empty source schema. You can then run your ETL process on the test data to populate the test CDM database. you can then use the framework ​to verify that the output of the ETL in the test CDM database ​is what you'd expect given the test source data.
  
 ===== Overview ===== ===== Overview =====
Line 73: Line 77:
 ==== Generate test data in the source data schema ==== ==== Generate test data in the source data schema ====
  
-If we run our full R script two variables are created in R. The first variable is ''​%%insertSql%%'',​ and it contains ​the SQL for creating ​the test data. The SQL assumes that the data schema already exists, and will first remove any records that might be in the tables. We can execute the SQL in any SQL client, or we can use OHDSI'​s [[https://​github.com/​OHDSI/​DatabaseConnector|DatabaseConnector package]]. For example:+After we have defined all our tests we need to run  
 +<​code>​ 
 +insertSql ​<- generateInsertSql(databaseSchema = "​nativeTestSchema"​) 
 +testSql <- generateTestSql(databaseSchema = "​cdmTestSchema"​) 
 +</​code>​ 
 +to generate ​the SQL for inserting ​the test data in the database (insertSql),​ and for running the tests on the ETL-ed data (testSql). The insertion ​SQL assumes that the data schema already exists ​in ''​nativeTestSchema''​, and will first remove any records that might be in the tables. We can execute the SQL in any SQL client, or we can use OHDSI'​s [[https://​github.com/​OHDSI/​DatabaseConnector|DatabaseConnector package]]. For example:
  
 <​code>​ <​code>​
Line 83: Line 92:
 connection <- connect(connectionDetails) connection <- connect(connectionDetails)
  
-executeSql(connection,​ "USE source_schema"​) 
 executeSql(connection,​ paste(insertSql,​ collapse = "​\n"​)) executeSql(connection,​ paste(insertSql,​ collapse = "​\n"​))
 </​code>​ </​code>​
Line 89: Line 97:
 ==== Run your ETL on the test data ==== ==== Run your ETL on the test data ====
  
-Now that the test source data is populated. You can run the ETL process you would like to test.+Now that the test source data is populated. You can run the ETL process you would like to test. The ETL should transform the data in ''​nativeTestSchema''​ to CDM data in ''​cdmTestSchema''​.
  
 ==== Test whether the CDM data meets expectations ==== ==== Test whether the CDM data meets expectations ====
  
-The second variable created by our R script is called ''​%%testSql%%''​. It contains ​SQL that will create a table called ''​%%test_results%%''​ in your CDM schema, and populate it with the results of the tests. (If the table already exists it will first be dropped). Again, we could use any SQL client to run this SQL, or we could use DatabaseConnector:​+The test SQL will create a table called ''​%%test_results%%''​ in ''​cdmTestSchema''​, and populate it with the results of the tests. (If the table already exists it will first be dropped). Again, we could use any SQL client to run this SQL, or we could use DatabaseConnector:​
  
 <​code>​ <​code>​
-executeSql(connection,​ "USE cdm_schema"​) 
 executeSql(connection,​ paste(testSql,​ collapse = "​\n"​)) executeSql(connection,​ paste(testSql,​ collapse = "​\n"​))
 </​code>​ </​code>​
documentation/software/whiterabbit/test_framework.1464934949.txt.gz · Last modified: 2016/06/03 06:22 by schuemie