This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
documentation:laertes_etl [2015/05/26 18:34] lee |
documentation:laertes_etl [2015/06/23 10:58] (current) lee |
||
---|---|---|---|
Line 33: | Line 33: | ||
This model decouples the data sources from the various copies of the | This model decouples the data sources from the various copies of the | ||
sources that might have been processed in many different ways. It also | sources that might have been processed in many different ways. It also | ||
- | decouples what can be said about and evidence item (i.e., the semantic | + | decouples what can be said about an evidence item (i.e., the semantic |
tags) from the information artifact. All of this allows for greater | tags) from the information artifact. All of this allows for greater | ||
flexibility with respect to inclusion of sources and | flexibility with respect to inclusion of sources and | ||
Line 173: | Line 173: | ||
==== “Terminology-mappings” directory ==== | ==== “Terminology-mappings” directory ==== | ||
- | This directory contains a number of data sets used to ultimately map terminologies from the sources terms to the OMOP CDM standard terms. | + | This directory contains a number of data sets ultimately used to map terminologies from the sources terms to the OMOP CDM standard terms. |
==== ETL Process Overview ==== | ==== ETL Process Overview ==== | ||
Line 195: | Line 195: | ||
* Load the RDF ntriple graph data into the Virtuoso database | * Load the RDF ntriple graph data into the Virtuoso database | ||
* Manually run Virtuoso SPARQL query to export the drug/hoi combinations along with the adverse event counts into an export file | * Manually run Virtuoso SPARQL query to export the drug/hoi combinations along with the adverse event counts into an export file | ||
+ | * Manually load the annotation URIs into the URL Shortener MySQL database using the MySQL command line client | ||
* Run Python script to load the export file into the PostgreSQL public schema database | * Run Python script to load the export file into the PostgreSQL public schema database | ||
Line 202: | Line 203: | ||
https://github.com/OHDSI/KnowledgeBase/tree/master/LAERTES/EuSPC | https://github.com/OHDSI/KnowledgeBase/tree/master/LAERTES/EuSPC | ||
- | ==== PUBMED/MEDLINE data feed ==== | + | ==== PUBMED / MEDLINE data feed ==== |
MEDLINE records for indexed literature reporting adverse drug events | MEDLINE records for indexed literature reporting adverse drug events | ||
Line 210: | Line 211: | ||
* Run python scripts to convert the data into RDF ntriple graph data | * Run python scripts to convert the data into RDF ntriple graph data | ||
* Load the RDF ntriple graph data into the Virtuoso database | * Load the RDF ntriple graph data into the Virtuoso database | ||
+ | * Manually load the annotation URIs into the URL Shortener MySQL database using the MySQL command line client | ||
* Manually run Virtuoso SPARQL query to export the drug/hoi combinations along with the adverse event counts into an export file | * Manually run Virtuoso SPARQL query to export the drug/hoi combinations along with the adverse event counts into an export file | ||
* Run Python script to load the export file into the PostgreSQL public schema database | * Run Python script to load the export file into the PostgreSQL public schema database | ||
Line 217: | Line 219: | ||
The details for this data feed are documented and maintained here: | The details for this data feed are documented and maintained here: | ||
https://github.com/OHDSI/KnowledgeBase/tree/master/LAERTES/PubMed | https://github.com/OHDSI/KnowledgeBase/tree/master/LAERTES/PubMed | ||
- | |||
==== SPLICER SPL data feed ==== | ==== SPLICER SPL data feed ==== | ||
SPLICER Natural Language Processing extracted Adverse Drug Events from FDA Structured Product Labels (SPLs) | SPLICER Natural Language Processing extracted Adverse Drug Events from FDA Structured Product Labels (SPLs) | ||
Line 226: | Line 227: | ||
* Run python scripts to convert the data into RDF ntriple graph data | * Run python scripts to convert the data into RDF ntriple graph data | ||
* Load the RDF ntriple graph data into the Virtuoso database | * Load the RDF ntriple graph data into the Virtuoso database | ||
+ | * Manually load the annotation URIs into the URL Shortener MySQL database using the MySQL command line client | ||
* Manually run Virtuoso SPARQL query to export the drug/hoi combinations along with the adverse event counts into an export file | * Manually run Virtuoso SPARQL query to export the drug/hoi combinations along with the adverse event counts into an export file | ||
* Run Python script to load the export file into the PostgreSQL public schema database | * Run Python script to load the export file into the PostgreSQL public schema database | ||
Line 233: | Line 235: | ||
The details for this data feed are documented and maintained here: | The details for this data feed are documented and maintained here: | ||
https://github.com/OHDSI/KnowledgeBase/tree/master/LAERTES/SPLICER | https://github.com/OHDSI/KnowledgeBase/tree/master/LAERTES/SPLICER | ||
- | |||
==== SemMED data feed ==== | ==== SemMED data feed ==== | ||
The Semantic MEDLINE Database is a repository of semantic predications (subject-predicate-object triples) extracted by SemRep, a semantic interpreter of biomedical text. | The Semantic MEDLINE Database is a repository of semantic predications (subject-predicate-object triples) extracted by SemRep, a semantic interpreter of biomedical text. | ||
Line 243: | Line 244: | ||
* Run python scripts to convert the data into RDF ntriple graph data | * Run python scripts to convert the data into RDF ntriple graph data | ||
* Load the RDF ntriple graph data into the Virtuoso database | * Load the RDF ntriple graph data into the Virtuoso database | ||
+ | * Manually load the annotation URIs into the URL Shortener MySQL database using the MySQL command line client | ||
* Manually run Virtuoso SPARQL query to export the drug/hoi combinations along with the adverse event counts into an export file | * Manually run Virtuoso SPARQL query to export the drug/hoi combinations along with the adverse event counts into an export file | ||
* Run Python script to load the export file into the PostgreSQL public schema database | * Run Python script to load the export file into the PostgreSQL public schema database |