documentation:athena:import_snomed

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
documentation:athena:import_snomed [2015/04/01 08:45]
gleb_malikov created
documentation:athena:import_snomed [2015/04/02 04:30] (current)
cgreich
Line 1: Line 1:
 +I would build the logic slightly differently:​
 +
 +1. Concepts. ​
 +- We have one authoratitative source: SNOMED international in combination with SNOMED UK. Other components might follow later (DM+D, other country-specific versions).
 +- We get a stream of concepts from them:
 +  - Attributes of existing concepts are overwritten by the new concepts
 +  - New concepts are added
 +  - Missing concepts are deprecated
 +  - Explicitely deprecated (inactivated) concepts are deprecated
 +  - We do domain assignments for all of them. This is done by building the entire hierarchical tree and defining "​peaks",​ of which all children inherit their domain. ​
 +  - We define standard_concepts depending on their deprecation status and domain
 +- We get a stream of concept-to-concept relationships
 +  - New ones get added
 +  - Missing ones - if the concepts are deprecated, we leave them alone, if the concepts are active, we deprecate them
 +  - Explicitely deprecated ones are deprecated
 +- We get a stream of update (inactive to active) relationships (only one per deprecated concept must exist)
 +  - New ones get added
 +  - Existing identical ones get left alone
 +  - Existing update relationship to a different concept get deprecated and the new one added
 +
 +Makes sense?
 +
 +I am not sure we need UMLS for them. UMLS is really only a re-formating of SNOMED. There isn't much going on. Unless you found something in http://​www.nlm.nih.gov/​research/​umls/​sourcereleasedocs/​current/​SNOMEDCT_US. Take a look.
 ====== Import data from SNOMED vocabulary. ====== ====== Import data from SNOMED vocabulary. ======
  
Line 21: Line 44:
   - Search OMOP by the "​CONCEPT_CODE"​   - Search OMOP by the "​CONCEPT_CODE"​
   - Query the UMLS by web API.   - Query the UMLS by web API.
 +After this checks we will receive the data:
 +
 +|Records processed| **X** |
 +|Records recognized only by OMOP| **Y** |
 +|Records recognized only by UMLS| **Z** |
 +|Records recognized by OMOP and UMLS| **N**|
 +|Records not recognized| **M**|
 +
 +From this table we can say that:
 +
 +**N** - stable records, recognized by both systems, most likely they are valid.
 +
 +**Z** - missing records, that should be added to OMOP. We can use UMLS data for validation purposes.
 +
 +**Y** - this data should be inspected. There might be an invalid records, or we importing newer version of SNOMED, that included in UMLS.
 +
 +**M** - new records, that are just added to new version of SNOMED. We need to validate them, using the source description.
 +
 +Also, we must have an ability to see each of this subsets as the table or export it to file by the user request.
 +==== Validation ====
 +
 +This process allows us to ensure, that OMOP describes the Concept exactly as the Source vocabulary. We also can use UMLS API for additional checks.
 +At first we should define the Concept'​s type. It can be:
 +  * Domain
 +  * Relationship
 +  * Standard Concept
 +  * Classification Concept
 +  * Vocabulary
 +After the type of the Concept is been defined, we can perform the additional checks, that are specific for each type.
 +
 +=== Domain ===
 +
 +If the current concept is Domain, we can verify that:
 +  * There is a Domain entity connected with this Concept.
 +  * The string description of the Source Concept is equal to CONCEPT_NAME,​ and DOMAIN_NAME. If they are not equal, there must be at least one connected Concept Synonym with equal CONCEPT_SYNONYM_NAME.
 +
 +=== Relationship ===
 +
 +If current Source Concept is Relationship,​ we should compare it with the Relationship Mapping.
 +
 +=== Classification Concept ===
 +  * Must belong to same Domain as the Source Concept.
 +  * Must be a part of the same Vocabulary, as being imported.
 +  * Must have same count of siblings as the Source Concept.
 +  * Must have equal CONCEPT_NAME,​ or CONCEPT_SYNONYM.
  
 +=== Standard Concept ===
 +  * Must belong to same Domain.
 +  * Must be a part of same vocabulary.
 +  * Must have an equal Relations within the importing vocabulary.
  
documentation/athena/import_snomed.1427877957.txt.gz · Last modified: 2015/04/01 08:45 by gleb_malikov