documentation:athena:import_snomed

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
documentation:athena:import_snomed [2015/04/01 10:04]
gleb_malikov
documentation:athena:import_snomed [2015/04/02 04:30] (current)
cgreich
Line 1: Line 1:
 +I would build the logic slightly differently:​
 +
 +1. Concepts. ​
 +- We have one authoratitative source: SNOMED international in combination with SNOMED UK. Other components might follow later (DM+D, other country-specific versions).
 +- We get a stream of concepts from them:
 +  - Attributes of existing concepts are overwritten by the new concepts
 +  - New concepts are added
 +  - Missing concepts are deprecated
 +  - Explicitely deprecated (inactivated) concepts are deprecated
 +  - We do domain assignments for all of them. This is done by building the entire hierarchical tree and defining "​peaks",​ of which all children inherit their domain. ​
 +  - We define standard_concepts depending on their deprecation status and domain
 +- We get a stream of concept-to-concept relationships
 +  - New ones get added
 +  - Missing ones - if the concepts are deprecated, we leave them alone, if the concepts are active, we deprecate them
 +  - Explicitely deprecated ones are deprecated
 +- We get a stream of update (inactive to active) relationships (only one per deprecated concept must exist)
 +  - New ones get added
 +  - Existing identical ones get left alone
 +  - Existing update relationship to a different concept get deprecated and the new one added
 +
 +Makes sense?
 +
 +I am not sure we need UMLS for them. UMLS is really only a re-formating of SNOMED. There isn't much going on. Unless you found something in http://​www.nlm.nih.gov/​research/​umls/​sourcereleasedocs/​current/​SNOMEDCT_US. Take a look.
 ====== Import data from SNOMED vocabulary. ====== ====== Import data from SNOMED vocabulary. ======
  
Line 23: Line 46:
 After this checks we will receive the data: After this checks we will receive the data:
  
-|Records processed| X | +|Records processed| ​**X** 
-|Records recognized only by OMOP| Y | +|Records recognized only by OMOP| **Y** 
-|Records recognized only by UMLS| Z | +|Records recognized only by UMLS| **Z** 
-|Records recognized by OMOP and UMLS| N| +|Records recognized by OMOP and UMLS| **N**
-|Records not recognized| M|+|Records not recognized| ​**M**|
  
 From this table we can say that: From this table we can say that:
  
-N - stable records, recognized by both systems, most likely they are valid.+**N** - stable records, recognized by both systems, most likely they are valid.
  
-Z - missing records, that should be added to OMOP. We can use UMLS data for validation purposes.+**Z** - missing records, that should be added to OMOP. We can use UMLS data for validation purposes.
  
-Y - this data should be inspected. There might be an invalid records, or we importing newer version of SNOMED, that included in UMLS.+**Y** - this data should be inspected. There might be an invalid records, or we importing newer version of SNOMED, that included in UMLS.
  
-M - new records, that are just added to new version of SNOMED. We need to validate them, using the source description.+**M** - new records, that are just added to new version of SNOMED. We need to validate them, using the source description.
  
 +Also, we must have an ability to see each of this subsets as the table or export it to file by the user request.
 ==== Validation ==== ==== Validation ====
  
Line 50: Line 74:
 After the type of the Concept is been defined, we can perform the additional checks, that are specific for each type. After the type of the Concept is been defined, we can perform the additional checks, that are specific for each type.
  
 +=== Domain ===
 +
 +If the current concept is Domain, we can verify that:
 +  * There is a Domain entity connected with this Concept.
 +  * The string description of the Source Concept is equal to CONCEPT_NAME,​ and DOMAIN_NAME. If they are not equal, there must be at least one connected Concept Synonym with equal CONCEPT_SYNONYM_NAME.
 +
 +=== Relationship ===
 +
 +If current Source Concept is Relationship,​ we should compare it with the Relationship Mapping.
 +
 +=== Classification Concept ===
 +  * Must belong to same Domain as the Source Concept.
 +  * Must be a part of the same Vocabulary, as being imported.
 +  * Must have same count of siblings as the Source Concept.
 +  * Must have equal CONCEPT_NAME,​ or CONCEPT_SYNONYM.
 +
 +=== Standard Concept ===
 +  * Must belong to same Domain.
 +  * Must be a part of same vocabulary.
 +  * Must have an equal Relations within the importing vocabulary.
  
documentation/athena/import_snomed.1427882685.txt.gz · Last modified: 2015/04/01 10:04 by gleb_malikov