This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
documentation:athena:import_snomed [2015/04/01 10:04] gleb_malikov |
documentation:athena:import_snomed [2015/04/02 04:30] (current) cgreich |
||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | I would build the logic slightly differently: | ||
| + | |||
| + | 1. Concepts. | ||
| + | - We have one authoratitative source: SNOMED international in combination with SNOMED UK. Other components might follow later (DM+D, other country-specific versions). | ||
| + | - We get a stream of concepts from them: | ||
| + | - Attributes of existing concepts are overwritten by the new concepts | ||
| + | - New concepts are added | ||
| + | - Missing concepts are deprecated | ||
| + | - Explicitely deprecated (inactivated) concepts are deprecated | ||
| + | - We do domain assignments for all of them. This is done by building the entire hierarchical tree and defining "peaks", of which all children inherit their domain. | ||
| + | - We define standard_concepts depending on their deprecation status and domain | ||
| + | - We get a stream of concept-to-concept relationships | ||
| + | - New ones get added | ||
| + | - Missing ones - if the concepts are deprecated, we leave them alone, if the concepts are active, we deprecate them | ||
| + | - Explicitely deprecated ones are deprecated | ||
| + | - We get a stream of update (inactive to active) relationships (only one per deprecated concept must exist) | ||
| + | - New ones get added | ||
| + | - Existing identical ones get left alone | ||
| + | - Existing update relationship to a different concept get deprecated and the new one added | ||
| + | |||
| + | Makes sense? | ||
| + | |||
| + | I am not sure we need UMLS for them. UMLS is really only a re-formating of SNOMED. There isn't much going on. Unless you found something in http://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/SNOMEDCT_US. Take a look. | ||
| ====== Import data from SNOMED vocabulary. ====== | ====== Import data from SNOMED vocabulary. ====== | ||
| Line 23: | Line 46: | ||
| After this checks we will receive the data: | After this checks we will receive the data: | ||
| - | |Records processed| X | | + | |Records processed| **X** | |
| - | |Records recognized only by OMOP| Y | | + | |Records recognized only by OMOP| **Y** | |
| - | |Records recognized only by UMLS| Z | | + | |Records recognized only by UMLS| **Z** | |
| - | |Records recognized by OMOP and UMLS| N| | + | |Records recognized by OMOP and UMLS| **N**| |
| - | |Records not recognized| M| | + | |Records not recognized| **M**| |
| From this table we can say that: | From this table we can say that: | ||
| - | N - stable records, recognized by both systems, most likely they are valid. | + | **N** - stable records, recognized by both systems, most likely they are valid. |
| - | Z - missing records, that should be added to OMOP. We can use UMLS data for validation purposes. | + | **Z** - missing records, that should be added to OMOP. We can use UMLS data for validation purposes. |
| - | Y - this data should be inspected. There might be an invalid records, or we importing newer version of SNOMED, that included in UMLS. | + | **Y** - this data should be inspected. There might be an invalid records, or we importing newer version of SNOMED, that included in UMLS. |
| - | M - new records, that are just added to new version of SNOMED. We need to validate them, using the source description. | + | **M** - new records, that are just added to new version of SNOMED. We need to validate them, using the source description. |
| + | Also, we must have an ability to see each of this subsets as the table or export it to file by the user request. | ||
| ==== Validation ==== | ==== Validation ==== | ||
| Line 50: | Line 74: | ||
| After the type of the Concept is been defined, we can perform the additional checks, that are specific for each type. | After the type of the Concept is been defined, we can perform the additional checks, that are specific for each type. | ||
| + | === Domain === | ||
| + | |||
| + | If the current concept is Domain, we can verify that: | ||
| + | * There is a Domain entity connected with this Concept. | ||
| + | * The string description of the Source Concept is equal to CONCEPT_NAME, and DOMAIN_NAME. If they are not equal, there must be at least one connected Concept Synonym with equal CONCEPT_SYNONYM_NAME. | ||
| + | |||
| + | === Relationship === | ||
| + | |||
| + | If current Source Concept is Relationship, we should compare it with the Relationship Mapping. | ||
| + | |||
| + | === Classification Concept === | ||
| + | * Must belong to same Domain as the Source Concept. | ||
| + | * Must be a part of the same Vocabulary, as being imported. | ||
| + | * Must have same count of siblings as the Source Concept. | ||
| + | * Must have equal CONCEPT_NAME, or CONCEPT_SYNONYM. | ||
| + | |||
| + | === Standard Concept === | ||
| + | * Must belong to same Domain. | ||
| + | * Must be a part of same vocabulary. | ||
| + | * Must have an equal Relations within the importing vocabulary. | ||