documentation:athena:import_snomed

This is an old revision of the document!


Import data from SNOMED vocabulary.

The source SNOMED vocabulary can be acquired from SNOMED. Also, SNOMED is included in UMLS. Both of this resources are suggested to be used in the import process.

Local copy of the vocabulary will be used to extract the concepts, and the UMLS web API will be used for additional concept analysis. The advantages of this approach are:

  • If the SNOMED vocabulary has been updated, we can process it, and we do not have to wait the next UMLS update.
  • Less load on web API.
  • Still we can use UMLS knowledge about the SNOMED vocabulary, especially in cross-vocabulary relations.

We'll start from the basic import process, which will give additional knowledge about the process itself.

The import process

Each concept in the source dictionary can be:

  • Identified
  • Validated

In current scope, identification means that OMOP and UMLS already have info about current context. When the Concept is identified, it can be validated. Each Concept is described by its type, set of attributes and relations with other Concepts. During the validation process, we must compare the Source and UMLS Concepts description to OMOP. If the translation can be performed to both directions, without data integrity and validity violation, we can say that the Concept is valid.

Identification

To identify the Concept we must:

  1. Search OMOP by the “CONCEPT_CODE”
  2. Query the UMLS by web API.

After this checks we will receive the data:

Records processed X
Records recognized only by OMOP Y
Records recognized only by UMLS Z
Records recognized by OMOP and UMLS N
Records not recognized M

From this table we can say that:

N - stable records, recognized by both systems, most likely they are valid.

Z - missing records, that should be added to OMOP. We can use UMLS data for validation purposes.

Y - this data should be inspected. There might be an invalid records, or we importing newer version of SNOMED, that included in UMLS.

M - new records, that are just added to new version of SNOMED. We need to validate them, using the source description.

Validation

This process allows us to ensure, that OMOP describes the Concept exactly as the Source vocabulary. We also can use UMLS API for additional checks. At first we should define the Concept's type. It can be:

  • Domain
  • Relationship
  • Standard Concept
  • Classification Concept
  • Vocabulary

After the type of the Concept is been defined, we can perform the additional checks, that are specific for each type.

documentation/athena/import_snomed.1427882685.txt.gz · Last modified: 2015/04/01 10:04 by gleb_malikov