Documentation
Common Data Model (CDM)
Convert Database to CDM (ETL)
Tool Specific Documentation
Common Data Model (CDM)
Convert Database to CDM (ETL)
Tool Specific Documentation
This is an old revision of the document!
Usagi is a software tool created by the Observational Health Data Sciences and Informatics (OHDSI) team and is used to help in the process of mapping codes from a source system into terminologies, preferably standard ones, stored in the Observational Medical Outcomes Partnership (OMOP) Vocabulary (http://www.ohdsi.org/data-standardization/vocabulary-resources/). The word Usagi is Japanese for rabbit and was named after the first mapping exercise it was used for; mapping source codes used in a Japanese dataset into OMOP Vocabulary concepts.
Mapping source codes into the OMOP Vocabulary is valuable for two main reasons:
A source codes file that needs mapping are loaded into the Usagi (if the codes are not in English additional translations columns are needed). A term similarity approach is used to connect source codes to OMOP Vocabulary concepts (currently only OMOP Vocabulary V5). At a high level this term similarity approach works by 1) leveraging Unified Medical Language System (UMLS) to find synonyms for concepts in the OMOP Vocabulary (i.e. if the concept in the OMOP Vocabulary is “Myocardial infarction” a synonym for that concept is “heart attack”) and 2) map source code descriptions (in English) to the OMOP Vocabulary concepts by using by using a term similarity score. However these code connections need to be manually reviewed and Usagi provides an interface to facilitate that.
Usagi currently does not currently translate non-English codes to English. We suggest using Google Translate (https://translate.google.com/). You can paste an entire column of non-English terms into Google Translate, and it will return that same column translated to English.
The typical sequence for using this software is:
All source code and installation instructions are available on Usagi’s GitHub site: https://github.com/OHDSI/Usagi
Any bugs/issues/enhancements should be posted to the GitHub repository: https://github.com/OHDSI/Usagi/issues
Any questions/comments/feedback/discussion can be posted on the OHDSI Developer Forum: http://forums.ohdsi.org/c/developers
Export source codes from source system into a CSV or Excel (.xlsx) file. This should at least have the columns SOURCE_CODE and SOURCE_CODE_DESCRIPTION however additional information about codes can be brought over as well (e.g. DOSE_UNIT). In addition to information about the source codes, the frequency of the code should also be brought over as FREQUENCY, this can help prioritize which codes should receive the most effort in mapping (i.e. you can have 1,000 source codes but only 100 are truly used within the system). If any source code information needs translating to English, use Google Translate to do that. Add the English translations to your file.
Note: source code extracts should be broken out by domain (i.e. drugs, procedures, conditions, observations) and not lumped into one large file.
Source codes are loaded into Usagi from the File Import codes menu. From here an “Import codes …” will display as seen in Figure 1.
Figure 1: Usagi Source Code Input Screen**
TBD
TBD
TBD
TBD
TBD
TBD