Differences

This shows you the differences between two versions of the page.

--- documentation:usagi [2014/12/17 16:14]
ericavoss [Importing Source Codes into Usagi]
+++ — (current)
@@ Line 1: / Line 1: @@
-====== Usagi ======
-{{ :documentation:usagi_logo.png |}}
-===== Introduction =====
-Usagi is a software tool created by the Observational Health Data Sciences and Informatics (OHDSI) team and is used to help in the process of mapping codes from a source system into terminologies, preferably standard ones, stored in the Observational Medical Outcomes Partnership (OMOP) Vocabulary ([[http://www.ohdsi.org/data-standardization/vocabulary-resources/]]).  The word Usagi is Japanese for rabbit and was named after the first mapping exercise it was used for; mapping source codes used in a Japanese dataset into OMOP Vocabulary concepts.
-Mapping source codes into the OMOP Vocabulary is valuable for two main reasons:
-  - When converting a raw dataset into the OMOP Common Data Model (CDM) ([[http://www.ohdsi.org/data-standardization/the-common-data-model/]]), translating source specific codes into standard concepts (i.e. RxNorm or SNOMED) translates the source data into a “common language” other CDMs follow.
-  - Having source codes tied into the OMOP Vocabulary concepts allow a researcher to leverage the power of finding relevant source codes leveraging classification terminologies in the OMOP Vocabulary (e.g. find all antipsychotic medications or find all condition codes related to heart failure).
-==== Scope and purpose ====
-A source codes file that needs mapping are loaded into the Usagi (if the codes are not in English additional translations columns are needed).  A term similarity approach is used to connect source codes to OMOP Vocabulary concepts (currently only OMOP Vocabulary V5).  At a high level this term similarity approach works by 1) leveraging Unified Medical Language System (UMLS) to find synonyms for concepts in the OMOP Vocabulary (i.e. if the concept in the OMOP Vocabulary is “Myocardial infarction” a synonym for that concept is “heart attack”) and 2) map source code descriptions (in English) to the OMOP Vocabulary concepts by using by using a term similarity score. However these code connections need to be manually reviewed and Usagi provides an interface to facilitate that.
-Usagi currently does not currently translate non-English codes to English.  We suggest using Google Translate ([[https://translate.google.com/]]).  You can paste an entire column of non-English terms into Google Translate, and it will return that same column translated to English.
-==== Process Overview ====
-The typical sequence for using this software is:
-  - Load codes from your sources system (“source codes”) that you would like to map to OMOP Vocabulary concepts.
-  - Usagi will run term similarity approach to map source codes to OMOP Vocabulary concepts.
-  - Leverage Usagi interface to check suggested mappings or create maps.  Preferably an individual who has experience with the coding system and medical terminology should be used for this review.
-  - Export final map generated by Usagi into the OMOP Vocabulary’s SOURCE_TO_CONCEPT_MAP.
-===== Installation and support =====
-All source code and installation instructions are available on Usagi’s GitHub site:
-[[https://github.com/OHDSI/Usagi]]
-Any bugs/issues/enhancements should be posted to the GitHub repository:
-[[https://github.com/OHDSI/Usagi/issues]]
-Any questions/comments/feedback/discussion can be posted on the OHDSI Developer Forum: [[http://forums.ohdsi.org/c/developers]]
-===== Using the Application Functions =====
-==== Importing Source Codes into Usagi ====
-Export source codes from source system into a CSV or Excel (.xlsx) file.  This should at least have the columns SOURCE_CODE and SOURCE_CODE_DESCRIPTION however additional information about codes can be brought over as well (e.g. DOSE_UNIT).  In addition to information about the source codes, the frequency of the code should also be brought over as FREQUENCY, this can help prioritize which codes should receive the most effort in mapping (i.e. you can have 1,000 source codes but only 100 are truly used within the system).  If any source code information needs translating to English, use Google Translate to do that.  Add the English translations to your file.
-Note: source code extracts should be broken out by domain (i.e. drugs, procedures, conditions, observations) and not lumped into one large file.
-Source codes are loaded into Usagi from the File --> Import codes menu.  From here an “Import codes ...” will display as seen in Figure 1.
-{{ :documentation:loadingscreen1.png?direct |}}
-**Figure 1:  Usagi Source Code Input Screen**
-In Figure 1, the source code terms were in Dutch and were also translated into English.  Usagi will leverage the English translations to map to the standard vocabulary (SNOMED in this case).
-{{ :documentation:loadingscreen2.png?direct |}}
-**Figure 2:  Telling Usagi how to Read Input File**
-Seen in Figure 2, the //Column mapping// section is where you define for Usagi how to use the imported CSV.  If you mouse hover over the drop downs, a pop-up will appear defining each column.  Usagi will not use the //Additional info column(s)// as information to associate source codes to Vocabulary concept codes; however this additional information may help the individual reviewing the source code mapping and should be included.
-Finally you can tell Usagi what OMOP Vocabulary terminologies you plan to map into.  For example, in Figure 3, the user is mapping the source codes to the SNOMED standard terminology the OMOP Vocabulary.  Hover your mouse over the different filters for additional information about the filter.
-One special filter is //Filter by automatically selected concepts//.  If there is information that you can use to restrict the search, you can do so by providing a list of CONCEPT_IDs in the column indicated in the //Auto concept ID column// (semicolon-delimited).  For example, in the case of drugs there might be a mapping available to ATC codes.  Even though an ATC code does not uniquely identify a single RxNorm drug code, it does help limit the search space to only those concepts that fall under the ATC code in the Vocabulary.  By providing this list of CONCEPT_IDs in the //Auto concept ID column//, and turning on //Filter by automatically selected concepts//, Usagi will make use of this information.  In the example above, we used a partial mapping derived from UMLS to restrict Usagi to this mapping when available.
-==== Reviewing Source Code to OMOP Vocabulary Concept Maps ====
-TBD
-=== Approving a Suggested Mapping ===
-TBD
-=== Searching for a New Mapping ===
-TBD
-=== Auto Mapped ===
-TBD
-==== Export the Usagi Map Created ====
-TBD
-==== Menu Options ====
-TBD

Observational Health Data Sciences and Informatics

User Tools

Site Tools

Differences

Page Tools