Hua Xu, Jon Duke, Noemie Elhadad, Anupama Gururaj, Alexandre Yahi, Thomas Ginter, Olga Patterson, George Hripsack, Vojtech Huser
Agenda
IRB for use of clinical text
Clinical text data storage and representation schema
Presentation by Dr. Noemie Elhadad, Title: NLP schemas and clinical NLP tools in ShARe
Discussion – Next steps
NLP tools/pipelines for ETL
Use cases, e.g, phenotyping for cohort selection using NLP outputs
Presentation by Dr. Jon Duke, Title: Regenstrief NLP platform and approach to validation of phenotypes
Discussion – Next steps
Discussion
Minutes
IRB for use of clinical text
Clinical text data storage and representation schema
Presentation by Dr. Noemie Elhadad
Title: NLP schemas and clinical NLP tools in ShARe, File
output of converted unstructured text could be in the form of structured data, bag of words and word embedding. Structured data and bag of words are the most useful in the current context.
the ShARe schema for structured output combines many initiatives such as SHARP, THYME etc.
Discussion – Next steps
Table structure for storing concept level NLP outputs to be determined
It is sufficient to start with structured output
A concept table with concept ID in each row and note IDs should be generated
OMOP vocabulary is to be used to aggregate concept to a higher level to manage and condense the number of concepts
Next step is to go through all the columns exhaustively for all attributes, merge them and then decide the attributes that should be used in the table
NLP tools/pipelines for ETL
Use cases, e.g, phenotyping for cohort selection using NLP outputs
Presentation by Dr. Jon Duke, Title: Regenstrief NLP platform and approach to validation of phenotypes
the NLP platform is composed of a state machine with Regex based system
the NLP data analysis tool is currently being used for data analytics at Regenstrief.
the tool has text search capabilities and was demonstrated at the meeting
Discussion – Next steps
need to determine if the API for keyword search based on Solr or ElasticSearch can be shared