Introduction to Text Mining and NLP for Health Data - 2021-02-12

views comments

This workshop covers an introduction to natural language processing (NLP) and caveats for its application to health data. Using the R programming language we will introduce the basics of text processing and demo how to calculate common metrics including word frequencies, term frequency-inverse document frequency (TFIDF), and principal component analysis (PCA) to explore important words and group similar documents. We will also introduce more advanced NLP topics (sentiment analysis, topic modeling, etc.) and discuss classical versus deep learning approaches, as time permits. Learners with proficient R skills are encouraged to code along.

Learner Objectives:
* explain natural language processing in lay terms
* give examples of text mining and NLP applications for research
* evaluate particular challenges posed by working with health data
* describe key text mining and NLP metrics for assessing word importance and document similarity
* identify where to go to learn more!

Prerequisites:
No prior NLP or text mining knowledge is necessary. Learners interested in coding along are expected to have prior experience using R, be comfortable with basic R syntax, and to have it pre-installed and running on their laptops. Some preparatory reading prior to the workshop may be provided.

The copyright on this video is owned by the Regents of the University of California and is licensed for reuse under the Creative Commons Attribution 4.0 International (CC BY 4.0) License.

Tags

Introduction to Text Mining and NLP for Health Data - 2021-02-12

Related Media