Introduction to Text-mining for Digital Health

Please make sure you register with your university email address.

What is text-mining? How can I use this method to extract important information from a large number of documents? What are its applications in the field of health?

If these questions intrigue you then come along to this free two-part workshop that has been designed to get you familiar with the main concepts. Text-mining is a data-mining process which transforms unstructured text into a structured format, which enables us to identify meaningful patterns and new insights. In terms of digital health, this method (which involves working with qualitative data) may seem more suited to the social sciences. Whilst there is a lot of truth to this, it’s also exciting to see text-mining applied to extract symptom-related information from online forums, or to anonymise personal medical data. This workshop will explore these applications and then showcase this process with a code demonstration in Python and R.

Facilitators

Louise Capener and Nadia Kennar

Dates and time

2 day event:

Wednesday 9th November, 1pm - 3pm: 1hr presentation (interactive mentimeter questions)
Friday 11th November, 1pm - 3pm: 2hr code demonstration (in Python and R one hour demo to be conducted in each software)

Session details

Our first session introduces the main concepts behind fully structured and semi-unstructured data, the theory behind capturing and amplifying existing structure, and the four basic steps involved in any text-mining project. We will then cover some of the most common text-mining analyses and discuss its implication.
Session two presents some sample code in both Python and R that demonstrates how to apply these concepts.

Prerequisites

This workshop is suitable for intermediate users of R and/or Python but there is no need to have experience with machine learning packages. Users should know how to set the working directory in R and/or Python, how to read in data and how to save scripts and output files.

Level: Beginner
Experience of R and/or Python: Yes
Prerequisites: Depending on your language of choice, Python/jupyter notebook and/or R and Rstudio must be already installed and working.
Knowledge of Text Mining: None
Target audience: Researchers/data scientists/anyone interested in text-mining