The importance and scale of data in the health sciences means researchers are increasingly required to develop the data skills needed to design reproducible workflows for the collection, organisation, processing, analysis and presentation of data. Developing such data skills requires at least some coding, also known as scripting. This makes your work (everything you do with your raw data) explicitly described, totally transparent and completely reproducible. However, learning to code can be a daunting prospect for many health scientists! That's where an Introduction to reproducible analyses in R comes in!
R is a free and open source language especially well-suited to data analysis and visualisation and has a relatively inclusive and newbie-friendly community. R caters to users who do not see themselves as programmers, but then allows them to slide gradually into programming.
To complete this course you will need to install:
- R version 3.6 or higher
- RStudio 1.2 or higher
- The tidyverse package
After this workshop the successful learner will be able to:
- Find their way around the RStudio windows
- Create and plot data using the base package and ggplot
- Explain the rationale for scripting analysis
- Use the help pages
- Know how to make additional packages available in an R session
- Reproducibly import data in a variety of formats
- Understand what is meant by the working directory, absolute and relative paths and be able to apply these concepts to data import
- Summarise data in a single group or in multiple groups
- Recognise tidy data format and carry out some typical data tidying tasks
The slides from this workshop are available on GitHub: https://github.com/3mmaRand/N8-CIR-intro-repro They include hyperlinks to the sample data used in the workshop.
You will need to set aside a total of around 2 hours to watch all of the videos. However, they have been split into smaller tutorials that you can view and revisit at your own pace. The videos are all hosted on YouTube.