Data Science in Infectious Disease Modeling using R
Online Lab Meeting Times:
- Monday, July 7, 1:00 - 2:30 PM ET and 3:00 - 4:30 PM ET
- Tuesday July 8, 1:00 - 2:30 PM ET and 3:00 - 4:30 PM ET
Classroom: Virtual
Module Summary:
This course will foster a problem solving mindset while exploring more advanced concepts in R. There are multiple approaches to coding an analysis; contents covered in this course will serve as a guide to help you decide which approach is best for your particular situation.
We will draw concepts from R for Data Science, Advanced R, and R for Epidemiology books along with the instructors’ experiences with infectious disease modeling. We will work in both base R and the tidyverse to wrangle messy data and build an analytic workflow. The course will consist of three themes: Foundations in data science not often covered in Intro R courses, Advanced data wrangling tailored to infectious disease research, and Special considerations for SISMID and public health data.
Prerequisites:
Familiarity with R base and Tidyverse.
Module Content:
-
Foundation Theme
- Structuring (file organization and object naming)
- Building “good” functions and environment scoping
- Programming for speed (writing efficient functions, parallelize processes, and when to use another language)
Problem solving using advanced data wrangling
- Pivot tables, joins and look up tables
- Regular Expressions
- Advanced troubleshooting using conditions (TryCatch, log, etc.)
- Implementing parallelized processes
Special considerations
- Working with encrypted files
- Web scraping?
- Common implementation/coding pain points in most popular SISMID courses
Instructors

Sarah Bowden, PhD
Data Scientist, Division of Global Migration Health at CDC
Dr. Sarah Bowden is a Data Scientist in the Division of Global Migration Health at CDC. She has been coding in R since 2007 and has enjoyed seeing the Tidyverse develop and grow over time. Dr. Bowden uses Tidyverse tools and best practices in her day-to-day coding activities and has trained and mentored 20+ undergraduate, graduate, and postdoctoral fellows in data science and public health analytics over the past 7 years.

Raj Reni Kaul, PhD
Professor, Department of Biostatistics & Bioinformatics, Emory University
Dr. Reni Kaul is a Prevention Effectiveness Fellow on the Analytics and Modeling Track in the Immunization Services Division at the CDC. She is a certified Carpentries Instructor and is committed to creating an inclusive learning environment. She has previously designed and taught coding courses in R for undergraduate and graduate students
Required Software:
Recommended Reading: