Data Science in Infectious Disease Modeling using R

Online Lab Meeting Times:

  • Monday, July 7, 1:00 - 2:30 PM ET and 3:00 - 4:30 PM ET
  • Tuesday July 8, 1:00 - 2:30 PM ET and 3:00 - 4:30 PM ET

Classroom: Virtual

Module Summary:

This course will foster a problem solving mindset while exploring more advanced concepts in R. There are multiple approaches to coding an analysis; contents covered in this course will serve as a guide to help you decide which approach is best for your particular situation.  

We will draw concepts from R for Data Science, Advanced R, and R for Epidemiology books along with the instructors’ experiences with infectious disease modeling.  We will work in both base R and the tidyverse to wrangle messy data and build an analytic workflow.  The course will consist of three themes: Foundations in data science not often covered in Intro R courses, Advanced data wrangling tailored to infectious disease research, and Special considerations for SISMID and public health data.

Prerequisites:

Familiarity with R base and Tidyverse.

Module Content:

  • Foundation Theme 

    • Structuring (file organization and object naming)
    • Building “good” functions and environment scoping
    • Programming for speed (writing efficient functions, parallelize processes, and when to use another language)

    Problem solving using advanced data wrangling 

    • Pivot tables, joins and look up tables
    • Regular Expressions
    • Advanced troubleshooting using conditions (TryCatch, log, etc.) 
    • Implementing parallelized processes 

    Special considerations 

    • Working with encrypted files
    • Web scraping? 
    • Common implementation/coding pain points in most popular SISMID courses

Instructors

Sarah Bowden, PhD

Sarah Bowden, PhD

Data Scientist, Division of Global Migration Health at CDC

Dr. Sarah Bowden is a Data Scientist in the Division of Global Migration Health at CDC. She has been coding in R since 2007 and has enjoyed seeing the Tidyverse develop and grow over time. Dr. Bowden uses Tidyverse tools and best practices in her day-to-day coding activities and has trained and mentored 20+ undergraduate, graduate, and postdoctoral fellows in data science and public health analytics over the past 7 years.

Learn More >>

Raj Reni Kaul, PhD

Raj Reni Kaul, PhD

Professor, Department of Biostatistics & Bioinformatics, Emory University

Dr. Reni Kaul is a Prevention Effectiveness Fellow on the Analytics and Modeling Track in the Immunization Services Division at the CDC. She is a certified Carpentries Instructor and is committed to creating an inclusive learning environment. She has previously designed and taught coding courses in R for undergraduate and graduate students

Required Software:

  •  

Recommended Reading:

  •