Data Science with R

Data Science
R
Author

John Robin Inston

Published

January 21, 2025

Modified

May 30, 2025

✏️ Information

Welcome to Data Science with R, a course providing an introduction on how to use the programming language R for data science and statistical modelling.

The aim of this course is to provide a thorough introduction to programming in R for individuals who may have never written or utilized a programming language before. First we outline how to install R on our operating system, how to download and use an Interactive Development Environment (IDE) such as RStudio, Positron or VSCode, and how to install packages on our system. We will then explore various ways to explore, manage and analyze data using both the in-built functionality in R as well as available libraries such as tidyverse.

✍️ Topics

This course is based fundamentally on the PSTAT10 Data Science Principles class I have taught in the past at UCSB. The course is split into the following topics:

  1. Installing and using R and RStudio.

  2. R Basics I - Operators, Logic & Data Types.

  3. R Basics II - Atomic Data Structures.

  4. R Basics III - Dataframes and Lists

  5. R Basics IV - Functions

  6. R Basics V - Looping and Branching

  7. Fundamentals of Probability Theory.

  8. Basic Simulation with R.

  9. Data Handling with the tidyverse package.

  10. Plotting with ggplot2.

  11. SQL Basics.

  12. SQL Aggregation and Joins.

📚 Materials

Each topic links to a website post with the relevant material. A pdf copy of the combined course notes can be downloaded here. Furthermore, each wesite post links to the corresponding youtube video going through the material.

For this course you will need to download the language R, your chosen IDE and Quarto using the following links:

Some helpful resources and additional guides and linked below:

If you found any of this material helpful consider buying me a coffee but only if you can afford to! Thank you for visiting this course page. 😊

Back to top