PSTAT100 Data Science Concepts and Analysis
Spring 2026
My name is John Inston and I will be the instructor for this course, I am a 4th year Ph.D. candidate in the Department of Statistics and Applied Probability here at UC Santa Barbara. Thank you all for taking this course. I hope that you find it both interesting and informative.
This course aims to provide an overview of key concepts in data science and the use of tools for data retrieval, analysis, visualization, and reproducible research in preparation for advanced data science courses. Topics include an introduction to inference and prediction, principles of measurement, missing data, and notions of causality, statistical traps, and concepts in data ethics and privacy.
🔍 Reference Material
The contents of this course was prepared using past teaching material provided by Ethan P. Marzpan as well as historical course material made available online by the UCSB Department of Statistics and Applied Probability.
📚 Material
Lecture Notes:
Course Readings:
Helpful Resources:
✏️ Information
Teaching Staff
| Name | Role | Office Hours | |
|---|---|---|---|
| John Inston | Instructor | johninston@ucsb.edu | SH 5431T R 1:00PM - 3:00PM |
| Lauren Hughes | TA | laurenhughes@ucsb.edu | SH 5421 T 3:30PM - 5:30PM |
| Yuting Ma | TA | yutingma@ucsb.edu | Zoom Meeting M 9:00AM - 11:00AM |
| Zhuojun Lyu | TA | zhuojun@ucsb.edu | Zoom Meetng W 1:00PM - 3:00PM |
Instruction
Course instruction will comprise of 20 lectures (held twice per week) and 10 programming labs (held weekly).
- Lecture:
- TR 11:00AM - 12:15PM ILP 1101
- Labs:
- M 1:00PM - 1:50PM ILP 4107 - Zhoujun Lyu
- M 2:00PM - 2:50PM ILP 3209 - Zhoujun Lyu
- M 3:00PM - 3:50PM ILP 3209 - Yuting Ma
- M 4:00PM - 4:50PM Girvetz Hall 2129 - Yuting Ma
- M 5:00PM - 5:50PM ILP 3209 - Lauren Hughes
- M 6:00PM - 6:50PM ILP 3205 - Lauren Hughes

Assessments
You will be required to complete 10 lab worksheets which will be due for submission the following Friday.
You will be required to complete 4 assignments which will be due for submission approximately every 2 weeks.
❗ The due dates for the assignments have changed and will be posted on Canvas. Your first assignment will be released on Wednesday April 8th and due on Friday April 17th at the end of week 3.
You final assessment will be a group project (groups up to 3) which will comprise of cleaning and analyzing a data set of your choice. You will be required to submit a project proposal in Week 5 specifying your data set as well as providing a rough project outline.
Course Schedule
Please check this schedule regularly throughout the term as it is updated with the latest material and reading suggestions.
| Week | Date | Topics | Reading | Materials |
|---|---|---|---|---|
| 1 | Tue, Mar 31 | Course Information. Introduction to Data Science. Data Science Lifecycle |
Lec 1 Slides Lec 2 Slides Lab 1 |
|
| 2 | Tue, Apr 7 | Sampling and Bias. Data Preparation. |
Lec 3 Slides | |
| 3 | Tue, Apr 14 | Visualizations. Exploratory Data Analysis. |
||
| 4 | Tue, Apr 21 | Regression Analysis GeneralizedLinear Regression Regression |
||
| 5 | Tue, Apr 28 | Classification Methods Support Vector Machines |
||
| 6 | Tue, May 5 | Decision Trees Random Forests |
||
| 7 | Tue, May 12 | Clustering Principal Components Analysis |
||
| 8 | Tue, May 19 | Feature Engineering | ||
| 9 | Tue, May 26 | Introduction to Deep Learning | ||
| 10 | Tue, Jun 2 | Data Science Ethics and Privacy |