Table of Contents

Welcome to CS201R, Winter 2021!

The goal of this course is to give students a broad, introductory look at the field of data science. It will develop technical skills (including some python programming, statistics, machine learning, data cleaning and visualization) as well as broad data literacy (mental frameworks for decomposing data science problems, critical thinking about potential conclusions of an analysis, and potential pitfalls of overreliance on unreliable data).

Learning outcomes:

Course structure:

The course will revolve around five main technical areas. Students will be taught enough python programming to be able to complete labs in each area, but we will also teach students through existing GUI-based visualization / data pipeline libraries (such as tableau or bamboolib).

The course will also emphasize data literacy, including how to think critically about the use of data in making arguments, and the reliability of conclusions drawn from data.

Programming Labs

Programming Lab 1 - Measures of Centrality

Programming Lab 2 - First Data Visualizations

Programming Lab 3 - Intro to Pandas

Programming Lab 4 - Intro to Statistics

All Labs Can Be Accessed Here

Data Literacy Assignments

Literacy Assignment 1 - Nutrition Analysis

Literacy Assignment 2 - Tweet Sentiments

Literacy Assignment 3 - Greenhouse Gases

Literacy Assignment 4 - Market Share Trends

Literacy Assignment 5 - Star Brightness

Literacy Assignment 6 - Movie Rating and Profits

Literacy Assignment 7 - Engagement Rings

Literacy Assignment 8 - Ride Hailing

Literacy Assignment 9 - Social Connectedness in America

Final Project

Final Project