The goal of this course is to give students a broad, introductory look at the field of data science. It will develop technical skills (including some python programming, statistics, machine learning, data cleaning and visualization) as well as broad data literacy (mental frameworks for decomposing data science problems, critical thinking about potential conclusions of an analysis, and potential pitfalls of overreliance on unreliable data).
Learning outcomes:
Course structure:
The course will revolve around five main technical areas. Students will be taught enough python programming to be able to complete labs in each area, but we will also teach students through existing GUI-based visualization / data pipeline libraries (such as tableau or bamboolib).
The course will also emphasize data literacy, including how to think critically about the use of data in making arguments, and the reliability of conclusions drawn from data.
Programming Lab 1 - Measures of Centrality
Programming Lab 2 - First Data Visualizations
Programming Lab 3 - Intro to Pandas
Literacy Assignment 1 - Nutrition Analysis
Literacy Assignment 2 - Tweet Sentiments
Literacy Assignment 3 - Greenhouse Gases
Literacy Assignment 4 - Market Share Trends
Literacy Assignment 5 - Star Brightness
Literacy Assignment 6 - Movie Rating and Profits
Literacy Assignment 7 - Engagement Rings