cs180_w2021

The goal of this course is to give students a broad, introductory look at the field of data science. It will develop technical skills (including some python programming, statistics, machine learning, data cleaning and visualization) as well as broad data literacy (mental frameworks for decomposing data science problems, critical thinking about potential conclusions of an analysis, and potential pitfalls of overreliance on unreliable data).

Learning outcomes:

- Students will be able to use state-of-the-art data science oriented languages and toolkits to derive insight from data
- Students will be able to think critically about conclusions drawn from data and its analysis
- Students will be able to apply general data science principles and methodologies to novel data science problems
- Students will be able to combine ideas from mathematics, statistics, machine learning and computer science to solve data science problems

Course structure:

The course will revolve around five main technical areas. Students will be taught enough python programming to be able to complete labs in each area, but we will also teach students through existing GUI-based visualization / data pipeline libraries (such as tableau or bamboolib).

- Programming: this is how we wrangle the data
- Data set cleaning and preparation
- Statistics: this is the mathematical foundation that explicitly lays out assumptions and allows us to derive sound conclusions from data
- Machine learning: this is how we build predictive models
- Visualization: the right picture can tell you what you need to know

The course will also emphasize data literacy, including how to think critically about the use of data in making arguments, and the reliability of conclusions drawn from data.

Programming Lab 1 - Measures of Centrality

Programming Lab 2 - First Data Visualizations

Programming Lab 3 - Intro to Pandas

cs180_w2021.txt · Last modified: 2021/02/09 13:19 by pkseeg