User Tools

Site Tools



To creatively apply knowledge gained through the course of the semester to a substantial data science problem.


You must turn in a PDF writeup of your project. The writeup must be about 6 pages long (including figures).

Grading standards:

Your final project counts as about 15% of your overall grade (see Learning Suite for a precise breakdown of the value of different assignments).

I will evaluate your writeup primarily based on the quality of your writing. Grades will be derived approximately as follows:

  • 10% a clean introduction and summary of findings
  • 80% the main technical sections
  • 10% conclusion - lessons learned, etc.

Note that no late submissions are possible for this project, because it is done in lieu of the final exam.


For your final project, you must analyze the ANES dataset used in class. As a reminder, this dataset captures the political landscape of 2016, and includes a wide variety of demographic and political variables.

For your project, I expect you to produce a significant report that leverages the skills and concepts we have learned in class. Your job is to analyze the ANES dataset and find interesting patterns. The goal of your writeup should be to convey insight. This is typically done with careful analysis, statistical rigor, and appropriate visualizations. Different people will find different things!

Your final report must be structured as follows:

  • Introduction - summarize interesting insights you uncovered
  • At least six technical sections - one for each substantial analytic effort
  • Conclusion - what did you learn as you analyzed this data set?

You may include more technical sections if you would like, and your may be more than 6 pages long (but not less).

Each technical section must contain

  • A sentence or two describing what you set out to do
  • Some technical detail on your approach
  • Some sort of visualization of the result (a figure, a table, etc).

I expect each technical section to be about 3/4 - 1 page long, although it could be longer. You may, of course, include multiple visualizations for each section – whatever conveys insight!

Possible ideas for elements of your project:

The ANES dataset is large and complex. Many different kinds of analysis and visualization are possible. A few examples include:

  • Looking at correlations between different variables
  • Comparing marginal and conditional probabilities
  • Visualizing histograms (or KDE plots) of different factors
  • Clustering ANES individuals based on different factors (and/or distance measures)


The dataset is available at

and the codebook is available at:

For more information about the ANES, you may also visit their official website:


You are welcome to use any publicly available code on the internet to help you.

cs180_final.txt · Last modified: 2021/04/01 16:16 by wingated