User Tools

Site Tools


cs180_final

This is an old revision of the document!


Objective:

To creatively apply knowledge gained through the course of the semester to a substantial data science problem.


Deliverable:

You must turn in a PDF writeup of your project. The writeup must be about 6 pages long (including figures).


Grading standards:

Your final project counts as about 15% of your overall grade (see Learning Suite for a precise breakdown of the value of different assignments).

I will evaluate your writeup primarily based on the quality of your writing. Grades will be derived approximately as follows:

  • 10% a clean introduction and summary of findings
  • 80% the main technical sections
  • 10% conclusion - lessons learned, etc.

Note that no late submissions are possible for this project, because it is done in lieu of the final exam.


Dataset:

For your final project, you must analyze the ANES dataset used in class. As a reminder, this dataset captures the political landscape of 2016, and includes a wide variety of demographic and political variables.

The dataset is available at

http://liftothers.org/byu/anes2016.csv

and the codebook is available at:

http://liftothers.org/byu/anes_codebook.pdf

For more information about the ANES, you may also visit their official website:

https://electionstudies.org/data-center/2016-time-series-study/


Description:

For your project, I expect you to produce a significant report that leverages the skills and concepts we have learned in class. Your final report must be structured as follows:

  • Introduction - summarize interesting insights you uncovered
  • At least six technical sections - one for each substantial analytic effort
  • Conclusion - what did you learn as you analyzed this data set?

You may include more technical sections if you would like, and your may be more than 6 pages long (but not less).

Each technical section must contain

  • A sentence or two describing what you set out to do
  • Some technical detail on your approach
  • Some sort of visualization of the result (a figure, a table, etc).

I expect each technical section to be about 3/4 - 1 page long, although it could be longer. You may, of course, include multiple visualizations for each section – whatever conveys insight!


Possible ideas for elements of your project:

The ANES dataset is large and complex. Many different kinds of analysis and visualization are possible. A few examples include:

  • Looking at correlations between different variables
  • Comparing marginal and conditional probabilities
  • Visualizing histograms (or KDE plots) of different factors
  • Clustering ANES individuals based on different factors (and/or distance measures)

Notes:

You are welcome to use any publicly available code on the internet to help you.

cs180_final.1617318830.txt.gz · Last modified: 2021/06/30 23:40 (external edit)