This is an old revision of the document!
To creatively apply knowledge gained through the course of the semester to a substantial data analysis problem of your own choosing.
For this lab, you will apply your data analysis skills to a new problem. You will turn in a report discussing your efforts.
Your entry will be graded on the following elements:
The final project is designed to give you a chance to explore a data science project end-to-end, with minimal restrictions.
For this project, you must:
You are welcome to use any publicly available code on the internet to help you. For example, you may wish to use the Stan language to help you construct an HMC sampler. Other possibilities include PyMC, the Venture probabilistic programming language, BayesDB, etc.
Your writeup should be a serious report on the dataset you chose, the problem you set out to solve, the technical approach you took (and your rationale for it), the results of any exploratory data analysis, and the results of your final model / inference / optimization algorithm.