This shows you the differences between two versions of the page.
cs501r_f2016:fp [2018/09/21 20:29] wingated |
cs501r_f2016:fp [2021/06/30 23:42] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====Objective:==== | ||
- | To creatively apply knowledge gained through the course of the semester to a substantial learning problem of your own choosing. | ||
- | |||
- | ---- | ||
- | ====Deliverable:==== | ||
- | |||
- | There are two deliverables for the final: | ||
- | |||
- | * An excel spreadsheet (or CSV file) that shows the total amount of time you spent on your final, broken down by day | ||
- | * A PDF writeup of your project (one page) | ||
- | |||
- | ---- | ||
- | ====Grading standards:==== | ||
- | |||
- | Your final project counts as 20% of your overall grade. | ||
- | |||
- | Grading is divided into two parts: 80% of your final project grade is based on the number of hours you spent, and 20% is based on your writeup. | ||
- | |||
- | For the number of hours, I will take the total number of hours and divide by 40, then multiply by 100. This will be your percentage. (So, 40 hours == 100%, 30 hours == 75%, etc.) | ||
- | |||
- | I will evaluate your writeup based on the quality of your writing. | ||
- | |||
- | ---- | ||
- | ====Description:==== | ||
- | |||
- | For your final project, you should execute a substantial project of your own choosing. You will turn in a single writeup (in PDF format only, please!). Your writeup can be structured in whatever way makes sense for your project, but see below for some possible outlines. | ||
- | |||
- | **Your project will be graded more on effort than results.** As I have stated in class, I would rather have you swing for the fences and miss, than take on a simple, safe project. **It is therefore very important that your final time log clearly convey the scope of your efforts.** | ||
- | |||
- | I am expecting some serious effort on this project, so I am expecting that your writeup, even if it's short, reflects that. | ||
- | |||
- | ---- | ||
- | ====Requirements for the time log:==== | ||
- | |||
- | For the time log, you must document the time you spent (on a daily basis). **If you do not document your time, it will not count.** In other words, it is not acceptable to claim that you spent 40 hours on your project, without a time log to back it up. I will not accept any excuses about this requirement. | ||
- | |||
- | Additional requirements: | ||
- | |||
- | * You may not count any more than 5 hours of research and reading | ||
- | * You may not count any more than 15 hours of "prep work". This could include dataset preparation, collection and cleaning; or wrestling with getting a simulator / model working for a deep RL project; etc. | ||
- | * At least 20 hours must involve designing, testing, and iterating deep learning-based models, analyzing results, experimenting, etc. | ||
- | |||
- | ---- | ||
- | ====Requirements for the writeup:==== | ||
- | |||
- | Your writeup serves to inform me about what you did, and simply needs to describe what you did for your project. You should describe: | ||
- | |||
- | * The problem you set out to solve | ||
- | * The exploratory data analysis you did | ||
- | * Your technical approach | ||
- | * Your results | ||
- | |||
- | It should be about 1-2 pages. | ||
- | |||
- | ---- | ||
- | ====Possible project ideas:==== | ||
- | |||
- | Many different kinds of final projects are possible. A few examples include: | ||
- | |||
- | * Learning how to render a scene based on examples of position and lighting | ||
- | * Learning which way is "up" in a photo (useful for drone odometry) | ||
- | * Training an HTTP server to predict which web pages a user will likely visit next | ||
- | * Training an earthquake predictor | ||
- | * Using GANs to turn rendered faces into something more realistic (avoiding the "uncanny valley") | ||
- | * Transforming Minecraft into a more realistic looking game with DNN post-processing | ||
- | * Using style transfer on a network trained for facial recognition (to identify and accentuate facial characteristics) | ||
- | * Using RGB+Depth datasets to improve geometric plausibility of GANs | ||
- | |||
- | The project can involve any application area, but the core challenge must be tackled using some sort of deep learning. | ||
- | |||
- | The best projects involve a new, substantive idea and novel dataset. It may also be acceptable to use vanilla DNN techniques on a novel dataset, as long as you demonstrate significant effort in the "science" of the project -- evaluating results, exploring topologies, thinking hard about how to train, and careful test/training evaluation. It may also be acceptable to simply implement a state-of-the-art method from the literature, but clear such projects with me first. | ||
- | |||
- | ---- | ||
- | ====Notes:==== | ||
- | |||
- | You are welcome to use any publicly available code on the internet to help you. | ||
- | |||
- | Here are some possible questions that you might consider answering as part of your report: | ||
- | |||
- | - **A discussion of the dataset** | ||
- | - Where did it come from? Who published it? | ||
- | - Who cares about this data? | ||
- | - **A discussion of the problem to be solved** | ||
- | - Is this a classification problem? A regression problem? | ||
- | - Is it supervised? Unsupervised? | ||
- | - What sort of background knowledge do you have that you could bring to bear on this problem? | ||
- | - What other approaches have been tried? How did they fare? | ||
- | - **A discussion of your exploration of the dataset**. | ||
- | - Before you start coding, you should look at the data. What does it include? What patterns do you see? | ||
- | - Any visualizations about the data you deem relevant | ||
- | - **A clear, technical description of your approach.** | ||
- | - Background on the approach | ||
- | - Description of the model you use | ||
- | - Description of the inference / training algorithm you use | ||
- | - Description of how you partitioned your data into a test/training split | ||
- | - How many parameters does your model have? What optimizer did you use? | ||
- | - What topology did you choose, and why? | ||
- | - Did you use any pre-trained weights? Where did they come from? | ||
- | - **An analysis of how your approach worked on the dataset** | ||
- | - What was your final RMSE on your private test/training split? | ||
- | - Did you overfit? How do you know? | ||
- | - Was your first algorithm the one you ultimately used for your submission? Why did you (or didn't you) iterate your design? | ||
- | - Did you solve (or make any progress on) the problem you set out to solve? | ||
- | |||
- | ---- | ||
- | ====Possible sources of interesting datasets==== | ||
- | |||
- | Croudflower | ||
- | |||
- | KDD cup | ||
- | |||
- | UCI repository | ||
- | |||
- | Kaggle (current and past) | ||
- | |||
- | Data.gov | ||
- | |||
- | AWS | ||
- | |||
- | World bank | ||
- | |||
- | BYU CS478 datasets | ||
- | |||
- | data.utah.gov | ||
- | |||
- | Google research | ||
- | |||
- | BYU DSC competition |