Differences

This shows you the differences between two versions of the page.

--- cs401r_w2016:lab12 [2018/03/16 20:27]
wingated
+++ cs401r_w2016:lab12 [2021/06/30 23:42] (current)
@@ Line 10: / Line 10: @@
   - A notebook containing your code, but we will not run it.
   - A set of predictions for a specific list of <user,movie> pairs, in a CSV file.
-  - A report discussing your approach, how well it worked (in terms of RMSE), and any visualizations or patterns you found in the data.  PDF format, please!
+  - A report discussing your approach, how well it worked (in terms of RMSE), and any visualizations or patterns you found in the data.  Markdown format, please!!
 We will run a small "competition" on your predictions: the three students with the best predictions will get 10% extra credit on this lab.
@@ Line 21: / Line 21: @@
 Your entry will be graded on the following elements:
-  * 100% Project writeup
+  * 85% Project writeup
-    * 35% Exploratory data analysis
+    * 30% Exploratory data analysis
-    * 35% Description of technical approach
+    * 30% Description of technical approach
-    * 30% Analysis of performance of method
+    * 25% Analysis of performance of method
+  * 15% Submission of predictions csv file
   * 10% extra credit for the three top predictions
@@ Line 48: / Line 49: @@
 As part of this lab, you must submit a set of predictions.  You must provide predictions as a simple CSV file with two columns and 85,000 rows.  Each row has the form
-''testID,predicted rating''
+''testID,predicted_rating''
 The ''testID'' field uniquely identifies each ''user,movie'' prediction pair in the predictions set.
@@ Line 88: / Line 89: @@
 import seaborn
 import pandas
+import numpy as np
 ur = pandas.read_csv('user_ratedmovies_train.dat','\t')
@@ Line 104: / Line 106: @@
 </code>
+And Here is some code that writes out your prediction file that you will submit:
+<code python>
+import numpy as np
+import pandas as pd
+pred_array = pd.read_table('predictions.dat')
+test_ids = pred_array[["testID"]]
+pred_array.head()
+N = pred_array.shape[0]
+my_preds = np.zeros((N,1))
+for id in range(N): ### Prediction loop
+    predicted_rating = 3
+    my_preds[ id, 0 ] = predicted_rating ### This Predicts everything as 3
+sfile = open( 'predictions.csv', 'w' )
+sfile.write( '"testID","predicted_rating"\n' )
+for id in range( 0, N ):
+    sfile.write( '%d,%.2f\n' % (test_ids.iloc[id], my_preds[id] ) )
+sfile.close()
+</code>

BYU CS classes

User Tools

Site Tools

Differences

Page Tools