Differences

This shows you the differences between two versions of the page.

--- cs401r_w2016:lab14 [2018/02/09 23:22]
sadler [Hints:]
+++ cs401r_w2016:lab14 [2018/02/12 21:59]
sadler [Description:]
@@ Line 80: / Line 80: @@
 ===Part 4: Implementation of Subset of Regressors===
-**Note: the slides in Monday's presentation were slightly incorrect.**  Please follow this description of the subset of regressors approach.  In particular, on Monday we discussed how you should partition your dataset into $m$ landmarks, and the $n$ rest of your data points.  Don't do that.  Instead, think of the $m$ landmarks as reusing points in your dataset -- so $m+n>n$.  In your dataset, you have $n$ training points, with $n$ x-values and $n$ y-values.  Depending on your landmark selection algorithm, the $m$ landmarks could be the same as some of the training points.  So, for example: if you have $n=1000$ training points, and you randomly pick $m=5$ landmark points, you will effectively have $n+m=1005$ points, but $5$ of those are re-used.
+Please follow this description of the subset of regressors approach.  In particular, on Monday we discussed how you should partition your dataset into $m$ landmarks, and the $n$ rest of your data points.  Don't do that.  Instead, think of the $m$ landmarks as reusing points in your dataset -- so $m+n>n$.  In your dataset, you have $n$ training points, with $n$ x-values and $n$ y-values.  Depending on your landmark selection algorithm, the $m$ landmarks could be the same as some of the training points.  So, for example: if you have $n=1000$ training points, and you randomly pick $m=5$ landmark points, you will effectively have $n+m=1005$ points, but $5$ of those are re-used.
 So: in all of the math below, the number $n$ refers to **all** of your training data.
@@ Line 151: / Line 151: @@
-sfile = open( 'mean_sub.csv', 'wb' )
+sfile = open( 'mean_sub.csv', 'w' )
 sfile.write( '"Id","Sales"\n' )
 for id in range( 0, N ):

BYU CS classes

User Tools

Site Tools

Differences

Page Tools