User Tools

Site Tools


cs501r_f2016:lab2

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cs501r_f2016:lab2 [2016/09/02 04:28]
wingated
cs501r_f2016:lab2 [2021/06/30 23:42]
Line 1: Line 1:
-====Objective:​==== 
  
-To gain experience with python, numpy, and linear classification.  ​ 
- 
-Oh, and to remember all of that linear algebra stuff. ​ ;) 
- 
----- 
-====Deliverable:​==== 
- 
-You should turn in an iPython notebook that implements the perceptron algorithm on two different datasets: the Iris dataset, and the CIFAR-10 dataset. ​ Because the perceptron is a binary classifier, we will preprocess the data to create two classes. 
- 
-Your notebook should also generate a visualization that shows classification accuracy at each iteration, along with the log of the l2 norm of the weight vector. ​ Examples of both are shown at the right. ​ **Please note that you should cleanly label your axes!** 
-{{ :​cs501r_f2016:​lab2_cacc.png?​direct&​200|}} 
- 
-The Iris dataset can be downloaded at the UCI ML repository, or you can download a slightly simpler version here: 
-[[http://​liftothers.org/​Fisher.csv|http://​liftothers.org/​Fisher.csv]] 
- 
-The CIFAR-10 dataset can be downloaded at 
-[[https://​www.cs.toronto.edu/​~kriz/​cifar.html|https://​www.cs.toronto.edu/​~kriz/​cifar.html]] 
- 
-**Note: make sure to download the python version of the data - it will simplify your life!** 
- 
----- 
-====Grading standards:​==== 
- 
-Your notebook will be graded on the following: 
- 
-  * 70% Correct implementation of perceptron algorithm 
-  * 20% Tidy and legible visualization of loss function 
-  * 10% Tidy and legible plot of classification accuracy over time 
- 
----- 
-====Description:​==== 
- 
-The purpose of this lab is to help you become familiar with ''​numpy'',​ to remember the basics of classification,​ and to implement the perceptron algorithm. ​ The perceptron algorithm is a simple method of learning a separating hyperplane. ​ It is guaranteed to converge iff the dataset is linearly separable - otherwise, there is no guarantee! 
- 
-You should implement the perceptron algorithm according to the description in Wikipedia: 
- 
-[[https://​en.wikipedia.org/​wiki/​Perceptron|Perceptron]] 
- 
-As you implement this lab, you will (hopefully!) learn the difference between numpy'​s matrices, numpy'​s vectors, and lists. ​ In particular, note that a list is not the same a vector, and a ''​n x 1''​ matrix is not the same as a vector of length ''​n''​. 
- 
-You may find the functions ''​np.asmatrix'',​ ''​np.atleast_2d'',​ and ''​np.reshape''​ helpful to convert between them. 
- 
-Also, you may find the function ''​np.dot''​ helpful to compute matrix-vector products, or vector-vector products. You can transpose a matrix or a vector by calling the ''​.T''​ method. 
- 
-**Preparing the data:** 
- 
-We need to convert both datasets to binary classification problems. ​ To show you how we're going to do this, and to give you a bit of code to get started, here is how I loaded and converted the Iris dataset: 
- 
-<code python> 
-data = pandas.read_csv( '​Fisher.csv'​ ) 
-m = data.as_matrix() 
-labels = m[:,0] 
-labels[ labels==2 ] = 1 
-labels = np.atleast_2d( labels ).T 
-features = m[:,1:5] 
-</​code>​ 
- 
-and the CIFAR-10 dataset: 
- 
-<code python> 
-def unpickle( file ): 
-    import cPickle 
-    fo = open(file, '​rb'​) 
-    dict = cPickle.load(fo) 
-    fo.close() 
-    return dict 
- 
-data = unpickle( '​cifar-10-batches-py/​data_batch_1'​ ) 
- 
-features = data['​data'​] 
-labels = data['​labels'​] 
-labels = np.atleast_2d( labels ).T 
- 
-# squash classes 0-4 into class 0, and squash classes 5-9 into class 1 
-labels[ labels < 5 ] = 0 
-labels[ labels >= 5 ] = 1 
- 
-</​code>​ 
- 
-** Running the perceptron algorithm** 
- 
-Remember that if a data instance is classified correctly, there is no change in the weight vector. 
- 
-In the wikipedia description of the perceptron algorithm, notice the function ''​f''​. ​ That's the Heaviside step function. ​ What does it do? 
- 
- 
-** Computing the l2 norm of the weight vector ** 
- 
-This should only take a single line of code.  Hint: can you rewrite the l2 norm in terms of dot products? 
- 
----- 
-====Hints:​==== 
- 
-An easy way to load a CSV datafile is with the ''​pandas''​ package. 
- 
-Here are some functions that may be helpful to you: 
- 
-<code python> 
- 
-np.random.randn 
- 
-import matplotlib.pyplot as plt 
- 
-plt.figure 
-plt.xlabel 
-plt.ylabel 
-plt.legend 
-plt.show 
- 
- 
-</​code>​ 
cs501r_f2016/lab2.txt ยท Last modified: 2021/06/30 23:42 (external edit)