Differences

This shows you the differences between two versions of the page.

--- cs501r_f2016:lab2 [2016/09/02 04:28]
wingated
+++ cs501r_f2016:lab2 [2021/06/30 23:42]
@@ Line 1: / Line 1: @@
-====Objective:====
-To gain experience with python, numpy, and linear classification.
-Oh, and to remember all of that linear algebra stuff.  ;)
-----
-====Deliverable:====
-You should turn in an iPython notebook that implements the perceptron algorithm on two different datasets: the Iris dataset, and the CIFAR-10 dataset.  Because the perceptron is a binary classifier, we will preprocess the data to create two classes.
-Your notebook should also generate a visualization that shows classification accuracy at each iteration, along with the log of the l2 norm of the weight vector.  Examples of both are shown at the right.  **Please note that you should cleanly label your axes!**
-{{ :cs501r_f2016:lab2_cacc.png?direct&200|}}
-The Iris dataset can be downloaded at the UCI ML repository, or you can download a slightly simpler version here:
-[[http://liftothers.org/Fisher.csv|http://liftothers.org/Fisher.csv]]
-The CIFAR-10 dataset can be downloaded at
-[[https://www.cs.toronto.edu/~kriz/cifar.html|https://www.cs.toronto.edu/~kriz/cifar.html]]
-**Note: make sure to download the python version of the data - it will simplify your life!**
-----
-====Grading standards:====
-Your notebook will be graded on the following:
-  * 70% Correct implementation of perceptron algorithm
-  * 20% Tidy and legible visualization of loss function
-  * 10% Tidy and legible plot of classification accuracy over time
-----
-====Description:====
-The purpose of this lab is to help you become familiar with ''numpy'', to remember the basics of classification, and to implement the perceptron algorithm.  The perceptron algorithm is a simple method of learning a separating hyperplane.  It is guaranteed to converge iff the dataset is linearly separable - otherwise, there is no guarantee!
-You should implement the perceptron algorithm according to the description in Wikipedia:
-[[https://en.wikipedia.org/wiki/Perceptron|Perceptron]]
-As you implement this lab, you will (hopefully!) learn the difference between numpy's matrices, numpy's vectors, and lists.  In particular, note that a list is not the same a vector, and a ''n x 1'' matrix is not the same as a vector of length ''n''.
-You may find the functions ''np.asmatrix'', ''np.atleast_2d'', and ''np.reshape'' helpful to convert between them.
-Also, you may find the function ''np.dot'' helpful to compute matrix-vector products, or vector-vector products. You can transpose a matrix or a vector by calling the ''.T'' method.
-**Preparing the data:**
-We need to convert both datasets to binary classification problems.  To show you how we're going to do this, and to give you a bit of code to get started, here is how I loaded and converted the Iris dataset:
-<code python>
-data = pandas.read_csv( 'Fisher.csv' )
-m = data.as_matrix()
-labels = m[:,0]
-labels[ labels==2 ] = 1
-labels = np.atleast_2d( labels ).T
-features = m[:,1:5]
-</code>
-and the CIFAR-10 dataset:
-<code python>
-def unpickle( file ):
-    import cPickle
-    fo = open(file, 'rb')
-    dict = cPickle.load(fo)
-    fo.close()
-    return dict
-data = unpickle( 'cifar-10-batches-py/data_batch_1' )
-features = data['data']
-labels = data['labels']
-labels = np.atleast_2d( labels ).T
-# squash classes 0-4 into class 0, and squash classes 5-9 into class 1
-labels[ labels < 5 ] = 0
-labels[ labels >= 5 ] = 1
-</code>
-** Running the perceptron algorithm**
-Remember that if a data instance is classified correctly, there is no change in the weight vector.
-In the wikipedia description of the perceptron algorithm, notice the function ''f''.  That's the Heaviside step function.  What does it do?
-** Computing the l2 norm of the weight vector **
-This should only take a single line of code.  Hint: can you rewrite the l2 norm in terms of dot products?
-----
-====Hints:====
-An easy way to load a CSV datafile is with the ''pandas'' package.
-Here are some functions that may be helpful to you:
-<code python>
-np.random.randn
-import matplotlib.pyplot as plt
-plt.figure
-plt.xlabel
-plt.ylabel
-plt.legend
-plt.show
-</code>

BYU CS classes

User Tools

Site Tools

Differences

Page Tools