Differences

This shows you the differences between two versions of the page.

--- cs501r_f2016:lab2 [2016/08/31 18:23]
wingated
+++ cs501r_f2016:lab2 [2021/06/30 23:42]
@@ Line 1: / Line 1: @@
-====Objective:====
-To gain experience with python, numpy, and linear classification.  Oh, and to remember all of that linear algebra stuff.  ;)
-----
-====Deliverable:====
-You should turn in an iPython notebook that implements the perceptron algorithm on the Iris dataset.
-Your notebook should also generate a visualization that shows the loss function at each iteration.  This can be generated as a single plot, and shown in the notebook.
-The dataset can be downloaded at
-[[https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data|The UCI ML repository]]
-----
-====Grading standards:====
-Your notebook will be graded on the following:
-  * 70% Correct implementation of perceptron algorithm
-  * 20% Tidy and legible visualization of loss function
-  * 10% Tidy and legible final classification rate
-----
-====Description:====
-For this lab, you will be experimenting with Kernel Density Estimators (see MLAPP 14.7.2).  These are a simple, nonparametric alternative to Gaussian mixture models, but which form an important part of the machine learning toolkit.
-At several points during this lab, you will need to construct density estimates that are "class-conditional".  For example, in order to classify a test point $x_j$, you need to compute
-$$p( \mathrm{class}=k | x_j, \mathrm{data} ) \propto p( x_j | \mathrm{class}=k, \mathrm{data} ) p(\mathrm{class}=k | \mathrm{data} ) $$
-where
-$$p( x_j | \mathrm{class}=k, \mathrm{data} )$$
-is given by a kernel density estimator derived from all data of class $k$.
-The data that you will analyzing is the famous [[http://yann.lecun.com/exdb/mnist/|MNIST handwritten digits dataset]].  You can download some pre-processed MATLAB data files below:
-[[http://hatch.cs.byu.edu/courses/stat_ml/mnist_train.mat|MNIST training data vectors and labels]]
-[[http://hatch.cs.byu.edu/courses/stat_ml/mnist_test.mat|MNIST test data vectors and labels]]
-These can be loaded using the scipy.io.loadmat function, as follows:
-<code python>
-import scipy.io
-train_mat = scipy.io.loadmat('mnist_train.mat')
-train_data = train_mat['images']
-train_labels = train_mat['labels']
-test_mat = scipy.io.loadmat('mnist_test.mat')
-test_data = test_mat['t10k_images']
-test_labels = test_mat['t10k_labels']
-</code>
-The training data vectors are now in ''train_data'', a numpy array of size 784x60000, with corresponding labels in ''train_labels'', a numpy array of size 60000x1.
-----
-====Hints:====
-Here is a simple way to visualize a digit.  Suppose our digit is in variable ''X'', which has dimensions 784x1:
-<code python>
-import matplotlib.pyplot as plt
-plt.imshow( X.reshape(28,28).T, interpolation='nearest', cmap=matplotlib.cm.gray)
-</code>
-Here are some functions that may be helpful to you:
-<code python>
-import matplotlib.pyplot as plt
-plt.subplot
-numpy.argmax
-numpy.exp
-numpy.mean
-numpy.bincount
-</code>

BYU CS classes

User Tools

Site Tools

Differences

Page Tools