User Tools

Site Tools


cs501r_f2016:lab2

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cs501r_f2016:lab2 [2016/08/31 18:23]
wingated
cs501r_f2016:lab2 [2021/06/30 23:42]
Line 1: Line 1:
-====Objective:​==== 
  
-To gain experience with python, numpy, and linear classification. ​ Oh, and to remember all of that linear algebra stuff. ​ ;) 
- 
----- 
-====Deliverable:​==== 
- 
-You should turn in an iPython notebook that implements the perceptron algorithm on the Iris dataset. 
- 
-Your notebook should also generate a visualization that shows the loss function at each iteration. ​ This can be generated as a single plot, and shown in the notebook. 
- 
-The dataset can be downloaded at 
-[[https://​archive.ics.uci.edu/​ml/​machine-learning-databases/​iris/​iris.data|The UCI ML repository]] 
- 
----- 
-====Grading standards:​==== 
- 
-Your notebook will be graded on the following: 
- 
-  * 70% Correct implementation of perceptron algorithm 
-  * 20% Tidy and legible visualization of loss function 
-  * 10% Tidy and legible final classification rate 
- 
----- 
-====Description:​==== 
- 
- 
-For this lab, you will be experimenting with Kernel Density Estimators (see MLAPP 14.7.2). ​ These are a simple, nonparametric alternative to Gaussian mixture models, but which form an important part of the machine learning toolkit. 
- 
-At several points during this lab, you will need to construct density estimates that are "​class-conditional"​. ​ For example, in order to classify a test point $x_j$, you need to compute 
- 
-$$p( \mathrm{class}=k | x_j, \mathrm{data} ) \propto p( x_j | \mathrm{class}=k,​ \mathrm{data} ) p(\mathrm{class}=k | \mathrm{data} ) $$ 
- 
-where 
- 
-$$p( x_j | \mathrm{class}=k,​ \mathrm{data} )$$ 
- 
-is given by a kernel density estimator derived from all data of class $k$. 
- 
- 
-The data that you will analyzing is the famous [[http://​yann.lecun.com/​exdb/​mnist/​|MNIST handwritten digits dataset]]. ​ You can download some pre-processed MATLAB data files below: 
- 
-[[http://​hatch.cs.byu.edu/​courses/​stat_ml/​mnist_train.mat|MNIST training data vectors and labels]] 
- 
-[[http://​hatch.cs.byu.edu/​courses/​stat_ml/​mnist_test.mat|MNIST test data vectors and labels]] 
- 
-These can be loaded using the scipy.io.loadmat function, as follows: 
- 
-<code python> 
-import scipy.io 
- 
-train_mat = scipy.io.loadmat('​mnist_train.mat'​) 
-train_data = train_mat['​images'​] 
-train_labels = train_mat['​labels'​] 
- 
-test_mat = scipy.io.loadmat('​mnist_test.mat'​) 
-test_data = test_mat['​t10k_images'​] 
-test_labels = test_mat['​t10k_labels'​] 
-</​code>​ 
- 
-The training data vectors are now in ''​train_data'',​ a numpy array of size 784x60000, with corresponding labels in ''​train_labels'',​ a numpy array of size 60000x1. 
- 
----- 
-====Hints:​==== 
- 
-Here is a simple way to visualize a digit. ​ Suppose our digit is in variable ''​X'',​ which has dimensions 784x1: 
- 
-<code python> 
-import matplotlib.pyplot as plt 
-plt.imshow( X.reshape(28,​28).T,​ interpolation='​nearest',​ cmap=matplotlib.cm.gray) 
-</​code>​ 
- 
-Here are some functions that may be helpful to you: 
- 
-<code python> 
- 
-import matplotlib.pyplot as plt 
-plt.subplot 
- 
-numpy.argmax 
- 
-numpy.exp 
- 
-numpy.mean 
- 
-numpy.bincount 
- 
-</​code>​ 
cs501r_f2016/lab2.txt ยท Last modified: 2021/06/30 23:42 (external edit)