User Tools

Site Tools


cs401r_w2016:lab2

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cs401r_w2016:lab2 [2016/01/02 23:24]
admin [Description:]
cs401r_w2016:lab2 [2021/06/30 23:42] (current)
Line 8: Line 8:
  
 For this lab, you will turn in an ipython notebook that implements the "​Bayesian Concept Learning"​ model from Chapter 3 of MLAPP. For this lab, you will turn in an ipython notebook that implements the "​Bayesian Concept Learning"​ model from Chapter 3 of MLAPP.
 +
 +[[http://​liftothers.org/​courses/​stat_ml/​mlapp_ch3.pdf|Here is a PDF of the relevant chapter.]]
  
 Your notebook should perform the following functions: Your notebook should perform the following functions:
Line 18: Line 20:
 When you display your prior, likelihood, and posterior, your figure should look something like the ones in the book; my version is shown here: When you display your prior, likelihood, and posterior, your figure should look something like the ones in the book; my version is shown here:
  
-{{:​cs401r_w2016:​lab2_bayesian_concepts.png?​direct&​800|}}+{{:​cs401r_w2016:​lab2_plp.JPG?​direct&​800|}}
  
 Similarly, when you display the posterior predictive, your figure should look something like this: Similarly, when you display the posterior predictive, your figure should look something like this:
  
-{{:​cs401r_w2016:​lab2_pp.png?​direct&​800|}}+{{:​cs401r_w2016:​lab2_predpost.JPG?​direct&​800|}} 
 + 
 +---- 
 +====Grading standards:​==== 
 + 
 +Your notebook will be graded on the following:​ 
 + 
 +  * 10% Correctly formed & normalized prior 
 +  * 20% Correctly formed likelihood 
 +  * 30% Correctly formed & normalized posterior 
 +  * 30% Correctly formed & normalized posterior predictive 
 +  * 10% tidy and legible figures, including labeled axes 
 + 
 +//Remember: correct normalization may mean different things for different distributions! //
  
 ---- ----
 ====Description:​==== ====Description:​====
 +
  
 Following the Bayesian Concept Learning example in Chapter 3 of MLAPP, we're interested in reasoning about the origin of a set of numbers. ​ We'll do this by placing a prior over a set of possible //​concepts//​ (or "​candidate origins"​),​ and then use Bayes' law to construct a posterior distribution over concepts given some data. Following the Bayesian Concept Learning example in Chapter 3 of MLAPP, we're interested in reasoning about the origin of a set of numbers. ​ We'll do this by placing a prior over a set of possible //​concepts//​ (or "​candidate origins"​),​ and then use Bayes' law to construct a posterior distribution over concepts given some data.
  
-For this lab, we will only consider numbers between 0 and 100.+For this lab, we will only consider numbers between 0 and 100, inclusive.
  
-Your notebook should construct a set of possible number-game concepts (such as "​even"​ or "​odd"​). ​ These can be any set of concepts you want, but should include at least all of the concepts in the book (see, for example, Fig. 3.2).  You must assign a prior probability to each concept; the prior can be anything you want.  This is+<del>Your notebook should construct a set of possible number-game concepts (such as "​even"​ or "​odd"​). ​ These can be any set of concepts you want, but should include at least all of the concepts in the book (see, for example, Fig. 3.2).  You must assign a prior probability to each concept; the prior can be anything you want. </​del>​ 
 + 
 +To make grading easier on our incredible TA, your notebook should construct a set of possible number-game concepts that are the same as the concepts in the book (see Fig. 3.2).  ​You must assign a prior probability to each concept; to make grading easier, your prior should be: 
 + 
 +<code python>​ 
 +prior = numpy.ones(len(concepts)) 
 +prior[0] = 5 
 +prior[1] = 5 
 +prior[30] = .01 
 +prior[31] = .01 
 +prior = prior / numpy.sum(prior) 
 +</​code>​ 
 + 
 +This prior distribution ​is 
  
 $$p(h)$$ $$p(h)$$
Line 39: Line 68:
 $$p(\mathrm{data} | h )$$ $$p(\mathrm{data} | h )$$
  
-**Important:​** you can assume that each number in the data was sampled independently,​ and that each number was sampled uniformly from the set of all possible numbers //in that concept//.+**Important:​** you can assume that each number in the data was sampled ​**independently**, and that each number was sampled ​**uniformly** from the set of all possible numbers //in that concept//.
  
 //Hint: what does that imply about the probability of sampling a given number from a concept with lots of possibilities,​ such as the ''​all''​ concept, vs. a concept with few possibilities,​ such as ''​multiples of 10''?//​ //Hint: what does that imply about the probability of sampling a given number from a concept with lots of possibilities,​ such as the ''​all''​ concept, vs. a concept with few possibilities,​ such as ''​multiples of 10''?//​
Line 45: Line 74:
 Prepare a figure, as described in the Deliverable that illustrates your prior, the likelihood of the data for each concept, the posterior. ​ **Note:** distributions should be properly normalized. Prepare a figure, as described in the Deliverable that illustrates your prior, the likelihood of the data for each concept, the posterior. ​ **Note:** distributions should be properly normalized.
  
-You must also prepare a figure showing the //posterior predictive distribution//​. ​ This distribution describes the probability that a number is in the target ​hypothesis, given the data.  The book is somewhat unclear on this, but to do this, we marginalize out the specific hypothesis:+You must also prepare a figure showing the //posterior predictive distribution//​. ​ This distribution describes the probability that a number ​$\tilde{x}$ ​is in the target ​concept (which we'll call $\mathrm{C}$), given the data.  (Note that we're drawing a subtle distinction between the true //​concept// ​ and a //​hypothesis//​).  The book is somewhat unclear on this, but to do this, we marginalize out the specific hypothesis: 
 + 
 +$$p(\tilde{x} \in C | \mathrm{data} ) = \sum_h p(\tilde{x} \in C , h | \mathrm{data} )$$
  
 $$p(\tilde{x} \in C | \mathrm{data} ) = \sum_h p(\tilde{x} \in C | h) p( h | \mathrm{data} )$$ $$p(\tilde{x} \in C | \mathrm{data} ) = \sum_h p(\tilde{x} \in C | h) p( h | \mathrm{data} )$$
 +
  
 We've already computed the posterior $p( h | \mathrm{data} )$, so we're only left with the term $p(\tilde{x} \in C | h)$.  For this, just use an //​indicator//​ function that returns 1 if $\tilde{x}$ is in $h$, and 0 otherwise. We've already computed the posterior $p( h | \mathrm{data} )$, so we're only left with the term $p(\tilde{x} \in C | h)$.  For this, just use an //​indicator//​ function that returns 1 if $\tilde{x}$ is in $h$, and 0 otherwise.
Line 55: Line 87:
 ---- ----
 ====Hints:​==== ====Hints:​====
- 
-When using an ipython notebook, it's nice to make your plots show up inline. ​ To do this, add the following lines to the first cell of your notebook: 
- 
-<code python> 
- 
-# this tells seaborn and matplotlib to generate plots inline in the notebook 
-%matplotlib inline  ​ 
- 
-# these two lines allow you to control the figure size 
-%pylab inline 
-pylab.rcParams['​figure.figsize'​] = (16.0, 8.0) 
- 
-</​code>​ 
- 
  
 You may find the following functions useful: You may find the following functions useful:
Line 94: Line 112:
 plt.title plt.title
 plt.xlabel plt.xlabel
 +
 +# changes the xlimits of an axis
 +plt.xlim
 +# changes the ylimits of an axis
 +plt.ylim
  
 </​code>​ </​code>​
cs401r_w2016/lab2.1451777043.txt.gz · Last modified: 2021/06/30 23:40 (external edit)