This shows you the differences between two versions of the page.
cs401r_w2016:lab3 [2015/12/23 21:08] admin |
cs401r_w2016:lab3 [2021/06/30 23:42] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====Objective:==== | ||
- | |||
- | To understand how to sample from different distributions, and to | ||
- | understand the link between samples and a PDF/PMF. To explore | ||
- | different parameter settings of common distributions, and to implement | ||
- | a small library of random variable types. | ||
- | |||
- | ====Deliverable:==== | ||
- | |||
- | You should turn in an ipython notebook that implements and tests a | ||
- | library of random variable types. | ||
- | |||
- | When run, this notebook should sample multiple times from each type of | ||
- | random variable; these samples should be aggregated and visualized, | ||
- | and compared to the corresponding PDF/PMF. The result should look | ||
- | something like this: | ||
- | |||
- | {{:cs401r_w2016:lab3.png?nolink|}} | ||
- | |||
- | For multidimensional variables, your visualization should convey information in a natural way; you can either use 3d surfaces, or 2d contour plots: | ||
- | |||
- | {{:cs401r_w2016:lab3_2d.png?nolink|}} | ||
- | |||
- | ====Description:==== | ||
- | |||
- | You must implement seven random variable objects. For each type, you should be able to sample from that distribution, and compute the log-likelihood of a particular value. All of your classes should inherit from a base random variable object that supports the following methods: | ||
- | |||
- | <code python> | ||
- | |||
- | class RandomVariable: | ||
- | def __init__( self ): | ||
- | self.state = None | ||
- | pass | ||
- | |||
- | def get( self ): | ||
- | return self.state | ||
- | |||
- | def sample( self ): | ||
- | pass | ||
- | |||
- | def log_likelihood( self ): | ||
- | pass | ||
- | |||
- | def propose( self ): | ||
- | pass | ||
- | | ||
- | </code> | ||
- | |||
- | You don't need to implement the ''get'' or ''propose'' methods yet. For example, your univariate Gaussian class might look like this: | ||
- | |||
- | <code python> | ||
- | |||
- | class Gaussian( RandomVariable ): | ||
- | def __init__( self, mu, sigma ): | ||
- | self.mu = mu | ||
- | self.sigma = sigma | ||
- | self.state = 0 | ||
- | |||
- | def sample( self ): | ||
- | return self.mu + self.sigma * numpy.Random.randn() | ||
- | |||
- | def log_likelihood( self, X, mu, sigma ): | ||
- | return -numpy.log( sigma*numpy.sqrt(2*pi) ) - (X-mu)**2/(sigma**2) | ||
- | | ||
- | </code> | ||
- | |||
- | **Given that framework, you should implement:** | ||
- | |||
- | * The following one dimensional, continuous valued distributions. For | ||
- | these, you should also plot the PDF of the random variable on the | ||
- | same plot; the curves should match. //Note: it is **not** sufficient to let seaborn estimate the PDF using its built-in KDE estimator; you need to plot the true PDF. In other words, you can't just use seaborn.kdeplot!// | ||
- | |||
- | * ''Beta (alpha=1, beta=3)'' | ||
- | * ''Poisson (lambda=7)'' | ||
- | * ''Univariate Gaussian (mean=2, variance=3)'' | ||
- | |||
- | * The following discrete distributions. For these, plot predicted and | ||
- | empirical histograms side-by-side: | ||
- | * ''Bernoulli (p=0.7)'' | ||
- | * ''Multinomial (theta=[0.1, 0.2, 0.7])'' | ||
- | |||
- | * The following multidimensional distributions. For these, | ||
- | * Two-dimensional Gaussian | ||
- | * 3-dimensional Dirichlet | ||
- | |||
- | **Important notes:** | ||
- | |||
- | **You //may// use [[http://docs.scipy.org/doc/numpy-1.10.0/reference/routines.random.html|numpy.random]] to sample from the appropriate distributions.** | ||
- | |||
- | **You may //not// use any existing code to calculate the log-likelihoods.** But you can, of course, use any online resources or the book to find the appropriate definition of each PDF. | ||
- | |||
- | ====Hints:==== | ||
- | |||
- | The following functions may be useful to you: | ||
- | |||
- | <code python> | ||
- | |||
- | numpy.random | ||
- | |||
- | matplotlib.pyplot.contour | ||
- | |||
- | seaborn.kdeplot | ||
- | |||
- | seaborn.jointplot | ||
- | |||
- | hist( data, bins=50, normed=True ) | ||
- | |||
- | numpy.linspace | ||
- | |||
- | legend | ||
- | |||
- | title | ||
- | |||
- | </code> | ||