This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
cs401r_w2016:lab9 [2018/03/16 20:24] wingated |
cs401r_w2016:lab9 [2021/06/30 23:42] (current) |
||
|---|---|---|---|
| Line 29: | Line 29: | ||
| {{ :cs401r_w2016:lab8_pdtm.png?direct&500 |}} | {{ :cs401r_w2016:lab8_pdtm.png?direct&500 |}} | ||
| + | |||
| + | {{ :cs401r_w2016:gibbs_sampler_results.png?direct&500|}} | ||
| Here, you can see how documents that are strongly correlated with Topic #3 appear every six months; these are the sustainings of church officers and statistical reports. | Here, you can see how documents that are strongly correlated with Topic #3 appear every six months; these are the sustainings of church officers and statistical reports. | ||
| Your notebook must also produce a plot of the log posterior of the data over time, as your sampler progresses. You should produce a single plot comparing the regular Gibbs sampler and the collapsed Gibbs sampler. | Your notebook must also produce a plot of the log posterior of the data over time, as your sampler progresses. You should produce a single plot comparing the regular Gibbs sampler and the collapsed Gibbs sampler. | ||
| + | |||
| + | To the right is an example of my log pdfs. | ||
| ---- | ---- | ||
| Line 41: | Line 45: | ||
| * 40% Correct implementation of Gibbs sampler | * 40% Correct implementation of Gibbs sampler | ||
| * 40% Correct implementation of collapsed Gibbs sampler | * 40% Correct implementation of collapsed Gibbs sampler | ||
| - | * 20% Final plots are tidy and legible | + | * 20% Final plots are tidy and legible (at least 2 plots: posterior over time for both samplers, and heat-map of distribution of topics over documents) |
| ---- | ---- | ||
| Line 120: | Line 124: | ||
| # topic distributions | # topic distributions | ||
| - | topics = np.zeros((V,K)) | + | bs = np.zeros((V,K)) + (1/V) |
| # how should this be initialized? | # how should this be initialized? | ||
| # per-document-topic distributions | # per-document-topic distributions | ||
| - | pdtm = np.zeros((K,D)) | + | pis = np.zeros((K,D)) + (1/K) |
| # how should this be initialized? | # how should this be initialized? | ||
| for iters in range(0,100): | for iters in range(0,100): | ||
| - | p = compute_data_likelihood( docs_i, qs, topics, pdtm ) | + | p = compute_data_likelihood( docs_i, qs, bs, pis) |
| - | print "Iter %d, p=%.2f" % (iters,p) | + | print("Iter %d, p=%.2f" % (iters,p)) |
| - | # resample per-word topic assignments qs | + | # resample per-word topic assignments bs |
| - | # resample per-document topic mixtures pdtm | + | # resample per-document topic mixtures pis |
| # resample topics | # resample topics | ||