This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cs401r_w2016:lab9 [2016/03/17 21:44] admin |
cs401r_w2016:lab9 [2018/03/21 16:51] sadler [Hints:] |
||
---|---|---|---|
Line 48: | Line 48: | ||
For this lab, you will code two different inference algorithms on the Latent Dirichlet Allocation (LDA) model. | For this lab, you will code two different inference algorithms on the Latent Dirichlet Allocation (LDA) model. | ||
- | You will use [[http://hatch.cs.byu.edu/courses/stat_ml/files.tar.gz|a dataset of general conference talks]]. Download and untar these files; there is helper code in the ''Hints'' section to help you process them. | + | You will use [[https://www.dropbox.com/s/yr3n9w61ifon04h/files.tar.gz?dl=0|a dataset of general conference talks]]. Download and untar these files; there is helper code in the ''Hints'' section to help you process them. |
**Part 1: Basic Gibbs Sampler** | **Part 1: Basic Gibbs Sampler** | ||
Line 120: | Line 120: | ||
# topic distributions | # topic distributions | ||
- | topics = np.zeros((V,K)) | + | bs = np.zeros((V,K)) |
# how should this be initialized? | # how should this be initialized? | ||
# per-document-topic distributions | # per-document-topic distributions | ||
- | pdtm = np.zeros((K,D)) | + | pis = np.zeros((K,D)) |
# how should this be initialized? | # how should this be initialized? | ||
for iters in range(0,100): | for iters in range(0,100): | ||
- | p = compute_data_likelihood( docs_i, qs, topics, pdtm ) | + | p = compute_data_likelihood( docs_i, qs, bs, pis) |
- | print "Iter %d, p=%.2f" % (iters,p) | + | print("Iter %d, p=%.2f" % (iters,p)) |
- | # resample per-word topic assignments qs | + | # resample per-word topic assignments bs |
- | # resample per-document topic mixtures pdtm | + | # resample per-document topic mixtures pis |
# resample topics | # resample topics |