This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
cs401r_w2016:lab1 [2015/12/23 20:47] admin |
cs401r_w2016:lab1 [2021/06/30 23:42] (current) |
||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | ====Objective:==== | ||
| - | ===Objective:=== | + | Get started with anaconda, python, ipython notebooks, and pandas. Begin producing simple visualizations of data and images. |
| - | Get started with python, ipython notebooks and anaconda. | + | ---- |
| + | ====Deliverable:==== | ||
| - | ===Deliverable:=== | + | For this lab, you will submit an ipython notebook. This notebook will have two parts: |
| - | An ipython notebook that generates a random image. We will run this | + | **Part 1:** Your notebook should generate a random image. We will run this |
| notebook 5 times; it should generate 5 different, moderately complex | notebook 5 times; it should generate 5 different, moderately complex | ||
| images. Each image should be 512 x 288. Have fun with it! | images. Each image should be 512 x 288. Have fun with it! | ||
| Line 14: | Line 16: | ||
| {{:cs401r_w2016:lab1.png?nolink|}} | {{:cs401r_w2016:lab1.png?nolink|}} | ||
| - | ===Description:=== | + | **Part 2:** Your notebook should use the pandas library to read in the Rossman store sales data (a CSV dataset) and plot the sales of store #1. Your plot should look something like this: |
| + | |||
| + | {{:cs401r_w2016:lab1_storesales.png?direct&700|}} | ||
| + | |||
| + | Done correctly, this should only take a few lines of code. | ||
| + | |||
| + | ---- | ||
| + | ====Grading standards:==== | ||
| + | |||
| + | Your notebook will be graded on the following: | ||
| + | |||
| + | * 20% Successfully turned in a notebook with working code | ||
| + | * 20% Random image with 50 random elements | ||
| + | * 20% Correctly used pandas to load store sales data | ||
| + | * 30% Some sort of plot of sales data (only for store #1!) | ||
| + | * 10% Tidy and legible figures, including labeled axes where appropriate | ||
| + | |||
| + | ---- | ||
| + | ====Description:==== | ||
| Throughout this class, we will be using a combination of ipython | Throughout this class, we will be using a combination of ipython | ||
| notebooks and the anaconda python distribution. For this lab, you | notebooks and the anaconda python distribution. For this lab, you | ||
| must install anaconda, and write a simple python program (using | must install anaconda, and write a simple python program (using | ||
| - | ipython notebooks) and use it to generate simple random images. | + | ipython notebooks). As described above, the notebook should do two things: |
| + | 1) generate simple random images, and 2) plot some data using pandas. | ||
| - | You can generate any sort of random image that you want -- consider | + | For part 1, you can generate any sort of random image that you want -- consider |
| random lines, random curves, random text, etc. Each time the program | random lines, random curves, random text, etc. Each time the program | ||
| is run, it should generate a different random image. Your image | is run, it should generate a different random image. Your image | ||
| Line 32: | Line 53: | ||
| [[http://cairographics.org/|cairo]] package as part of your image generator. | [[http://cairographics.org/|cairo]] package as part of your image generator. | ||
| - | ===Installing anaconda:=== | + | For part 2, the data you should use is downloadable here: |
| + | |||
| + | [[http://liftothers.org/courses/stat_ml/store_train.csv|Rossman store sales data]] | ||
| + | |||
| + | ---- | ||
| + | ====Installing anaconda:==== | ||
| http://docs.continuum.io/anaconda/install | http://docs.continuum.io/anaconda/install | ||
| Line 52: | Line 78: | ||
| notebook. | notebook. | ||
| - | Here's some starter code to help you generate an image: | + | **Note:** When you turn in your notebook, you should turn in the ''.ipynb'' file. Do not take a screen shot, or turn in an HTML page. |
| + | |||
| + | Here's some starter code to help you generate an image. The ''nbimage'' function will display the image inline in the notebook: | ||
| <code python> | <code python> | ||
| Line 88: | Line 116: | ||
| nbimage( data ) | nbimage( data ) | ||
| </code> | </code> | ||
| + | |||
| + | ---- | ||
| + | ====Using Pandas:==== | ||
| + | |||
| + | For the second part of this lab, you will need to understand the ''pandas'' python package, just a little bit. For this lab, you only need to know how to select some data from a CSV file. | ||
| + | |||
| + | You should read through this tutorial and play with it. | ||
| + | |||
| + | [[http://synesthesiam.com/posts/an-introduction-to-pandas.html|Tutorial on using Pandas]] | ||
| + | |||
| + | For this lab, you need select the data for store #1 and plot it. | ||
| + | |||
| + | An important part of generating visualizations is conveying information cleanly and accurately. You should therefore label all axes, and in particular, the x-axis should be labeled using dates (See the example image). This involves a bit of python trickery, but check out some helpful functions in the hints below. | ||
| + | |||
| + | ---- | ||
| + | ====Hints:==== | ||
| + | |||
| + | When using an ipython notebook, it's nice to make your plots show up inline. To do this, add the following lines to the first cell of your notebook: | ||
| + | |||
| + | <code python> | ||
| + | |||
| + | # this tells seaborn and matplotlib to generate plots inline in the notebook | ||
| + | %matplotlib inline | ||
| + | |||
| + | # these two lines allow you to control the figure size | ||
| + | %pylab inline | ||
| + | pylab.rcParams['figure.figsize'] = (16.0, 8.0) | ||
| + | |||
| + | </code> | ||
| + | |||
| + | The following python functions might be helpful: | ||
| + | |||
| + | <code python> | ||
| + | |||
| + | import matplotlib.pyplot as plt | ||
| + | plt.plot_date | ||
| + | |||
| + | pandas.to_datetime | ||
| + | |||
| + | plt.legend | ||
| + | plt.xlabel | ||
| + | plt.ylabel | ||
| + | |||
| + | plt.tight_layout | ||
| + | |||
| + | </code> | ||
| + | |||
| + | |||