User Tools

Site Tools


cs401r_w2016:lab1

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cs401r_w2016:lab1 [2016/01/02 23:54]
admin
cs401r_w2016:lab1 [2021/06/30 23:42] (current)
Line 16: Line 16:
 {{:​cs401r_w2016:​lab1.png?​nolink|}} {{:​cs401r_w2016:​lab1.png?​nolink|}}
  
-**Part 2:** Your notebook should use the pandas library to read in some CSV data and plot some of it. +**Part 2:** Your notebook should use the pandas library to read in the Rossman store sales data (a CSV dataset) ​and plot the sales of store #1 Your plot should look something like this:
-Your plot should look something like this:+
  
-{{:​cs401r_w2016:​lab1_storesales.png?​direct&​300|}}+{{:​cs401r_w2016:​lab1_storesales.png?​direct&​700|}} 
 + 
 +Done correctly, this should only take a few lines of code. 
 + 
 +---- 
 +====Grading standards:​==== 
 + 
 +Your notebook will be graded on the following:​ 
 + 
 +  * 20% Successfully turned in a notebook with working code 
 +  * 20% Random image with 50 random elements 
 +  * 20% Correctly used pandas to load store sales data 
 +  * 30% Some sort of plot of sales data (only for store #1!) 
 +  * 10% Tidy and legible figures, including labeled axes where appropriate
  
 ---- ----
Line 27: Line 39:
 notebooks and the anaconda python distribution. ​ For this lab, you notebooks and the anaconda python distribution. ​ For this lab, you
 must install anaconda, and write a simple python program (using must install anaconda, and write a simple python program (using
-ipython notebooks) ​and use it to generate simple random images.+ipython notebooks).  As described above, the notebook should do two things: 
 +1) generate simple random images, and 2) plot some data using pandas.
  
-You can generate any sort of random image that you want -- consider+For part 1, you can generate any sort of random image that you want -- consider
 random lines, random curves, random text, etc.  Each time the program random lines, random curves, random text, etc.  Each time the program
 is run, it should generate a different random image. ​ Your image is run, it should generate a different random image. ​ Your image
Line 39: Line 52:
 In preparation for future labs, we strongly encourage you to use the In preparation for future labs, we strongly encourage you to use the
 [[http://​cairographics.org/​|cairo]] package as part of your image generator. [[http://​cairographics.org/​|cairo]] package as part of your image generator.
 +
 +For part 2, the data you should use is downloadable here:
 +
 +[[http://​liftothers.org/​courses/​stat_ml/​store_train.csv|Rossman store sales data]]
  
 ---- ----
Line 60: Line 77:
 should see "​Notebook"​ and "​Python 2"​. ​ This will create a new should see "​Notebook"​ and "​Python 2"​. ​ This will create a new
 notebook. notebook.
 +
 +**Note:** When you turn in your notebook, you should turn in the ''​.ipynb''​ file.  Do not take a screen shot, or turn in an HTML page.
  
 Here's some starter code to help you generate an image. ​ The ''​nbimage''​ function will display the image inline in the notebook: Here's some starter code to help you generate an image. ​ The ''​nbimage''​ function will display the image inline in the notebook:
Line 97: Line 116:
 nbimage( data ) nbimage( data )
 </​code>​ </​code>​
 +
 +----
 +====Using Pandas:====
 +
 +For the second part of this lab, you will need to understand the ''​pandas''​ python package, just a little bit.  For this lab, you only need to know how to select some data from a CSV file.
 +
 +You should read through this tutorial and play with it.
 +
 +[[http://​synesthesiam.com/​posts/​an-introduction-to-pandas.html|Tutorial on using Pandas]]
 +
 +For this lab, you need select the data for store #1 and plot it.
 +
 +An important part of generating visualizations is conveying information cleanly and accurately. ​ You should therefore label all axes, and in particular, the x-axis should be labeled using dates (See the example image). ​ This involves a bit of python trickery, but check out some helpful functions in the hints below.
 +
 +----
 +====Hints:​====
 +
 +When using an ipython notebook, it's nice to make your plots show up inline. ​ To do this, add the following lines to the first cell of your notebook:
 +
 +<code python>
 +
 +# this tells seaborn and matplotlib to generate plots inline in the notebook
 +%matplotlib inline  ​
 +
 +# these two lines allow you to control the figure size
 +%pylab inline
 +pylab.rcParams['​figure.figsize'​] = (16.0, 8.0)
 +
 +</​code>​
 +
 +The following python functions might be helpful:
 +
 +<code python>
 +
 +import matplotlib.pyplot as plt
 +plt.plot_date
 +
 +pandas.to_datetime
 +
 +plt.legend
 +plt.xlabel
 +plt.ylabel
 +
 +plt.tight_layout
 +
 +</​code>​
 +
 +
cs401r_w2016/lab1.1451778892.txt.gz · Last modified: 2021/06/30 23:40 (external edit)