User Tools

Site Tools


cs401r_w2016:lab1

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cs401r_w2016:lab1 [2016/01/02 23:55]
admin
cs401r_w2016:lab1 [2017/01/11 22:32]
wingated
Line 16: Line 16:
 {{:​cs401r_w2016:​lab1.png?​nolink|}} {{:​cs401r_w2016:​lab1.png?​nolink|}}
  
-**Part 2:** Your notebook should use the pandas library to read in some CSV data and plot some of it. +**Part 2:** Your notebook should use the pandas library to read in the Rossman store sales data (a CSV dataset) ​and plot the sales of store #1 Your plot should look something like this:
-Your plot should look something like this:+
  
-{{:​cs401r_w2016:​lab1_storesales.png?​direct&​500|}}+{{:​cs401r_w2016:​lab1_storesales.png?​direct&​700|}} 
 + 
 +Done correctly, this should only take a few lines of code. 
 + 
 +---- 
 +====Grading standards:​==== 
 + 
 +Your notebook will be graded on the following:​ 
 + 
 +  * 20% Successfully turned in a notebook with working code 
 +  * 20% Random image with 50 random elements 
 +  * 20% Correctly used pandas to load store sales data 
 +  * 30% Some sort of plot of sales data (only for store #1!) 
 +  * 10% Tidy and legible figures, including labeled axes where appropriate
  
 ---- ----
Line 27: Line 39:
 notebooks and the anaconda python distribution. ​ For this lab, you notebooks and the anaconda python distribution. ​ For this lab, you
 must install anaconda, and write a simple python program (using must install anaconda, and write a simple python program (using
-ipython notebooks) ​and use it to generate simple random images.+ipython notebooks).  As described above, the notebook should do two things: 
 +1) generate simple random images, and 2) plot some data using pandas.
  
-You can generate any sort of random image that you want -- consider+For part 1, you can generate any sort of random image that you want -- consider
 random lines, random curves, random text, etc.  Each time the program random lines, random curves, random text, etc.  Each time the program
 is run, it should generate a different random image. ​ Your image is run, it should generate a different random image. ​ Your image
Line 39: Line 52:
 In preparation for future labs, we strongly encourage you to use the In preparation for future labs, we strongly encourage you to use the
 [[http://​cairographics.org/​|cairo]] package as part of your image generator. [[http://​cairographics.org/​|cairo]] package as part of your image generator.
 +
 +For part 2, the data you should use is downloadable here:
 +
 +[[http://​liftothers.org/​courses/​stat_ml/​store_train.csv|Rossman store sales data]]
  
 ---- ----
Line 60: Line 77:
 should see "​Notebook"​ and "​Python 2"​. ​ This will create a new should see "​Notebook"​ and "​Python 2"​. ​ This will create a new
 notebook. notebook.
 +
 +**Note:** When you turn in your notebook, you should turn in the ''​.ipynb''​ file.  Do not take a screen shot, or turn in an HTML page.
  
 Here's some starter code to help you generate an image. ​ The ''​nbimage''​ function will display the image inline in the notebook: Here's some starter code to help you generate an image. ​ The ''​nbimage''​ function will display the image inline in the notebook:
Line 97: Line 116:
 nbimage( data ) nbimage( data )
 </​code>​ </​code>​
 +
 +----
 +====Using Pandas:====
 +
 +For the second part of this lab, you will need to understand the ''​pandas''​ python package, just a little bit.  For this lab, you only need to know how to select some data from a CSV file.
 +
 +You should read through this tutorial and play with it.
 +
 +[[http://​synesthesiam.com/​posts/​an-introduction-to-pandas.html|Tutorial on using Pandas]]
 +
 +For this lab, you need select the data for store #1 and plot it.
 +
 +An important part of generating visualizations is conveying information cleanly and accurately. ​ You should therefore label all axes, and in particular, the x-axis should be labeled using dates (See the example image). ​ This involves a bit of python trickery, but check out some helpful functions in the hints below.
 +
 +----
 +====Hints:​====
 +
 +The following python functions might be helpful:
 +
 +<code python>
 +
 +import matplotlib.pyplot as plt
 +plt.plot_date
 +
 +pandas.to_datetime
 +
 +plt.legend
 +plt.xlabel
 +plt.ylabel
 +
 +plt.tight_layout
 +
 +</​code>​
 +
 +
cs401r_w2016/lab1.txt · Last modified: 2021/06/30 23:42 (external edit)