User Tools

Site Tools


cs401r_w2016:lab1

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cs401r_w2016:lab1 [2015/12/23 20:47]
admin
cs401r_w2016:lab1 [2021/06/30 23:42] (current)
Line 1: Line 1:
 +====Objective:​====
  
-===Objective:​===+Get started with anaconda, python, ipython notebooks, and pandas. ​ Begin producing simple visualizations of data and images.
  
-Get started with python, ipython notebooks and anaconda.+---- 
 +====Deliverable:​====
  
-===Deliverable:===+For this lab, you will submit an ipython notebook. ​ This notebook will have two parts:
  
-An ipython ​notebook ​that generates ​a random image. ​ We will run this+**Part 1:**  Your notebook ​should generate ​a random image. ​ We will run this
 notebook 5 times; it should generate 5 different, moderately complex notebook 5 times; it should generate 5 different, moderately complex
 images. ​ Each image should be 512 x 288.  Have fun with it! images. ​ Each image should be 512 x 288.  Have fun with it!
Line 14: Line 16:
 {{:​cs401r_w2016:​lab1.png?​nolink|}} {{:​cs401r_w2016:​lab1.png?​nolink|}}
  
-===Description:​===+**Part 2:** Your notebook should use the pandas library to read in the Rossman store sales data (a CSV dataset) and plot the sales of store #1.  Your plot should look something like this: 
 + 
 +{{:​cs401r_w2016:​lab1_storesales.png?​direct&​700|}} 
 + 
 +Done correctly, this should only take a few lines of code. 
 + 
 +---- 
 +====Grading standards:​==== 
 + 
 +Your notebook will be graded on the following:​ 
 + 
 +  * 20% Successfully turned in a notebook with working code 
 +  * 20% Random image with 50 random elements 
 +  * 20% Correctly used pandas to load store sales data 
 +  * 30% Some sort of plot of sales data (only for store #1!) 
 +  * 10% Tidy and legible figures, including labeled axes where appropriate 
 + 
 +---- 
 +====Description:​====
  
 Throughout this class, we will be using a combination of ipython Throughout this class, we will be using a combination of ipython
 notebooks and the anaconda python distribution. ​ For this lab, you notebooks and the anaconda python distribution. ​ For this lab, you
 must install anaconda, and write a simple python program (using must install anaconda, and write a simple python program (using
-ipython notebooks) ​and use it to generate simple random images.+ipython notebooks).  As described above, the notebook should do two things: 
 +1) generate simple random images, and 2) plot some data using pandas.
  
-You can generate any sort of random image that you want -- consider+For part 1, you can generate any sort of random image that you want -- consider
 random lines, random curves, random text, etc.  Each time the program random lines, random curves, random text, etc.  Each time the program
 is run, it should generate a different random image. ​ Your image is run, it should generate a different random image. ​ Your image
Line 32: Line 53:
 [[http://​cairographics.org/​|cairo]] package as part of your image generator. [[http://​cairographics.org/​|cairo]] package as part of your image generator.
  
-===Installing anaconda:​===+For part 2, the data you should use is downloadable here: 
 + 
 +[[http://​liftothers.org/​courses/​stat_ml/​store_train.csv|Rossman store sales data]] 
 + 
 +---- 
 +====Installing anaconda:====
  
 http://​docs.continuum.io/​anaconda/​install http://​docs.continuum.io/​anaconda/​install
Line 52: Line 78:
 notebook. notebook.
  
-Here's some starter code to help you generate an image:+**Note:** When you turn in your notebook, you should turn in the ''​.ipynb''​ file.  Do not take a screen shot, or turn in an HTML page. 
 + 
 +Here's some starter code to help you generate an image.  The ''​nbimage''​ function will display the image inline in the notebook:
  
 <code python> <code python>
Line 88: Line 116:
 nbimage( data ) nbimage( data )
 </​code>​ </​code>​
 +
 +----
 +====Using Pandas:====
 +
 +For the second part of this lab, you will need to understand the ''​pandas''​ python package, just a little bit.  For this lab, you only need to know how to select some data from a CSV file.
 +
 +You should read through this tutorial and play with it.
 +
 +[[http://​synesthesiam.com/​posts/​an-introduction-to-pandas.html|Tutorial on using Pandas]]
 +
 +For this lab, you need select the data for store #1 and plot it.
 +
 +An important part of generating visualizations is conveying information cleanly and accurately. ​ You should therefore label all axes, and in particular, the x-axis should be labeled using dates (See the example image). ​ This involves a bit of python trickery, but check out some helpful functions in the hints below.
 +
 +----
 +====Hints:​====
 +
 +When using an ipython notebook, it's nice to make your plots show up inline. ​ To do this, add the following lines to the first cell of your notebook:
 +
 +<code python>
 +
 +# this tells seaborn and matplotlib to generate plots inline in the notebook
 +%matplotlib inline  ​
 +
 +# these two lines allow you to control the figure size
 +%pylab inline
 +pylab.rcParams['​figure.figsize'​] = (16.0, 8.0)
 +
 +</​code>​
 +
 +The following python functions might be helpful:
 +
 +<code python>
 +
 +import matplotlib.pyplot as plt
 +plt.plot_date
 +
 +pandas.to_datetime
 +
 +plt.legend
 +plt.xlabel
 +plt.ylabel
 +
 +plt.tight_layout
 +
 +</​code>​
 +
 +
cs401r_w2016/lab1.1450903665.txt.gz · Last modified: 2021/06/30 23:40 (external edit)