User Tools

Site Tools


cs501r_f2018:lab4

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cs501r_f2018:lab4 [2018/09/24 07:47]
shreeya
cs501r_f2018:lab4 [2021/06/30 23:42] (current)
Line 28: Line 28:
  
   * 40% Proper design, creation and debugging of a dense prediction network   * 40% Proper design, creation and debugging of a dense prediction network
-  * 40% Proper ​design ​of a loss function and test set accuracy measure +  * 40% Proper ​implementation ​of a loss function and train/test set accuracy measure 
-  * 10% Tidy visualizations of loss and accuracy ​of your dense predictor training+  * 10% Tidy visualizations of loss of your dense predictor ​during ​training
   * 10% Test image output   * 10% Test image output
  
Line 56: Line 56:
 ---- ----
 ====Description:​==== ====Description:​====
 +
 +For a video including some tips and tricks that can help with this lab: [[https://​youtu.be/​Ms19kgK_D8w|https://​youtu.be/​Ms19kgK_D8w]]
  
 For this lab, you will implement a virtual radiologist. ​ You are given For this lab, you will implement a virtual radiologist. ​ You are given
Line 75: Line 77:
 {{ :​cs501r_f2016:​screen_shot_2017-10-10_at_10.11.55_am.png?​direct&​200|}} {{ :​cs501r_f2016:​screen_shot_2017-10-10_at_10.11.55_am.png?​direct&​200|}}
  
-Use the "Deep Convolution U-Net" from this paper: [[https://​arxiv.org/​pdf/​1505.04597.pdf|U-Net:​ Convolutional Networks for Biomedical Image Segmentation]] (See figure 1, replicated at the right).  ​This should ​be fairly easy to implement given the +Use the "Deep Convolution U-Net" from this paper: [[https://​arxiv.org/​pdf/​1505.04597.pdf|U-Net:​ Convolutional Networks for Biomedical Image Segmentation]] (See figure 1, replicated at the right).  ​You should ​use existing pytorch functions (not your own Conv2D module), such as ''​nn.Conv2d'';​ you will also need the pytorch function ''​torch.cat''​ and ''​nn.ConvTranspose2d''​
-''​conv'' ​helper functions that you implemented previously; you +
-may also need the pytorch function ''​torch.cat''​ and ''​nn.ConvTranspose2d''​+
  
 ''​torch.cat''​ allows you to concatenate tensors. ''​nn.ConvTranspose2d''​ is the opposite of ''​nn.Conv2d''​. It is used to bring an image from low res to higher res. [[https://​towardsdatascience.com/​up-sampling-with-transposed-convolution-9ae4f2df52d0|This blog]] should help you understand this function in detail. ''​torch.cat''​ allows you to concatenate tensors. ''​nn.ConvTranspose2d''​ is the opposite of ''​nn.Conv2d''​. It is used to bring an image from low res to higher res. [[https://​towardsdatascience.com/​up-sampling-with-transposed-convolution-9ae4f2df52d0|This blog]] should help you understand this function in detail.
Line 90: Line 90:
 can be viewed as a two-class classification problem. can be viewed as a two-class classification problem.
  
 +**Part 2: Plot performance over time**
 +
 +Please generate a plot that shows loss on the training set as a function of training time.  Make sure your axes are labeled!
 +
 +**Part 3: Generate a prediction on the ''​pos_test_000072.png''​ image**
 +
 +Calculate the output of your trained network on the ''​pos_test_000072.png''​ image, then make a hard decision (cancerous/​not-cancerous) for each pixel. ​ The resulting image should be black-and-white,​ where white pixels represent things you think are probably cancerous.
  
 ---- ----
 ====Hints:​==== ====Hints:​====
  
-Importing data is a little tricky for this lab. There are a few ways to handle dataI downloaded ​the dataset ​and divided ​it into train_input,train_output,​test_input ​and test_outputI uploaded this to my google drive and wrote the follwoing in colab to use the dataset+The intention of this lab is to learn how to make deep neural nets and implement loss functionTherefore we'll help you with the implementation of Dataset. This code will download ​the dataset ​for you so that you are ready to use it and focus on network implementationlosses ​and accuracies.
  
-    # Load the Drive helper and mount +<code python>​ 
-    from google.colab import ​drive +import torchvision 
-    # This will prompt for authorization. +import os 
-    drive.mount('/​content/​drive'+import gzip 
-    print('done') +import tarfile 
-     +import gc 
-This will ask you for an authentication so please copy paste the code that it gives you. Next make your dataset class and you will be using ''​ImageFolder'​ to build your dataset in init:+from IPython.core.ultratb ​import ​AutoFormattedTB 
 +__ITB__ = AutoFormattedTB(mode = 'Verbose',​color_scheme='LightBg', tb_offset = 1)
  
-    ​class CancerDataset(Dataset):​ +class CancerDataset(Dataset):​ 
-      def __init__(self, ​dataset_folder): +  def __init__(self, ​root, download=True,​ size=512, train=True): 
-        self.dataset_folder = torchvision.datasets.ImageFolder(dataset_folder + 'train_input' ,transform = transforms.ToTensor()) +    if download and not os.path.exists(os.path.join(root,​ '​cancer_data'​)):​ 
-        self.label_folder = torchvision.datasets.ImageFolder(dataset_folder + 'train_output' ,transform = transforms.ToTensor()) +      datasets.utils.download_url('​http://​liftothers.org/​cancer_data.tar.gz',​ root, '​cancer_data.tar.gz',​ None) 
-   +      self.extract_gzip(os.path.join(root,​ '​cancer_data.tar.gz'​)) 
-      def __getitem__(self,index)+      self.extract_tar(os.path.join(root,​ '​cancer_data.tar'​)) 
-        img = self.dataset_folder[index] +        
-        label = self.label_folder[index] +    postfix = '​train'​ if train else '​test'​ 
-        return img[0],​label[0]+    root = os.path.join(root,​ '​cancer_data',​ '​cancer_data'​) 
 +    ​self.dataset_folder = torchvision.datasets.ImageFolder(os.path.join(root, ​'inputs_' ​+ postfix) ​,transform = transforms.Compose([transforms.Resize(size),​transforms.ToTensor()])) 
 +    self.label_folder = torchvision.datasets.ImageFolder(os.path.join(root, ​'outputs_' ​+ postfix) ​,transform = transforms.Compose([transforms.Resize(size),transforms.ToTensor()]))
     ​     ​
-      ​def __len__(self): +  @staticmethod 
-        ​return len(self.dataset_folder)+  ​def extract_gzip(gzip_path, remove_finished=False): 
 +    print('​Extracting {}'.format(gzip_path)) 
 +    with open(gzip_path.replace('​.gz',​ ''​),​ '​wb'​) as out_f, gzip.GzipFile(gzip_path) as zip_f: 
 +      out_f.write(zip_f.read()) 
 +    if remove_finished:​ 
 +      os.unlink(gzip_path)
  
-Finally you can call your dataset like this:+  @staticmethod 
 +  def extract_tar(tar_path): 
 +    print('​Untarring {}'​.format(tar_path)) 
 +    z = tarfile.TarFile(tar_path) 
 +    z.extractall(tar_path.replace('​.tar',​ ''​))
  
-    trainset = CancerDataset(dataset_folder = '/​content/​drive/​My Drive/​cancer_data/'​)+   
 +  def __getitem__(self,​index):​ 
 +    img = self.dataset_folder[index] 
 +    label self.label_folder[index] 
 +    return img[0],​label[0][0]
     ​     ​
-The tricky part here is that Imagefolder takes a folder of images. When you have both inputs and outputs folder inside cancer_data,​ it does not know that outputs is label. To get over this problem, I created a layer of folders that look like this cancer_data -> train_input -> train_input->​all image files and similar for all other folders.(we are looking at other options of making the dataset...)+  def __len__(self): 
 +    return len(self.dataset_folder) 
 +</​code>​
  
 You are welcome to resize your input images, although don't make them You are welcome to resize your input images, although don't make them
Line 127: Line 152:
 down to 512x512. down to 512x512.
  
-You are welcome (and encouraged) to use the built-in +You will need to add some lines of code for memory management:​ 
-dropout layer.+ 
 +<code python>​ 
 +def scope(): 
 +  try: 
 +    #your code for calling dataset and dataloader 
 +     
 +    gc.collect() 
 +    print(torch.cuda.memory_allocated(0) / 1e9) 
 +     
 +    #for epochs: 
 +    #  Call your model,loss and accuracy 
 +     
 +  except: 
 +    __ITB__() 
 + 
 +scope() 
 +</​code>​ 
 +   
 +Since you will be using the output of one network in two places(convolution and maxpooling),​ you can't use nn.Sequential. Instead you will write up the network like normal variable assignment as the example shown below: 
 + 
 +<code python>​ 
 +class CancerDetection(nn.Module):​ 
 +  def __init__(self):​ 
 +    super(CancerDetection,​ self).__init__() 
 +     
 +    self.conv1 = nn.Conv2d(3,​64,​kernel_size = 3, stride = 1, padding = 1) 
 +    self.relu2 = nn.ReLU() 
 +    self.conv3 = nn.Conv2d(64,​128,​kernel_size = 3, stride = 1, padding = 1) 
 +    self.relu4 = nn.ReLU() 
 + 
 +  def forward(self,​ input): 
 +    conv1_out = self.conv1(input) 
 +    relu2_out = self.relu2(conv1_out) 
 +    conv3_out = self.conv3(relu2_out) 
 +    relu4_out = self.relu4(conv3_out)  
 +    return relu4_out 
 +</​code>​ 
 + 
 +You are welcome (and encouraged) to use the built-in ​batch normalization and dropout layer. 
 + 
 +Guessing that the pixel is not cancerous every single time will give you an accuracy of ~ 85%. Your trained network should be able to do better than that (but you will not be graded on accuracy). This is the result I got after 1 hour or training. 
 + 
 +{{:​cs501r_f2016:​training_accuracy.png?​400|}}  
 +{{:​cs501r_f2016:​training_loss.png?​400|}}
cs501r_f2018/lab4.1537775258.txt.gz · Last modified: 2021/06/30 23:40 (external edit)