User Tools

Site Tools


supercomputer

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
supercomputer [2017/08/23 20:22]
sean [Deep Learning on the Supercomputer]
supercomputer [2017/10/10 17:09]
wingated
Line 19: Line 19:
 module add tensorflow/​0.9.0_python-2.7.11+cuda module add tensorflow/​0.9.0_python-2.7.11+cuda
 </​code>​ </​code>​
 +
 +**UPDATE: apparently, the following module file works better:**
 +
 +<​code>​
 +#%Module
 +
 +module load defaultenv
 +module load cuda/8.0
 +module load cudnn/​5.1_cuda-8.0
 +module load python/2/7
 +
 +</​code>​
 +
  
 The computer lab grants most memory to the **compute** directory, so from now on we will make sure to put all data and code in there. The computer lab grants most memory to the **compute** directory, so from now on we will make sure to put all data and code in there.
Line 42: Line 55:
 #!/bin/bash #!/bin/bash
  
-#SBATCH --time=01:​00:​00 ​  # walltime+#SBATCH --time=01:​00:​00 ​  # walltime ​- this is one hour
 #SBATCH --ntasks=1 ​  # number of processor cores (i.e. tasks) #SBATCH --ntasks=1 ​  # number of processor cores (i.e. tasks)
 #SBATCH --nodes=1 ​  # number of nodes #SBATCH --nodes=1 ​  # number of nodes
Line 54: Line 67:
 Simple enough, right? Also it is important to make sure we tell it how much memory and time we expect. If we give it a lot we will have less priority and have to weight longer for the job to start. Simple enough, right? Also it is important to make sure we tell it how much memory and time we expect. If we give it a lot we will have less priority and have to weight longer for the job to start.
  
-Now we just do execute ​./​slurm.sh ​to run it.+To submit your job, use the ''​sbatch''​ command, as in ''​sbatch ​./slurm.sh''​.
  
 ==== Pro Tips ==== ==== Pro Tips ====
   * Make sure your tf code uses the GPU   * Make sure your tf code uses the GPU
   * to see all your jobs status its helpful to make an alias with the command `watch squeue -u<​username>​ --Format=jobid,​numcpus,​state,​timeused,​timeleft'​   * to see all your jobs status its helpful to make an alias with the command `watch squeue -u<​username>​ --Format=jobid,​numcpus,​state,​timeused,​timeleft'​
supercomputer.txt ยท Last modified: 2021/06/30 23:42 (external edit)