Differences

This shows you the differences between two versions of the page.

--- supercomputer [2017/08/23 20:22]
sean [Deep Learning on the Supercomputer]
+++ supercomputer [2021/06/30 23:42] (current)
@@ Line 19: / Line 19: @@
 module add tensorflow/0.9.0_python-2.7.11+cuda
 </code>
+**UPDATE: apparently, the following module file works better:**
+<code>
+#%Module
+module load defaultenv
+module load cuda/8.0
+module load cudnn/5.1_cuda-8.0
+module load python/2/7
+</code>
 The computer lab grants most memory to the **compute** directory, so from now on we will make sure to put all data and code in there.
@@ Line 42: / Line 55: @@
 #!/bin/bash
-#SBATCH --time=01:00:00   # walltime
+#SBATCH --time=01:00:00   # walltime - this is one hour
 #SBATCH --ntasks=1   # number of processor cores (i.e. tasks)
 #SBATCH --nodes=1   # number of nodes
@@ Line 54: / Line 67: @@
 Simple enough, right? Also it is important to make sure we tell it how much memory and time we expect. If we give it a lot we will have less priority and have to weight longer for the job to start.
-Now we just do execute ./slurm.sh to run it.
+To submit your job, use the ''sbatch'' command, as in ''sbatch ./slurm.sh''.
 ==== Pro Tips ====
   * Make sure your tf code uses the GPU
   * to see all your jobs status its helpful to make an alias with the command `watch squeue -u<username> --Format=jobid,numcpus,state,timeused,timeleft'

BYU CS classes