CoEPP RC
 

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cloud:spartangpu [2017/01/18 12:17]
ahawthorne [Getting job status]
cloud:spartangpu [2018/09/05 15:22]
scrosby
Line 32: Line 32:
 <​code>​ <​code>​
 /​home/<​username>​ - your home directory /​home/<​username>​ - your home directory
-/data/projects/punim0011 - our project directory+/data/cephfs/punim0011 - our project directory
 /scratch - scratch directory for jobs /scratch - scratch directory for jobs
 </​code>​ </​code>​
Line 202: Line 202:
 ===== TensorFlow ===== ===== TensorFlow =====
  
 +  * To install TensorFlow
 <​code>​ <​code>​
 module load CUDA/8.0.44 module load CUDA/8.0.44
 +wget https://​bootstrap.pypa.io/​ez_setup.py -O ez_setup.py
 +python ez_setup.py --user
 easy_install --user pip easy_install --user pip
 export LD_LIBRARY_PATH=/​data/​projects/​punim0011/​cuda/​lib64:​$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/​data/​projects/​punim0011/​cuda/​lib64:​$LD_LIBRARY_PATH
 pip install --user tensorflow-gpu pip install --user tensorflow-gpu
 +</​code>​
  
-To run tensorflow program, do 
  
-module load CUDA/​8.0.44 +To run tensorflow programs, do
-export LD_LIBRARY_PATH=/​data/​projects/​punim0011/​cuda/​lib64:​$LD_LIBRARY_PATH+
  
 +<​code>​
 +module load CUDA/8.0.44
 +</​code>​
 e.g. e.g.
  
 +<​code>​
 [scrosby@spartan ~]$ module load CUDA/8.0.44 [scrosby@spartan ~]$ module load CUDA/8.0.44
-[scrosby@spartan ~]$ export LD_LIBRARY_PATH=/​data/​projects/​punim0011/​cuda/​lib64:​$LD_LIBRARY_PATH 
 [scrosby@spartan ~]$ python [scrosby@spartan ~]$ python
 Python 2.7.5 (default, Aug  2 2016, 04:​20:​16) ​ Python 2.7.5 (default, Aug  2 2016, 04:​20:​16) ​
Line 227: Line 232:
 I tensorflow/​stream_executor/​dso_loader.cc:​128] successfully opened CUDA library libcuda.so.1 locally I tensorflow/​stream_executor/​dso_loader.cc:​128] successfully opened CUDA library libcuda.so.1 locally
 I tensorflow/​stream_executor/​dso_loader.cc:​128] successfully opened CUDA library libcurand.so locally I tensorflow/​stream_executor/​dso_loader.cc:​128] successfully opened CUDA library libcurand.so locally
 +</​code>​
  
 You'll have to add those lines to your bash script which you submit to the queue as well.  You'll have to add those lines to your bash script which you submit to the queue as well. 
  
 For example in file submitSlurm.sh:​ For example in file submitSlurm.sh:​
- #​!/​bin/​bash +<​code>​ 
- ​module load CUDA/8.0.44 +#​!/​bin/​bash 
- ​export LD_LIBRARY_PATH=/​data/​projects/​punim0011/​cuda/​lib64:​$LD_LIBRARY_PATH +module load CUDA/​8.0.44 
-  + 
- ​python deepLearningTrain.py+python deepLearningTrain.py 
 +</​code>​
  
 Where deepLearningTrain.py is the TensorFlow script. No changes are needed from TensorFlow-CPU,​ the operations will be automatically allocated to the GPUs. Where deepLearningTrain.py is the TensorFlow script. No changes are needed from TensorFlow-CPU,​ the operations will be automatically allocated to the GPUs.
-Run with: 
  
 +Run with:
 +<​code>​
 sbatch -p physics-gpu --gres=gpu:​2 --mem-per-cpu=20G --cpus-per-task=6 --time=48:​00:​00 submitSlurm.sh sbatch -p physics-gpu --gres=gpu:​2 --mem-per-cpu=20G --cpus-per-task=6 --time=48:​00:​00 submitSlurm.sh
- 
 </​code>​ </​code>​
cloud/spartangpu.txt · Last modified: 2018/09/05 15:22 by scrosby
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki