CoEPP RC
 

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
cloud:spartangpu [2017/01/18 12:17]
ahawthorne [Getting job status]
cloud:spartangpu [2017/01/20 12:59]
scrosby
Line 202: Line 202:
 ===== TensorFlow ===== ===== TensorFlow =====
  
 +  * To install TensorFlow
 <​code>​ <​code>​
 module load CUDA/8.0.44 module load CUDA/8.0.44
 +wget https://​bootstrap.pypa.io/​ez_setup.py -O ez_setup.py
 +python ez_setup.py --user
 easy_install --user pip easy_install --user pip
 export LD_LIBRARY_PATH=/​data/​projects/​punim0011/​cuda/​lib64:​$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/​data/​projects/​punim0011/​cuda/​lib64:​$LD_LIBRARY_PATH
 pip install --user tensorflow-gpu pip install --user tensorflow-gpu
 +</​code>​
  
-To run tensorflow program, do 
  
 +To run tensorflow programs, do
 +
 +<​code>​
 module load CUDA/8.0.44 module load CUDA/8.0.44
 export LD_LIBRARY_PATH=/​data/​projects/​punim0011/​cuda/​lib64:​$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/​data/​projects/​punim0011/​cuda/​lib64:​$LD_LIBRARY_PATH
 +</​code>​
 e.g. e.g.
  
 +<​code>​
 [scrosby@spartan ~]$ module load CUDA/8.0.44 [scrosby@spartan ~]$ module load CUDA/8.0.44
 [scrosby@spartan ~]$ export LD_LIBRARY_PATH=/​data/​projects/​punim0011/​cuda/​lib64:​$LD_LIBRARY_PATH [scrosby@spartan ~]$ export LD_LIBRARY_PATH=/​data/​projects/​punim0011/​cuda/​lib64:​$LD_LIBRARY_PATH
Line 227: Line 234:
 I tensorflow/​stream_executor/​dso_loader.cc:​128] successfully opened CUDA library libcuda.so.1 locally I tensorflow/​stream_executor/​dso_loader.cc:​128] successfully opened CUDA library libcuda.so.1 locally
 I tensorflow/​stream_executor/​dso_loader.cc:​128] successfully opened CUDA library libcurand.so locally I tensorflow/​stream_executor/​dso_loader.cc:​128] successfully opened CUDA library libcurand.so locally
 +</​code>​
  
 You'll have to add those lines to your bash script which you submit to the queue as well.  You'll have to add those lines to your bash script which you submit to the queue as well. 
  
 For example in file submitSlurm.sh:​ For example in file submitSlurm.sh:​
- #​!/​bin/​bash +<​code>​ 
- ​module load CUDA/​8.0.44 +#​!/​bin/​bash 
- ​export LD_LIBRARY_PATH=/​data/​projects/​punim0011/​cuda/​lib64:​$LD_LIBRARY_PATH +module load CUDA/​8.0.44 
-  +export LD_LIBRARY_PATH=/​data/​projects/​punim0011/​cuda/​lib64:​$LD_LIBRARY_PATH 
- ​python deepLearningTrain.py+ 
 +python deepLearningTrain.py 
 +</​code>​
  
 Where deepLearningTrain.py is the TensorFlow script. No changes are needed from TensorFlow-CPU,​ the operations will be automatically allocated to the GPUs. Where deepLearningTrain.py is the TensorFlow script. No changes are needed from TensorFlow-CPU,​ the operations will be automatically allocated to the GPUs.
-Run with: 
  
 +Run with:
 +<​code>​
 sbatch -p physics-gpu --gres=gpu:​2 --mem-per-cpu=20G --cpus-per-task=6 --time=48:​00:​00 submitSlurm.sh sbatch -p physics-gpu --gres=gpu:​2 --mem-per-cpu=20G --cpus-per-task=6 --time=48:​00:​00 submitSlurm.sh
- 
 </​code>​ </​code>​
cloud/spartangpu.txt · Last modified: 2018/09/05 15:22 by scrosby
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki