|Usable Cores||20 (24 Total)|
|Storage|| 4TB NFS /home
|Usable Cores||28 (32 Total)|
|Storage|| 60TB NFS /home
In order to access any of the CoEPP nodes' Tier 3 resources you will need to be registered in the CoEPP central authentication service.
You were most likely registered when you first joined CoEPP and got your email@example.com email address, however if you do not remember your coepp username or password please contact the Research Computing team on firstname.lastname@example.org or contact any of the individuals listed below:
Since Adelaide University does not assign unix style usernames, your username can be chosen. We request that it be longer than 3 characters and only consist of alphabetic characters.
Please email the email@example.com and either Sean, Lucien or Goncalo will get in touch
We endeavour to keep your CoEPP username the same as your central unimelb username (which is also the same as your unimelb Physics username). Only in the rare occasion of a username clash with an existing user from another node will this not be possible.
|Contact Name||Lucien Bolandfirstname.lastname@example.org||03 8344 7994|
|Contact Name||Sean Crosbyemail@example.com||03 8344 8093|
Your CoEPP username and UID (the unique number associated with your username) will be the same as your Physics IT username and UID. We do this to allow you continued NFS access to your home directories from Linux machines controlled by the USyd Physics IT department.
|Contact Name||Goncalo Borgesfirstname.lastname@example.org||02 9351 1937|
Once you have your CoEPP central authentication account you will immediately be able to access the UI of your home node.
|Node||Login address||Actual server name|
Access to other node's UIs can also be request from the Research Computing team and additional resources are available to all CoEPP researchers on the CoEPP Tier 3 cloud facility.
Terminal access can gained using these recommended emulators:
You must use the Secure SHell protocol as demonstrated below:
user@local$ ssh -X -l lucien mui.coepp.org.au Last login: Tue Feb 12 06:37:15 2013 from locahost.unimelb.edu.au +---+ +---+ | |-------| | Research Computing +---+ | +---+ | +---+ | ARC Centre of Excellence for Particle Physics at the Terascale |---| |---| | +---+ | *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* +---+ | +---+ * This is a restricted server. * | |-------| | * Only authorised users are permitted * +---+ +---+ *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* To initialise ATLAS software, type . setupATLAS To setup ROOT, then type localSetupROOT To setup DQ2, type localSetupDQ2Client To setup PANDA, type localSetupPandaClient [agu1:~]$
[tjdyce@ui ~]$ qsub test.sh 782.ui.atlas.unimelb.edu.au
Note: the job number is important. Here it is 782.
Note: This submits to the long queue by default, check the queues and priorities section later for other possible queues.
[tjdyce@ui ~]$ qstat ACTIVE JOBS-------------------- JOBNAME USERNAME STATE PROC REMAINING STARTTIME 782 tjdyce Running 1 2:16:46:30 Wed May 27 22:42:02 783 fifieldt Running 1 2:19:47:56 Thu May 28 01:43:28 784 ulif Running 1 2:19:50:28 Thu May 28 01:46:00 785 tshao Running 1 2:19:54:10 Thu May 28 01:49:42 4 Active Jobs 4 of 8 Processors Active (50.00%) 1 of 1 Nodes Active (100.00%) IDLE JOBS---------------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME 0 Idle Jobs BLOCKED JOBS---------------- JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME Total Jobs: 4 Active Jobs: 4 Idle Jobs: 0 Blocked Jobs: 0
[tjdyce@ui ~]$ tracejob 782 Job: 782.ui.atlas.unimelb.edu.au 02/25/2009 06:52:41 S enqueuing into default, state 1 hop 1 02/25/2009 06:52:41 S Job Queued at request of email@example.com, owner = firstname.lastname@example.org, job name = test.sh, queue = default 02/25/2009 06:52:42 S Job Modified at request of email@example.com 02/25/2009 06:52:42 S Job Run at request of firstname.lastname@example.org 02/25/2009 06:52:42 S Exit_status=0 resources_used.cput=00:00:00 resources_used.mem=0kb resources_used.vmem=0kb resources_used.walltime=00:00:00 02/25/2009 06:52:42 S dequeuing from default, state COMPLETE
Once the job is complete, the output is stored as below. Note again the job number is used.
[tjdyce@ui ~]$ ls -lah *782 -rw------- 1 tjdyce epp 0 Feb 25 06:52 test.sh.e782 -rw------- 1 tjdyce epp 2.8K Feb 25 06:52 test.sh.o782
|Queue||Priority||Maximum Simultaneous Jobs||Maximum Walltime||Usage||Submit Command|
|mel_long||Low, jobs will run behind short jobs||No limit||72 hours||For standard jobs, which you have tested and are ready to set running||qsub -q mel_long|
|mel_short||High, jobs take precedence |
over long jobs
|No limit||1 hour||For Prototyping jobs||qsub -q mel_short|
Note: The mel_long queue is the default, if you do not specify a queue this is where jobs will go.
qmgr -c "print server" | grep default_queue set server default_queue = mel_long</code> === Adelaide Queues === ^Queue ^Priority ^Maximum Simultaneous Jobs ^Maximum Walltime ^Usage ^Submit Command^ |adl_long |Low, jobs will run behind short jobs | No limit |72 hours |For standard jobs, which you have tested and are ready to set running |qsub -q adl_long| |adl_short |High, jobs take precedence \\ over long jobs| No limit |1 hour |For Prototyping jobs |qsub -q adl_short| //Note: The adl_long queue is the default, if you do not specify a queue this is where jobs will go.//<code> qmgr -c "print server" | grep default_queue set server default_queue = adl_long</code> === Sydney Queues === ^Queue ^Priority ^Maximum Simultaneous Jobs ^Maximum Walltime ^Usage ^Submit Command^ |syd_long |Low, jobs will run behind short jobs | No limit |72 hours |For standard jobs, which you have tested and are ready to set running |qsub -q syd_long| |syd_medium |Medium, | No limit |10 hour | |qsub -q syd_medium| |syd_short |High, jobs take precedence \\ over long jobs| No limit |1 hour |For Prototyping jobs |qsub -q syd_short| //Note: The syd_medium queue is the default, if you do not specify a queue this is where jobs will go.//<code> qmgr -c "print server" | grep default_queue set server default_queue = syd_medium</code> ====Fairshare==== The T3 queues are setup with user based fairshare. This means that your jobs will be given lower priority over other users if you have been running more jobs than them. This fairshare is calculated over a 5 day period, with daily windows, and a 50% decay. =====Adelaide Users===== ====Home Directories==== There are two sets of execution queues shown above, the Melbourne and Adelaide queues. When your jobs run on a Melbourne queue they have access to the home directory that you had on the //old// Adelaide cloud. When you run on an Adelaide queue, you have a different (possibly empty) home directory. If you're not sure what that means, run a job like this to show what is in your home directory: <code> #!/bin/bash #PBS -S /bin/bash #PBS -j oe #PBS -l nodes=1 #PBS -l mem=512MB,vmem=512MB #PBS -l walltime=00:05:00 #PBS -N test # show the queue we are running on and home directory contents echo "PBS_O_QUEUE=$PBS_O_QUEUE" # show the queue we are running on echo "$ pwd" pwd echo "$ ls -l" ls -l
Submit the job to a Melbourne and Adelaide queue:
$ qsub test.sh # submitted to the Melbourne long queue 270981.t3torque.atlas.unimelb.edu.au $ qsub -q adl_short test.sh # submitted to the Adelaide short queue 270982.t3torque.atlas.unimelb.edu.au $ ls -l total 12 -rw------- 1 rwilson people 1102 Mar 6 03:32 test.o270981 -rw------- 1 rwilson people 115 Mar 6 03:32 test.o270982 -rw-r--r-- 1 rwilson people 278 Mar 6 03:32 test.sh
$ more test.o270981 PBS_O_QUEUE=long $ pwd /imports/home/rwilson $ ls -l total 40 drwxr-xr-x 5 rwilson people 155 Mar 6 03:09 checkjobs -rw-r--r-- 1 rwilson people 342 Dec 3 22:11 cloud_users drwxrwxr-x 4 rwilson people 4096 Feb 19 04:33 combo drwxr-xr-x 2 rwilson people 4096 Feb 14 00:43 example drwxrwxrwx 2 rwilson people 4096 Jan 6 21:54 fib drwxr-xr-x 2 rwilson people 84 Feb 25 23:40 ioctl drwx------ 5 rwilson people 105 Jan 14 00:10 jdash drwxr-xr-x 2 rwilson people 67 Feb 18 04:07 job_test drwxr-xr-x 2 rwilson people 4096 Jan 9 21:52 jobdash drwxr-xr-x 2 rwilson people 75 Jan 9 22:59 makejob -rw-r--r-- 1 rwilson people 788 Sep 18 02:53 martin_cpu.job drwxr-xr-x 2 rwilson people 76 Feb 25 04:14 noclean -rw-r--r-- 1 rwilson people 315 Sep 18 01:38 one_hour_cpu.py -rw-r--r-- 1 rwilson people 1645 Feb 20 23:56 results drwxr-xr-x 4 rwilson people 4096 Mar 1 01:51 test drwxr-xr-x 2 rwilson people 94 Feb 14 01:15 test2 drwxr-xr-x 2 rwilson people 50 Feb 20 03:28 test_cloud drwxr-xr-x 2 rwilson people 4096 Feb 21 04:14 test_stage drwxr-xr-x 3 rwilson people 26 Feb 3 22:58 workarea
$ more test.o270982 PBS_O_QUEUE=adl_short $ pwd /imports/home/rwilson $ ls -l total 0 drwxr-xr-x 2 rwilson people 39 Mar 6 03:32 test
Note that your home directory is in the same place in the filesystem in both runs (/imports/home/<username>) but the directory contents are different.
As stated above, the default queue used if you don't specify a queue in the job script (or on the command line) is the long queue, which is the Melbourne long queue.
If you want your job to run on an Adelaide queue, specify either the adl_short or adl_long queue in the job script:
#PBS -q adl_short
or on the command line:
$ qsub -q adl_short test.sh