CoEPP RC
 

This is an old revision of the document!


CREAM Torque Filter

Aim

We want to add a custom parameter to all jobs submitted through the CREAM-CE interface. We want to add a 'score' property to jobs which request a single core, and 'mcore' to jobs which request multiple cores

Assumptions

Multicore

The contents of a JDL which requests more than 1 core is as such

...
cream_attributes  =  CpuNumber=8;WholeNodes=false;SMPGranularity=8;
...

CREAM will convert those JDL attributes into the following line in a normal Torque submit script

#PBS -l nodes=1:ppn=8

My goal is to change that line to be

#PBS -l nodes=1:ppn=8:mcore
#PBS -l cput=416:00:00

I also include the cput request as we now want 8 cores each of 52 hrs of cpu time

Single core jobs

For single core jobs, the nodes line looks like

#PBS -l nodes=1

My goal is to change that line to be

#PBS -l nodes=1:score

Procedure

It's actually quite simple. To submit a job to the Torque server, CREAM just runs the regular qsub command. The qsub command will then run the submit filter before submitting the job to the batch server.

To get qsub to run a custom submit filter, you need to add a file called torque.cfg to a directory on your CE (which is a submit host). The directory is the one you defined as “–with-server-home” when you built the Torque RPMs. My torque-client RPM was made with /var/lib/torque as the “–with-server-home”.

# cat /var/lib/torque/torque.cfg
SUBMITFILTER /var/lib/torque/submit_filter

This will run the script /var/lib/torque/submit_filter every time it tries to submit a job.

The contents of my /var/lib/torque/submit_filter is

#!/usr/bin/perl

# This script read a submission script on the standard input, modifies
# it, and writes the modified script on standard output.  This script
# makes one modification:
#
#   * adds a parameter mcore to all jobs which request 1 node and more than 1 CPU
#   * adds a parameter score to all jobs which request 1 node and 1 CPU
#
while (<STDIN>) {

    # By default just copy the line.
    $line = $_;

    # If there is a nodes line, then extract the value and adjust it
    # as necessary.  Only modify the nodes/ppn line.
    if (m/#PBS\s+-l\s+nodes=(\d+).*$/) {
        $line = process_nodes($line);
    }

    print $line;
}

# This takes the existing node line and returns the altered one
sub process_nodes {
    my $oldline = shift;
    # done to ensure the node line is not lost if it doesn't match
    my $line = $oldline;

    # mcore
    if ($oldline =~ m/#PBS\s+-l\s+nodes=(\d+):ppn=(\d+)\s*$/) {
       $nodes=$1;
       $ppn=$2;
       if ($nodes == 1) {
          if ($ppn > 1) {
             $hours = 52*$ppn;
             $line = "#PBS -l nodes=$nodes:ppn=$ppn:mcore\n#PBS -l cput=$hours:00:00\n";
          }
          if ($ppn == 1) {
             $line = "#PBS -l nodes=$nodes:ppn=$ppn:score\n";
          }
       }
    }

    # score
    if ($oldline =~ m/#PBS\s+-l\s+nodes=(\d+)\s*$/) {
       $nodes=$1;
       if ($nodes == 1) {
          $line = "#PBS -l nodes=$nodes:score\n";
       }
    }

    return $line;
}

You don't need to restart CREAM for the change to come into effect

Submitted job examples

Single core ATLAS job

#!/bin/bash
# PBS job wrapper generated by pbs_submit.sh
# on Fri Apr  1 15:38:17 UTC 2016
#
# stgcmd = yes
# proxy_string = /var/cream_sandbox/atlaspil/CN_Asoka_De_Silva_GC1_OU_triumf_ca_O_Grid_C_CA_atlas_Role_pilot_Capability_NULL_pilatl07/proxy/1459258372_302678_11377319443391
# proxy_local_file = /var/cream_sandbox/atlaspil/CN_Asoka_De_Silva_GC1_OU_triumf_ca_O_Grid_C_CA_atlas_Role_pilot_Capability_NULL_pilatl07/proxy/1459258372_302678_11377319443391
#
# PBS directives:
#PBS -S /bin/bash
#PBS -o /dev/null
#PBS -e /dev/null
#PBS -q atlas
#PBS -l nodes=1
#PBS -W stagein=\'CREAM507328982_jobWrapper.sh.55207.3797.1459525097@agcream1.atlas.unimelb.edu.au:/var/cream_sandbox/atlaspil/CN_Asoka_De_Silva_GC1_OU_triumf_ca_O_Grid_C_CA_atlas_Role_pilot_Capability_NULL_pilatl07/50/CREAM507328982/CREAM507328982_jobWrapper.sh,cream_507328982.proxy@agcream1.atlas.unimelb.edu.au:/var/cream_sandbox/atlaspil/CN_Asoka_De_Silva_GC1_OU_triumf_ca_O_Grid_C_CA_atlas_Role_pilot_Capability_NULL_pilatl07/proxy/1459258372_302678_11377319443391\'
#PBS -W stageout=\'out_cream_507328982_StandardOutput@agcream1.atlas.unimelb.edu.au:/var/cream_sandbox/atlaspil/CN_Asoka_De_Silva_GC1_OU_triumf_ca_O_Grid_C_CA_atlas_Role_pilot_Capability_NULL_pilatl07/50/CREAM507328982/StandardOutput,err_cream_507328982_StandardError@agcream1.atlas.unimelb.edu.au:/var/cream_sandbox/atlaspil/CN_Asoka_De_Silva_GC1_OU_triumf_ca_O_Grid_C_CA_atlas_Role_pilot_Capability_NULL_pilatl07/50/CREAM507328982/StandardError\'
#PBS -m n

8 core ATLAS job

#!/bin/bash
# PBS job wrapper generated by pbs_submit.sh
# on Sun Apr  3 10:55:27 UTC 2016
#
# stgcmd = yes
# proxy_string = /var/cream_sandbox/prdatlas/CN_Robot__ATLAS_Pilot1_CN_614260_CN_atlpilo1_OU_Users_OU_Organic_Units_DC_cern_DC_ch_atlas_Role_production_Capability_NULL_patl009/proxy/1451973343_181204_11377319443391
# proxy_local_file = /var/cream_sandbox/prdatlas/CN_Robot__ATLAS_Pilot1_CN_614260_CN_atlpilo1_OU_Users_OU_Organic_Units_DC_cern_DC_ch_atlas_Role_production_Capability_NULL_patl009/proxy/1451973343_181204_11377319443391
#
# PBS directives:
#PBS -S /bin/bash
#PBS -o /dev/null
#PBS -e /dev/null
#PBS -q atlas
#PBS -l nodes=1:ppn=8:mcore
#PBS -l cput=416:00:00
#PBS -W stagein=\'CREAM915732117_jobWrapper.sh.40809.5385.1459680927@agcream1.atlas.unimelb.edu.au:/var/cream_sandbox/prdatlas/CN_Robot__ATLAS_Pilot1_CN_614260_CN_atlpilo1_OU_Users_OU_Organic_Units_DC_cern_DC_ch_atlas_Role_production_Capability_NULL_patl009/91/CREAM915732117/CREAM915732117_jobWrapper.sh,cream_915732117.proxy@agcream1.atlas.unimelb.edu.au:/var/cream_sandbox/prdatlas/CN_Robot__ATLAS_Pilot1_CN_614260_CN_atlpilo1_OU_Users_OU_Organic_Units_DC_cern_DC_ch_atlas_Role_production_Capability_NULL_patl009/proxy/1451973343_181204_11377319443391\'
#PBS -W stageout=\'out_cream_915732117_StandardOutput@agcream1.atlas.unimelb.edu.au:/var/cream_sandbox/prdatlas/CN_Robot__ATLAS_Pilot1_CN_614260_CN_atlpilo1_OU_Users_OU_Organic_Units_DC_cern_DC_ch_atlas_Role_production_Capability_NULL_patl009/91/CREAM915732117/StandardOutput,err_cream_915732117_StandardError@agcream1.atlas.unimelb.edu.au:/var/cream_sandbox/prdatlas/CN_Robot__ATLAS_Pilot1_CN_614260_CN_atlpilo1_OU_Users_OU_Organic_Units_DC_cern_DC_ch_atlas_Role_production_Capability_NULL_patl009/91/CREAM915732117/StandardError\'
#PBS -m n

Single core Ops job

#!/bin/bash
# PBS job wrapper generated by pbs_submit.sh
# on Tue Apr  5 05:06:39 UTC 2016
#
# stgcmd = yes
# proxy_string = /var/cream_sandbox/sgmops/CN_Liaw_SyueYi_182693_OU_GRID_O_AS_C_TW_ops_Role_lcgadmin_Capability_NULL_sops009/proxy/14391939652E829472rocwms022Egrid2Esinica2Eedu2Etw_11377319443391
# proxy_local_file = /var/cream_sandbox/sgmops/CN_Liaw_SyueYi_182693_OU_GRID_O_AS_C_TW_ops_Role_lcgadmin_Capability_NULL_sops009/proxy/14391939652E829472rocwms022Egrid2Esinica2Eedu2Etw_11377319443391
#
# PBS directives:
#PBS -S /bin/bash
#PBS -o /dev/null
#PBS -e /dev/null
#PBS -q ops
#PBS -l nodes=1
#PBS -W stagein=\'CREAM486087923_jobWrapper.sh.50909.12661.1459832799@agcream1.atlas.unimelb.edu.au:/var/cream_sandbox/sgmops/CN_Liaw_SyueYi_182693_OU_GRID_O_AS_C_TW_ops_Role_lcgadmin_Capability_NULL_sops009/48/CREAM486087923/CREAM486087923_jobWrapper.sh,cream_486087923.proxy@agcream1.atlas.unimelb.edu.au:/var/cream_sandbox/sgmops/CN_Liaw_SyueYi_182693_OU_GRID_O_AS_C_TW_ops_Role_lcgadmin_Capability_NULL_sops009/proxy/14391939652E829472rocwms022Egrid2Esinica2Eedu2Etw_11377319443391\'
#PBS -W stageout=\'out_cream_486087923_StandardOutput@agcream1.atlas.unimelb.edu.au:/var/cream_sandbox/sgmops/CN_Liaw_SyueYi_182693_OU_GRID_O_AS_C_TW_ops_Role_lcgadmin_Capability_NULL_sops009/48/CREAM486087923/StandardOutput,err_cream_486087923_StandardError@agcream1.atlas.unimelb.edu.au:/var/cream_sandbox/sgmops/CN_Liaw_SyueYi_182693_OU_GRID_O_AS_C_TW_ops_Role_lcgadmin_Capability_NULL_sops009/48/CREAM486087923/StandardError\'
#PBS -m n
torque_filter.1459841066.txt.gz · Last modified: 2016/04/05 17:24 by scrosby
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki