Job Description Extensions Support (4.0.5+, update pkg available)

[Important]Important

This feature has been added as of GT 4.0.5. For versions older than 4.0.5, an update package is available to upgrade your installation. See the GT Development Downloads page for the latest links.

The WS-GRAM job description schema includes a section for extending the job description with custom elements. To make sense of this in the resource manager adapter Perl scripts, a Perl module named Globus::GRAM::ExtensionsHandler is provided to turn these custom elements into parameters that the adapter scripts can understand.

[Note]Note

Although only non-GRAM XML elements are allowed in the <extensions> element of the job description, the extensions handler makes no distinction based on namespace. Thus, <foo:myparam> and <bar:myparam> will both be treated as just <myparam>.

Familiarity with the adapter scripts is assumed in the following sub-sections.

1. Requirements for Extensions Support

  • XML::Parser Perl module

2. Supported Extension Constructs

2.1. Simple String Parameters

Simple string extension elements are converted into single-element arrays with the name of the unqualified tag name of the extension element as the array's key name in the Perl job description hash. Simple string extension elements can be considered a special case of the string array construct in the next section.

For example, adding the following element to the <extensions> element of the job description as follows:

    <extensions>
        <myparam>yahoo!</myparam>
    </extensions>

will cause the $description->myparam() to return the following value:

    'yahoo!'

2.2. String Array Parameters

String arrays are a simple iteration of the simple string element construct. If you specify more than one simple string element in the job description, these will be assembled into a multi-element array with the unqualified tag name of the extension elements as the array's key name in the Perl job description hash.

For example:

<extensions>
  <myparams>Hello</myparams>
  <myparams>World!</myparams>
</extensions>

will cause the $description->myparams() to return the following value:

[ 'Hello', 'World!' ]

2.3. Name/Value Parameters

Name/value extension elements can be thought of as string arrays with an XML attribute 'name'. This will cause the creation of a two-dimensional array with the unqualified extension element tag name as the name of the array in the Perl job description hash.

For example:

<extensions>
  <myvars name="pi">3.14159</myvars>
  <myvars name="mole">6.022 x 10^23</myvars>
</extensions>

will cause the $description->myvars() to return the following value:

        [ [ 'pi', '3.14159'], ['mole', '6.022 x 10^23'] ]

2.4. Condor specific parameters

If a user submits a job to Gram4 specifying Condor as local resource manager a condor-specific job description will be created from the XML job description which will be used when the job is submitted to Condor. A user can influence the creation of the condor-specific job description by adding condorsubmit elements to the extensions element:

<job>
    ...
    <extensions>
        <condorsubmit name="nameOfAnElement">valueOfTheElement</condorsubmit>
    </extensions>
</job>

More than one condorsubmit element can be placed in the extensions element.

The following example shows how to set a different Requirements element than is added by default. By default Gram4 adds a Requirement element and sets the parameter OpSys and Arch to values that fit the head-node where Gram4 is running. If e.g. the operating system on the head-node is Linux and the architecture is X86_64, the Requirements element in a Condor job description will look like

Requirements=OpSys == "LINUX" && Arch == "X86_64"

If this is not what is needed, requirements can be added as follows:

<job>
  <executable>/bin/date</executable>
  <extensions>
    <condorsubmit name="Requirements">OpSys == "LINUX" &amp;&amp; (Arch == "X86_64" || Arch == "INTEL")</condorsubmit>
  </extensions>
</job>

Note that the special char & must be coded as &amp.

2.5. PBS Node Selection Parameters

[Note]Note

If you are using an update package with a version of GT prior to 4.0.5: in addition to the globus_gram_job_manager update package, the globus_gram_job_manager_setup_pbs update package is required to take advantage of the PBS node selection extensions.

Node selection constraints in PBS can be specified in one of the following ways:

  • generally, using a construct intended to eventually apply to all resource managers which support node selection

  • explicitly. by specifying a simple string element.

The former will be more portable, but the latter will appeal to those familiar with specifying node constraints for PBS jobs.

2.5.1. Using the nodes extensions element

To specify PBS node selection constraints explicitly, one can simply construct a single, simple string extension element named nodes with a value that conforms to the #PBS -l nodes=... PBS job description directive. The Globus::GRAM::ExtensionsHandler module will make this available to the PBS adapter script by invoking $description->{nodes}. The updated PBS adapter package checks for this value and will create a directive in the PBS job description using this value.

For example the following nodes extensions element

...
<extensions>
  <nodes>activemural:ppn=10+5:ia64-compute:ppn=2</nodes>
</extensions>
...

will result in the following directive in the PBS job description:

#PBS -l nodes=activemural:ppn=10+5:ia64-compute:ppn=2

2.5.2. Using the resourceAllocationGroup extensions element

To specify PBS node selection constraints explicitly, one can simply constuct a single, simple string extension element named nodes with a value that conforms to the #PBS -l nodes=... PBS job description directive. The Globus::GRAM::ExtensionsHandler module will make this available to the PBS adapter script by invoking $description->{nodes}. The updated PBS adapter package checks for this value and will create a directive in the PBS job description using this value.

To use the generic construct for specifying node selection constraints, use the resourceAllocationGroup element:

<extensions>
    <resourceAllocationGroup>
    <!-- Optionally select hosts by type and number... -->
    <hostType>...</hostType>
    <hostCount>...</hostCount>
        
    <!-- *OR* by host names -->
        
    <hostName>...</hostName>
    <hostName>...</hostName>
        . . .
              
    <!-- With a total CPU count for this group... -->
    <cpuCount>...</cpuCount>
        
    <!-- *OR* an explicit number of CPUs per node... -->
    <cpusPerHost>...</cpusPerHost>
        . . .
         
    <!-- And a total process count for this group... -->
    <processCount>...</processCount>
        
    <!-- *OR* an explicit number of processes per node... -->
    <processesPerHost>...</processesPerHost>
    </resourceAllocationGroup>
</extensions>

Extension elements specified according to the above pseudo-schema will be converted to an appropriate nodes parameter which will be treated as if an explicit nodes extension element were specified.

Multiple resourceAllocationGroup elements may be specified. This will simply append the constraints to the nodes paramater with a '+' separator.

[Note]Note

You cannot specify both hostType/hostCount and hostName elements. Similarly, one cannot specify both processCount and processesPerHost elements.

Here are some examples of using resourceAllocationGroup:

<!-- #PBS -l nodes=1:ppn=10 -->
    <!-- 10 processes -->
    <extensions>
    <resourceAllocationGroup>
    <cpuCount>10</cpuCount>
    <processCount>10</processCount>
    </resourceAllocationGroup>
    </extensions>
        
    <!-- #PBS -l nodes=activemural:ppn=10+5:ia64-compute:ppn=2 -->
    <!-- 1 process (process default) -->
    <extensions>
    <resourceAllocationGroup>
    <hostType>activemural</hostType>
    <cpuCount>10</cpuCount>
    </resourceAllocationGroup>
    <resourceAllocationGroup>
    <hostType>ia64-compute</hostType>
    <hostCount>5</hostCount>
    <cpusPerHost>2</cpusPerHost>
    </resourceAllocationGroup>
    </extensions>
        
    <!-- #PBS -l nodes=vis001:ppn=5+vis002:ppn=5+comp014:ppn=2+comp015:ppn=2 -->
    <!-- 15 total processes -->
    <extensions>
    <resourceAllocationGroup>
    <hostName>vis001</hostName>
    <hostName>vis002</hostName>
    <cpuCount>10</cpuCount>
    <processesPerHost>5</processesPerHost>
    </resourceAllocationGroup>
    <resourceAllocationGroup>
    <hostName>comp014</hostName>
    <hostName>comp015</hostName>
    <cpusPerHost>2</cpusPerHost>
    <processCount>5</processCount>
    </resourceAllocationGroup>
    </extensions>

3. Customizing Extensions Support

Two Perl modules must be edited to customize extensions support.

  • The first is ExtensionsHandler.pm. This is where the WS-GRAM job description XML of the extensions element is parsed and entries are added or appended to the Perl job description hash.

  • The second module that needs to be edited is the particular resource manager adapter module that will use any new hash entries to either alter its behavior or create additional parameters in the resource manager job description.

3.1. Customizing ExtensionsHandler.pm

This module logs various things to the log file specified in the logfile extension element. If you place this element at the start of the extensions for which you are creating support, then you can look at the specified log file to get some idea of what the handler is doing. You can add new logging lines by using the $self->log() function. This simply takes a string that gets appended to the log file with a prefix of "<date string> EXTENSIONS HANDLER:".

There are three main subroutines that are used to handle parsing events and process them accordingly:

  • Char()

  • StartTag()

  • EndTag()

More handlers can be specified for other specific events when creating the XML::Parser instance in new() (see the XML::Parser documentation for details).

The following list describes what the three main subroutines currently do. Modify the subroutines as necessary to achieve your specific goal.

  • Char() doesn't do anything but collect CDATA found between the current element's start and end tags. You can access the CDATA for the current element by using $self->{CDATA}.

  • StartTag() is responsible for collecting the attributes associated with the element. It also increments the counter which keeps track of the number of child elements to the current extension element, and pushes the current element name onto the @scope queue for later use.

  • EndTag() takes the CDATA collected by Char() and creates new Perl job description hash entries. This is most likely where you will need to do most of your work when adding support for new extension elements. Two useful variables are $currentScope and $parentScope. These indicate the current element that is being parsed and the parent of the element being parsed respectively. This is useful for establishing a context from which to work. The @scope queue is [poped] at the end of this subroutine.

3.2. Customizing the Adapter Module

There is not much to say here. Each adapter is different. Spend some time trying to understand what the adapter does and then make and test your changes. Any new hash entries you created in ExtensionsHandler.pm can be accessed by calling $description->entryname(), where 'entryname' is the name of the entry that was added. See the construct documentation above for more details.