Software Links
Getting Started
- Doc Structure
- A Globus Primer
- Globus Is Modular!
- Quickstart
- Installing GT
- Platform Notes
- Migrating from GT2
- Migrating from GT3
Reference
- PDF version
- Best Practices
- Coding Guidelines
- API docs
- Public Interfaces
- Resource Properties
- Samples
- Glossary
- Performance Studies
Common Runtime
Security
Data Mgt
Information Svcs
Execution Mgt
Table of Contents
- 1. Introduction
- 2. New Functionality
- 3. Common Usage scenarios
- 4. Usage Scenarios for Jobs Specified in JDD
- 4.1. Submitting a simple job
- 4.2. Submitting a job with the contact string
- 4.3. Submitting a job with the job description
- 4.4. Specifying file staging in the job description
- 4.5. Specifying and handling custom job description extensions.
- 4.6. Per-job customization with job ID substitution variable.
- 4.7. Specifying and submitting a multijob
- 5. Usage Scenarios for jobs specified in JSDL
- 5.1. Submitting a simple job
- 5.2. Submitting a job with the contact string
- 5.3. Submitting a job with the job description
- 5.4. Specifying file staging in the job description
- 5.5. Specifying and handling custom job description extensions.
- 5.6. Per-job customization with job ID substitution variable.
- 5.7. Specifying and submitting a multijob
- 6. Job Description Extensions
- 7. Command-line tools
- 8. Graphical user interfaces
- 9. Troubleshooting
- 10. Known problems
- 11. Usage statistics collection by the Globus Alliance
GRAM services provide secure job submission to many types of job schedulers for users who have the right to access a job hosting resource in a Grid environment. The existence of a valid proxy is in fact required for job submission. All GRAM job submission options are supported transparently through the embedded request document input. In fact, the job startup is done by submitting a client-side provided job description to the GRAM services. This submission can be made by end-users with the GRAM command-line tools.
Jobs can be described in two different job description languages: the original language defined by WS-GRAM (JDD) or in JSDL. That's why there are separate sections for the usage scenarios.
Jobs submitted to WS-GRAM can now be described in JSDL (in addition to the original JDD schema). A separete job factory service is deployed by the grid resource provider to allow job submissions for each job description type. Thus, a client must target the appropriate factory endpoint for their job description type.
Currently the following two flavors can be used inside the JSDL
<Application>-element:
- HPCProfileApplication
- POSIXApplication
Have a look at the Usage scenarios for jobs specified in JSDL for examples.
<FileSystem>-element
- DiskSpace
- FileSystemType
<PosixApplication>-element
- CoreDumpLimit
- CPUTimeLimit
- DataSegmentLimit
- FileSizeLimit
- GroupName
- LockedMemoryLimit
- MemoryLimit
- OpenDescriptorsLimit
- PipeSizeLimit
- ProcessCountLimit
- StackSizeLimit
- ThreadCountLimit
- VirtualMemoryLimit
- WallTimeLimit
<Resources>-element
- CandidateHosts
- CPUArchitecture
- ExclusiveExecution
- IndividualCPUSpeed
- IndividualCPUTime
- IndividualCPUCount
- IndividualDiskSpace
- IndividualNetworkBandwidth
- IndividualPhysicalMemory
- IndividualVirtualMemory
- OperatingSystem
- TotalCPUTime
- TotalCPUCount
- TotalPhysicalMemory
- TotalVirtualMemory
- TotalDiskSpace
<Resources>-element
- IndividualCPUSpeed
- IndividualDiskSpace
- IndividualNetworkBandwidth
- OperatingSystem
- TotalDiskSpace
Job description variables are special strings in a job description that are
replaced by the GRAM service with values that the client-side does not
a priori know. Job description variables can be used
in any path-like string or URL specified in the job description.
An example of a variable is
${GLOBUS_USER_HOME}, which represents the
path to the HOME directory on the file system where the job is executed.
The set of variables is fixed in the gram service implementation. This is
different from previous implementations of
RSL
substitutions in GT2 and GT3,
where a user could define a new variable for use inside a job description
document. This was done to preserve the simplicity of the job description
XML schema (relatively to the GT3.2 RSL schema), which does not require a
specialized XML parser to serialize a job description document.
Details of the RSL variables are in job description doc
A submission ID may be used in the GRAM protocol for reliability in the face of message faults or other transient errors in order to ensure that at most one instance of a job is executed, i.e. to prevent accidental duplication of jobs under rare circumstances with client retry on failure. By default, the globusrun-ws program will generate a submission ID (uuid). One can override this behavior by supplying a submission ID as a command line argument.
If a user is unsure whether a job was submitted successfully, he should resubmit using the same ID as was used for the previous attempt.
It is possible to specify in a job description that the job be put on hold when it reaches a chosen state (see GRAM Approach documentation for more information about the executable job state machine, and see the job description XML schema documentation for information about how to specify a held state). This is useful for instance when a GRAM client wishes to directly access output files written by the job (as opposed to waiting for the stage-out step to transfer files from the job host). The client would request that the file cleanup process be held until released, giving the client an opportunity to fetch all remaining/buffered data after the job completes but before the output files are deleted.
This is used by globusrun-ws in order to ensure client-side
streaming of remote files in batch mode.
GRAM4 services implement a WS Rendezvous mechanism to perform synchronization between job processes in a multiprocess job and between subjobs in a multijob. The job application can in fact register binary information, for instance process information or subjob information, and get notified when all the other processes or subjobs have registered their own information. This is for instance useful for parallel jobs which need to rendezvous at a "barrier" before proceeding with computations, in the case when no native application API is available to help do the rendezvous.
In order to generate a valid proxy file, use the
grid-proxy-init
tool available under $GLOBUS_LOCATION/bin:
% bin/grid-proxy-init
Your identity: /O=Grid/OU=GlobusTest/OU=simpleCA.mymachine/OU=mymachine/CN=John Doe
Enter GRID pass phrase for this identity:
Creating proxy ................................. Done
Your proxy is valid until: Tue Oct 26 01:33:42 2004
There are three different uses of delegated credentials:
- for use by the MEJS to create a remote user proxy
- for use by the MEJS to contact RFT
- for use by RFT to contact the GridFTP servers. The EPRs to each of these are specified in three job description elements -- they are jobCredentialEndpoint, stagingCredentialEndpoint, and transferCredentialEndpoint respectively. Please the "Job Description Schema Reference" section (under Semantics and syntax of domain-specific interface data) and RFT transfer request schema documentation for more details about these elements.
The globusrun-ws client can either delegate
these credentials automatically for a particular job, or it can reuse
pre-delegated credentials (see next paragraph) through the use of command-line
arguments for specifying the credentials' EPR files. Please see the
GT 4.1.1 GRAM4 Command-line Reference for details on these command-line arguments.
It is possible to use delegation Delegation Service Command Reference to obtain and refresh delegated credentials in order to use them when submitting jobs to GRAM4. This, for instance, enables the submission of many jobs using a shared set of delegated credentials. This can significantly decrease the number of remote calls for a set of jobs, thus improving performance.
Unfortunately there is no option yet to print the list of local resource managers supported by a given GRAM service installation. Such information must currently be provided out of band to the user. The GRAM name of local resource managers for which GRAM support has been installed can be obtained by looking at the GRAM configuration on the GRAM server-side machine, as explained in "Local resource manager configuration" under Configuring GRAM4.
The GRAM name of the local resource manager can be used with the factory type option of the job submission command-line tool to specify which factory resource to use when submitting a job.
Use the globusrun-ws program to submit a
simple job without writing a job description document. Use the -c argument,
a job description will be generated assuming the first arg is the executable
and the remaining are arguments. For example:
% globusrun-ws -submit -c /bin/touch touched_it
Submitting job...Done.
Job ID: uuid:4a92c06c-b371-11d9-9601-0002a5ad41e5
Termination time: 04/23/2005 20:58 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Confirm that the job worked by verifying the file was touched:
% ls -l ~/touched_it
-rw-r--r-- 1 smartin globdev 0 Apr 22 15:59 /home/smartin/touched_it
% date
Fri Apr 22 15:59:20 CDT 2005
Note: you did not tell globusrun-ws where to run your job, so the default of localhost was used.
Use globusrun-ws to submit the same touch job, but this time specify the contact string.
% globusrun-ws -submit -F https://lucky0.mcs.anl.gov:8443/wsrf/services/ManagedJobFactoryService -c /bin/touch touched_it
Submitting job...Done.
Job ID: uuid:3050ad64-b375-11d9-be11-0002a5ad41e5
Termination time: 04/23/2005 21:26 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Try the same job to a remote host. Type globusrun-ws -help to learn the details about the contact string.
The specification of a job to submit is to be written by the user in a job description XML file.
Here is an example of a simple job description:
<job>
<executable>/bin/echo</executable>
<argument>this is an example_string </argument>
<argument>Globus was here</argument>
<stdout>${GLOBUS_USER_HOME}/stdout</stdout>
<stderr>${GLOBUS_USER_HOME}/stderr</stderr>
</job>
Tell globusrun-ws to read the job description from a file, using the -f argument:
% bin/globusrun-ws -submit -f test_super_simple.xml
Submitting job...Done.
Job ID: uuid:c51fe35a-4fa3-11d9-9cfc-000874404099
Termination time: 12/17/2004 20:47 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Note the usage of the substitution variable ${GLOBUS_USER_HOME}
which resolves to the user home directory.
Here is an example with more job description parameters:
<?xml version="1.0" encoding="UTF-8"?>
<job>
<executable>/bin/echo</executable>
<directory>/tmp</directory>
<argument>12</argument>
<argument>abc</argument>
<argument>34</argument>
<argument>this is an example_string </argument>
<argument>Globus was here</argument>
<environment>
<name>PI</name>
<value>3.141</value>
</environment>
<stdin>/dev/null</stdin>
<stdout>stdout</stdout>
<stderr>stderr</stderr>
<count>2</count>
</job>
Note that in this example, a <directory> element specifies the current directory for the execution
of the command on the execution machine to be /tmp, and the standard output is
specified as the relative path stdout. The output is therefore written to /tmp/stdout:
% cat /tmp/stdout
12 abc 34 this is an example_string Globus was here
In order to do file staging one must add specific elements to the job description and delegate credentials appropriately (see Section 3.2, “Delegating credentials”). The file transfer directives follow the RFT syntax, which allows only for third-party transfers. Each file transfer must therefore specify a source URL and a destination URL. URLs are specified as GridFTP URLs (for remote files) or as file URLs (for files local to the service--these are converted internally to full GridFTP URLs by the service).
For instance, in the case of staging a file in, the source
URL would be a GridFTP URL (for instance
gsiftp://job.submitting.host:2811/tmp/mySourceFile
) resolving to a source document accessible on the file system
of the job submission machine (for instance /tmp/mySourceFile
). At run-time the Reliable File Transfer service used by the
MEJS on the remote machine would reliably fetch the remote file using the
GridFTP protocol and write it to the specified local file (for instance
file:///${GLOBUS_USER_HOME}/my_transfered_file,
which resolves to ~/my_transfered_file). Here
is how the stage-in directive would look like:
<fileStageIn>
<transfer>
<sourceUrl>gsiftp://job.submitting.host:2811/tmp/mySourceFile</sourceUrl>
<destinationUrl>file:///${GLOBUS_USER_HOME}/my_transfered_file</destinationUrl>
</transfer>
</fileStageIn>
Note: additional RFT-defined quality of service requirements can be specified for each transfer. See the RFT documentation for more information.
Here is an example job description with file stage-in and stage-out:
<job>
<executable>my_echo</executable>
<directory>${GLOBUS_USER_HOME}</directory>
<argument>Hello</argument>
<argument>World!</argument>
<stdout>${GLOBUS_USER_HOME}/stdout</stdout>
<stderr>${GLOBUS_USER_HOME}/stderr</stderr>
<fileStageIn>
<transfer>
<sourceUrl>gsiftp://job.submitting.host:2811/bin/echo</sourceUrl>
<destinationUrl>file:///${GLOBUS_USER_HOME}/my_echo</destinationUrl>
</transfer>
</fileStageIn>
<fileStageOut>
<transfer>
<sourceUrl>file:///${GLOBUS_USER_HOME}/stdout</sourceUrl>
<destinationUrl>gsiftp://job.submitting.host:2811/tmp/stdout</destinationUrl>
</transfer>
</fileStageOut>
<fileCleanUp>
<deletion>
<file>file:///${GLOBUS_USER_HOME}/my_echo</file>
</deletion>
</fileCleanUp>
</job>
Note that the job description XML does not need to include a reference to the schema that describes its syntax. As a matter of fact it is possible to omit the namespace in the GRAM job description XML elements as well. The submission of this job to the GRAM services causes the following sequence of actions:
- The
/bin/echoexecutable is transfered from the submission machine to the GRAM host file system. The destination location is the HOME directory of the user on behalf of whom the job is executed by the GRAM services (see<fileStageIn>). - The transfered executable is used to print a test string
(see
<executable>,<directory>and the<argument>elements) on the standard output, which is redirected to a local file (see<stdout>). - The standard output file is transfered to the submission machine
(see
<fileStageOut>). - The file that was initially transfered during the stage-in phase is removed
from the file system of the GRAM installation (see
<fileCleanup>).
Basic support is provided for specifying custom extensions to the job description. There are plans to improve the usability of this feature, but at this time it involves a bit of work.
Specifying the actual custom elements in the job description is trivial. Simply add any elements that you need between the beginning and ending
extensions tags at the bottom of the job
description as in the following basic example:
<job>
<executable>/home/user1/myapp</executable>
<extensions>
<mySillyData>
<florgsplat>on</florgsplat>
<tumblebuffel>off</tumblebuffel>
<headontight>no</headontight>
</mySillyData>
</extensions>
</job>
To handle this data, you will have to alter the appropriate perl scheduler
script (i.e. fork.pm for the Fork scheduler, etc...) to parse the data returned
from the $description->extensions() sub.
To allow for customization of values, such as paths, on a per-job basis; a job description substitution variable named "GLOBUS_JOB_ID" can be used.
For example:
<job>
<executable>/bin/date<executable>
<stdout>/tmp/stdout.${GLOBUS_JOB_ID}<stdout>
<stderr>/tmp/stderr.${GLOBUS_JOB_ID}<stderr>
<fileStageOut>
<transfer>
<sourceUrl>file:///tmp/stdout.${GLOBUS_JOB_ID}<sourceUrl>
<destinationUrl>gsiftp://mymachine.mydomain.com/out.${GLOBUS_JOB_ID}<destinationUrl>
<transfer>
<fileStageOut>
<job>
The job description XML schema allows for specification of a multijob i.e. a job that is itself composed of several executable jobs, which we will refer to as subjobs (note: subjobs cannot be multijobs, so the structure is not recursive). This is useful for instance in order to bundle a group of jobs together and submit them as a whole to a remote GRAM installation.
Note that no relationship can be specified between the subjobs of a multijob. The subjobs are submitted to job factory services in their order of appearance in the multijob description.
Within a multijob description, each subjob description must come along with an endpoint for the factory to submit the subjob to. This enables the at-once submission of several jobs to different hosts. The factory to which the multijob is submitted acts as an intermediary tier between the client and the eventual executable job factories.
Here is an example of a multijob description:
<?xml version="1.0" encoding="UTF-8"?>
<multiJob xmlns:gram="http://www.globus.org/namespaces/2004/10/gram/job"
xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/03/addressing">
<factoryEndpoint>
<wsa:Address>
https://localhost:8443/wsrf/services/ManagedJobFactoryService
</wsa:Address>
<wsa:ReferenceProperties>
<gram:ResourceID>Multi</gram:ResourceID>
</wsa:ReferenceProperties>
</factoryEndpoint>
<directory>${GLOBUS_LOCATION}</directory>
<count>1</count>
<job>
<factoryEndpoint>
<wsa:Address>https://localhost:8443/wsrf/services/ManagedJobFactoryService</wsa:Address>
<wsa:ReferenceProperties>
<gram:ResourceID>Fork</gram:ResourceID>
</wsa:ReferenceProperties>
</factoryEndpoint>
<executable>/bin/date</executable>
<stdout>${GLOBUS_USER_HOME}/stdout.p1</stdout>
<stderr>${GLOBUS_USER_HOME}/stderr.p1</stderr>
<count>2</count>
</job>
<job>
<factoryEndpoint>
<wsa:Address>https://localhost:8443/wsrf/services/ManagedJobFactoryService</wsa:Address>
<wsa:ReferenceProperties>
<gram:ResourceID>Fork</gram:ResourceID>
</wsa:ReferenceProperties>
</factoryEndpoint>
<executable>/bin/echo</executable>
<argument>Hello World!</argument>
<stdout>${GLOBUS_USER_HOME}/stdout.p2</stdout>
<stderr>${GLOBUS_USER_HOME}/stderr.p2</stderr>
<count>1</count>
</job>
</multiJob>Notes:
- The
<ResourceID>element within the<factoryEndpoint>WS-Addressing endpoint structures must be qualified with the appropriate GRAM namespace. - Apart from the
factoryEndpointelement, all elements at the enclosing multijob level act as defaults for the subjob parameters, in this example<directory>and<count>. - The default
<count>value is overridden in the subjob descriptions.
In order to submit a multijob description, use a job submission GT 4.1.1 GRAM4 Command-line Reference
and specify the Managed Job Factory resource to be Multi.
For instance, submitting the multijob description above using globusrun-ws, we obtain:
% bin/globusrun-ws -submit -f test_multi.xml
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:bd9cd634-4fc0-11d9-9ee1-000874404099
Termination time: 12/18/2004 00:15 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.
A multijob resource is created by the factory and exposes a set of WSRF resource properties different than the resource properties of an executable job. The state machine of a multijob is also different since the multijob represents the overall execution of all the executable jobs it is composed of.
The default client for job submissions to WS-GRAM, globusrun-ws, does not yet support JDSL. Until it does, use the unofficial Java client GlobusRun for job submission as described below. To be able to use GlobusRun, the environment variable CLASSPATH must contain the path to some of the Java archives provided by the GT.
To make sure your CLASSPATH included the necessary paths, execute the following command (in bash)
source $GLOBUS_LOCATION/etc/globus-devel-env.sh
or in case you're using csh or tcsh:
source $GLOBUS_LOCATION/etc/globus-devel-env.csh
The specification of a job to submit is to be written by the user in a JSDL compliant job description file. Here is an example of a simple job description:
<?xml version="1.0" encoding="UTF-8"?>
<jsdl:JobDefinition xmlns:jsdl="http://schemas.ggf.org/jsdl/2005/11/jsdl"
xmlns:jsdl-posix="http://schemas.ggf.org/jsdl/2005/11/jsdl-posix">
<jsdl:JobDescription>
<jsdl:Application>
<JobName>Test Job</JobName>
<Description>Simple Job without staging</Description>
<JobAnnotation>With Posix application element</JobAnnotation>
<JobProject>Test project</JobProject>
<jsdl-posix:POSIXApplication >
<jsdl-posix:Executable>/bin/echo</jsdl-posix:Executable>
<jsdl-posix:Argument>hello, world!</jsdl-posix:Argument>
<jsdl-posix:Output>${GLOBUS_USER_HOME}/stdout</jsdl-posix:Output>
<jsdl-posix:Error>${GLOBUS_USER_HOME}/stderr</jsdl-posix:Error>
</jsdl-posix:POSIXApplication>
</jsdl:Application>
</jsdl:JobDescription>
</jsdl:JobDefinition>The above job description could also use HPCProfileApplication instead of PosixApplication. The only thing that needs to be done is to replace the PosixApplication element in the above job description with the following
<jsdl-hpcp:HPCProfileApplication>
<jsdl-hpcp:Executable>/bin/echo</jsdl-hpcp:Executable>
<jsdl-hpcp:Argument>hello, world!</jsdl-hpcp:Argument>
<jsdl-hpcp:Output>${GLOBUS_USER_HOME}/stdout</jsdl-hpcp:Output>
<jsdl-hpcp:Error>${GLOBUS_USER_HOME}/stderr</jsdl-hpcp:Error>
</jsdl-hpcp:HPCProfileApplication>
Note the usage of the substitution variable
${GLOBUS_USER_HOME} which resolves
to the user home directory.
Submit the job with the following command but fill in values for host and port before:
% java -DGLOBUS_LOCATION=${GLOBUS_LOCATION} \
org.globus.exec.client.GlobusRun \
-factory https://<host>:<port>/wsrf/services/v4_2/ManagedJobFactoryService \
-file <jsdl-file>In order to do file staging one must add specific elements to the job description and delegate credentials appropriately. The file transfer directives follow the JSDL syntax. Each file transfer must therefore specify either a source URL or a target URL, depending on the fact if a file should be staged in before job execution or staged out after job execution. URLs are specified as GridFTP URLs.
For instance, in the case of staging a file in, the source URL would be a GridFTP URL (for instance gsiftp://job.submitting.host:2811/tmp/mySourceFile ) resolving to a source document accessible on the file system of the job submission machine (for instance /tmp/mySourceFile ). At run-time the Reliable File Transfer service used by the MEJS on the remote machine would reliably fetch the remote file using the GridFTP protocol and write it to the specified local file (for instance /tmp/mySourceFile).
Here is how the stage-in directive would look like:
<jsdl:DataStaging>
<jsdl:FileName>/tmp/my_SourceFile</jsdl:FileName>
<jsdl:CreationFlag>overwrite</jsdl:CreationFlag>
<jsdl:Source>
<jsdl:URI>gsiftp://job.submitting.host:2811/tmp/mySourceFile</jsdl:URI>
</jsdl:Source>
</jsdl:DataStaging>Here is an example job description with file stage-in and stage-out:
<?xml version="1.0" encoding="UTF-8"?>
<!-- simple job, file staging in and out -->
<jsdl:JobDefinition xmlns:jsdl="http://schemas.ggf.org/jsdl/2005/11/jsdl"
xmlns:jsdl-rft="http://www.globus.org/gram/2006/12/jsdl-rft"
xmlns:jsdl-posix="http://schemas.ggf.org/jsdl/2005/11/jsdl-posix">
<jsdl:JobDescription>
<jsdl:Application>
<JobName>Test Job</JobName>
<Description>Job with staging in and out</Description>
<JobAnnotation>With Posix application element</JobAnnotation>
<JobProject>Test project</JobProject>
<jsdl-posix:POSIXApplication >
<jsdl-posix:Executable>/tmp/my_echo</jsdl-posix:Executable>
<jsdl-posix:Argument>hello, world!</jsdl-posix:Argument>
<jsdl-posix:Output>/tmp/stdout</jsdl-posix:Output>
<jsdl-posix:Error>/tmp/stderr</jsdl-posix:Error>
</jsdl-posix:POSIXApplication>
</jsdl:Application>
<jsdl:DataStaging>
<jsdl:FileName>/tmp/my_echo1</jsdl:FileName>
<jsdl:CreationFlag>overwrite</jsdl:CreationFlag>
<jsdl:Source>
<jsdl:URI>gsiftp://127.0.0.1:2811/bin/echo</jsdl:URI>
</jsdl:Source>
</jsdl:DataStaging>
<jsdl:DataStaging>
<jsdl:FileName>/tmp/stdout</jsdl:FileName>
<jsdl:CreationFlag>overwrite</jsdl:CreationFlag>
<jsdl:Target>
<jsdl:URI>gsiftp://127.0.0.1:2811/tmp/stdout_staged</jsdl:URI>
</jsdl:Target>
</jsdl:DataStaging>
<jsdl:DataStaging>
<jsdl:FileName>/tmp/stderr</jsdl:FileName>
<jsdl:CreationFlag>overwrite</jsdl:CreationFlag>
<jsdl:Target>
<jsdl:URI>gsiftp://127.0.0.1:2811/tmp/stderr_staged</jsdl:URI>
</jsdl:Target>
</jsdl:DataStaging>
</jsdl:JobDescription>
</jsdl:JobDefinition>The above job description could also use HPCProfileApplication instead of PosixApplication. The only thing that needs to be done is to replace the PosixApplication element in the above job description with the following
<jsdl-hpcp:HPCProfileApplication>
<jsdl-hpcp:Executable>/tmp/my_echo</jsdl-hpcp:Executable>
<jsdl-hpcp:Argument>hello, world!</jsdl-hpcp:Argument>
<jsdl-hpcp:Output>/tmp/stdout</jsdl-hpcp:Output>
<jsdl-hpcp:Error>/tmp/stderr</jsdl-hpcp:Error>
</jsdl-hpcp:HPCProfileApplication>Submit the job with the following command:
% java -DGLOBUS_LOCATION=${GLOBUS_LOCATION} \
org.globus.exec.client.GlobusRun \
-factory https://<host>:<port>/wsrf/services/v4_2/ManagedJobFactoryService \
-deleg limited \
-file <jsdl-file>![]() | Note |
|---|---|
Job Description Extensions are currently not supported in JSDL job descriptions. |
To allow adding features to WS-GRAM while avoiding breaking compatibility
between versions, an extensibility point was included in the job description
schema. This appears as the <extensions>
element at the bottom of a job description document. Starting with version 4.2.0
of the Globus Toolkit, WS-GRAM will support both a number of specific extenions
as well as generic constructs that can be used for passing custom values to the
resource manager/scheduler adapter Perl modules.
The following are specific supported extensions to the WS-GRAM job description schema. They do not require any modification of the resource manager/scheduler adapter Perl modules.
The multiAuthzSubject extension is used
to specify the credential subject/DN to be used by the multijob being
created for authorizing the subjob factory service. If specified, all
subjob factory services must be using the same credential. This is
meant to address the case where a set of test containers is deployed
which are all running under a single user's proxy credentials as opposed
to individual host credentials.
For example, if the subjob factory services are using a credential with the subject "/DC=org/DC=doegrids/OU=People/CN=John Doh 123456", the subjob should be submitted as follows:
<multiJob>
...
<job>
...
</job>
<job>
...
</job>
<extensions>
<multiAuthzSubject>/DC=org/DC=doegrids/OU=People/CN=John Doh 123456</multiAuthzSubject>
</extensions>
</multiJob>
Node selection constraints in PBS can be specified in two ways, generally using a construct intended to eventually apply to all resource managers which support node selection, or explicitly by sepcifiying a simple string element. The former will be more portable, but the later will appeal to those familiar with specifying node constraints for PBS jobs.
To specify PBS node selection constraints explicitly, one can simply
constuct a single, simple string extension element named
nodes with a value that conforms to the
#PBS -l nodes=... PBS job description
directive. The Globus::GRAM::ExtensionsHandler module will make this
available to the PBS adapter script by invoking
$description->{nodes}. The updated PBS
adapter package checks for this value and will create a directive in the
PBS job description using this value.
To use the generic construct for specifying node selection constraints,
use the
resourceAllocationGroup element:
<extensions>
<resourceAllocationGroup>
<!-- Optionally select hosts by type and number... -->
<hostType>...</hostType>
<hostCount>...</hostCount>
<!-- *OR* by host names -->
<hostName>...</hostName>
<hostName>...</hostName>
. . .
<!-- With a total CPU count for this group... -->
<cpuCount>...</cpuCount>
<!-- *OR* an explicit number of CPUs per node... -->
<cpusPerNode>...</cpusPerNode>
. . .
<!-- And a total process count for this group... -->
<processCount>...</processCount>
<!-- *OR* an explicit number of processes per node... -->
<processesPerNode>...</processesPerNode>
</resourceAllocationGroup>
</extensions>
Extension elements specified according to the above pseudo-schema will
be converted to an appropriate nodes
parameter which will be treated as if an explicit
nodes extension element were specified.
Multiple resourceAllocationGroup
elements may be specified. This will simply append the constraints to
the nodes paramater with a '+'
separator. Note that one cannot specify both hostType/hostCount and
hostName elements. Similarly, one cannot specify both processCount and
processesPerNode elements.
Here are some examples of using
resourceAllocationGroup:
<!-- #PBS -l nodes=1:ppn=10 -->
<!-- 10 processes -->
<extensions>
<resourceAllocationGroup>
<cpuCount>10</cpuCount>
<processCount>10</processCount>
</resourceAllocationGroup>
</extensions>
<!-- #PBS -l nodes=activemural:ppn=10+5:ia64-compute:ppn=2 -->
<!-- 1 process (process default) -->
<extensions>
<resourceAllocationGroup>
<hostType>activemural</hostType>
<cpuCount>10</cpuCount>
</resourceAllocationGroup>
<resourceAllocationGroup>
<hostType>ia64-compute</hostType>
<hostCount>5</hostCount>
<cpusPerHost>2</cpusPerHost>
</resourceAllocationGroup>
</extensions>
<!-- #PBS -l nodes=vis001:ppn=5+vis002:ppn=5+comp014:ppn=2+comp015:ppn=2 -->
<!-- 15 total processes -->
<extensions>
<resourceAllocationGroup>
<hostName>vis001</hostName>
<hostName>vis002</hostName>
<cpuCount>10</cpuCount>
<processesPerHost>5</processesPerHost>
</resourceAllocationGroup>
<resourceAllocationGroup>
<hostName>comp014</hostName>
<hostName>comp015</hostName>
<cpusPerHost>2</cpusPerHost>
<processCount>5</processCount>
</resourceAllocationGroup>
</extensions>
The following are general constructs that are supported by the ExtensionsHandler.pm Perl module. Although no modifications to ExtensionsHandler.pm are required, you will need to edit the appropriate resource manager/scheduler adapter Perl module as neccessary to affect the submission of jobs to the local resource manager/batch scheduler.
The WS-GRAM job description schema includes a section for extending the job description with custom elements. To make sense of this in the resource manager adapter Perl scripts, a Perl module named Globus::GRAM::ExtensionsHandler is provided to turn these custom elements into paramters that the adapter scripts can understand.
It should be noted that although non-GRAM XML elements only are allowed
in the <extensions> element of the
job description, the extensions handler makes no distinction based on
namespace. Thus, <foo:myparam> and
<bar:myparam> will both be treated as
just <myparam>.
Familiarity with the adapter scripts is assumed in the following subsections.
Simple string extension elements are converted into single-element arrays with the name of the unqualified tag name of the extension element as the array's key name in the Perl job description hash. Simple string extension elements can be considered a special case of the string array construct in the next section.
For example, adding the following element to the
<extensions> element of the job
description:
<extensions>
<myparam>yahoo!</myparam>
</extensions>
will cause the $description->myparam()
to return the following value:
'yahoo!'
String arrays are a simple iteration of the simple string element construct. If you specify more than one simple string element in the job description, these will be assembled into a multi-element array with the unqualified tag name of the extension elements as the array's key name in the Perl job description hash.
For example:
<extensions>
<myparams>Hello</myparams>
<myparams>World!</myparams>
</extensions>
will cause the $description->myparams() to
return the following value:
[ 'Hello', 'World!' ]
Name/value extension elements can be thought of as string arrays with an XML attribute 'name'. This will cause the creation of a two-dimensional array with the unqualified extension element tag name as the name of the array in the Perl job description hash.
For example:
<extensions>
<myvars name="pi">3.14159</myvars>
<myvars name="mole">6.022 x 10^23</myvars>
</extensions>
will cause the $description->myvars() to
return the following value:
[ [ 'pi', '3.14159'], ['mole', '6.022 x 10^23'] ]
See the System Administrator's Guide section on Configuring GRAM4 for information on how to customize the resource manager/scheduler adapter Perl modules
![]() | Note |
|---|---|
The default client globusrun-ws can't be used for jobs described in JSDL so far. For JSDL jobs please use the inofficial Java client GlobusRun. Please look at "Usage scenarios jobs described in JSDL" for information about how to submit jobs using the Java client. |
Please see the GT 4.1.1 GRAM4 Command-line Reference.
The job manager detected an invalid script response
- Check for a restrictive umask. When the service writes the native scheduler job description to a file, an overly restrictive umask will cause the permissions on the file to be such that the submission script run through sudo as the user cannot read the file (bug #2655).
Fork jobs work fine, but submitting PBS jobs with globusrun-ws hangs at "Current job state: Unsubmitted"
- Make sure the the log_path in $GLOBUS_LOCATION/etc/globus-pbs.conf points to locally accessible scheduler logs that are readable by the user running the container. The Scheduler Event Generator (SEG) will not work without local scheduler logs to monitor. This can also apply to other resource managers, but is most comonly seen with PBS.
- If the SEG configuration looks sane, try running the SEG tests. They are located in $GLOBUS_LOCATION/test/globus_scheduler_event_generator_*_test/. If Fork jobs work, you only need to run the PBS test. Run each test by going to the associated directory and run ./TESTS.pl. If any tests fail, report this to the gram-dev@globus.org mailing list.
- If the SEG tests succeed, the next step is to figure out the ID assigned by PBS to the queued job. Enable GRAM debug logging by uncommenting the appropriate line in the $GLOBUS_LOCATION/container-log4j.properties configuration file. Restart the container, run a PBS job, and search the container log for a line that contains "Received local job ID" to obtain the local job ID.
- Once you have the local job ID you can check the latest PBS logs pointed to by the value of "log_path" in $GLOBUS_LOCATION/etc/globus-pbs.conf to make sure the job's status is being logged. If the status is not being logged, check the documentation for your flavor of PBS to see if there's any futher configuration that needs to be done to enable job status logging. For example, PBS Pro requires a sufficient -e <bitmask> option added to the pbs_server command line to enable enough logging to satisfy the SEG.
- If the correct status is being logged, try running the
SEG manually to see if it is reading the log file properly. The general
form of the SEG command line is as follows:
$GLOBUS_LOCATION/libexec/globus-scheduler-event-generator -s pbs -t <timestamp>The timestamp is in seconds since the epoch and dictates how far back in the log history the SEG should scan for job status events. The command should hang after dumping some status data to stdout. If no data appears, change the timestamp to an earlier time. If nothing ever appears, report this to the gram-user@globus.org mailing list. - If running the SEG manually succeeds, try running another job and make sure the job process actually finishes and PBS has logged the correct status before giving up and cancelling globusrun-ws. If things are still not working, report your problem and exactly what you have tried to remedy the situtation to the gram-user@globus.org mailing list.
When restarting the container, I get the following error: Error getting delegation resource
- Most likely this is simply a case of the delegated credential expiring. Either refresh it for the affected job or destroy the job resource.
The following usage statistics are sent by default in a UDP packet (in addition to the GRAM component code, packet version, timestamp, and source IP address) at the end of each job (i.e. when Done or Failed state is entered).
- job creation timestamp (helps determine the rate at which jobs are submitted)
- scheduler type (Fork, PBS, LSF, Condor, etc...)
- jobCredentialEndpoint present in RSL flag (to determine if server-side user proxies are being used)
- fileStageIn present in RSL flag (to determine if the staging in of files is used)
- fileStageOut present in RSL flag (to determine if the staging out of files is used)
- fileCleanUp present in RSL flag (to determine if the cleaning up of files is used)
- CleanUp-Hold requested flag (to determine if streaming is being used)
- job type (Single, Multiple, MPI, or Condor)
- gt2 error code if job failed (to determine common scheduler script errors users experience)
- fault class name if job failed (to determine general classes of common faults users experience)
If you wish to disable this feature, please see the "Usage Statistics Configuration" section of Configuring Java WS Core for instructions.
Also, please see our policy statement on the collection of usage statistics.
![[Note]](/docbook-images/note.gif)