Introduction
This guide contains configuration information for system administrators working with GRAM5. It describes procedures typically performed by system administrators, including GRAM5 software installation, configuration, testing, and debugging. Readers should be familiar with the GRAM5 Key Concepts to understand the motivation for and interaction between the various deployed components.
Table of Contents
- 1. GRAM5 Installation
- 2. Common Administrative Tasks
- 3. Configuring GRAM5
- 4. Audit Logging
- 5. Security Considerations
- 6. Troubleshooting
- 7. Admin Tools
- globus-gatekeeper - Authorize and execute a grid service on behalf of a user
- globus-gatekeeper-admin - Manage globus-gatekeeper services
- globus-gram-audit - Load GRAM4 and GRAM5 audit records into a database
- globus-job-manager - Execute and monitor jobs
- globus-scheduler-event-generator - Process LRM events into a common format for use with GRAM
- globus-scheduler-event-generator-admin - Manage SEG modules
- 8. Usage statistics collection by the Globus Alliance
- Glossary
- Index
Table of Contents
The Globus Toolkit provides GRAM5: a service to submit, monitor, and cancel jobs on Grid computing resources. In GRAM5, a job consists of a computation and, optionally, file transfer and management operations related to the computation. Some users, particularly interactive ones, benefit from accessing output data files as the job is running. Monitoring consists of querying for and/or subscribing to status information, such as job state changes.
GRAM5 relies on GSI C mechanisms for security, and interacts with GridFTP services to stage files to compute resources. Please see their respective Administrator's guides for information about installing, configuring, and managing those systems. In particular, you must understand the tasks in Installing GT and install the basic GRAM5 packages, and complete the tasks in Basic Security Configuration.
Before installing GRAM5 on a server, you'll first need to plan what Local Resource Managers (LRMs) you want GRAM5 to interface with, what LRM you want to have as your default GRAM5 service, and whether you'll be using the globus-scheduler-event-generator to process LRM events.
GRAM5 requires a few services to be running to function: the Gatekeeper and the Scheduler Event Generator (SEG). The supported way to run these services is via the System-V style init scripts provided with the GRAM5-related packages. The gatekeeper daemon can also be configured to start via an internet superserver such as inetd or xinetd though that is beyond the scope of this document. The globus-scheduler-event-generator can not be run in that way.
GRAM5 in GT 5.2.1 supports the following LRM adapters: Condor, PBS, GridEngine, and Fork. These LRM adapters translate GRAM5 job specifications into LRM-specific job descriptions and scripts to run them, as well as interfaces to the LRM to determine job termination status.
If you're not familiar with the supported LRMs, you might want to start with the Fork one to get familiar with how GRAM5 works. This adapter simply forks the job and runs it on the GRAM5 node. You can then install one of the other LRMs and its adapter to provide batch or high-throughput job scheduling.
GRAM5 can be configured to support multiple LRMs on the same service machine. In
that case, one LRM is typically configured as the default LRM which is used when a
client uses a shortened version of a GRAM5
resource
name. A common configuration is to configure a batch system interface as the
default, and provide the jobmanager-fork service as well for simple
jobs, such as creating directories or staging data.
GRAM5 has two ways of determining job state transitions: polling the LRM and using the Scheduler Event Generator (SEG) service. When polling, each user's globus-job-manager will periodically execute an LRM-specific command to determine the state of each job. On systems with many users, or with users submitting a large number of jobs, this can cause significant resource use on the GRAM5 service machine. Instead, the GRAM5 service can be configured (on a per-LRM basis) to use the globus-scheduler-event-generator service to more efficiently process LRM state changes.
![]() | Note |
|---|---|
Not all LRM adapters provide an interface to the globus-scheduler-event-generator, and some require LRM-specific configuration to work properly. This is described in more detail. |
There are several LRM adapters included in the GT 5.2.1. For some, there is a -setup-poll and -setup-seg package which installs the adapter and configuration file needed for
job status via polling or the globus-scheduler-event-generator program.
There are three ways to get LRM adapters: as RPM packages, as Debian packages, and from the source installer. These installation methods are described in Installing GT 5.2.1.
LRM adapter packages included in the GT 5.2.1 release are:
Table 1.1. GRAM5 LRM Adapters
| LRM Adapter | Poll Package | SEG Package | Installer Target |
|---|---|---|---|
| fork | globus-gram-job-manager-fork-setup-poll |
globus-gram-job-manager-fork-setup-seg[a] | globus_gram_job_manager_fork |
| pbs | globus-gram-job-manager-pbs-setup-poll [b] | globus-gram-job-manager-pbs-setup-seg | globus_gram_job_manager_pbs |
| Condor | N/A |
globus-gram-job-manager-condor[c] | globus_gram_job_manager_condor |
| SGE | globus-gram-job-manager-sge-setup-poll | globus-gram-job-manager-sge-setup-seg | globus_gram_job_manager_sge |
[a] Not recommended for production use [b] This module does not work with torque 3.0.1-5 in Fedora 15 because of a bug causing qstat to hang. This bug is mentioned on the TORQUE user list and is fixed in newer versions. [c] This LRM uses a SEG-like mechanism included in the globus-job-manager program, but not the globus-scheduler-event-generator service. | |||
Table of Contents
There are several tools provided with GT 5.2.1 to manage GRAM5, as well as OS-specific tools to start and stop some of the services. There are tools to manage user authorization, which services are enabled, which scheduler event generator modules are enabled, and to test the globus-gatekeeper service.
Before a user may interact with the GRAM5 service to submit jobs, he or she must be
authorized to use the service. In order to be authorized, a GRAM5 administrator must add
the user's credential name and local account mapping to the /etc/grid-mapfile. This can be done using the grid-mapfile-add-entry and grid-mapfile-delete-entry tools. For more information, see the GSI C manual.
In order to run the service, the globus-gatekeeper, and, if applicable to your configuration, the globus-scheduler-event-generator services must be running on your system. The packages for these services include init scripts and configuration files which can be used to configure, start, and stop the service.
The globus-gatekeeper and globus-scheduler-event-generator init scripts handle the following actions: start, stop, status, restart, condrestart, try-restart,
reload, and force-reload. The globus-scheduler-event-generator script also
accepts another optional parameter to start or stop a particular globus-scheduler-event-generator module. If the second
parameter is not present, then all services will be acted on.
If you installed using Debian packaging tools, then the services will automatically be started upon installation. To start or stop the service, use the command invoke-rc.d with the service name and action.
If you installed using the RPM packaging tools, then the services will be installed but not enabled by default. To enable the services to start at boot time, use the commands:
#chkconfig globus-gatekeeper on#chkconfig globus-scheduler-event-generator on
To start or stop the services, use the service command to run the init scripts with the service name and action and optional globus-scheduler-event-generator module.
The GRAM5 packages described in Section 3, “Installing LRM Adapter Packages” will automatically register themselves with the globus-gatekeeper and globus-scheduler-event-generator services. The first LRM adapter installed will be configured as the default Job Manager service. To list the installed services, change the default, or disable a service, use the globus-gatekeeper-admin(8) tool.
Example 2.1. Using globus-gatekeeper-admin to set the default service
This example shows how to use the globus-gatekeeper-admin tool to list the available services and then choose one as the default:
#globus-gatekeeper-admin-ljobmanager-condor [ENABLED] jobmanager-fork-poll [ENABLED] jobmanager-fork [ALIAS to jobmanager-fork-poll]#globus-gatekeeper-admin-e jobmanager-condor -n jobmanager#globus-gatekeeper-admin-ljobmanager-condor [ENABLED] jobmanager-fork-poll [ENABLED] jobmanager [ALIAS to jobmanager-condor] jobmanager-fork [ALIAS to jobmanager-fork-poll]
The -setup-seg packages described in Section 3, “Installing LRM Adapter Packages” will automatically register themselves
with the globus-scheduler-event-generator service. To disable a module from running when the globus-scheduler-event-generator service is started,
use the globus-scheduler-event-generator-admin(8) tool.
Example 2.2. Using globus-scheduler-event-generator-admin to disable a SEG module
This example shows how to stop the pbs
globus-scheduler-event-generator module and disable it so it will not
restart when the system is rebooted:
#/etc/init.d/globus-scheduler-event-generator stop pbsStopped globus-scheduler-event-generator [ OK ]#globus-scheduler-event-generator-admin-d pbs#globus-scheduler-event-generator-admin-lpbs [DISABLED]
Table of Contents
GRAM5 is designed to be usable by default without any manual configuration. However, there are many ways to customize a GRAM5 installation to better interact with site policies, filesystem layouts, LRM interactions, logging, and auditing. In addition to GRAM5-specific configuration, see Configuring GSI for information about configuring GSI security.
The globus-gatekeeper has many configuration options related to network configuration, security, logging, service path, and nice level. This configuration is located in:
Table 3.1. Gatekeeper Configuration Path
| Installation Type | Configuration Path |
|---|---|
| RPM | /etc/sysconfig/globus-gatekeeper |
| Debian Package | /etc/default/globus-gatekeeper |
| Source Installer | |
The following configuration variables are available in the globus-gatekeeper configuration file:
- GLOBUS_GATEKEEPER_PORT
- Gatekeeper Service Port. If not set, the globus-gatekeeper uses the
default of
2119. - GLOBUS_LOCATION
- Globus Installation Path. If not set, the globus-gatekeeper uses the paths defined at package compilation time.
- GLOBUS_GATEKEEPER_LOG
- Gatekeeper Log Filename. If not set, the globus-gatekeeper logs to
syslog using the
GRAM-gatekeeperlog identification prefix. The default configuration value is/var/log/globus-gatekeeper.log - GLOBUS_GATEKEEPER_GRID_SERVICES
- Path to grid service definitions. If not set, the globus-gatekeeper uses
the default of
/etc/grid-services. - GLOBUS_GATEKEEPER_GRIDMAP
- Path to grid-mapfile for authorization. If not set, the
globus-gatekeeper uses the default of
/etc/grid-security/grid-mapfile. - GLOBUS_GATEKEEPER_CERT_DIR
- Path to a trusted certificate root directory. If not set, the
globus-gatekeeper uses the default of
/etc/grid-security/certificates. - GLOBUS_GATEKEEPER_CERT_FILE
- Path to the gatekeeper's certificate. If not set, the globus-gatekeeper
uses the default of
/etc/grid-security/hostcert.pem. - GLOBUS_GATEKEEPER_KEY_FILE
- Path to the gatekeeper's private key. If not set, the globus-gatekeeper
uses the default of
/etc/grid-security/hostkey.pem. - GLOBUS_GATEKEEPER_KERBEROS_ENABLED
- Flag indicating whether or not the globus-gatekeeper will use a kerberos GSSAPI implementation instead of the GSI GSSAPI implementation (untested).
- GLOBUS_GATEKEEPER_KMAP
- Path to the KMAP authentication module. (untested).
- GLOBUS_GATEKEEPER_PIDFILE
- Path to a file where the globus-gatekeeper's process ID is written. If
not set, globus-gatekeeper uses
/var/run/globus-gatekeeper.pid - GLOBUS_GATEKEEPER_NICE_LEVEL
- Process nice level for globus-gatekeeper and globus-job-manager processes. If not set, the default system process nice level is used.
After modifying the configuration file, restart the globus-gatekeeper using the methods described in Section 2, “Starting and Stopping GRAM5 services”.
The globus-scheduler-event-generator has several configuration options related to filesystem paths. This configuration is located in:
Table 3.2. Scheduler Event Generator Configuration Path
| Installation Type | Configuration Path |
|---|---|
| RPM | /etc/sysconfig/globus-scheduler-event-generator |
| Debian Package | /etc/default/globus-scheduler-event-generator |
| Source Installer | |
The following configuration variables are available in the globus-scheduler-event-generator configuration file:
- GLOBUS_SEG_PIDFMT
- Scheduler Event Generator PID file path format. Modify this to be
the location where the globus-scheduler-event-generator writes its process IDs (one per configured
LRM). The format is a
printfformat string with one%sto be replaced by the LRM name. By default, globus-scheduler-event-generator uses/var/run/globus-scheduler-event-generator-%s.pid. - GLOBUS_SEG_LOGFMT
- Scheduler Event Generator Log path format. Modify this to be the
location where globus-scheduler-event-generator writes its event logs. The format is a
printfformat string with one%sto be replaced by the LRM name. By default, globus-scheduler-event-generator uses/var/lib/globus/globus-seg-%s. If you modify this value, you'll need to also update the LRM configuration file to look for the log file in the new location. - GLOBUS_SEG_NICE_LEVEL
- Process nice level for globus-scheduler-event-generator processes. If not set, the default system process nice level is used.
After modifying the configuration file, restart the globus-scheduler-event-generator using the methods described in Section 2, “Starting and Stopping GRAM5 services”.
The globus-job-manager process is started by the globus-gatekeeper and uses the configuration defined in the service entry for the resource name. By default, these service entries use a common configuration file for most job manager features. This configuration is located in:
Table 3.3. Job Manager Configuration Path
| Installation Type | Configuration Path |
|---|---|
| RPM | /etc/globus/globus-gram-job-manager.conf |
| Debian Package | /etc/globus/globus-gram-job-manager.conf |
| Source Installer | |
This configuration file is used to construct the command-line options for the globus-job-manager program. Thus, all of the options described in globus-job-manager(8) may be used.
From an administrator's perspective, the most important job manager configuration
options are likely the ones related to logging and auditing. The default GRAM5
configuration puts logs in /var/log/globus/gram_,
with logging enabled at the USERNAME.logFATAL and ERROR
levels. To enable more fine-grained logging, add the option -log-levels
to LEVELS/etc/globus/globus-gram-job-manager.conf. The value for
LEVELS is a set of log levels joined by the | character. The available log levels are:
Table 3.4. GRAM5 Log Levels
| Level | Meaning | Default Behavior |
|---|---|---|
FATAL | Problems which cause the job manager to terminate prematurely. | Enabled |
ERROR | Problems which cause a job or operation to fail. | Enabled |
WARN | Problems which cause minor problems with job execution or monitoring. | Disabled |
INFO | Major events in the lifetime of the job manager and its jobs. | Disabled |
DEBUG | Minor events in the lifetime of jobs. | Disabled |
TRACE | Job processing details. | Disabled |
In RPM or Debian package installs, these logs will be configured to be rotated via
logrotate. See /etc/logrotate.d/globus-job-manager for details on the default log
rotation configuration.
There are also a few configuration options related to the TCP ports
the the Job Manager users. This port configuration is useful
when dealing with firewalls that restrict incoming or outgoing
ports. To restrict incoming ports (those that the Job Manager
listens on), add the command-line option
-globus-tcp-port-range to the Job Manager
configuration file like this:
-globus-tcp-port-rangeMIN-PORT,MAX-PORT
Where MIN-PORT is the minimum TCP port
number the Job Manager will listen on and
MAX-PORT is the maximum TCP port number
the Job Manager will listen on.
Similarly, to restrict the outgoing port numbers that the job
manager connects form, use the command-line option
-globus-tcp-source-range, like this:
-globus-tcp-source-rangeMIN-PORT,MAX-PORT
Where MIN-PORT is the minimum outgoing
TCP port number the Job Manager will use and
MAX-PORT is the maximum TCP outgoing
port number the Job Manager will use.
For more information about Globus and firewalls, see Section 4, “Firewall configuration”.
Each LRM adapter has its own configuration file which can help customize the adapter
to the site configuration. Some LRMs use non-standard programs to launch parallel or MPI
jobs, and some might want to provide queue or project validation to make it easier to
translate job failures into problems that can be described by GRAM5. All of the LRM
adapter configuration files consist of simple variable="value" pairs,
with a leading # starting a comment until end-of-line.
Generally, the GRAM5 LRM configuration files are located in the globus configuration
directory, with each configuration file named by the LRM name (fork,
condor, pbs, sge). The
following table contains the paths to these configurations:
Table 3.5. LRM Adapter Configuration Path
| Installation Type | Configuration Path |
|---|---|
| RPM | /etc/globus/globus- |
| Debian Package | /etc/globus/globus- |
| Source Installer | |
The globus-fork.conf configuration file can define the
following configuration parameters:
- log_path
- Path to the
globus-fork.logfile used by the globus-fork-starter and fork SEG module. - mpiexec, mpirun
- Path to mpiexec and mpirun for parallel jobs which use MPI. By default, these are not configured. The LRM adapter will use mpiexec over mpirun if both are defined.
- softenv_dir
- Path to an installation of softenv, which is used on some systems to manage application environment variables.
The globus-condor.conf configuration file can define the
following configuration parameters:
- condor_os
- Custom value for the
OpSysrequirement for condor jobs. If not specified, the system-wide default will be used. - condor_arch
- Custom value for the
OpSysrequirement for condor jobs. If not specified, the system-wide default will be used. - condor_submit, condor_rm
- Path to the condor commands that the LRM adapter uses. These
are usually determined when the LRM adapter is compiled if the
commands are in the
PATH. - condor_config
- Value of the
CONDOR_CONFIGenvironment variable, which might be needed to use condor in some cases. - check_vanilla_files
- Enable checking if executable, standard input, and directory
are valid paths for
vanillauniverse jobs. This can detect some types of errors before submitting jobs to condor, but only if the filesystems between the condor submit host and condor execution hosts are equivalent. In other cases, this may cause unneccessary job failures. - condor_mpi_script
- Path to a script to launch MPI jobs on condor
The globus-pbs.conf configuration file can define the
following configuration parameters:
- log_path
- Path to PBS server_logs directory. The PBS SEG module parses these logs to generate LRM events.
- pbs_default
- Name of the PBS server node, if not the same as the GRAM service node.
- mpiexec, mpirun
- Path to mpiexec and mpirun for parallel jobs which use MPI. By default these are not configured. The LRM adapter will use mpiexec over mpirun if both are defined.
- qsub, qstat, qdel
- Path to the LRM-specific command to submit, check, and delete
PBS jobs. These are usually determined when the LRM adapter is
compiled if they are in the
PATH. - cluster
- If this value is set to
yes, then the LRM adapter will attempt to use a remote shell command to launch multiple instances of the executable on different nodes, as defined by the file named by thePBS_NODEFILEenvironment variable. - remote_shell
- Remote shell command to launch processes on different nodes
when
clusteris set toyes. - cpu_per_node
- Number of instances of the executable to launch per allocated node.
- softenv_dir
- Path to an installation of softenv which is used on some systems to manage application environment variables.
The globus-sge.conf configuration file can define the
following configuration parameters:
- sge_root
- Root location of the GridEngine installation. If this is set to
undefined, then the LRM adapter will try to determine it from the globus-job-manager environment, or if not there, the contents of the file named by thesge_configconfiguration parameter. - sge_cell
- Name of the GridEngine cell to interact with. If this is set to
undefined, then the LRM adapter will try to determine it from the globus-job-manager environment, or if not there, the contents of the file named by thesge_configconfiguration parameter. - sge_config
- Path to a file which defines the
SGE_ROOTand theSGE_CELLenvironment variables. - log_path
- Path to GridEngine reporting file. This value is used by the SGE SEG module. If this is used, GridEngine must be configured to write a reporting file and not load reporting data into an ARCo database.
- qsub, qstat, qdel, qconf
- Path to the LRM-specific command to submit, check, and delete
GridEngine jobs. These are usually determined when the LRM adapter
is compiled if they are in the
PATH. - sun_mprun, mpirun
- Path to mprun and mpirun for parallel jobs which use MPI. By default these are not configured. The LRM adapter will use mprun over mpirun if both are defined.
- default_pe
- Default parallel environment to submit parallel jobs to. If
this is not set, then clients must use the
parallel_environmentRSL attribute to choose one. - validate_pes
- If this value is set to
yes, then the LRM adapter will verify that theparallel_environmentRSL attribute value matches one of the parallel environments supported by this GridEngine service. - available_pes
- If this value is defined, use it as a list of parallel
environments supported by this GridEngine deployment for validation
when
validate_pesis set toyes. If validation is being done but this value is not set, then the LRM adapter will query the GridEngine service to determine available parallel environments at startup. - default_queue
- Default queue to use if the job description does not name one.
- validate_queues
- If this value is set to
yes, then the LRM adapter will verify that thequeueRSL attribute value matches one of the queues supported by this GridEngine service. - available_queues
- If this value is defined, use it as a list of queues supported
by this GridEngine deployment for validation when
validate_queuesis set toyes. If validation is being done but this value is not set, then the LRM adapter will query the GridEngine service to determine available queues at startup.
The globus-gram-audit configuration defines information about the database to load the GRAM5 audit records into. This configuration is located in:
Table 3.6. GRAM Audit Configuration Path
| Installation Type | Configuration Path |
|---|---|
| RPM | /etc/globus/gram-audit.conf |
| Debian Package | /etc/globus/gram-audit.conf |
| Source Installer | |
This configuration file contains the following attributes. Each attribute is defined
by a ATTRIBUTE:VALUE pair.
Table 3.7. Audit Configuration Attributes
| Attribute Name | Values | Default |
|---|---|---|
| DRIVER |
The name of the Perl 5 DBI driver for the database to be
used. The supported drivers for this program are | SQLite |
| DATABASE |
The DBI data source specfication to contact the audit database. | dbname=/var/gram_audit_database/gram_audit.db |
| USERNAME | Username to authenticate as to the database | |
| PASSWORD | Password to use to authenticate with the database | |
| AUDITVERSION | Version of the audit database table schemas to use. May be
1 or 1TG for this version
of the software. | 1 |
Table of Contents
GRAM5 includes mechanisms to provide access to audit and accounting information associated with jobs that GRAM5 submits to a local resource manager (LRM) such as Torque, GridEngine, or Condor.
In some scenarios, it is desirable to get general information about the usage of the underlying LRM, such as:
What kinds of jobs were submitted via GRAM?
How long did the processing of a job take?
How many jobs were submitted by user X?
The following three use cases give a better overview of the meaning and purpose of auditing and accounting:
Group Access: A grid resource provider allows a remote service (e.g., a gateway or portal) to submit jobs on behalf of multiple users. The grid resource provider only obtains information about the identity of the remote submitting service and thus does not know the identity of the users for which the grid jobs are submitted. This group access is allowed under the condition that the remote service stores audit information so that, if and when needed, the grid resource provider can request and obtain information to track a specific job back to an individual user.
Query Job Accounting: A client that submits a job needs to be able to obtain, after the job has completed, information about the resources consumed by that job. In portal and gateway environments where many users submit many jobs against a single allocation, this per-job accounting information is needed soon after the job completes so that client-side accounting can be updated. Accounting information is sensitive and thus should only be released to authorized parties.
Auditing: In a distributed, multi-site environment, it can be necessary to investigate various forms of suspected intrusion and abuse. In such cases, we may need to access an audit trail of the actions performed by a service. When accessing this audit trail, it will frequently be important to be able to relate specific actions to the user.
Audit logging in GRAM5 is done when a job completes.
While audit and accounting records may be generated and stored by different entities in different contexts, we make the following assumptions in this chapter:
| Audit Records | Accounting Records | |
|---|---|---|
| Generated by: | GRAM service | LRM to which the GRAM service submits jobs |
| Stored in: | Database, indexed by GJID | LRM, indexed by JID |
| Data that is stored: | See list below. | May include all information about the duration and resource-usage of a job |
The audit record of each job contains the following data:
job_grid_id: String representation of the resource EPR
local_job_id: Job/process id generated by the scheduler
subject_name: Distinguished name (DN) of the user
username: Local username
idempotence_id: Job id generated on the client-side
creation_time: Date when the job resource is created
queued_time: Date when the job is submitted to the scheduler
stage_in_grid_id: String representation of the stageIn-EPR (RFT)
stage_out_grid_id: String representation of the stageOut-EPR (RFT)
clean_up_grid_id: String representation of the cleanUp-EPR (RFT)
globus_toolkit_version: Version of the server-side GT
resource_manager_type: Type of the resource manager (Fork, Condor, ...)
job_description: Complete job description document
success_flag: Flag that shows whether the job failed or finished successfully
finished_flag: Flag that shows whether the job is already fully processed or still in progress
gateway_user: Teragrid identity of the user which submitted the job.
The rest of this chapter focuses on how to configure GRAM5 to enable Audit-Logging.
Audit logging is turned off by default. To enable GRAM5 audit logging, in the job manager,
add the command-line option -audit-directory
to the job manager configuration in
one of the following locations: AUDIT-DIRECTORY}
to enable it for all job manager services$GLOBUS_LOCATION/etc/globus-job-manager.confto enable it for a particular job manager service for a particular LRM.$GLOBUS_LOCATION/etc/grid-services/LRM_SERVICE_NAME
The globus-gram-audit program reads GRAM5 audit records and loads those
records into a SQL database. This program is available as part of the globus_gram_job_manager_auditing package. It must be configured by installing
and running the globus_gram_job_manager_auditing_setup_scripts
setup package via gpt-postinstall. This setup script creates the
configuration file described below and creates database tables needed by the audit system. $GLOBUS_LOCATION/etc/globus-job-manager-audit.conf
The globus-gram-audit program support three database systems: MySQL, PostgreSQL, and SQLite.
Table of Contents
GRAM5 runs different parts of itself under different privilege
levels. The globus-gatekeeper runs as root, and uses its root privilege to access
the host's private key. It uses the grid map file to
map Grid Certificates to
local user ids and then uses the setuid() function
to change to that user and execute the globus-job-manager program
The globus-job-manager program runs as a local non-root account. It receives a delegated limited proxy certificate from the GRAM5 client which it uses to access Grid storage resources via GridFTP and to authenticate job signals (such as client cancel requests), and send job state callbacks to registered clients. This proxy is generally short-lived, and is automatically removed by the job manager when the job completes.
The globus-job-manager program uses a publicly-writable directory for job state files. This directory has the sticky bit set, so users may not remove other users files. Each file is named by a UUID, so it should be unique.
Table of Contents
GRAM requires a host certificate and private key in order for the globus-gatekeeeper service to run. These are typically located in
/etc/grid-security/hostcert.pem and /etc/grid-security/hostkey.pem, but the path is configurable in the
gatekeeper
configuration file. The key must be protected by file permissions allowing
only the root user to read it.
GRAM also (by default) uses a grid-mapfile to authorize Grid
users as local users. This file is typically located in /etc/grid-security/grid-mapfile, but is configurable in the gatekeeper
configuration file.
Problems in either of these configurations will show up in the gatekeeper log
described below. See the GSI documentation for more
detailed information about obtaining and installing host certificates and maintaining a
grid-mapfile.
GRAM relies on the globus-gatekeeper program and (in some cases)
the globus-scheduler-event-generator programs to process jobs. If the
former is not running, jobs requests will fail with a "connection refused" error. If the
latter is not running, GRAM jobs will appear to "hang" in the PENDING
state.
The globus-gatekeeper is typically started via an init script
installed in /etc/init.d/globus-gatekeeper. The command /etc/init.d/globus-gatekeeper status will indicate whether the service is
running. See Section 2, “Starting and Stopping GRAM5 services” for more information about starting and stopping the globus-gatekeeper program.
If the globus-gatekeeper service fails to start, the output of the command globus-gatekeeper -test will output information describing some types of configuration problems.
The globus-scheduler-event-generator is typically started via an
init script installed in /etc/init.d/globus-scheduler-event-generator. It is only needed when the
LRM-specific "setup-seg" package is installed. The command /etc/init.d/globus-scheduler-event-generator status will indicate whether
the service is running. See Section 2, “Starting and Stopping GRAM5 services” for more information about starting
and stopping the globus-scheduler-event-generator program.
The globus-gatekeeper program starts the globus-job-manager service with different command-line parameters depending on the LRM being used. Use the command globus-gatekeeper-admin -l to list which LRMs the gatekeeper is configured to use.
The globus-job-manager-script.pl is the interface between the GRAM job manager process and the LRM adapter. The command /usr/share/globus/globus-job-manager-script.pl -h will print the list of available adapters.
%/usr/share/globus/globus-job-manager-script.pl -hUSAGE: /usr/share/globus/globus-job-manager-script.pl -m MANAGER -f FILE -c COMMAND Installed managers: condor fork
The globus-scheduler-event-generator also uses an LRM-specific module to generate scheduler events for GRAM to reduce the amount of resources GRAM uses on the machine where it runs. To determine which LRMs are installed and configured, use the command globus-scheduler-event-generator-admin -l.
%globus-scheduler-event-generator-admin -lfork [DISABLED]
If any of these do not show the LRM you are trying to use, install the relevant packages related to that LRM and restart the GRAM services. See the GRAM Administrator's Guide for more information about starting and stopping the GRAM services.
All GRAM5 LRM adapters have a configuration file for site customizations, such as queue names, paths to executables needed to interface with the LRM, etc. Check that the values in these files are correct. These files are described in Section 4, “LRM Adapter Configuration”.
The /var/log/globus-gatekeeper.log file contains information
about service requests from clients, and will be useful when diagnosing service startup
failures, authentication failures, and authorization failures.
GRAM uses GSI to authenticate client job requests. If there is a problem with the GSI configuration for your host, or a client is trying to connect with a certificate signed by a CA your host does not trust, the job request will fail. This will show up in the log as a "GSS authentication failure". See the GSI Administrator's Guide for information about diagnosing authentication failures.
After authentication is complete, GRAM maps the Grid identity to a local user prior to starting the globus-job-manager process. If this fails, an error will show up in the log as "globus_gss_assist_gridmap() failed authorization". See the GSI Administrator's Guide for information about managing gridmap files.
A per-user job manager log is typically located in
/var/log/globus/gram_.
This log contains information from the job manager as it attempts
to execute GRAM jobs via a local resource manager. The logs can be
fairly verbose. Sometimes looking for log entries near those
containing the string $USERNAME.loglevel=ERROR will show more information
about what caused a particular failure.
Once you've found an error in the log, it is generally useful to find log entries
related to the job which hit that error. There are two job IDs associated with
each job, one a GRAM-specific ID, and one an LRM-specific ID. To determine the
GRAM ID associated with a job, look for the attribute
gramid in the log message. Finding that, looking for all
other log messages which contain that gramid value will
give a better picture of what the job manager is doing. To determine the
LRM-specific ID, look for a message at TRACE level with the
matching GRAM ID found above with the response value matching
GRAM_SCRIPT_JOB_ID:LRM-ID. You
can then find follow the state of the LRM-ID as well
as the GRAM ID in the log, and correlate the LRM-ID
information with local resource manager logs and administrative tools.
If all else fails, please send information about your problem to
<gram-user@globus.org>. You'll have to subscribe to a list before you
can send an e-mail to it. See here for general e-mail lists and information on how to subscribe to a list
and here for
GRAM-specific lists. Depending on the problem, you may be requested to file a bug report
to the Globus project's Issue Tracker.
Table of Contents
- globus-gatekeeper - Authorize and execute a grid service on behalf of a user
- globus-gatekeeper-admin - Manage globus-gatekeeper services
- globus-gram-audit - Load GRAM4 and GRAM5 audit records into a database
- globus-job-manager - Execute and monitor jobs
- globus-scheduler-event-generator - Process LRM events into a common format for use with GRAM
- globus-scheduler-event-generator-admin - Manage SEG modules
Name
globus-gatekeeper — Authorize and execute a grid service on behalf of a user
Synopsis
globus-gatekeeper [-help]
[-conf PARAMETER_FILE]
[-test] [ -d | -debug ]
{ -inetd | -f }
[ -p PORT | -port PORT ]
[-home PATH] [ -l LOGFILE | -logfile LOGFILE ]
[-acctfile ACCTFILE]
[-e LIBEXECDIR]
[-launch_method { fork_and_exit | fork_and_wait | dont_fork }
]
[-grid_services SERVICEDIR]
[-globusid GLOBUSID]
[-gridmap GRIDMAP]
[-x509_cert_dir TRUSTED_CERT_DIR]
[-x509_cert_file TRUSTED_CERT_FILE]
[-x509_user_cert CERT_PATH]
[-x509_user_key KEY_PATH]
[-x509_user_proxy PROXY_PATH]
[-k]
[-globuskmap KMAP]
Description
The globus-gatekeeper program is a meta-server similar to inetd or xinetd that starts other services after authenticating the TCP connection using GSSAPI.
The most common use for the globus-gatekeeper program is to start instances of the globus-job-manager(8) service. A single globus-gatekeeper deployment can handle multiple different service configurations by having entries in the grid-services directory.
Typically, users interact with the globus-gatekeeper program via client applications such as globusrun(1), globus-job-submit, or tools such as CoG jglobus or Condor-G.
The full set of command-line options to globus-gatekeeper consists of:
- -help
- Display a help message to standard error and exit.
- -conf
PARAMETER_FILE - Load configuration parameters from
PARAMETER_FILE. The parameters in that file are treated as additional command-line options. - -test
- Parse the configuration file and print out the POSIX user id of the globus-gatekeeper process, service home directory, service execution directory, and X.509 subject name and then exit.
- -d, -debug
- Run the globus-gatekeeper process in the foreground.
- -inetd
- Flag to indicate that the globus-gatekeeper process was started via inetd or a similar super-server. If this flag is set and the globus-gatekeeper was not started via inetd, a warning will be printed in the gatekeeper log.
- -f
- Flag to indicate that the globus-gatekeeper process should run in the foreground. This flag has no effect when the globus-gatekeeper is started via inetd.
- -p
PORT, -portPORT - Listen for connections on the TCP/IP port
PORT. This option has no effect if the globus-gatekeeper is started via inetd or a similar service. If not specified and the gatekeeper is running as root, the default of754is used. Otherwise, the gatekeeper defaults to an ephemeral port. - -home
PATH - Sets the gatekeeper deployment directory to
PATH. This is used to interpret relative paths for accounting files, libexecdir, certificate paths, and also to set theGLOBUS_LOCATIONenvironment variable in the service environment. If not specified, the gatekeeper uses its working directory. - -l
LOGFILE, -logfileLOGFILE - Write status log entries to
LOGFILE. - -acctfile
ACCTFILE - Set the path to write accounting records to
ACCTFILE. If not set, no accounting records will be written. - -e
LIBEXECDIR - Look for service executables in
LIBEXECDIR. If not specified, the default ofis used.HOME/libexec - -launch_method
fork_and_exit|fork_and_wait|dont_fork Determine how to launch services. The method may be one of the following:
fork_and_exit: The service runs completely independently of the gatekeeper, which exits after creating the new service process.fork_and_wait: The service is run in a separate process from the gatekeeper but the gatekeeper does not exit until the service terminates.dont_fork: The gatekeeper process becomes the service process via theexec()system call.
- -grid_services
SERVICEDIR - Look for service descriptions in
SERVICEDIR. If this is a relative path, it is interpreted relative to theHOMEvalue. If this is not specified, the default ofis used.HOME/etc/grid-services - -globusid
GLOBUSID - Sets the
GLOBUSIDenvironment variable toGLOBUSID. This variable is used to construct the gatekeeper contact string if it cannot be parsed from the service credential. - -gridmap
GRIDMAP - Use the file at
GRIDMAPto map GSSAPI names to POSIX user names. If not specified, the default ofis used.HOME/etc/grid-mapfile - -x509_cert_dir
TRUSTED_CERT_DIR - Use the directory
TRUSTED_CERT_DIRto locate trusted CA X.509 certificates. The gatekeeper sets the environment variableX509_CERT_DIRto this value. - -x509_cert_file
TRUSTED_CERT_FILE - OBSOLETE GSI OPTION
- -x509_user_cert
CERT_PATH - Read the service X.509 certificate from
CERT_PATH. The gatekeeper sets theX509_USER_CERTenvironment variable to this value. - -x509_user_key
KEY_PATH - Read the private key for the service from
KEY_PATH. The gatekeeper sets theX509_USER_KEYenvironment variable to this value. - -x509_user_proxy
PROXY_PATH - Read the X.509 proxy certificate from
PROXY_PATH. The gatekeeper sets theX509_USER_PROXYenvironment variable to this value. - -k
- Assume authentication with Kerberos 5 GSSAPI instead of X.509 GSSAPI.
- -globuskmap
KMAP - Assume authentication with Kerberos 5 GSSAPI instead of X.509
GSSAPI and use
KMAPas the path to the Kerberos-principal-to-POSIX-user mapping file.
ENVIRONMENT
The following variables affect the execution of globus-gatekeeper:
- X509_CERT_DIR
- Directory containing X.509 trust anchors and signing policy files.
- X509_USER_PROXY
- Path to file containing an X.509 proxy.
- X509_USER_CERT
- Path to file containing an X.509 user certificate.
- X509_USER_KEY
- Path to file containing an X.509 user key.
Name
globus-gatekeeper-admin — Manage globus-gatekeeper services
Synopsis
globus-gatekeeper-admin [-h]
globus-gatekeeper-admin [-l] [-n NAME]
globus-gatekeeper-admin [-e SERVICE] [-n NAME]
globus-gatekeeper-admin [-E]
globus-gatekeeper-admin [-d SERVICE]
Description
The globus-gatekeeper-admin program manages service entries which are used by the globus-gatekeeper to execute services. Service entries are located in the
/etc/grid-services directory. The globus-gatekeeper-admin can list, enable, or
disable specific services, or set a service as the default. The -h
command-line option shows a brief usage message.
Listing services
The -l command-line option to globus-gatekeeper-admin will list all of the
services which are available to be run by the globus-gatekeeper.
In the output, the service name will be followed by its status in brackets. Possible
status strings are ENABLED, DISABLED, and
ALIAS to , where
NAMENAME is another service name.
If the -n is used, then only
information about the service named NAMENAME is printed.
Enabling services
The -e command-line option to
globus-gatekeeper-admin will enable a service so that it may be run by the globus-gatekeeper. SERVICE
If the -n option is used as
well, then the service will be enabled with the alias
NAMENAME.
Enabling a default service
The -E command-line option to globus-gatekeeper-admin will cause it to enable a
service alias with the name jobmanager. The globus-gatekeeper-admin program will
choose the first service it finds as the default. To enable a particular service as
the default, use the -e parameter described above with the
-n parameter.
Name
globus-gram-audit — Load GRAM4 and GRAM5 audit records into a database
Synopsis
globus-gram-audit [--conf CONFIG_FILE] [--check] [--delete] [--audit-directory AUDITDIR]
Description
The globus-gram-audit program loads audit records to a SQL-based
database. It reads by default
to determine the audit directory and then uploads all files in that directory that
contain valid audit records to the database configured by the globus_gram_job_manager_auditing_setup_scripts package. If the upload
completes successfully, the audit files will be removed. $GLOBUS_LOCATION/etc/globus-job-manager.conf
The full set of command-line options to globus-gram-audit consist of:
--conf
|
Use |
--check |
Check whether the insertion of a record was successful by querying the database after inserting the records. This is used in tests. |
--delete | Delete audit records from the database right after inserting them. This is used in tests to avoid filling the databse with test records. |
--audit-directory
| Look for audit records in DIR, instead
of looking in the directory specified in the job manager
configuration. This is used in tests to control which records are
loaded to the database and then deleted. |
--query | Perform the given SQL query on the audit database. This uses the database information from the configuration file to determine how to contact the database. |
FILES
The globus-gram-audit uses the following files (paths are relative
to $GLOBUS_LOCATION).
etc/globus-gram-job-manager.conf |
GRAM5 job manager configuration. It includes the default path to the audit directory |
etc/globus-gram-audit.conf |
Audit configuration. It includes the information needed to contact the audit database. |
Name
globus-job-manager — Execute and monitor jobs
Synopsis
globus-job-manager {-type LRM} [-conf CONFIG_PATH] [-help] [-globus-host-manufacturer MANUFACTURER] [-globus-host-cputype CPUTYPE] [-globus-host-osname OSNAME] [-globus-host-osversion OSVERSION] [-globus-gatekeeper-host HOST] [-globus-gatekeeper-port PORT] [-globus-gatekeeper-subject SUBJECT] [-home GLOBUS_LOCATION] [-target-globus-location TARGET_GLOBUS_LOCATION] [-condor-arch ARCH] [-condor-os OS] [-history HISTORY_DIRECTORY] [-scratch-dir-base SCRATCH_DIRECTORY] [-enable-syslog] [-stdio-log LOG_DIRECTORY] [-log-pattern PATTERN] [-log-levels LEVELS] [-state-file-dir STATE_DIRECTORY] [-globus-tcp-port-range PORT_RANGE] [-globus-tcp-source-range SOURCE_RANGE] [-x509-cert-dir TRUSTED_CERTIFICATE_DIRECTORY] [-cache-location GASS_CACHE_DIRECTORY] [-k] [-extra-envvars VAR=VAL,...] [-seg-module SEG_MODULE] [-audit-directory AUDIT_DIRECTORY] [-globus-toolkit-version TOOLKIT_VERSION] [-disable-streaming] [-disable-usagestats] [-usagestats-targets TARGET] [-service-tag SERVICE_TAG]
Description
The globus-job-manager program is a service which starts and controls GRAM jobs which are executed by a local resource management (LRM) system, such as LSF or Condor. The globus-job-manager program is typically started by the globus-gatekeeper program and not directly by a user. It runs until all jobs it is managing have terminated or its delegated credentials have expired.
Typically, users interact with the globus-job-manager program via client applications such as globusrun, globus-job-submit, or tools such as CoG jglobus or Condor-G.
The full set of command-line options to globus-job-manager consists of:
-help- Display a help message to standard error and exit
-typeLRM- Execute jobs using the local resource manager named
LRM. -confCONFIG_PATH- Read additional command-line arguments from the file
CONFIG_PATH. If present, this must be the first command-line argument to the globus-job-manager program. -globus-host-manufacturerMANUFACTURER- Indicate the manufacturer of the system on which the jobs will
execute. This parameter sets the value of the
$(GLOBUS_HOST_MANUFACTURER)RSL substitution toMANUFACTURER. -globus-host-cputypeCPUTYPE- Indicate the CPU type of the system on which the jobs will execute.
This parameter sets the value of the
$(GLOBUS_HOST_CPUTYPE)RSL substitution toCPUTYPE. -globus-host-osnameOSNAME- Indicate the operating system type of the system on which the jobs
will execute. This parameter sets the value of the
$(GLOBUS_HOST_OSNAME)RSL substitution toOSNAME. -globus-host-osversionOSVERSION- Indicate the operating system version of the system on which the
jobs will execute. This parameter sets the value of the
$(GLOBUS_HOST_OSVERSION)RSL substitution toOSVERSION. -globus-gatekeeper-hostHOST- Indicate the host name of the machine to which the job was
submitted. This parameter sets the value of the
$(GLOBUS_GATEKEEPER_HOST)RSL substitution toHOST. -globus-gatekeeper-portPORT- Indicate the TCP port number of the gatekeeper to which jobs are
submitted. This parameter sets the value of the
$(GLOBUS_GATEKEEPER_PORT)RSL substitution toPORT. -globus-gatekeeper-subjectSUBJECT- Indicate the X.509 identity of the gatekeeper to which jobs are
submitted. This parameter sets the value of the
$(GLOBUS_GATEKEEPER_SUBJECT)RSL substitution toSUBJECT. -homeGLOBUS_LOCATION- Indicate the path where the Globus Toolkit(r) is installed on the service node. This is used by the job manager to locate its support and configuration files.
-target-globus-locationTARGET_GLOBUS_LOCATION- Indicate the path where the Globus Toolkit(r) is installed on the
execution host. If this is omitted, the value specified as a parameter
to
-homeis used. This parameter sets the value of the$(GLOBUS_LOCATION)RSL substitution toTARGET_GLOBUS_LOCATION. -historyHISTORY_DIRECTORY- Configure the job manager to write job history files to
HISTORY_DIRECTORY. These files are described in the FILES section below. -scratch-dir-baseSCRATCH_DIRECTORY- Configure the job manager to use
SCRATCH_DIRECTORYas the default scratch directory root if a relative path is specified in the job RSL'sscratch_dirattribute. -enable-syslog- Configure the job manager to write log messages via syslog. Logging
is further controlled by the argument to the
-log-levelsparameter described below. -log-patternPATTERN- Configure the job manager to write log messages to files named by
the string
PATTERN. ThePATTERNstring may contain job-independent RSL substitutions such as$(HOME),$(LOGNAME), etc, as well as the special RSL substition$(DATE)which will be resolved at log time to the date in YYYYMMDD form. -stdio-logLOG_DIRECTORY- Configure the job manager to write log messages to files in the
LOG_DIRECTORYdirectory. This is a backwards-compatible parameter, equivalent to-log-pattern.LOG_DIRECTORY/gram_$(DATE).log -log-levelsLEVELS- Configure the job manager to write log messages of certain levels
to syslog and/or log files. The available log levels are
FATAL,ERROR,WARN,INFO,DEBUG, andTRACE. Multiple values can be combined with the|character. The default value of logging when enabled isFATAL|ERROR. -state-file-dirSTATE_DIRECTORY- Configure the job manager to write state files to
STATE_DIRECTORY. If not specified, the job manager uses the default of. This directory must be writable by all users and be on a file system which supports POSIX advisory file locks.$GLOBUS_LOCATION/tmp/gram_job_state/ -globus-tcp-port-rangePORT_RANGE- Configure the job manager to restrict its TCP/IP communication to
use ports in the range described by
PORT_RANGE. This value is also made available in the job environment via theGLOBUS_TCP_PORT_RANGEenvironment variable. -globus-tcp-source-rangeSOURCE_RANGE- Configure the job manager to restrict its TCP/IP communication to
use source ports in the range described by
SOURCE_RANGE. This value is also made available in the job environment via theGLOBUS_TCP_SOURCE_RANGEenvironment variable. -x509-cert-dirTRUSTED_CERTIFICATE_DIRECTORY- Configure the job manager to search
TRUSTED_CERTIFICATE_DIRECTORYfor its list of trusted CA certificates and their signing policies. This value is also made available in the job environment via theX509_CERT_DIRenvironment variable. -cache-locationGASS_CACHE_DIRECTORY- Configure the job manager to use the path
GASS_CACHE_DIRECTORYfor its temporary GASS-cache files. This value is also made available in the job environment via theGLOBUS_GASS_CACHE_DEFAULTenvironment variable. -k- Configure the job manager to assume it is using Kerberos for authentication instead of X.509 certificates. This disables some certificate-specific processing in the job manager.
-extra-envvarsVAR=VAL,...- Configure the job manager to define a set of environment variables
in the job environment beyond those defined in the base job environment.
The format of the parameter to this argument is a comma-separated
sequence of VAR=VAL pairs, where
VARis the variable name andVALis the variable's value. If the value is not specified, then the value of the variable in the job manager's environment is used. This option may be present multiple times on the command-line or the job manager configuration file to append multiple environment settings. -seg-moduleSEG_MODULE- Configure the job manager to use the schedule event generator
module named by
SEG_MODULEto detect job state changes events from the LRM (this replaces the less efficient polling operations used in GT2). To use this, one instance of the globus-job-manager-event-generator must be running to process events for the LRM into a generic format that the job manager can parse. -audit-directoryAUDIT_DIRECTORY- Configure the job manager to write audit records to the directory
named by
AUDIT_DIRECTORY. These records can be loaded into a database using the globus-gram-audit program. -globus-toolkit-versionTOOLKIT_VERSION- Configure the job manager to use
TOOLKIT_VERSIONas the version for audit and usage stats records. -service-tagSERVICE_TAG- Configure the job manager to use
SERVICE_TAGas a unique identifier to allow multiple GRAM instances to use the same job state directories without interfering with each other's jobs. If not set, the valueuntaggedwill be used. -disable-streaming- Configure the job manager to disable file streaming. This is propagated to the LRM script interface but has no effect in GRAM5.
-disable-usagestats- Disable sending any usage stats data, even if
-usagestats-targetsis present in the configuration. -usagestats-targetsTARGET- Send usage packets to a data collection service for analysis. The
TARGETstring consists of a comma-separated list of HOST:PORT combinations, each contaiing an optional list of data to send. See Usage Stats Packets for more information about the tags. Special tag strings ofall(which enables all tags) anddefaultmay be used, or a sequence of characters for the various tags. If this option is not present in the configuration, then the default ofusage-stats.globus.org:4810is used. -condor-archARCH- Set the architecture specification for Condor jobs to be
ARCHin job classified ads generated by the GRAM5 Condor LRM script. This is required for the Condor LRM but ignored for all others. -condor-osOS- Set the operating system specification for Condor jobs to be
OSin job classified ads generated by the GRAM5 Condor LRM script. This is required for the Condor LRM but ignored for all others.
Environment
The following variables affect the execution of globus-job-manager:
HOME- User's home directory.
LOGNAME- User's name.
JOBMANAGER_SYSLOG_ID- String to prepend to syslog audit messages.
JOBMANAGER_SYSLOG_FAC- Facility to log syslog audit messages as.
JOBMANAGER_SYSLOG_LVL- Priority level to use for syslog audit messages.
GATEKEEPER_JM_ID- Job manager ID to be used in syslog audit records.
GATEKEEPER_PEER- Peer information to be used in syslog audit records.
GLOBUS_ID- Credential information to be used in syslog audit records.
GLOBUS_JOB_MANAGER_SLEEP- Time (in seconds) to sleep when the job manager is started. (For debugging purposes only.)
GRID_SECURITY_HTTP_BODY_FD- File descriptor of an open file which contains the initial job request and to which the initial job reply should be sent. This file descriptor is inherited from the globus-gatekeeper.
X509_USER_PROXY- Path to the X.509 user proxy which was delegated by the client to the globus-gatekeeper program to be used by the job manager.
GRID_SECURITY_CONTEXT_FD- File descriptor containing an exported security context that the job manager should use to reply to the client which submitted the job.
GLOBUS_USAGE_TARGETS- Default list of usagestats services to send usage packets to.
GLOBUS_TCP_PORT_RANGE- Default range of allowed TCP ports to listen on. The
-globus-tcp-port-rangecommand-line option overrides this. GLOBUS_TCP_SOURCE_RANGE- Default range of allowed TCP ports to bind to. The
-globus-tcp-source-rangecommand-line option overrides this.
Files
$HOME/.globus/job/HOSTNAME/LRM.TAG.red- Job manager delegated user credential.
$HOME/.globus/job/HOSTNAME/LRM.TAG.lock- Job manager state lock file.
$HOME/.globus/job/HOSTNAME/LRM.TAG.pid- Job manager pid file.
$HOME/.globus/job/HOSTNAME/LRM.TAG.sock- Job manager socket for inter-job manager communications.
$HOME/.globus/job/HOSTNAME/JOB_ID/- Job-specific state directory.
$HOME/.globus/job/HOSTNAME/JOB_ID/stdin- Standard input which has been staged from a remote URL.
$HOME/.globus/job/HOSTNAME/JOB_ID/stdout- Standard output which will be staged from a remote URL.
$HOME/.globus/job/HOSTNAME/JOB_ID/stderr- Standard error which will be staged from a remote URL.
$HOME/.globus/job/HOSTNAME/JOB_ID/x509_user_proxy- Job-specific delegated credential.
/var/lib/globus/gram_job_state/job.HOSTNAME.JOB_ID- Job state file.
/var/lib/globus/gram_job_state/job.HOSTNAME.JOB_ID.lock- Job state lock file. In most cases this will be a symlink to the job manager lock file.
/etc/globus-gram-job-manager.conf- Default location of the global job manager configuration file.
/etc/grid-services/jobmanager-LRM- Default location of the LRM-specific gatekeeper configuration file.
Name
globus-scheduler-event-generator — Process LRM events into a common format for use with GRAM
Synopsis
globus-scheduler-event-generator -s LRM
[-t TIMESTAMP] [-d DIRECTORY]
[-b] [-p PIDFILE]
Description
The globus-scheduler-event-generator program processes information from a local resource manager
to generate LRM-independent events which GRAM can use to track job
state changes. Typically, the globus-scheduler-event-generator is started at system boot time
for all LRM adapters which have been installed. The only required parameter
to globus-scheduler-event-generator is -s , which
indicates what LRM-specific module to load. A list of available modules can
be found by using the
globus-scheduler-event-generator-admin LRM-l
command.
Other options control how the globus-scheduler-event-generator program runs and where its output goes. These options are:
-tTIMESTAMPStart processing events which start at
TIMESTAMPin seconds since the UNIX epoch. If not present, the globus-scheduler-event-generator will process events from the time it was started, and not look for historical events.-dDIRECTORYWrite the event log to files in
DIRECTORY, instead of printing them to standard output. WithinDIRECTORY, logs will be named by the time when they were created inYYYYMMDDformat.-bRun the globus-scheduler-event-generator program in the background.
-pPIDFILEWrite the process-id of globus-scheduler-event-generator to
PIDFILE.
Name
globus-scheduler-event-generator-admin — Manage SEG modules
Synopsis
globus-scheduler-event-generator-admin [-h]
globus-scheduler-event-generator-admin [-l]
globus-scheduler-event-generator-admin [-e MODULE]
globus-scheduler-event-generator-admin [-d MODULE]
Description
The globus-scheduler-event-generator-admin program manages SEG modules which are used by the
globus-scheduler-event-generator to monitor
a local resource manager or batch system for events.
The globus-scheduler-event-generator-admin can list, enable, or disable specific SEG modules.
The -h command-line option
shows a brief usage message.
Listing SEG Modules
The -l command-line option to globus-scheduler-event-generator-admin will cause it
to list all of the SEG modules which are available to be run by the
globus-scheduler-event-generator. In the output, the service
name will be followed by its status in brackets. Possible status
strings are ENABLED and DISABLED.
Table of Contents
The following usage statistics are sent by default in a UDP packet (in addition to the GRAM component code, packet version, timestamp, and source IP address) at the end of each job.
- Job Manager Session ID
- dryrun used
- RSL Host Count
- Timestamp when job hit
GLOBUS_GRAM_PROTOCOL_JOB_STATE_UNSUBMITTED - Timestamp when job hit
GLOBUS_GRAM_PROTOCOL_JOB_STATE_FILE_STAGE_IN - Timestamp when job hit
GLOBUS_GRAM_PROTOCOL_JOB_STATE_PENDING - Timestamp when job hit
GLOBUS_GRAM_PROTOCOL_JOB_STATE_ACTIVE - Timestamp when job hit
GLOBUS_GRAM_PROTOCOL_JOB_STATE_FAILED - Timestamp when job hit
GLOBUS_GRAM_PROTOCOL_JOB_STATE_FILE_STAGE_OUT - Timestamp when job hit
GLOBUS_GRAM_PROTOCOL_JOB_STATE_DONE - Job Failure Code
- Number of times status is called
- Number of times register is called
- Number of times signal is called
- Number of times refresh is called
- Number of files named in file_clean_up RSL
- Number of files being staged in (including executable, stdin) from http servers
- Number of files being staged in (including executable, stdin) from https servers
- Number of files being staged in (including executable, stdin) from ftp servers
- Number of files being staged in (including executable, stdin) from gsiftp servers
- Number of files being staged into the GASS cache from http servers
- Number of files being staged into the GASS cache from https servers
- Number of files being staged into the GASS cache from ftp servers
- Number of files being staged into the GASS cache from gsiftp servers
- Number of files being staged out (including stdout and stderr) to http servers
- Number of files being staged out (including stdout and stderr) to https servers
- Number of files being staged out (including stdout and stderr) to ftp servers
- Number of files being staged out (including stdout and stderr) to gsiftp servers
- Bitmask of used RSL attributes (values are 2^id from the gram5_rsl_attributes table)
- Number of times unregister is called
- Value of the
countRSL attribute - Comma-separated list of string names of other RSL attributes not in the set defined in
globus-gram-job-manager.rvf - Job type string
- Number of times the job was restarted
- Total number of state callbacks sent to all clients for this job
The following information can be sent as well in a job status packet but it is not sent unless explicitly enabled by the system administrator:
- Value of the executable RSL attribute
- Value of the arguments RSL attribute
- IP adddress and port of the client that submitted the job
- User DN of the client that submitted the job
In addition to job-related status, the job manager sends information periodically about its execution status. The following information is sent by default in a UDP packet (in addition to the GRAM component code, packet version, timestamp, and source IP address) at job manager start and every 1 hour during the job manager lifetime:
- Job Manager Start Time
- Job Manager Session ID
- Job Manager Status Time
- Job Manager Version
- LRM
- Poll used
- Audit used
- Number of restarted jobs
- Total number of jobs
- Total number of failed jobs
- Total number of canceled jobs
- Total number of completed jobs
- Total number of dry-run jobs
- Peak number of concurrently managed jobs
- Number of jobs currently being managed
- Number of jobs currently in the UNSUBMITTED state
- Number of jobs currently in the STAGE_IN state
- Number of jobs currently in the PENDING state
- Number of jobs currently in the ACTIVE state
- Number of jobs currently in the STAGE_OUT state
- Number of jobs currently in the FAILED state
- Number of jobs currently in the DONE state
Also, please see our policy statement on the collection of usage statistics.
C
- certificate
A public key plus information about the certificate owner bound together by the digital signature of a CA. In the case of a CA certificate, the certificate is self signed, i.e. it was signed using its own private key.
- Condor
A Local Resource Manager mechanism supported by GRAM. See the Condor Project Website for more information.
F
- fork
A POSIX-specific way of creating new processes. GRAM implements a basic fork LRM Adapter which runs jobs on the GRAM head node.
G
- Gatekeeper
A part of GRAM that runs as root and authenticates clients prior to starting the Job Manager.
- grid map file
A file containing entries mapping certificate subjects to local user names. This file can also serve as a access control list for GSI enabled services and is typically found in
/etc/grid-security/grid-mapfile. For more information see the Gridmap section here.- Oracle GridEngine
A Local Resource Manager supported by GRAM. See Oracle's Web Site for more information.
J
- Job Manager
A part of GRAM that runs as a local user and interfaces with a Local Resource Manager for that user.
L
- Local Resource Manager (LRM)
A system which controls access to a compute resource, such as a compute cluster or parallel computer. Such systems provide batch execution interfaces, which GRAM uses to execute jobs. Condor, PBS, GridEngine are examples of local resource managers.
- LRM Adapter
The interface code between a Local Resource Manager and GRAM. In most cases, this consists of a Perl module that implements the
Globus::GRAM::JobManagerclass and a Scheduler Event Generator module.
P
- Portable Batch System (PBS)
A Local Resource Manager mechanism supported by GRAM. Multiple implementations of PBS exist: GRAM currently supports TORQUE. See also TORQUE.
- proxy certificate
A short lived certificate issued using a EEC. A proxy certificate typically has the same effective subject as the EEC that issued it and can thus be used in its place. GSI uses proxy certificates for single sign on and delegation of rights to other entities.
For more information about types of proxy certificates and their compatibility in different versions of GT, see http://dev.globus.org/wiki/Security/ProxyCertTypes.
S
- Scheduler Event Generator (SEG)
The Scheduler Event Generator (SEG) is a program which uses scheduler-specific monitoring modules to generate job state change events. Depending on scheduler-specific requirements, the SEG may need to run with privileges to enable it to obtain scheduler event notifications. As such, one SEG runs per scheduler resource. For example, on a host which provides access to both PBS and fork jobs, two SEGs, running at (potentially) different privilege levels will be running. One SEG instance exists for any particular scheduled resource instance (one for all homogeneous PBS queues, one for all fork jobs, etc). The SEG is implemented in an executable called the globus-scheduler-event-generator, located in the Globus Toolkit's libexec directory.
- Sun GridEngine (SGE)
The old name for Oracle GridEngine.
A
- audit logging, Audit Logging
![[Note]](/docbook-images/note.gif)