GT 4.1.0 WS MDS Aggregator Framework: Developer's Guide

1. Introduction

The Aggregator Framework allows pluggable data sources and sinks to be written and connected together. Generally a source collects data from or about a particular grid resource, and passes it to a sink which does something interesting with it.

The aggregator sinks supplied with the toolkit implement the WS MDS Index Service and WS MDS Trigger Service. The aggregator sources supplied with the toolkit collect information using resource property queries, subscription/notification, and execution of external programs.

This document describes the programmatic interfaces to the Aggregator Framework. See also general Globus Toolkit coding guidelines and GT 4.1.0 best practices.

2. Before you begin

2.1. Feature summary

Features new in release GT 4.1.0

  • The mds-servicegroup-add command no longer requires the -s or -e arguments
  • The mds-set-multiple-termination-time command has been created to aid in lifetime management of service group entry resources created via mds-servicegroup-add

Other Supported Features

  • Collects information from grid resources using pluggable aggregation sources which collect information by polling, subscription, and by execution of local scripts.
  • Delivers collected information to pluggable information sinks.
  • Management of individual aggregations is now performed over the wire through WS ServiceGroup APIs.

2.2. Tested platforms

Tested Platforms for WS MDS Aggregator Framework

  • Linux on i386
  • Windows XP

2.3. Backward compatibility summary

Protocol changes since GT version 4.0.2

  • The Aggregator Framework is a complete reimplementation of the MDS3 Aggregator Framework using WSRF rather than OGSI protocols.
  • No wireside compatibility with MDS3 Aggregator Framework.
  • Architectural similarity should make porting straightforward.

API changes since GT version 4.0.2

  • APIs entirely rewritten, so no API compatibility.
  • Architectural similarity should make porting straightforward.

Exception changes since GT version 4.0.2

  • See API changes above.

Schema changes since GT version 4.0.2

  • Registration interface uses WSRF rather than OGSI schemas.
  • New per-source and per-sink configuration schemas.

2.4. Technology dependencies

Aggregator Framework depends on the following GT components:

  • Java WS Core

Aggregator Framework depends on the following 3rd party software:

  • None

2.5.  Security considerations

By default, the aggregator sources do not use authentication credentials -- they retrieve information using anonymous SSL authentication or no authentication at all, and thus retrieve only publicly-available information. If a user or administrator changes that configuration so that a service's aggregator source uses credentials to acquire non-privileged data, then that user or administrator must configure the service's aggregator sink to limit access to authorized users.

3. Architecture and design overview

The WS MDS Aggregator Framework is the software framework on which WS MDS services are built. The Aggregator Framework collects data from an aggregator source and sends that data to an aggregator sink for processing. Aggregator sources distributed with the Globus Toolkit include modules that query resource properties, acquire data through subscription/notification, and execute programs to generate data. Another way of describing the Aggregator Framework is that it is designed to facilitate the collecting of information from or about WS-Resources via plugin aggregator sources and the feeding of that information to plugin aggregator sinks, which can then perform actions such as re-publishing, logging, or archiving the information.

Figure 1. Graphic of Information Services Flow

Graphic of Information Services Flow

Aggregators work on a type of service group called an AggregatorServiceGroupRP. Resources may be registered to an AggregatorServiceGroupRP using the service group add operation, which will cause an entry to be added to the service group. The entry will include configuration parameters for the aggregator source; when the registration is made, the appropriate aggregation source and sinks will be informed; the aggregator source will begin collecting data and inserting it into the corresponding service group entry, and the aggregator sink will begin processing the information in the service group entries.

The method of collection by source and processing by the sink is dependent on the particular instantiation of the aggregator framework.

3.1. Standard aggregator sinks

The aggregator sinks distributed with the toolkit (org.globus.mds.aggregator.impl.ServiceGroupEntryAggregatorSink and org.globus.mds.trigger.impl.TriggerResource) are described in the following table.

Table 1. Standard aggregator sinks

Aggregator SinkService ImplementedDescription
ServiceGroupEntryAggregatorSinkIndex ServiceThe servicegroup sink (used by the Index Service) publishes received data as content in the AggregatingServiceGroup entry used to manage the registration. This data can therefore be retrieved by querying the index for its 'entries' resource property.
TriggerResourceTrigger ServiceThe Trigger Service provides an aggregator sink which receives data, applies tests to that data, and if the tests match, runs a specified executable. See the WS MDS Trigger Service documentation for more information.

3.2. Standard aggregator sources

The aggregator sources supplied with the toolkit collect information using resource property queries (query sources), subscription/notification (subscription sources), and execution of external programs (execution sources).

The aggregator sources supplied with the Globus Toolkit are listed in the following table.

[Note]Note

All aggregator sources listed in this table are in the org.globus.mds.aggregator.impl package, so for example the aggregator source listed as QueryAggregatorSource is actually org.globus.mds.aggregator.impl.QueryAggregatorSource

Table 2. Standard aggregator sources

Aggregator SourceDescription
QueryAggregatorSource

The query source collects information from a registered resource by using WS-Resource Properties polling mechanisms:

  • GetResourcePropertyPollType; requests a single Resource Property from the remote resource.
  • GetMultipleResourcePropertiesPollType; requests multiple Resource Properties from the remote resource.
  • QueryResourcePropertiesPollType; requests a query be executed against the Resource Property Set of the remote resource.

Polls are made periodically, with both the period and target Resource Properties specified in the registration message.

SubscriptionAggregatorSourceThe subscription source collects information from a registered resource using WS-Notification mechanisms. Data is delivered when property values change, rather than periodically.
ExecutionAggregatorSourceThe execution source collects information about (not necessarily from) a registered resource by execution of a local executable, which is passed as input the identity of the registered resource. Details of the interface between the execution source and local executables are in Execution Aggregator Sources Reference.

4. Public interface

4.1. Semantics and syntax of APIs

4.1.1. Programming Model Overview

The Aggregator Framework module consists of an Aggregating ServiceGroup framework which supports plugins as detailed below, as well as a number of standard plugins.

4.1.2. The Aggregating ServiceGroup framework

The aggregating servicegroup framework is designed to facilitate the collecting of information from or about WS-Resources (via plugin aggregator sources) and the feeding of that information to plugin aggregator sinks.

The framework provides for over-the-wire management of the list of registered resources (through a WS-ServiceGroup interface) and a Java API for connecting sources and sinks together.

In general (although this is not a hard requirement), aggregator sinks will be tied into a specific service implementation, while aggregator sources are more independent. (For example, the trigger and index services act as sinks)

4.1.3. The standard plugins

A number of standard aggregator sources are provided, which implement the aggregator source API. These provide for collecting information from/about a WS-Resource by:

  • WS-ResourceProperties poll operations
  • WS-Notification subscription
  • Execution of arbitrary executables

See Aggregator Sources Reference for more information about standard aggregator sources for GT 4.1.0.

4.1.4. Component API

There are two main Java interfaces in the aggregator framework.

  • AggregatorSink - which is implemented by sinks that can receive data from the Aggregator Framework.
  • AggregatorSource - which is implemented by sources that can feed data into the Aggregator Framework.

In addition, the AggregatorContent class is used when configuring an aggregator service programmatically, and to represent the data published in the aggregator's Entry resource property. All aggregator classes and interfaces are documented in the aggregator Java API documentation

4.2. Semantics and syntax of the WSDL

4.2.1. Protocol overview

The Aggregator Framework builds on the WS-ServiceGroup and WS-ResourceLifetime specifications. Those specifications should be consulted for details on the syntax of each operation.

Each Aggregator Framework is represented as a WS-ServiceGroup (specifically, an AggregatorServiceGroup).

Resources may be registered to an AggregatorServiceGroup using the AggregatorServiceGroup Add operation. Each registration will be represented as a ServiceGroupEntry resource. Resources may be registered to an AggregatorServiceGroup using the service group add operation, which will cause an entry to be added to the service group.

The entry will include configuration parameters for the aggregator source; when the registration is made, the following will happen:

  1. The appropriate aggregation source and sinks will be informed,
  2. the aggregator source will begin collecting data and inserting it into the corresponding service group entry,
  3. and the aggregator sink will begin processing the information in the service group entries.

The method of collection by source and processing by the sink is dependent on the particular instantiation of the aggregator framework (see per-source documentation for source information and per-service documentation for sink information - for example the Index Service and the Trigger Service.)

4.2.2. Operations

4.2.2.1. AggregatorServiceGroup
  • add: This operation is used to register a specified resource with the Aggregator Framework. In addition to the requirements made by the WS-ServiceGroup specification, the Content element of each registration must be an AggregatorContent type, with the AggregatorConfig element containing configuration information specific to each source and sink (documented in the Aggregator System Administrator's Guide).
4.2.2.2. AggregatorServiceGroupEntry
  • setTerminationTime: This operation can be used to set the termination time of the registration, as detailed in WS-ResourceLifetime.

4.2.3. Resource properties

4.2.3.1. AggregatorServiceGroup Resource Properties
  • Entry: This resource property publishes details of each registered resource, including both an EPR to the resource, the Aggregator Framework configuration information, and data from the sink.
  • RegistrationCount: This resource property publishes registration load information (the total number of registrations since service startup and decaying averages)

4.2.4. Faults

The Aggregator Framework throws standard WS-ServiceGroup, WS-ResourceLifetime, and WS-ResourceProperties faults and does not define any new faults of its own.

4.3. Semantics and syntax of non-WSDL protocols

[describe non-WSDL protocols. if none, state so.]

4.4. Command-line tools

Please see the Aggregator Command Reference.

4.5. Overview of Graphical User Interface

There is no GUI specifically for the Aggregator Framework. The release contains WebMDS which can be used to display monitoring information in a web browser. Specifically, it can be directed at services based on the Aggregator Framework to display information about resources registered to the Aggregator Framework.

4.6. Semantics and syntax of domain-specific interface

4.6.1. Writing executable to be called by execution aggregator source

4.6.1.1. Introduction

The execution aggregation source provides a way to aggregate data (arbitrary XML information) about a registered resource using an arbitrary local executable (such as an external script). The executable will be passed registration information as parameters and is expected to output the gathered data, as detailed below.

A basic example of the use of this API is described in the ping test example for the aggregator execution source

The execution aggregation source will periodically execute an identified executable. The identity of the executable and the frequency with which it is to run are specified in the registration message.

4.6.1.2. Registering

To register resources:

  • Create a configuration file in XML that specifies registrations. See $GLOBUS_LOCATION/etc/globus_wsrf_mds_aggregator/example-aggregator-registration.xml for several specific examples.
  • Run mds-servicegroup-add(1) to perform the registrations specified in that configuration file.

The configuration file consists of an optional defaultServiceGroupEPR, an optional defaultRegistrantEPR, and then one or more ServiceGroupRegistrationParameters blocks, each of which represents one registration.

The general syntax of the configuration file is:


<?xml version="1.0" encoding="UTF-8" ?>
<ServiceGroupRegistrations
  xmlns="http://mds.globus.org/servicegroup/client">

  // An optional default service group EPR.
  <defaultServiceGroupEPR>
    // Default service group EPR
  </defaultServiceGroupEPR>

  // An optional default registrant EPR.
  <defaultRegistrantEPR>
    // Default registrant EPR
  </defaultRegistrantEPR>

  // An optional default security descriptor file.
  <defaultSecurityDescriptorFile>
    // Path name of default security descriptor file
  </defaultSecurityDescriptorFile>

  // One or more service group registration blocks:

  <ServiceGroupRegistrationParameters>
    <ServiceGroupEPR>
      // EPR of the service group to register to
    </ServiceGroupEPR>
    <RegistrantEPR>
      // EPR of the entity to be monitored.
    </RegistrantEPR>
    <InitialTerminationTime>
      // Initial termination time
    </InitialTerminationTime>
    <RefreshIntervalSecs>
      // Refresh interval, in seconds
    </RefreshIntervalSecs>
    <Content type="agg:AggregatorContent">
      // Aggregator-source-specific configuration parameters
    </Content>
  </ServiceGroupRegistrationParameters>

</ServiceGroupRegistrations>

The following table describes the different blocks of the file and any parameters:

Table 3. Aggregator configuration parameters

defaultServiceGroupEPR blockThe provides a convenient way to register a number of resources to a single service group -- for example, if you wish to register several resources to your default VO index, you can specify that index as the default service group and omit the ServiceGroupEPR blocks from each ServiceGroupRegistrationParameters block.
defaultRegistrantEPRThe provides a convenient way to register a single resource to several service groups -- for example, if you wish to register your local GRAM server to several index servers, you can specify your GRAM server as the default registrant and omit the RegistrantEPR blocks from each ServiceGroupRegistrationParameters block.
defaultSecurityDescriptorFileSimply the path to the security descriptor file.
ServiceGroupRegistrationParametersEach ServiceGroupRegistrationParameters block specifies the parameters used to register a resource to a service group. The parameters specified in this block are:
ServiceGroupEPRThe EPR of the service group to register to. This parameter may be omitted if a defaultServiceGroupEPR block is specified; in this case, the value of defaultServiceGroupEPR will be used instead.
RegistrantEPRThe EPR of the resource to register. This parameter may be omitted if a defaultRegistrantEPR block is specified; in this case, the value of defaultRegistrantEPR will be used instead.
InitialTerminationTimeThe initial termination time of this registration (this may be omitted). If the initial termination time is omitted, then the mds-servicegroup-add sets the initial termination time to the current wall time plus 2 times that of the specified RefreshIntervalSecs parameter.
RefreshIntervalSecsThe refresh interval of the registration, in seconds. The mds-servicegroup-add(1) will attempt to refresh the registration according to this interval, by default incrementing the termination time of the registration by 2 times this interval for every successful refresh. If at any point the termination time for the registration expires the registration will be subject to removal within a maximum of 5 minutes.
ContentAggregator-source-specific registration parameters. The content blocks for the various aggregator sources are described in detail in the following sections.
4.6.1.3. Configuration file: parameters for the execution aggregator source

The configuration block for ExecutionAggregatorSource (inside the Content block) looks like this:

<Content xsi:type="agg:AggregatorContent"
   xmlns:agg="http://mds.globus.org/aggregator/types">
  <agg:AggregatorConfig xsi:type="agg:AggregatorConfig">
    <agg:ExecutionPollType>
      <agg:PollIntervalMillis>interval_in_ms</agg:PollIntervalMillis>
      <agg:ProbeName>dummy_namespace:probe_name</agg:ProbeName>
    </agg:ExecutionPollType>
  </agg:AggregatorConfig>
  <agg:AggregatorData/>
 </Content>
    

where:

PollIntervalMillis

This parameter is the poll refresh period in milliseconds.

ProbeName

This parameter specifies name of the probe to run. This probe is defined in the jndi-config.xml file for the service being configured (for example, the file for the MDS Index Service is $GLOBUS_LOCATION/etc/globus_wsrf_mds_index_jndi-config.xml). An executableMappings parameter should be defined within this file to map probe names to executable names. For example, this maps the probe names aggr-test and pingexec to the executables called aggregator-exec-test.sh and example-ping-exec, respectively. All executables are presumed to be in the directory $GLOBUS_LOCATION/libexec/aggrexec.

 <resource name="configuration"
            type="org.globus.mds.aggregator.impl.AggregatorConfiguration">
  <resourceParams>
             // ...
    <parameter>
      <name>executableMappings</name>
      <value>aggr-test=aggregator-exec-test.sh, pingexec=example-ping-exec</value>
    </parameter>
  </resourceParams>
</resource>

4.6.1.4. Troubleshooting

If you've properly configured and registered your script for execution but are getting errors from the container because it cannot find the specified script, there are two likely causes.

First, make sure that your script/program is executable and is located in the $GLOBUS_LOCATION/libexec/aggrexec directory. When it's specified in the configuration mentioned above, only specify the name of the script/program, without any qualification or path. For example, using the ProbeName as test-script will be specifying the file $GLOBUS_LOCATION/libexec/aggrexec/test-script script.

Next, make sure that you have correctly created an executableMappings definition in the appropriate jndi-config.xml file.

4.6.1.5. Configuring the executable
4.6.1.5.1. Name of executable

The executable to run will be $GLOBUS_LOCATION/libexec/aggrexec/<scriptname> with scriptname supplied by the ProbeName parameter in the configuration file.

4.6.1.5.2. Input to executable

Information about the registration will be supplied as command line parameters and on stdin.

A single command line parameter will be supplied to the executable. This will be the URL from the EPR of the registered service.

Two XML documents will be sent to stdin, in sequence:

  1. The first document will be the full EPR to the registered service.
  2. The second document will be the AggregatorConfig block from the registration message (configuration file).
4.6.1.5.3. Output from executable

The executable must output a well-formed XML document to stdout. This output document will be delivered into the Aggregator Framework.

4.7. Configuration interface

Please see the Configuring section of the System Administrator's Guide.

4.8. Environment variable interface

There are no environment variables specific to the aggregator framework.

5. Usage scenarios

5.1. Creating WS MDS services

The Aggregator Framework is used to create MDS services by linking an aggregator source (a java class that implements the AggregatorSource interface to collect data) to an aggregator sink (a java class that implements the AggregatorSink interface to process data, e.g., by providing a service interface for it). The AggregatorSource and AggregatorSink interfaces are documented in Aggregator Public Interface Guide.

6. Tutorials

Use of the index service (based on the WS MDS Aggregator Framework) is covered in the Build a Grid Service Tutorial (GlobusWORLD 2005).

7. Debugging

See Section 7, “Debugging” for general information on logging, including which files to edit to set logging properties.

To turn on debug logging for the Aggregator framework, add the line:

log4j.category.org.globus.mds.aggregator=DEBUG

to the appropriate properties file.

8.  Troubleshooting

General troubleshooting information can be found in the GT 4.1.0 Java WS Core : Developer's Guide.

9. Related Documentation

Specifications for resource properties, service groups, and subscription/notification are available at http://www.globus.org/wsrf/.