GT 3.9.5 WS MDS Aggregator: System Administrator's Guide

Introduction

This guide contains advanced configuration information for system administrators working with WS MDS Aggregator. It provides references to information on procedures typically performed by system administrators, including installation, configuring, deploying, and testing the installation.

This information is in addition to the basic installation instructions in the GT 3.9.5 System Administrator's Guide.

Building and Installing

The aggregator framework is built and installed as part of the standard Globus Toolkit installation procedure.

Configuring

Configuration overview

Configuring an Aggregating Service Group to perform a data aggregation is performed by specifying an AggregatorContent object as the content parameter of a ServiceGroup add method invocation.  An AggregatorContent object is composed of two xsd:any arrays: AggregatorConfig and AggregatorData:

  • AggregatorConfig is used to specify parameters that are to be passed to the underlying AggregatorSource when the ServiceGroup add method is invoked.  These parameters are generally type-specific to the implementation of the AggregatorSource and/or AggregatorSink being used.
  • The AggregatorData xsd:any array is used as the storage location for aggregated data that is the result of message deliveries to the AggregatorSink.  Generally, the AggregatorData parameter of the AggregatorContent is not populated when the ServiceGroup add method is invoked, but rather is populated by message delivery from the AggregatorSource.

Syntax of the interface

aggregator-types.xsd

The basic structure of the AggregatorContent type is defined in the file aggregator-types.xsd, the relevant fragment of which is shown below. In addition, there are per-source and per-sink configuration elements, which should be placed in the AggregatorConfig element of a registration if the appropriate source or sink is being used. These are detailed in a table below.

<xsd:complexType name="AggregatorConfig">
  <annotation><documentation>
    This type encapsulates multiple arbitrary aggregator configuration data
  </documentation></annotation>
  <xsd:sequence>
    <xsd:any namespace="##any" minOccurs="0" maxOccurs="unbounded"/>
  </xsd:sequence>
</xsd:complexType>

<xsd:complexType name="AggregatorData">
  <annotation><documentation>
    This type encapsulates multiple arbitrary aggregated content data.
  </documentation></annotation>
  <xsd:sequence>
    <xsd:any namespace="##any" minOccurs="0" maxOccurs="unbounded"/>
  </xsd:sequence>
</xsd:complexType>

<xsd:complexType name="AggregatorContent">
  <annotation><documentation>
    This type encapsulates the Aggregator's ServiceGroup content element, 
    which is composed of two xsd:any arrays, one storing the aggregator  
    configuration, the other storing the aggregated data. 
   </documentation></annotation>
  <xsd:sequence>
    <xsd:element name="AggregatorConfig"
                 type="tns:AggregatorConfig" 
                 minOccurs="1" maxOccurs="1"/> 
    <xsd:element name="AggregatorData"
                 type="tns:AggregatorData"
                 minOccurs="1" maxOccurs="1"/> 
    </xsd:sequence>
</xsd:complexType> 

Specifying the Aggregator Source

The aggregation source used to collect data can be changed from default by editing the aggregatorSource parameter in the index configuration in $GLOBUS_LOCATION/etc/globus_wsrf_mds_index/jndi-config.xml:

  <resource name="configuration"
               type="org.globus.mds.index.impl.IndexConfiguration">
    <resourceParams>
      <parameter>
        <name> factory</name>
        <value>org.globus.wsrf.jndi.BeanFactory</value>
      </parameter>
      <parameter>
        <name>aggregatorSource</name>
        <value>org.globus.mds.aggregator.impl.QueryAggregatorSource</value>
      </parameter>
    </resourceParams>

This parameter specifies a java class that will be used to collect data for the index. By default it is set to the QueryAggregatorSource. It can be changed to one of the other sources supplied with the toolkit, or to one installed later. Details of the supplied sources are in the Aggregator Framework Developers Guide.

Configuring the Aggregator Source

Configuration options are specified by creating a configuration file and running mds-servicegroup-add to perform the registrations specified in that configuration file. The syntax of that file is:
<?xml version="1.0" encoding="UTF-8" ?>
<ServiceGroupRegistrations
  xmlns="http://mds.globus.org/servicegroup/client" 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
  xmlns:agg="http://mds.globus.org/aggregator/types">

   <defaultServiceGroupEPR>
      Default service group EPR
   </defaultServiceGroupEPR>

   <defaultRegistrantEPR>
      Default registrant EPR
   </defaultRegistrantEPR>

   <defaultSecurityDescriptorFile>
      Path name of security descriptor file
   </defaultSecurityDescriptorFile>

   One or more of the following:
   <ServiceGroupRegistrationParameters>
      <ServiceGroupEPR>
         EPR of the service group to register to
      </ServiceGroupEPR>
      <RegistrantEPR>
         EPR of the entity to be monitored.
      </RegistrantEPR>
      <InitialTerminationTime>
         Initial termination time
      </InitialTerminationTime>
      <RefreshIntervalSecs>
         Refresh interval, in seconds
      </RefreshIntervalSecs>
      <Content>
         Aggregator-source-specific configuration parameters
      </Content>
   </ServiceGroupRegistrationParameters>

</ServiceGroupRegistrations>

The following table describes the different blocks of the file and any parameters:

defaultServiceGroupEPR block
The provides a convenient way to register a number of resources to a single service group -- for example, if you wish to register several resources to your default VO index, you can specify that index as the default service group and omit the ServiceGroupEPR blocks from each ServiceGroupRegistrationParameters block.
defaultRegistrantEPR
The provides a convenient way to register a single resource to several service groups -- for example, if you wish to register your local GRAM server to several index servers, you can specify your GRAM server as the default registrant and omit the RegistrantEPR blocks from each ServiceGroupRegistrationParameters block.
defaultSecurityDescriptorFile
Simply the path to the security descriptor file.
ServiceGroupRegistrationParameters
Each ServiceGroupRegistrationParameters block specifies the parameters used to register a resource to a service group. The parameters specified in this block are:
ServiceGroupEPR The EPR of the service group to register to. This parameter may be omitted if a defaultServiceGroupEPR block is specified; in this case, the value of defaultServiceGroupEPR will be used instead.
RegistrantEPR The EPR of the resource to register. This parameter may be omitted if a defaultRegistrantEPR block is specified; in this case, the value of defaultRegistrantEPR will be used instead.
InitialTerminationTime The initial termination time of this registration (this may be omitted).
RefreshIntervalSecs The refresh interval, in seconds.
Content Aggregator-source-specific registration parameters. The content blocks for the various aggregator sources are described in detail in the following sections.

ServiceGroupRegistration Content Blocks for QueryAggregatorSource

The QueryAggregatorSource can use one of the following three configuration blocks:

GetResourcePropertyPollType
If a GetResourcePropertyPollType block is used, QueryAggregatorSource will request a single resource property. The block has this form:
   <Content xsi:type="agg:AggregatorContent"
      xmlns:agg="http://mds.globus.org/aggregator/types">
      <agg:AggregatorConfig xsi:type="agg:AggregatorConfig">
         <agg:GetResourcePropertyPollType>
            <agg:PollIntervalMillis>interval_in_ms</agg:PollIntervalMillis>
            <agg:ResourcePropertyName>rp_namespace:rp_localname</agg:ResourcePropertyName>
         </agg:GetResourcePropertyPollType>
      </agg:AggregatorConfig>
      <agg:AggregatorData/>
   </Content>
The PollIntervalMillis parameter is the poll refresh period in milliseconds; the ResourcePropertyName parameter is the QName of the resource property to poll for.
GetMultipleResourcePropertiesPollType
If a GetMultipleResourcePropertiesPollType block is used, QueryAggregatorSource will request one or more resource properties. The block has this form:
   <Content
        xmlns:agg="http://mds.globus.org/aggregator/types"
        xsi:type="agg:AggregatorContent">
      <agg:AggregatorConfig xsi:type="agg:AggregatorConfig">
         <agg:GetMultipleResourcePropertiesPollType>
            <agg:PollIntervalMillis>interval_in_ms</agg:PollIntervalMillis>
            <agg:ResourcePropertyNames>rp1_namespace:rp1_localname</agg:ResourcePropertyNames>
            <agg:ResourcePropertyNames>rp2_namespace:rp3_localname</agg:ResourcePropertyNames>
            <agg:ResourcePropertyNames>rp3_namespace:rp3_localname</agg:ResourcePropertyNames>
         </agg:GetMultipleResourcePropertiesPollType>
      </agg:AggregatorConfig>
      <agg:AggregatorData/>
   </Content>
The PollIntervalMillis parameter is the poll refresh period in milliseconds; the ResourcePropertyNames parameters are the QNames of the resource properties to poll for. There is no limit on the number of ResourcePropertyNames that may be specified.
QueryResourcePropertiesPollType
If a QueryResourcePropertiesPollType block is used, QueryAggregatorSource will request that a query be executed against the Resource Property Set of the remote resource. In the GT 3.9.5 implementation of core, the only query language that is supported is XPath. The block has this form:
   <Content
        xmlns:agg="http://mds.globus.org/aggregator/types"
        xsi:type="agg:AggregatorContent">
      <agg:AggregatorConfig xsi:type="agg:AggregatorConfig">
         <agg:QueryResourcePropertiesPollType>
            <agg:PollIntervalMillis>interval_in_ms</agg:PollIntervalMillis>
            <agg:QueryExpression Dialect="dialect">
               Query Expression
            </agg:QueryExpression>
         </agg:QueryResourcePropertiesPollType>
      </agg:AggregatorConfig>
      <agg:AggregatorData/>
   </Content>
The PollIntervalMillis parameter is the poll refresh period in milliseconds. The QueryExpression is an xsd:any element; the Dialect attribute specifies the dialect of the query expression.

ServiceGroupRegistration Content Blocks for SubscriptionAggregatorSource

The SubscriptionAggregatorSource gathers resource property values from the registered resource using WS-Notification subscriptions. The configuration block for SubscriptionAggregatorSource looks like this:
   <Content
        xmlns:agg="http://mds.globus.org/aggregator/types"
        xsi:type="agg:AggregatorContent">
      <agg:AggregatorConfig xsi:type="agg:AggregatorConfig">
         <agg:AggregatorSubscriptionType>
             <TopicExpression Dialect="dialect">
                Topic Expression
             </TopicExpression>
             <Precondition Dialect="dialect">
                Precondition
             </Precondition>
             <Selector Dialect="dialect">
                Selector
             </Selector>
             <SubscriptionPolicy>
                Subscription Policy
             </SubscriptionPolicy>
             <InitialTerminationTime>time</InitialTerminationTime>
         </agg:AggregatorSubscriptionType>
      </agg:AggregatorConfig>
      <agg:AggregatorData/>
   </Content>
The only required parameter is the TopicExpression, which specifies the topic expression to use in the subscription request. [TODO: link to generic notification/subscription docs].

ServiceGroupRegistration Content Blocks for ExecutionAggregatorSource

The ExecutionAggregatorSource gathers arbitrary XML information about a registered resource by executing an external script and passing registration as parameters. The configuration block for ExecutionAggregatorSource looks like this:
   <Content xsi:type="agg:AggregatorContent"
      xmlns:agg="http://mds.globus.org/aggregator/types">
      <agg:AggregatorConfig xsi:type="agg:AggregatorConfig">
         <agg:ExecutionPollType>
            <agg:PollIntervalMillis>interval_in_ms</agg:PollIntervalMillis>
            <agg:ProbeName>dummy_namespace:filename</agg:ProbeName>
         </agg:ExecutionPollType>
      </agg:AggregatorConfig>
      <agg:AggregatorData/>
   </Content>
The PollIntervalMillis parameter is the poll refresh period in milliseconds. The ProbeName parameter specifies the path name to the executable file, relative to the $GLOBUS_LOCATION/libexec/aggrexec directory. The path name should be specified as the local name part of this QName; the namespace part is ignored.

Configuring the Aggregator Sink

An aggregator sink may require sink-specific configuration (the MDS Trigger service requires sink-specific configuration; the MDS Index service does not). See the documentation for the specific aggregator service being used for details on sink-specific documentation.

The aggregator framework does not have its own service side configuration, although services which are based on the framework have their own service side configuration options, documented in the per-service documentation.

Registrations to a working aggregator framework are configured for the mds-servicegroup-add tool, which is documented in the public interface guide. This tool takes an XML configuration file listing registrations, and causes thoses registrations to be made.

The tool can be deployed at the aggregating service, at resource services, or at any other location. This allows registrations to be configured by the administrator of the aggregating service, or by the administrator of resources, by a third party, or by some combination of those.

Three Aggregator Sources are included in the Globus Toolkit distribution:
  • The QueryAggregatorSource gathers resource property values from the registered resource using one of the three WS-Resource Properties poll operations. .
    • GetResourcePropertyPollType; requests a single Resource Property from the remote resource.
    • GetMultipleResourcePropertiesPollType; requests multiple Resource Properties from the remote resource.
    • QueryResourcePropertiesPollType; requests a query be executed against the Resource Property Set of the remote resource.
  • The SubscriptionAggregatorSource gathers resource property values from the registered resource using WS-Notification subscriptions.
  • The ExecutionAggregatorSource gathers arbitrary XML information about a registered resource by executing an external script and passing registration as parameters. See the developers guide for details of this API.

Deploying

This component is deployed as part of the standard toolkit installation.

Testing

[procedures for how to test the configuration. must include examples of the tests ]

Security Considerations

By default, the aggregator sources do not use authentication credentials -- they retrieve information using anonymous SSL authentication or no authentication at all, and thus retrieve only publicly-available information. If a user or administrator changes that configuration so that a service's aggregator source uses credentials to acquire non-privileged data, then that user or administrator must configure the service's aggregator sink to limit access to authorized users.

Troubleshooting

[help for common problems sysadmins may experience]