GT 3.0: XPath Query support for Service Data Elements (Experimental)

Introduction

The current implementation of XPath query support for Service Data uses Xalan-J XPathAPI to evaluate XPath expressions on serialized DOM representations of Service Data Elements and return the results as an array of XML elements.

Audience

This document is targeted towards OGSA developers interested in evaluating XPath query expressions against OGSA Service Data Elements.

Assumptions

This document assumes that the Globus Toolkit® 3.0 has been installed and configured for your particular computing environment. .

Organization

This document contains the following sections:

Related Documents

Here are some links to XPath documents that you may find useful:

Usage

An XPath query can be performed by issuing a findServiceData() method call and passing a ServiceDataXPathQueryExpressionType object as the parameter instead of ServiceDataNameQueryExpressionType.

ServiceDataXPathQueryExpressionType has the following WSDL structure:

<complexType name="ServiceDataXPathQueryExpressionType">
    <sequence>
        <element name="name" type="QName"/>
        <element name="XPath" type="string"/>
        <element name="namespaces" type="string" minOccurs="0"
          maxOccurs="unbounded"/>
    </sequence>
</complexType>

Parameter Explanation

QName: "name"
This is the QName of the SDE to select as the basis of the search.  Wildcarding is supported in either part of the QName.  If the asterisk (*) wildcard is specified, the XPath expression will be applied in turn to each matching Service Data Element returned by the wildcard selection, and the entire results set will be returned as an array of XML elements, with each element containing one or more child result elements from the XPath query.
String: "XPath"
This is the XPath expression to apply to the SDE specified by the "name" parameter.  The XPath language is sophisticated and the syntax is strict.  Improper syntax in query expressions will generate Invalid Expression exceptions.  Improper semantic usage may yield an undesirable or unexpected result set, or no results at all.  At this time, DOM Element types are the only return type supported by the XPath query evaluator.
String[]: "namespaces"
XPath requires that the client provide a namespace mapping for every node in the query scope that has a corresponding namespace attribute.  In the current implementation, this is provided via an array of strings of the form xmlns:<prefix>=<namespaceURI>.  For example:
xmlns:gsdl=http://www.gridforum.org/namespaces/2002/10/gridServices
xmlns:wsil=http://schemas.xmlsoap.org/ws/2001/10/inspection/
If namespace mappings are not provided, the default behavior is to use the current context node (in our case the SDE root element) to resolve the namespaces.  However, this may not be sufficient when searching for child nodes that contain namespace attributes not present in the root node, so one must be careful to provide all possible namespaces of interest that are likely to be encountered when traversing the SDE.

Note that while with XPath it is possible to do a wildcard selection on a QName localPart within the XPath string expression, it is not possible to wildcard the namespace URI or prefix. For example: //*:* is an invalid query, but //gsdl:* is valid -- as long as the "gsdl" prefix can be resolved to a URI.

Current Client Implementations

Querying with the OGSA Service Browser GUI

The GridServicePortTypePanel.java of the OGSA Service Browser GUI can take the extra parameters of the String XPath expression and the Namespace mapping string.  Specify the literal XPath expression string in the XPath text box and use a whitespace, semicolon, or comma delimited set of namespace strings in the Namespace text box.

Querying with the ogsi-find-service-data Command

Another way to query is with the ogsi-find-service-data command.  The following example shows how to query the container registry for all active services registered to it:

 ogsi-find-service-data -service 
 http://128.9.72.46:9103/ogsa/services/core/registry/ContainerRegistryService -sde entry -querytype 
 xpath -xpath 
 '/gsdl:entry[gsdl:content/prop:propertiesDetail/prop:state="ACTIVE"]' -ns 
 prop=http://ogsa.globus.org/types/properties -output pretty

This command uses an XPath expression to specify that the Container Registry Service be queried for all services registered to it.  The XPath expression uses both the default namespace, gsdl, as well as the –ns option to define the prop abbreviation for the properties namespace.  The XPath expression filters the results so that only active services are returned.  The –output pretty option specifies that the service data be returned in an easy-to-read outline form.

The ogsi-find-service-data command is described in detail in Querying Service Data.

Sample Queries

To get you started with creating XPath expressions here are some examples.  These expressions can be used with the OGSA Service Browser GUI, the ogsi-find-service-data client mentioned above, or any client you might create for this purpose.  Much more information on XPath expressions and examples can be found in the Related Documents section at the beginning of this document.

The following namespace mappings are assumed for these sample queries:

xmlns:gsdl=http://www.gridforum.org/namespaces/2003/03/OGSI

xmlns:glue=http://glue.base.ogsa.globus.org/ce/1.1
prop=http://ogsa.globus.org/types/properties

Note that these sample queries are all single-line commands.

1.  Query against the container registry to get a list of active services:

/gsdl:entry[gsdl:content/prop:propertiesDetail/prop:state="ACTIVE"]

2.  Look in an Index Service for a file system called /scratch, residing on the host host1.isi.edu:

/glue:Cluster/glue:SubCluster/glue:Host[@glue:Name="host1.isi.edu"]
/glue:FileSystem[@glue:Name="/scratch"]

3.  Select all clusters that have a host with more than 500 MB of memory available:

/glue:Cluster[glue:SubCluster/glue:Host/glue:MainMemory/@glue:
 RAMAvailable>500]

4.  Show information (OS, available memory, CPU load, etc.) for a host called host2.isi.edu that is being indexed by an Index Service:

 '/glue:Cluster/glue:SubCluster/glue:Host[@glue:Name="host2.isi.edu"]'

5.  Show just the CPU load for host2.isi.edu in Example 4 above:

/glue:Cluster/glue:SubCluster/glue:Host[@glue:Name="host2.isi.edu"]
/glue:ProcessorLoad