Introduction
This guide contains information of interest to developers working with the RLS. It provides reference information for application developers, including APIs, architecture, procedures for using the APIs and code samples.
Table of Contents
- 1. Before you begin
- 2. Usage scenarios
- 1. Java Examples
- 1.1. Create a connection source
- 1.2. Create a connection
- 1.3. Creating and adding mappings
- 1.4. Create a simple index query
- 1.5. Query the index service
- 1.6. Create a simple catalog query
- 1.7. Alternate: Create a catalog wildcard query
- 1.8. Query the catalog service
- 1.9. Defining attributes
- 1.10. Undefining attributes
- 1.11. Adding attributes
- 1.12. Searching by attributes
- 2. Example Code
- 3. Tutorials
- 4. Architecture and design overview
- 5. APIs
- 6. Command line tools
- globus-rls-admin - RLS administration tool
- globus-rls-cli - RLS client tool
- globus-rls-server - RLS server tool
- 7. Configuring RLS
- 1. Configuration overview
- 2. Server configuration file (globus-rls-server.conf)
- 3. Basic configuration
- 4. Host key and certificate configuration
- 5. Configuring LRC to RLI updates
- 6. Configuring the RLS Server for the WS MDS Index Service
- 7. Configuring the RLS Server for the MDS2 GRIS
- 8. Complete RLS Server settings (globus-rls-server.conf)
- 8. Debugging
- 9. Troubleshooting
- 10. Related Documentation
- Glossary
- Index
Table of Contents
Features New in GT 4.2.1
- None since GT 4.2.0.
Other Supported Features
- Comprehensive C and Java library for replica registration, replica lookup, replica attributes, index queries, and administrative tasks.
- Command line (
globus-rls-cli) tool for client operations on catalogs and indexes. - Command line (
globus-rls-admin) tool for administrative tasks.
Deprecated Features
- None
Tested platforms for RLS include most 32-bit flavors of Linux and UNIX, including RedHat, Debian, CentOS, SUSE, Solaris/Sparc, Solaris/x86, and others.
Protocol changes since GT 4.2.0
- None
API changes since GT 4.2.0
- None
Exception changes since GT 4.2.0
- None
Schema changes since GT 4.2.0
- None
RLS depends on the following GT components:
- globus_core
- globus_common
- globus_io
- globus_gssapi_gsi
- globus_usage
RLS depends on the following 3rd party software:
- RDBMS: SQLite*, MySQL, PostgreSQL, or Oracle
- ODBC manager: iODBC, unixODBC
- ODBC driver: SQLite-ODBC*, MyODBC, psqlODBC, or Oracle
* The RLS comes installed with and configured to use these components.
Security recommendations include:
- Dedicated User Account: It is recommended that users create a
dedicated user account for installing and running the RLS service (e.g.,
globusas recommended in the general GT installation instructions). This account may be used to install and run other services from the Globus Toolkit. - Key and Certificate: It is recommended that users do not use
their hostkey and hostcert for use by the RLS service. Create a containerkey and
containercert with permissions
400and644respectively and owned by theglobususer. Change therlskeyfileandrlscertfilesettings in the RLS configuration file ($GLOBUS_LOCATION/etc/globus-rls-server.conf) to reflect the appropriate filenames. - LRC and RLI Databases: Users must ensure security of the RLS data as maintained by their chosen database management system. Appropriate precautions should be made to protect the data and access to the database. Such precautions may include creating a user account specifically for RLS usage, encrypting database users' passwords, etc.
- RLS Configuration: It is recommended that the RLS
configuration file (
$GLOBUS_LOCATION/etc/globus-rls-server.conf) be owned by and accessible only by the dedicated user account for RLS (e.g.,globusaccount per above recommendations). The file contains the database user account and password used to access the LRC and RLI databases along with important settings which, if tampered with, could adversely affect the RLS service.
Table of Contents
- 1. Java Examples
- 1.1. Create a connection source
- 1.2. Create a connection
- 1.3. Creating and adding mappings
- 1.4. Create a simple index query
- 1.5. Query the index service
- 1.6. Create a simple catalog query
- 1.7. Alternate: Create a catalog wildcard query
- 1.8. Query the catalog service
- 1.9. Defining attributes
- 1.10. Undefining attributes
- 1.11. Adding attributes
- 1.12. Searching by attributes
- 2. Example Code
This section provides examples of a few basic operations using the new Java client API. Here is an outline of the typical steps used to resolve a replica location.
- Establish a connection to the Replica Location Index service
- Construct a list of the logical names to be used in the query
- Query the index service
- Inspect the return list and construct lists of logical names to be used in queries to the Local Replica Catalog services
- Query the catalog services
- Inspect the results returned by the catalog services
A connection source is needed in order to establish connections to the RLS. The connection source may be shared and is a thread safe object. The SimpleRLSConnectionSource may be directly instantiated by the client or it may be used as a JNDI object and shared by multiple clients (e.g., in a container that supports JNDI). In this example, the client instantiates the connection source.
RLSConnectionSource source = null;
try {
source = new SimpleRLSConnectionSource();
} catch (RLSException e) {
// handle exception
}
Use the connection source to establish a connection. If the source has defaults you may use its parameterless connect() method, otherwise you must supply the connection URL and credentials.
RLSConnection connection = null;
try {
connection = source.connect(url, credential);
} catch (RLSException e) {
// handle exception
}
The RLS catalog supports creating and adding mappings as distinct operations. A mapping is an association between a logical name and a target name. A logical name or target name is implicitly created when the first mapping between the logical name and a target name is created. For the first such mapping, the create operation must be used. Subsequent mappings between the logical name and other target names may be added using the add operation. Likewise, the logical name is implicitly deleted when the last mapping it appears in is removed.
// Creates
List creates = new List();
creates.add(
new Mapping("my-logical-name-123", "ftp://foo/"));
// Adds
adds.add(
new Mapping("my-logical-name-123", "ftp://bar/"));
try {
LocalReplicaCatalog catalog = connection.catalog();
List createResults = catalog.createMappings(creates);
List addResults = catalog.addMappings(adds);
// The results lists contain MappingResult objects. Each
// MappingResult object indicates an error with a result
// code. If all mappings succeed, the lists will be empty.
} catch (RLSException e) {
// handle exception
}
Query objects are used to represent different types of RLS queries. There are simple queries, batch queries, and attribute searches. This examples uses a simple query object. We begin by querying the RLS index service which tells us which catalog services to query for a given logcial name.
IndexQuery indexQuery = new SimpleIndexQuery(
SimpleIndexQuery.queryMappingsByLogicalName,
"my-logical-name-123",
null);
RLS index services keep an index of logical names for each catalog service that sends its index to the given index service. By querying an index service, the client can find out which catalog services may have replica locations for a desired logical name.
try {
ReplicaLocationIndex index = connection.index();
Results results = index.query(indexQuery);
if (results.getRC() == RLSStatusCode.RLS_SUCCESS) {
List batch = results.getBatch();
Iterator i = batch.iterator();
while (i.hasNext()) {
IndexMappingResult result = (IndexMappingResult) i.next();
if (result.getRC() != RLSStatusCode.RLS_SUCCESS)
continue;
String logicalName = result.getLogical();
String catalogURL = result.getCatalog();
// At this point, the client will need to create
// a CatalogQuery object for each distinct catalog
// URL returned in the results. These URLs indicate
// the catalog services which have replica locations
// for the given logical name.
}
}
} catch (RLSException e) {
// handle exception
}
Based on the results of the index query, a client will create catalog queries.
CatalogQuery catalogQuery = new SimpleCatalogQuery(
SimpleCatalogQuery.queryMappingsByLogicalName,
"my-logical-name-123",
null);
The catalog also supports wildcard queries. Use the '%' character for the wildcard.
CatalogQuery catalogQuery = new SimpleCatalogQuery(
SimpleCatalogQuery.queryMappingsByLogicalNamePattern,
"my-logical-name-%",
null);
RLS catalog services keep a catalog of logical names mapped to target names. The target names are typically used to indicate the URL for a data object (e.g., a gsiftp, http, etc. URL).
try {
LocalReplicaCatalog catalog = connection.catalog();
Results results = catalog.query(catalogQuery);
if (results.getRC() == RLSStatusCode.RLS_SUCCESS) {
List batch = results.getBatch();
Iterator i = batch.iterator();
while (i.hasNext()) {
MappingResult result = (MappingResult) i.next();
if (result.getRC() != RLSStatusCode.RLS_SUCCESS)
continue;
String logicalName = result.getLogical();
String targetName = result.getTarget();
// At this point, the client has resolved the
// target name for the given logical name. Keep in
// mind that in the RLS, a logical name may be
// mapped to multiple target names.
}
}
} catch (RLSException e) {
// handle exception
}
RLS also supports attributes on objects (logical names and target names) in the catalog. Before assigning attributes to objects, attributes must be defined.
List defs = new List();
// Define a string attribute
defs.add(new RLSAtribute(
"label",
RLSAttribute.LRC_PFN,
RLSAttribute.STR,
null));
// Define an integer attribute
defs.add(new RLSAtribute(
"series",
RLSAttribute.LRC_PFN,
RLSAttribute.INT,
0));
try {
catalog.defineAttributes(defs);
} catch (RLSException e) {
// handle exception
}
Defined attributes may be undefined, and (optionally) all assigned attributes of the matching type may be cleared.
try {
catalog.undefineAttributes(defs, true);
} catch (RLSException e) {
// handle exception
}
Attributes may be added to objects (logical names or target names) in the catalog. Before adding an attribute, it must be defined. Once added, it may be modified or removed.
List attributes = new List();
// Add a string attribute
attributes.add(new RLSAttributeObject(new RLSAttribute(
"label",
RLSAttribute.LRC_PFN,
"my-label-value"),
"my-target-name-XYZ));
// Add an integer attribute
attributes.add(new RLSAtributeObject(new RLSAttribute(
"series",
RLSAttribute.INT,
RLSAttribute.LRC_PFN,
12345),
"my-target-name-XYZ));
try {
catalog.addAttributes(attributes);
} catch(RLSException e) {
// handle exception
}
In addtion to catalog queries by logical name, target name, or wildcards, the catalog may be search based on attribute values. Attribute searching supports the most common comparison operators (equality, less than, greater than, etc.), a similarity operator (like), and ranges (between).
// Specify search parameters
AttributeSearch asearch = new AttributeSearch(
"series", // the name of the attribute
RLSAttribute.LRC_PFN, // the object type
RLSAttribute.OPBTW, // the comparison operation
new RLSAttribute(
"series",
RLSAttribute.LRC_PFN,
0), // the left value
new RLSAttribute(
"series",
RLSAttribute.LRC_PFN,
99999), // the right value
null); // optional offset/limit
try {
Results results = catalog.query(asearch);
if (results.getRC() == RLSStatusCode.RLS_SUCCESS) {
List batch = results.getBatch();
Iterator i = batch.iterator();
while (i.hasNext()) {
// Returns RLSAttributeObject values
RLSAttributeObject aobj = (RLSAttributeObject) i.next();
if (aobj.rc != RLSStatusCode.RLS_SUCCESS)
continue;
// The search was on LRC_PFN objects therefore the
// key is the target name. If the search was on
// LRC_LFN objects the key would be the logical
// name. The search returns the attributes and the
// object (logical or target) names that the
// attributes are associated with. It does not
// return the mapping.
String targetName = aobj.key;
}
}
} catch (RLSException e) {
// handle exception
}
This section provides examples illustrating the basic usage of the client interfaces supported by the RLS. Using the client API, developers may create client applications that interact with the RLS server to perform replica location operations.
Developing in C
Client applications developed in C must do both of the following:
- Include the client header file at
$GLOBUS_LOCATION/include/globus_rls_client.h. - Link to the client shared library at
$GLOBUS_LOCATION/lib/libglobus_rls_client_gcc32dbgpthr.
For C language example code, click here.
Developing in Java
Client applications developed in Java must do all of the following:
- Include the RLS Jar,
$GLOBUS_LOCATION/lib/globus_rls_client.jar, in the CLASSPATH. - Import the RLS Package
org.globus.replica.rls.*.
For Java language example code, click here. Note that the examples in this section use the older, deprecated API.
For Java language example code using the new API, click here.
The Replica Location Service design consists of two components. Local Replica Catalogs (LRCs) maintain consistent information about logical-to-physical mappings on a site or storage system. The Replica Location Indexes (RLIs) aggregate state information contained in one or more LRCs and build a global, hierarchical distributed index to support discovery of replicas at multiple sites. LRCs send summaries of their state to RLIs using soft state update protocols. The server consists of a multi-threaded front end server and a back-end relational database, such as MySQL or PostgreSQL. The front end server can be configured to act as an LRC server and/or an RLI server. Clients access the server via a simple string-based RPC protocol. The client APIs support C, Java and Python. The APIs contain operations to create and delete mappings, associate attributes with mappings, and perform queries.
Detailed information on the architecture and design can be found in A Framework for Constructing Scalable Replica Location Services and Performance and Scalability of a Replica Location Service.
Table of Contents
The RLS provides a Client API for C and Java based clients. The RLS Client C API is provided in the form of a library (e.g., .so file). Any installation of RLS will include the shared library as part of the $GLOBUS_LOCATION/include and $GLOBUS_LOCATION/lib directories. The RLS Client Java API depends on the the commons-logging and cog-jglobus libraries which typically located in the $GLOBUS_LOCATION/lib/common folder. The RLS Java Client jar is named globus_rls_client.jar and is typically installed in the $GLOBUS_LOCATION/lib folder.
Table of Contents
- globus-rls-admin - RLS administration tool
- globus-rls-cli - RLS client tool
- globus-rls-server - RLS server tool
Name
globus-rls-admin — RLS administration tool
Synopsis
globus-rls-admin
Synopsis
-A|-a|-C option value|-c option|-d|-e|-p|-q|-s|-t timeout|-u|-v [ rli ] [ pattern ] [ server ]
Options
Table 6.1. Options for globus-rls-admin
| -A | Adds rli to the list of RLI servers updated by an LRC server using Bloom filters. Note: Partitions are not supported with Bloom filters. The LRC server maintains one Bloom filter for all LFNs in its database, which is sent to all RLI servers configured to receive Bloom filter updates with this option. |
| -a | Adds rli and optionally pattern to the list of RLI servers that the LRC server sends updates to (using a list of LFNs). If pattern is specified, then only LFNs matching it will be sent to rli. If rli is added with no patterns, then it is sent all updates. Pattern matching is done using standard Unix file globbing. |
| -C option value | Sets server option to value. Important: This does not update the configuration file. The next time the server is restarted, the configuration change will be lost. |
| -c option | Retrieves the configuration value for the specified option from the server. If option is set to all, then all options are retrieved. |
| -d | Removes rli and pattern from the list of RLI servers that the LRC server sends updates to. If pattern is not specified, then all entries for rli are removed. Note: If all patterns are removed separately, then rli is sent all updates. To stop any updates from being sent to rli, do not specify pattern. |
| -e | Clears the LRC database. Removes all lfn, pfn mappings. |
| -p | Verifies that the server is responding. |
| -q | Causes the RLS server to exit. |
| -S | Shows statistics and other information gathered by the RLS server. This is intended to be input into GRIS. |
| -s | Shows the list of RLI servers and patterns being sent updates by the LRC server. If rli or pattern are not specified, they are considered wildcards. |
| -t timeout | Sets timeout (in seconds) for RLS server requests. The default value is 30. |
| -u | Causes the LRC server to immediately start full soft state updates to any RLI servers previously added with the -a option. |
| -v | Shows the version and exits. |
Name
globus-rls-cli — RLS client tool
Synopsis
globus-rls-cli
Tool description
Provides a command line interface to some of the functions supported by RLS. It also supports an interactive interface (if command is not specified). In interactive mode, double quotes may be used to encode an argument that contains white space.
Options
The client command tool uses getopt for command line parsing.
Note: Some versions will continue scanning for options (works that begin with a hyphen) for the entire command line, which makes it impossible to specify negative integer or floating point value for an attribute. The workaround for this problem is to tell getopt() that there are no more options by including 2 hyphens. For example, to specify the value -2 you must enter -- -2.
Table 6.2. Options for globus-rls-cli
| -c | Sets "clearvalues" flag when deleting an attribute (will remove any attribute value records when an attribute is deleted). |
| -h | Shows usage. |
| -l reslimit |
Sets an incremental limit on the number of results returned by a wildcard query at a time. Note that all results will be returned by the client. This parameter only limits the number of results incrementally retrieved by the client during a single internal communication call. For instance, if the wildcard query produces 1000 results and the reslimit is set to 100, the client will internally make 10 calls to the server. From the user's perspective the client will simply return all 1000 results. Zero means no limit. |
| -s | Uses SQL style wildcards (% and _). |
| -t timeout | Sets timeout (in seconds) for RLS server requests. The default is 30 seconds. |
| -u | Uses Unix style wildcards (* and ?). |
| -v | Shows version. |
Commands
Table 6.3. Commands for globus-rls-cli
| add <lfn> <pfn> | Adds pfn to mappings of lfn in an LRC catalog. |
| attribute add <object> <attr> <obj-type> <attr-type> | Adds an attribute to an object, where object should be the lfn or pfn name. obj-type should be one of lfn or pfn. attr-type should be one of date, float, int, or string. If <value> is of type date then it should be in the form "YYYY-MM-DD HH:MM:DD". |
| attribute bulk add <object> <attr> <obj-type> | Bulk adds attribute values. |
| attribute bulk delete <object> <attr> <obj-type> | Bulk deletes attributes. |
| attribute bulk query <attr> <obj-type> <object> | Bulk queries attributes. |
| attribute define <attr> <obj-type> <attr-type> | Defines a new attribute. |
| attribute delete <object> <attr> <obj-type> | Removes attribute from object. |
| attribute modify <object> <attr> <obj-type> <attr-type> | Modifies the value of an attribute. |
| attribute query <object> <attr> <obj-type> | Retrieves the value of the specified attribute for object. |
| attribute search <attr> <obj-type> <operator> <attr-type> | Searches for objects which have the specified attribute matching operator and value. operator should be one of =, !=, >, >=, <, or <=. |
| attribute show <attr> <obj-type> | Shows an attribute definition. If attr is a hyphen (-) then all attributes are shown. |
| attribute undefine <attr> <obj-type> | Deletes an attribute definition. Will return an error if any objects possess this attribute. |
| bulk add <lfn> <pfn> [<lfn> <pfn>] | Bulk adds lfn, pfn mappings. |
| bulk create <lfn> <pfn> [<lfn> <pfn>] | Bulk creates lfn, pfn mappings. |
| bulk delete <lfn> <pfn> [<lfn> <pfn>] | Bulk deletes lfn, pfn mappings. |
| bulk query lrc lfn [<lfn> ...] | Bulk queries the LRC for lfns. |
| bulk query lrc pfn [<pfn> ...] | Bulk queries the LRC for pfns. |
| bulk query rli lfn [<lfn> ...] | Bulk queries the RLI for lfns. |
| create <lfn> <pfn> | Creates a new lfn, pfn mapping in an LRC catalog. |
| delete <lfn> <pfn> | Deletes a lfn, pfn mapping from an LRC catalog. |
| exit | Exits the interactive session. |
| help | Prints a help message. |
| query lrc lfn <lfn> | Queries an LRC server for mappings of lfn. |
| query lrc pfn <pfn> | Queries an LRC server for mappings to pfn. |
| query rli lfn <lfn> | Queries an RLI server for mappings of lfn. |
| query wildcard lrc lfn <lfn-pattern> | Performs a wildcarded query of an LRC server for mappings of lfn-pattern. Patterns use the standard Unix wildcard characters: an asterisk (*) matches 0 or more characters, and a question mark (?) matches any single character. |
| query wildcard lrc pfn <pfn-pattern> | Queries an LRC server for mappings to pfn-pattern. Patterns use the standard Unix wildcard characters: an asterisk (*) matches 0 or more characters, and a question mark (?) matches any single character. |
| query wildcard rli lfn <lfn-pattern> | Queries an RLI server for mappings of lfn-pattern. Patterns use the standard Unix wildcard characters: an asterisk (*) matches 0 or more characters, and a question mark (?) matches any single character. |
| set reslimit <limit> |
Sets an incremental limit on the number of results returned by a wildcard query at a time. Note that all results will be returned by the client. This parameter only limits the number of results incrementally retrieved by the client during a single internal communication call. For instance, if the wildcard query produces 1000 results and the reslimit is set to 100, the client will internally make 10 calls to the server. From the user's perspective the client will simply return all 1000 results. |
| set timeout <timeout> | Sets the timeout (in seconds) on calls to the RLS server. The default value is 30. |
| version | Shows the version and exits. |
Name
globus-rls-server — RLS server tool
Synopsis
globus-rls-server
Tool description
The RLS server (globus-rls-server) can be configured as either one or both of the following:
- Location Replica Catalog (LRC) server, which manages Logical FileName (LFN) to Physical FileName (PFN) mappings in a database. Note: If globus-rls-server is configured as an LRC server, the RLI servers that it sends updates to should be added to the database using globus-rls-admin.
- Replica Location Index (RLI) server, which manages mappings of LFNs to LRC servers.
Clients wishing to locate one or more physical filenames associated with a logical filename should first contact an RLI server, which will return a list of LRCs that may know about the LFN. The LRC servers are then contacted in turn to find the physical filenames.
Note: RLI information may be out of date, so clients should be prepared to get a negative response when contacting an LRC (or no response at all if the LRC server is unavailable).
Synopsis
[ -B lrc_update_bf ] [ -b maxbackoff ] [ -C rlscertfile ] [ -c conffile ] [ -d ] [ -e rli_expire_int ] [ -F lrc_update_factor ] [ -f maxfreethreads ] [ -I true|false [ -i idletimeout ] [ -K rlskeyfile ] [ -L loglevel ] [ -l true|false ] [ -M maxconnections ] [ -m maxthreads ] [ -N ] [ -o lrc_buffer_time ] [ -p pidfiledir ] [ -r true|false ] [ -S rli_expire_stale ] [ -s startthreads ] [ -t timeout ] [ -U myurl ] [ -u lrc_update_ll ] [ -v ]
LRC to RLI Updates
Two methods exist for LRC servers to inform RLI servers of their LFNs.
- By default, the LFNs are sent from the LRC to the RLI. This can be time consuming if the number of LFNs is large, but it does give the RLI an exact list of the LFNs known to the LRC, and it allows wildcard searching of the RLI.
- Alternatively, Bloom filters may be sent, which are highly compressed summaries of the LFNs. However, they do not allow wildcard searching and will generate more "false positives" when querying an RLI.
Please see below for more on Bloom filters.
globus-rls-admin can be used to manage the list of RLIs that an LRC server updates. This includes partitioning LFNs among multiple RLI servers.
A soft state algorithm is used in both update modes: periodically the LRC server sends its state (LFN information) to the RLI servers it updates. The RLI servers add these LFNs to their indexes or update timestamps if the LFNs were already known. RLI servers expire information about LFN, LRC mappings if they haven't been updated for a period longer than the soft state update interval.
The following options in the configuration file control the soft state algorithm when an LRC updates an RLI by sending LFNs:
- rli_expire_int (seconds)
- rli_expire_stale (seconds)
- lrc_update_ll (seconds)
- lrc_update_bf (seconds)
Updates to an LRC (new LFNs or deleted LFNs) normally don't propagate to RLI servers until the next soft state update (controlled by options lrc_update_ll and lrc_update_bf).
However, by enabling "immediate update" mode (set lrc_update_immediate to true), an LRC will send updates to an RLI within lrc_buffer_time seconds.
If updates are done with LFN lists then only the LFNs that have been added or deleted to the LRC are sent. If Bloom filters are used, then the entire Bloom filter is sent.
When immediate updates are enabled, the interval between soft state updates is multiplied by lrc_update_factor as long as no updates have failed (LRC and RLI are considered to be in sync). This can greatly reduce the number of soft state updates an LRC needs to send to an RLI.
Incremental updates are buffered by the LRC server until either 200 updates have accumulated (when LFN lists are used), or lrc_buffer_time seconds have passed since the last update.
Bloom filter updates
A Bloom filter is an array of bits. Each LFN is hashed multiple times and the corresponding bits in the Bloom filter are set.
Querying an RLI to verify if an LFN exists is done by performing the same hashes and checking if the bits in the filter are on. If not, then the LFN is known not to exist. If they're all on, then all that's known is that the LFN probably exists.
The size of the Bloom filter (as a multiple of the number of LFNs) and the number of hash functions control the false positive rate. The default values of 10 and 3 give a false positive rate of approximately 1%.
The advantage of Bloom filters is their efficiency. For example, if the LRC has 1,000,000 LFNs in its database, with an average length of 20 bytes, then 20,000,000 bytes must be sent to an RLI during a soft state update (assuming no partitioning). The RLI server must perform 1,000,000 updates to its database to create new LFN, LRC mappings or update timestamps on existing entries. With Bloom filters only 1,250,000 bytes are sent (10 x 1,000,000 bits / 8), and there are no database operations on the RLI (Bloom filters are maintained entirely in memory). A comparison of the time to perform a 1,000,000 LFN update: it took 20 minutes sending all the LFNs and less than 1 second using a Bloom filter. However as noted before, Bloom filters do not support wild card searches of an RLI.
Note: An LRC server can update some RLIs with Bloom filters and others with LFNs. However, an RLI server can only be updated using one method.
The following options in the Configuration file control Bloom filter updates:
- rli_bloomfilter true|false
- rli_bloomfilter_dir none|default|pathname
- lrc_bloomfilter_numhash N
- lrc_bloomfilter_ratio N
- lrc_update_bf seconds
Log Messages
globus-rls-server uses syslog to log errors and other information (facility LOG_DAEMON) when it's running in normal (daemon) mode.
If the -d option (debug) is specified, then log messages are written to stdout.
Signals
The server will reread its configuration file if it receives a HUP signal. It will wait for all current requests to complete and shut down cleanly if sent any of the following signals: INT, QUIT or TERM.
Options (globus-rls-server)
The following table describes the command line options available for globus-rls-server:
Table 6.4. Options for globus-rls-server
| -b maxbackoff | Maximum time (in seconds) that globus-rls-server will attempt to reopen the socket it listens on after an I/O error. |
| -C rlscertfile | Name of the X.509 certificate file that identifies the server; sets environment variable X509_USER_CERT. |
| -c conffile | Name of the configuration file for the server. The default is $GLOBUS_LOCATION/etc/globus-rls-server.conf if the environment variable GLOBUS_LOCATION is set; else, /usr/local/etc/globus-rls-server.conf. |
| -d | Enables debugging. The server will not detach from the controlling terminal, and log messages will be written to stdout rather than syslog. For additional logging verbosity set the loglevel (see the -L option) to higher values. |
| -e rli_expire_int | Interval (seconds) at which an RLI server should expire stale entries. |
| -F lrc_update_factor | If lrc_update_immediate mode is on, and the LRC server is in sync with an RLI server (an LRC and RLI are synced if there have been no failed updates since the last full soft state update), then the interval between RLI updates for this server (lrc_update_ll) is multiplied by lrc_update_factor. |
| -f maxfreethreads | Maximum number of idle threads the server will leave running. Excess threads are terminated. |
| -I true|false | Turns LRC to RLI immediate update mode on (true) or off (false). The default value is false. |
| -i idletimeout | Seconds after which idle client connections are timed out. |
| -K rlskeyfile | Name of the X.509 key file. Sets environment variable X509_USER_KEY. |
| -L loglevel | Sets the log level. By default this is 0, which means only errors will be logged. Higher values mean more verbose logging. |
| -l true|false | Configures whether the server is an LRC server. The default is false. |
| -M maxconnections | Maximum number of active connections. It should be small enough to prevent the server from running out of open file descriptors. The default value is 100. |
| -m maxthreads | Maximum number of threads server will start up to support simultaneous requests. |
| -N | Disables authentication checking. This option is intended for debugging. Clients should use the URL RLSN://host to disable authentication on the client side. |
| -o lrc_buffer_time | LRC to RLI updates are buffered until either the buffer is full or this much time (in seconds) has elapsed since the last update. The default value is 30. |
| -p pidfiledir | Directory where PID files should be written. |
| -r | Configures whether the server is an RLI server. The default value is false. |
| -S rli_expire_stale | Interval (in seconds) after which entries in the RLI database are considered stale (presumably because they were deleted in the LRC). Stale entries are not returned in queries. |
| -s startthreads | Number of threads to start up initially. |
| -t timeout | Timeout (in seconds) for calls to other RLS servers (in other words, for LRC calls to send an update to an RLI). A value of 0 disables timeouts. The default value is 30. |
| -U myurl | URL for this server. |
| -u lrc_update_ll | Interval (in seconds) between lfn-list LRC to RLI updates. |
| -v | Shows version and exits. |
Table of Contents
- 1. Configuration overview
- 2. Server configuration file (globus-rls-server.conf)
- 3. Basic configuration
- 4. Host key and certificate configuration
- 5. Configuring LRC to RLI updates
- 6. Configuring the RLS Server for the WS MDS Index Service
- 7. Configuring the RLS Server for the MDS2 GRIS
- 8. Complete RLS Server settings (globus-rls-server.conf)
RLS configuration involves statically-defined, system settings as defined in the RLS configuration file (see $GLOBUS_LOCATION/etc/globus-rls-server.conf), settings changed temporarally at run-time using the RLS Admin tool (see globus-rls-admin(1) -C option value command), and finally LRC-to-RLI and RLI-to-RLI updates configured using the RLS Admin tool (see globus-rls-admin(1) -a, -A, -d commands).
Configuration settings for the RLS are specified in the globus-rls-server.conf file. If the configuration file is not specified on the command line (see the -c option) then it is looked for in both:
$GLOBUS_LOCATION/etc/globus-rls-server.conf/usr/local/etc/globus-rls-server.confifGLOBUS_LOCATIONis not set
![]() | Note |
|---|---|
Command line options always override items found in the configuration file. |
The configuration file is a sequence of lines consisting of a keyword, whitespace, and a value. Comments begin with # and end with a newline.
Review the server configuration file $GLOBUS_LOCATION/etc/globus-rls-server.conf and change any options you want. The server man page globus-rls-server(8) has complete details on all options. The complete details are also provided later in this section.
A minimal configuration file for both an LRC and RLI server would be:
# Configure the database connection info
db_user dbuser
db_pwd dbpassword
# If the server is an LRC server
lrc_server true
lrc_dbname lrc1000
# If the server is an RLI server
rli_server true
rli_dbname rli1000 # Not needed if updated by Bloom filters
# Configure who can make requests of the server
acl .*: all
# RE matching grid-mapfile users or DNs from x509 certs
...
The server uses a host certificate to identify itself to clients. By default this certificate is located in the files /etc/grid-security/hostcert.pem and /etc/grid-security/hostkey.pem. Host certificates have a distinguished name of the form /CN=host/FQDN. If the host you plan to run the RLS server on does not have a host certificate, you must obtain one from your Certificate Authority. The RLS server must be run as the same user who owns the host certificate files (typically root). The location of the host certificate files may be specified in $GLOBUS_LOCATION/etc/globus-rls-server.conf:
rlscertfile path-to-cert-file # default /etc/grid-security/hostcert.pem
rlskeyfile path-to-key-file # default /etc/grid-security/hostkey.pem
It is possible to run the RLS server without authentication, by starting it with the -N option, and using URL's of the form rlsn://server to connect to it. Notice that the URL scheme is rlsn as opposed to rls.
It is generally recommended to run the server with a user account other than root for added security. In order to do so, you will need to create complimentary key and certificate files owned by a designated user account, globus for instance.
Begin by copying the
/etc/grid-security/hostcert.pemand/etc/grid-security/hostkey.pemto/etc/grid-security/containercert.pemand/etc/grid-security/constainerkey.pem. Note that we use the prefix "container" to conform with the recommended naming scheme for other services distributed with the Globus Toolkit.% cp /etc/grid-security/hostcert.pem /etc/grid-security/containercert.pem % cp /etc/grid-security/hostkey.pem /etc/grid-security/containerkey.pemThen change ownership of the files to the designated user account,
globusin our example.% chown globus /etc/grid-security/containercert.pem % chown globus /etc/grid-security/containerkey.pemChange the
rlskeyfileandrlscertfilesettings in the RLS configuration file ($GLOBUS_LOCATION/etc/globus-rls-server.conf) to reflect the appropriate filenames.rlscertfile /etc/grid-security/containercert.pem rlskeyfile /etc/grid-security/containerkey.pemFinally, bear in mind that your certificate and key files must always have file permissions
644and400respectively.% ls -l /etc/grid-security/*.pem -rw-r--r-- 1 globus gridstaff 818 Dec 8 2005 /etc/grid-security/containercert.pem -r-------- 1 globus gridstaff 887 Dec 8 2005 /etc/grid-security/containerkey.pem -rw-r--r-- 1 root root 818 Dec 8 2005 /etc/grid-security/hostcert.pem -r-------- 1 root root 887 Dec 8 2005 /etc/grid-security/hostkey.pem
If authentication is enabled, RLI servers must include acl configuration options that match the identities of LRC servers that update it and that grant the rli_update permission to the LRCs.
One of the key benefits to using the RLS for managing replica location information is its distributed architecture. In a distributed deployment, one or more Local Replica Catalog (LRC) services will send updates of its contents to one or more Replica Location Index (RLI) services.
By default the installed LRC is not configured to send updates to any RLI, even the local RLI co-located with the local LRC. Use the globus-rls-admin(1) tool to configure the LRC to send updates to one or more RLI services.
To configure the LRC to send uncompressed lists of its logical names to a RLI, use the following command:
% $GLOBUS_LOCATION/bin/globus-rls-admin -a rls://rli_host rls://lrc_hostTo configure the LRC to send compressed bitmaps (using Bloom filters) of its logical names to a RLI, use the following command:
% $GLOBUS_LOCATION/bin/globus-rls-admin -A rls://rli_host rls://lrc_hostTo configure the LRC to stop sending updates to a RLI, use the following command:
% $GLOBUS_LOCATION/bin/globus-rls-admin -d rls://rli_host rls://lrc_host
![]() | Note |
|---|---|
While any given LRC is capable of sending uncompressed or compressed updates to any RLI. The RLI service must be configured to accept either uncompressed or compressed updates but not both. See the |
There are tradeoffs between using uncompressed and compressed updates in your configuration. The advantage of using compressed updates, not surprisingly, is a significant reduction in network overhead and memory usage. As replica location mappings grow into the 10's of millions or more, the savings of using compressed updates becomes important. On the other hand, due to the compressed nature of the Bloom filter bitmap used to represent the logical names in the LRC, the wildcard query at the RLI cannot be supported when update compression is used.
The server package includes a script $GLOBUS_LOCATION/libexec/aggrexec/globus-rls-aggregatorsource.pl that may be used as an Execution Aggregator Source by WS MDS. See GT 4.2.1 Index Services for more information on setting up and using the Execution Aggregator Source scripts in WS MDS. The script may be invoked as follows and will generate output in the format as depicted.
% $GLOBUS_LOCATION/libexec/aggrexec/globus-rls-aggregatorsource.pl rls://mysite
<?xml version="1.0" encoding="UTF-8"?>
<rlsStats>
<site>rls://mysite</site>
<version>4.0</version>
<uptime>03:08:15</uptime>
<serviceList>
<service>lrc</service>
<service>rli</service>
</serviceList>
<lrc>
<updateMethodList>
<updateMethod>lfnlist</updateMethod>
<updateMethod>bloomfilter</updateMethod>
</updateMethodList>
<updatesList>
<updates>
<site>rls://myothersite:39281</site>
<method>bloomfilter</method>
<date>08/01/05</date>
<time>16:16:38</time>
</updates>
</updatesList>
<numlfn>283902</numlfn>
<numpfn>593022</numpfn>
<nummap>593022</nummap>
</lrc>
<rli>
<updatedViaList>
<updatedVia>bloomfilters</updatedVia>
</updatedViaList>
<updatedByList>
<updatedBy>
<site>rls://myothersite:39281</site>
<date>08/01/05</date>
<time>10:03:21</time>
</updatedBy>
</updatedByList>
</rli>
</rlsStats>
![]() | Important |
|---|---|
Be sure to configure the security context of the container running the MDS, and be sure that the security configuration on the RLS host recognizes the MDS security context. |
When following the instructions provided by the GT 4.2.1 Index Services, you will need to consider the security context used by the MDS to invoke the Execution Aggregator Source script provided by RLS. Most deployments of RLS run the service with security enabled. Therefore any client connections, including administrative status operations, require authentication and authorization. In order for MDS to use the provided script to check RLS status, it must invoke the script with a valid user proxy or user certificate and key. The RLS must recognize the DN from the user certificate (i.e., the DN should be in the gridmap file).
One way to configure the MDS security context for use with RLS monitoring is to set the environment variables X509_USER_CERT and X509_USER_KEY to point to the container certificate and key. Run the MDS with these environment settings. Also, add the DN from the container certificate to the gridmap file on the host running the RLS.
Alternatively, you could modify the provided script so that it sets the environment variables to another user certificate and key (or proxy) as desired before calling the RLS.
The server package includes a program called globus-rls-reporter that will report information about an RLS server to the MDS2 GRIS. Use this procedure to enable this program:
- To enable Index Service reporting, add the contents of the file
$GLOBUS_LOCATION/setup/globus/rls-ldif.confto the MDS2 GRIS configuration file$GLOBUS_LOCATION/etc/grid-info-resource-ldif.conf. - If necessary, set your virtual organization (VO) name in
$GLOBUS_LOCATION/setup/globus/rls-ldif.conf. The default value islocal. The VO name is referenced twice, on the lines beginningdn:andargs:. - You must restart your MDS (GRIS) server after modifying
$GLOBUS_LOCATION/etc/grid-info-resoruce-ldif.confYou can use the following commands to do so:
$GLOBUS_LOCATION/sbin/SXXgris stop
$GLOBUS_LOCATION/sbin/SXXgris start
This section describes the complete details of the RLS Server configuration settings.
Table 7.1. Complete RLS Server settings (globus-rls-server.conf)
acl user: permission [permission] |
A gridmap file may also
be used to map DNs to local usernames, which in turn are matched
against the regular expressions in the
There may be multiple
|
authentication true|false | Enable or disable GSI authentication. The default value is If authentication is enabled ( If authentication is not enabled ( |
db_pwd password | Password to use to connect to the database server. The default value is |
db_user databaseuser | Username to use to connect to database server. The default value is |
idletimeout seconds | Seconds after which idle connections close. The default value is |
loglevel N | Sets loglevel to N (default is 0). Higher levels mean more verbosity. |
lrc_bloomfilter_numhash N | Number of hash functions to use in Bloom filters. The default
value is Possible values are 1 through 8. This value, in conjunction
with Note: The
default values of |
lrc_bloomfilter_ratio N | Sets ratio of bloom filter size (in bits) to number of LFNs in the LRC catalog (in other words, size of the Bloom filter as a multiple of the number of LFNs in the LRC database.) This is only meaningful if Bloom filters are used to update an RLI. Too small a value will generate too many false positives, while too large a value wastes memory and network bandwidth. The default value is Note: The default values of |
lrc_buffer_time N | LRC to RLI updates are buffered until either the buffer is full or this much time in seconds has elapsed since the last update. The default value is
|
lrc_dbname | Name of LRC database. The default value is |
lrc_server true|false | If LRC server, the value should
be The default value is |
lrc_update_bf seconds | Interval in seconds between LRC to RLI updates when the RLI is updated by Bloom filters. In other words, how often an LRC server does a Bloom filter soft state update. This can
be much smaller than the interval between updates without using
Bloom filters ( The default value is |
lrc_update_factor N | If lrc_update_immediate mode
is on, and the LRC server is in sync with an RLI server (an LRC
and RLI are synced if there have been no failed updates since the
last full soft state update), then the interval between RLI updates
for this server (lrc_update_ll)
is multiplied by the value of this option. |
lrc_update_immediate true|false | Turns LRC to RLI immediate mode updates
on ( The default value is |
lrc_update_ll seconds | Number of seconds before an LRC server does an LFN list soft state update. The default value is |
lrc_update_retry seconds | Seconds to wait before an LRC server will retry to connect to an RLI server that it needs to update. The default value is |
maxbackoff seconds | Maximum seconds to wait before re-trying listen in the event of an I/O error. The default value is |
maxfreethreads N | Maximum number of idle threads. Excess threads are killed. The default
value is |
maxconnections N | Maximum number of simultaneous connections. The default value is |
maxthreads N | Maximum number of threads running at one time. The default value is |
myurl URL | URL of server. The default value is |
odbcini filename | Sets environment variable If not specified, and |
pidfile filename | Filename where pid file should be written. The
default value is |
port N | Port the server listens on. The default
value is |
result_limit limit | Sets the maximum number of results returned by a query. The default value is If a query request includes a limit greater than this
value, an error ( If the query
request has no limit specified, then at most |
rli_bloomfilter true|false | RLI servers must have this set to accept Bloom filter updates. If If Note: If Bloom filters are enabled, then the RLI does not support wildcarded queries. |
rli_bloomfilter_dir none|default|pathname
| If an RLI is configured to accept
bloom filters ( This directory is scanned when an RLI server starts up and is used to initialize Bloom filters for each LRC that updated the RLI. This option is useful when you want the RLI to recover its data immediately after a restart rather than wait for LRCs to send another update. If the LRCs are updating frequently, this option is unnecessary and may be wasteful in that each Bloom filter is written to disk after each update.
|
rli_dbname database | Name of the RLI database. The default value is |
rli_expire_int seconds | Interval (in seconds) between RLI expirations of stale entries. In other words, how often an RLI server will check for stale entries in its database. The default value is |
rli_expire_stale seconds | Interval (in seconds) after which entries in the RLI database are considered stale (presumably because they were deleted in the LRC). The default value is This value should be no smaller than Stale RLI entries are not returned in queries. Note: If the LRC server is responding,
this value is not used. Instead the value of |
rli_server true|false | If an RLI server, the value should be
The default value is |
rlscertfile filename | Name of the X.509 certificate file identifying the server. This value is set by setting environment variable |
rlskeyfile filename | Name of the X.509 key file for the server. This value is
set by setting environment variable |
startthreads N | Number of threads to start initially. The
default value is |
timeout seconds | Timeout (in seconds) for calls to other RLS servers (e.g., for LRC calls to send an update to an RLI). |
To run the RLS server in debug mode, use the -d option
along with the -L num option (e.g.,
$GLOBUS_LOCATION/bin/globus-rls-server -d -L 3). The
-d option instructs the RLS server to direct log output to
stdout, while the -L num
option sets the log level where a higher num results in
more detailed output.
Table of Contents
Information on troubleshooting can be found in the FAQ. For a list of common errors in GT, see Error Codes.
Table 9.1. Replica Locator Service (RLS) Errors
| Error Code | Definition | Possible Solutions |
|---|---|---|
Error with credential: The proxy credential: <credential> with subject: <subject> expired <minutes> minutes ago
| Expired proxy credential | Create a new proxy with grid-proxy-init. |
Unable to connect to localhost:xxxx
| Unable to connect to the local host. This can be due to a variety of reasons, including a wrong address or port number in the RLS connection URL or an issue with a firewall configuration. |
|
| "connection timeout" | At times, a client may experience a connection timeout when interacting with the RLS server due to a variety of reasons:
|
If timeouts are experienced with increasing frequency, increase the RLS server's timeout configuration parameter found in the
$GLOBUS_LOCATION/var/globus-rls-server.conf file. You may also use the -t timeout option of the
globus-rls-cli tool.
|
For additional details, see the RPC Protocol Description.
B
- Bloom filter
Compression scheme used by the Replica Location Service (RLS) that is intended to reduce the size of soft state updates between Local Replica Catalogs (LRCs) and Replica Location Index (RLI) servers. A Bloom filter is a bit map that summarizes the contents of a Local Replica Catalog (LRC). An LRC constructs the bit map by applying a series of hash functions to each logical name registered in the LRC and setting the corresponding bits.
L
- Local Replica Catalog (LRC)
Stores mappings between logical names for data items and the target names (often the physical locations) of replicas of those items. Clients query the LRC to discover replicas associated with a logical name. Also may associate attributes with logical or target names. Each LRC periodically sends information about its logical name mappings to one or more RLIs.
See also RLI.
- logical file name
A unique identifier for the contents of a file.
R
- Replica Location Index (RLI)
Collects information about the logical name mappings stored in one or more Local Replica Catalogs (LRCs) and answers queries about those mappings. Each RLI periodically receives updates from one or more LRCs that summarize their contents.
- RLS attribute
Descriptive information that may be associated with a logical or target name mapping registered in a Local Replica Catalog (LRC). Clients can query the LRC to discover logical names or target names that have specified RLS attributes.
A
- architecture
- for admin, Architecture and design overview
D
- debugging, Debugging
E
- errors, Troubleshooting, Errors
![[Note]](/docbook-images/note.gif)
![[Important]](/docbook-images/important.gif)