Software Links
Getting Started
- Doc Structure
- A Globus Primer
- Globus Is Modular!
- Quickstart
- Installing GT
- Platform Notes
- Migrating from GT2
- Migrating from GT3
Reference
- PDF version
- Best Practices
- Coding Guidelines
- API docs
- Public Interfaces
- Resource Properties
- Samples
- Glossary
- Performance Studies
Common Runtime
Security
Data Mgt
Information Svcs
Execution Mgt
Table of Contents
- 1. Semantics and syntax of APIs
- 2. Semantics and syntax of the WSDL
- 3. Semantics and syntax of non-WSDL protocols
- 4. Command-line tools
- 5. Overview of Graphical User Interface
- 6. Semantics and syntax of domain-specific interface
- 7. Configuration interface
- 7.1. GridFTP server configuration overview
- 7.2. GridFTP server configuration options
- 7.3. Configuring the GridFTP server to run under xinetd/inetd
- 7.4. Configuring GridFTP to run with the Community Authorization Service (CAS)
- 7.5. Configuring GridFTP to use UDT instead of TCP (for third party transfers)
- 7.6. Configuring GridFTP-over-SSH
- 8. Environment variable interface
The Globus FTP Client library provides a convenient way of accessing files on remote FTP servers. In addition to supporting the basic FTP protocol, the FTP Client library supports several security and performance extensions to make FTP more suitable for Grid applications. These extensions are described in the Grid FTP Protocol document.
In addition to protocol support for grid applications, the FTP Client library provides a plugin architecture for installing application or grid-specific fault recovery and performance tuning algorithms within the library. Application writers may then target their code toward the FTP Client library, and by simply enabling the appropriate plugins, easily tune their application to run it on a different grid.
All applications which use the Globus FTP Client API must include the header file "globus_ftp_client.h" and activate the GLOBUS_FTP_CLIENT_MODULE.
To use the Globus FTP Client API, one must create an FTP Client handle. This structure contains context information about FTP operations which are being executed, a cache of FTP control and data connections, and information about plugins which are being used. The specifics of the connection caching and plugins are found in the "Handle Attributes" section of the API documentation.
Once the handle is created, one may begin transferring files or doing other FTP operations by calling the functions in the "FTP Operations" section of the API documentation. In addition to whole-file transfers, the API supports partial file transfers, restarting transfers from a known point, and various FTP directory management commands. All FTP operations may have a set of attributes, defined in the operationattr section, associated with them to tune various FTP parameters. The data structures and functions needed to restart a file transfer are described in the "Restart Markers" section of the API documentation. For operations which require the user to send to or receive data from an FTP server they must call the functions described in the "globus_ftp_client_data" section of the manual.
The globus_ftp_control library provides low-level services needed to implement FTP clients and servers. The API provided is protocol specific. The data transfer portion of this API provides support for the standard data methods described in the FTP Specification as well as extensions for parallel, striped, and partial data transfer.
For information on the internationalization API, see Section 1, “Semantics and syntax of APIs”.
Please see the GridFTP Command Reference.
Globus does not provide any interactive client for GridFTP, either GUI or text based. However, NCSA, as part of there TeraGrid activity, produces a text based interactive client called UberFTP, which you may want to check out. See the "Interactive Clients" section of GridFTP Command Reference for more information.
The Globus implementation of the GridFTP server draws on:
three IETF RFCs:
- RFC 959
- RFC 2228
- RFC 2389
- an IETF Draft: MLST-16
- the GridFTP protocol specification, which is Global Grid Forum (GGF) Standard GFD.020.
The command line tools and the client library completely hide the details of the protocol from the user and the developer. Unless you choose to use the control library, it is not necessary to have a detailed knowledge of the protocol.
Note: Command line options and configuration file options may both be used, but the command line overrides the config file.
The configuration file for the GridFTP server is read from the following locations, in the given order. Only the first found will be loaded.
- Path specified with the
-c <configfile>command line option. - $GLOBUS_LOCATION/etc/gridftp.conf
- /etc/grid-security/gridftp.conf
Options are one per line, with the format:
<option> <value>
If the value contains spaces, they should be enclosed in double-quotes ("). Flags or boolean options should only have a value of 0 or 1. Blank lines and lines beginning with # are ignored.
For example:
port 5000 allow_anonymous 1 anonymous_user bob banner "Welcome!"
The table below lists config file options, associated command line options (if available) and descriptions. Note that any boolean option can be negated on the command line by preceding the specified option with '-no-' or '-n'. example: -no-cas or -nf.
Table 1. Informational Options
help <0|1> -h -help | Show usage information and exit. Default value: FALSE |
longhelp <0|1> -hh -longhelp | Show more usage information and exit. Default value: FALSE |
version <0|1> -v -version | Show version information for the server and exit. Default value: FALSE |
versions <0|1> -V -versions | Show version information for all loaded globus libraries and exit. Default value: FALSE |
Table 2. Modes of Operation
inetd <0|1> -i -inetd | Run under an inetd service. Default value: FALSE |
daemon <0|1> -s -daemon | Run as a daemon. All connections will fork off a new process and setuid if allowed. Default value: TRUE |
detach <0|1> -S -detach | Run as a background daemon detached from any controlling terminals. Default value: FALSE |
exec <string> -exec <string> | For statically compiled or non-GLOBUS_LOCATION standard binary locations, specify the full path of the server binary here. Only needed when run in daemon mode. Default value: not set |
chdir <0|1> -chdir | Change directory when the server starts. This will change directory to the dir specified by the chdir_to option. Default value: TRUE |
chdir_to <string> -chdir-to <string> | Directory to chdir to after starting. Will use / if not set. Default value: not set |
fork <0|1> -f -fork | Server will fork for each new connection. Disabling this option is only recommended when debugging. Note that non-forked servers running as 'root' will only accept a single connection and then exit. Default value: TRUE |
single <0|1> -1 -single | Exit after a single connection. Default value: FALSE |
Table 3. Authentication, Authorization, and Security Options
auth_level <number> -auth-level <number> | 0 = Disables all authorization checks. 1 = Authorize identity only. 2 = Authorize all file/resource accesses. If not set, it uses level 2 for front ends and level 1 for data nodes. Default value: not set |
allow_from <string> -allow-from <string> | Only allow connections from these source ip addresses. Specify a comma separated list of ip address fragments. A match is any ip address that starts with the specified fragment. Example: '192.168.1.' will match and allow a connection from 192.168.1.45. Note that if this option is used any address not specifically allowed will be denied. Default value: not set |
deny_from <string> -deny-from <string> | Deny connections from these source ip addresses. Specify a comma separated list of ip address fragments. A match is any ip address that starts with the specified fragment. Example: '192.168.2.' will match and deny a connection from 192.168.2.45. Default value: not set |
cas <0|1> -cas | Enable CAS authorization. Default value: TRUE |
secure_ipc <0|1> -si -secure-ipc | Use GSI security on the ipc channel. Default value: TRUE |
ipc_auth_mode <string> -ia <string> -ipc-auth-mode <string> | Set GSI authorization mode for the ipc connection. Options are: none, host, self or subject:[subject]. Default value: host |
allow_anonymous <0|1> -aa -allow-anonymous | Allow cleartext anonymous access. If server is running as root, anonymous_user must also be set. Disables ipc security. Default value: FALSE |
anonymous_names_allowed <string> -anonymous-names-allowed <string> | Comma separated list of names to treat as anonymous users when allowing anonymous access. If not set the default names of 'anonymous' and 'ftp' will be allowed. Use '*' to allow any username. Default value: not set |
anonymous_user <string> -anonymous-user <string> | User to setuid to for an anonymous connection. Only applies when running as root. Default value: not set |
anonymous_group <string> -anonymous-group <string> | Group to setgid to for an anonymous connection. If not set the default group of anonymous_user will be used. Default value: not set |
pw_file <string> -password-file <string> | Enable cleartext access and authenticate users against this /etc/passwd formatted file. Default value: not set |
connections_max <number> -connections-max <number> | Maximum concurrent connections allowed. Only applies when running in daemon mode. Unlimited if not set. Default value: not set |
connections_disabled <0|1> -connections-disabled | Disable all new connections. Does not affect ongoing connections. This would have to be set in the configuration file and then the server issued a SIGHUP in order to reload the config. Default value: FALSE |
Table 4. Logging Options
log_level <string> -d <string> -log-level <string> | Log level. A comma separated list of levels from: 'ERROR, WARN, INFO, DUMP, ALL'. Example: error,warn,info. You may also specify a numeric level of 1-255. Default value: ERROR |
log_module <string> -log-module <string> | globus_logging module that will be loaded. If not set the default 'stdio' module will be used, and the logfile options apply. Built-in modules are 'stdio' and 'syslog'. Log module options may be set by specifying module:opt1=val1:opt2=val2. Available options for the built-in modules are 'interval' and 'buffer', for buffer flush interval and buffer size, respectively. The default options are a 64k buffer size and a 5 second flush interval. A 0 second flush interval will disable periodic flushing, and the buffer will only flush when it is full. A value of 0 for buffer will disable buffering and all messages will be written immediately. Example: -log-module stdio:buffer=4096:interval=10 Default value: not set |
log_single <string> -l <string> -logfile <string> | Path of a single file to log all activity to. If neither this option nor log_unique is set, logs will be written to stderr unless the execution mode is detached or inetd, in which case logging will be disabled. Default value: not set |
log_unique <string>
-L <string>
-logdir <string> | Partial path to which 'gridftp.(pid).log' will be appended to construct the log filename. Example: -L /var/log/gridftp/ will create a separate log (/var/log/gridftp/gridftp.xxxx.log) for each process (which is normally each new client session). If neither this option nor log_single is set, logs will be written to stderr unless the execution mode is detached or inetd, in which case logging will be disabled. Default value: not set |
log_transfer <string> -Z <string> -log-transfer <string> |
Log netlogger style info for each transfer into this file. Default value: not set ex: DATE=20050520163008.306532 HOST=localhost PROG=globus-gridftp-server NL.EVNT=FTP_INFO START=20050520163008.305913 USER=ftp FILE=/etc/group BUFFER=0 BLOCK=262144 NBYTES=542 VOLUME=/ STREAMS=1 STRIPES=1 DEST=[127.0.0.1] TYPE=RETR CODE=226 Time format is YYYYMMDDHHMMSS.UUUUUU (microsecs). DATE: time the transfer completed. START: time the transfer started. HOST: hostname of the server. USER: username on the host that transfered the file. BUFFER: tcp buffer size (if 0 system defaults were used). BLOCK: the size of the data block read from the disk and posted to the network. NBYTES: the total number of bytes transfered. VOLUME: the disk partition where the transfer file is stored. STREAMS: the number of parallel TCP streams used in the transfer. STRIPES: the number of stripes used on this end of the transfer. DEST: the destination host. TYPE: the transfer type, RETR is a send and STOR is a receive (ftp 959 commands). CODE: the FTP rfc959 completion code of the transfer. 226 indicates success, 5xx or 4xx are failure codes. |
log_filemode <string> -log-filemode <string> | File access permissions of log files. Should be an octal number such as 0644 (the leading 0 is required). Default value: not set |
disable_usage_stats <0|1> -disable-usage-stats | Disable transmission of per-transfer usage statistics. See the Usage Statistics section in the online documentation for more information. Default value: FALSE |
usage_stats_target <string> -usage-stats-target <string> | Comma separated list of contact strings for usage statistics listeners. Default value: not set |
Table 5. Single and Striped Remote Data Node Options
remote_nodes <string> -r <string> -remote-nodes <string> | Comma separated list of remote node contact strings. Default value: not set |
data_node <0|1> -dn -data-node | This server is a back end data node. Default value: FALSE |
stripe_blocksize <number> -sbs <number> -stripe-blocksize <number> | Size in bytes of sequential data that each stripe will transfer. Default value: 1048576 |
stripe_layout <number> -sl <number> -stripe-layout <number> | Stripe layout. 1 = Partitioned, 2 = Blocked. Default value: 2 |
stripe_blocksize_locked <0|1> -stripe-blocksize-locked | Do not allow client to override stripe blocksize with the OPTS RETR command. Default value: FALSE |
stripe_layout_locked <0|1> -stripe-layout-locked | Do not allow client to override stripe layout with the OPTS RETR command. Default value: FALSE |
Table 6. Disk Options
blocksize <number> -bs <number> -blocksize <number> | Size in bytes of data blocks to read from disk before posting to the network. Default value: 262144 |
sync_writes <0|1> -sync-writes | Flush disk writes before sending a restart marker. This attempts to ensure that the range specified in the restart marker has actually been committed to disk. This option will probably impact performance and may result in different behavior on different storage systems. See the man page for sync() for more information. Default value: FALSE |
Table 7. Network Options
port <number> -p <number> -port <number> | Port on which a front end will listen for client control channel connections or on which a data node will listen for connections from a front end. If not set a random port will be chosen and printed via the logging mechanism. Default value: not set |
control_interface <string> -control-interface <string> | Hostname or IP address of the interface to listen for control connections on. If not set will listen on all interfaces. Default value: not set |
data_interface <string> -data-interface <string> | Hostname or IP address of the interface to use for data connections. If not set will use the current control interface. Default value: not set |
ipc_interface <string> -ipc-interface <string> | Hostname or IP address of the interface to use for ipc connections. If not set will listen on all interfaces. Default value: not set |
hostname <string> -hostname <string> | Effectively sets the above control_interface, data_interface and ipc_interface options. Default value: not set |
ipc_port <number> -ipc-port <number> | Port on which the front end will listen for data node connections. Default value: not set |
Table 8. Timeouts
control_preauth_timeout <number> -control-preauth-timeout <number> | Time in seconds to allow a client to remain connected to the control channel without activity before authenticating. Default value: 30 |
control_idle_timeout <number> -control-idle-timeout <number> | Time in seconds to allow a client to remain connected to the control channel without activity. Default value: 600 |
ipc_idle_timeout <number> -ipc-idle-timeout <number> | Idle time in seconds before an unused ipc connection will close. Default value: 600 |
ipc_connect_timeout <number> -ipc-connect-timeout <number> | Time in seconds before cancelling an attempted ipc connection. Default value: 60 |
Table 9. User Messages
banner <string> -banner <string> | Message to display to the client before authentication. Default value: not set |
banner_file <string> -banner-file <string> | File to read banner message from. Default value: not set |
banner_terse <0|1> -banner-terse | When this is set, the minimum allowed banner message will be displayed to unauthenticated clients. Default value: FALSE |
login_msg <string> -login-msg <string> | Message to display to the client after authentication. Default value: not set |
login_msg_file <string> -login-msg-file <string> | File to read login message from. Default value: not set |
Table 10. Module Options
load_dsi_module <string> -dsi <string> | Data Storage Interface module to load. File and remote modules are defined by the server. If not set the file module is loaded, unless the 'remote' option is specified, in which case the remote module is loaded. An additional configuration string can be passed to the DSI using the format [module name]:[configuration string]. The format of the configuration string is defined by the DSI being loaded. Default value: not set |
allowed_modules <string> -allowed-modules <string> | Comma separated list of ERET/ESTO modules to allow and, optionally, specify an alias for. Example: module1,alias2:module2,module3 (module2 will be loaded when a client asks for alias2). Default value: not set |
Table 11. Other
configfile <string> -c <string> | Path to configuration file that should be loaded. Otherwise will attempt to load $GLOBUS_LOCATION/etc/gridftp.conf and /etc/grid-security/gridftp.conf. Default value: not set |
use_home_dirs <0|1> -use-home-dirs | Set the startup directory to the authenticated user's home dir. Default value: TRUE |
debug <0|1> -debug | Set options that make the server easier to debug. Forces no-fork, no-chdir, and allows core dumps on bad signals instead of exiting cleanly. Not recommended for production servers. Note that non-forked servers running as root will only accept a single connection and then exit. Default value: FALSE |
Note: The service name used (gsiftp in this case) should
be defined in /etc/services with the desired port.
Here is a sample gridftp server xinetd config entry:
service gsiftp
{
instances = 100
socket_type = stream
wait = no
user = root
env += GLOBUS_LOCATION=(globus_location)
env += LD_LIBRARY_PATH=(globus_location)/lib
server = (globus_location)/sbin/globus-gridftp-server
server_args = -i
log_on_success += DURATION
nice = 10
disable = no
}Here is a sample gridftp server inetd config entry (read as a single line):
gsiftp stream tcp nowait root /usr/bin/env env \
GLOBUS_LOCATION=(globus_location) \
LD_LIBRARY_PATH=(globus_location)/lib \
(globus_location)/sbin/globus-gridftp-server -i
![]() | Note |
|---|---|
On Mac OS X, you must set DYLD_LIBRARY_PATH instead of LD_LIBRARY_PATH in the above examples. On IRIX, you may need to set either LD_LIBRARYN32_PATH or LD_LIBRARY64_PATH. |
The Community Authorization Service (CAS) is used to administer access rights to files and directories and the GridFTP server can be configured to enforce those rights.
For more information, see How to Set Up CAS to Use with GridFTP.
GridFTP can use the alternative UDT protocol between servers, but not yet between client and server.
Prerequisites:
Begin with a threaded build of the Globus GridFTP server. Refer to section 2.7 for information on how to build a threaded flavor of the GridFTP server.
![[Note]](/docbook-images/note.gif)
Note This requires that GridFTP be built from the source installer.
Currently GridFTP requires the reference implementation of UDT. This is available at http://sourceforge.net/projects/udt, and must be downloaded and installed prior to continuing.
Steps:
Starting with the un-tar'd source bundle, go to
gt4.1.1-all-source-installer/source-trees/xio/drivers/udt-ref/and execute the following two commands:globus$ ./configure --with-flavor=gcc32dbgpthr (or whatever your threaded flavor is) --with-udt-path=<path to your installed udt location> globus$ make installStart the globus-gridftp-server with the
--protocol_stack udt_ref,gsioption. For separate control and backend data nodes, this option is required on the backend data nodes only.
![]() | Note |
|---|---|
Currently UDT can only be used for third-party transfers. |
Make sure both GridFTP (source & destination) servers have been configured for using the UDT protocol by following the instructions above.
GridFTP traditionally uses GSI for establishing secure connections as a very strong, robust, and flexible means of securing messages. In some situations however it is preferable to use the existing SSH security mechanism.
GridFTP-over-SSH leverages the fact that an SSH client can remotely execute programs by forming a secure connection with SSHD. In this case:
All of the standard IO from the remote program is routed back to the client.
The client (globus-url-copy) acts as an SSH client and remotely executes a Globus GridFTP server.
All IO is sent to and from the client by way of UNIX pipes.
Client support scripts for GridFTP-over-SSH are automatically created when you run make install as part of step 2 above.
On the server side (where you intend the client to remotely execute a server), the following command must be run as root in order to enable this
machine to accept SSHFTP connections. The following command will create the file /etc/grid-security/sshftp:
$GLOBUS_LOCATION/setup/globus/setup-globus-gridftp-sshftp -server
If root access is not available, the option -nonroot may be added to enable
connections as the current user only. The following alternative command will create the file $HOME/.globus/sshftp:
$GLOBUS_LOCATION/setup/globus/setup-globus-gridftp-sshftp -server -nonroot
In order to use GridFTP-over-SSH, the user must provide urls that begin with sshftp:// as arguments. For example:
globus-url-copy sshftp://<host>:<port>/<path_to_file> file:/<path_to_file>
where <port> is the port in which sshd listens on the host
referred to by <host> (the default value is 22).
![]() | Note |
|---|---|
GridFTP-over-SSH does not support either third party transfers or data channel security. |