Some important consideration before transfering data between two systems

  • Time

Some data transfers might take days to complete, based on amount of data, number of fies, load on transfer node, destination storage service. Please make sure that you have kerberos ticket with aceptable liftime as well as AFS tokens in case you are transfering data to/from AFS.

Plase use iput, iget and irsync command with -K option (-K : verify checksum). Checksums provide a simple way to compute the integrity of data files before and after file transfer, and detect possible data corruption or network problems.

  • Symlinks

Symlinks are filesystem dependent. So, when moving files between systems, please remember that symlinks will be affected.

Swestore (iRODS) client on PDC transfer node

Please use one of the Nodes for file operations. These nodes are dedicated for large file transfers but also for extensive file operations involving large amount of data or many files. It is important that you use these nodes for extensive file operations as not to overload the login node.

1. Login to one of the transfer nodes

# get a 7-days forwardable ticket
$ kinit --forwardable -l 7d <user>@NADA.KTH.SE

Note: Please consider time required for files to transfer, and kerberos default ticket lifetime. You can request up to 30-days forwardable kerberos ticket. Considering that it might take time to do file transfers, we suggest to do it from screen or tmux. In order to keep your access to the AFS filesystem in screen after logging out from your original ssh session you need to have started the original screen like following:

# $ pagsh bash -c "export KRB5CCNAME=$KRB5CCNAME; afslog; screen -S transfer"

2. Load the irods module

$ module load irods

3. Set Swestore (iRODS) environment

$ irods-setenv SNIC

4. And now you can use iCommands to transfer/manage files on the system. You can test it, with ‘ils’ command, by listing files in you current virtual directory:

$ ils

After which the iRODS environment at ~/.irods/irods_environment.json is configured with the settings for current iRODS version. The iRODS command-line utilities, referred to as “iCommands”, use this JSON configuration file to store iRODS initialization information under a user’s home directory (“~/.irods/irods_environment.json”).

(NOTE: If you will be accessing iRODS only through the WebDAV or other Web-based mechanisms you do not need to create this configuration file.)

Basic iCommands

Now you can use iCommands to access and manipulate data in the system. The iCommands nomenclature mimcs that of UNIX but with an “i” prepended to the command name e.g. ils, imkdir, icd. Generally, iCommands are functionally equivalent to their UNIX counterparts. Complete iCommands documentation can be found on the iRODS documentation site.

On transfer nodes, you can get list of iCommands and a brief description of each by typing:

$ ihelp

or for more information on a particular iCommand:

$ ihelp <iCommand>

The following table summarizes some of the most common iCommands:

iCommand

Syntax

Description

ils

ils file or directory

Like UNIX ls, lists files or directories

icd

icd directory

Like UNIX cd, changes iRODS directory

iput

iput -vK file destination

Stores file/s into the system. (-K : verify checksum - calculate and verify the checksum on the data, both client-side and server-side, and store it in the catalog; -v : verbose)

iget

iget -vK file destination

Retrieves file/s from the system. (-K : verify the checksum; -v : verbose)

imkdir

imkdir new_directory

Like UNIX mkdir, create a new directory

icp

icp source destination

Copy a file or directory

irsync

irsync -vK source target

Synchronization of data from the client’s local file system to iRODS, from iRODS to the local file system, or from one iRODS path to another iRODS path. (-K : verify the checksum; -v : verbose)

Synchronize from/to iRODS

Use the irsync command to synchronize a local directory with iRODS, similar to the Unix rsync command. It can be used to make an exact copy of a directory hierarchy on a local disk within iRODS, or retrieve an exact copy of a directory hierarchy already stored in iRODS. It may also be used to create an exact copy of a file or directory within iRODS.

NOTE: iRODS paths are identified with an i: prefix in the irsync command.

For example, if you have created a directory within iRODS called “/snic.se/projects/myproject/data”, and you wish to retrieve an exact copy of that “data” directory on Klemming, run the command:

$ irsync -vrK i:/snic.se/projects/myproject/data /cfs/klemming/projects/snic/workdir/data
Note: We encourage you to use -K option, where
-K

verify checksum - calculate and verify the checksum on the data

Once finished, you can then synchronize the data back into iRODS using the command:

$ irsync -vrK /cfs/klemming/projects/snic/workdir/data i:/snic.se/projects/myproject/new_data

Note_2: We encourage you also to re-run irsync command one additional time after data transfer. Files will not be transfered but checksums will be recomputed and compared. Be aware that this operation is CPU intensive and it will take time.