Skip to content

Transfer Data Inside ZIH Systems with Datamover

With the Datamover, we provide special data transfer machines for transferring data between the ZIH filesystems with best transfer speed. The Datamover machine is not accessible through SSH as it is dedicated to data transfers. To move or copy files from one filesystem to another, you have to use the following commands after logging in to any of the ZIH HPC systems:

  • dtcp, dtls, dtmv, dtrm, dtrsync, dttar, and dtwget

These special commands submit a batch job to the data transfer machines performing the selected command. Their syntax and behavior is the very same as the well-known shell commands without the prefix dt, except for the following options.

Additional Option Description
--account=ACCOUNT Assign data transfer job to specified account.
--blocking Do not return until the data transfer job is complete. (default for dtls)
--time=TIME Job time limit (default: 18 h).

Managing Transfer Jobs

There are the commands dtinfo, dtqueue, dtq, and dtcancel to manage your transfer commands and jobs.

  • dtinfo shows information about the nodes of the data transfer machine (like sinfo).
  • dtqueue and dtq show all your data transfer jobs (like squeue --me).
  • dtcancel signals data transfer jobs (like scancel).

To identify the mount points of the different filesystems on the data transfer machine, use dtinfo. It shows an output like this:

ZIH system Local directory Directory on data transfer machine
Barnard /data/horse /data/horse
/data/walrus /data/walrus
outdated: Taurus /home /data/old/home
/scratch/ws /data/old/lustre/scratch2/ws
/ssd/ws /data/old/lustre/ssd/ws
/beegfs/global0/ws /data/old/beegfs/global0/ws
/warm_archive/ws /data/old/warm_archive/ws
/projects /projects
Archive /data/archiv

Usage of Datamover

Data on outdated filesystems

Copying data from /beegfs/.global0 to /projects filesystem.

marie@login$ dtcp -r /data/old/beegfs/.global0/ws/marie-workdata/results /projects/p_number_crunch/.

Archive data from /beegfs/.global0 to /archiv filesystem.

marie@login$ dttar -czf /data/archiv/p_number_crunch/results.tgz /data/old/beegfs/global0/ws/marie-workdata/results

Copy data from /data/horse to /projects filesystem.

``` console
marie@login$ dtcp -r /data/horse/ws/marie-workdata/results /projects/p_number_crunch/.
```

Move data from /data/horse to /data/walrus filesystem.

marie@login$ dtmv /data/horse/ws/marie-workdata/results /data/walrus/ws/marie-archive/.

Archive data from /data/walrus to /archiv filesystem.

marie@login$ dttar -czf /archiv/p_number_crunch/results.tgz /data/walrus/ws/marie-workdata/results

Warning

Do not generate files in the /archiv filesystem much larger that 500 GB!

Note

The warm archive and the projects filesystem are not writable from within batch jobs. However, you can store the data in the walrus filesystem using the Datamover nodes via dt* commands.

Transferring Files Between ZIH Systems and Group Drive

In order to let the datamover have access to your group drive, copy your public SSH key from ZIH system to login1.zih.tu-dresden.de, first.

marie@login$ ssh-copy-id -i ~/.ssh/id_rsa.pub login1.zih.tu-dresden.de
# Export the name of your group drive for reuse of example commands
marie@login$ export GROUP_DRIVE_NAME=<my-drive-name>

Copy data from your group drive to /data/horse filesystem.

marie@login$ dtrsync -av dgw.zih.tu-dresden.de:/glw/${GROUP_DRIVE_NAME}/inputfile /data/horse/ws/marie-workdata/.

Copy data from /data/horse filesystem to your group drive.

marie@login$ dtrsync -av /data/horse/ws/marie-workdata/resultfile dgw.zih.tu-dresden.de:/glw/${GROUP_DRIVE_NAME}/.