ALERT! Warning: your browser isn't supported. Please install a modern one, like Firefox, Opera, Safari, Chrome or the latest Internet Explorer. Thank you!
Startseite » ... » Zentrale Einrichtungen  » ZIH  » Wiki
phone prefix: +49 351 463.....

HPC Support

Operation Status

Ulf Markwardt: 33640
Claudia Schmidt: 39833 hpcsupport@zih.tu-dresden.de

Login and project application

Phone: 40000
Fax: 42328
servicedesk@tu-dresden.de

You are here: Compendium » SystemTaurus

Taurus

Taurus is the next generation HRSK (HRSK-II) installation by the vendor Bull and was deployed in two phases. It is a set of three tightly coupled clusters. In addition it also offers two large shared memory nodes with 1 TByte of main memory. More information on the hardware

Applying for Access to the System

Project and login application forms for taurus are available here.

Login to the System

Login to the system is available via ssh at taurus.hrsk.tu-dresden.de. There are three login nodes (internally called tauruslogin3 to tauruslogin5). Currently, if you use taurus.hrsk.tu-dresden.de, you will be placed on tauruslogin4. It might be a good idea to give the other two login nodes a try if the load on tauruslogin4 is rather high (there will once again be load balancer soon, but at the moment, there is none).

Please note that if you store data on the local disk (e.g. under /tmp), it will be on only one of the three nodes. If you relogin and the data is not there, you are probably on another node.

The RSA fingerprints of the Phase 2 Login nodes are:

MD5:cf:c8:72:9c:7b:ca:88:ec:1f:52:e2:f4:c0:ba:9e:b0

and
SHA256:7hpn/HpOCYJ1xeQX5nWGcvspzB3MNO42c4L1PrbgXH0

You can find an list of fingerprints here.

Transferring Data from/to Taurus

taurus has two specialized data transfer nodes. Both nodes are accessible via taurusexport.hrsk.tu-dresden.de. Currently, only rsync, scp and sftp to these nodes will work. A login via SSH is not possible as these nodes are dedicated to data transfers.

These nodes are located behind a firewall. By default, they are only accessible from IP addresses from with the Campus of the TU Dresden. External IP addresses can be enabled upon request. These requests should be send via eMail to servicedesk@tu-dresden.de and mention the IP address range (or node names), the desired protocol and the time frame that the firewall needs to be open.

We are open to discuss options to export the data in the scratch file system via CIFS or other protocols. If you have a need for this, please contact the Service Desk as well.

Phase 2: The nodes taurusexport[3,4] provide access to the /scratch file system of the second phase.

The RSA fingerprints of the Phase 2 Export nodes are:
MD5:d6:b7:29:88:51:96:b9:cb:d5:81:ef:75:46:67:22:f8

and
SHA256:Nf/x8pD7c4GC0zV8ThfPHiqieKsuF/qHctVzsU36Lic

You can find an list of fingerprints here.

Compiling Parallel Applications

You have to explicitly load a compiler module and an MPI module on Taurus. Eg. with module load intel bullxmpi. ( read more about Modules, read more about Compilers)

Use the wrapper commands mpicc , mpiCC , mpif77 , or mpif90 to compile MPI source code. They use the currently loaded compiler. To reveal the command lines behind the wrappers, use the option -show.

For running your code, you have to load the same compiler and MPI module as for compiling the program. Please follow the following guiedlines to run your parallel program using the batch system.

Batch System

Applications on an HPC system can not be run on the login node. They have to be submitted to compute nodes with dedicated resources for the user's job. Normally a job can be submitted with these data:
  • number of CPU cores,
  • requested CPU cores have to belong on one node (OpenMP programs) or can distributed (MPI),
  • memory per process,
  • maximum wall clock time (after reaching this limit the process is killed automatically),
  • files for redirection of output and error messages,
  • executable and command line parameters.

The batch system on Taurus is Slurm. If you are migrating from LSF (deimos, mars, atlas), the biggest difference is that Slurm has no notion of batch queues any more.

Partitions

Please note that the islands are also present as partitions for the batch systems. They are called
  • sandy (Island 1 - Sandybridge CPUs)
  • west (Island 3 - Westmere CPUs)
  • haswell (Islands 4 to 6 - Haswell CPUs)
  • gpu (Island 2 - GPUs)
    • gpu1 (K20X)
    • gpu2 (K80X)
  • smp1, smp2 (SMP Nodes)
Note: usually you don't have to specify a partition explicitly with the parameter -p, because SLURM will automatically select a suitable partition depending on your memory and gres requirements.

Run-time Limits

Run-time limits are enforced. This means, a job will be canceled as soon as it exceeds its requested limit. At Taurus, the maximum run time is 7 days.

Shorter jobs come with multiple advantages:part.png
  • lower risk of loss of computing time,
  • shorter waiting time for reservations,
  • higher job fluctuation; thus, jobs with high priorities may start faster.
To bring down the percentage of long running jobs we restrict the number of cores with jobs longer than 2 days to approximately 50% and with jobs longer than 24 to 75% of the total number of cores. (These numbers are subject to changes.) As best practice we advise a run time of about 8h.

Please always try to make a good estimation of your needed time limit. For this, you can use a command line like this to compare the requested timelimit with the elapsed time for your completed jobs that started after a given date:

sacct -X -S 2017-04-10 --format=start,JobID,elapsed,timelimit -s COMPLETE

Memory Limits

Memory limits are enforced. This means that jobs which exceed their per-node memory limit will be killed automatically by the batch system. Memory requirements for your job can be specified via the sbatch/srun parameters: --mem-per-cpu=<MB> or --mem=<MB> (which is "memory per node"). The default limit is 300 MB per cpu.

Taurus has sets of nodes with a different amount of installed memory which affect where your job may be run. To achieve the shortest possible waiting time for your jobs, you should be aware of the limits shown in the following table.
Partition Nodes # Nodes Cores per Node Avail. Memory per Core Avail. Memory per Node
haswell taurusi[4001-4104] 104 24 2583 MB 62000 MB
haswell taurusi[4105-4188] 84 24 5250 MB 126000 MB
haswell taurusi[4189-4232] 44 24 10583 MB 254000 MB
haswell taurusi[5001-5612] 612 24 2583 MB 62000 MB
haswell taurusi[6001-6612] 612 24 2583 MB 62000 MB
sandy taurusi[1001-1228] 228 16 1875 MB 30000 MB
sandy taurusi[1229-1256] 28 16 3875 MB 62000 MB
sandy taurusi[1257-1270] 14 16 7875 MB 126000 MB
triton taurusi[3001-3036] 36 12 3875 MB 46500 MB
west taurusi[3037-3180] 144 12 3875 MB 46500 MB
gpu1 taurusi[2001-2042] 42 16 3000 MB 48000 MB
gpu2 taurusi[2045-2106] 62 24 2583 MB 62000 MB
gpu1-interactive taurusi[2001-2044] 2 16 3000 MB 48000 MB
gpu2-interactive taurusi[2045-2108] 2 24 2583 MB 62000 MB
smp1 taurussmp[1-2] 2 32 31875 MB 1020000 MB
smp2 taurussmp[3-7] 5 56 36500 MB 2044000 MB
knl taurusknl[1-32] 32 64 1468 MB 94000 MB
broadwell taurusi[4233-4264] 32 28 2214 MB 62000 MB

Submission of Parallel Jobs

To run MPI jobs ensure that the same MPI module is loaded as during compile-time. In doubt, check you loaded modules with module list. If you code has been compiled with the standard bullxmpi installation, you can load the module via module load bullxmpi.

Please pay attention to the messages you get loading the module. They are more up-to-date than this manual.

GPUs

Island 2 of taurus contains a total of 88 NVIDIA Tesla K20x GPUs in 44 nodes (phase 1) and 256 NVIDIA Tesla K80 GPUs in 64 nodes (phase 2).

More information on how to program applications for GPUs can be found GPUProgramming.

The following software modules on taurus offer GPU support:
  • cuda : The NVIDIA CUDA compilers
  • pgi : The PGI compilers with OpenACC support

Intel Xeon Phi (Knights Landing - KNL)

32 nodes with Intel's manycore processor (64 cores each) can be used in the Slurm partition knl. These nodes can only be used in exclusive mode. A simple bash can be started via Slurm like this:
srun -p knl -n 1 -c 64 --mem=90000 --pty bash

Please also see the information at KnlNodes.

Energy Measurement

Taurus contains sophisticated energy measurement instrumentation. Especially HDEEM is available on the haswell nodes of Phase II. More detailed information can be found at EnergyMeasurement.

Low level optimizations

x86 processsors provide registers that can be used for optimizations and performance monitoring. Taurus provides you access to such features via the x86_adapt software infrastructure.