Neural Networks with TensorFlow¶

TensorFlow is a free end-to-end open-source software library for data flow and differentiable programming across many tasks. It is a symbolic math library, used primarily for machine learning applications. It has a comprehensive, flexible ecosystem of tools, libraries and community resources.

Please check the software modules list via

marie@compute$ module spider TensorFlow
[...]

to find out, which TensorFlow modules are available on your cluster.

On ZIH systems, TensorFlow 2 is the default module version. For compatibility hints between TensorFlow 2 and TensorFlow 1, see the corresponding section below.

We recommend using the clusters alpha and/or power when working with machine learning workflows and the TensorFlow library. You can find detailed hardware specification in our Hardware documentation.

TensorFlow Console¶

On the cluster alpha, load the module environment:

marie@alpha$ module load release/23.04

Alternatively you can use release/23.10 module environment, where the newest versions are available

[marie@alpha ]$ module load release/23.10  GCC/11.3.0  OpenMPI/4.1.4
Module GCC/11.3.0, OpenMPI/4.1.4 and 14 dependencies loaded.

[marie@alpha ]$ module load TensorFlow/2.9.1
Module TensorFlow/2.9.1 and 35 dependencies loaded.
[marie@alpha ]$ module avail TensorFlow

-------- /software/modules/rapids/r23.10/all/MPI/GCC/11.3.0/OpenMPI/4.1.4 --------
   TensorFlow/2.9.1 (L)

  Where:
   L:  Module is loaded
   *Module:  Some Toolchain, load to access other modules that depend on it
   >Module:  Recommended toolchain version, load to access other modules that depend on it

This example shows how to install and start working with TensorFlow using the modules system.

marie@power$ module load TensorFlow
Module TensorFlow/2.3.1-fosscuda-2019b-Python-3.7.4 and 47 dependencies loaded.

Now we can use TensorFlow. Nevertheless when working with Python in an interactive job, we recommend to use a virtual environment. In the following example, we create a python virtual environment and import TensorFlow:

Example

marie@power$ ws_allocate -F horse python_virtual_environment 1
Info: creating workspace.
/data/horse/ws/python_virtual_environment
[...]
marie@power$ which python    #check which python are you using
/sw/installed/Python/3.7.2-GCCcore-8.2.0
marie@power$ virtualenv --system-site-packages /data/horse/ws/marie-python_virtual_environment/env
[...]
marie@power$ source /data/horse/ws/marie-python_virtual_environment/env/bin/activate
marie@power$ python -c "import tensorflow as tf; print(tf.__version__)"
[...]
2.3.1

TensorFlow in JupyterHub¶

In addition to interactive and batch jobs, it is possible to work with TensorFlow using JupyterHub, which contains a kernel named Python 3 ... TensorFlow, that come with TensorFlow support.

Hint

You can also define your own Jupyter kernel for more specific tasks. Please read about Jupyter kernels and virtual environments in our JupyterHub documentation.

TensorFlow in Containers¶

Another option to use TensorFlow are containers. In the HPC domain, the Singularity container system is a widely used tool. In the following example, we use the tensorflow-test in a Singularity container:

marie@power$ singularity shell --nv /data/horse/singularity/powerai-1.5.3-all-ubuntu16.04-py3.img
Singularity>$ export PATH=/opt/anaconda3/bin:$PATH
Singularity>$ source activate /opt/anaconda3    #activate conda environment
(base) Singularity>$ . /opt/DL/tensorflow/bin/tensorflow-activate
(base) Singularity>$ tensorflow-test
Basic test of tensorflow - A Hello World!!!...
[...]

Hint

In the above example, we create a conda virtual environment. To use conda, it is be necessary to configure your shell as described in Python virtual environments

TensorFlow with Python or R¶

For further information on TensorFlow in combination with Python see data analytics with Python, for R see data analytics with R.

Distributed TensorFlow¶

For details on how to run TensorFlow with multiple GPUs and/or multiple nodes, see distributed training.

Compatibility TF2 and TF1¶

TensorFlow 2.0 includes many API changes, such as reordering arguments, renaming symbols, and changing default values for parameters. Thus in some cases, it makes code written for the TensorFlow 1.X not compatible with TensorFlow 2.X. However, If you are using the high-level APIs (tf.keras) there may be little or no action you need to take to make your code fully TensorFlow 2.0 compatible. It is still possible to run 1.X code, unmodified (except for contrib), in TensorFlow 2.0:

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()    #instead of "import tensorflow as tf"

To make the transition to TensorFlow 2.0 as seamless as possible, the TensorFlow team has created the tf_upgrade_v2 utility to help transition legacy code to the new API.

Keras¶

Keras is a high-level neural network API, written in Python and capable of running on top of TensorFlow. Please check the software modules list via

marie@compute$ module spider Keras
[...]

to find out, which Keras modules are available on your cluster. TensorFlow should be automatically loaded as a dependency. After loading the module, you can use Keras as usual.