Inspect Model Training with TensorBoard¶
TensorBoard is a visualization toolkit for TensorFlow and offers a variety of functionalities such as presentation of loss and accuracy, visualization of the model graph or profiling of the application.
The easiest way to use TensorBoard is via JupyterHub. By default,
TensorBoard is configured to read log data from
/tmp/<username>/tf-logs on the compute node on
which the Jupyter session is running. In order to show your own log data from a different directory,
soft-link this directory with
/tmp/<username>/tf-logs in order to make TensorBoard reading your
log data. Note, that the directory
/tmp/<username>/tf-logs might not exist and you have to
create it first. Therefore, open a "New Launcher" (
Ctrl+Shift+L) and select "Terminal" session.
It will start a new terminal on the respective compute node. Then you can create the directory
/tmp/<username>/tf-logs and link it with the directory where your own log data is located.
Assuming you use a line like the following in your code:
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir="/home/marie/logs")
You can then make the TensorBoard available from the Jupyter terminal with:
mkdir -p /tmp/$USER/tf-logs ln -s /home/marie/logs /tmp/$USER/tf-logs
Update TensorBoard tab if needed with
Using TensorBoard from Module Environment¶
On ZIH systems, TensorBoard is also available as an extension of the TensorFlow module. To check whether a specific TensorFlow module provides TensorBoard, use the following command:
marie@compute$ module spider TensorFlow/2.3.1 [...] Included extensions =================== absl-py-0.10.0, astor-0.8.0, astunparse-1.6.3, cachetools-4.1.1, gast-0.3.3, google-auth-1.21.3, google-auth-oauthlib-0.4.1, google-pasta-0.2.0, grpcio-1.32.0, Keras-Preprocessing-1.1.2, Markdown-3.2.2, oauthlib-3.1.0, opt- einsum-3.3.0, pyasn1-modules-0.2.8, requests-oauthlib-1.3.0, rsa-4.6, tensorboard-2.3.0, tensorboard-plugin-wit-1.7.0, TensorFlow-2.3.1, tensorflow- estimator-2.3.0, termcolor-1.1.0, Werkzeug-1.0.1, wrapt-1.12.1
If TensorBoard occurs in the
Included extensions section of the output, TensorBoard is available.
To use TensorBoard, you have to connect via ssh to the ZIH system as usual, schedule an interactive job and load a TensorFlow module:
marie@compute$ module load TensorFlow/2.3.1 Module TensorFlow/2.3.1-fosscuda-2019b-Python-3.7.4 and 47 dependencies loaded.
Then, create a workspace for the event data, that should be visualized in TensorBoard. If you already have an event data directory, you can skip that step.
marie@compute$ ws_allocate -F scratch tensorboard_logdata 1 Info: creating workspace. /scratch/ws/1/marie-tensorboard_logdata [...]
Now, you can run your TensorFlow application. Note that you might have to adapt your code to make it accessible for TensorBoard. Please find further information on the official TensorBoard website Then, you can start TensorBoard and pass the directory of the event data:
marie@compute$ tensorboard --logdir /scratch/ws/1/marie-tensorboard_logdata --bind_all [...] TensorBoard 2.3.0 at http://taurusi8034.taurus.hrsk.tu-dresden.de:6006/ [...]
TensorBoard then returns a server address on Taurus, e.g.
For accessing TensorBoard now, you have to set up some port forwarding via ssh to your local machine:
marie@local$ ssh -N -f -L 6006:taurusi8034:6006 taurus
The previous SSH command requires that you have already set up your SSH configuration .
Now, you can see the TensorBoard in your browser at
Note that you can also use TensorBoard in an sbatch file.