Guidelines for using/running jobs on larger HPC machines

22 Mar 2017

      Hi all,

Generally speaking, what should the PYOP2/FIREDRAKE_TSFC_KERNEL_CACHE_DIR
environment variables be set to? Is it sufficient to have something like:

export PYOP2/FIREDRAKE_TSFC_KERNEL_CACHE_DIR=$HOME/fd-cache

This works on my smaller university cluster but I wonder if for a system
like Edison at NERSC (
http://www.nersc.gov/users/computational-systems/edison/) if there is a
better directory for this.

Also, from what I read, Edison's SLURM scheduler loads the executable to
the allocated compute nodes from the current working directory, which
supposedly can be really slow - they recommend using something like:

srun --bcast=/tmp/$SLURM_JOB_ID --compress=lz4 ...

if 2000 or more nodes are needed. But even on jobs that require no more
than a single compute node (24 cores), the firedrake/python modules seem to
load very slowly.

This is the description of the file storage system on Edison (
http://www.nersc.gov/users/computational-systems/edison/file-storage-and-i-o...)
btw

Any help or thoughts appreciated.

Justin

Justin Chang

Lawrence Mitchell

tags

participants (2)