******************* This email originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list https://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email stamping for this address. ******************* Dear All, After a bumpy ride and a steep learning curve, I have now managed to submit my first successful jobs on the grid! Let me start by saying that this would not have been possible without the help of Daniela Bauer, Simon Fayer, and later also Dan Whitehouse from Imperial. The purpose of this email is to summarise the experience, and perhaps provide some tips about how to get from just having a certificate and maybe a simple script submitted to the grid (like in my situation 10 days ago), to having a working large-scale submission up and running on the grid. 1) Get your certificate, read the guidance. 2) I found useful to install my local copy of diracos / Dirac API following the guidance on the wiki https://www.gridpp.ac.uk/wiki/Quick_Guide_to_Dirac#Dirac_client_installation in particular the Python 3 installation. Note, I was told that Python 3 will be the default from the end of July. You may want to use CVMFS instead, that is entirely up to you. 3) Once diracos is installed and you have a live proxy, you are ready to go. I will be sourcing diracos specifically for my submission script which I adapted from other runs on clusters managed with slurm. 4) Now it is the time to think about how to manage your code. In my case I have my Python package already source released in tarball, however I did that in the wrong way. I used python-build, but actually it is much easier to use python setup.py sdist from within the local environment. Otherwise you will likely have dependency mismatch somewhere down the line. 5) You will also need an environment to ship with your code. In this case we create a conda environment from an environment.yml, then use conda construct to build an installer file (~100MB). Test your environment+code locally; check the versions; check that everything works as expected. In my case I had to slightly tweak my requirement file to have set versions of Python dependencies, but I also had to change some versions (pyfftw and numpy). 6) Write a wrapper executable. This is what the worker will call each time. It will: 6.1) Set the script to stop execution whenever an error is hit [set -e] 6.2) Install the environment [bash my-env-installer.sh] 6.3) Source the environment [eval "$(my-env/bin/python my-env/bin/conda shell.bash hook)"] (the initial python is necessary to ensure the correct conda is picked up, thanks Dan for help) 6.4) Install your package [pip install my-package.tar.gz] 6.5) Run your script [python my_script.py "${@}"] (you can pass any parameters directly to the wrapper script) 7) Upload installer, package, wrapper, script, and any other input files to storage. Replicate across few more sites. Following recommendation, I used Imperial to make debugging easier. 8) Initialise/submit your job via Dirac API [job.setExecutable(f'bash wrapper.sh {args}'); job.setInputSandbox(my_file_list)] (here 'args' could be named keywords; also ensure you prepend 'LFN:' to your gridpp path if you want to run on all sites, otherwise just use job.setInputData). Thanks again to Simon and Daniela for help. Please feel free to chime in. Any questions about the specifics, please get back to us. I hope I have not forgotten anything critical, and that this will help others get started on the grid in no time. Best wishes Giuseppe -- Dr Giuseppe Congedo (Senior Researcher) Institute for Astronomy, University of Edinburgh Royal Observatory, Blackford Hill Edinburgh, EH9 3HJ The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.