Hi Simon,

Thanks so much. That helped indeed, the job was correctly dispatched to a site, and data correctly downloaded, which I think it's already good progress!

But it ran into problems afterwards as it crashed a few minutes later. The stdout and stderr don't say that much, so I'm not sure what happened. Also peek in ganga is not very informative, showing only a warning about a non-existent scratch directory. Any clues?

Thanks again for your help.

Giuseppe



==================================================================================
Last 8 lines of application output from JobWrapper on 2017-03-18 14:54:47.985686 :
CPU Total: 00:00:02 (h:m:s) Normalized CPU Total 24.7 s @ HEP'06
==================================================================================
2017-03-18 14:54:39 UTC dirac-jobexec   INFO: JobID: 2335477 
2017-03-18 14:54:39 UTC dirac-jobexec   INFO: DIRAC JobID 2335477 is running at site VAC.UKI-NORTHGRID-MAN-HEP.uk 
Executing StepInstance RunScriptStep1 of type ScriptStep1 ['ScriptStep1']
2017-03-18 14:54:40 UTC dirac-jobexec/Script   INFO: Command is: /scratch/plt00/2335477/exe-script.py 
2017-03-18 14:54:40 UTC dirac-jobexec/Script  ERROR: Non-zero status while executing 2: /scratch/plt00/2335477/exe-script.py
2017-03-18 14:54:40 UTC dirac-jobexec/Script   INFO: Output written to Ganga_Executable.log, execution complete. 
2017-03-18 14:54:40 UTC dirac-jobexec/Script  ERROR: 'exe-script.py' Exited With Status 2 
2017-03-18 14:54:40 UTC dirac-jobexec/Script   INFO: ===== Terminating  =====  


JobManager
Received
Job accepted
Unknown
2017-03-18 14:46
JobPath
Checking
JobSanity
Unknown
2017-03-18 14:46
JobSanity
Checking
JobScheduling
Unknown
2017-03-18 14:46
JobScheduling
Waiting
Pilot Agent Submission
Unknown
2017-03-18 14:46
Matcher
Matched
Assigned
Unknown
2017-03-18 14:54
Matched
Job Received by Agent
Unknown
2017-03-18 14:54
Matched
Submitted To CE
Unknown
2017-03-18 14:54
JobWrapper
Running
Job Initialization
Unknown
2017-03-18 14:54
JobWrapper
Running
Downloading InputSandbox
Unknown
2017-03-18 14:54
JobWrapper
Running
Downloading InputSandbox LFN(s)
Unknown
2017-03-18 14:54
JobWrapper
Running
Application
Unknown
2017-03-18 14:54
Job_2335477
Running
Application
Executing RunScriptStep1
2017-03-18 14:54
Job_2335477
Running
Application
exe-script.py Exited With Status 2
2017-03-18 14:54
Job_2335477
Running
Application
exe-script.py Exited With Status 2
2017-03-18 14:54
JobWrapper
Completed
Application Finished With Errors
exe-script.py Exited With Status 2
2017-03-18 14:54
JobWrapper
Failed
Application Finished With Errors
exe-script.py Exited With Status 2
2017-03-18 14:54



On 18/03/2017 14:37, Will Furnell wrote:

Hi,

I've had the same problem with Ganga, adding the following to your job submission seems to make it only add InputSandbox to the JDL:

j.backend.settings['InputData'] = ''

Will.


On 18/03/2017 14:28, Giuseppe Congedo wrote:
Hi Simon,

Thanks so much for looking into this. I've fixed up the paths - thanks for spotting this. It now looks like the job got somewhere nearer to start, but again it's been put on hold because it doesn't see the files - please find the log below.

Also, I've been looking at the JDL and I do have indeed both an InputSandbox section (with LFN paths to files), and an InputData section (with the same files) too. I suspect it's exactly the problem you mentioned. In fact, two identical jobs were both submitted to LCG.UKI-SCOTGRID-ECDF.uk where my data live. Please look at job no. 2335476 and 2335466. I'm puzzled by the fact that I only set this:

j.inputfiles = [ DiracFile(...

in my ganga job sumission file. Is this correct? If so, why is Dirac trying to use InputData?

Thanks again for all your useful tips.

Giuseppe

---
JobManager
Received
Job accepted
Unknown
2017-03-18 13:12
JobPath
Checking
JobSanity
Unknown
2017-03-18 13:12
JobSanity
Checking
InputData
Unknown
2017-03-18 13:12
InputData
Checking
JobScheduling
Unknown
2017-03-18 13:12
JobScheduling
Waiting
Pilot Agent Submission
Unknown
2017-03-18 13:12
Matcher
Matched
Assigned
Unknown
2017-03-18 13:16
Matched
Job Received by Agent
Unknown
2017-03-18 13:16
Matched
Submitted To CE
Unknown
2017-03-18 13:16
JobWrapper
Running
Job Initialization
Unknown
2017-03-18 13:16
JobWrapper
Running
Downloading InputSandbox
Unknown
2017-03-18 13:16
JobWrapper
Running
Downloading InputSandbox LFN(s)
Unknown
2017-03-18 13:16
JobWrapper
Running
Application
Failed Input Sandbox Download
2017-03-18 13:16
JobWrapper
Rescheduled
Input Sandbox Download
Failed Input Sandbox Download
2017-03-18 13:16
JobManager
Received
Job Rescheduled
Unknown
2017-03-18 13:16
JobPath
Checking
JobSanity
Unknown
2017-03-18 13:16
JobSanity
Checking
InputData
Unknown
2017-03-18 13:16
InputData
Checking
JobScheduling
Unknown
2017-03-18 13:16
JobScheduling
Checking
JobScheduling
On Hold: after rescheduling 1
2017-03-18 13:16
JobScheduling
Waiting
Pilot Agent Submission
Unknown
2017-03-18 13:20
Matcher
Matched
Assigned
Unknown
2017-03-18 13:24
Matched
Job Received by Agent
Unknown
2017-03-18 13:24
Matched
Submitted To CE
Unknown
2017-03-18 13:24
JobWrapper
Running
Job Initialization
Unknown
2017-03-18 13:24
JobWrapper
Running
Downloading InputSandbox
Unknown
2017-03-18 13:24
JobWrapper
Running
Downloading InputSandbox LFN(s)
Unknown
2017-03-18 13:24
JobWrapper
Running
Application
Failed Input Sandbox Download
2017-03-18 13:24
JobWrapper
Rescheduled
Input Sandbox Download
Failed Input Sandbox Download
2017-03-18 13:24
JobManager
Received
Job Rescheduled
Unknown
2017-03-18 13:24
JobPath
Checking
JobSanity
Unknown
2017-03-18 13:24
JobSanity
Checking
InputData
Unknown
2017-03-18 13:24
InputData
Checking
JobScheduling
Unknown
2017-03-18 13:24
JobScheduling
Checking
JobScheduling
On Hold: after rescheduling 2
2017-03-18 13:24
JobScheduling
Waiting
Pilot Agent Submission
Unknown
2017-03-18 13:30
Matcher
Matched
Assigned
Unknown
2017-03-18 13:34
Matched
Job Received by Agent
Unknown
2017-03-18 13:34
Matched
Submitted To CE
Unknown
2017-03-18 13:34
JobWrapper
Running
Job Initialization
Unknown
2017-03-18 13:34
JobWrapper
Running
Downloading InputSandbox
Unknown
2017-03-18 13:34
JobWrapper
Running
Downloading InputSandbox LFN(s)
Unknown
2017-03-18 13:34
JobWrapper
Running
Application
Failed Input Sandbox Download
2017-03-18 13:34
JobWrapper
Rescheduled
Input Sandbox Download
Failed Input Sandbox Download
2017-03-18 13:34
JobManager
Received
Job Rescheduled
Unknown
2017-03-18 13:34
JobPath
Checking
JobSanity
Unknown
2017-03-18 13:34
JobSanity
Checking
InputData
Unknown
2017-03-18 13:34
InputData
Checking
JobScheduling
Unknown
2017-03-18 13:34
JobScheduling
Checking
JobScheduling
On Hold: after rescheduling 3
2017-03-18 13:34
JobScheduling
Waiting
Pilot Agent Submission
Unknown
2017-03-18 13:45
Matcher
Matched
Assigned
Unknown
2017-03-18 13:48
Matched
Job Received by Agent
Unknown
2017-03-18 13:48
Matched
Submitted To CE
Unknown
2017-03-18 13:48
JobWrapper
Running
Job Initialization
Unknown
2017-03-18 13:48
JobWrapper
Running
Downloading InputSandbox
Unknown
2017-03-18 13:48
JobWrapper
Running
Downloading InputSandbox LFN(s)
Unknown
2017-03-18 13:48
JobWrapper
Running
Application
Failed Input Sandbox Download
2017-03-18 13:48
JobWrapper
Rescheduled
Input Sandbox Download
Failed Input Sandbox Download
2017-03-18 13:48
JobManager
Received
Job Rescheduled
Unknown
2017-03-18 13:48
JobPath
Checking
JobSanity
Unknown
2017-03-18 13:48
JobSanity
Checking
InputData
Unknown
2017-03-18 13:48
InputData
Checking
JobScheduling
Unknown
2017-03-18 13:48
JobScheduling
Checking
JobScheduling
On Hold: after rescheduling 4
2017-03-18 13:48
JobScheduling
Waiting
Pilot Agent Submission
Unknown
2017-03-18 13:59
Matcher
Matched
Assigned
Unknown
2017-03-18 14:02
Matched
Job Received by Agent
Unknown
2017-03-18 14:02
Matched
Submitted To CE
Unknown
2017-03-18 14:02
JobWrapper
Running
Job Initialization
Unknown
2017-03-18 14:02
JobWrapper
Running
Downloading InputSandbox
Unknown
2017-03-18 14:02
JobWrapper
Running
Downloading InputSandbox LFN(s)
Unknown
2017-03-18 14:02
JobWrapper
Running
Application
Failed Input Sandbox Download
2017-03-18 14:02
JobWrapper
Rescheduled
Input Sandbox Download
Failed Input Sandbox Download
2017-03-18 14:02
JobManager
Received
Job Rescheduled
Unknown
2017-03-18 14:02
JobPath
Checking
JobSanity
Unknown
2017-03-18 14:02
JobSanity
Checking
InputData
Unknown
2017-03-18 14:02
InputData
Checking
JobScheduling
Unknown
2017-03-18 14:02
JobScheduling
Checking
JobScheduling
On Hold: after rescheduling 5
2017-03-18 14:02
---




On 18/03/2017 12:33, Simon Fayer wrote:
Hi Giuseppe,

You can generally get more details information about why a job failed from
the web interface at: https://dirac.gridpp.ac.uk
(Although you have to have your grid certificate installed in your browser
to access this).

In this case your jobs have the Application Status of "Input data not
available", which as you correctly surmised indicates a problem with the
inputfiles.

I've had a look at the JDL which Ganga generated and you have three LFN
based files in the sandbox:

"LFN:/gridpp/user/giuseppe.congedo/sim/image_0_details.fits",
"LFN:/gridpp/user/giuseppe.congedo/PSF/simssbc_cb2004a_001.fits_0.000_0.804_1.00.fits",
"LFN:/gridpp/user/giuseppe.congedo/sim/.lensmc_cache",

If I look in the catalogue (using the dirac-dms-filecatalog-cli tool), it
seems like you used a slightly different path for these files when you
uploaded them:

/gridpp/user/giuseppe.congedo/sim/sim/image_0_details.fits
/gridpp/user/giuseppe.congedo/sim/PSF/simssbc_cb2004a_001.fits_0.000_0.804_1.00.fits
/gridpp/user/giuseppe.congedo/sim/.lensmc_cache

(Note the extra "sim/" directory in the first two paths). It should simply
be a case of adjusting the paths in your ganga script to match the file
catalogue.

It also looks like you may have put these as inputdata as well... There is
no need to do this, just specifying them in the inputfiles is
sufficient (if you set inputdata, the jobs have to run at a site with those
files, whereas inputfiles can be downloaded at any site).

Please let us know if that doesn't work or if you run into any further
problems.

Regards,
Simon


On Sat, Mar 18, 2017 at 11:21:56AM +0000, Giuseppe Congedo wrote:
Hello all,

I am trying to move my workflow on gridpp, but the problem is, I am a
newbie! Andrew Lahiff has been very helpful in giving me good advice along
the way - thanks. I hope you too could help me with this.

I set up code, software environment, and data, and it works fine locally on
Ganga. To move things on the grid, I put code & environment on cvmfs (thanks
Catalin Condurache), and uploaded data on DFC. After submission, I got the
error attached below. It looks like the job could not see the input data,
and also some output sandbox was not properly initialised.

My submission scripts looks like the example in the userguide. I suspect I
might be doing something wrong perhaps here?

j.inputfiles = [
DiracFile('LFN:/gridpp/user/giuseppe.congedo/my_folder/my_file'),

Please note that the job fails no matter if I set DiracLFNBase or not in
.gangarc.

I look forward to hearing from you, and thanks so much for your help.

Giuseppe

---
ERROR    Unable to finalise job after 36 retries due to error:
GangaDiracError: No Output sandbox registered for job 2335379
INFO     job 36 status changed to "failed"
WARNING  An error occured finalising job: 36
WARNING  Attemting again (2 of 5) after 2.5-sec delay
INFO     job 36 status changed to "submitted"
INFO     job 36 status changed to "running"
INFO     job 36 status changed to "failed"
WARNING  An error occured finalising job: 36
WARNING  Attemting again (3 of 5) after 2.5-sec delay
INFO     job 36 status changed to "submitted"
INFO     job 36 status changed to "running"
INFO     job 36 status changed to "failed"
WARNING  An error occured finalising job: 36
WARNING  Attemting again (4 of 5) after 2.5-sec delay
INFO     job 36 status changed to "submitted"
INFO     job 36 status changed to "running"
INFO     job 36 status changed to "failed"
WARNING  An error occured finalising job: 36
WARNING  Attemting again (5 of 5) after 2.5-sec delay
ERROR    GangaDiracError: No Output sandbox registered for job 2335379
---




The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.