Hi Simon, Thanks. And now it is happening at Imperial too. Regards, Raja. On 01/08/19 18:14, Simon Fayer wrote:
Hi Raja,
Hmm, it's failing to add the VOMS extension onto the proxy for some reason (full error below). I can't see anything wrong with this other than the slightly dubious -vomses option on the command (it points to /opt/dirac rather than the job work dir).
We'll look at running some test jobs in the next couple of days to try and work out what's going on there.
Regards, Simon
2019-08-01 13:13:32 UTC WorkloadManagement/JobAgent ERROR: Could not retrieve payload proxy Cannot append voms extension: VOMS Error ( 1121 : Failed to set VOMS attributes. Command: voms-proxy-init2 -cert "/tmp/tmppr6baQ" -key "/tmp/tmppr6baQ" -out "/tmp/tmp0KDf2R" -voms "dune:/dune, /dune/Role=Analysis" -valid "5901:38" -vomses "/opt/dirac/etc/grid-security/vomses" -r -timeout 12; StdOut: Your identity: /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=nraja/CN=471708/CN=Raja Nandakumar/CN=4095007602 Creating temporary proxy Done Contacting voms1.fnal.gov:15042 [/DC=org/DC=incommon/C=US/ST=IL/L=Batavia/O=Fermi Research Alliance/OU=Fermilab/CN=voms1.fnal.gov] "dune" Done Creating proxy Done
Your proxy is valid until Fri Apr 3 11:51:32 2020 ; StdErr: .......................................................................... Warning: voms1.fnal.gov:15042: The validity of this VOMS AC in your proxy is shortened to 86400 seconds!
...................................................Error: verification failed. Cannot verify AC signature! )
On Thu, Aug 01, 2019 at 02:55:30PM +0100, Raja Nandakumar wrote:
Hi Simon,
It looks like LCG.SARA-MATRIX.nl also has this problem, though after enough reschedulings the jobs eventually run there.
Regards, Raja.
On 29/07/19 21:30, Simon Fayer wrote:
Hi Raja,
On Mon, Jul 29, 2019 at 10:50:41AM +0100, Raja Nandakumar wrote:
So far in my various tests, it has been BHAM-HEP which has had this issue. Could you let me know which sites you would like me to target and I will try.
It's increasingly sounding like this is a site-specific problem, so it's probably not worth putting too much effort in to test. I'd just keep an eye out for any similar "Unable to retrieve proxy" errors for now (let us know if you spot any).
If you really want to run a test, I'd suggest targeting sites that have run the largest amounts of DUNE work in the last month from the EGI accounting portal: If the normal DUNE glideinWMS jobs are running there, then DIRAC jobs should also be successful there too.
Regards, Simon