"Hashes don't match!" error when writing output data

newer
JDL fields being dropped by Dirac...

Matt Williams

19 May 2016 19 May '16

4:36 p.m.

Hi, I'm one of the developers of Ganga and I've just started looking into using the GridPP DIRAC server as an endpoint for testing the Ganga DIRAC interface. We have some old tests that have not run in a while and used to run against the LHCb server. It generates a local script at /tmp/tmpITNsT1 which looks like: #!/bin/bash echo '1463671701.0 7678619867.41' > sandboxFile.txt echo '1463671701.0 1646249395.76' > getFile.dst echo '1463671702.0 9279948101.21' > removeFile.dst and then uses the DIRAC Python API to do: from DIRAC.Interfaces.API.Dirac import Dirac from DIRAC.Interfaces.API.Job import Job j = Job() j.setName('InitTestJob') j.setExecutable('tmpITNsT1','','Ganga_Executable.log') j.setInputSandbox(['/tmp/tmpITNsT1']) j.setOutputSandbox(['std.out','std.err','sandboxFile.txt']) j.setOutputData(['getFile.dst', 'removeFile.dst']) j.setBannedSites(['LCG.CERN.ch', 'LCG.CNAF.it', 'LCG.GRIDKA.de', 'LCG.IN2P3.fr', 'LCG.NIKHEF.nl', 'LCG.PIC.es', 'LCG.RAL.uk', 'LCG.SARA.nl']) #submit the job to dirac dirac=Dirac() result = dirac.submit(j) output(result) The job submits and run fine (for example jod ID 491036) but when it comes to write the output data I get the attached error which contains the line: dm.putAndRegister failed with message Failed to put file to Storage Element. Hashes don't match! Now, I assume that we're simply using the API wrong and that the test is out-of-date with the current state of things but if anyone could give me a pointer as to where I should start looking that would be appreciated. I've tried googling for the error message but only found matches against the source code. Cheers, Matt

Attachments:

dirac_error.log (text/x-log — 1.7 KB)

Show replies by date

Mark Slater

20 May 20 May

10:24 a.m.

New subject: "Hashes don't match!" error when writing output data

Hi Matt, I *think* this is because the output site hasn't been set and therefore the GridPP Dirac instance attempts to upload to the Sandbox SE which fails. This would work in LHCb because they have default SEs associated with sites in the Dirac configuration (additionally, this is because LHCb historically only use storage at Tier1s so it's very easy to determine where output data should go). I can't remember the Issue/PR number but this was addressed by Rob's recent DiracFile changes that should allow the SE to be set. Hope this helps! Thanks, Mark On 19/05/2016 16:36, Matt Williams wrote:

...

Hi,

I'm one of the developers of Ganga and I've just started looking into using the GridPP DIRAC server as an endpoint for testing the Ganga DIRAC interface. We have some old tests that have not run in a while and used to run against the LHCb server.

It generates a local script at /tmp/tmpITNsT1 which looks like:

#!/bin/bash echo '1463671701.0 7678619867.41' > sandboxFile.txt echo '1463671701.0 1646249395.76' > getFile.dst echo '1463671702.0 9279948101.21' > removeFile.dst

and then uses the DIRAC Python API to do:

from DIRAC.Interfaces.API.Dirac import Dirac from DIRAC.Interfaces.API.Job import Job j = Job() j.setName('InitTestJob') j.setExecutable('tmpITNsT1','','Ganga_Executable.log') j.setInputSandbox(['/tmp/tmpITNsT1']) j.setOutputSandbox(['std.out','std.err','sandboxFile.txt']) j.setOutputData(['getFile.dst', 'removeFile.dst']) j.setBannedSites(['LCG.CERN.ch', 'LCG.CNAF.it', 'LCG.GRIDKA.de', 'LCG.IN2P3.fr', 'LCG.NIKHEF.nl', 'LCG.PIC.es', 'LCG.RAL.uk', 'LCG.SARA.nl']) #submit the job to dirac dirac=Dirac() result = dirac.submit(j) output(result)

The job submits and run fine (for example jod ID 491036) but when it comes to write the output data I get the attached error which contains the line:

dm.putAndRegister failed with message Failed to put file to Storage Element. Hashes don't match!

Now, I assume that we're simply using the API wrong and that the test is out-of-date with the current state of things but if anyone could give me a pointer as to where I should start looking that would be appreciated. I've tried googling for the error message but only found matches against the source code.

Cheers, Matt

Matt Williams

11:13 a.m.

New subject: "Hashes don't match!" error when writing output data

Ok, so is something like: from DIRAC.ResourceStatusSystem.Utilities.CSHelpers import getStorageElements all_ses = getStorageElements()['Value'] uk_ses = [se for se in all_ses if se.startswith('UKI')] j.setOutputData(['getFile.dst', 'removeFile.dst'], outputSE=uk_ses) a sensible way of getting the UK SEs? Is there a way to prioritise writing to the SE attached to the CE where the job ran (if it has one)? Cheers, Matt On 20 May 2016 at 10:24, Mark Slater <mslater@cern.ch> wrote:

...

Hi Matt,

I *think* this is because the output site hasn't been set and therefore the GridPP Dirac instance attempts to upload to the Sandbox SE which fails. This would work in LHCb because they have default SEs associated with sites in the Dirac configuration (additionally, this is because LHCb historically only use storage at Tier1s so it's very easy to determine where output data should go). I can't remember the Issue/PR number but this was addressed by Rob's recent DiracFile changes that should allow the SE to be set.

Hope this helps!

Thanks,

Mark

On 19/05/2016 16:36, Matt Williams wrote:

Hi,

I'm one of the developers of Ganga and I've just started looking into using the GridPP DIRAC server as an endpoint for testing the Ganga DIRAC interface. We have some old tests that have not run in a while and used to run against the LHCb server.

It generates a local script at /tmp/tmpITNsT1 which looks like:

#!/bin/bash echo '1463671701.0 7678619867.41' > sandboxFile.txt echo '1463671701.0 1646249395.76' > getFile.dst echo '1463671702.0 9279948101.21' > removeFile.dst

and then uses the DIRAC Python API to do:

from DIRAC.Interfaces.API.Dirac import Dirac from DIRAC.Interfaces.API.Job import Job j = Job() j.setName('InitTestJob') j.setExecutable('tmpITNsT1','','Ganga_Executable.log') j.setInputSandbox(['/tmp/tmpITNsT1']) j.setOutputSandbox(['std.out','std.err','sandboxFile.txt']) j.setOutputData(['getFile.dst', 'removeFile.dst']) j.setBannedSites(['LCG.CERN.ch', 'LCG.CNAF.it', 'LCG.GRIDKA.de', 'LCG.IN2P3.fr', 'LCG.NIKHEF.nl', 'LCG.PIC.es', 'LCG.RAL.uk', 'LCG.SARA.nl']) #submit the job to dirac dirac=Dirac() result = dirac.submit(j) output(result)

The job submits and run fine (for example jod ID 491036) but when it comes to write the output data I get the attached error which contains the line:

dm.putAndRegister failed with message Failed to put file to Storage Element. Hashes don't match!

Now, I assume that we're simply using the API wrong and that the test is out-of-date with the current state of things but if anyone could give me a pointer as to where I should start looking that would be appreciated. I've tried googling for the error message but only found matches against the source code.

Cheers, Matt

-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users

Daniela Bauer

11:20 a.m.

New subject: "Hashes don't match!" error when writing output data

Hi Matt et al, We had this discussion at Imperial. The concept you are working on (automatically assigning an SE to store the job output) here is unique to LHCb. We need our users to define the SE where they want their output stored (e.g. in the way CMS requires their users to specify an output SE) as we as the administrators cannot know which users are supposed to write where, while our users usually have a very good idea where they want their data to go. So ideally if a user has a job that produces output, there should be an error message to indicate that they should specify an output SE. I don't know if internally you can deal with this with a "if not LHCb flag". Regards, Daniela On 20 May 2016 at 11:13, Matt Williams <Matt.Williams@cern.ch> wrote:

...

Ok, so is something like:

from DIRAC.ResourceStatusSystem.Utilities.CSHelpers import getStorageElements all_ses = getStorageElements()['Value'] uk_ses = [se for se in all_ses if se.startswith('UKI')]

j.setOutputData(['getFile.dst', 'removeFile.dst'], outputSE=uk_ses)

a sensible way of getting the UK SEs? Is there a way to prioritise writing to the SE attached to the CE where the job ran (if it has one)?

Cheers, Matt

On 20 May 2016 at 10:24, Mark Slater <mslater@cern.ch> wrote:

...
Hi Matt,

I *think* this is because the output site hasn't been set and therefore the GridPP Dirac instance attempts to upload to the Sandbox SE which fails. This would work in LHCb because they have default SEs associated with sites in the Dirac configuration (additionally, this is because LHCb historically only use storage at Tier1s so it's very easy to determine where output data should go). I can't remember the Issue/PR number but this was addressed by Rob's recent DiracFile changes that should allow the SE to be set.

Hope this helps!

Thanks,

Mark

On 19/05/2016 16:36, Matt Williams wrote:

Hi,

I'm one of the developers of Ganga and I've just started looking into using the GridPP DIRAC server as an endpoint for testing the Ganga DIRAC interface. We have some old tests that have not run in a while and used to run against the LHCb server.

It generates a local script at /tmp/tmpITNsT1 which looks like:

#!/bin/bash echo '1463671701.0 7678619867.41' > sandboxFile.txt echo '1463671701.0 1646249395.76' > getFile.dst echo '1463671702.0 9279948101.21' > removeFile.dst

and then uses the DIRAC Python API to do:

from DIRAC.Interfaces.API.Dirac import Dirac from DIRAC.Interfaces.API.Job import Job j = Job() j.setName('InitTestJob') j.setExecutable('tmpITNsT1','','Ganga_Executable.log') j.setInputSandbox(['/tmp/tmpITNsT1']) j.setOutputSandbox(['std.out','std.err','sandboxFile.txt']) j.setOutputData(['getFile.dst', 'removeFile.dst']) j.setBannedSites(['LCG.CERN.ch', 'LCG.CNAF.it', 'LCG.GRIDKA.de', 'LCG.IN2P3.fr', 'LCG.NIKHEF.nl', 'LCG.PIC.es', 'LCG.RAL.uk', 'LCG.SARA.nl']) #submit the job to dirac dirac=Dirac() result = dirac.submit(j) output(result)

The job submits and run fine (for example jod ID 491036) but when it comes to write the output data I get the attached error which contains the line:

dm.putAndRegister failed with message Failed to put file to Storage Element. Hashes don't match!

Now, I assume that we're simply using the API wrong and that the test is out-of-date with the current state of things but if anyone could give me a pointer as to where I should start looking that would be appreciated. I've tried googling for the error message but only found matches against the source code.

Cheers, Matt

-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users

-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users

-- Sent from the pit of despair ----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/

Mark Slater

11:25 a.m.

New subject: "Hashes don't match!" error when writing output data

HI Matt, err... Probably..? I'll defer to someone with more Dirac API knowledge but it makes sense (though I'm not sure if outputSE takes a list or s ingle value). As regards linking an SE to a CE, I don't believe there's an easy way of doing that short of going over the Dirac configuration and doing it manually. As Daniella says though, it's a policy decision that users have to specify the Dirac SE so I agree with her in that we should add a config option to Ganga to warn/raise exception if no Dirac SE is specified. Thanks, Mark On 20/05/2016 11:13, Matt Williams wrote:

...

Ok, so is something like:

from DIRAC.ResourceStatusSystem.Utilities.CSHelpers import getStorageElements all_ses = getStorageElements()['Value'] uk_ses = [se for se in all_ses if se.startswith('UKI')]

j.setOutputData(['getFile.dst', 'removeFile.dst'], outputSE=uk_ses)

a sensible way of getting the UK SEs? Is there a way to prioritise writing to the SE attached to the CE where the job ran (if it has one)?

Cheers, Matt

On 20 May 2016 at 10:24, Mark Slater <mslater@cern.ch> wrote:

...
Hi Matt,

I *think* this is because the output site hasn't been set and therefore the GridPP Dirac instance attempts to upload to the Sandbox SE which fails. This would work in LHCb because they have default SEs associated with sites in the Dirac configuration (additionally, this is because LHCb historically only use storage at Tier1s so it's very easy to determine where output data should go). I can't remember the Issue/PR number but this was addressed by Rob's recent DiracFile changes that should allow the SE to be set.

Hope this helps!

Thanks,

Mark

On 19/05/2016 16:36, Matt Williams wrote:

Hi,

I'm one of the developers of Ganga and I've just started looking into using the GridPP DIRAC server as an endpoint for testing the Ganga DIRAC interface. We have some old tests that have not run in a while and used to run against the LHCb server.

It generates a local script at /tmp/tmpITNsT1 which looks like:

#!/bin/bash echo '1463671701.0 7678619867.41' > sandboxFile.txt echo '1463671701.0 1646249395.76' > getFile.dst echo '1463671702.0 9279948101.21' > removeFile.dst

and then uses the DIRAC Python API to do:

from DIRAC.Interfaces.API.Dirac import Dirac from DIRAC.Interfaces.API.Job import Job j = Job() j.setName('InitTestJob') j.setExecutable('tmpITNsT1','','Ganga_Executable.log') j.setInputSandbox(['/tmp/tmpITNsT1']) j.setOutputSandbox(['std.out','std.err','sandboxFile.txt']) j.setOutputData(['getFile.dst', 'removeFile.dst']) j.setBannedSites(['LCG.CERN.ch', 'LCG.CNAF.it', 'LCG.GRIDKA.de', 'LCG.IN2P3.fr', 'LCG.NIKHEF.nl', 'LCG.PIC.es', 'LCG.RAL.uk', 'LCG.SARA.nl']) #submit the job to dirac dirac=Dirac() result = dirac.submit(j) output(result)

The job submits and run fine (for example jod ID 491036) but when it comes to write the output data I get the attached error which contains the line:

dm.putAndRegister failed with message Failed to put file to Storage Element. Hashes don't match!

Now, I assume that we're simply using the API wrong and that the test is out-of-date with the current state of things but if anyone could give me a pointer as to where I should start looking that would be appreciated. I've tried googling for the error message but only found matches against the source code.

Cheers, Matt

-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users

3457

Age (days ago)

3458

Last active (days ago)

List overview

Download

4 comments

3 participants

participants (3)

Daniela Bauer
Mark Slater
Matt Williams