"Hashes don't match!" error when writing output data
Hi, I'm one of the developers of Ganga and I've just started looking into using the GridPP DIRAC server as an endpoint for testing the Ganga DIRAC interface. We have some old tests that have not run in a while and used to run against the LHCb server. It generates a local script at /tmp/tmpITNsT1 which looks like: #!/bin/bash echo '1463671701.0 7678619867.41' > sandboxFile.txt echo '1463671701.0 1646249395.76' > getFile.dst echo '1463671702.0 9279948101.21' > removeFile.dst and then uses the DIRAC Python API to do: from DIRAC.Interfaces.API.Dirac import Dirac from DIRAC.Interfaces.API.Job import Job j = Job() j.setName('InitTestJob') j.setExecutable('tmpITNsT1','','Ganga_Executable.log') j.setInputSandbox(['/tmp/tmpITNsT1']) j.setOutputSandbox(['std.out','std.err','sandboxFile.txt']) j.setOutputData(['getFile.dst', 'removeFile.dst']) j.setBannedSites(['LCG.CERN.ch', 'LCG.CNAF.it', 'LCG.GRIDKA.de', 'LCG.IN2P3.fr', 'LCG.NIKHEF.nl', 'LCG.PIC.es', 'LCG.RAL.uk', 'LCG.SARA.nl']) #submit the job to dirac dirac=Dirac() result = dirac.submit(j) output(result) The job submits and run fine (for example jod ID 491036) but when it comes to write the output data I get the attached error which contains the line: dm.putAndRegister failed with message Failed to put file to Storage Element. Hashes don't match! Now, I assume that we're simply using the API wrong and that the test is out-of-date with the current state of things but if anyone could give me a pointer as to where I should start looking that would be appreciated. I've tried googling for the error message but only found matches against the source code. Cheers, Matt
Hi Matt, I *think* this is because the output site hasn't been set and therefore the GridPP Dirac instance attempts to upload to the Sandbox SE which fails. This would work in LHCb because they have default SEs associated with sites in the Dirac configuration (additionally, this is because LHCb historically only use storage at Tier1s so it's very easy to determine where output data should go). I can't remember the Issue/PR number but this was addressed by Rob's recent DiracFile changes that should allow the SE to be set. Hope this helps! Thanks, Mark On 19/05/2016 16:36, Matt Williams wrote:
Hi,
I'm one of the developers of Ganga and I've just started looking into using the GridPP DIRAC server as an endpoint for testing the Ganga DIRAC interface. We have some old tests that have not run in a while and used to run against the LHCb server.
It generates a local script at /tmp/tmpITNsT1 which looks like:
#!/bin/bash echo '1463671701.0 7678619867.41' > sandboxFile.txt echo '1463671701.0 1646249395.76' > getFile.dst echo '1463671702.0 9279948101.21' > removeFile.dst
and then uses the DIRAC Python API to do:
from DIRAC.Interfaces.API.Dirac import Dirac from DIRAC.Interfaces.API.Job import Job j = Job() j.setName('InitTestJob') j.setExecutable('tmpITNsT1','','Ganga_Executable.log') j.setInputSandbox(['/tmp/tmpITNsT1']) j.setOutputSandbox(['std.out','std.err','sandboxFile.txt']) j.setOutputData(['getFile.dst', 'removeFile.dst']) j.setBannedSites(['LCG.CERN.ch', 'LCG.CNAF.it', 'LCG.GRIDKA.de', 'LCG.IN2P3.fr', 'LCG.NIKHEF.nl', 'LCG.PIC.es', 'LCG.RAL.uk', 'LCG.SARA.nl']) #submit the job to dirac dirac=Dirac() result = dirac.submit(j) output(result)
The job submits and run fine (for example jod ID 491036) but when it comes to write the output data I get the attached error which contains the line:
dm.putAndRegister failed with message Failed to put file to Storage Element. Hashes don't match!
Now, I assume that we're simply using the API wrong and that the test is out-of-date with the current state of things but if anyone could give me a pointer as to where I should start looking that would be appreciated. I've tried googling for the error message but only found matches against the source code.
Cheers, Matt
Ok, so is something like: from DIRAC.ResourceStatusSystem.Utilities.CSHelpers import getStorageElements all_ses = getStorageElements()['Value'] uk_ses = [se for se in all_ses if se.startswith('UKI')] j.setOutputData(['getFile.dst', 'removeFile.dst'], outputSE=uk_ses) a sensible way of getting the UK SEs? Is there a way to prioritise writing to the SE attached to the CE where the job ran (if it has one)? Cheers, Matt On 20 May 2016 at 10:24, Mark Slater <mslater@cern.ch> wrote:
Hi Matt,
I *think* this is because the output site hasn't been set and therefore the GridPP Dirac instance attempts to upload to the Sandbox SE which fails. This would work in LHCb because they have default SEs associated with sites in the Dirac configuration (additionally, this is because LHCb historically only use storage at Tier1s so it's very easy to determine where output data should go). I can't remember the Issue/PR number but this was addressed by Rob's recent DiracFile changes that should allow the SE to be set.
Hope this helps!
Thanks,
Mark
On 19/05/2016 16:36, Matt Williams wrote:
Hi,
I'm one of the developers of Ganga and I've just started looking into using the GridPP DIRAC server as an endpoint for testing the Ganga DIRAC interface. We have some old tests that have not run in a while and used to run against the LHCb server.
It generates a local script at /tmp/tmpITNsT1 which looks like:
#!/bin/bash echo '1463671701.0 7678619867.41' > sandboxFile.txt echo '1463671701.0 1646249395.76' > getFile.dst echo '1463671702.0 9279948101.21' > removeFile.dst
and then uses the DIRAC Python API to do:
from DIRAC.Interfaces.API.Dirac import Dirac from DIRAC.Interfaces.API.Job import Job j = Job() j.setName('InitTestJob') j.setExecutable('tmpITNsT1','','Ganga_Executable.log') j.setInputSandbox(['/tmp/tmpITNsT1']) j.setOutputSandbox(['std.out','std.err','sandboxFile.txt']) j.setOutputData(['getFile.dst', 'removeFile.dst']) j.setBannedSites(['LCG.CERN.ch', 'LCG.CNAF.it', 'LCG.GRIDKA.de', 'LCG.IN2P3.fr', 'LCG.NIKHEF.nl', 'LCG.PIC.es', 'LCG.RAL.uk', 'LCG.SARA.nl']) #submit the job to dirac dirac=Dirac() result = dirac.submit(j) output(result)
The job submits and run fine (for example jod ID 491036) but when it comes to write the output data I get the attached error which contains the line:
dm.putAndRegister failed with message Failed to put file to Storage Element. Hashes don't match!
Now, I assume that we're simply using the API wrong and that the test is out-of-date with the current state of things but if anyone could give me a pointer as to where I should start looking that would be appreciated. I've tried googling for the error message but only found matches against the source code.
Cheers, Matt
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
Hi Matt et al, We had this discussion at Imperial. The concept you are working on (automatically assigning an SE to store the job output) here is unique to LHCb. We need our users to define the SE where they want their output stored (e.g. in the way CMS requires their users to specify an output SE) as we as the administrators cannot know which users are supposed to write where, while our users usually have a very good idea where they want their data to go. So ideally if a user has a job that produces output, there should be an error message to indicate that they should specify an output SE. I don't know if internally you can deal with this with a "if not LHCb flag". Regards, Daniela On 20 May 2016 at 11:13, Matt Williams <Matt.Williams@cern.ch> wrote:
Ok, so is something like:
from DIRAC.ResourceStatusSystem.Utilities.CSHelpers import getStorageElements all_ses = getStorageElements()['Value'] uk_ses = [se for se in all_ses if se.startswith('UKI')]
j.setOutputData(['getFile.dst', 'removeFile.dst'], outputSE=uk_ses)
a sensible way of getting the UK SEs? Is there a way to prioritise writing to the SE attached to the CE where the job ran (if it has one)?
Cheers, Matt
On 20 May 2016 at 10:24, Mark Slater <mslater@cern.ch> wrote:
Hi Matt,
I *think* this is because the output site hasn't been set and therefore the GridPP Dirac instance attempts to upload to the Sandbox SE which fails. This would work in LHCb because they have default SEs associated with sites in the Dirac configuration (additionally, this is because LHCb historically only use storage at Tier1s so it's very easy to determine where output data should go). I can't remember the Issue/PR number but this was addressed by Rob's recent DiracFile changes that should allow the SE to be set.
Hope this helps!
Thanks,
Mark
On 19/05/2016 16:36, Matt Williams wrote:
Hi,
I'm one of the developers of Ganga and I've just started looking into using the GridPP DIRAC server as an endpoint for testing the Ganga DIRAC interface. We have some old tests that have not run in a while and used to run against the LHCb server.
It generates a local script at /tmp/tmpITNsT1 which looks like:
#!/bin/bash echo '1463671701.0 7678619867.41' > sandboxFile.txt echo '1463671701.0 1646249395.76' > getFile.dst echo '1463671702.0 9279948101.21' > removeFile.dst
and then uses the DIRAC Python API to do:
from DIRAC.Interfaces.API.Dirac import Dirac from DIRAC.Interfaces.API.Job import Job j = Job() j.setName('InitTestJob') j.setExecutable('tmpITNsT1','','Ganga_Executable.log') j.setInputSandbox(['/tmp/tmpITNsT1']) j.setOutputSandbox(['std.out','std.err','sandboxFile.txt']) j.setOutputData(['getFile.dst', 'removeFile.dst']) j.setBannedSites(['LCG.CERN.ch', 'LCG.CNAF.it', 'LCG.GRIDKA.de', 'LCG.IN2P3.fr', 'LCG.NIKHEF.nl', 'LCG.PIC.es', 'LCG.RAL.uk', 'LCG.SARA.nl']) #submit the job to dirac dirac=Dirac() result = dirac.submit(j) output(result)
The job submits and run fine (for example jod ID 491036) but when it comes to write the output data I get the attached error which contains the line:
dm.putAndRegister failed with message Failed to put file to Storage Element. Hashes don't match!
Now, I assume that we're simply using the API wrong and that the test is out-of-date with the current state of things but if anyone could give me a pointer as to where I should start looking that would be appreciated. I've tried googling for the error message but only found matches against the source code.
Cheers, Matt
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- Sent from the pit of despair ----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
HI Matt, err... Probably..? I'll defer to someone with more Dirac API knowledge but it makes sense (though I'm not sure if outputSE takes a list or s ingle value). As regards linking an SE to a CE, I don't believe there's an easy way of doing that short of going over the Dirac configuration and doing it manually. As Daniella says though, it's a policy decision that users have to specify the Dirac SE so I agree with her in that we should add a config option to Ganga to warn/raise exception if no Dirac SE is specified. Thanks, Mark On 20/05/2016 11:13, Matt Williams wrote:
Ok, so is something like:
from DIRAC.ResourceStatusSystem.Utilities.CSHelpers import getStorageElements all_ses = getStorageElements()['Value'] uk_ses = [se for se in all_ses if se.startswith('UKI')]
j.setOutputData(['getFile.dst', 'removeFile.dst'], outputSE=uk_ses)
a sensible way of getting the UK SEs? Is there a way to prioritise writing to the SE attached to the CE where the job ran (if it has one)?
Cheers, Matt
On 20 May 2016 at 10:24, Mark Slater <mslater@cern.ch> wrote:
Hi Matt,
I *think* this is because the output site hasn't been set and therefore the GridPP Dirac instance attempts to upload to the Sandbox SE which fails. This would work in LHCb because they have default SEs associated with sites in the Dirac configuration (additionally, this is because LHCb historically only use storage at Tier1s so it's very easy to determine where output data should go). I can't remember the Issue/PR number but this was addressed by Rob's recent DiracFile changes that should allow the SE to be set.
Hope this helps!
Thanks,
Mark
On 19/05/2016 16:36, Matt Williams wrote:
Hi,
I'm one of the developers of Ganga and I've just started looking into using the GridPP DIRAC server as an endpoint for testing the Ganga DIRAC interface. We have some old tests that have not run in a while and used to run against the LHCb server.
It generates a local script at /tmp/tmpITNsT1 which looks like:
#!/bin/bash echo '1463671701.0 7678619867.41' > sandboxFile.txt echo '1463671701.0 1646249395.76' > getFile.dst echo '1463671702.0 9279948101.21' > removeFile.dst
and then uses the DIRAC Python API to do:
from DIRAC.Interfaces.API.Dirac import Dirac from DIRAC.Interfaces.API.Job import Job j = Job() j.setName('InitTestJob') j.setExecutable('tmpITNsT1','','Ganga_Executable.log') j.setInputSandbox(['/tmp/tmpITNsT1']) j.setOutputSandbox(['std.out','std.err','sandboxFile.txt']) j.setOutputData(['getFile.dst', 'removeFile.dst']) j.setBannedSites(['LCG.CERN.ch', 'LCG.CNAF.it', 'LCG.GRIDKA.de', 'LCG.IN2P3.fr', 'LCG.NIKHEF.nl', 'LCG.PIC.es', 'LCG.RAL.uk', 'LCG.SARA.nl']) #submit the job to dirac dirac=Dirac() result = dirac.submit(j) output(result)
The job submits and run fine (for example jod ID 491036) but when it comes to write the output data I get the attached error which contains the line:
dm.putAndRegister failed with message Failed to put file to Storage Element. Hashes don't match!
Now, I assume that we're simply using the API wrong and that the test is out-of-date with the current state of things but if anyone could give me a pointer as to where I should start looking that would be appreciated. I've tried googling for the error message but only found matches against the source code.
Cheers, Matt
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
participants (3)
-
Daniela Bauer
-
Mark Slater
-
Matt Williams