Hi,
would it be terribly difficult, to download the files to your machine, delete them from the SE and re-upload them using dirac-dms-add-file ?
It would certainly be awkward - one of them is 26GB on its own so I would get into quota issues on the RAL UI. I can look into it. (It goes without saying it's a non-starter on domestic wifi...)
That should take care of the checksum.
That was my other question: the storage already knows the checksum, so is there actually any value in adding it to the DFC? NB the back-story is that these files are already on Castor tape, but those replicas cannot be added to the DFC (I'm not re-opening that discussion!). Naively, I'd been expecting that dirac-dms-add-file would have similar functionality to lcg-cr, but it doesn't (admit to) allowing a SURL for the source, only a local PFN. Hence I have to first transfer the files from Castor, for Dirac to be able to find them at all :( . Would having dirac-dms-add-file be able to copy files already on the Grid be a more constructive feature request? Thanks Henry On 15/04/2020, Daniela Bauer <daniela.bauer.grid@googlemail.com> wrote:
Hi Henry,
would it be terribly difficult, to download the files to your machine, delete them from the SE and re-upload them using dirac-dms-add-file ? That should take care of the checksum. What we really need is a tool that gets the checksum from the SE and then adds it to the catalogue, that would be much safer than doing it by hand.
Regards, Daniela
On Wed, 15 Apr 2020 at 16:15, Henry Nebrensky <torty5737@gmail.com> wrote:
Hi Daniela,
Thanks for looking into this... given the present circumstances, I already went ahead at the weekend and registered the files (34 of them) with dirac-dms-filecatalog-cli register file <lfn> <pfn> <size> <SE> as I felt it more important to get *everything* in the catalogue ASAP and avoid dark data issues.
Now the bad news :( dirac-dms-lfn-metadata shows a 'Checksum' key, with a null value. So on one file I tried using the "meta set" command to set a value for Checksum, and sure enough "meta get" now reports: Checksum : 90d83f3c BUT, this isn't reflected in whatever dirac-dms-lfn-metadata is looking at, which still has: {'Checksum': '', I'm obviously not clear on how the metadata is supposed to work: do I now have two separate keys called "Checksum" associated with the same LFN, with differing values? Was that supposed to happen? (History below. No "Checksum" shows up in "meta show".)
Apologies in advance if I've made things horribly worse!
Thanks
Henry
[phsrjjn@mercury008 ~]$ dirac-dms-filecatalog-cli Starting FileCatalog client
File Catalog Client $Revision: 1.17 $Date:
FC:/> meta set /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz Checksum 90d83f3c /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz {'Checksum': '90d83f3c'} FC:/> meta get /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz Checksum : 90d83f3c FC:/> meta get /mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz No metadata found FC:/> exit
[phsrjjn@mercury008 ~]$ dirac-dms-lfn-metadata /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz /mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz {'Failed': {}, 'Successful': {'/mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz': {'Checksum': '',
'ChecksumType': 'Adler32',
'CreationDate': datetime.datetime(2020, 4, 7, 22, 23, 36),
'FileID': 21843035L,
'GID': 21,
'GUID': '18cb95b3-7845-45f9-b3d3-97249d10e399',
'Mode': 509,
'ModificationDate': datetime.datetime(2020, 4, 7, 22, 23, 36),
'Owner': 'henry.nebrensky1',
'OwnerGroup': 'mice_user',
'Size': 854778028L,
'Status': 'AprioriGood',
'UID': 204},
'/mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz': {'Checksum': '',
'ChecksumType': 'Adler32',
'CreationDate': datetime.datetime(2020, 4, 7, 22, 37, 57),
'FileID': 21843415L,
'GID': 21,
'GUID': '901f0c57-d80f-4b1e-acc3-c3dbd920c2d6',
'Mode': 509,
'ModificationDate': datetime.datetime(2020, 4, 7, 22, 37, 57),
'Owner': 'henry.nebrensky1',
'OwnerGroup': 'mice_user',
'Size': 511181742L,
'Status': 'AprioriGood',
'UID': 204}}}
On 15/04/2020, Daniela Bauer <daniela.bauer.grid@googlemail.com> wrote:
Hi Henry,
unfortunately this feature is missing from the dirac-dms-filecatalogue-cli. I have made a feature request for this: https://github.com/DIRACGrid/DIRAC/issues/4548 Currently the only way to add this is to use the python API, you can find an example here, but it might be overkill for one file: https://github.com/ic-hep/DIRAC-tools/blob/master/solid/registerme.py (let me know if this script doesn't work any longer).
Regards, Daniela
On Wed, 8 Apr 2020 at 17:09, Henry Nebrensky <torty5737@gmail.com> wrote:
******************* This email originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list https://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email
stamping
for this address. ******************* Hi all,
I've got some files on the SE at Imperial which I'd like to register in the File Catalogue: I can do this using dirac-dms-filecatalog-cli with
register file <lfn> <pfn> <size> <SE> [<guid>] - register new file record in the catalog
The DFC has a metadata field for the checksum, and I have the original checksums recorded separately. How can I add the checksum to the file entry? Is it worth the effort (e.g. does the DFC checksum entry actually get used to validate transfers?)?
At present I'm using dirac-ui v6r22p6.
Thanks
Henry
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- Sent from my guinea pig living room
----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- Sent from my guinea pig living room
----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/