******************* This email originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list https://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email stamping for this address. ******************* Hi all, I've got some files on the SE at Imperial which I'd like to register in the File Catalogue: I can do this using dirac-dms-filecatalog-cli with register file <lfn> <pfn> <size> <SE> [<guid>] - register new file record in the catalog The DFC has a metadata field for the checksum, and I have the original checksums recorded separately. How can I add the checksum to the file entry? Is it worth the effort (e.g. does the DFC checksum entry actually get used to validate transfers?)? At present I'm using dirac-ui v6r22p6. Thanks Henry
Hi Henry, unfortunately this feature is missing from the dirac-dms-filecatalogue-cli. I have made a feature request for this: https://github.com/DIRACGrid/DIRAC/issues/4548 Currently the only way to add this is to use the python API, you can find an example here, but it might be overkill for one file: https://github.com/ic-hep/DIRAC-tools/blob/master/solid/registerme.py (let me know if this script doesn't work any longer). Regards, Daniela On Wed, 8 Apr 2020 at 17:09, Henry Nebrensky <torty5737@gmail.com> wrote:
******************* This email originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list https://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email stamping for this address. ******************* Hi all,
I've got some files on the SE at Imperial which I'd like to register in the File Catalogue: I can do this using dirac-dms-filecatalog-cli with
register file <lfn> <pfn> <size> <SE> [<guid>] - register new file record in the catalog
The DFC has a metadata field for the checksum, and I have the original checksums recorded separately. How can I add the checksum to the file entry? Is it worth the effort (e.g. does the DFC checksum entry actually get used to validate transfers?)?
At present I'm using dirac-ui v6r22p6.
Thanks
Henry
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- Sent from my guinea pig living room ----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
Hi Daniela, Thanks for looking into this... given the present circumstances, I already went ahead at the weekend and registered the files (34 of them) with dirac-dms-filecatalog-cli register file <lfn> <pfn> <size> <SE> as I felt it more important to get *everything* in the catalogue ASAP and avoid dark data issues. Now the bad news :( dirac-dms-lfn-metadata shows a 'Checksum' key, with a null value. So on one file I tried using the "meta set" command to set a value for Checksum, and sure enough "meta get" now reports: Checksum : 90d83f3c BUT, this isn't reflected in whatever dirac-dms-lfn-metadata is looking at, which still has: {'Checksum': '', I'm obviously not clear on how the metadata is supposed to work: do I now have two separate keys called "Checksum" associated with the same LFN, with differing values? Was that supposed to happen? (History below. No "Checksum" shows up in "meta show".) Apologies in advance if I've made things horribly worse! Thanks Henry [phsrjjn@mercury008 ~]$ dirac-dms-filecatalog-cli Starting FileCatalog client File Catalog Client $Revision: 1.17 $Date: FC:/> meta set /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz Checksum 90d83f3c /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz {'Checksum': '90d83f3c'} FC:/> meta get /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz Checksum : 90d83f3c FC:/> meta get /mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz No metadata found FC:/> exit [phsrjjn@mercury008 ~]$ dirac-dms-lfn-metadata /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz /mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz {'Failed': {}, 'Successful': {'/mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz': {'Checksum': '', 'ChecksumType': 'Adler32', 'CreationDate': datetime.datetime(2020, 4, 7, 22, 23, 36), 'FileID': 21843035L, 'GID': 21, 'GUID': '18cb95b3-7845-45f9-b3d3-97249d10e399', 'Mode': 509, 'ModificationDate': datetime.datetime(2020, 4, 7, 22, 23, 36), 'Owner': 'henry.nebrensky1', 'OwnerGroup': 'mice_user', 'Size': 854778028L, 'Status': 'AprioriGood', 'UID': 204}, '/mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz': {'Checksum': '', 'ChecksumType': 'Adler32', 'CreationDate': datetime.datetime(2020, 4, 7, 22, 37, 57), 'FileID': 21843415L, 'GID': 21, 'GUID': '901f0c57-d80f-4b1e-acc3-c3dbd920c2d6', 'Mode': 509, 'ModificationDate': datetime.datetime(2020, 4, 7, 22, 37, 57), 'Owner': 'henry.nebrensky1', 'OwnerGroup': 'mice_user', 'Size': 511181742L, 'Status': 'AprioriGood', 'UID': 204}}} On 15/04/2020, Daniela Bauer <daniela.bauer.grid@googlemail.com> wrote:
Hi Henry,
unfortunately this feature is missing from the dirac-dms-filecatalogue-cli. I have made a feature request for this: https://github.com/DIRACGrid/DIRAC/issues/4548 Currently the only way to add this is to use the python API, you can find an example here, but it might be overkill for one file: https://github.com/ic-hep/DIRAC-tools/blob/master/solid/registerme.py (let me know if this script doesn't work any longer).
Regards, Daniela
On Wed, 8 Apr 2020 at 17:09, Henry Nebrensky <torty5737@gmail.com> wrote:
******************* This email originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list https://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email stamping for this address. ******************* Hi all,
I've got some files on the SE at Imperial which I'd like to register in the File Catalogue: I can do this using dirac-dms-filecatalog-cli with
register file <lfn> <pfn> <size> <SE> [<guid>] - register new file record in the catalog
The DFC has a metadata field for the checksum, and I have the original checksums recorded separately. How can I add the checksum to the file entry? Is it worth the effort (e.g. does the DFC checksum entry actually get used to validate transfers?)?
At present I'm using dirac-ui v6r22p6.
Thanks
Henry
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- Sent from my guinea pig living room
----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
Hi Henry, Meta set sets user metadata, so you can mark your favourite analysis sample. It does not magically set file metadata. The ‘Checksum' key is your ‘private’ key you used to mark your sample and … any similarities to facts or characters are purely accidental ;-) . There must be a way to use APIs to set just file metadata programmatically, setFileMetadata or similar on File catalog. I might look into this later this week. Best, JM
On 15 Apr 2020, at 16:15, Henry Nebrensky <torty5737@gmail.com> wrote:
Hi Daniela,
Thanks for looking into this... given the present circumstances, I already went ahead at the weekend and registered the files (34 of them) with dirac-dms-filecatalog-cli register file <lfn> <pfn> <size> <SE> as I felt it more important to get *everything* in the catalogue ASAP and avoid dark data issues.
Now the bad news :( dirac-dms-lfn-metadata shows a 'Checksum' key, with a null value. So on one file I tried using the "meta set" command to set a value for Checksum, and sure enough "meta get" now reports: Checksum : 90d83f3c BUT, this isn't reflected in whatever dirac-dms-lfn-metadata is looking at, which still has: {'Checksum': '', I'm obviously not clear on how the metadata is supposed to work: do I now have two separate keys called "Checksum" associated with the same LFN, with differing values? Was that supposed to happen? (History below. No "Checksum" shows up in "meta show".)
Apologies in advance if I've made things horribly worse!
Thanks
Henry
[phsrjjn@mercury008 ~]$ dirac-dms-filecatalog-cli Starting FileCatalog client
File Catalog Client $Revision: 1.17 $Date:
FC:/> meta set /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz Checksum 90d83f3c /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz {'Checksum': '90d83f3c'} FC:/> meta get /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz Checksum : 90d83f3c FC:/> meta get /mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz No metadata found FC:/> exit
[phsrjjn@mercury008 ~]$ dirac-dms-lfn-metadata /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz /mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz {'Failed': {}, 'Successful': {'/mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz': {'Checksum': '',
'ChecksumType': 'Adler32',
'CreationDate': datetime.datetime(2020, 4, 7, 22, 23, 36),
'FileID': 21843035L,
'GID': 21,
'GUID': '18cb95b3-7845-45f9-b3d3-97249d10e399',
'Mode': 509,
'ModificationDate': datetime.datetime(2020, 4, 7, 22, 23, 36),
'Owner': 'henry.nebrensky1',
'OwnerGroup': 'mice_user',
'Size': 854778028L,
'Status': 'AprioriGood',
'UID': 204},
'/mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz': {'Checksum': '',
'ChecksumType': 'Adler32',
'CreationDate': datetime.datetime(2020, 4, 7, 22, 37, 57),
'FileID': 21843415L,
'GID': 21,
'GUID': '901f0c57-d80f-4b1e-acc3-c3dbd920c2d6',
'Mode': 509,
'ModificationDate': datetime.datetime(2020, 4, 7, 22, 37, 57),
'Owner': 'henry.nebrensky1',
'OwnerGroup': 'mice_user',
'Size': 511181742L,
'Status': 'AprioriGood',
'UID': 204}}}
On 15/04/2020, Daniela Bauer <daniela.bauer.grid@googlemail.com> wrote:
Hi Henry,
unfortunately this feature is missing from the dirac-dms-filecatalogue-cli. I have made a feature request for this: https://github.com/DIRACGrid/DIRAC/issues/4548 Currently the only way to add this is to use the python API, you can find an example here, but it might be overkill for one file: https://github.com/ic-hep/DIRAC-tools/blob/master/solid/registerme.py (let me know if this script doesn't work any longer).
Regards, Daniela
On Wed, 8 Apr 2020 at 17:09, Henry Nebrensky <torty5737@gmail.com> wrote:
******************* This email originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list https://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email stamping for this address. ******************* Hi all,
I've got some files on the SE at Imperial which I'd like to register in the File Catalogue: I can do this using dirac-dms-filecatalog-cli with
register file <lfn> <pfn> <size> <SE> [<guid>] - register new file record in the catalog
The DFC has a metadata field for the checksum, and I have the original checksums recorded separately. How can I add the checksum to the file entry? Is it worth the effort (e.g. does the DFC checksum entry actually get used to validate transfers?)?
At present I'm using dirac-ui v6r22p6.
Thanks
Henry
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- Sent from my guinea pig living room
----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
Hi Janusz,
There must be a way to use APIs to set just file metadata programmatically, setFileMetadata or similar on File catalog. I might look into this later this week.
It would be great if you could have a look as I think this would be the most efficient way of tidying this up. My other thought was to try replicating the files from IC to the RAL PPD SE and hope Dirac notices the checksum, but when I tried that on a different file some days ago it insisted on using local disk as a buffer rather than doing a proper 3rd-party copy :( . Thanks Henry On 15/04/2020, Dr Janusz Martyniak <janusz.martyniak@imperial.ac.uk> wrote:
Hi Henry,
Meta set sets user metadata, so you can mark your favourite analysis sample. It does not magically set file metadata. The ‘Checksum' key is your ‘private’ key you used to mark your sample and … any similarities to facts or characters are purely accidental ;-) .
There must be a way to use APIs to set just file metadata programmatically, setFileMetadata or similar on File catalog. I might look into this later this week.
Best, JM
On 15 Apr 2020, at 16:15, Henry Nebrensky <torty5737@gmail.com> wrote:
Hi Daniela,
Thanks for looking into this... given the present circumstances, I already went ahead at the weekend and registered the files (34 of them) with dirac-dms-filecatalog-cli register file <lfn> <pfn> <size> <SE> as I felt it more important to get *everything* in the catalogue ASAP and avoid dark data issues.
Now the bad news :( dirac-dms-lfn-metadata shows a 'Checksum' key, with a null value. So on one file I tried using the "meta set" command to set a value for Checksum, and sure enough "meta get" now reports: Checksum : 90d83f3c BUT, this isn't reflected in whatever dirac-dms-lfn-metadata is looking at, which still has: {'Checksum': '', I'm obviously not clear on how the metadata is supposed to work: do I now have two separate keys called "Checksum" associated with the same LFN, with differing values? Was that supposed to happen? (History below. No "Checksum" shows up in "meta show".)
Apologies in advance if I've made things horribly worse!
Thanks
Henry
[phsrjjn@mercury008 ~]$ dirac-dms-filecatalog-cli Starting FileCatalog client
File Catalog Client $Revision: 1.17 $Date:
FC:/> meta set /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz Checksum 90d83f3c /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz {'Checksum': '90d83f3c'} FC:/> meta get /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz Checksum : 90d83f3c FC:/> meta get /mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz No metadata found FC:/> exit
[phsrjjn@mercury008 ~]$ dirac-dms-lfn-metadata /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz /mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz {'Failed': {}, 'Successful': {'/mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz': {'Checksum': '',
'ChecksumType': 'Adler32',
'CreationDate': datetime.datetime(2020, 4, 7, 22, 23, 36),
'FileID': 21843035L,
'GID': 21,
'GUID': '18cb95b3-7845-45f9-b3d3-97249d10e399',
'Mode': 509,
'ModificationDate': datetime.datetime(2020, 4, 7, 22, 23, 36),
'Owner': 'henry.nebrensky1',
'OwnerGroup': 'mice_user',
'Size': 854778028L,
'Status': 'AprioriGood',
'UID': 204},
'/mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz': {'Checksum': '',
'ChecksumType': 'Adler32',
'CreationDate': datetime.datetime(2020, 4, 7, 22, 37, 57),
'FileID': 21843415L,
'GID': 21,
'GUID': '901f0c57-d80f-4b1e-acc3-c3dbd920c2d6',
'Mode': 509,
'ModificationDate': datetime.datetime(2020, 4, 7, 22, 37, 57),
'Owner': 'henry.nebrensky1',
'OwnerGroup': 'mice_user',
'Size': 511181742L,
'Status': 'AprioriGood',
'UID': 204}}}
On 15/04/2020, Daniela Bauer <daniela.bauer.grid@googlemail.com> wrote:
Hi Henry,
unfortunately this feature is missing from the dirac-dms-filecatalogue-cli. I have made a feature request for this: https://github.com/DIRACGrid/DIRAC/issues/4548 Currently the only way to add this is to use the python API, you can find an example here, but it might be overkill for one file: https://github.com/ic-hep/DIRAC-tools/blob/master/solid/registerme.py (let me know if this script doesn't work any longer).
Regards, Daniela
On Wed, 8 Apr 2020 at 17:09, Henry Nebrensky <torty5737@gmail.com> wrote:
******************* This email originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list https://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email stamping for this address. ******************* Hi all,
I've got some files on the SE at Imperial which I'd like to register in the File Catalogue: I can do this using dirac-dms-filecatalog-cli with
register file <lfn> <pfn> <size> <SE> [<guid>] - register new file record in the catalog
The DFC has a metadata field for the checksum, and I have the original checksums recorded separately. How can I add the checksum to the file entry? Is it worth the effort (e.g. does the DFC checksum entry actually get used to validate transfers?)?
At present I'm using dirac-ui v6r22p6.
Thanks
Henry
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- Sent from my guinea pig living room
----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
Hi Henry, I spoke to the DIRAC developers at length and apparently it's a design choice: Only admins are meant to update individual fields in the catalogue. We might still be able to (eventually) write something that adds a checksum that it finds on the SE to the catalogue if none is present The no-3rd party copy might also be a feature as there are some SEs (mainly ECHO) that have a really hard time talking to each other. In the short term, once you have a list of files, can you email them to Simon F and me, please (as we are the only ones with admin access to the catalogue) and we should be able to add your checksums. We have basically no experience with tape handling in DIRAC (though technically code exists), but as MICE is officially winding down, this might not be a good time to start. Regards, Daniela On Thu, 16 Apr 2020 at 16:41, Henry Nebrensky <torty5737@gmail.com> wrote:
Hi Janusz,
There must be a way to use APIs to set just file metadata programmatically, setFileMetadata or similar on File catalog. I might look into this later this week.
It would be great if you could have a look as I think this would be the most efficient way of tidying this up.
My other thought was to try replicating the files from IC to the RAL PPD SE and hope Dirac notices the checksum, but when I tried that on a different file some days ago it insisted on using local disk as a buffer rather than doing a proper 3rd-party copy :( .
Thanks
Henry
On 15/04/2020, Dr Janusz Martyniak <janusz.martyniak@imperial.ac.uk> wrote:
Hi Henry,
Meta set sets user metadata, so you can mark your favourite analysis sample. It does not magically set file metadata. The ‘Checksum' key is your ‘private’ key you used to mark your sample and … any similarities to facts or characters are purely accidental ;-) .
There must be a way to use APIs to set just file metadata programmatically, setFileMetadata or similar on File catalog. I might look into this later this week.
Best, JM
On 15 Apr 2020, at 16:15, Henry Nebrensky <torty5737@gmail.com> wrote:
Hi Daniela,
Thanks for looking into this... given the present circumstances, I already went ahead at the weekend and registered the files (34 of them) with dirac-dms-filecatalog-cli register file <lfn> <pfn> <size> <SE> as I felt it more important to get *everything* in the catalogue ASAP and avoid dark data issues.
Now the bad news :( dirac-dms-lfn-metadata shows a 'Checksum' key, with a null value. So on one file I tried using the "meta set" command to set a value for Checksum, and sure enough "meta get" now reports: Checksum : 90d83f3c BUT, this isn't reflected in whatever dirac-dms-lfn-metadata is looking at, which still has: {'Checksum': '', I'm obviously not clear on how the metadata is supposed to work: do I now have two separate keys called "Checksum" associated with the same LFN, with differing values? Was that supposed to happen? (History below. No "Checksum" shows up in "meta show".)
Apologies in advance if I've made things horribly worse!
Thanks
Henry
[phsrjjn@mercury008 ~]$ dirac-dms-filecatalog-cli Starting FileCatalog client
File Catalog Client $Revision: 1.17 $Date:
FC:/> meta set /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz Checksum 90d83f3c /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz {'Checksum': '90d83f3c'} FC:/> meta get /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz Checksum : 90d83f3c FC:/> meta get /mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz No metadata found FC:/> exit
[phsrjjn@mercury008 ~]$ dirac-dms-lfn-metadata /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz /mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz {'Failed': {}, 'Successful': {'/mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz': {'Checksum': '',
'ChecksumType': 'Adler32',
'CreationDate': datetime.datetime(2020, 4, 7, 22, 23, 36),
'FileID': 21843035L,
'GID': 21,
'GUID': '18cb95b3-7845-45f9-b3d3-97249d10e399',
'Mode': 509,
'ModificationDate': datetime.datetime(2020, 4, 7, 22, 23, 36),
'Owner': 'henry.nebrensky1',
'OwnerGroup': 'mice_user',
'Size': 854778028L,
'Status': 'AprioriGood',
'UID': 204},
'/mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz': {'Checksum': '',
'ChecksumType': 'Adler32',
'CreationDate': datetime.datetime(2020, 4, 7, 22, 37, 57),
'FileID': 21843415L,
'GID': 21,
'GUID': '901f0c57-d80f-4b1e-acc3-c3dbd920c2d6',
'Mode': 509,
'ModificationDate': datetime.datetime(2020, 4, 7, 22, 37, 57),
'Owner': 'henry.nebrensky1',
'OwnerGroup': 'mice_user',
'Size': 511181742L,
'Status': 'AprioriGood',
'UID': 204}}}
On 15/04/2020, Daniela Bauer <daniela.bauer.grid@googlemail.com> wrote:
Hi Henry,
unfortunately this feature is missing from the dirac-dms-filecatalogue-cli. I have made a feature request for this: https://github.com/DIRACGrid/DIRAC/issues/4548 Currently the only way to add this is to use the python API, you can find an example here, but it might be overkill for one file: https://github.com/ic-hep/DIRAC-tools/blob/master/solid/registerme.py (let me know if this script doesn't work any longer).
Regards, Daniela
On Wed, 8 Apr 2020 at 17:09, Henry Nebrensky <torty5737@gmail.com> wrote:
******************* This email originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list https://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email
stamping
for this address. ******************* Hi all,
I've got some files on the SE at Imperial which I'd like to register in the File Catalogue: I can do this using dirac-dms-filecatalog-cli with
register file <lfn> <pfn> <size> <SE> [<guid>] - register new file record in the catalog
The DFC has a metadata field for the checksum, and I have the original checksums recorded separately. How can I add the checksum to the file entry? Is it worth the effort (e.g. does the DFC checksum entry actually get used to validate transfers?)?
At present I'm using dirac-ui v6r22p6.
Thanks
Henry
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- Sent from my guinea pig living room
----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- Sent from my guinea pig living room ----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
Hi Henry, would it be terribly difficult, to download the files to your machine, delete them from the SE and re-upload them using dirac-dms-add-file ? That should take care of the checksum. What we really need is a tool that gets the checksum from the SE and then adds it to the catalogue, that would be much safer than doing it by hand. Regards, Daniela On Wed, 15 Apr 2020 at 16:15, Henry Nebrensky <torty5737@gmail.com> wrote:
Hi Daniela,
Thanks for looking into this... given the present circumstances, I already went ahead at the weekend and registered the files (34 of them) with dirac-dms-filecatalog-cli register file <lfn> <pfn> <size> <SE> as I felt it more important to get *everything* in the catalogue ASAP and avoid dark data issues.
Now the bad news :( dirac-dms-lfn-metadata shows a 'Checksum' key, with a null value. So on one file I tried using the "meta set" command to set a value for Checksum, and sure enough "meta get" now reports: Checksum : 90d83f3c BUT, this isn't reflected in whatever dirac-dms-lfn-metadata is looking at, which still has: {'Checksum': '', I'm obviously not clear on how the metadata is supposed to work: do I now have two separate keys called "Checksum" associated with the same LFN, with differing values? Was that supposed to happen? (History below. No "Checksum" shows up in "meta show".)
Apologies in advance if I've made things horribly worse!
Thanks
Henry
[phsrjjn@mercury008 ~]$ dirac-dms-filecatalog-cli Starting FileCatalog client
File Catalog Client $Revision: 1.17 $Date:
FC:/> meta set /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz Checksum 90d83f3c /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz {'Checksum': '90d83f3c'} FC:/> meta get /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz Checksum : 90d83f3c FC:/> meta get /mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz No metadata found FC:/> exit
[phsrjjn@mercury008 ~]$ dirac-dms-lfn-metadata /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz /mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz {'Failed': {}, 'Successful': {'/mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz': {'Checksum': '',
'ChecksumType': 'Adler32',
'CreationDate': datetime.datetime(2020, 4, 7, 22, 23, 36),
'FileID': 21843035L,
'GID': 21,
'GUID': '18cb95b3-7845-45f9-b3d3-97249d10e399',
'Mode': 509,
'ModificationDate': datetime.datetime(2020, 4, 7, 22, 23, 36),
'Owner': 'henry.nebrensky1',
'OwnerGroup': 'mice_user',
'Size': 854778028L,
'Status': 'AprioriGood',
'UID': 204},
'/mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz': {'Checksum': '',
'ChecksumType': 'Adler32',
'CreationDate': datetime.datetime(2020, 4, 7, 22, 37, 57),
'FileID': 21843415L,
'GID': 21,
'GUID': '901f0c57-d80f-4b1e-acc3-c3dbd920c2d6',
'Mode': 509,
'ModificationDate': datetime.datetime(2020, 4, 7, 22, 37, 57),
'Owner': 'henry.nebrensky1',
'OwnerGroup': 'mice_user',
'Size': 511181742L,
'Status': 'AprioriGood',
'UID': 204}}}
On 15/04/2020, Daniela Bauer <daniela.bauer.grid@googlemail.com> wrote:
Hi Henry,
unfortunately this feature is missing from the dirac-dms-filecatalogue-cli. I have made a feature request for this: https://github.com/DIRACGrid/DIRAC/issues/4548 Currently the only way to add this is to use the python API, you can find an example here, but it might be overkill for one file: https://github.com/ic-hep/DIRAC-tools/blob/master/solid/registerme.py (let me know if this script doesn't work any longer).
Regards, Daniela
On Wed, 8 Apr 2020 at 17:09, Henry Nebrensky <torty5737@gmail.com> wrote:
******************* This email originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list https://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email
stamping
for this address. ******************* Hi all,
I've got some files on the SE at Imperial which I'd like to register in the File Catalogue: I can do this using dirac-dms-filecatalog-cli with
register file <lfn> <pfn> <size> <SE> [<guid>] - register new file record in the catalog
The DFC has a metadata field for the checksum, and I have the original checksums recorded separately. How can I add the checksum to the file entry? Is it worth the effort (e.g. does the DFC checksum entry actually get used to validate transfers?)?
At present I'm using dirac-ui v6r22p6.
Thanks
Henry
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- Sent from my guinea pig living room
----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- Sent from my guinea pig living room ----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
Hi,
would it be terribly difficult, to download the files to your machine, delete them from the SE and re-upload them using dirac-dms-add-file ?
It would certainly be awkward - one of them is 26GB on its own so I would get into quota issues on the RAL UI. I can look into it. (It goes without saying it's a non-starter on domestic wifi...)
That should take care of the checksum.
That was my other question: the storage already knows the checksum, so is there actually any value in adding it to the DFC? NB the back-story is that these files are already on Castor tape, but those replicas cannot be added to the DFC (I'm not re-opening that discussion!). Naively, I'd been expecting that dirac-dms-add-file would have similar functionality to lcg-cr, but it doesn't (admit to) allowing a SURL for the source, only a local PFN. Hence I have to first transfer the files from Castor, for Dirac to be able to find them at all :( . Would having dirac-dms-add-file be able to copy files already on the Grid be a more constructive feature request? Thanks Henry On 15/04/2020, Daniela Bauer <daniela.bauer.grid@googlemail.com> wrote:
Hi Henry,
would it be terribly difficult, to download the files to your machine, delete them from the SE and re-upload them using dirac-dms-add-file ? That should take care of the checksum. What we really need is a tool that gets the checksum from the SE and then adds it to the catalogue, that would be much safer than doing it by hand.
Regards, Daniela
On Wed, 15 Apr 2020 at 16:15, Henry Nebrensky <torty5737@gmail.com> wrote:
Hi Daniela,
Thanks for looking into this... given the present circumstances, I already went ahead at the weekend and registered the files (34 of them) with dirac-dms-filecatalog-cli register file <lfn> <pfn> <size> <SE> as I felt it more important to get *everything* in the catalogue ASAP and avoid dark data issues.
Now the bad news :( dirac-dms-lfn-metadata shows a 'Checksum' key, with a null value. So on one file I tried using the "meta set" command to set a value for Checksum, and sure enough "meta get" now reports: Checksum : 90d83f3c BUT, this isn't reflected in whatever dirac-dms-lfn-metadata is looking at, which still has: {'Checksum': '', I'm obviously not clear on how the metadata is supposed to work: do I now have two separate keys called "Checksum" associated with the same LFN, with differing values? Was that supposed to happen? (History below. No "Checksum" shows up in "meta show".)
Apologies in advance if I've made things horribly worse!
Thanks
Henry
[phsrjjn@mercury008 ~]$ dirac-dms-filecatalog-cli Starting FileCatalog client
File Catalog Client $Revision: 1.17 $Date:
FC:/> meta set /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz Checksum 90d83f3c /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz {'Checksum': '90d83f3c'} FC:/> meta get /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz Checksum : 90d83f3c FC:/> meta get /mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz No metadata found FC:/> exit
[phsrjjn@mercury008 ~]$ dirac-dms-lfn-metadata /mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz /mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz {'Failed': {}, 'Successful': {'/mice/Construction/EPICSarchive/data_miceecserv1_20180819.tar.gz': {'Checksum': '',
'ChecksumType': 'Adler32',
'CreationDate': datetime.datetime(2020, 4, 7, 22, 23, 36),
'FileID': 21843035L,
'GID': 21,
'GUID': '18cb95b3-7845-45f9-b3d3-97249d10e399',
'Mode': 509,
'ModificationDate': datetime.datetime(2020, 4, 7, 22, 23, 36),
'Owner': 'henry.nebrensky1',
'OwnerGroup': 'mice_user',
'Size': 854778028L,
'Status': 'AprioriGood',
'UID': 204},
'/mice/Construction/EPICSarchive/data_miceecserv2_20180819.tar.gz': {'Checksum': '',
'ChecksumType': 'Adler32',
'CreationDate': datetime.datetime(2020, 4, 7, 22, 37, 57),
'FileID': 21843415L,
'GID': 21,
'GUID': '901f0c57-d80f-4b1e-acc3-c3dbd920c2d6',
'Mode': 509,
'ModificationDate': datetime.datetime(2020, 4, 7, 22, 37, 57),
'Owner': 'henry.nebrensky1',
'OwnerGroup': 'mice_user',
'Size': 511181742L,
'Status': 'AprioriGood',
'UID': 204}}}
On 15/04/2020, Daniela Bauer <daniela.bauer.grid@googlemail.com> wrote:
Hi Henry,
unfortunately this feature is missing from the dirac-dms-filecatalogue-cli. I have made a feature request for this: https://github.com/DIRACGrid/DIRAC/issues/4548 Currently the only way to add this is to use the python API, you can find an example here, but it might be overkill for one file: https://github.com/ic-hep/DIRAC-tools/blob/master/solid/registerme.py (let me know if this script doesn't work any longer).
Regards, Daniela
On Wed, 8 Apr 2020 at 17:09, Henry Nebrensky <torty5737@gmail.com> wrote:
******************* This email originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list https://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email
stamping
for this address. ******************* Hi all,
I've got some files on the SE at Imperial which I'd like to register in the File Catalogue: I can do this using dirac-dms-filecatalog-cli with
register file <lfn> <pfn> <size> <SE> [<guid>] - register new file record in the catalog
The DFC has a metadata field for the checksum, and I have the original checksums recorded separately. How can I add the checksum to the file entry? Is it worth the effort (e.g. does the DFC checksum entry actually get used to validate transfers?)?
At present I'm using dirac-ui v6r22p6.
Thanks
Henry
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- Sent from my guinea pig living room
----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- Sent from my guinea pig living room
----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
participants (3)
- 
                
                Daniela Bauer
- 
                
                Dr Janusz Martyniak
- 
                
                Henry Nebrensky