Hi; Is there a way to tell ganga/dirac that your jobs are multi-core, so that sites can direct them to appropriate queues, or do you just take pot-luck? Thanks, ivan -- Ivan Reid (ivan.reid@[brunel.ac.uk|cern.ch]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12
Hi Ivan, As regards Ganga, there isn't a specific option for multicore but there is a 'diracOpts' field in the Dirac backend which you could use. However, I don't know what dirac option you'd need to specify MC so I can't help any more than that I'm afraid! Thanks, Mark On 06/07/2017 08:12, Dr Ivan D. Reid wrote:
Hi; Is there a way to tell ganga/dirac that your jobs are multi-core, so that sites can direct them to appropriate queues, or do you just take pot-luck?
Thanks, ivan
On Thu, 6 Jul 2017, Mark Slater wrote:
As regards Ganga, there isn't a specific option for multicore but there is a 'diracOpts' field in the Dirac backend which you could use. However, I don't know what dirac option you'd need to specify MC so I can't help any more than that I'm afraid!
Thanks, Mark
Cheers, Mark, at least it's a keyword to gwgl on! ivan
On 06/07/2017 08:12, Dr Ivan D. Reid wrote:
Is there a way to tell ganga/dirac that your jobs are multi-core, so that sites can direct them to appropriate queues, or do you just take pot-luck?
-- Ivan Reid (ivan.reid@[brunel.ac.uk|cern.ch]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12
Hi Ivan, we haven't tested multi-core job submission in our dirac instance yet, but if you have a use case (and a site that will support it!) we can look into it. Regards, Daniela On 6 July 2017 at 10:16, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk> wrote:
On Thu, 6 Jul 2017, Mark Slater wrote:
As regards Ganga, there isn't a specific option for multicore but there is
a 'diracOpts' field in the Dirac backend which you could use. However, I don't know what dirac option you'd need to specify MC so I can't help any more than that I'm afraid!
Thanks,
Mark
Cheers, Mark, at least it's a keyword to gwgl on!
ivan
On 06/07/2017 08:12, Dr Ivan D. Reid wrote:
Is there a way to tell ganga/dirac that your jobs are multi-core, so that sites can direct them to appropriate queues, or do you just take pot-luck?
-- Ivan Reid (ivan.reid@[brunel.ac.uk|cern.ch]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12
-- _______________________________________________ Gridpp-Dirac-Users mailing list Gridpp-Dirac-Users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users
-- Sent from the pit of despair ----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
On Thu, 6 Jul 2017, Daniela Bauer wrote:
Hi Ivan,
we haven't tested multi-core job submission in our dirac instance yet, but if you have a use case (and a site that will support it!) we can look into it.
OK, Daniela. I've been testing the application locally, and could have a job script ready later today. It should run at Brunel, as that's where the data are. I've attached the plot of speedup vs cores that I've just obtained -- the single-core time is ~125 minutes. You can see that there's definitely no real advantage in going beyond eight cores. A sysadmin might prefer four...
Regards, Daniela
Cheers, ivan
On 6 July 2017 at 10:16, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk> wrote:
On Thu, 6 Jul 2017, Mark Slater wrote:
As regards Ganga, there isn't a specific option for multicore but there is
a 'diracOpts' field in the Dirac backend which you could use. However, I don't know what dirac option you'd need to specify MC so I can't help any more than that I'm afraid!
Cheers, Mark, at least it's a keyword to gwgl on!
On 06/07/2017 08:12, Dr Ivan D. Reid wrote:
Is there a way to tell ganga/dirac that your jobs are multi-core, so that sites can direct them to appropriate queues, or do you just take pot-luck?
-- Ivan Reid (ivan.reid@[brunel.ac.uk|cern.ch]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12
But have you got a queue and for which VO, gridpp ? Cheers, Daniela On 6 July 2017 at 10:57, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk> wrote:
On Thu, 6 Jul 2017, Daniela Bauer wrote:
Hi Ivan,
we haven't tested multi-core job submission in our dirac instance yet, but
if you have a use case (and a site that will support it!) we can look into it.
OK, Daniela. I've been testing the application locally, and could have a job script ready later today. It should run at Brunel, as that's where the data are. I've attached the plot of speedup vs cores that I've just obtained -- the single-core time is ~125 minutes. You can see that there's definitely no real advantage in going beyond eight cores. A sysadmin might prefer four...
Regards,
Daniela
Cheers, ivan
On 6 July 2017 at 10:16, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk> wrote:
On Thu, 6 Jul 2017, Mark Slater wrote:
As regards Ganga, there isn't a specific option for multicore but there is
a 'diracOpts' field in the Dirac backend which you could use. However, I don't know what dirac option you'd need to specify MC so I can't help any more than that I'm afraid!
Cheers, Mark, at least it's a keyword to gwgl on!
On 06/07/2017 08:12, Dr Ivan D. Reid wrote:
Is there a way to tell ganga/dirac that your jobs are multi-core,
so that sites can direct them to appropriate queues, or do you just take pot-luck?
-- Ivan Reid (ivan.reid@[brunel.ac.uk|cern.ch]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12
-- Sent from the pit of despair ----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
On Thu, 6 Jul 2017, Daniela Bauer wrote:
But have you got a queue and for which VO, gridpp ?
For gridpp, yes, but how do I specify a queue?
Cheers, Daniela
Thanks, ivan
On 6 July 2017 at 10:57, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk> wrote:
On Thu, 6 Jul 2017, Daniela Bauer wrote:
Hi Ivan,
we haven't tested multi-core job submission in our dirac instance yet, but
if you have a use case (and a site that will support it!) we can look into it.
OK, Daniela. I've been testing the application locally, and could have a job script ready later today. It should run at Brunel, as that's where the data are. I've attached the plot of speedup vs cores that I've just obtained -- the single-core time is ~125 minutes. You can see that there's definitely no real advantage in going beyond eight cores. A sysadmin might prefer four...
Regards,
Daniela
Cheers, ivan
On 6 July 2017 at 10:16, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk> wrote:
On Thu, 6 Jul 2017, Mark Slater wrote:
As regards Ganga, there isn't a specific option for multicore but there is
a 'diracOpts' field in the Dirac backend which you could use. However, I don't know what dirac option you'd need to specify MC so I can't help any more than that I'm afraid!
Cheers, Mark, at least it's a keyword to gwgl on!
On 06/07/2017 08:12, Dr Ivan D. Reid wrote:
Is there a way to tell ganga/dirac that your jobs are multi-core,
so that sites can direct them to appropriate queues, or do you just take pot-luck?
-- Ivan Reid (ivan.reid@[brunel.ac.uk|cern.ch]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12
-- Ivan Reid (ivan.reid@[brunel.ac.uk|cern.ch]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12 RG250WD "You Porsche. Me pass!" DoD #484 JKLO#003, 005 WP7# 3000 LC Unit #2368 (tinlc) UKMC#00009 BOTAFOT#16 UKRMMA#7 (Hon) KotPT -- "for stupidity above and beyond the call of duty".
You can't -- this is the bit we need to work out within dirac, so that it reads the JDL correctly and then matches the queue. Which queue is it ? Regards, Daniela On 6 July 2017 at 11:28, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk> wrote:
On Thu, 6 Jul 2017, Daniela Bauer wrote:
But have you got a queue and for which VO, gridpp ?
For gridpp, yes, but how do I specify a queue?
Cheers,
Daniela
Thanks,
ivan
On 6 July 2017 at 10:57, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk> wrote:
On Thu, 6 Jul 2017, Daniela Bauer wrote:
Hi Ivan,
we haven't tested multi-core job submission in our dirac instance yet, but
if you have a use case (and a site that will support it!) we can look into it.
OK, Daniela. I've been testing the application locally, and could have a job script ready later today. It should run at Brunel, as that's where the data are. I've attached the plot of speedup vs cores that I've just obtained -- the single-core time is ~125 minutes. You can see that there's definitely no real advantage in going beyond eight cores. A sysadmin might prefer four...
Regards,
Daniela
Cheers, ivan
On 6 July 2017 at 10:16, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk> wrote:
On Thu, 6 Jul 2017, Mark Slater wrote:
As regards Ganga, there isn't a specific option for multicore but there is
a 'diracOpts' field in the Dirac backend which you could use. However,
I don't know what dirac option you'd need to specify MC so I can't help any more than that I'm afraid!
Cheers, Mark, at least it's a keyword to gwgl on!
On 06/07/2017 08:12, Dr Ivan D. Reid wrote:
Is there a way to tell ganga/dirac that your jobs are multi-core,
so that sites can direct them to appropriate queues, or do you just
> take > pot-luck? > > -- Ivan Reid (ivan.reid@[brunel.ac.uk|cern.ch]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12
-- Ivan Reid (ivan.reid@[brunel.ac.uk|cern.ch]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12 RG250WD "You Porsche. Me pass!" DoD #484 JKLO#003, 005 WP7# 3000 LC Unit #2368 (tinlc) UKMC#00009 BOTAFOT#16 UKRMMA#7 (Hon) KotPT -- "for stupidity above and beyond the call of duty".
-- Sent from the pit of despair ----------------------------------------------------------- daniela.bauer@imperial.ac.uk HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
Hi Daniela, I have not discussed this with Ivan yet, but I believe he is considering any of the Brunel ArcCEs. I could it ready quickly. Maybe defining one for target (dc2-grid-25) would make it easier for debugging. Important points: - the gridpp VO seems to be bag with many communities using it. They would all be entitled to 8 cores jobs... - I'll inform Peter that I have opened 8 cores to all gridpp. He will not oppose if we can deal with the above. Thanks, Raul ________________________________ From: Daniela Bauer <daniela.bauer.grid@googlemail.com> Sent: 06 July 2017 11:32:14 To: Ivan Reid Cc: Raul Lopes; gridpp-dirac-users@imperial.ac.uk Subject: Re: [Gridpp-Dirac-Users] Multi-core jobs in ganga/dirac You can't -- this is the bit we need to work out within dirac, so that it reads the JDL correctly and then matches the queue. Which queue is it ? Regards, Daniela On 6 July 2017 at 11:28, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk<mailto:ivan.reid@brunel.ac.uk>> wrote: On Thu, 6 Jul 2017, Daniela Bauer wrote: But have you got a queue and for which VO, gridpp ? For gridpp, yes, but how do I specify a queue? Cheers, Daniela Thanks, ivan On 6 July 2017 at 10:57, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk<mailto:ivan.reid@brunel.ac.uk>> wrote: On Thu, 6 Jul 2017, Daniela Bauer wrote: Hi Ivan, we haven't tested multi-core job submission in our dirac instance yet, but if you have a use case (and a site that will support it!) we can look into it. OK, Daniela. I've been testing the application locally, and could have a job script ready later today. It should run at Brunel, as that's where the data are. I've attached the plot of speedup vs cores that I've just obtained -- the single-core time is ~125 minutes. You can see that there's definitely no real advantage in going beyond eight cores. A sysadmin might prefer four... Regards, Daniela Cheers, ivan On 6 July 2017 at 10:16, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk<mailto:ivan.reid@brunel.ac.uk>> wrote: On Thu, 6 Jul 2017, Mark Slater wrote: As regards Ganga, there isn't a specific option for multicore but there is a 'diracOpts' field in the Dirac backend which you could use. However, I don't know what dirac option you'd need to specify MC so I can't help any more than that I'm afraid! Cheers, Mark, at least it's a keyword to gwgl on! On 06/07/2017 08:12, Dr Ivan D. Reid wrote: Is there a way to tell ganga/dirac that your jobs are multi-core, so that sites can direct them to appropriate queues, or do you just take pot-luck? -- Ivan Reid (ivan.reid@[brunel.ac.uk<http://brunel.ac.uk>|cern.ch<http://cern.ch>]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12 -- Ivan Reid (ivan.reid@[brunel.ac.uk<http://brunel.ac.uk>|cern.ch<http://cern.ch>]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12 RG250WD "You Porsche. Me pass!" DoD #484 JKLO#003, 005 WP7# 3000 LC Unit #2368 (tinlc) UKMC#00009 BOTAFOT#16 UKRMMA#7 (Hon) KotPT -- "for stupidity above and beyond the call of duty". -- Sent from the pit of despair ----------------------------------------------------------- daniela.bauer@imperial.ac.uk<mailto:daniela.bauer@imperial.ac.uk> HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/
On Thu, 6 Jul 2017, Raul Lopes wrote:
I have not discussed this with Ivan yet, but I believe he is considering any of the Brunel ArcCEs. I could it ready quickly. Maybe defining one for target (dc2-grid-25) would make it easier for debugging.
We might have some memory requirements, too. I've been debugging with the single-thread option and finally got enough bugs out that the programme actually ran for a while. However, it then gave this message and stopped (running on CLOUD.UKI-GridPP-Cloud-IC.uk): "The reads contain too many k-mers to fit into available memory. You need approx. 2.07591GB of free RAM to assemble your datase" Now I was checking memory usage when I was running my test jobs and pion never got below 126 GB free (normally >127 free), so I suspect that was on the limit. I've dropped down to only processing one k-mer instead of four and it's been running at LCG.UKI-LT2-QMUL.uk for 20 minutes now -- previously it died after 4 mins.
Important points:
- the gridpp VO seems to be bag with many communities using it. They would all be entitled to 8 cores jobs...
Arshad agrees with me that 4 cores would actually be the best option. 8 is quicker but uses disproportionately more resources.
- I'll inform Peter that I have opened 8 cores to all gridpp. He will not oppose if we can deal with the above.
I had a chat to Peter when he dropped by, and in principle he's happy with whatever you suggest.
Thanks, Raul
Cheers, ivan
____________________________________________________________________________ From: Daniela Bauer <daniela.bauer.grid@googlemail.com> Sent: 06 July 2017 11:32:14 To: Ivan Reid Cc: Raul Lopes; gridpp-dirac-users@imperial.ac.uk Subject: Re: [Gridpp-Dirac-Users] Multi-core jobs in ganga/dirac You can't -- this is the bit we need to work out within dirac, so that it reads the JDL correctly and then matches the queue.
Which queue is it ?
Regards, Daniela
-- Ivan Reid (ivan.reid@[brunel.ac.uk|cern.ch]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12
Hi Ivan, For my curiosity, how do you actually submit a job? Do you use Ganga or the DIRAC api directly? Regards, Raja. On 06/07/17 11:32, Daniela Bauer wrote:
You can't -- this is the bit we need to work out within dirac, so that it reads the JDL correctly and then matches the queue.
Which queue is it ?
Regards, Daniela
On 6 July 2017 at 11:28, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk <mailto:ivan.reid@brunel.ac.uk>> wrote:
On Thu, 6 Jul 2017, Daniela Bauer wrote:
But have you got a queue and for which VO, gridpp ?
For gridpp, yes, but how do I specify a queue?
Cheers, Daniela
Thanks,
ivan
On 6 July 2017 at 10:57, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk <mailto:ivan.reid@brunel.ac.uk>> wrote:
On Thu, 6 Jul 2017, Daniela Bauer wrote:
Hi Ivan,
we haven't tested multi-core job submission in our dirac instance yet, but
if you have a use case (and a site that will support it!) we can look into it.
OK, Daniela. I've been testing the application locally, and could have a job script ready later today. It should run at Brunel, as that's where the data are. I've attached the plot of speedup vs cores that I've just obtained -- the single-core time is ~125 minutes. You can see that there's definitely no real advantage in going beyond eight cores. A sysadmin might prefer four...
Regards,
Daniela
Cheers, ivan
On 6 July 2017 at 10:16, Dr Ivan D. Reid <ivan.reid@brunel.ac.uk <mailto:ivan.reid@brunel.ac.uk>> wrote:
On Thu, 6 Jul 2017, Mark Slater wrote:
As regards Ganga, there isn't a specific option for multicore but there is
a 'diracOpts' field in the Dirac backend which you could use. However, I don't know what dirac option you'd need to specify MC so I can't help any more than that I'm afraid!
Cheers, Mark, at least it's a keyword to gwgl on!
On 06/07/2017 08:12, Dr Ivan D. Reid wrote:
Is there a way to tell ganga/dirac that your jobs are multi-core,
so that sites can direct them to appropriate queues, or do you just take pot-luck?
-- Ivan Reid (ivan.reid@[brunel.ac.uk <http://brunel.ac.uk>|cern.ch <http://cern.ch>]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12
-- Ivan Reid (ivan.reid@[brunel.ac.uk <http://brunel.ac.uk>|cern.ch <http://cern.ch>]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12 RG250WD "You Porsche. Me pass!" DoD #484 JKLO#003, 005 WP7# 3000 LC Unit #2368 (tinlc) UKMC#00009 BOTAFOT#16 UKRMMA#7 (Hon) KotPT -- "for stupidity above and beyond the call of duty".
-- Sent from the pit of despair
----------------------------------------------------------- daniela.bauer@imperial.ac.uk <mailto:daniela.bauer@imperial.ac.uk> HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/ <http://www.hep.ph.ic.ac.uk/%7Edbauer/>
On Thu, 6 Jul 2017, Raja Nandakumar wrote:
Hi Ivan,
For my curiosity, how do you actually submit a job? Do you use Ganga or the DIRAC api directly?
Currently I'm using Ganga, tho' Simon tells me he finds it easier to use direct commands. What I'm trying to do (and have a small amount of money towards my salary for) is set up a pilot project for "Big Data" for a bioinformatics group here at Brunel. Trying to think back to what I've heard about such things from GRIDPP meetings, I dredged up the keywords "ganga" and "dirac" so that's what I'm going for at present. The aim is to make it as simple as possible for non-Grid scientists (who seem to know python). As a (totally non-debugged as yet!) example, here's my first pass that I came up with for this multi-core submission just before lunch:
cat submit_spades.py import os exefile=('spades.sh') os.system('chmod +x %s' % exefile) j=Job() j.backend=Dirac() j.application=Executable() j.application.exe=File(exefile) j.inputfiles=[ LocalFile('bin.tgz'), DiracFile('LFN:gridpp/user/i/ivan.reid/embl/SRR2099924_1.fastq.gz', DiracFile('LFN:gridpp/user/i/ivan.reid/embl/SRR2099924_2.fastq.gz') ] j.outputfiles= [ LocalFile('contigs.fasta.gz'), LocalFile('scaffolds.fasta.gz') ] j.submit()
cat spades.sh #!/bin/sh tar -xvzf bin.tgz time bin/spades.py -k 21,33,55,77 --careful --pe1-1 SRR2099924_1.fastq.gz --pe1-2 SRR2099924_2.fastq.gz -o spades_output -t 8 gzip spades_output/contigs.fasta gzip spades_output/scaffolds.fasta mv spades_output/*.gz .
Of course, if anyone spots some obvious bug/miscomprehension in that, it could save me some debug time...
Regards, Raja.
Cheers, ivan -- Ivan Reid (ivan.reid@[brunel.ac.uk|cern.ch]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12
Hi Ivan, Okay - this should in principle be straightforward from what I know. A few comments below which may help clarify the situation - most of which you may know already. My apologies if you already know all of this - in case, feel free to ignore this email. 1. Ganga and Dirac do not actually care about whether your application is single or multi-core. All they do is to send your job to a particular queue in a particular site. On a personal level, I like both of them and would use Ganga if possible as it makes it easier for a new user to get started quickly. 2. On the grid, a given queue at a site should have all the machines giving the same "type" of resources (queue length, OS, ...). So, if you want to submit a multi-core job, this should go to a queue supporting multi-core job slots. Otherwise, the site admins will be a little unhappy. 3. Following point 2 above, the question is now to find out which queue supports multi-core jobs. The site administrator will help here normally. There is a special case of "ARC" CEs which run with Condor as the batch system. In this case, the same queue apparently supports all possible varieties of jobs and it is for us to submit the job with the correct options. In case of this last issue (which I suspect it is!) the DIRAC administrator of the instance you are using (Daniela / Simon) will be able to tell you exactly what to do. The solution they will probably adopt is to create a new "site" (say for example "LCG.MultiCoreSite.site") for just these multi-core jobs and ask you to submit specifically to this site. This site will automatically have the correct options to request the needed resources (queue length, RAM, # cores, etc) from the site. Then, you will need an additional line in your ganga job along the lines of j.backend.settings['Destination'] = 'LCG.MultiCoreSite.site'. Hope this helps! Cheers, Raja. On 06/07/17 13:04, Dr Ivan D. Reid wrote:
On Thu, 6 Jul 2017, Raja Nandakumar wrote:
Hi Ivan,
For my curiosity, how do you actually submit a job? Do you use Ganga or the DIRAC api directly?
Currently I'm using Ganga, tho' Simon tells me he finds it easier to use direct commands.
What I'm trying to do (and have a small amount of money towards my salary for) is set up a pilot project for "Big Data" for a bioinformatics group here at Brunel. Trying to think back to what I've heard about such things from GRIDPP meetings, I dredged up the keywords "ganga" and "dirac" so that's what I'm going for at present. The aim is to make it as simple as possible for non-Grid scientists (who seem to know python).
As a (totally non-debugged as yet!) example, here's my first pass that I came up with for this multi-core submission just before lunch:
cat submit_spades.py import os exefile=('spades.sh') os.system('chmod +x %s' % exefile) j=Job() j.backend=Dirac() j.application=Executable() j.application.exe=File(exefile) j.inputfiles=[ LocalFile('bin.tgz'), DiracFile('LFN:gridpp/user/i/ivan.reid/embl/SRR2099924_1.fastq.gz', DiracFile('LFN:gridpp/user/i/ivan.reid/embl/SRR2099924_2.fastq.gz') ] j.outputfiles= [ LocalFile('contigs.fasta.gz'), LocalFile('scaffolds.fasta.gz') ] j.submit()
cat spades.sh #!/bin/sh tar -xvzf bin.tgz time bin/spades.py -k 21,33,55,77 --careful --pe1-1 SRR2099924_1.fastq.gz --pe1-2 SRR2099924_2.fastq.gz -o spades_output -t 8 gzip spades_output/contigs.fasta gzip spades_output/scaffolds.fasta mv spades_output/*.gz .
Of course, if anyone spots some obvious bug/miscomprehension in that, it could save me some debug time...
Regards, Raja.
Cheers, ivan
On Thu, 6 Jul 2017, Raja Nandakumar wrote: 8< Snip sensible advice for brevity >8
In case of this last issue (which I suspect it is!) the DIRAC administrator of the instance you are using (Daniela / Simon) will be able to tell you exactly what to do. The solution they will probably adopt is to create a new "site" (say for example "LCG.MultiCoreSite.site") for just these multi-core jobs and ask you to submit specifically to this site. This site will automatically have the correct options to request the needed resources (queue length, RAM, # cores, etc) from the site. Then, you will need an additional line in your ganga job along the lines of
j.backend.settings['Destination'] = 'LCG.MultiCoreSite.site'.
Hope this helps!
Yes it does, clarifies a few things and helps me think I'm not too far off the mark! I'd only discovered the Destination setting this morning, which looks like it might be as useful as I'd hoped it would be.
Cheers, Raja.
Thanks very much, ivan -- Ivan Reid (ivan.reid@[brunel.ac.uk|cern.ch]) Engineering, Design & Physical Sciences CMS Collaboration, Brunel University London. Room TOWD405 CERN, Room 40-1-B12
participants (5)
-
Daniela Bauer
-
Dr Ivan D. Reid
-
Mark Slater
-
Raja Nandakumar
-
Raul Lopes