Dear Firedrakers,
I'm recruiting a MEng4 student who is potentially interested in adapting COFFEE to GPUs, particularly studying the impact of the available code transformations and exploring new optimisation strategies. Am I right at saying that for running on GPUs some FFC kernels generated through a Firedrake program there is theoretically no problem, provided that:
* there's no call to PETSc
* no need to handle boundary conditions (for which I guess one needs PyOP2 subsets? Don't know, this is really not my territory...)
* what else?
In other words, for a simple, trivial program (i.e. I'm not solving anything here):
TrialFunctions
TestFunctions
Coefficients, if any
LHS = ...
RHS = ...
LHS.M
RHS._dat_forceevaluation # don't remember the syntax for this, but I guess you can get what I mean
offloading the generated kernels to a GPU and collecting back the results should not be a big issue, right?
Thanks, and have a great weekend
-- Fabio