Dear Firedrakers, I'm recruiting a MEng4 student who is potentially interested in adapting COFFEE to GPUs, particularly studying the impact of the available code transformations and exploring new optimisation strategies. Am I right at saying that for running on GPUs some FFC kernels generated through a Firedrake program there is theoretically no problem, provided that: * there's no call to PETSc * no need to handle boundary conditions (for which I guess one needs PyOP2 subsets? Don't know, this is really not my territory...) * what else? In other words, for a simple, trivial program (i.e. I'm not solving anything here): TrialFunctions TestFunctions Coefficients, if any LHS = ... RHS = ... LHS.M RHS._dat_forceevaluation # don't remember the syntax for this, but I guess you can get what I mean offloading the generated kernels to a GPU and collecting back the results should not be a big issue, right? Thanks, and have a great weekend -- Fabio