Dear all,
In my code, I need to compute the hessian of my solution (for now a
Lagrange P1 field, on an unstructured simplicial mesh). To that aim, I
have been using the classic FE method, which is given in the
Monge-Ampère demonstration in 2D
(http://www.firedrakeproject.org/demos/ma-demo.py.html).
I am now trying to do the same thing in 3D. But it is slow. For now, I
am running in serial, so I expected it to be slow but now that slow.
The code used is the following:
n = 20
mesh = UnitCubeMesh(n, n, n)
W = FunctionSpace(mesh, 'CG', 1)
M = TensorFunctionSpace(mesh, 'CG', 1)
icExpr = Expression("((x[0]-0.35)*(x[0]-0.35) + (x[1]-0.35)*(x[1]-0.35)
+ (x[2]-0.35)*(x[2]-0.35) < 0.15*0.15)")
w = Function(W).interpolate(icExpr)
sigma = TestFunction(M)
H = Function(M)
n = FacetNormal(mesh)
Lh = inner(sigma, H)*dx + inner(div(sigma), grad(w))*dx
Lh -= (sigma[0,0]*w.dx(0)*n[0] + sigma[1,0]*w.dx(1)*n[0] +
sigma[2,0]*w.dx(2)*n[0] \
+ sigma[0,1]*w.dx(0)*n[1] + sigma[1,1]*w.dx(1)*n[1] +
sigma[2,1]*w.dx(2)*n[1] \
+ sigma[0,2]*w.dx(0)*n[2] + sigma[1,2]*w.dx(1)*n[2] +
sigma[2,2]*w.dx(2)*n[2])*ds
H_prob = NonlinearVariationalProblem(Lh, H)
H_solv = NonlinearVariationalSolver(H_prob)
H_solv.solve()
(Warning if you try to run this code, UniCubeMesh can take a long time
for not so great n)
If we consider a unit cube with 50 vertices on each edge (n=50, that
makes 132,651 vertices and 750,000 tetrahedra which is a small 3D case),
the wall clock time for the last instruction H_solv.solve() is 15 min.
For n=65 (287,496 vertices and 1,647,750 tets), the wall clock time goes
up to one hour. For n=85 (531,441 vertices, 3,072,000 tets), it takes
80 min, The processor is an
Intel Xeon E5-2640 v3 @ 2.60GHz, and these timings are obtained with the
default set of parameters for the solver (notably snes_rtol = 1e-8 and
ksp_rtol = 1e-5). For both meshes, the solver ends after 2 SNES
iterations, each iteration requiring 5 KSP iterations, which seem to be
relatively small numbers.
If I understand correctly, these times include the assembly and the
actual solution of the problem. I do not know how to distinguish the
assembly time from the solver time, but, if I can trust Firedrake debug
prints (with monitor_snes: True), the assembly is an order of magnitude
slower than the solver.
Do these CPU times look normal to you ?
I have, changed snes_rtol to 1e-2, as I don't need a great precision for
my hessian (I'm leaving ksp_rtol to 1e-5 to preserve the convergence, am
I right ?). I also changed the preconditioner, and use SOR, which helped
(a lot for tiny meshes, somewhat less for bigger ones). With these
options, my CPU time becomes 5min for n=50 and 9min for n=80.
That is better. However, I am a little concerned by the fact that the
CPU time quickly increases with n, and that this method seems
significantly slower than a gradient/Hessian recovery method. In the
context of mesh adaptation, I have to compute the Hessian frequently,
and it is not supposed to be a costly stage in the process.
So what do you think? Are there other things I can do to speed this up?
Many thanks,
--
Nicolas
--
Nicolas Barral
Dept. of Earth Science and Engineering
Imperial College London
Royal School of Mines - Office 4.88
London SW7 2AZ
_______________________________________________
firedrake mailing list
firedrake@imperial.ac.uk
https://mailman.ic.ac.uk/mailman/listinfo/firedrake