Lawrence 1) Okay that makes sense. Don't know why I didn't see this earlier. 2) Okay thanks, 3) The only reason I wanted that function was to get &real_time. Unless there's a more efficient way (or "firedrake" way) of getting this metric? In my case, I only want the time from SNESSolve() Thanks, Justin On Thu, Jul 16, 2015 at 8:47 AM, Lawrence Mitchell < lawrence.mitchell@imperial.ac.uk> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 16/07/15 14:32, Justin Chang wrote:
Lawrence,
I have attached the code I am working with. It's basically the one you sent me a few weeks ago, but I am only working with selfp. Attached are the log files with 1, 2, and 4 processors on our local HPC machine (Intel Xeon E5-2680v2 2.8 GHz)
1) I wrapped the PyPAPI calls around solver.solve(). I guess this is doing what I want. Right now I am estimating the arithmetic intensity by documenting the FLOPS, Loads, and Stores. When i compare the measured FLOPS with the PETSc manual FLOP count it seems papi over counts by a factor of 2 (which I suppose is expected coming from a new Intel machine). Anyway, in terms of computing the FLOPS and AI this is what I want, I just wanted to make sure these don't account for the DMPlex initialization and stuff because:
So note that inside SNESSolve, petsc attributes zero flops to forming the residual and jacobian (since that's a user function that it knows nothing about). We could actually do a reasonable job of adding a PetscLogFlops call, since we can inspect the kernel and make a reasonable guess at the number of flops it does, but we don't currently do that.
This may explain the difference in flop counts.
2) According to the attached log_summaries it seems DMPlexDistribute and MeshMigration still consume a significant portion of the time. By significant I mean that the %T doesn't reduce as I increase the number of processors. I remember seeing Michael Lange's presentations (from PETSc-20 and the webinar) that mentioned something about this?
Yes, for more details on what scales and doesn't, see this paper:
http://arxiv.org/abs/1506.06194
3) Bonus question: how do I also use PAPI_flops(&real_time, &proc_time, &flpins, &mflops)? I see there's the flops() function, but in my limited PAPI experience, I seem to have issues whenever I try to put both that and PAPI_start_counters into the same program, but I could be wrong.
I'm by no means a PAPI expert, but can you not just obtain the result of PAPI_flops by measuring the PAPI_FP_INS counter?
Cheers,
Lawrence -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQEcBAEBAgAGBQJVp7XVAAoJECOc1kQ8PEYvrU0IALr8aXhyfb+uVkQuS07s8Wov hn1a8i2Fu1ERNk/0W8NYXNckY0g+HP0zfgPpo/vsVv4b4W3l3uqsCqvXzzUG8AH7 GYgKIfBTqr9d6OvFN2niZNnrogbbpsq1u6RxVzqYCQCKgXkJ++BaGStHsQIyg++M 8zFoJ97HWEUdEgcjsNvuugqf14M/2PfZnMFrJJghr7xf4W37w47Ya4bizzAH2NNh yIefY5DFldPmfBgbEfDGhjEUig8wkuwTinVo8NnXW4yXJsqqu+THgdncQagiQwDo ugV6oABIdkX/JNHt0spADarL3vX+lk7aMZ7Zyibj7L+65YXCJlhgv8vXc/KQyzM= =WY+c -----END PGP SIGNATURE-----
_______________________________________________ firedrake mailing list firedrake@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/firedrake