Hi everyone, I am very new to Firedrake, i recently switched over from FEniCS, so pardon me if this is a newbie question. If I wanted to run hardware performance counters from PAPI to diagnose my firedrake application how would I go about doing this? In FEniCS, my application was written in C++ and I included the PAPI directives and libraries in the CMakefile. With this I was able to specify where in the source code I wanted to start and stop the counters. Is there a way to somehow replicate this if my code is written using Firedrake? Thanks, Justin
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Justin, On 09/07/15 09:18, Justin Chang wrote:
Hi everyone,
I am very new to Firedrake, i recently switched over from FEniCS, so pardon me if this is a newbie question.
If I wanted to run hardware performance counters from PAPI to diagnose my firedrake application how would I go about doing this? In FEniCS, my application was written in C++ and I included the PAPI directives and libraries in the CMakefile. With this I was able to specify where in the source code I wanted to start and stop the counters.
Is there a way to somehow replicate this if my code is written using Firedrake?
This is somewhat tricky, since we generate all code to spin over the mesh for assembly at runtime, and the calls into PETSc happen purely from petsc4py. There are two approaches which will have slightly different measurement characteristics: 1. Measure from Python I wrote some bindings for the high-level PAPI API that are callable from Python. They seem to be broadly functional. See https://github.com/firedrakeproject/PyPAPI To use these you do: import pypapi import numpy as np events = np.asarray([pypapi.Event.TOT_CYC], dtype=pypapi.EventType) counts = np.zeros(1, dtype=pypapi.CountType) pypapi.start_counters(events) # Do your computation (for example, assembling an operator) ... pypapi.stop_counters(counts) # counts now contains the number of cycles elapsed between start # and stop Note that this approach includes cycle/flop counts and timings for the full stack of computation: that is, it counts work in the python interpreter as well as the low-level C that does the assembly loop. In some sense, this is the fairest thing to report (since that's the work your code is actually doing). 2. Measure in the generated code This gives you finer-grained control over where you want to actually count things, and gets you much closer to the metal. However, it's (much) trickier to do. Firedrake defers to PyOP2 to do computation on meshes, and it is PyOP2 which generates the code to do this. If you want to insert PAPI calls here there are two things to consider: 1. Where exactly you want to measure. I guess you'll want to measure around a complete assembly over the mesh, rather than the execution of an individual kernel. 2. How to get the information back out again so you can inspect it. As to the former, I think it should be possible to extend the code generation to optionally insert the correct PAPI calls. Returning the information is also probably not too difficult, we can stuff the results in a Python dict (or similar) that you can then query from PyOP2. My recommendation is to try the pure Python approach first. If you want finer-grained control, we can provide guidance and/or help on how to best do that. Cheers, Lawrence -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVnmvxAAoJECOc1kQ8PEYvSqYH/iL2eE6Vz5OF9yG8yIFRAghN h6VOYZ5GgA420U8k6sjP1fBdHwidAUV8gtVLhBuj7UskTWf/HwSVvfUwf1KbPUie fAzopYfuDXVolR3uYnUX5GDYDin1gkCCRRbOrk0wU98bSOyaRmcgmg0l3aDp4VTO zHruPKzxyhgqoaWqcq5y7HFTCa3TYPBfdoJ3mu3zK4U8AWKMnuhKq24BEA6pBZfV fSf4MvlCMKSmxOFTHYeOuPExmtM3m4y049Me88uxRi0gS3fTb1Yhhuhi9zF9DRXC YIt1xKjZ+kudwiII22MA35rlXYjqxtShqMDuPC2HZi3xuIuANEWIGAE9uqhx4aY= =HJsh -----END PGP SIGNATURE-----
Lawrence, I will give this a try, thanks. On Thu, Jul 9, 2015 at 7:41 AM, Lawrence Mitchell < lawrence.mitchell@imperial.ac.uk> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hi Justin,
On 09/07/15 09:18, Justin Chang wrote:
Hi everyone,
I am very new to Firedrake, i recently switched over from FEniCS, so pardon me if this is a newbie question.
If I wanted to run hardware performance counters from PAPI to diagnose my firedrake application how would I go about doing this? In FEniCS, my application was written in C++ and I included the PAPI directives and libraries in the CMakefile. With this I was able to specify where in the source code I wanted to start and stop the counters.
Is there a way to somehow replicate this if my code is written using Firedrake?
This is somewhat tricky, since we generate all code to spin over the mesh for assembly at runtime, and the calls into PETSc happen purely from petsc4py. There are two approaches which will have slightly different measurement characteristics:
1. Measure from Python
I wrote some bindings for the high-level PAPI API that are callable from Python. They seem to be broadly functional. See
https://github.com/firedrakeproject/PyPAPI
To use these you do:
import pypapi import numpy as np
events = np.asarray([pypapi.Event.TOT_CYC], dtype=pypapi.EventType)
counts = np.zeros(1, dtype=pypapi.CountType)
pypapi.start_counters(events) # Do your computation (for example, assembling an operator) ... pypapi.stop_counters(counts)
# counts now contains the number of cycles elapsed between start # and stop
Note that this approach includes cycle/flop counts and timings for the full stack of computation: that is, it counts work in the python interpreter as well as the low-level C that does the assembly loop. In some sense, this is the fairest thing to report (since that's the work your code is actually doing).
2. Measure in the generated code
This gives you finer-grained control over where you want to actually count things, and gets you much closer to the metal. However, it's (much) trickier to do.
Firedrake defers to PyOP2 to do computation on meshes, and it is PyOP2 which generates the code to do this. If you want to insert PAPI calls here there are two things to consider:
1. Where exactly you want to measure. I guess you'll want to measure around a complete assembly over the mesh, rather than the execution of an individual kernel.
2. How to get the information back out again so you can inspect it.
As to the former, I think it should be possible to extend the code generation to optionally insert the correct PAPI calls. Returning the information is also probably not too difficult, we can stuff the results in a Python dict (or similar) that you can then query from PyOP2.
My recommendation is to try the pure Python approach first. If you want finer-grained control, we can provide guidance and/or help on how to best do that.
Cheers,
Lawrence
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQEcBAEBAgAGBQJVnmvxAAoJECOc1kQ8PEYvSqYH/iL2eE6Vz5OF9yG8yIFRAghN h6VOYZ5GgA420U8k6sjP1fBdHwidAUV8gtVLhBuj7UskTWf/HwSVvfUwf1KbPUie fAzopYfuDXVolR3uYnUX5GDYDin1gkCCRRbOrk0wU98bSOyaRmcgmg0l3aDp4VTO zHruPKzxyhgqoaWqcq5y7HFTCa3TYPBfdoJ3mu3zK4U8AWKMnuhKq24BEA6pBZfV fSf4MvlCMKSmxOFTHYeOuPExmtM3m4y049Me88uxRi0gS3fTb1Yhhuhi9zF9DRXC YIt1xKjZ+kudwiII22MA35rlXYjqxtShqMDuPC2HZi3xuIuANEWIGAE9uqhx4aY= =HJsh -----END PGP SIGNATURE-----
_______________________________________________ firedrake mailing list firedrake@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/firedrake
First option works wonderfully for me, but now I am wondering how I would employ the second option. Specifically, I want to profile SNESSolve() I would prefer to circumvent profiling of the DMPlex distribution because it seems that is a major bottleneck for multiple processes at the moment. How would I do this? Thanks, Justin On Thu, Jul 9, 2015 at 3:02 PM, Justin Chang <jychang48@gmail.com> wrote:
Lawrence, I will give this a try, thanks.
On Thu, Jul 9, 2015 at 7:41 AM, Lawrence Mitchell < lawrence.mitchell@imperial.ac.uk> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hi Justin,
On 09/07/15 09:18, Justin Chang wrote:
Hi everyone,
I am very new to Firedrake, i recently switched over from FEniCS, so pardon me if this is a newbie question.
If I wanted to run hardware performance counters from PAPI to diagnose my firedrake application how would I go about doing this? In FEniCS, my application was written in C++ and I included the PAPI directives and libraries in the CMakefile. With this I was able to specify where in the source code I wanted to start and stop the counters.
Is there a way to somehow replicate this if my code is written using Firedrake?
This is somewhat tricky, since we generate all code to spin over the mesh for assembly at runtime, and the calls into PETSc happen purely from petsc4py. There are two approaches which will have slightly different measurement characteristics:
1. Measure from Python
I wrote some bindings for the high-level PAPI API that are callable from Python. They seem to be broadly functional. See
https://github.com/firedrakeproject/PyPAPI
To use these you do:
import pypapi import numpy as np
events = np.asarray([pypapi.Event.TOT_CYC], dtype=pypapi.EventType)
counts = np.zeros(1, dtype=pypapi.CountType)
pypapi.start_counters(events) # Do your computation (for example, assembling an operator) ... pypapi.stop_counters(counts)
# counts now contains the number of cycles elapsed between start # and stop
Note that this approach includes cycle/flop counts and timings for the full stack of computation: that is, it counts work in the python interpreter as well as the low-level C that does the assembly loop. In some sense, this is the fairest thing to report (since that's the work your code is actually doing).
2. Measure in the generated code
This gives you finer-grained control over where you want to actually count things, and gets you much closer to the metal. However, it's (much) trickier to do.
Firedrake defers to PyOP2 to do computation on meshes, and it is PyOP2 which generates the code to do this. If you want to insert PAPI calls here there are two things to consider:
1. Where exactly you want to measure. I guess you'll want to measure around a complete assembly over the mesh, rather than the execution of an individual kernel.
2. How to get the information back out again so you can inspect it.
As to the former, I think it should be possible to extend the code generation to optionally insert the correct PAPI calls. Returning the information is also probably not too difficult, we can stuff the results in a Python dict (or similar) that you can then query from PyOP2.
My recommendation is to try the pure Python approach first. If you want finer-grained control, we can provide guidance and/or help on how to best do that.
Cheers,
Lawrence
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQEcBAEBAgAGBQJVnmvxAAoJECOc1kQ8PEYvSqYH/iL2eE6Vz5OF9yG8yIFRAghN h6VOYZ5GgA420U8k6sjP1fBdHwidAUV8gtVLhBuj7UskTWf/HwSVvfUwf1KbPUie fAzopYfuDXVolR3uYnUX5GDYDin1gkCCRRbOrk0wU98bSOyaRmcgmg0l3aDp4VTO zHruPKzxyhgqoaWqcq5y7HFTCa3TYPBfdoJ3mu3zK4U8AWKMnuhKq24BEA6pBZfV fSf4MvlCMKSmxOFTHYeOuPExmtM3m4y049Me88uxRi0gS3fTb1Yhhuhi9zF9DRXC YIt1xKjZ+kudwiII22MA35rlXYjqxtShqMDuPC2HZi3xuIuANEWIGAE9uqhx4aY= =HJsh -----END PGP SIGNATURE-----
_______________________________________________ firedrake mailing list firedrake@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/firedrake
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 15/07/15 21:14, Justin Chang wrote:
First option works wonderfully for me, but now I am wondering how I would employ the second option.
Specifically, I want to profile SNESSolve()
OK, so calls out to PETSc are done from Python (via petsc4py). It's just calls to integral assembly (i.e. evaluation of jacobians and residuals) that go through a generated code path. To be more concrete, let's say you have the following code: F = some_residual problem = NonlinearVariationalProblem(F, u, ...) solver = NonlinearVariationalSolver(problem) solver.solve() Then the call chain inside solver.solve is effectively: solver.solve -> SNESSolve -> # via petsc4py SNESComputeJacobian -> assemble(Jacobian) # Callback to Firedrake SNESComputeFunction -> assemble(residual) # Callback to Firedrake KSPSolve So if you wrapped flop counting around the outermost solver.solve() call, you're pretty close to wrapping SNESSolve. Or do you mean something else when profiling SNESSolve?
I would prefer to circumvent profiling of the DMPlex distribution because it seems that is a major bottleneck for multiple processes at the moment.
Can you provide an example mesh/process count that demonstrates this issue, or at least characterize it a little better? Michael Lange and Matt Knepley have done a lot of work on making DMPlexDistribute much faster than it was over the last 9 months or so. So if it turns out still to be slow, we'd really like to know about it and try and fix it. Cheers, Lawrence -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVp29aAAoJECOc1kQ8PEYv+xYIAKMWLy2Go1WXjxKAj9+RvbHs s26Dr/nJufqgC9GxArKRM0g/iJXD9sTnJckSQQQA1wHZzuVigr+ZFyHkN6HeNkbM HILg5Mu7SYWvAwQOo18G3y6e8c7WFryJU7eNcEcfMqgZqQnfQ0JrV5iIshgM36mx aP6VN7PfmJgy0CxQ/QuYyemt+U/9qvMAMSqfWNd5xRABTFw+dLcaj/h2T6u8EKxA JCbhr3WTpeVsKygdDl01ZkGXjG7xd0tYRq9Y0AoZ7K9fUQlAYcAAPhfjlSz9ABZe ZHWgJi724uzcnbAxtnY78TDqD0eHFFfRetEwd5Bn2G8uAssXZYzOg+DO49ETjn8= =eDM8 -----END PGP SIGNATURE-----
Lawrence, I have attached the code I am working with. It's basically the one you sent me a few weeks ago, but I am only working with selfp. Attached are the log files with 1, 2, and 4 processors on our local HPC machine (Intel Xeon E5-2680v2 2.8 GHz) 1) I wrapped the PyPAPI calls around solver.solve(). I guess this is doing what I want. Right now I am estimating the arithmetic intensity by documenting the FLOPS, Loads, and Stores. When i compare the measured FLOPS with the PETSc manual FLOP count it seems papi over counts by a factor of 2 (which I suppose is expected coming from a new Intel machine). Anyway, in terms of computing the FLOPS and AI this is what I want, I just wanted to make sure these don't account for the DMPlex initialization and stuff because: 2) According to the attached log_summaries it seems DMPlexDistribute and MeshMigration still consume a significant portion of the time. By significant I mean that the %T doesn't reduce as I increase the number of processors. I remember seeing Michael Lange's presentations (from PETSc-20 and the webinar) that mentioned something about this? 3) Bonus question: how do I also use PAPI_flops(&real_time, &proc_time, &flpins, &mflops)? I see there's the flops() function, but in my limited PAPI experience, I seem to have issues whenever I try to put both that and PAPI_start_counters into the same program, but I could be wrong. Thanks, Justin On Thu, Jul 16, 2015 at 3:46 AM, Lawrence Mitchell < lawrence.mitchell@imperial.ac.uk> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 15/07/15 21:14, Justin Chang wrote:
First option works wonderfully for me, but now I am wondering how I would employ the second option.
Specifically, I want to profile SNESSolve()
OK, so calls out to PETSc are done from Python (via petsc4py). It's just calls to integral assembly (i.e. evaluation of jacobians and residuals) that go through a generated code path.
To be more concrete, let's say you have the following code:
F = some_residual
problem = NonlinearVariationalProblem(F, u, ...)
solver = NonlinearVariationalSolver(problem)
solver.solve()
Then the call chain inside solver.solve is effectively:
solver.solve -> SNESSolve -> # via petsc4py SNESComputeJacobian -> assemble(Jacobian) # Callback to Firedrake SNESComputeFunction -> assemble(residual) # Callback to Firedrake KSPSolve
So if you wrapped flop counting around the outermost solver.solve() call, you're pretty close to wrapping SNESSolve.
Or do you mean something else when profiling SNESSolve?
I would prefer to circumvent profiling of the DMPlex distribution because it seems that is a major bottleneck for multiple processes at the moment.
Can you provide an example mesh/process count that demonstrates this issue, or at least characterize it a little better? Michael Lange and Matt Knepley have done a lot of work on making DMPlexDistribute much faster than it was over the last 9 months or so. So if it turns out still to be slow, we'd really like to know about it and try and fix it.
Cheers,
Lawrence -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQEcBAEBAgAGBQJVp29aAAoJECOc1kQ8PEYv+xYIAKMWLy2Go1WXjxKAj9+RvbHs s26Dr/nJufqgC9GxArKRM0g/iJXD9sTnJckSQQQA1wHZzuVigr+ZFyHkN6HeNkbM HILg5Mu7SYWvAwQOo18G3y6e8c7WFryJU7eNcEcfMqgZqQnfQ0JrV5iIshgM36mx aP6VN7PfmJgy0CxQ/QuYyemt+U/9qvMAMSqfWNd5xRABTFw+dLcaj/h2T6u8EKxA JCbhr3WTpeVsKygdDl01ZkGXjG7xd0tYRq9Y0AoZ7K9fUQlAYcAAPhfjlSz9ABZe ZHWgJi724uzcnbAxtnY78TDqD0eHFFfRetEwd5Bn2G8uAssXZYzOg+DO49ETjn8= =eDM8 -----END PGP SIGNATURE-----
_______________________________________________ firedrake mailing list firedrake@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/firedrake
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 16/07/15 14:32, Justin Chang wrote:
Lawrence,
I have attached the code I am working with. It's basically the one you sent me a few weeks ago, but I am only working with selfp. Attached are the log files with 1, 2, and 4 processors on our local HPC machine (Intel Xeon E5-2680v2 2.8 GHz)
1) I wrapped the PyPAPI calls around solver.solve(). I guess this is doing what I want. Right now I am estimating the arithmetic intensity by documenting the FLOPS, Loads, and Stores. When i compare the measured FLOPS with the PETSc manual FLOP count it seems papi over counts by a factor of 2 (which I suppose is expected coming from a new Intel machine). Anyway, in terms of computing the FLOPS and AI this is what I want, I just wanted to make sure these don't account for the DMPlex initialization and stuff because:
So note that inside SNESSolve, petsc attributes zero flops to forming the residual and jacobian (since that's a user function that it knows nothing about). We could actually do a reasonable job of adding a PetscLogFlops call, since we can inspect the kernel and make a reasonable guess at the number of flops it does, but we don't currently do that. This may explain the difference in flop counts.
2) According to the attached log_summaries it seems DMPlexDistribute and MeshMigration still consume a significant portion of the time. By significant I mean that the %T doesn't reduce as I increase the number of processors. I remember seeing Michael Lange's presentations (from PETSc-20 and the webinar) that mentioned something about this?
Yes, for more details on what scales and doesn't, see this paper: http://arxiv.org/abs/1506.06194
3) Bonus question: how do I also use PAPI_flops(&real_time, &proc_time, &flpins, &mflops)? I see there's the flops() function, but in my limited PAPI experience, I seem to have issues whenever I try to put both that and PAPI_start_counters into the same program, but I could be wrong.
I'm by no means a PAPI expert, but can you not just obtain the result of PAPI_flops by measuring the PAPI_FP_INS counter? Cheers, Lawrence -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVp7XVAAoJECOc1kQ8PEYvrU0IALr8aXhyfb+uVkQuS07s8Wov hn1a8i2Fu1ERNk/0W8NYXNckY0g+HP0zfgPpo/vsVv4b4W3l3uqsCqvXzzUG8AH7 GYgKIfBTqr9d6OvFN2niZNnrogbbpsq1u6RxVzqYCQCKgXkJ++BaGStHsQIyg++M 8zFoJ97HWEUdEgcjsNvuugqf14M/2PfZnMFrJJghr7xf4W37w47Ya4bizzAH2NNh yIefY5DFldPmfBgbEfDGhjEUig8wkuwTinVo8NnXW4yXJsqqu+THgdncQagiQwDo ugV6oABIdkX/JNHt0spADarL3vX+lk7aMZ7Zyibj7L+65YXCJlhgv8vXc/KQyzM= =WY+c -----END PGP SIGNATURE-----
Lawrence 1) Okay that makes sense. Don't know why I didn't see this earlier. 2) Okay thanks, 3) The only reason I wanted that function was to get &real_time. Unless there's a more efficient way (or "firedrake" way) of getting this metric? In my case, I only want the time from SNESSolve() Thanks, Justin On Thu, Jul 16, 2015 at 8:47 AM, Lawrence Mitchell < lawrence.mitchell@imperial.ac.uk> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 16/07/15 14:32, Justin Chang wrote:
Lawrence,
I have attached the code I am working with. It's basically the one you sent me a few weeks ago, but I am only working with selfp. Attached are the log files with 1, 2, and 4 processors on our local HPC machine (Intel Xeon E5-2680v2 2.8 GHz)
1) I wrapped the PyPAPI calls around solver.solve(). I guess this is doing what I want. Right now I am estimating the arithmetic intensity by documenting the FLOPS, Loads, and Stores. When i compare the measured FLOPS with the PETSc manual FLOP count it seems papi over counts by a factor of 2 (which I suppose is expected coming from a new Intel machine). Anyway, in terms of computing the FLOPS and AI this is what I want, I just wanted to make sure these don't account for the DMPlex initialization and stuff because:
So note that inside SNESSolve, petsc attributes zero flops to forming the residual and jacobian (since that's a user function that it knows nothing about). We could actually do a reasonable job of adding a PetscLogFlops call, since we can inspect the kernel and make a reasonable guess at the number of flops it does, but we don't currently do that.
This may explain the difference in flop counts.
2) According to the attached log_summaries it seems DMPlexDistribute and MeshMigration still consume a significant portion of the time. By significant I mean that the %T doesn't reduce as I increase the number of processors. I remember seeing Michael Lange's presentations (from PETSc-20 and the webinar) that mentioned something about this?
Yes, for more details on what scales and doesn't, see this paper:
http://arxiv.org/abs/1506.06194
3) Bonus question: how do I also use PAPI_flops(&real_time, &proc_time, &flpins, &mflops)? I see there's the flops() function, but in my limited PAPI experience, I seem to have issues whenever I try to put both that and PAPI_start_counters into the same program, but I could be wrong.
I'm by no means a PAPI expert, but can you not just obtain the result of PAPI_flops by measuring the PAPI_FP_INS counter?
Cheers,
Lawrence -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQEcBAEBAgAGBQJVp7XVAAoJECOc1kQ8PEYvrU0IALr8aXhyfb+uVkQuS07s8Wov hn1a8i2Fu1ERNk/0W8NYXNckY0g+HP0zfgPpo/vsVv4b4W3l3uqsCqvXzzUG8AH7 GYgKIfBTqr9d6OvFN2niZNnrogbbpsq1u6RxVzqYCQCKgXkJ++BaGStHsQIyg++M 8zFoJ97HWEUdEgcjsNvuugqf14M/2PfZnMFrJJghr7xf4W37w47Ya4bizzAH2NNh yIefY5DFldPmfBgbEfDGhjEUig8wkuwTinVo8NnXW4yXJsqqu+THgdncQagiQwDo ugV6oABIdkX/JNHt0spADarL3vX+lk7aMZ7Zyibj7L+65YXCJlhgv8vXc/KQyzM= =WY+c -----END PGP SIGNATURE-----
_______________________________________________ firedrake mailing list firedrake@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/firedrake
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 16/07/15 15:09, Justin Chang wrote:
Lawrence
1) Okay that makes sense. Don't know why I didn't see this earlier.
2) Okay thanks,
3) The only reason I wanted that function was to get &real_time. Unless there's a more efficient way (or "firedrake" way) of getting this metric? In my case, I only want the time from SNESSolve()
You could just use time.time() from the python time library I have also pushed an update to PyPAPI to wrap PAPI_get_real_cyc and PAPI_get_real_usec, the latter of which will give you the equivalent of real_time. Cheers, Lawrence -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVp8ANAAoJECOc1kQ8PEYv8hIIANA4Ux1VL2+LKAgYGE0IPp4Z xY1uFPMP1uEgr2S4gLl90Ws1SiFvn7w97IBQqqqia547Xiq2wTR8m1+Va2pY4nLT O2306hJZxZMPm6ngi1xglfOk/Jxtp0CWbS6zhych2g066WmsrSKJwSx3FL8c25eR qWTjMHyM5Kn7URHdnQNIGyvXIaM77xk+1uHwIY7PnNlCwodgLgnHWItRYxRlsvy0 gnFE9kSAgAA4XHnpcQ7Gundq2+JKeNwW5SG7orxFNRM1+ru7UHergj70byX9ZCS0 SmPCjHJEiI6FSyigp+BWg9RI46rU3VU9GBYNXRKCVAvvbtuPEzpb8mla8A6IJdY= =Edfm -----END PGP SIGNATURE-----
I am getting this now: Traceback (most recent call last): File "mixed-poisson.py", line 3, in <module> import pypapi File "/home/jchang23/firedrake-deps/PyPAPI/pypapi/__init__.py", line 11, in <module> from pypapi.papi import get_cycles_time, get_real_cyc, get_real_usec ImportError: cannot import name get_cycles_time On Thu, Jul 16, 2015 at 9:30 AM, Lawrence Mitchell < lawrence.mitchell@imperial.ac.uk> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 16/07/15 15:09, Justin Chang wrote:
Lawrence
1) Okay that makes sense. Don't know why I didn't see this earlier.
2) Okay thanks,
3) The only reason I wanted that function was to get &real_time. Unless there's a more efficient way (or "firedrake" way) of getting this metric? In my case, I only want the time from SNESSolve()
You could just use time.time() from the python time library
I have also pushed an update to PyPAPI to wrap PAPI_get_real_cyc and PAPI_get_real_usec, the latter of which will give you the equivalent of real_time.
Cheers,
Lawrence -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQEcBAEBAgAGBQJVp8ANAAoJECOc1kQ8PEYv8hIIANA4Ux1VL2+LKAgYGE0IPp4Z xY1uFPMP1uEgr2S4gLl90Ws1SiFvn7w97IBQqqqia547Xiq2wTR8m1+Va2pY4nLT O2306hJZxZMPm6ngi1xglfOk/Jxtp0CWbS6zhych2g066WmsrSKJwSx3FL8c25eR qWTjMHyM5Kn7URHdnQNIGyvXIaM77xk+1uHwIY7PnNlCwodgLgnHWItRYxRlsvy0 gnFE9kSAgAA4XHnpcQ7Gundq2+JKeNwW5SG7orxFNRM1+ru7UHergj70byX9ZCS0 SmPCjHJEiI6FSyigp+BWg9RI46rU3VU9GBYNXRKCVAvvbtuPEzpb8mla8A6IJdY= =Edfm -----END PGP SIGNATURE-----
_______________________________________________ firedrake mailing list firedrake@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/firedrake
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 16/07/15 16:01, Justin Chang wrote:
I am getting this now:
Traceback (most recent call last): File "mixed-poisson.py", line 3, in <module> import pypapi File "/home/jchang23/firedrake-deps/PyPAPI/pypapi/__init__.py", line 11, in <module> from pypapi.papi import get_cycles_time, get_real_cyc, get_real_usec ImportError: cannot import name get_cycles_time
Oh, you'll need to rebuild the extension module (python setup.py build_ext --inplace). Lawrence -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVp8ePAAoJECOc1kQ8PEYvVZEH/iu9HFGWcxe1utNl7YzEjPpt Vr1fAXOeyXdZQrn9VH+BNZ06djhCqYSOSSAR7Wm7Ga1JiJUYTCoyU7wxDJS6xCsP FeowghpLOsyeaY8PthXbKUBUwYw/f1J1Krsum6FGdUcrmchP/Xna1KKUdix3dXcy K36Kaatg7SOr7Nzy9DT+4NdfRBcR3OPwsg9+Jx27vDFAR5QpmCXfQk/FGd6V2det UYVgZyZDZfUCIYRiLuj+DiBKu7Cvgov7y2sp145D53DoJbXp/0hp5f2lGNOPJSoN YXUxyAA8FpGqCBdT1DfTaxKP+905Bdnt6mMH0DVQpHc4tlLEbaiLVgWiH/5a+Js= =XAMe -----END PGP SIGNATURE-----
Got it, thanks! On Thu, Jul 16, 2015 at 10:02 AM, Lawrence Mitchell < lawrence.mitchell@imperial.ac.uk> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 16/07/15 16:01, Justin Chang wrote:
I am getting this now:
Traceback (most recent call last): File "mixed-poisson.py", line 3, in <module> import pypapi File "/home/jchang23/firedrake-deps/PyPAPI/pypapi/__init__.py", line 11, in <module> from pypapi.papi import get_cycles_time, get_real_cyc, get_real_usec ImportError: cannot import name get_cycles_time
Oh, you'll need to rebuild the extension module (python setup.py build_ext --inplace).
Lawrence -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQEcBAEBAgAGBQJVp8ePAAoJECOc1kQ8PEYvVZEH/iu9HFGWcxe1utNl7YzEjPpt Vr1fAXOeyXdZQrn9VH+BNZ06djhCqYSOSSAR7Wm7Ga1JiJUYTCoyU7wxDJS6xCsP FeowghpLOsyeaY8PthXbKUBUwYw/f1J1Krsum6FGdUcrmchP/Xna1KKUdix3dXcy K36Kaatg7SOr7Nzy9DT+4NdfRBcR3OPwsg9+Jx27vDFAR5QpmCXfQk/FGd6V2det UYVgZyZDZfUCIYRiLuj+DiBKu7Cvgov7y2sp145D53DoJbXp/0hp5f2lGNOPJSoN YXUxyAA8FpGqCBdT1DfTaxKP+905Bdnt6mMH0DVQpHc4tlLEbaiLVgWiH/5a+Js= =XAMe -----END PGP SIGNATURE-----
_______________________________________________ firedrake mailing list firedrake@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/firedrake
participants (2)
- 
                
                Justin Chang
- 
                
                Lawrence Mitchell