Hi Helmut,

Chris ran the 1D solver and profiled it locally. When using Collections with auto-tuning he got the attached profile. If he turned off auto-tuning we did observe the mutex_lock being second highest and dgemv being the top of the list. However in your earlier profiling  you had a boost method we do not recognise at the top. Could you perhaps try profiling your run again but with the auto-tuning turned on to see if this boost method is still at the top of your profiling list. If so the discrepancy is presumably related to this method and we should try and find out why this is arising. 

Cheers,
Spencer

PS We do have a threading library which has some lock on the memory pool and is probably why the no-auto-tuning is leading to high number of lock. We will look into turning off this feature if threading is not being used

PPS Douglas is going to look into using boost 1.57 (since we are generally using 1.58 or higher)  to see if he observes a similar slow down to what you see. 

Begin forwarded message:

From: Chris Cantwell <c.cantwell@imperial.ac.uk>
Subject: profiling screenshot
Date: 18 May 2016 at 18:04:26 BST
To: Spencer Sherwin <s.sherwin@imperial.ac.uk>


Spencer  Sherwin
McLaren Racing/Royal Academy of Engineering Research Chair, 
Professor of Computational Fluid Mechanics,
Department of Aeronautics,
Imperial College London
South Kensington Campus
London SW7 2AZ

+44 (0) 20 759 45052