On Mon, 13 Jan 2020, Sven Schreiber wrote:
Thanks for the thorough testing. Perhaps it would be interesting to
also
restrict the number of OMP threads to 1 (instead of ncores)?
Yes, very interesting! See below.
(I don't know how to do that on Windows, except by wrapping the
whole
thing in a dummy mpi block with np=1 and omp-threads=1.)
google: windows set environment variable. (It's not that hard.)
Anyway, here's what I found on forcing single-threaded behavior:
* eigensym() is uniformly faster than eigen(), on both Windows and
Linux.
* It makes more difference to the eigensym() times, but even eigen()
is a bit faster when single-threaded.
* This applies for matrices up to order 200 (which is as big as I've
tried).
Just as one example, here's the comparison for input of order 90, on a
dual-boot haswell laptop with 2 physical cores, max 4 threads:
OMP_NUM_THREADS=1
Win10 Linux
eigen: 6.2417s 6.2913s
eigensym: 1.7279s 1.5508s
OMP_NUM_THREADS=2
Win10 Linux
eigen: 6.3512s 6.7276s
eigensym: 5.6093s 2.1323s
So rather than divert from eigensym to eigen on Windows, what we
really want to do is run single-threaded eigensym on both platforms,
if we can figure out how to do that. (We don't want everything OMP to
be single-threaded.)
Allin