[Gretl-devel] Re: speed of (m)ols with svd

Saturday, 28 December 2019

On Sat, 28 Dec 2019, Riccardo (Jack) Lucchetti wrote:

...
 On Fri, 27 Dec 2019, Allin Cottrell wrote:

> It appears to be a free lunch, pretty much. The speed-up is significant but 
> not huge, something like 30 percent. That's now in git. But the primary 
> source of difference between the gretl and numpy times that you quote must 
> be due to the respective Blas/Lapack implementations.
> 
> Here's what I'm seeing on a fairly elderly PC running Fedora with gretl 
> linked against openblas. I reduced the number of replications to 2000 
> (impatience) and created a baseline of accuracy by running mpols with 
> GRETL_MP_BITS=4096.
> 
> # gretl svd on, using DGELSS
> gretl (mols): 8.43043 seconds
> maxerr (gretl)  = 0.000000000000007
> python (linalg.lstsq): 5.63282 seconds
> maxerr (python) = 0.000000000000007
> 
> # gretl svd on, using DGELSD
> gretl (mols): 6.00157 seconds
> maxerr (gretl)  = 0.000000000000007
> python (linalg.lstsq): 5.62002 seconds
> maxerr (python) = 0.000000000000007
> 
> # gretl svd off (Cholesky)
> gretl (mols): 0.396789 seconds
> maxerr (gretl)  = 0.000000000000005
> python (linalg.lstsq): 5.62659 seconds
> maxerr (python) = 0.000000000000007
> 
>> From this, three points are apparent: (1) as stated above, DGELSD is 
> close to 30% faster than DGELSS; (2) numpy is a little faster than us on 
> SVD; and (3) you get just as accurate results an order of magnitude faster 
> via Cholesky (the mols default) provided the regressors (as here) are not 
> horribly collinear.

 Hm, I'm seeing something weird here: on my home laptop (an 8-core, 4 real 
 cpus machine) I'm getting results that are fairly consistent with yours. On 
 the other hand, trying the same on my work PC (a 12-core box) I'm seeing 
 this:

 ----------------------------------------------
 Old code (with set svd on):
 ----------------------------------------------

 gretl (mols): 7.09208 seconds
 maxerr (gretl) = 0.000000000000010
 python (linalg.lstsq): 3.92083 seconds

 maxerr (numpy) = 0.000000000000008

 ----------------------------------------------
 New code (with set svd on):
 ----------------------------------------------

 gretl (mols): 3.79993 seconds
 maxerr (gretl) = 0.000000000000008
 python (linalg.lstsq): 3.89591 seconds

 ----------------------------------------------
 New code (with set svd on): 
Should that be, with set svd off?

...
 ----------------------------------------------

 gretl (mols): 120.903 seconds
 maxerr (gretl) = 0.000000000000005
 python (linalg.lstsq): 5.63783 seconds

 It looks as if turning svd off makes mols a good deal slower (by the way: all 
 12 cpus go at 100% for the whole time). I have no time now to check why this 
 happens, but I'll throw a idea in: perhaps, for some reason, I'm getting the 
 QR decomposition instead of Cholesky? 
If that were the case you should see on stderr,

"gretl_matrix_multi_ols: switching to QR decomp"

Are these 12 "real" cores? The timings I showed were from a quad 
core box with hyperthreads and I set OMP_NUM_THREADS=4 to prevent 
hyperthreading, which slows things down. (If OpenBLAS doesn't use 
OMP, you'd set OPENBLAS_NUM_THREADS if wanted.)

Allin

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[Gretl-devel] Re: speed of (m)ols with svd