[Gretl-devel] Re: Speed of matrix "block" operations

Sunday, 13 October 2019

On Sun, 13 Oct 2019, Riccardo (Jack) Lucchetti wrote:

...
 On Sat, 12 Oct 2019, Allin Cottrell wrote:

> Here are my timings:

 [...]

 and here are mine, on two different machines:

 Laptop at home:

  1 columns per chunk: 4.1346s
  2 columns per chunk: 2.1247s
  5 columns per chunk: 0.7616s
 10 columns per chunk: 0.2831s
 25 columns per chunk: 0.1298s
 50 columns per chunk: 0.3060s
 100 columns per chunk: 0.9900s
 125 columns per chunk: 1.2825s
 500 columns per chunk: 4.1578s

 Desktop at work:

  1 columns per chunk: 2.4007s
  2 columns per chunk: 0.6390s
  5 columns per chunk: 0.1440s
 10 columns per chunk: 0.0739s
 25 columns per chunk: 0.0363s
 50 columns per chunk: 0.0724s
 100 columns per chunk: 0.1957s
 125 columns per chunk: 0.3581s
 500 columns per chunk: 2.8391s 
Thanks, Jack. So, given the options I posited, our machines agree on 
a best chunk size of (25 * 5000 * 8) bytes (=~ 1 MB) for use with 
memcpy. (5000 being the number of rows in the matrix to be copied, 
and 8 the number of bytes to represent a double-precision floating 
point value.)

Now, to optimize libgretl's copying of contiguous data, we just have 
to figure out how that relates to the size of L1 or L2 cache, or 
whatever is truly the relevant hardware parameter here!

Allin

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[Gretl-devel] Re: Speed of matrix "block" operations