Hi,
I send results of 2 tests ran on Pentium 4 HyperThreading (which is not
true multicore CPU) - machine: Dell PowerEdge 400SC and Core2 Duo -
machine: Sony Vaio VGN-FW51JF.
Marcin
--------------------------------------------------------
Pentium 4 HyperThreading
With OpenMP:
40000: 0,30 ( 0,133 Mflops)
160000: 0,80 ( 0,199 Mflops)
640000: 2,24 ( 0,286 Mflops)
2560000: 8,03 ( 0,319 Mflops)
10240000: 31,74 ( 0,323 Mflops)
Without OpenMP:
40000: 0,20 ( 0,200 Mflops)
160000: 0,57 ( 0,281 Mflops)
640000: 2,04 ( 0,314 Mflops)
2560000: 7,99 ( 0,320 Mflops)
10240000: 31,78 ( 0,322 Mflops)
--------------------------------------------------------
Core2 Duo
With OpenMP:
40000: 0,22 ( 0,178 Mflops)
160000: 0,66 ( 0,243 Mflops)
640000: 1,84 ( 0,348 Mflops)
2560000: 3,46 ( 0,740 Mflops)
10240000: 12,33 ( 0,830 Mflops)
Without OpenMP:
40000: 0,16 ( 0,250 Mflops)
160000: 0,49 ( 0,327 Mflops)
640000: 1,61 ( 0,398 Mflops)
2560000: 5,83 ( 0,439 Mflops)
10240000: 23,63 ( 0,433 Mflops)
--------------------------------------------------------
W dniu 24.03.2010 04:44, Allin Cottrell pisze:
As some of you know, we're currently experimenting with openmp
in
gretl. When building from CVS, use of openmp is the default (if
openmp is supported on the host) unless you pass the option
--disable-openmp to the configure script. In addition the current
snapshots for Windows and OS X are built with openmp support
(using gcc 4.4.3 and gcc 4.2.4 respectively).
This note is just to inform you about the state of play, and to
invite submission of test results if people would like to do that.
Right now, we use openmp only for gretl's native matrix
multiplication. So it'll get used (assuming you have at least two
cores) if you do matrix multiplication in a script, or call a
function that does matrix multiplication (such as qform), or use a
built-in command that happens to call matrix multiplication. If we
decide it's a good idea, we could use openmp directives in other
gretl code (but as along as we rely on lapack for much of our
number-crunching, and as long as lapack is not available in a
parallelized form, the scope for threading will remain somewhat
limited).
In a typical current use situation, with gretl running on a
dual-core machine where there's little other demand being placed
on the processors, the asymptotic speed-up from openmp should be
close to a factor of two. However, it takes a big calculation to
get close to the asymptote, and we've found that with small to
moderate sized matrices the overhead from starting and stopping
threads dominates, producing a slowdown relative to serial code.
This is similar to what we found with regard to the ATLAS
optimized blas; see
http://ricardo.ecn.wfu.edu/~cottrell/tmp/gretl_speed.html
Anyway, in case anyone would like to test I'm attaching a matrix
multiplication script that Jack wrote. Right now this is mostly
useful for people building gretl from source, since you want to
run timings both with and without MP, which requires rebuilding.
But if you're currently using a snapshot from before yesterday
(build date 2010-03-21 or earlier) you could run the script, then
download a current snapshot and run it again.
Allin
_______________________________________________
Gretl-devel mailing list
Gretl-devel(a)lists.wfu.edu
http://lists.wfu.edu/mailman/listinfo/gretl-devel
--
Marcin Błażejowski
http://www.wrzosy.nsb.pl/~marcin/
GG# 203127