I just want to add another result obtained on Windows 7 (64-bit) but using gretl's 32-bit version:
<OUTPUT>
dgemm experiment 1, variant 1, speed in Gflops
m n k vanilla openmp netlib
128 128 128 2.1368 3.4852 2.3743
128 128 256 2.5487 4.4178 2.5388
128 128 512 2.7106 5.0125 2.7782
128 128 1024 2.7998 5.4136 2.8940
128 128 2048 2.7670 5.3196 2.9199
result: openmp dominates
dgemm experiment 1, variant 2, speed in Gflops
m n k vanilla openmp netlib
128 128 128 2.3372 3.4591 2.3988
256 256 128 2.3256 4.4618 2.7094
512 512 128 2.1448 4.6779 2.7654
1024 1024 128 2.4723 4.7650 2.7767
2048 2048 128 2.5796 4.6320 2.7367
result: openmp dominates
dgemm experiment 1, variant 3, speed in Gflops
m n k vanilla openmp netlib
128 128 128 2.6201 3.3397 2.3657
256 256 256 2.8571 5.0325 2.9471
512 512 512 3.1156 4.9564 3.1536
1024 1024 1024 2.3233 5.2080 2.3002
2048 2048 2048 2.3192 4.4718 2.3189
result: openmp dominates
dgemm experiment 2, variant 1, speed in Gflops
m n k vanilla openmp netlib
8 8 8 0.44731 0.027292 0.46398
16 8 8 0.56046 0.050733 0.62706
32 8 8 0.66725 0.10275 0.66642
64 8 8 0.73151 0.18160 0.74646
128 8 8 0.78491 0.30364 0.79847
256 8 8 0.80797 0.45798 0.82017
512 8 8 0.79569 0.61589 0.78943
1024 8 8 0.97164 0.75438 0.82914
2048 8 8 0.83483 0.80910 0.71419
4096 8 8 0.84114 0.85660 0.83595
result: openmp dominates for mnk >= 262144
dgemm experiment 2, variant 2, speed in Gflops
m n k vanilla openmp netlib
10 2 1000 1.9133 0.59579 2.2596
20 2 1000 2.4185 0.93620 2.6299
40 2 1000 2.4249 1.5991 2.4208
80 2 1000 2.5991 2.3573 2.7317
160 2 1000 2.8793 3.3007 2.9413
320 2 1000 2.9538 4.5477 2.9906
640 2 1000 2.2554 3.4918 2.2917
1280 2 1000 2.2609 3.7071 2.2745
2560 2 1000 2.2118 3.3263 2.2296
5120 2 1000 2.2168 3.4272 2.2340
result: openmp dominates for mnk >= 320000
dgemm experiment 2, variant 3, speed in Gflops
m n k vanilla openmp netlib
10 10 1000 1.9689 1.9801 2.3104
20 10 1000 2.4662 3.0143 2.7699
40 10 1000 2.4801 3.8146 2.4038
80 10 1000 2.6682 4.2088 2.7531
160 10 1000 2.9370 4.6238 2.9567
320 10 1000 2.9992 4.6294 2.9979
result: openmp dominates for mnk >= 200000
netlib dominates for mnk < 200000
Operating system: Windows (32-bit)
BLAS library: Netlib
Number of processors: 4
OpenMP enabled: yes
Performance summary:
vanilla -
dominates outright in 0 out of 6 tests
openmp -
dominates outright in 3 out of 6 tests
dominates in 3 test(s) for mnk >= (262144, 320000, 200000)
netlib -
dominates outright in 0 out of 6 tests
dominates in 1 test(s) for mnk < 200000
</OUTPUT>