The performance is traced by timing of cubby and bench_fft runs.
the cubby.data of the bench :
magnetic Nts 100 out_energy 10
The trunk 2009 is now the yannick_trunk branch
the trunk 2010 is the actual trunk branch to be compared with the yannick_trunk version or something else.
the runs has been bench with the resolution 2563 and 5123 on different platform
For the trunk branch 2010 : write the resolution in the file cubby.cfg
cube-dim=256 (or 512)
For the yannick_trunk branch : change the resolution in cubby.hpp file with NY = NX/(processor number) and compile.
// PARAMETERS #define NPROC 32 // Number of processors #define NX 256 #define NY 8 // if NPROC > 1, NY*NPROC is the total y size #define NZ 256 #define NMAX 256 // must be max of NX, NY*NPROC, NZ
the time is in second(CPU time).
Sporadic traces
Some results regarding the clk for future reference.
- On Satch.
- On Jade.
- On Fripp octo 8G.
- On Fripp octo4c_8G.
- On Vargas.
- On Titane.
- On licallo
Benchmark 2011 from version 2109
- On Sting (v1).
- On FrippOcto8G
- On Jade_2011.
- On licallo.
Analyse of the trace : FFT balance with transpostion
See the results in the files attached.
Last modified 9 years ago
Last modified on Oct 28, 2011 5:24:36 PM
Attachments (2)
- benchmark-fft-loop-benchcubby.ods (26.5 KB) - added by 10 years ago.
- benchmark-fft-loop-benchcubby.pdf (36.5 KB) - added by 10 years ago.
Download all attachments as: .zip