wiki:DevFftwWisdom

FFTW in the Cubby code

The Cubby code needs to calculate a lot of Fourier transforms at each time step. So, the way to compute Fourier transforms plays a very important role in the Cubby code's performances. This short page explains why we have chosen FFTW(Fastest Fourier Transformation in the West) and how is this library used in the Cubby code.

About FFTW

We have chosen FFTW for several reasons, here are the main features of FFTW 3.2.1 :

  • Very fast transforms of purely real input or output data
  • Arbitrary-size transforms
  • Both one-dimensional and multi-dimensional transforms
  • Both C and Fortan interfaces
  • Free software (released under the GNU License)

For more informations about FFTW, see their web page.

How FFTW is used in the Cubby code ?

With FFTW, computing a Fourier transform requires 4 steps:

  1. Allocation of the input and output arrays (fftw-malloc can be used)
    real_type* data_as_real = get_storage_impl().data();
    fftw_complex* data_as_cplx =  reinterpret_cast<fftw_complex*>(get_storage_impl().data());
    
  1. Building of the plan (which depends on the size, the sense of the transform, the planner flag, the input, the output and other parameters)
    fftw_plan p1 = fftw_plan_guru_dft_r2c( 1, &dims_z,1,&howmany_dims_z, data_as_real, data_as_cplx, FFTW_ESTIMATE);
    

  1. Execution of the plan
    fftw_execute(p1);
    

  1. Free up memory
    fftw_destroy_plan(p1);
    

In our code, the class where the Fourier transforms are computed is fftw_engine.cpp, located in the following directory : source:/cubby/branches/work/libcubby/cubby/field/fft/fftw_engine.cpp.

For more informations about the FFTW utilization, see their manual.

Influence of the planner flag

The planner flag control the rigor of the planning process, and can also impose restrictions on the type of transform algorithm that is employed. The different flags are, by increasing order of rigor : FFTW_ESTIMATE, FFTW_MEASURE, FFTW_PATIENT and FFTW_EXHAUSTIVE.

In our case, as the number of Fourier transform is gigantic, we use the flag FFTW_EXHAUSTIVE. Here are some results which shows the influence of the planner flag in the computing time (computing CPU time in seconds, with 10 000 time steps and a cube dimension of 256) :

  • With 4 processors in mpifull4 :

Planner flagVector testScalar test
FFTW_ESTIMATE28 04330 933
FFTW_MEASURE23 65626 406
FFTW_PATIENT21 72524 481
FFTW_EXHAUSTIVE21 53024 135


  • With 16 processors in mpifull4 :

Planner flagVector testScalar test
FFTW_ESTIMATE80588873
FFTW_MEASURE7 5348 420
FFTW_PATIENT7 1457 808
FFTW_EXHAUSTIVE6 5237 139


Note that these tests are located in the following directory: source:/cubby/branches/work/main_bench_fft.cpp. With such a large number of Fourier transforms, we can save a lot of time by using the Exhaustive flag. In this case, the benchmark with the Exhaustive flag can run more than 23% faster than with the Estimate flag.

For more informations, see the documentation about the planner flags.

Speed up the usage with FFTW Wisdom

Thanks to FFTW, the strategy used for the first transform is reused for all the next transforms. Here is the time in seconds spent by FFTW to find the best strategy (planning time) for a cube dimension of 256 with 4 processors in mpifull4 :

Planner flagPlanning time1 FFT computing time (vectorial meth.)1 FFT computing time (scalar meth.)
FFTW_ESTIMATE2.81.551.61
FFTW_MEASURE6.11.291.34
FFTW_PATIENT62.41.191.26
FFTW_EXHAUSTIVE311.81.091.15

To minimize the planning time, FFTW Wisdom can be used. Indeed, this mechanism saves the plans to disk and restores them. In the Cubby code, we don't use this at the moment because all the nodes of the cluster are not necessarily in the same configuration and the plan which is used can be suboptimal.

How to set the planner flag ?

By default, the planner flag used for each Fourier transform is Exhaustive. But it is possible to set this flag manually in adding --fftw-planner [flag name] to the executable files launch command.Note that some abbreviations are allowed (EST, MEA, PAT and EXH) and you can use capital or small letters to set the planner flag.

$ ./cubby --cube-dim 256 --fftw-planner Mea 

For example, this command will execute the Cubby code with a cube dimension of 256 and will use the planner flag FFTW_MEASURE.

Last modified 11 years ago Last modified on Aug 24, 2009 9:29:16 AM