wiki:Vargas

On Vargas(IDRIS) for release 1476, 32 cpus for a 2563

****************************************************
max number of allocated scalars: 39
 ---------------------------------------------
   32 Processor run, global<256,256,256>
   for 100 time steps
main timer : (real:0:6:4,user:0:6:2,sys:0:0:0)[clk:36438]
        loop : (real:0:5:29,user:0:5:27,sys:0:0:0)[clk:32930]   (100 calls X 3.293000e+02)
        FFT timer : (real:0:4:23,user:0:4:23,sys:0:0:0)[clk:26308]
                FFT and transposition time : (real:0:4:23,user:0:4:23,sys:0:0:0)[clk:26308]     (512 calls X 5.138281e+01)
                        FFT only : (real:0:0:16,user:0:0:15,sys:0:0:0)[clk:1637]        (2560 calls X 6.394531e-01)
                planification time : (real:0:0:21,user:0:0:21,sys:0:0:0)[clk:2192]
        azur::array timer root : (real:0:0:28,user:0:0:28,sys:0:0:0)[clk:2878]
                view = expr : (real:0:0:28,user:0:0:28,sys:0:0:0)[clk:2878]     (2380 calls X 1.209244e+00)
                        basic3<flt> id= (basic3<int> + flt) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:6]  (2 calls X 3.000000e+00)
                        fftw4<cdbl> = cdbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:34]  (106 calls X 3.207547e-01)
                        fftw4<dbl> = dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:1]     (4 calls X 2.500000e-01)
                        fftw3<dbl> = fftw3<dbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0]      (9 calls X 0.000000e+00)
                        fftw3<dbl> *= dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0]    (6 calls X 0.000000e+00)
                        fftw3<dbl> = dbl : (real:0:0:1,user:0:0:0,sys:0:0:0)[clk:108]   (516 calls X 2.093023e-01)
                        fftw3<dbl> /= dbl : (real:0:0:2,user:0:0:2,sys:0:0:0)[clk:245]  (618 calls X 3.964401e-01)
                        fftw4<cdbl> = fftw4<cdbl> : (real:0:0:3,user:0:0:3,sys:0:0:0)[clk:370]  (406 calls X 9.113300e-01)
                        fftw4<dbl> = fftw4<dbl> : (real:0:0:0,user:0:0:1,sys:0:0:0)[clk:92]     (205 calls X 4.487805e-01)
                        fftw4<cdbl> id= (s2v<basic3<dbl>> * ((fftw4<cdbl> * dbl) + fftw4<cdbl>)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:10]    (2 calls X 5.000000e+00)
                        fftw4<cdbl> id= (s2v<basic3<dbl>> * (fftw4<cdbl> * dbl)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:10]    (2 calls X 5.000000e+00)
                        fftw4<cdbl> *= s2v<basic3<dbl>> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0]      (2 calls X 0.000000e+00)
                        fftw4<cdbl> += fftw4<cdbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0]   (2 calls X 0.000000e+00)
                        fftw4<cdbl> id= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:3,user:0:0:3,sys:0:0:0)[clk:376]   (100 calls X 3.760000e+00)
                        fftw4<cdbl> -= fftw4<cdbl> : (real:0:0:2,user:0:0:1,sys:0:0:0)[clk:214] (100 calls X 2.140000e+00)
                        fftw4<cdbl> -= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:4,user:0:0:4,sys:0:0:0)[clk:470]    (100 calls X 4.700000e+00)
                        fftw4<cdbl> id= (s2v<basic3<dbl>> swp(*) (fftw4<cdbl> + (fftw4<cdbl> * dbl))) : (real:0:0:9,user:0:0:9,sys:0:0:0)[clk:941]      (200 calls X 4.705000e+00)
        cubby::field timer root : (real:0:1:3,user:0:1:2,sys:0:0:0)[clk:6359]
                vector::in_place_curl : (real:0:0:5,user:0:0:5,sys:0:0:0)[clk:518]      (204 calls X 2.539216e+00)
                vector::vec_prod : (real:0:0:43,user:0:0:42,sys:0:0:0)[clk:4353]        (204 calls X 2.133824e+01)
                vector::project : (real:0:0:5,user:0:0:5,sys:0:0:0)[clk:540]    (102 calls X 5.294117e+00)
                scalar::dealias : (real:0:0:9,user:0:0:9,sys:0:0:0)[clk:938]    (612 calls X 1.532680e+00)
                scalar::local_energy : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:10]        (66 calls X 1.515152e-01)

On Vargas(IDRIS) for release 1481, 32 cpus for a 2563

****************************************************
max number of allocated scalars: 39
 ---------------------------------------------
   32 Processor run, global<256,256,256>
   for 100 time steps
main timer : (real:0:5:28,user:0:5:27,sys:0:0:0)[clk:32890]
        loop : (real:0:4:55,user:0:4:53,sys:0:0:0)[clk:29540]   (100 calls X 2.954000e+02)
        FFT timer : (real:0:4:31,user:0:4:29,sys:0:0:0)[clk:27134]
                FFT and transposition time : (real:0:4:31,user:0:4:29,sys:0:0:0)[clk:27134]     (512 calls X 5.299609e+01)
                        FFT only : (real:0:0:15,user:0:0:14,sys:0:0:0)[clk:1530]        (4302 calls X 3.556485e-01)
                planification time : (real:0:0:21,user:0:0:21,sys:0:0:0)[clk:2184]
        azur::array timer root : (real:0:0:31,user:0:0:29,sys:0:0:0)[clk:3115]
                view = expr : (real:0:0:31,user:0:0:29,sys:0:0:0)[clk:3115]     (2380 calls X 1.308824e+00)
                        basic3<flt> id= (basic3<int> + flt) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:5]  (2 calls X 2.500000e+00)
                        fftw4<cdbl> id= cdbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:50]        (106 calls X 4.716981e-01)
                        fftw4<dbl> id= dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0]   (4 calls X 0.000000e+00)
                        fftw3<dbl> id= fftw3<dbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0]    (9 calls X 0.000000e+00)
                        fftw3<dbl> *= dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0]    (6 calls X 0.000000e+00)
                        fftw3<dbl> id= dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:70]  (516 calls X 1.356589e-01)
                        fftw3<dbl> /= dbl : (real:0:0:2,user:0:0:2,sys:0:0:0)[clk:290]  (618 calls X 4.692557e-01)
                        fftw4<cdbl> id= fftw4<cdbl> : (real:0:0:5,user:0:0:4,sys:0:0:0)[clk:570]        (406 calls X 1.403941e+00)
                        fftw4<dbl> id= fftw4<dbl> : (real:0:0:1,user:0:0:1,sys:0:0:0)[clk:120]  (205 calls X 5.853658e-01)
                        fftw4<cdbl> id= (s2v<basic3<dbl>> * ((fftw4<cdbl> * dbl) + fftw4<cdbl>)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:10]    (2 calls X 5.000000e+00)
                        fftw4<cdbl> id= (s2v<basic3<dbl>> * (fftw4<cdbl> * dbl)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0]     (2 calls X 0.000000e+00)
                        fftw4<cdbl> *= s2v<basic3<dbl>> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:10]     (2 calls X 5.000000e+00)
                        fftw4<cdbl> += fftw4<cdbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0]   (2 calls X 0.000000e+00)
                        fftw4<cdbl> id= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:3,user:0:0:3,sys:0:0:0)[clk:360]   (100 calls X 3.600000e+00)
                        fftw4<cdbl> -= fftw4<cdbl> : (real:0:0:1,user:0:0:2,sys:0:0:0)[clk:180] (100 calls X 1.800000e+00)
                        fftw4<cdbl> -= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:5,user:0:0:4,sys:0:0:0)[clk:520]    (100 calls X 5.200000e+00)
                        fftw4<cdbl> id= (s2v<basic3<dbl>> swp(*) (fftw4<cdbl> + (fftw4<cdbl> * dbl))) : (real:0:0:9,user:0:0:9,sys:0:0:0)[clk:930]      (200 calls X 4.650000e+00)
        cubby::field timer root : (real:0:4:11,user:0:4:11,sys:0:0:0)[clk:25140]
                scalar::transpose_blocks_when_received : (real:0:3:50,user:0:3:49,sys:0:0:0)[clk:23000] (3072 calls X 7.486979e+00)
                        scalar::copy_transposed : (real:0:0:16,user:0:0:13,sys:0:0:0)[clk:1630] (49152 calls X 3.316243e-02)
                vector::in_place_curl : (real:0:0:4,user:0:0:5,sys:0:0:0)[clk:430]      (204 calls X 2.107843e+00)
                vector::vec_prod : (real:0:0:1,user:0:0:1,sys:0:0:0)[clk:170]   (204 calls X 8.333333e-01)
                vector::project : (real:0:0:5,user:0:0:4,sys:0:0:0)[clk:550]    (102 calls X 5.392157e+00)
                scalar::dealias : (real:0:0:9,user:0:0:9,sys:0:0:0)[clk:990]    (612 calls X 1.617647e+00)
                scalar::local_energy : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0] (66 calls X 0.000000e+00)

On Vargas(IDRIS) for release 1476, 32 cpus for a 5123

****************************************************
max number of allocated scalars: 39
 ---------------------------------------------
   32 Processor run, global<512,512,512>
   for 100 time steps
main timer : (real:0:32:52,user:0:32:45,sys:0:0:0)[clk:197267]
        loop : (real:0:28:19,user:0:28:13,sys:0:0:0)[clk:169975]        (100 calls X 1.699750e+03)
        FFT timer : (real:0:19:10,user:0:19:4,sys:0:0:0)[clk:115003]
                FFT and transposition time : (real:0:19:10,user:0:19:4,sys:0:0:0)[clk:115003]   (512 calls X 2.246152e+02)
                        FFT only : (real:0:2:54,user:0:2:50,sys:0:0:0)[clk:17400]       (2560 calls X 6.796875e+00)
                planification time : (real:0:3:3,user:0:3:3,sys:0:0:0)[clk:18380]
        azur::array timer root : (real:0:4:0,user:0:3:59,sys:0:0:0)[clk:24060]
                view = expr : (real:0:4:0,user:0:3:59,sys:0:0:0)[clk:24060]     (2380 calls X 1.010924e+01)
                        basic3<flt> id= (basic3<int> + flt) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:51] (2 calls X 2.550000e+01)
                        fftw4<cdbl> = cdbl : (real:0:0:4,user:0:0:4,sys:0:0:0)[clk:425] (106 calls X 4.009434e+00)
                        fftw4<dbl> = dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:20]    (4 calls X 5.000000e+00)
                        fftw3<dbl> = fftw3<dbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:30]     (9 calls X 3.333333e+00)
                        fftw3<dbl> *= dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:10]   (6 calls X 1.666667e+00)
                        fftw3<dbl> = dbl : (real:0:0:8,user:0:0:8,sys:0:0:0)[clk:882]   (516 calls X 1.709302e+00)
                        fftw3<dbl> /= dbl : (real:0:0:23,user:0:0:23,sys:0:0:0)[clk:2348]       (618 calls X 3.799353e+00)
                        fftw4<cdbl> = fftw4<cdbl> : (real:0:0:27,user:0:0:26,sys:0:0:0)[clk:2713]       (406 calls X 6.682266e+00)
                        fftw4<dbl> = fftw4<dbl> : (real:0:0:13,user:0:0:13,sys:0:0:0)[clk:1349] (205 calls X 6.580488e+00)
                        fftw4<cdbl> id= (s2v<basic3<dbl>> * ((fftw4<cdbl> * dbl) + fftw4<cdbl>)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:80]    (2 calls X 4.000000e+01)
                        fftw4<cdbl> id= (s2v<basic3<dbl>> * (fftw4<cdbl> * dbl)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:60]    (2 calls X 3.000000e+01)
                        fftw4<cdbl> *= s2v<basic3<dbl>> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:40]     (2 calls X 2.000000e+01)
                        fftw4<cdbl> += fftw4<cdbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:20]  (2 calls X 1.000000e+01)
                        fftw4<cdbl> id= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:31,user:0:0:31,sys:0:0:0)[clk:3176]        (100 calls X 3.176000e+01)
                        fftw4<cdbl> -= fftw4<cdbl> : (real:0:0:15,user:0:0:15,sys:0:0:0)[clk:1512]      (100 calls X 1.512000e+01)
                        fftw4<cdbl> -= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:37,user:0:0:37,sys:0:0:0)[clk:3794] (100 calls X 3.794000e+01)
                        fftw4<cdbl> id= (s2v<basic3<dbl>> swp(*) (fftw4<cdbl> + (fftw4<cdbl> * dbl))) : (real:0:1:15,user:0:1:14,sys:0:0:0)[clk:7550]   (200 calls X 3.775000e+01)
        cubby::field timer root : (real:0:9:2,user:0:9:1,sys:0:0:0)[clk:54203]
                vector::in_place_curl : (real:0:0:58,user:0:0:58,sys:0:0:0)[clk:5822]   (204 calls X 2.853922e+01)
                vector::vec_prod : (real:0:6:6,user:0:6:6,sys:0:0:0)[clk:36685] (204 calls X 1.798284e+02)
                vector::project : (real:0:0:41,user:0:0:42,sys:0:0:0)[clk:4181] (102 calls X 4.099020e+01)

On Vargas(IDRIS) for release 1481, 32 cpus for a 5123

max number of allocated scalars: 39
 ---------------------------------------------
   32 Processor run, global<512,512,512>
   for 100 time steps
main timer : (real:0:27:31,user:0:27:24,sys:0:0:0)[clk:165144]
        loop : (real:0:23:7,user:0:23:1,sys:0:0:0)[clk:138760]  (100 calls X 1.387600e+03)
        FFT timer : (real:0:19:10,user:0:19:4,sys:0:0:0)[clk:115098]
                FFT and transposition time : (real:0:19:10,user:0:19:4,sys:0:0:0)[clk:115098]   (512 calls X 2.248008e+02)
                        FFT only : (real:0:2:42,user:0:2:40,sys:0:0:0)[clk:16246]       (4302 calls X 3.776383e+00)
                planification time : (real:0:2:57,user:0:2:56,sys:0:0:0)[clk:17767]
        azur::array timer root : (real:0:4:5,user:0:4:5,sys:0:0:0)[clk:24570]
                view = expr : (real:0:4:5,user:0:4:5,sys:0:0:0)[clk:24570]      (2380 calls X 1.032353e+01)
                        basic3<flt> id= (basic3<int> + flt) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:50] (2 calls X 2.500000e+01)
                        fftw4<cdbl> id= cdbl : (real:0:0:3,user:0:0:4,sys:0:0:0)[clk:396]       (106 calls X 3.735849e+00)
                        fftw4<dbl> id= dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:13]  (4 calls X 3.250000e+00)
                        fftw3<dbl> id= fftw3<dbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:17]   (9 calls X 1.888889e+00)
                        fftw3<dbl> *= dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:8]    (6 calls X 1.333333e+00)
                        fftw3<dbl> id= dbl : (real:0:0:9,user:0:0:9,sys:0:0:0)[clk:900] (516 calls X 1.744186e+00)
                        fftw3<dbl> /= dbl : (real:0:0:22,user:0:0:22,sys:0:0:0)[clk:2294]       (618 calls X 3.711974e+00)
                        fftw4<cdbl> id= fftw4<cdbl> : (real:0:0:36,user:0:0:36,sys:0:0:0)[clk:3659]     (406 calls X 9.012315e+00)
                        fftw4<dbl> id= fftw4<dbl> : (real:0:0:12,user:0:0:13,sys:0:0:0)[clk:1275]       (205 calls X 6.219512e+00)
                        fftw4<cdbl> id= (s2v<basic3<dbl>> * ((fftw4<cdbl> * dbl) + fftw4<cdbl>)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:78]    (2 calls X 3.900000e+01)
                        fftw4<cdbl> id= (s2v<basic3<dbl>> * (fftw4<cdbl> * dbl)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:56]    (2 calls X 2.800000e+01)
                        fftw4<cdbl> *= s2v<basic3<dbl>> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:30]     (2 calls X 1.500000e+01)
                        fftw4<cdbl> += fftw4<cdbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:33]  (2 calls X 1.650000e+01)
                        fftw4<cdbl> id= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:29,user:0:0:29,sys:0:0:0)[clk:2995]        (100 calls X 2.995000e+01)
                        fftw4<cdbl> -= fftw4<cdbl> : (real:0:0:16,user:0:0:16,sys:0:0:0)[clk:1601]      (100 calls X 1.601000e+01)
                        fftw4<cdbl> -= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:37,user:0:0:37,sys:0:0:0)[clk:3751] (100 calls X 3.751000e+01)
                        fftw4<cdbl> id= (s2v<basic3<dbl>> swp(*) (fftw4<cdbl> + (fftw4<cdbl> * dbl))) : (real:0:1:14,user:0:1:14,sys:0:0:0)[clk:7413]   (200 calls X 3.7
06500e+01)
        cubby::field timer root : (real:0:16:32,user:0:16:29,sys:0:0:0)[clk:99269]
                scalar::transpose_blocks_when_received : (real:0:12:55,user:0:12:51,sys:0:0:0)[clk:77548]       (3072 calls X 2.524349e+01)
                        scalar::copy_transposed : (real:0:2:16,user:0:2:3,sys:0:0:0)[clk:13696] (49152 calls X 2.786458e-01)
                vector::in_place_curl : (real:0:0:57,user:0:0:57,sys:0:0:0)[clk:5761]   (204 calls X 2.824020e+01)
                vector::vec_prod : (real:0:0:42,user:0:0:43,sys:0:0:0)[clk:4273]        (204 calls X 2.094608e+01)
                vector::project : (real:0:0:41,user:0:0:41,sys:0:0:0)[clk:4151] (102 calls X 4.069608e+01)
                scalar::dealias : (real:0:1:14,user:0:1:14,sys:0:0:0)[clk:7468] (612 calls X 1.220261e+01)
                scalar::local_energy : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:67]        (66 calls X 1.015152e+00)

Last modified 10 years ago Last modified on Aug 17, 2010 6:20:29 PM