On Vargas(IDRIS) for release 1476, 32 cpus for a 2563
**************************************************** max number of allocated scalars: 39 --------------------------------------------- 32 Processor run, global<256,256,256> for 100 time steps main timer : (real:0:6:4,user:0:6:2,sys:0:0:0)[clk:36438] loop : (real:0:5:29,user:0:5:27,sys:0:0:0)[clk:32930] (100 calls X 3.293000e+02) FFT timer : (real:0:4:23,user:0:4:23,sys:0:0:0)[clk:26308] FFT and transposition time : (real:0:4:23,user:0:4:23,sys:0:0:0)[clk:26308] (512 calls X 5.138281e+01) FFT only : (real:0:0:16,user:0:0:15,sys:0:0:0)[clk:1637] (2560 calls X 6.394531e-01) planification time : (real:0:0:21,user:0:0:21,sys:0:0:0)[clk:2192] azur::array timer root : (real:0:0:28,user:0:0:28,sys:0:0:0)[clk:2878] view = expr : (real:0:0:28,user:0:0:28,sys:0:0:0)[clk:2878] (2380 calls X 1.209244e+00) basic3<flt> id= (basic3<int> + flt) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:6] (2 calls X 3.000000e+00) fftw4<cdbl> = cdbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:34] (106 calls X 3.207547e-01) fftw4<dbl> = dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:1] (4 calls X 2.500000e-01) fftw3<dbl> = fftw3<dbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0] (9 calls X 0.000000e+00) fftw3<dbl> *= dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0] (6 calls X 0.000000e+00) fftw3<dbl> = dbl : (real:0:0:1,user:0:0:0,sys:0:0:0)[clk:108] (516 calls X 2.093023e-01) fftw3<dbl> /= dbl : (real:0:0:2,user:0:0:2,sys:0:0:0)[clk:245] (618 calls X 3.964401e-01) fftw4<cdbl> = fftw4<cdbl> : (real:0:0:3,user:0:0:3,sys:0:0:0)[clk:370] (406 calls X 9.113300e-01) fftw4<dbl> = fftw4<dbl> : (real:0:0:0,user:0:0:1,sys:0:0:0)[clk:92] (205 calls X 4.487805e-01) fftw4<cdbl> id= (s2v<basic3<dbl>> * ((fftw4<cdbl> * dbl) + fftw4<cdbl>)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:10] (2 calls X 5.000000e+00) fftw4<cdbl> id= (s2v<basic3<dbl>> * (fftw4<cdbl> * dbl)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:10] (2 calls X 5.000000e+00) fftw4<cdbl> *= s2v<basic3<dbl>> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0] (2 calls X 0.000000e+00) fftw4<cdbl> += fftw4<cdbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0] (2 calls X 0.000000e+00) fftw4<cdbl> id= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:3,user:0:0:3,sys:0:0:0)[clk:376] (100 calls X 3.760000e+00) fftw4<cdbl> -= fftw4<cdbl> : (real:0:0:2,user:0:0:1,sys:0:0:0)[clk:214] (100 calls X 2.140000e+00) fftw4<cdbl> -= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:4,user:0:0:4,sys:0:0:0)[clk:470] (100 calls X 4.700000e+00) fftw4<cdbl> id= (s2v<basic3<dbl>> swp(*) (fftw4<cdbl> + (fftw4<cdbl> * dbl))) : (real:0:0:9,user:0:0:9,sys:0:0:0)[clk:941] (200 calls X 4.705000e+00) cubby::field timer root : (real:0:1:3,user:0:1:2,sys:0:0:0)[clk:6359] vector::in_place_curl : (real:0:0:5,user:0:0:5,sys:0:0:0)[clk:518] (204 calls X 2.539216e+00) vector::vec_prod : (real:0:0:43,user:0:0:42,sys:0:0:0)[clk:4353] (204 calls X 2.133824e+01) vector::project : (real:0:0:5,user:0:0:5,sys:0:0:0)[clk:540] (102 calls X 5.294117e+00) scalar::dealias : (real:0:0:9,user:0:0:9,sys:0:0:0)[clk:938] (612 calls X 1.532680e+00) scalar::local_energy : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:10] (66 calls X 1.515152e-01)
On Vargas(IDRIS) for release 1481, 32 cpus for a 2563
**************************************************** max number of allocated scalars: 39 --------------------------------------------- 32 Processor run, global<256,256,256> for 100 time steps main timer : (real:0:5:28,user:0:5:27,sys:0:0:0)[clk:32890] loop : (real:0:4:55,user:0:4:53,sys:0:0:0)[clk:29540] (100 calls X 2.954000e+02) FFT timer : (real:0:4:31,user:0:4:29,sys:0:0:0)[clk:27134] FFT and transposition time : (real:0:4:31,user:0:4:29,sys:0:0:0)[clk:27134] (512 calls X 5.299609e+01) FFT only : (real:0:0:15,user:0:0:14,sys:0:0:0)[clk:1530] (4302 calls X 3.556485e-01) planification time : (real:0:0:21,user:0:0:21,sys:0:0:0)[clk:2184] azur::array timer root : (real:0:0:31,user:0:0:29,sys:0:0:0)[clk:3115] view = expr : (real:0:0:31,user:0:0:29,sys:0:0:0)[clk:3115] (2380 calls X 1.308824e+00) basic3<flt> id= (basic3<int> + flt) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:5] (2 calls X 2.500000e+00) fftw4<cdbl> id= cdbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:50] (106 calls X 4.716981e-01) fftw4<dbl> id= dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0] (4 calls X 0.000000e+00) fftw3<dbl> id= fftw3<dbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0] (9 calls X 0.000000e+00) fftw3<dbl> *= dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0] (6 calls X 0.000000e+00) fftw3<dbl> id= dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:70] (516 calls X 1.356589e-01) fftw3<dbl> /= dbl : (real:0:0:2,user:0:0:2,sys:0:0:0)[clk:290] (618 calls X 4.692557e-01) fftw4<cdbl> id= fftw4<cdbl> : (real:0:0:5,user:0:0:4,sys:0:0:0)[clk:570] (406 calls X 1.403941e+00) fftw4<dbl> id= fftw4<dbl> : (real:0:0:1,user:0:0:1,sys:0:0:0)[clk:120] (205 calls X 5.853658e-01) fftw4<cdbl> id= (s2v<basic3<dbl>> * ((fftw4<cdbl> * dbl) + fftw4<cdbl>)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:10] (2 calls X 5.000000e+00) fftw4<cdbl> id= (s2v<basic3<dbl>> * (fftw4<cdbl> * dbl)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0] (2 calls X 0.000000e+00) fftw4<cdbl> *= s2v<basic3<dbl>> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:10] (2 calls X 5.000000e+00) fftw4<cdbl> += fftw4<cdbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0] (2 calls X 0.000000e+00) fftw4<cdbl> id= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:3,user:0:0:3,sys:0:0:0)[clk:360] (100 calls X 3.600000e+00) fftw4<cdbl> -= fftw4<cdbl> : (real:0:0:1,user:0:0:2,sys:0:0:0)[clk:180] (100 calls X 1.800000e+00) fftw4<cdbl> -= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:5,user:0:0:4,sys:0:0:0)[clk:520] (100 calls X 5.200000e+00) fftw4<cdbl> id= (s2v<basic3<dbl>> swp(*) (fftw4<cdbl> + (fftw4<cdbl> * dbl))) : (real:0:0:9,user:0:0:9,sys:0:0:0)[clk:930] (200 calls X 4.650000e+00) cubby::field timer root : (real:0:4:11,user:0:4:11,sys:0:0:0)[clk:25140] scalar::transpose_blocks_when_received : (real:0:3:50,user:0:3:49,sys:0:0:0)[clk:23000] (3072 calls X 7.486979e+00) scalar::copy_transposed : (real:0:0:16,user:0:0:13,sys:0:0:0)[clk:1630] (49152 calls X 3.316243e-02) vector::in_place_curl : (real:0:0:4,user:0:0:5,sys:0:0:0)[clk:430] (204 calls X 2.107843e+00) vector::vec_prod : (real:0:0:1,user:0:0:1,sys:0:0:0)[clk:170] (204 calls X 8.333333e-01) vector::project : (real:0:0:5,user:0:0:4,sys:0:0:0)[clk:550] (102 calls X 5.392157e+00) scalar::dealias : (real:0:0:9,user:0:0:9,sys:0:0:0)[clk:990] (612 calls X 1.617647e+00) scalar::local_energy : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:0] (66 calls X 0.000000e+00)
On Vargas(IDRIS) for release 1476, 32 cpus for a 5123
**************************************************** max number of allocated scalars: 39 --------------------------------------------- 32 Processor run, global<512,512,512> for 100 time steps main timer : (real:0:32:52,user:0:32:45,sys:0:0:0)[clk:197267] loop : (real:0:28:19,user:0:28:13,sys:0:0:0)[clk:169975] (100 calls X 1.699750e+03) FFT timer : (real:0:19:10,user:0:19:4,sys:0:0:0)[clk:115003] FFT and transposition time : (real:0:19:10,user:0:19:4,sys:0:0:0)[clk:115003] (512 calls X 2.246152e+02) FFT only : (real:0:2:54,user:0:2:50,sys:0:0:0)[clk:17400] (2560 calls X 6.796875e+00) planification time : (real:0:3:3,user:0:3:3,sys:0:0:0)[clk:18380] azur::array timer root : (real:0:4:0,user:0:3:59,sys:0:0:0)[clk:24060] view = expr : (real:0:4:0,user:0:3:59,sys:0:0:0)[clk:24060] (2380 calls X 1.010924e+01) basic3<flt> id= (basic3<int> + flt) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:51] (2 calls X 2.550000e+01) fftw4<cdbl> = cdbl : (real:0:0:4,user:0:0:4,sys:0:0:0)[clk:425] (106 calls X 4.009434e+00) fftw4<dbl> = dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:20] (4 calls X 5.000000e+00) fftw3<dbl> = fftw3<dbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:30] (9 calls X 3.333333e+00) fftw3<dbl> *= dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:10] (6 calls X 1.666667e+00) fftw3<dbl> = dbl : (real:0:0:8,user:0:0:8,sys:0:0:0)[clk:882] (516 calls X 1.709302e+00) fftw3<dbl> /= dbl : (real:0:0:23,user:0:0:23,sys:0:0:0)[clk:2348] (618 calls X 3.799353e+00) fftw4<cdbl> = fftw4<cdbl> : (real:0:0:27,user:0:0:26,sys:0:0:0)[clk:2713] (406 calls X 6.682266e+00) fftw4<dbl> = fftw4<dbl> : (real:0:0:13,user:0:0:13,sys:0:0:0)[clk:1349] (205 calls X 6.580488e+00) fftw4<cdbl> id= (s2v<basic3<dbl>> * ((fftw4<cdbl> * dbl) + fftw4<cdbl>)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:80] (2 calls X 4.000000e+01) fftw4<cdbl> id= (s2v<basic3<dbl>> * (fftw4<cdbl> * dbl)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:60] (2 calls X 3.000000e+01) fftw4<cdbl> *= s2v<basic3<dbl>> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:40] (2 calls X 2.000000e+01) fftw4<cdbl> += fftw4<cdbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:20] (2 calls X 1.000000e+01) fftw4<cdbl> id= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:31,user:0:0:31,sys:0:0:0)[clk:3176] (100 calls X 3.176000e+01) fftw4<cdbl> -= fftw4<cdbl> : (real:0:0:15,user:0:0:15,sys:0:0:0)[clk:1512] (100 calls X 1.512000e+01) fftw4<cdbl> -= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:37,user:0:0:37,sys:0:0:0)[clk:3794] (100 calls X 3.794000e+01) fftw4<cdbl> id= (s2v<basic3<dbl>> swp(*) (fftw4<cdbl> + (fftw4<cdbl> * dbl))) : (real:0:1:15,user:0:1:14,sys:0:0:0)[clk:7550] (200 calls X 3.775000e+01) cubby::field timer root : (real:0:9:2,user:0:9:1,sys:0:0:0)[clk:54203] vector::in_place_curl : (real:0:0:58,user:0:0:58,sys:0:0:0)[clk:5822] (204 calls X 2.853922e+01) vector::vec_prod : (real:0:6:6,user:0:6:6,sys:0:0:0)[clk:36685] (204 calls X 1.798284e+02) vector::project : (real:0:0:41,user:0:0:42,sys:0:0:0)[clk:4181] (102 calls X 4.099020e+01)
On Vargas(IDRIS) for release 1481, 32 cpus for a 5123
max number of allocated scalars: 39 --------------------------------------------- 32 Processor run, global<512,512,512> for 100 time steps main timer : (real:0:27:31,user:0:27:24,sys:0:0:0)[clk:165144] loop : (real:0:23:7,user:0:23:1,sys:0:0:0)[clk:138760] (100 calls X 1.387600e+03) FFT timer : (real:0:19:10,user:0:19:4,sys:0:0:0)[clk:115098] FFT and transposition time : (real:0:19:10,user:0:19:4,sys:0:0:0)[clk:115098] (512 calls X 2.248008e+02) FFT only : (real:0:2:42,user:0:2:40,sys:0:0:0)[clk:16246] (4302 calls X 3.776383e+00) planification time : (real:0:2:57,user:0:2:56,sys:0:0:0)[clk:17767] azur::array timer root : (real:0:4:5,user:0:4:5,sys:0:0:0)[clk:24570] view = expr : (real:0:4:5,user:0:4:5,sys:0:0:0)[clk:24570] (2380 calls X 1.032353e+01) basic3<flt> id= (basic3<int> + flt) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:50] (2 calls X 2.500000e+01) fftw4<cdbl> id= cdbl : (real:0:0:3,user:0:0:4,sys:0:0:0)[clk:396] (106 calls X 3.735849e+00) fftw4<dbl> id= dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:13] (4 calls X 3.250000e+00) fftw3<dbl> id= fftw3<dbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:17] (9 calls X 1.888889e+00) fftw3<dbl> *= dbl : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:8] (6 calls X 1.333333e+00) fftw3<dbl> id= dbl : (real:0:0:9,user:0:0:9,sys:0:0:0)[clk:900] (516 calls X 1.744186e+00) fftw3<dbl> /= dbl : (real:0:0:22,user:0:0:22,sys:0:0:0)[clk:2294] (618 calls X 3.711974e+00) fftw4<cdbl> id= fftw4<cdbl> : (real:0:0:36,user:0:0:36,sys:0:0:0)[clk:3659] (406 calls X 9.012315e+00) fftw4<dbl> id= fftw4<dbl> : (real:0:0:12,user:0:0:13,sys:0:0:0)[clk:1275] (205 calls X 6.219512e+00) fftw4<cdbl> id= (s2v<basic3<dbl>> * ((fftw4<cdbl> * dbl) + fftw4<cdbl>)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:78] (2 calls X 3.900000e+01) fftw4<cdbl> id= (s2v<basic3<dbl>> * (fftw4<cdbl> * dbl)) : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:56] (2 calls X 2.800000e+01) fftw4<cdbl> *= s2v<basic3<dbl>> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:30] (2 calls X 1.500000e+01) fftw4<cdbl> += fftw4<cdbl> : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:33] (2 calls X 1.650000e+01) fftw4<cdbl> id= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:29,user:0:0:29,sys:0:0:0)[clk:2995] (100 calls X 2.995000e+01) fftw4<cdbl> -= fftw4<cdbl> : (real:0:0:16,user:0:0:16,sys:0:0:0)[clk:1601] (100 calls X 1.601000e+01) fftw4<cdbl> -= ((fftw4<cdbl> * dbl) * s2v<basic3<dbl>>) : (real:0:0:37,user:0:0:37,sys:0:0:0)[clk:3751] (100 calls X 3.751000e+01) fftw4<cdbl> id= (s2v<basic3<dbl>> swp(*) (fftw4<cdbl> + (fftw4<cdbl> * dbl))) : (real:0:1:14,user:0:1:14,sys:0:0:0)[clk:7413] (200 calls X 3.7 06500e+01) cubby::field timer root : (real:0:16:32,user:0:16:29,sys:0:0:0)[clk:99269] scalar::transpose_blocks_when_received : (real:0:12:55,user:0:12:51,sys:0:0:0)[clk:77548] (3072 calls X 2.524349e+01) scalar::copy_transposed : (real:0:2:16,user:0:2:3,sys:0:0:0)[clk:13696] (49152 calls X 2.786458e-01) vector::in_place_curl : (real:0:0:57,user:0:0:57,sys:0:0:0)[clk:5761] (204 calls X 2.824020e+01) vector::vec_prod : (real:0:0:42,user:0:0:43,sys:0:0:0)[clk:4273] (204 calls X 2.094608e+01) vector::project : (real:0:0:41,user:0:0:41,sys:0:0:0)[clk:4151] (102 calls X 4.069608e+01) scalar::dealias : (real:0:1:14,user:0:1:14,sys:0:0:0)[clk:7468] (612 calls X 1.220261e+01) scalar::local_energy : (real:0:0:0,user:0:0:0,sys:0:0:0)[clk:67] (66 calls X 1.015152e+00)
Last modified 10 years ago
Last modified on Aug 17, 2010 6:20:29 PM