arrays - Trying to pass MPI derived types between processors (and failing) -


i trying parallelize customer's fortran code mpi. f array of 4-byte reals dimensioned f(dimx,dimy,dimz,dimf). need various processes work on different parts of array's first dimension. (i have rather started last, wasn't me.) define derived type mpi_x_inteface so

call mpi_type_vector(dimy*dimz*dimf, 1, dimx, mpi_real,  &                      mpi_x_interface, mpi_err) call mpi_type_commit(mpi_x_interface, mpi_err) 

my intent single mpi_x_interface contain of data in 'f' @ given first index "i". is, given i, should contain f(i,:,:,:). (note @ stage of game, procs have complete copy of f. intend split f between procs, except want proc 0 have full copy purpose of gathering.)

ptsinproc array containing number of "i" indices handled each proc. x_slab_displs displacement beginning of array each proc. 2 procs, testing on, ptsinproc=(/61,60/), x_slab_displs=(/0,61/). myminpt simple integer giving minimum index handled in each proc.

so want gather of f proc 0 , run

    if (myrank == 0)       call mpi_gatherv(mpi_in_place, ptsinproc(myrank),  +                     mpi_x_interface, f(1,1,1,1), ptsinproc,  +                     x_slab_displs, mpi_x_interface, 0,  +                     mpi_comm_world, mpi_err)     else       call mpi_gatherv(f(myminpt,1,1,1), ptsinproc(myrank),  +                     mpi_x_interface, f(1,1,1,1), ptsinproc,  +                     x_slab_displs, mpi_x_interface, 0,  +                     mpi_comm_world, mpi_err)     endif 

i can send @ 1 "slab" this. if try send entire 60 "slabs" proc 1 proc 0 seg fault due "invalid memory reference". btw, when send single slab, data winds in wrong places.

i've checked obvious stuff maiking sure myrank , ptsinproc , x_slab_dislps should on procs. i've looked difference between "size" , "extent" , on, no avail. i'm @ wit's end. don't see doing wrong. , might remember asked similar (but different!) question few months back. admit i'm not getting it. patience appreciated.

first off, want reason you're running many problems because trying split first (fastest) axis. not recommended @ because as-is packing mpi_x_interface requires lot of non-contiguous memory accesses. we're talking huge loss in performance.

splitting slowest axis across mpi processes better strategy. highly recommend transposing 4d matrix x axis last if can.

now actual problem(s)...

derived datatypes

as have deduced, 1 problem size , extent of derived datatype might incorrect. let's simplify problem bit can draw picture. dimy*dimz*dimf=3, , dimx=4. as-is, datatype mpi_x_interface describes following data in memory:

| x |   |   |   | x |   |   |   | x |   |   |   | 

that is, every 4th mpi_real, , 3 of them total. seeing want, far good: size of variable correct. however, if try , send "the next" mpi_x_interface, see implementation of mpi start @ next point in memory (which in case has not been allocated), , throw "invalid memory access" @ you:

                                             tries access , bombs                                                  vvv | x |   |   |   | x |   |   |   | x |   |   |   | y |   |   |   | y | ... 

what need tell mpi part of datatype "the next" mpi_x_interface starts 1 real array. accomplished redefining "extent" of derived datatype calling mpi_type_create_resized(). in case, need write

integer :: mpi_x_interface, mpi_x_interface_resized integer, parameter :: sizeof_real = 4 ! or whatever f  call mpi_type_vector(dimy*dimz*dimf, 1, dimx, mpi_real,  &                  mpi_x_interface, mpi_err) call mpi_type_create_resized(mpi_x_interface, 0, 1*sizeof_real, &                              mpi_x_interface_resized, mpi_err) call mpi_type_commit(mpi_x_interface_resized, mpi_err) 

then, calling "the next" 3 mpi_x_interface_resized result in:

| x | y | z | | x | y | z | | x | y | z | | 

as expected.

mpi_gatherv

note have correctly defined extent of datatype, calling mpi_gatherv offset in terms of datatype should work expected.

personally, wouldn't think there need try fancy logic mpi_in_place collective operation. can set myminpt=1 on myrank==0. can call on every rank:

   call mpi_gatherv(f(myminpt,1,1,1), ptsinproc(myrank), +                     mpi_x_interface_resized, f, ptsinproc, +                     x_slab_displs, mpi_x_interface_resized, 0, +                     mpi_comm_world, mpi_err) 

Comments

Popular posts from this blog

java - pagination of xlsx file to XSSFworkbook using apache POI -

Unlimited choices in BASH case statement -

apache - How do I stop my index.php being run twice for every user -