[mpich-discuss] MPI_Recv: MPI + Pthreads
Olya Krachina
okrachin at purdue.edu
Tue Apr 22 21:00:37 CDT 2008
Hello list,
i am kind of new to MPI and Pthreads.... and trying to understand MPI_Recv.
I am trying to write a parallel matrix multiply (square power of two dimension:
AxB = C). I am using 3 machines, 8-core each. So, my program should create MPI
processes (-np option with mpiexec) and each MPI process in turn should
generate 8 threads on the "home" machine.
I have a regular MPI version running fine for up to 32 MPI processes, and my 1-
MPI_process + pthreads runs fine, and regular 8-thread version working fine,
but once i increase number of MPI processes to 2 with 8 threads (i.e. matrices
are 16x16) i get:
$rank 1 in job 147 .... caused collective abord of all ranks, killed by signal 9
My algorithm is very basic:
root mpi_broadcasts B and mpi_sends strip of rows of A, non-roots mpi_receive A
and broadcast/receive B;
then all perform threaded computation of part of A and B place it in C and send
it back to the root;
i do not use any mutexes with pthreads, since writing to resulting C is totally
independent.
and somehow i get abort signal on receive in root.... another thing is that
root always finishes first, while non-root doesnt get to collect its threads
before the abort.
So, my question is: what is the timing involved with MPI_Recv... it seems to
fail if there is no data to receive... i tried using MPI_Irecv and wait, but no
change. Another thing, how can i find out what is the error status..... isnt it
what is the last argument of Irecv?
thank you in advance,
Olya
More information about the mpich-discuss
mailing list