[mpich-discuss] Question about MPI - Message Truncation Problem in MPI_Recv

Ayaz ul Hassan Khan ahkhan at kfupm.edu.sa
Mon Dec 19 00:49:29 CST 2011


I am having problems in one my project related to MPI development. I am working on the implementation of an RNA parsing algorithm using MPI in which I started the parsing of an input string based on some parsing rules and parsing table (contains different states and related actions) with a master node. In parsing table, there are multiple actions for each state which can be done in parallel. So, I have to distribute these actions among different processes. To do that, I am sending the current state and parsing info (current stack of parsing) to the nodes using separate thread to receive actions from other nodes while the main thread is busy in parsing based on received actions. Following are the code snippets of the sender and receiver:

Sender Code:
StackFlush(&snd_stack);
StackPush(&snd_stack, state_index);
StackPush(&snd_stack, current_ch);
StackPush(&snd_stack, actions_to_skip);
elements_in_stack = stack.top + 1;
for(int a=elements_in_stack-1;a>=0;a--)
                StackPush(&snd_stack, stack.contents[a]);
StackPush(&snd_stack, elements_in_stack);
elements_in_stack = parse_tree.top + 1;
for(int a=elements_in_stack-1;a>=0;a--)
                StackPush(&snd_stack, parse_tree.contents[a]);
StackPush(&snd_stack, elements_in_stack);
elements_in_stack = snd_stack.top+1;
MPI_Send(&elements_in_stack, 1, MPI_INT, (myrank + actions_to_skip) % mysize, MSG_ACTION_STACK_COUNT, MPI_COMM_WORLD);
MPI_Send(&snd_stack.contents[0], elements_in_stack, MPI_CHAR, (myrank + actions_to_skip) % mysize, MSG_ACTION_STACK, MPI_COMM_WORLD);

Receiver Code:
MPI_Recv(&e_count, 1, MPI_INT, MPI_ANY_SOURCE, MSG_ACTION_STACK_COUNT, MPI_COMM_WORLD, &status);
if(e_count == 0){
                break;
}
while((bt_stack.top + e_count) >= bt_stack.maxSize - 1){usleep(500);}
pthread_mutex_lock(&mutex_bt_stack); //using mutex for accessing shared data among threads
MPI_Recv(&bt_stack.contents[bt_stack.top + 1], e_count, MPI_CHAR, status.MPI_SOURCE, MSG_ACTION_STACK, MPI_COMM_WORLD, &status);
bt_stack.top += e_count;
pthread_mutex_unlock(&mutex_bt_stack);

The program is running fine for small input having less communications but as we increase the input size which in response increases the communication so the receiver receives many requests while processing few then it get crashed with the following errors:
Fatal error in MPI_Recv: Message truncated, error stack:
MPI_Recv(186) ..........................................: MPI_Recv(buf=0x5b8d7b1, count=19, MPI_CHAR, src=3, tag=1, MPI_COMM_WORLD, status=0x41732100) failed
MPIDI_CH3U_Request_unpack_uebuf(625)L Message truncated; 21 bytes received but buffer size is 19
Rank 0 in job 73 hpc081_56549 caused collective abort of all ranks exit status of rank 0: killed by signal 9.

I have also tried this by using Non-Blocking MPI calls but still the similar errors.


Ayaz ul Hassan Khan
Lecturer-B (PhD Student), Information and Computer Sciences
King Fahd University of Petroleum & Minerals
Dhahran 31261, Kingdom of Saudi Arabia


Save a tree. Don't print this e-mail unless it's really necessary
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111219/dd0eb281/attachment.htm>


More information about the mpich-discuss mailing list