[mpich-discuss] Unknown error code in 1.1.1p1

Scott Atchley atchley at myri.com
Mon Nov 30 13:23:24 CST 2009


On Nov 30, 2009, at 2:04 PM, Dave Goodell wrote:

> On Nov 30, 2009, at 12:48 PM, Scott Atchley wrote:
>
>> Looking at $MPICH/src/mpi/pt2pt/test.c, line 136 calls  
>> MPIU_ERR_POP() if mpi_errno is not 0. It should then fall through  
>> to to fn_exit which will return mpi_errno.
>>
>> Instead, it seems to go to fn_fail which calls  
>> MPIR_Err_create_code() and then MPIR_Err_return_comm() which  
>> eventually calls MPID_Abort(). The latter prints the MXMPI message  
>> and exits.
>>
>> Am I missing something? Has anyone seen a similar failure before?
>
> The key piece you are missing is that MPIU_ERR_POP() does a "goto  
> fn_fail;" under the hood.  The following line:

<snip>

Argh, it was right in front of me in the definition and I did not see  
the goto fn_fail. :-)

> An error message like your user is getting usually indicates either  
> a programming error inside the library (our fault or possibly  
> yours), or memory corruption issues (usually the user's fault).   
> Have you tried valgrind on it yet?

Not yet. I am waiting on the dataset so I can try to reproduce here.

Thanks,

Scott


More information about the mpich-discuss mailing list