[mpich-discuss] Unknown error code in 1.1.1p1

Scott Atchley atchley at myri.com
Mon Nov 30 13:23:24 CST 2009

On Nov 30, 2009, at 2:04 PM, Dave Goodell wrote:

> On Nov 30, 2009, at 12:48 PM, Scott Atchley wrote:
>> Looking at $MPICH/src/mpi/pt2pt/test.c, line 136 calls  
>> MPIU_ERR_POP() if mpi_errno is not 0. It should then fall through  
>> to to fn_exit which will return mpi_errno.
>> Instead, it seems to go to fn_fail which calls  
>> MPIR_Err_create_code() and then MPIR_Err_return_comm() which  
>> eventually calls MPID_Abort(). The latter prints the MXMPI message  
>> and exits.
>> Am I missing something? Has anyone seen a similar failure before?
> The key piece you are missing is that MPIU_ERR_POP() does a "goto  
> fn_fail;" under the hood.  The following line:


Argh, it was right in front of me in the definition and I did not see  
the goto fn_fail. :-)

> An error message like your user is getting usually indicates either  
> a programming error inside the library (our fault or possibly  
> yours), or memory corruption issues (usually the user's fault).   
> Have you tried valgrind on it yet?

Not yet. I am waiting on the dataset so I can try to reproduce here.



More information about the mpich-discuss mailing list