[ExM Users] mkstatic questions

Tim Armstrong tim.g.armstrong at gmail.com
Fri May 30 12:49:16 CDT 2014


I see.  Based on the two lines of output you sent me, the problem is
something to do with message sizes on MPI.  Assuming your app isn't using
MPI internally, it's probably some communication that ADLB is doing.  The
error message would generally be caused by a mismatch of message size
between sender and receiver.  The most likely explanation in the ADLB
codebase is that the sender and receiver somehow disagree on sizes of
structs, which doesn't make a whole lot of sense unless something strange
was done during the build process, e.g. one file was compiled with
different compiler settings, or you somehow linked to different versions of
the function.
.
It's possible that it's a bug in the ADLB codebase that's nothing to do
with how it was built, but it seems unlikely that something like that would
have escaped all the tests.  It might help to look at the Tcl code or Swift
that's being run, as well as to make sure that it runs correctly on a
different environment.

It would also be helpful to have a full log of the program output with
debug logging enabled, since that will tell me what ADLB was doing at the
time.

I'm not sure if I can help with debugging the problem without more info.

- Tim


On Fri, May 30, 2014 at 11:33 AM, Ketan Maheshwari <ketan at mcs.anl.gov>
wrote:

> I rebuilt the application recently without MPI. It seems to be working
> outside of Swift on Cetus compute nodes.
>
>
> On Fri, May 30, 2014 at 11:18 AM, Tim Armstrong <tim.g.armstrong at gmail.com
> > wrote:
>
>>  Regarding the MPI error - that seems strange.  There are multiple
>> places in the code that it might be.
>>
>>  One possible cause is if something funny happened in compiling/linking -
>> e.g. multiple compilers or versions of things linked together.  Have you
>> tried running the code locally?
>>
>> I'm a little perplexed because MPI tag 4 shouldn't be used in your
>> application - the message type (Iget) is only really used for
>> gemtc/coasters applications.  It would be helpful to debug further if I
>> could get a log from the run with ADLB debugging enabled at compile time
>> (--enable-log-debug for the ADLB configure stage, or setting EXM_DEBUG=1 in
>> exm-settings.sh depending on how you built it).
>>
>>  - Tim
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/exm-user/attachments/20140530/e45f8e28/attachment.html>


More information about the ExM-user mailing list