[mpich-discuss] Finalize abort in mpd
Darius Buntinas
buntinas at mcs.anl.gov
Tue Oct 25 14:56:40 CDT 2011
Can you send us a stack trace? (e.g., get a core dump, then use gdb to get the stack trace)
-d
On Oct 25, 2011, at 2:54 PM, Hiatt, Dave M wrote:
> Hydra oddly enough exhibits the same behavior. Smpd appears to be the only one that will run through. Would that suggest anything? What is different is that in this version of the app we are sending materially larger and more volume of data back to Node 0. We develop and test in Windows 7 and we are suing smpd there, and things are bullet proof. In the past problem were easily recreated, so this is really puzzling. We are building with gcc and have not had any problems in the past in terms of MPI when we transition between RH and Window so the implications of this are dismaying.
>
> Here's the odd part, when the problem appears we've essentially completed the run, all the data has been received and pushed to the data files, the compute nodes appear to have all called Finalize and are sitting on the barrier waiting on Node 0 to call Finalize.
>
> One other factoid, this is an OpenMP hybrid application. But again that has not been an issue in the past.
>
>
> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Dave Goodell
> Sent: Tuesday, October 25, 2011 12:11 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] Finalize abort in mpd
>
> http://wiki.mcs.anl.gov/mpich2/index.php/Frequently_Asked_Questions#Q:_I_don.27t_like_.3CWHATEVER.3E_about_mpd.2C_or_I.27m_having_a_problem_with_mpdboot.2C_can_you_fix_it.3F
>
> Just use hydra.
>
> -Dave
>
> On Oct 25, 2011, at 11:34 AM CDT, Hiatt, Dave M wrote:
>
>> We have been running with 1.21 for some time with no problems, but with our latest release we now get an error when Finalize is called if we are running mpd. If we run smpd in RH Linux there is no problem. I suspect this has probably been seen before but I have had no luck in a Google search so my apologies if this has been answered before. But could someone be so kind as to tell me what we have done to ourselves if this is a known problem.
>>
>> "So they go on in strange paradox, decided only to be undecided, resolved to be irresolute, adamant for drift, solid for fluidity, all-powerful to be impotent."
>>
>> Dave M. Hiatt
>> Director, Risk Analytics
>> CitiMortgage
>> 1000 Technology Drive
>> O'Fallon, MO 63368-2240
>>
>> Telephone: 636-261-1408
>>
>>
>> _______________________________________________
>> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> _______________________________________________
> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> _______________________________________________
> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
More information about the mpich-discuss
mailing list