[mpich-discuss] There is any change in mpich2 compilers between ver.1.2 and ver.1.4.1p1?
Darius Buntinas
buntinas at mcs.anl.gov
Sun May 20 19:46:33 CDT 2012
The message you're seeing is a result of a segmentation fault in some process. This is usually the result of a bug in the application. The best way to diagnose this is to rerun the application with core files enabled, then open the core file in a debugger to see where the segmentation fault occurred. E.g.,
ulimit -c unlimited
mpiexec ...
Then look for a file called core.XXX (where XXX is the pid of the failed process) and open it in a debugger, e.g.:
gdb executable core.XXX
In gdb give the command
bt
to see the back trace to see where the error occurred.
If you're running this on a mac, the core file will be located in /core, and if there are multiple core files in there already, you can find the one you're looking for by the creation time.
-d
On May 18, 2012, at 10:02 PM, 유경완 wrote:
> Hi, thanks for read this mail
>
>
> First of all, I very appreciate to make this mpich2 programs, because I used this program very usefully with clustering. Really thanks about it
>
>
> But, I have little problems with using this... So I wanna ask something. sorry for bother you.
>
> The problem was that when I upgrading cluster computers and also upgrading mpich2's version from 1.2 to 1.4.1p1,
>
> and then installing was finished and mpiexec worked well with mpich2 1.4.1p version.
>
> Then I tested compiling with mpicxx and it seems like works well with no errors.
>
> But, when I processed mpiexec with just compiled files, then there appear errors like this...
>
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
> [root at octofous2 yookw]# ./odengmorun 8
>
> =====================================================================================
> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> = EXIT CODE: 11
> = CLEANING UP REMAINING PROCESSES
> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> =====================================================================================
> [proxy:0:1 at n002] HYD_pmcd_pmip_control_cmd_cb (/home/octofous2/libraries/mpich2-1.4.1p1/src/pm/hydra/pm/pmiserv/pmip_cb.c:928): assert (!closed) failed
> [proxy:0:1 at n002] HYDT_dmxu_poll_wait_for_event (/home/octofous2/libraries/mpich2-1.4.1p1/src/pm/hydra/tools/demux/demux_poll.c:77): callback returned error status
> [proxy:0:1 at n002] main (/home/octofous2/libraries/mpich2-1.4.1p1/src/pm/hydra/pm/pmiserv/pmip.c:226): demux engine error waiting for event
> [mpiexec at octofous2.psl] HYDT_bscu_wait_for_completion (/home/octofous2/libraries/mpich2-1.4.1p1/src/pm/hydra/tools/bootstrap/utils/bscu_wait.c:70): one of the processes terminated badly; aborting
> [mpiexec at octofous2.psl] HYDT_bsci_wait_for_completion (/home/octofous2/libraries/mpich2-1.4.1p1/src/pm/hydra/tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion
> [mpiexec at octofous2.psl] HYD_pmci_wait_for_completion (/home/octofous2/libraries/mpich2-1.4.1p1/src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:191): launcher returned error waiting for completion
> [mpiexec at octofous2.psl] main (/home/octofous2/libraries/mpich2-1.4.1p1/src/pm/hydra/ui/mpich/mpiexec.c:405): process manager error waiting for completion
> 8 cpus
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>
>
> which odengmorun was
>
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
> #!/bin/bash
>
> mpiexec -f ./machine.list -n $1 ./yoo.out
> echo $1 cpus
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>
> This is weird for me because when I compiled with past computers which have mpich2 1.2 version same code, the mpiexec was worked...
>
> So, there is only change of version and maybe some configures( sorry but I was changed administrator of clusters and I don't know about past computer's configure options.... but this times configure of new computer was
>
> --with-pm=hydra:gforker:smpd --enable-fast=O3 -prefix=/home/octofous2/mpich2-install )
>
> Sorry for ask like this but can I know any change in compiling between version 1.2 and 1.4.1p1 which can be a clue of this problem?
>
>
> Thanks for read
>
> Best regards
>
> _______________________________________________
> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120521/f0a9eb4e/attachment.html>
More information about the mpich-discuss
mailing list