[mpich-discuss] A problem of carrying out mpich2

Yuriy Khalak khalak.yura at gmail.com
Tue Jun 15 12:45:18 CDT 2010


Dear Qianlin,

Signal 9, which seems to be causing the crash, usually indicates a
segmentation fault in the code bing run by mpiexec. I had something similar
happen to me recently under c++ and mpich. Turned out I was trying to delete
the contents of a pointer twice.

In my case Linux did a core dump, which I could trace with gdb and determine
approximately where the segmentation fault occurred. So if you can find a
core dump in your program's working directory, the problem is probably in
the VASP code, not in mpich2.

Keep in mind that I'm just a user of mpich2 and very well could be wrong.

Regards,
             Yuriy

> Dear mpich2-support,
>
> Based on mpif90, I have installed the parallel version of a commercilal
code VASP . Sometimes mpich2 can work well with vasp, but it also failed for
some VASP-treated jobs with the following error messages:
> ------------------------------
-----
> running on    8 nodes
> distr:  one band on    1 nodes,    8 groups
> vasp.4.6.21  23Feb03 complex
> POSCAR found :  3 types and   30 ions
> LDA part: xc-table for Ceperly-Alder, Vosko type interpolation para-ferro
> POSCAR, INCAR and KPOINTS ok, starting setup
> WARNING: wrap around errors must be expected
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> FFT: planning ...            2
> reading WAVECAR
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> mpiexec_qltang1 (handle_stdin_input 1089): stdin problem; if pgm is run in
background, redirect from /dev/null
> mpiexec_qltang1 (handle_stdin_input 1090):     e.g.: mpiexec -n 4 a.out <
/dev/null &
> WARNING: random wavefunctions but no delay for mixing, default for NELMDL
> entering main loop
>       N       E                     dE             d eps       ncg     rms
         rms(c)
> rank 6 in job 36  qltang1_54199   caused collective abort of all ranks
>  exit status of rank 6: killed by signal 9
> rank 3 in job 36  qltang1_54199   caused collective abort of all ranks
>  exit status of rank 3: killed by signal 9
> -----------------------------
> I want to know how to solve the above question. Thank you alot.
>
> Best regards,
> Qianlin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100615/20236900/attachment.htm>


More information about the mpich-discuss mailing list