[mpich-discuss] MPI communication problem.(MPI Abort by user Aborting program!)

Martin Siegert siegert at sfu.ca
Thu Apr 10 19:17:46 CDT 2008


Hi,

try

mpirun.lsf /nfs/s04r2p1/wangxq_tj/espresso-3.2.3/bin/pw.x \
           -in /nfs/s04r2p1/wangxq_tj/cu.scf.in > cu.scf.out

see the espresso user-guide pages 29 and 30. In most of the espresso
examples it is assumed that stdin is connected on all nodes,
which (as far as I know) is not guaranteed by the MPI standard.
However, the "-in inputfile" option should always work.

Cheers,
Martin

-- 
Martin Siegert
Head, Research Computing
WestGrid Site Lead
Client and Research Services               phone: 778 782-4691
Simon Fraser University                    fax:   778 782-4242
Burnaby, British Columbia                  email: siegert at sfu.ca
Canada  V5A 1S6

On Fri, Apr 11, 2008 at 07:56:32AM +0800, wangxinquan at tju.edu.cn wrote:
> Dear all,
> 
>      Recently I have done a test on Nankai Stars HPC. The error message 
> "MPI Abort by user Aborting program!Aborting program!"appeared when I did 
> a calculation through 2 cpu over 2 nodes. But it worked well over 1 nodes.
> So, I'm afraid it is a MPI communication problem.
>      After google, I have found some hints.My communication library (MPI)
> might not be properly configured to allow input redirection (so that I
> am effectively reading an empty file).
>      The pwscf package need to be configured to allow interactive execution.
> Do I need to adjust some parameters of MPICH?
> 
>      Any help will be deeply appreciated!
> 
> Calculation Details are as follows:
> ---------------------------------------------------------------------------------
> HPC background:
> Nankai Stars (http://202.113.29.200/introduce.htm)
> 800 Xeon 3.06 Ghz CPU (400 nodes)   
> 800 GB Memory    
> 53T High-Speed Storage    
> Myrinet
> Parallel jobs are run and debuged through Platform LSF system.
> Mpich_gm driver:1.2.6..13a
> Test package: Espresso-3.2.3(www.pwscf.org)
> ---------------------------------------------------------------------------------
> 
> ---------------------------------------------------------------------------------
> Installation:
> /configure CC=mpicc F77=mpif77 F90=mpif90
> modified make.sys file:
> IFLAGS=-I. -I/usr/local/mpich/1.2.6..13a/gm-2.1.3aa2nks3/smp/intel32/ssh/include
> MPI_LIBS=/usr/local/mpich/1.2.6..13a/gm-2.1.3aa2nks3/smp/intel32/ssh/lib/libmpichf
> 90.a
> make all
> ---------------------------------------------------------------------------------
> 
> ---------------------------------------------------------------------------------
> Submit script :
> #!/bin/bash
> #BSUB -q normal
> #BSUB -J test.icymoon
> #BSUB -c 3:00
> #BSUB -a "mpich_gm"
> #BSUB -o %J.log
> #BSUB -n 2 
> 
> cd /nfs/s04r2p1/wangxq_tj
> echo "test icymoon"
> 
> mpirun.lsf /nfs/s04r2p1/wangxq_tj/espresso-3.2.3/bin/pw.x <
> /nfs/s04r2p1/wangxq_tj/cu.scf.in > cu.scf.out
> 
> echo "test icymoon end"
> ---------------------------------------------------------------------------------
> 
> ---------------------------------------------------------------------------------
> Output file (%J.log):
> 
> ?? ??
> The output (if any) follows:
> 
> test icymoon
> [0]  MPI Abort by user Aborting program !
> [0] Aborting program!
> test icymoon end
> ---------------------------------------------------------------------------------
> 
> 
> Best regards,XQ Wang
> 
> =====================================
> 
> X.Q. Wang 
> 
> wangxinquan at tju.edu.cn
> 
> School of Chemical Engineering and Technology
> 
> Tianjin University
> 
> 92 Weijin Road, Tianjin, P. R. China
> 
> tel:86-22-27890268, fax: 86-22-27892301
> 
> ===================================== 




More information about the mpich-discuss mailing list