[petsc-users] MPI_AllReduce error with -xcore-avx2 flags

Bikash Kanungo bikash at umich.edu
Thu Jan 28 02:13:38 CST 2016


Hi Jose,

Here is the complete error message:

[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR: Invalid argument
[0]PETSC ERROR: Scalar value must be same on all processes, argument # 3
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.5.2, Sep, 08, 2014
[0]PETSC ERROR: Unknown Name on a intel-openmpi_ib named
comet-03-60.sdsc.edu by bikashk Thu Jan 28 00:09:17 2016
[0]PETSC ERROR: Configure options CFLAGS="-fPIC -xcore-avx2" FFLAGS="-fPIC
-xcore-avx2" CXXFLAGS="-fPIC -xcore-avx2"
--prefix=/opt/petsc/intel/openmpi_ib --with-mpi=true
--download-pastix=../pastix_5.2.2.12.tar.bz2
--download-ptscotch=../scotch_6.0.0_esmumps.tar.gz
--with-blas-lib="-Wl,--start-group
/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_intel_lp64.a
/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_sequential.a
/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_core.a
-Wl,--end-group -lpthread -lm" --with-lapack-lib="-Wl,--start-group
/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_intel_lp64.a
/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_sequential.a
/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_core.a
-Wl,--end-group -lpthread -lm"
--with-superlu_dist-include=/opt/superlu/intel/openmpi_ib/include
--with-superlu_dist-lib="-L/opt/superlu/intel/openmpi_ib/lib -lsuperlu"
--with-parmetis-dir=/opt/parmetis/intel/openmpi_ib
--with-metis-dir=/opt/parmetis/intel/openmpi_ib
--with-mpi-dir=/opt/openmpi/intel/ib
--with-scalapack-dir=/opt/scalapack/intel/openmpi_ib
--download-mumps=../MUMPS_4.10.0-p3.tar.gz
--download-blacs=../blacs-dev.tar.gz
--download-fblaslapack=../fblaslapack-3.4.2.tar.gz --with-pic=true
--with-shared-libraries=1 --with-hdf5=true
--with-hdf5-dir=/opt/hdf5/intel/openmpi_ib --with-debugging=false
[0]PETSC ERROR: #1 BVScaleColumn() line 380 in
/scratch/build/git/math-roll/BUILD/sdsc-slepc_intel_openmpi_ib-3.5.3/slepc-3.5.3/src/sys/classes/bv/interface/bvops.c
[0]PETSC ERROR: #2 BVOrthogonalize_GS() line 474 in
/scratch/build/git/math-roll/BUILD/sdsc-slepc_intel_openmpi_ib-3.5.3/slepc-3.5.3/src/sys/classes/bv/interface/bvorthog.c
[0]PETSC ERROR: #3 BVOrthogonalize() line 535 in
/scratch/build/git/math-roll/BUILD/sdsc-slepc_intel_openmpi_ib-3.5.3/slepc-3.5.3/src/sys/classes/bv/interface/bvorthog.c
[comet-03-60:27927] *** Process received signal ***
[comet-03-60:27927] Signal: Aborted (6)



On Thu, Jan 28, 2016 at 2:56 AM, Jose E. Roman <jroman at dsic.upv.es> wrote:

>
> > El 28 ene 2016, a las 8:32, Bikash Kanungo <bikash at umich.edu> escribió:
> >
> > Hi,
> >
> > I was trying to use BVOrthogonalize() function in SLEPc. For smaller
> problems (10-20 vectors of length < 20,000) I'm able to use it without any
> trouble. For larger problems ( > 150 vectors of length > 400,000) the code
> aborts citing an MPI_AllReduce error with following message:
> >
> > Scalar value must be same on all processes, argument # 3.
> >
> > I was skeptical that the PETSc compilation might be faulty and tried to
> build a minimalistic version omitting the previously used -xcore-avx2 flags
> in CFLAGS abd CXXFLAGS. That seemed to have done the cure.
> >
> > What perplexes me is that I have been using the same code with
> -xcore-avx2 flags in PETSc build on a local cluster at the University of
> Michigan without any problem. It is only until recently when I moved to
> Xsede's Comet machine, that I started getting this MPI_AllReduce error with
> -xcore-avx2.
> >
> > Do you have any clue on why the same PETSc build fails on two different
> machines just because of a build flag?
> >
> > Regards,
> > Bikash
> >
> > --
> > Bikash S. Kanungo
> > PhD Student
> > Computational Materials Physics Group
> > Mechanical Engineering
> > University of Michigan
> >
>
> Without the complete error message I cannot tell the exact point where it
> is failing.
> Jose
>
>


-- 
Bikash S. Kanungo
PhD Student
Computational Materials Physics Group
Mechanical Engineering
University of Michigan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160128/c84fe948/attachment-0001.html>


More information about the petsc-users mailing list