[petsc-users] VecAssembly gives segmentation fault with MPI

Karl Rupp rupp at iue.tuwien.ac.at
Wed Apr 19 08:09:57 CDT 2017


Hi Francesco,

please don't drop petsc-users from the communication. This will likely 
provide you with better and faster answers.

Since your current build is with debugging turned off, please 
reconfigure with debugging turned on, as the error message says. Chances 
are good that you will get much more precise information about what went 
wrong.

Best regards,
Karli



On 04/19/2017 03:03 PM, Francesco Migliorini wrote:
> Hi, thank you for your answer!
>
> Unfortunately I cannot use Valgrind on the machine I am using, but I am
> sure than the error is in using VecAssembly. Here's the error message
> from PETSc:
>
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [1]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
> and run
> [1]PETSC ERROR: to get more information on the crash.
> [1]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [1]PETSC ERROR: Signal received
> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015
> [1]PETSC ERROR: /u/migliorini/SPEED/SPEED on a arch-linux-opt named
> idra116 by migliorini Wed Apr 19 10:20:48 2017
> [1]PETSC ERROR: Configure options
> --prefix=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/petsc/3.6.3
> --with-petsc-arch=arch-linux-opt --with-fortran=1 --with-pic=1
> --with-debugging=0 --with-x=0 --with-blas-lapack=1
> --with-blas-lib=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/openblas/0.2.17/lib/libopenblas.so
> --with-lapack-lib=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/openblas/0.2.17/lib/libopenblas.so
> --with-boost=1
> --with-boost-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/boost/1.60.0
> --with-fftw=1
> --with-fftw-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/fftw/3.3.4
> --with-hdf5=1
> --with-hdf5-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/hdf5/1.8.16
> --with-hypre=1
> --with-hypre-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/hypre/2.11.0
> --with-metis=1
> --with-metis-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/metis/5
> --with-mumps=1
> --with-mumps-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/mumps/5.0.1
> --with-netcdf=1
> --with-netcdf-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/netcdf/4.4.0
> --with-p4est=1
> --with-p4est-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/p4est/1.1
> --with-parmetis=1
> --with-parmetis-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/metis/5
> --with-ptscotch=1
> --with-ptscotch-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/scotch/6.0.4
> --with-scalapack=1
> --with-scalapack-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/scalapack/2.0.2
> --with-suitesparse=1
> --with-suitesparse-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/suitesparse/4.5.1
> [1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
> with errorcode 59.
>
> do in=1,nnod_loc is a loop over the nodes contained in the local vector
> because the program arrives to Petsc initialization with already
> multiple processes. Then I thought Petsc was applied to all the
> processes separately and therefore the global dimensions of the system
> were the local ones of the MPI processes. Maybe it does not work in this
> way...
>
> 2017-04-19 13:25 GMT+02:00 Karl Rupp <rupp at iue.tuwien.ac.at
> <mailto:rupp at iue.tuwien.ac.at>>:
>
>     Hi Francesco,
>
>     please consider the following:
>
>      a) run your code through valgrind to locate the segmentation fault.
>     Maybe there is already a memory access problem in the sequential
>     version.
>
>      b) send any error messages as well as the stack trace.
>
>      c) what is you intent with "do in = nnod_loc"? Isn't nnoc_loc the
>     number of local elements?
>
>     Best regards,
>     Karli
>
>
>
>
>     On 04/19/2017 12:26 PM, Francesco Migliorini wrote:
>
>         Hello!
>
>         I have an MPI code in which a linear system is created and
>         solved with
>         PETSc. It works in sequential run but when I use multiple cores the
>         VecAssemblyBegin/End give segmentation fault. Here's a sample of
>         my code:
>
>         call PetscInitialize(PETSC_NULL_CHARACTER,perr)
>
>               ind(1) = 3*nnod_loc*max_time_deg
>               call VecCreate(PETSC_COMM_WORLD,feP,perr)
>               call VecSetSizes(feP,PETSC_DECIDE,ind,perr)
>               call VecSetFromOptions(feP,perr)
>
>               do in = nnod_loc
>         do jt = 1,mm
>         ind(1) = 3*((in -1)*max_time_deg + (jt-1))
>         fval(1) = fe(3*((in -1)*max_time_deg + (jt-1)) +1)
>         call VecSetValues(feP,1,ind,fval(1),INSERT_VALUES,perr)
>         ind(1) = 3*((in -1)*max_time_deg + (jt-1)) +1
>         fval(1) = fe(3*((in -1)*max_time_deg + (jt-1)) +2)
>         call VecSetValues(feP,1,ind,fval(1),INSERT_VALUES,perr)
>         ind(1) = 3*((in -1)*max_time_deg + (jt-1)) +2
>         fval(1) = fe(3*((in -1)*max_time_deg + (jt-1)) +3)
>         call VecSetValues(feP,1,ind,fval(1),INSERT_VALUES,perr)
>          enddo
>                  enddo
>               enddo
>               call VecAssemblyBegin(feP,perr)
>               call VecAssemblyEnd(feP,perr)
>
>         The vector has 640.000 elements more or less but I am running on
>         a high
>         performing computer so there shouldn't be memory issues. Does anyone
>         know where is the problem and how can I fix it?
>
>         Thank you,
>         Francesco Migliorini
>
>


More information about the petsc-users mailing list