[petsc-users] errors when using elemental with petsc3.10.5

Smith, Barry F. bsmith at mcs.anl.gov
Thu Aug 22 21:12:29 CDT 2019


  Does this crash on a PETSc example that uses elemental? For example run 

   src/ksp/ksp/examples/tests/ex40.c:      with the arguments

   -pc_type lu -pc_factor_mat_solver_type elemental

  To get the stack trace 

  Add the command line option   -start_in_debugger noxterm when you run your program and when it starts up type c (for continue) when it crashes type bt (for backtrace) send all the output.

  Barry

> On Aug 22, 2019, at 7:54 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Thu, Aug 22, 2019 at 8:45 PM Lailai Zhu via petsc-users <petsc-users at mcs.anl.gov> wrote:
> Thank you guys,  after i remove the system's metis-related things,
> it compiles well and go through the check. however when i try to
> use the elemental solver, it still does not work out. i basically get
> the segmentation fault error, which does not appear when i use
> the standard dense matrix and jacobi sovers. thanks in advance,
> 
> best,
> lailai
> 
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, 
> probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see 
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS 
> X to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, 
> and run
> [0]PETSC ERROR: to get more information on the crash.
> 
> It would be really helpful to get a stack trace from the debugger.
> 
>   Thanks,
> 
>     Matt
>  
> 
> On 8/22/19 8:03 PM, Smith, Barry F. wrote:
> >
> >> On Aug 22, 2019, at 6:48 PM, Balay, Satish <balay at mcs.anl.gov> wrote:
> >>
> >> Compilers are supposed to prefer libraries in specified -L path before system stuff.
> >    Suppose to.
> >
> >> balay at es^~ $ ls /usr/lib/lib*metis*
> >> /usr/lib/libmetis.a    /usr/lib/libmetis.so.3.1  /usr/lib/libparmetis.so@     /usr/lib/libscotchmetis-5.1.so  /usr/lib/libscotchmetis.so@
> >> /usr/lib/libmetis.so@  /usr/lib/libparmetis.a    /usr/lib/libparmetis.so.3.1  /usr/lib/libscotchmetis.a
> >> balay at es^~ $
> >    This is really bad system management, there is no reason for them to be there nor should they be there.
> >
> > bsmith at es:~$ ldd /usr/lib/libparmetis.so.3.1
> >       linux-vdso.so.1 =>  (0x00007fff9115e000)
> >       libmpi.so.1 => /usr/lib/libmpi.so.1 (0x00007faf95f87000)
> >       libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007faf95c81000)
> >       libmetis.so.3.1 => /usr/lib/libmetis.so.3.1 (0x00007faf95a34000)
> >       libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faf9566b000)
> >       libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007faf95468000)
> >       libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 (0x00007faf95228000)
> >       libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007faf9501e000)
> >       libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007faf94e00000)
> >       /lib64/ld-linux-x86-64.so.2 (0x00007faf9654e000)
> >       libnuma.so.1 => /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 (0x00007faf94bf5000)
> >       libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007faf949f1000)
> >
> > You have something in /usr/lib referring to something in /soft/com/packages/pgi/  ??
> >
> > and of course that refers back to
> >
> > bsmith at es:~$ ls -l /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1
> > lrwxrwxrwx 1 fritz voice 38 Mar 29 15:59 /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 -> /usr/lib/x86_64-linux-gnu/libnuma.so.1
> >
> > I stand by my statement, it is bad policy to put any stuff like this in system directories.
> >
> >
> >> <<<<
> >>
> >> And we have these files installed and they don't cause problems. And its not always practical to uninstall system stuff
> >> [esp on multi-user machines]
> >    I agree it is not always practical or possible to remove them.
> >
> >    barry
> >
> >> Satish
> >>
> >>
> >> On Thu, 22 Aug 2019, Smith, Barry F. wrote:
> >>
> >>>   You have a copy of parmetis installed in /usr/lib this is a systems directory and many compilers and linkers automatically find libraries in that location and it is often difficult to avoid have the compilers/linkers use these.   In general you never want to install external software such as parmetis, PETSc, MPI,  etc in systems directories (/usr/  and /usr/local)
> >>>
> >>>   You should delete this library (and the includes in /usr/include)
> >>>
> >>>   Barry
> >>>
> >>>
> >>>
> >>>
> >>>> On Aug 22, 2019, at 5:17 PM, Balay, Satish via petsc-users <petsc-users at mcs.anl.gov> wrote:
> >>>>
> >>>>
> >>>>> ./ex19: symbol lookup error: /usr/lib/libparmetis.so: undefined symbol: ompi_mpi_comm_world
> >>>> For some reason the wrong parmetis library is getting picked up. I don't know why.
> >>>>
> >>>> Can you copy/paste the log from the following?
> >>>>
> >>>> cd src/snes/examples/tutorials
> >>>> make PETSC_DIR=/home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3 ex19
> >>>> ldd ex19
> >>>>
> >>>> cd /home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3/pet3.11.3-intel19-mpich3.3/lib
> >>>> ldd *.so
> >>>>
> >>>> Satish
> >>>>
> >>>> On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote:
> >>>>
> >>>>> hi, Satish,
> >>>>>
> >>>>> as you have suggested, i compiled a new version using 3.11.3,
> >>>>> it compiles well, the errors occur in checking. i also attach
> >>>>> the errors of check. thanks very much,
> >>>>>
> >>>>> lailai
> >>>>>
> >>>>> On 8/22/19 4:16 PM, Balay, Satish wrote:
> >>>>>> Any reason for using  petsc-3.10.5 and not latest petsc-3.11?
> >>>>>>
> >>>>>> I suggest starting from scatch and rebuilding.
> >>>>>>
> >>>>>> And if you still have issues - send corresponding configure.log and make.log
> >>>>>>
> >>>>>> Satish
> >>>>>>
> >>>>>> On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote:
> >>>>>>
> >>>>>>> sorry, Satish,
> >>>>>>>
> >>>>>>> but it does not seem to solve the problem.
> >>>>>>>
> >>>>>>> best,
> >>>>>>> lailai
> >>>>>>>
> >>>>>>> On 8/22/19 12:41 AM, Balay, Satish wrote:
> >>>>>>>> Can you run 'make' again and see if this error goes away?
> >>>>>>>>
> >>>>>>>> Satish
> >>>>>>>>
> >>>>>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote:
> >>>>>>>>
> >>>>>>>>> hi, Satish,
> >>>>>>>>> i tried to do it following your suggestion, i get the following errors
> >>>>>>>>> when
> >>>>>>>>> installing.
> >>>>>>>>> here is my configuration,
> >>>>>>>>>
> >>>>>>>>> any ideas?
> >>>>>>>>>
> >>>>>>>>> best,
> >>>>>>>>> lailai
> >>>>>>>>>
> >>>>>>>>> ./config/configure.py --with-c++-support --known-mpi-shared-libraries=1
> >>>>>>>>> --with-batch=0  --with-mpi=1 --with-debugging=0  CXXOPTFLAGS="-g -O3"
> >>>>>>>>> COPTFLAGS="-O3 -ip -axCORE-AVX2 -xSSE4.2" FOPTFLAGS="-O3 -ip -axCORE-AVX2
> >>>>>>>>> -xSSE4.2" --with-blas-lapack-dir=/opt/intel/mkl --download-elemental=1
> >>>>>>>>> --download-blacs=1  --download-scalapack=1  --download-hypre=1
> >>>>>>>>> --download-plapack=1 --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort
> >>>>>>>>> --download-amd=1 --download-anamod=1 --download-blopex=1
> >>>>>>>>> --download-dscpack=1     --download-sprng=1 --download-superlu=1
> >>>>>>>>> --with-cxx-dialect=C++11 --download-metis --download-parmetis
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: In function
> >>>>>>>>> `MatCreate_SeqSBAIJ':
> >>>>>>>>> sbaij.c:(.text+0x1bc45): undefined reference to
> >>>>>>>>> `MatConvert_SeqSBAIJ_Elemental'
> >>>>>>>>> ld: pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o:
> >>>>>>>>> relocation
> >>>>>>>>> R_X86_64_PC32 against undefined hidden symbol
> >>>>>>>>> `MatConvert_SeqSBAIJ_Elemental'
> >>>>>>>>> can not be used when making a shared object
> >>>>>>>>> ld: final link failed: Bad value
> >>>>>>>>> gmakefile:86: recipe for target
> >>>>>>>>> 'pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5' failed
> >>>>>>>>> make[2]: *** [pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5] Error 1
> >>>>>>>>> make[2]: Leaving directory
> >>>>>>>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3'
> >>>>>>>>> ........................../petsc3.10.5_intel19_mpich3.3/lib/petsc/conf/rules:81:
> >>>>>>>>> recipe for target 'gnumake' failed
> >>>>>>>>> make[1]: *** [gnumake] Error 2
> >>>>>>>>> make[1]: Leaving directory
> >>>>>>>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3'
> >>>>>>>>> **************************ERROR*************************************
> >>>>>>>>>   Error during compile, check
> >>>>>>>>> pet3.10.5-intel19-mpich3.3/lib/petsc/conf/make.log
> >>>>>>>>>   Send it and pet3.10.5-intel19-mpich3.3/lib/petsc/conf/configure.log to
> >>>>>>>>> petsc-maint at mcs.anl.gov
> >>>>>>>>>
> >>>>>>>>> On 8/21/19 10:58 PM, Balay, Satish wrote:
> >>>>>>>>>> To install elemental - you use: --download-elemental=1 [not
> >>>>>>>>>> --download-elemental-commit=v0.87.7]
> >>>>>>>>>>
> >>>>>>>>>> Satish
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote:
> >>>>>>>>>>
> >>>>>>>>>>> hi, dear petsc developers,
> >>>>>>>>>>>
> >>>>>>>>>>> I am having a problem when using the external solver elemental.
> >>>>>>>>>>> I installed petsc3.10.5 version with the flag
> >>>>>>>>>>> --download-elemental-commit=v0.87.7
> >>>>>>>>>>> the installation seems to be ok. However, it seems that i may not be
> >>>>>>>>>>> able
> >>>>>>>>>>> to use the elemental solver though.
> >>>>>>>>>>>
> >>>>>>>>>>> I followed this page
> >>>>>>>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html
> >>>>>>>>>>> to interface the elemental solver, namely,
> >>>>>>>>>>> MatSetType(A,MATELEMENTAL);
> >>>>>>>>>>> or set it via the command line '*-mat_type elemental*',
> >>>>>>>>>>>
> >>>>>>>>>>> in either case, i will get the following error,
> >>>>>>>>>>>
> >>>>>>>>>>> [0]PETSC ERROR: --------------------- Error Message
> >>>>>>>>>>> --------------------------------------------------------------
> >>>>>>>>>>> [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing
> >>>>>>>>>>> package:
> >>>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/installation.html#external
> >>>>>>>>>>> [0]PETSC ERROR: Unknown Mat type given: elemental
> >>>>>>>>>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> >>>>>>>>>>> for
> >>>>>>>>>>> trouble shooting.
> >>>>>>>>>>> [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019
> >>>>>>>>>>>
> >>>>>>>>>>> May i ask whether there will be a way or some specific petsc versions
> >>>>>>>>>>> that
> >>>>>>>>>>> are
> >>>>>>>>>>> able to use the elemental solver?
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks in advance,
> >>>>>>>>>>>
> >>>>>>>>>>> best,
> >>>>>>>>>>> lailai
> >>>>>>>>>>>
> >>>>>
> >>>>>
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/



More information about the petsc-users mailing list