[petsc-users] PETSc + Julia + MVAPICH2 = Segfault

Derek Gaston friedmud at gmail.com
Tue Dec 6 00:43:30 CST 2016


Quick update for the list:  Matt and I were emailing back and forth a bit
and I at least have a workaround for now.

It turns out that MVAPICH2 does include their own implementations of
malloc/calloc .  Matt believes (and I agree) that they should be private to
the library though.  It looks like something about the way Julia is loading
the library is exposing those to PETSc.

For now: I've worked around the issue by hardcore hacking the MVAPICH
source to remove the definitions of malloc/calloc... and it DOES fix the
"problem"... but, is definitely not the right answer.

I'm going to talk to the Julia guys here at MIT tomorrow and see if I can
get to the bottom of why those symbols are getting exposed when libmpi is
getting loaded.

Thanks Matt, for the help!

Derek

On Mon, Dec 5, 2016 at 11:56 PM Derek Gaston <friedmud at gmail.com> wrote:

> Please excuse the slightly off-topic post: but I'm pulling my hair out
> here and I'm hoping someone else has seen this before.
>
> I'm calling PETSc from Julia and it's working great on my Mac with MPICH
> but I'm seeing a segfault on my Linux cluster using MVAPICH2.  I get the
> same segfault both with the "official" PETSc.jl and my own smaller wrapper
> MiniPETSc.jl: https://github.com/friedmud/MiniPETSc.jl
>
> Here is the stack trace I'm seeing:
>
> signal (11): Segmentation fault
> while loading /home/gastdr/projects/falcon/julia_mpi.jl, in expression
> starting on line 5
> _int_malloc at /home/gastdr/projects/falcon/root/lib/libmpi.so.12 (unknown
> line)
> calloc at /home/gastdr/projects/falcon/root/lib/libmpi.so.12 (unknown line)
> PetscOptionsCreate at
> /home/gastdr/projects/falcon/petsc-3.7.3/src/sys/objects/options.c:2578
> PetscInitialize at
> /home/gastdr/projects/falcon/petsc-3.7.3/src/sys/objects/pinit.c:761
> PetscInitializeNoPointers at
> /home/gastdr/projects/falcon/petsc-3.7.3/src/sys/objects/pinit.c:111
> __init__ at /home/gastdr/.julia/v0.5/MiniPETSc/src/MiniPETSc.jl:14
>
> The script I'm running is simply just run.jl :
>
> using MiniPETSc
>
> It feels like libmpi is not quite loaded correctly yet.  It does get
> loaded by MPI.jl here:
> https://github.com/JuliaParallel/MPI.jl/blob/master/src/MPI.jl#L29 and
> I've verified that that code is running before PETSc is being initialized.
>
> It looks ok to me... and I've tried a few variations on that dlopen() call
> and nothing makes it better.
>
> BTW: MPI.jl is working fine on its own.  I can write pure MPI Julia apps
> and run them in parallel on the cluster.  Just need to get this
> initialization of PETSc straightened out.
>
> Thanks for any help!
>
> Derek
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161206/bcf3502a/attachment-0001.html>


More information about the petsc-users mailing list