[mpich-discuss] static compiling with mpich2

Jason Palmer jason at sccn.ucsd.edu
Mon Feb 1 14:29:07 CST 2010


So, the problem seems to be limited to gfortran (4.4.3) compiling with
static and MPICH2 commands. I don't get any errors when compiling without
-static.

Gdb shows the main fortran executable dying in a thread mutex call inside a
fotran io function:
==================================================
(gdb) run
Starting program: /home/jason/amica_gcc/a.out
[Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) where
#0  0x0000000000000000 in ?? ()
#1  0x000000000047c203 in __gthread_mutex_trylock (n=6, do_create=1) at
../../../gcc-4.4.3/libgfortran/../gcc/gthr-posix.h:767
#2  get_external_unit (n=6, do_create=1) at
../../../gcc-4.4.3/libgfortran/io/unit.c:330
#3  0x000000000047aa0f in data_transfer_init (dtp=0x7fffffffe0c0,
read_flag=0) at ../../../gcc-4.4.3/libgfortran/io/transfer.c:2017
#4  0x0000000000400302 in MAIN__ ()
#5  0x000000000047627a in main (argc=1, argv=0x7fffffffe5b8) at
../../../gcc-4.4.3/libgfortran/fmain.c:21
===================================================

I recompiled gcc without --enable threads=posix, and I get the same error. I
get no errors when building gcc.

I get the exact same seg fault in examples, same time in execution, when I
run the f90pi example program compiled with -static.

Example f90pi compiles and runs when compiled without -static, but it still
has io issues--it doesn't print anything on the scree until <enter> is
typed, then it prints the usual input message but immediately exits (without
error).

On the other hand, with cpi and cxxpi, everything works fine with or without
static.

I ran make testing, and got the following io related problems:
==========================================================
Processing directory coll
Looking in ./coll/testlist
Unexpected output in bcast2: mpiexec_juggling.ucsd.edu (handle_sig_occurred
1144): job ending due to env var MPIEXEC_TIMEOUT=420
Program bcast2 exited without No Errors
Unexpected output in bcast3: mpiexec_juggling.ucsd.edu (handle_sig_occurred
1144): job ending due to env var MPIEXEC_TIMEOUT=420
Program bcast3 exited without No Errors
....
Processing directory io
Looking in ./f77/io/testlist
Unexpected output in iwriteatf: mpiexec_juggling.ucsd.edu
(handle_sig_occurred 1144): job ending due to env var MPIEXEC_TIMEOUT=180
Program iwriteatf exited without No Errors
... 11 more identical errors in ./f77/io/testlist for similar functions
....
Processing directory io
Looking in ./cxx/io/testlist
Unexpected output in iwriteatx: mpiexec_juggling.ucsd.edu
(handle_sig_occurred 1144): job ending due to env var MPIEXEC_TIMEOUT=180
Program iwriteatx exited without No Errors
... 23 more identical erros in ./cxx/io/testlist for similar functions
....
Processing directory threads
Looking in ./threads/testlist
Processing directory pt2pt
Looking in ./threads/pt2pt/testlist
Unexpected output in threads: mpiexec_juggling.ucsd.edu (handle_sig_occurred
1144): job ending due to env var MPIEXEC_TIMEOUT=600
Program threads exited without No Errors
Some programs (threads) may still be running:
pids = 14676
The executable (threads) will not be removed.
Unexpected output in multisend4: mpiexec_juggling.ucsd.edu
(handle_sig_occurred 1144): job ending due to env var MPIEXEC_TIMEOUT=180
Program multisend4 exited without No Errors
Processing directory comm
Looking in ./threads/comm/testlist
Processing directory init
Looking in ./threads/init/testlist
Processing directory spawn
Looking in ./threads/spawn/testlist
41 tests failed out of 580
============================================================

So apparently cxx is having io issues as well, but I didn't get any erros on
cxxpi with -static.

I'm using gcc-4.4.3 (and latest gmp, mpfr, and mpc) and Mpich2-1.2.1 on an
x86_64-linux host (AMD Opteron).

Has anyone else used gfortran 4.4.3 to compile Mpich2 programs statically?
Again, the above testing errors notwithstanding, things seem to work when
compiled dynamically, and when using c or cxx with or without static.

Also, my test program (MPI_INIT + print + MPI_FINALIZE) compiles and runs
successfully when the MPI calls are commented out (using -static and mpif90)
so the problem doesn't seem to be limited to gfortran and static, but seems
to involve gfortran, static, mpich2, and possibly threads.

Maybe I should try explicitly disabling threads?

Thanks again for your help!

Jason



-----Original Message-----
From: mpich-discuss-bounces at mcs.anl.gov
[mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Rajeev Thakur
Sent: Friday, January 29, 2010 5:26 PM
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] static compiling with mpich2

You can also run "make testing" in the top-level mpich2 directory, which
will run the entire test suite in test/mpi, including a whole bunch of
f90 tests (can take an hour or more).

A simpler first step may be to run the example in examples/f90

Rajeev

> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov 
> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Jason Palmer
> Sent: Friday, January 29, 2010 6:17 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] static compiling with mpich2
> 
> The problem does seem to be related to MPICH and/or gcc with 
> static compilation ... oddly, I only get a seg fault when I 
> try to print (to
> stdout) and execute an MPI instruction, but either alone works.
> 
> I can compile using mpif90 with or without -static.
> 
> In the following simple fortran program:
> 
> ---------------------
> program main
> use mpi
> implicit none
> integer :: ierr
> 
> print *, 'This is a test ...'
> 
> call MPI_INIT(ierr)
> call MPI_FINALIZE(ierr)
> 
> end program
> ------------------------
> 
> when compiled with MPICH2 mpif90 and gcc-4.4.3, and -static, 
> no optimization, I get a seg fault.
> 
> If I comment out the print statement, there is no error (even 
> with static compile).
> If I comment out the MPI statements, the print statement 
> works and no error (even though compiled with mpif90).
> 
> I get the error both with the "use mpi" and include mpif.h.
> 
> I re-compiled MPICH2 without --enable-fast, and with 
> --enable-g=dbg and --enable-debug-info (I got a compile error 
> when I tried to use --enable-error-checking complaining about 
> and unknown error number).
> 
> I ran gdb, and the seg fault happens during a return from a 
> thread mutex call. I compiled gcc with enable-threads=posix, 
> but there is no explicit threading in my code here.
> 
> The MPI routines work fine with static and no print 
> statement, and vice-versa. And somehow the combination seems 
> to die in a thread call.
> 
> I will try recompiling gcc without --enable-threads=posix to 
> see what that does (assuming no immediate fixes are forthcoming.)
> 
> Thanks for your help.
> 
> Jason
> 
> 
> 
> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov
> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Dave Goodell
> Sent: Friday, January 29, 2010 1:21 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] static compiling with mpich2
> 
> We (the MPICH2 developers) primarily compile statically and 
> we definitely use static libraries in many of our nightly 
> tests.  GCC is also well tested with MPICH2 and is totally 
> compatible on most platforms.
> 
> Have you used gdb or valgrind to determine where/why the 
> segfault is occurring?
> 
> If the segfault is coming from within MPICH2, can you try 
> configuring with "--enable-g=dbg --enable-error-checking" and 
> *without* any "-- enable-fast" arguments?  Versions of MPICH2 
> built for performance are much less forgiving of errors in 
> application code.
> 
> -Dave
> 
> On Jan 29, 2010, at 3:03 PM, Jason Palmer wrote:
> 
> > Hi all,
> >
> > We recently switched from Mpich1 to Mpich2 as part of a general 
> > upgrade, and I'm re-building a program that was previously 
> compiled as 
> > a static binary under Mpich1 on 64-bit RedHat Linux using pathscale 
> > fortran on an AMD Opteron cluster.
> >
> > We are now using gcc-4.4.3 (instead of pathscale), and I 
> can compile 
> > and run the code with dynamic libraries, but when I compile with - 
> > static, the compile succeeds, but the binary gives a segmentation 
> > fault when executed, with or without mpiexec.
> >
> > I have tried compiling Mpich2 (using gcc 4.4.3) both with -enable- 
> > sharedlibs=gcc and without, but both give the error in the static 
> > version but no error in the non-static version.
> >
> > Has anyone had success with static compiling of Mpich2 / 
> gcc programs?
> >
> > Thanks,
> > Jason
> >
> >
> > _______________________________________________
> > mpich-discuss mailing list
> > mpich-discuss at mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 

_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list