[mpich-discuss] static compiling with mpich2

Jason Palmer jason at sccn.ucsd.edu
Mon Feb 1 15:54:00 CST 2010


Ok, thanks. ... though it seems to me that with around 33% speed increase
for static compilation (with mpich1), arguing against static on principle,
esp. for small/medium scale parallel, performance sensitive programs, is
moot. I guess this is finally a reason to prefer C over fortran ... but I'm
still not ready to make that concession.

-Jason

-----Original Message-----
From: mpich-discuss-bounces at mcs.anl.gov
[mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Dave Goodell
Sent: Monday, February 01, 2010 12:49 PM
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] static compiling with mpich2

So I misunderstood your original question.  We don't usually link  
statically by using the "-static" flag to gcc/gfortran/etc.  Some  
googling around turned up the following two relevant pages:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33581
http://www.mail-archive.com/gcc@gcc.gnu.org/msg26770.html

The basic upshot of which was "don't use `-static', especially with  
threads".  When I said that we usually link statically, I meant that  
we compile without "--enable-sharedlibs=...".  That will generate a  
static library archive (libmpich.a) for MPICH2, but the system  
libraries will still be linked dynamically.

Jakub Jelinek proposes using "-Wl,--whole-archive -lnptl -Wl,--no- 
whole-archive" to sort this out, but I don't know anything about this  
solution.  It sounds rather hack-ish to me.

-Dave

On Feb 1, 2010, at 2:29 PM, Jason Palmer wrote:

> So, the problem seems to be limited to gfortran (4.4.3) compiling with
> static and MPICH2 commands. I don't get any errors when compiling  
> without
> -static.
>
> Gdb shows the main fortran executable dying in a thread mutex call  
> inside a
> fotran io function:
> ==================================================
> (gdb) run
> Starting program: /home/jason/amica_gcc/a.out
> [Thread debugging using libthread_db enabled]
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x0000000000000000 in ?? ()
> (gdb) where
> #0  0x0000000000000000 in ?? ()
> #1  0x000000000047c203 in __gthread_mutex_trylock (n=6, do_create=1)  
> at
> ../../../gcc-4.4.3/libgfortran/../gcc/gthr-posix.h:767
> #2  get_external_unit (n=6, do_create=1) at
> ../../../gcc-4.4.3/libgfortran/io/unit.c:330
> #3  0x000000000047aa0f in data_transfer_init (dtp=0x7fffffffe0c0,
> read_flag=0) at ../../../gcc-4.4.3/libgfortran/io/transfer.c:2017
> #4  0x0000000000400302 in MAIN__ ()
> #5  0x000000000047627a in main (argc=1, argv=0x7fffffffe5b8) at
> ../../../gcc-4.4.3/libgfortran/fmain.c:21
> ===================================================
>
> I recompiled gcc without --enable threads=posix, and I get the same  
> error. I
> get no errors when building gcc.
>
> I get the exact same seg fault in examples, same time in execution,  
> when I
> run the f90pi example program compiled with -static.
>
> Example f90pi compiles and runs when compiled without -static, but  
> it still
> has io issues--it doesn't print anything on the scree until <enter> is
> typed, then it prints the usual input message but immediately exits  
> (without
> error).
>
> On the other hand, with cpi and cxxpi, everything works fine with or  
> without
> static.
>
> I ran make testing, and got the following io related problems:
> ==========================================================
> Processing directory coll
> Looking in ./coll/testlist
> Unexpected output in bcast2: mpiexec_juggling.ucsd.edu  
> (handle_sig_occurred
> 1144): job ending due to env var MPIEXEC_TIMEOUT=420
> Program bcast2 exited without No Errors
> Unexpected output in bcast3: mpiexec_juggling.ucsd.edu  
> (handle_sig_occurred
> 1144): job ending due to env var MPIEXEC_TIMEOUT=420
> Program bcast3 exited without No Errors
> ....
> Processing directory io
> Looking in ./f77/io/testlist
> Unexpected output in iwriteatf: mpiexec_juggling.ucsd.edu
> (handle_sig_occurred 1144): job ending due to env var  
> MPIEXEC_TIMEOUT=180
> Program iwriteatf exited without No Errors
> ... 11 more identical errors in ./f77/io/testlist for similar  
> functions
> ....
> Processing directory io
> Looking in ./cxx/io/testlist
> Unexpected output in iwriteatx: mpiexec_juggling.ucsd.edu
> (handle_sig_occurred 1144): job ending due to env var  
> MPIEXEC_TIMEOUT=180
> Program iwriteatx exited without No Errors
> ... 23 more identical erros in ./cxx/io/testlist for similar functions
> ....
> Processing directory threads
> Looking in ./threads/testlist
> Processing directory pt2pt
> Looking in ./threads/pt2pt/testlist
> Unexpected output in threads: mpiexec_juggling.ucsd.edu  
> (handle_sig_occurred
> 1144): job ending due to env var MPIEXEC_TIMEOUT=600
> Program threads exited without No Errors
> Some programs (threads) may still be running:
> pids = 14676
> The executable (threads) will not be removed.
> Unexpected output in multisend4: mpiexec_juggling.ucsd.edu
> (handle_sig_occurred 1144): job ending due to env var  
> MPIEXEC_TIMEOUT=180
> Program multisend4 exited without No Errors
> Processing directory comm
> Looking in ./threads/comm/testlist
> Processing directory init
> Looking in ./threads/init/testlist
> Processing directory spawn
> Looking in ./threads/spawn/testlist
> 41 tests failed out of 580
> ============================================================
>
> So apparently cxx is having io issues as well, but I didn't get any  
> erros on
> cxxpi with -static.
>
> I'm using gcc-4.4.3 (and latest gmp, mpfr, and mpc) and Mpich2-1.2.1  
> on an
> x86_64-linux host (AMD Opteron).
>
> Has anyone else used gfortran 4.4.3 to compile Mpich2 programs  
> statically?
> Again, the above testing errors notwithstanding, things seem to work  
> when
> compiled dynamically, and when using c or cxx with or without static.
>
> Also, my test program (MPI_INIT + print + MPI_FINALIZE) compiles and  
> runs
> successfully when the MPI calls are commented out (using -static and  
> mpif90)
> so the problem doesn't seem to be limited to gfortran and static,  
> but seems
> to involve gfortran, static, mpich2, and possibly threads.
>
> Maybe I should try explicitly disabling threads?
>
> Thanks again for your help!
>
> Jason
>
>
>
> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov
> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Rajeev Thakur
> Sent: Friday, January 29, 2010 5:26 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] static compiling with mpich2
>
> You can also run "make testing" in the top-level mpich2 directory,  
> which
> will run the entire test suite in test/mpi, including a whole bunch of
> f90 tests (can take an hour or more).
>
> A simpler first step may be to run the example in examples/f90
>
> Rajeev
>
>> -----Original Message-----
>> From: mpich-discuss-bounces at mcs.anl.gov
>> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Jason Palmer
>> Sent: Friday, January 29, 2010 6:17 PM
>> To: mpich-discuss at mcs.anl.gov
>> Subject: Re: [mpich-discuss] static compiling with mpich2
>>
>> The problem does seem to be related to MPICH and/or gcc with
>> static compilation ... oddly, I only get a seg fault when I
>> try to print (to
>> stdout) and execute an MPI instruction, but either alone works.
>>
>> I can compile using mpif90 with or without -static.
>>
>> In the following simple fortran program:
>>
>> ---------------------
>> program main
>> use mpi
>> implicit none
>> integer :: ierr
>>
>> print *, 'This is a test ...'
>>
>> call MPI_INIT(ierr)
>> call MPI_FINALIZE(ierr)
>>
>> end program
>> ------------------------
>>
>> when compiled with MPICH2 mpif90 and gcc-4.4.3, and -static,
>> no optimization, I get a seg fault.
>>
>> If I comment out the print statement, there is no error (even
>> with static compile).
>> If I comment out the MPI statements, the print statement
>> works and no error (even though compiled with mpif90).
>>
>> I get the error both with the "use mpi" and include mpif.h.
>>
>> I re-compiled MPICH2 without --enable-fast, and with
>> --enable-g=dbg and --enable-debug-info (I got a compile error
>> when I tried to use --enable-error-checking complaining about
>> and unknown error number).
>>
>> I ran gdb, and the seg fault happens during a return from a
>> thread mutex call. I compiled gcc with enable-threads=posix,
>> but there is no explicit threading in my code here.
>>
>> The MPI routines work fine with static and no print
>> statement, and vice-versa. And somehow the combination seems
>> to die in a thread call.
>>
>> I will try recompiling gcc without --enable-threads=posix to
>> see what that does (assuming no immediate fixes are forthcoming.)
>>
>> Thanks for your help.
>>
>> Jason
>>
>>
>>
>> -----Original Message-----
>> From: mpich-discuss-bounces at mcs.anl.gov
>> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Dave Goodell
>> Sent: Friday, January 29, 2010 1:21 PM
>> To: mpich-discuss at mcs.anl.gov
>> Subject: Re: [mpich-discuss] static compiling with mpich2
>>
>> We (the MPICH2 developers) primarily compile statically and
>> we definitely use static libraries in many of our nightly
>> tests.  GCC is also well tested with MPICH2 and is totally
>> compatible on most platforms.
>>
>> Have you used gdb or valgrind to determine where/why the
>> segfault is occurring?
>>
>> If the segfault is coming from within MPICH2, can you try
>> configuring with "--enable-g=dbg --enable-error-checking" and
>> *without* any "-- enable-fast" arguments?  Versions of MPICH2
>> built for performance are much less forgiving of errors in
>> application code.
>>
>> -Dave
>>
>> On Jan 29, 2010, at 3:03 PM, Jason Palmer wrote:
>>
>>> Hi all,
>>>
>>> We recently switched from Mpich1 to Mpich2 as part of a general
>>> upgrade, and I'm re-building a program that was previously
>> compiled as
>>> a static binary under Mpich1 on 64-bit RedHat Linux using pathscale
>>> fortran on an AMD Opteron cluster.
>>>
>>> We are now using gcc-4.4.3 (instead of pathscale), and I
>> can compile
>>> and run the code with dynamic libraries, but when I compile with -
>>> static, the compile succeeds, but the binary gives a segmentation
>>> fault when executed, with or without mpiexec.
>>>
>>> I have tried compiling Mpich2 (using gcc 4.4.3) both with -enable-
>>> sharedlibs=gcc and without, but both give the error in the static
>>> version but no error in the non-static version.
>>>
>>> Has anyone had success with static compiling of Mpich2 /
>> gcc programs?
>>>
>>> Thanks,
>>> Jason
>>>
>>>
>>> _______________________________________________
>>> mpich-discuss mailing list
>>> mpich-discuss at mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list