[MPICH] build mpich2 with Myrinet GM

Wei-keng Liao wkliao at ece.northwestern.edu
Thu Feb 28 11:38:09 CST 2008


I got an error when I applied the patch:
mercury::mpich2-1.0.6p1(11:34am) #448% patch -p0 < gm.patch
patching file src/mpid/ch3/include/mpidpre.h
patching file 
src/mpid/ch3/channels/nemesis/nemesis/net_mod/gm_module/gm_module_impl.h
Hunk #1 succeeded at 51 (offset -1 lines).
patching file 
src/mpid/ch3/channels/nemesis/nemesis/net_mod/gm_module/gm_module_poll.c
patching file 
src/mpid/ch3/channels/nemesis/nemesis/net_mod/gm_module/gm_module_send.c
Hunk #2 FAILED at 233.
Hunk #3 succeeded at 265 (offset -80 lines).
Hunk #4 succeeded at 343 (offset -81 lines).
1 out of 4 hunks FAILED -- saving rejects to file 
src/mpid/ch3/channels/nemesis/nemesis/net_mod/gm_module/gm_module_send.c.rej

mercury::mpich2-1.0.6p1(11:36am) #450% cat 
src/mpid/ch3/channels/nemesis/nemesis/net_mod/gm_module/gm_module_send.c.rej
***************
*** 237,243 ****
  {
      int mpi_errno = MPI_SUCCESS;
      char *dataptr;
-     int datalen;
      int complete;  
  
      while (active_send || !SEND_Q_EMPTY())
--- 233,239 ----
  {
      int mpi_errno = MPI_SUCCESS;
      char *dataptr;
+     MPIDI_msg_sz_t datalen;
      int complete;  
  
      while (active_send || !SEND_Q_EMPTY())

Wei-keng


On Thu, 28 Feb 2008, Darius Buntinas wrote:

> 
> Thanks for reporting this.  Here's a patch that should fix it.  Let me know if
> you have any more trouble.
> 
> Thanks,
> -d
> 
> On 02/27/2008 10:18 PM, Wei-keng Liao wrote:
> > OK. the patch fixed the problem and I was able to build the mpich. But when
> > I ran the test alltoallv in test/mpi/coll using 4 processes, it failed with
> > error message:
> >   rank 2 in job 1 tg-c527_40397 caused collective abort of all ranks
> >   exit status of rank 2: killed by signal 9 
> > 
> > The gdb on the coredump shows
> > (gdb) where
> > #0  0x20000000001c9120 in ?? ()
> > #1  0x40000000000a71d0 in send_pkt ()
> > #2  0x40000000000a6530 in MPID_nem_gm_iSendContig ()
> > #3  0x40000000000ab000 in MPIDI_CH3_iSendv ()
> > #4  0x400000000003e890 in MPIDI_CH3_EagerContigIsend ()
> > #5  0x4000000000048160 in MPID_Isend ()
> > #6  0x400000000000ec00 in MPIC_Isend ()
> > #7  0x400000000000a8e0 in MPIR_Alltoallv ()
> > #8  0x400000000000b2d0 in PMPI_Alltoallv ()
> > #9  0x4000000000003670 in main ()
> > #10 0x40000000000a71d0 in send_pkt ()
> > 
> > Wei-keng
> > 
> > 
> > On Wed, 27 Feb 2008, Darius Buntinas wrote:
> > 
> > > Sorry about that.  I guess I didn't test this on an itanium after making
> > > some changes there.
> > >
> > > I've attached a patch file that should fix this.  I'm still not sure why
> > > it's not working with your intel compiler.
> > >
> > > Apply the patch like this (from the mpich2 source directory)
> > >   patch -p0 < ia64_atomics.patch
> > >
> > > Then do a make clean and make.
> > >
> > > -d
> > >
> > > On 02/27/2008 11:33 AM, Wei-keng Liao wrote:
> > > > I got a different error when I built mpich with gcc 3.2.2 at compiling
> > > > file
> > > > nemesis/src/mpid_nem_alloc.c. (I used ifort for FC environment
> > > > variable.)
> > > >
> > > > In file included from ../include/mpid_nem_impl.h:13,
> > > >                  from mpid_nem_alloc.c:7:
> > > > ../include/mpid_nem_atomics.h: In function `MPID_NEM_SWAP':
> > > > ../include/mpid_nem_atomics.h:27: warning: dereferencing `void *'
> > > > pointer
> > > > ../include/mpid_nem_atomics.h: In function `MPID_NEM_CAS':
> > > > ../include/mpid_nem_atomics.h:54: warning: dereferencing `void *'
> > > > pointer
> > > > ../include/mpid_nem_atomics.h: In function `MPID_NEM_FETCH_AND_INC':
> > > > ../include/mpid_nem_atomics.h:164: parse error before string constant
> > > >
> > > > Also, I tried Intel icc 8.1.037 and it failed with the message as icc
> > > > 9.0.032 and 9.1.046.
> > > >
> > > > Wei-keng
> > > >
> > > >
> > > > On Tue, 26 Feb 2008, Darius Buntinas wrote:
> > > > > It looks like the icc compiler you're using doesn't like the gcc-style
> > > > > inline
> > > > > assembly code.
> > > > >
> > > > > What version of icc do you have?
> > > > > Can you try compiling with gcc instead of icc?
> > > > >
> > > > > -d
> > > > >
> > > > > On 02/26/2008 12:32 PM, Wei-keng Liao wrote:
> > > > > > Attached are 3 files:
> > > > > >
> > > > > > out.configure  -  stdout from configure
> > > > > > out.make       -  stdout from make
> > > > > > config.log
> > > > > >
> > > > > > Wei-keng
> > > > > >
> > > > > > On Tue, 26 Feb 2008, Darius Buntinas wrote:
> > > > > >
> > > > > > > Can you send us the output of configure as well as config.log?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > -d
> > > > > > >
> > > > > > > On 02/26/2008 11:35 AM, Wei-keng Liao wrote:
> > > > > > > > I got an error during make:
> > > > > > > >
> > > > > > > > ../include/mpid_nem_atomics.h(31): catastrophic error: #error
> > > > > > > > directive:
> > > > > > > > No
> > > > > > > > swap function defined for this architecture
> > > > > > > >   #error No swap function defined for this architecture
> > > > > > > >    ^
> > > > > > > > compilation aborted for mpid_nem_alloc.c (code 4)
> > > > > > > >
> > > > > > > > I am using configure options:
> > > > > > > >           --with-device=ch3:nemesis:gm  \
> > > > > > > >           --with-gm=/opt/gm \
> > > > > > > >           --enable-f77 --enable-f90 --enable-cxx \
> > > > > > > >           --enable-fast \
> > > > > > > >           --enable-romio \
> > > > > > > >           --without-mpe \
> > > > > > > >           --with-file-system=ufs
> > > > > > > >
> > > > > > > > and the command "uname -a" on the machine is
> > > > > > > > Linux tg-login4 2.4.21-309.tg1 #1 SMP Thu Jun 1 17:07:28 CDT
> > > > > > > > 2006
> > > > > > > > ia64
> > > > > > > > unknown
> > > > > > > >
> > > > > > > > I am using Intel compiler v 9.1.043
> > > > > > > >
> > > > > > > > Wei-keng
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, 26 Feb 2008, Darius Buntinas wrote:
> > > > > > > > > On 02/26/2008 10:08 AM, Wei-keng Liao wrote:
> > > > > > > > > > I have a few questions on build mpich2-1.0.6p1 with Myrinet
> > > > > > > > > > GM
> > > > > > > > > > library.
> > > > > > > > > >
> > > > > > > > > > On my target machine, the GM library (include, lib, bin,
> > > > > > > > > > etc.)
> > > > > > > > > > is in
> > > > > > > > > > /opt/gm. According to MPICH README, I used the 2 options
> > > > > > > > > > below
> > > > > > > > > > when
> > > > > > > > > > configuring: 
> > > > > > > > > >     --with-device=ch3:nemesis:gm  and --with-gm=/opt/gm
> > > > > > > > > >
> > > > > > > > > > I can see both libgm.a and libgm.so are in /opt/gm/lib.
> > > > > > > > > >
> > > > > > > > > > Q1: Do I need other configure options or setting environment
> > > > > > > > > > variables
> > > > > > > > > >     (in addition to CC, FC, CXX, F90)? Should I set LDFLAGS
> > > > > > > > > >     to
> > > > > > > > > >     "-L/opt/gm/lib -lgm" ?
> > > > > > > > > Nope, the --with-gm=/opt/gm should take care of all of that
> > > > > > > > > for
> > > > > > > > > you.
> > > > > > > > >
> > > > > > > > > > Q2: Since nemesis does not support MPI dynamic process
> > > > > > > > > > routines
> > > > > > > > > > yet
> > > > > > > > > > and
> > > > > > > > > > I 
> > > > > > > > > >     need those routines, can I use --with-device=ch3:sock:gm
> > > > > > > > > >     instead?
> > > > > > > > > No, only nemesis supports gm.
> > > > > > > > >
> > > > > > > > > > Q3: Do I need anything else (source codes, library) from
> > > > > > > > > > Myrinet
> > > > > > > > > > to
> > > > > > > > > > build 
> > > > > > > > > >     mpich? Or the /opt/gm is good enough?
> > > > > > > > > All you need is libgm.a and gm.h.
> > > > > > > > >
> > > > > > > > > > Q4: Once the mpich is built, is there a way to verify that
> > > > > > > > > > GM is
> > > > > > > > > > actually 
> > > > > > > > > >     used?
> > > > > > > > > Well, you should see a performance improvement over using
> > > > > > > > > sockets.
> > > > > > > > > Run a
> > > > > > > > > ping-pong test; you should see latencies around 10us or less.
> > > > > > > > >
> > > > > > > > > -d
> > > > > > > > >
> > >
> > 
> 
> 




More information about the mpich-discuss mailing list