[mpich-discuss] nemesis
Anthony Chan
chan at mcs.anl.gov
Mon Aug 17 02:16:28 CDT 2009
remove --enable-sharedlibs=gcc in your configure command will enable static
instead of shared library.
----- shenqian at tsinghua.org.cn wrote:
> > Hum... this is strange.
> > I'm going to ckeck this and let you know.
> > All I can say is that Nemesis/MX *should* be compatible. This could
> be
> > also a bug in Open-Mx.
> > Did you try to compile everything statically and without creating
> shared
> > libs?
>
> which option should I use? --enable-sharedlibs=none or
> --enable-dynamiclibs=none ?
>
>
> >
> > Guillaume
> >
> >
> > shenqian at tsinghua.org.cn a écrit :
> > > Hi Rajeev,
> > >
> > > I run "make testing" in the top-level mpich2-1.1.1 directory, got
> many many errors! Indeed, there are 94 tests failed against the all
> 553 tests in the summary.xml. My configure options are:
> > >
> > > ./configure --prefix=/opt/mpich2-install
> --with-device=ch3:nemesis:mx --with-mx-lib=/opt/open-mx/lib/
> --with-mx-include=/opt/open-mx/include/ --enable-sharedlibs=gcc
> > >
> > > I also build mpich2-1.1.1 with the default settings, and run "make
> testing" in the top directory . The all 553 test passed! No fail in
> summary.xml.
> > >
> > > So is it really that Nemesis/MX is compatible with Open-MX? Or are
> there any missing options for configure?
> > >
> > > Thanks,
> > > Qian Shen
> > >
> > >
> > >> You can also run "make testing" in the top-level mpich2
> directory. It
> > >> will run the entire test suite in test/mpi. If they run, it
> would
> > >> indicate there is something wrong with your program.
> > >>
> > >> Rajeev
> > >>
> > >>
> > >>> -----Original Message-----
> > >>> From: mpich-discuss-bounces at mcs.anl.gov
> > >>> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of
> > >>> shenqian at tsinghua.org.cn
> > >>> Sent: Sunday, August 16, 2009 8:54 AM
> > >>> To: mpich-discuss at mcs.anl.gov
> > >>> Subject: Re: [mpich-discuss] nemesis
> > >>>
> > >>> Hi Guillaume,
> > >>>
> > >>> Thanks for your response.
> > >>>
> > >>> Because the program is another company's property, I can't
> > >>> send it to you :-( If I can reproduce it with my own
> > >>> program, I will send it to you as soon as possible.
> > >>>
> > >>> I'm trying to using Nemesis/MX on top of open-mx for high
> > >>> performance. I got a good performance improvement by using
> > >>> open-mx instead of TCP/IP, but the hang problem at
> > >>> MPI_Barrier() puzzles me these days. Would you tell me some
> > >>> way to debug it?
> > >>>
> > >>> Thanks,
> > >>> Qian Shen
> > >>>
> > >>>
> > >>>> Hello,
> > >>>>
> > >>>> Yes, Nemesis/MX is supposed to be compatible with Open-MX.
> > >>>>
> > >>> Could you
> > >>>
> > >>>> sent me your example program so that I can find out what is the
>
> > >>>> problem?
> > >>>>
> > >>>>
> > >>>> Thanks.
> > >>>> Guillaume
> > >>>>
> > >>>> shenqian at tsinghua.org.cn a écrit :
> > >>>>
> > >>>>> Hi,
> > >>>>>
> > >>>>> I build mpich2-1.1.1 on top of open-mx-1.1.1, and have
> > >>>>>
> > >>> ch3:nemesis:mx enabled. Nemesis/MX should be compatible with
> > >>> Open-MX, isn't it? I test the examples shipped by mpich2,
> > >>> they work well. But my MPI program always hang on
> > >>> MPI_Barrier(). The output messages are:
> > >>>
> > >>>>> Open-MX: Send request (seqnum 105 sesnum 0) timeout, already
> sent
> > >>>>> 1001 times, resetting partner status
> > >>>>> Open-MX: Cleaning partner 00:11:09:5b:7d:16 endpoint 0
> > >>>>> Open-MX: Dropped 1 pending send requests to partner
> > >>>>>
> > >>>>> It seems like that the requests can not be send to other
> > >>>>>
> > >>> nodes. If I switch to the traditional TCP/IP stack, use the
> > >>> ch3:nemesis device, the program can run successfully.
> > >>>
> > >>>>> Could anyone tell me how to handle this issue?
> > >>>>>
> > >>>>> Regards,
> > >>>>> Qian Shen
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>
> > >
> > >
> >
More information about the mpich-discuss
mailing list