[mpich-discuss] nemesis
shenqian at tsinghua.org.cn
shenqian at tsinghua.org.cn
Mon Aug 17 00:37:17 CDT 2009
> Hum... this is strange.
> I'm going to ckeck this and let you know.
> All I can say is that Nemesis/MX *should* be compatible. This could be
> also a bug in Open-Mx.
> Did you try to compile everything statically and without creating shared
> libs?
which option should I use? --enable-sharedlibs=none or --enable-dynamiclibs=none ?
>
> Guillaume
>
>
> shenqian at tsinghua.org.cn a écrit :
> > Hi Rajeev,
> >
> > I run "make testing" in the top-level mpich2-1.1.1 directory, got many many errors! Indeed, there are 94 tests failed against the all 553 tests in the summary.xml. My configure options are:
> >
> > ./configure --prefix=/opt/mpich2-install --with-device=ch3:nemesis:mx --with-mx-lib=/opt/open-mx/lib/ --with-mx-include=/opt/open-mx/include/ --enable-sharedlibs=gcc
> >
> > I also build mpich2-1.1.1 with the default settings, and run "make testing" in the top directory . The all 553 test passed! No fail in summary.xml.
> >
> > So is it really that Nemesis/MX is compatible with Open-MX? Or are there any missing options for configure?
> >
> > Thanks,
> > Qian Shen
> >
> >
> >> You can also run "make testing" in the top-level mpich2 directory. It
> >> will run the entire test suite in test/mpi. If they run, it would
> >> indicate there is something wrong with your program.
> >>
> >> Rajeev
> >>
> >>
> >>> -----Original Message-----
> >>> From: mpich-discuss-bounces at mcs.anl.gov
> >>> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of
> >>> shenqian at tsinghua.org.cn
> >>> Sent: Sunday, August 16, 2009 8:54 AM
> >>> To: mpich-discuss at mcs.anl.gov
> >>> Subject: Re: [mpich-discuss] nemesis
> >>>
> >>> Hi Guillaume,
> >>>
> >>> Thanks for your response.
> >>>
> >>> Because the program is another company's property, I can't
> >>> send it to you :-( If I can reproduce it with my own
> >>> program, I will send it to you as soon as possible.
> >>>
> >>> I'm trying to using Nemesis/MX on top of open-mx for high
> >>> performance. I got a good performance improvement by using
> >>> open-mx instead of TCP/IP, but the hang problem at
> >>> MPI_Barrier() puzzles me these days. Would you tell me some
> >>> way to debug it?
> >>>
> >>> Thanks,
> >>> Qian Shen
> >>>
> >>>
> >>>> Hello,
> >>>>
> >>>> Yes, Nemesis/MX is supposed to be compatible with Open-MX.
> >>>>
> >>> Could you
> >>>
> >>>> sent me your example program so that I can find out what is the
> >>>> problem?
> >>>>
> >>>>
> >>>> Thanks.
> >>>> Guillaume
> >>>>
> >>>> shenqian at tsinghua.org.cn a écrit :
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I build mpich2-1.1.1 on top of open-mx-1.1.1, and have
> >>>>>
> >>> ch3:nemesis:mx enabled. Nemesis/MX should be compatible with
> >>> Open-MX, isn't it? I test the examples shipped by mpich2,
> >>> they work well. But my MPI program always hang on
> >>> MPI_Barrier(). The output messages are:
> >>>
> >>>>> Open-MX: Send request (seqnum 105 sesnum 0) timeout, already sent
> >>>>> 1001 times, resetting partner status
> >>>>> Open-MX: Cleaning partner 00:11:09:5b:7d:16 endpoint 0
> >>>>> Open-MX: Dropped 1 pending send requests to partner
> >>>>>
> >>>>> It seems like that the requests can not be send to other
> >>>>>
> >>> nodes. If I switch to the traditional TCP/IP stack, use the
> >>> ch3:nemesis device, the program can run successfully.
> >>>
> >>>>> Could anyone tell me how to handle this issue?
> >>>>>
> >>>>> Regards,
> >>>>> Qian Shen
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>
> >
> >
>
More information about the mpich-discuss
mailing list