[MOAB-dev] failed to run parallel test case: moab-5.1.0 with mpich-3.3

Lorenzo Botti bottilorenzo at gmail.com
Tue Jun 11 10:40:33 CDT 2019


Deai Vijay,
the examples runs fine with mpich-3.3.1 so I guess there was some
problem with the previous mpich release (3.3).
Sorry for bothering you.
Bests
Lorenzo

Il giorno lun 10 giu 2019 alle ore 14:41 Lorenzo Botti
<bottilorenzo at gmail.com> ha scritto:
>
> Dear Vijay,
> I'm also surprised by this issue.
> Tomorrow I will have the possibility to launch the tests individually,
> I don't have access to the computer right now.
> It would be great if you could test the same setup.
> Thanks a lot!
> Lorenzo
>
> Il giorno dom 9 giu 2019 alle ore 22:16 Vijay S. Mahadevan
> <vijay.m at gmail.com> ha scritto:
> >
> > Lorenzo,
> >
> > I routinely run with mpich-3.2 on my Macbook with clang and have not
> > seen any issues both in-source or out-of-source. Our buildbot builds
> > do use mpich-3.1 on Ubuntu Trusty and there are no issues there as
> > well. However, I currently don't see any builds with mpich-3.3 -
> > though I would be surprised to see a different behavior with a newer
> > minor MPI release. We can test it out and let you know if the failures
> > can be replicated.
> >
> > Vijay
> >
> > On Sun, Jun 9, 2019 at 1:00 PM Lorenzo Botti <bottilorenzo at gmail.com> wrote:
> > >
> > > Dear Vijay,
> > > I tried to run the test individually last week and some of them hang. I do not remember exactly which but I can tell you in more detail if you need to know. Anyway, since something is not working correctly, I'd like to know if your builds take into account mpich-3.3.
> > > Bests
> > > Lorenzo
> > >
> > > On Sun, Jun 9, 2019, 14:59 Vijay S. Mahadevan <vijay.m at gmail.com> wrote:
> > >>
> > >> Dear Lorenzo,
> > >>
> > >> Sorry about my delayed response. I was on travel this past week and
> > >> couldn't reply to you immediately.
> > >>
> > >> It is puzzling that the tests in parallel folder are getting built as
> > >> expected but "make check" does not run them correctly. I can see that
> > >> mpiexec program was set correctly during configuration and so the next
> > >> logical check here would be to try and launch the test programs
> > >> manually to see whether they succeed.
> > >>
> > >> Can you do the following:
> > >> cd build/test/parallel && mpiexec -n 2 ./parallel_unit_tests &&
> > >> mpiexec -n 2 ./parallel_hdf5_test
> > >>
> > >> Both those tests should run to completion successfully if everything
> > >> is setup correctly. If these succeed, I would be out of ideas until we
> > >> can replicate it locally. Let me know the result with the above
> > >> experiment.
> > >>
> > >> Thanks,
> > >> Vijay
> > >>
> > >> On Sun, Jun 2, 2019 at 6:22 AM Lorenzo Botti <bottilorenzo at gmail.com> wrote:
> > >> >
> > >> > > Can you do the following two variations:
> > >> > >
> > >> > > 1) Re-verifying your current approach
> > >> > >   a) Go to test/parallel and do `make clean && make check | tee make_check.log`
> > >> > >   b) Send us make_check.log so that we can see if things compile but
> > >> > > do not run or nothing actually happens there.
> > >> > >
> > >> >
> > >> > Please find in attachment the make_check.log... so yes this is the
> > >> > point where it hangs.
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > > 2) Retry out-of-source
> > >> > >   a) Perform `make distclean` in your in-source build
> > >> > >   b) Do: `mkdir build && ../configure <configure arguments> && make
> > >> > > all && make check`
> > >> > >
> > >> > > Let me know if (1) yields something useful and if not, (2) should
> > >> > > resolve the issue. If (2) does work, then we may have a problem with
> > >> > > in-source builds and I can check that on a workstation to replicate
> > >> > > the problem. Hope my instructions above aren't confusing.
> > >> > >
> > >> >
> > >> > I got same behavior with out-of-source, following your instructions,
> > >> > see attached config.log
> > >> >
> > >> > Bests
> > >> > Lorenzo


More information about the moab-dev mailing list