[mpich2-dev] regarding checkpointing using BLCR in MPICH2

Hao Yang hao.yang0614 at gmail.com
Mon Dec 12 14:28:45 CST 2011


Hi, all:

I found checkpoint failure when I use MPICH with BLCR in my  machine.

We tried inserting MPI_Iprobe in the program, but the second checkpoint
still failed.

Alternatively, we try to set MPI_ASYNC_PROGRESS to enable a MPI progress
thread which may help processes execute checkpoint algorithm. We first use
--enable-async-progress when we build MPICH and then run "mpiexec -env
MPICH_ASYNC_PROGRESS 1 -n 5 ./mpitest". But we met an error "Assertion
failed in file async.c at line 52: !mpi_errno. internal ABORT - process 0".

Does anyone know how to fix this? Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich2-dev/attachments/20111212/b43b94ef/attachment.htm>


More information about the mpich2-dev mailing list