[mpich-discuss] Asking standard checkpoint in MPICH2

Pavan Balaji balaji at mcs.anl.gov
Tue May 11 08:28:30 CDT 2010


Bagus,

Checkpointing support is only available 1.3a2 onward, not in 1.2.1p1. 
Also, MPICH2 does not package BLCR itself -- it only *uses* BLCR. So, 
you'll always need to install BLCR separately and point MPICH2 to it 
using the --with-blcr option (in some cases, it might auto-detect where 
BLCR is located, if it's a standard location).

  -- Pavan

On 05/10/2010 10:10 PM, Bagus Jati Santoso wrote:
> Hi Darius,
> 
> OK. Thank you for answer my email.
> 
> And how about MPICH-1.2.1.p1? The more stable version one..
> Do I need to install BLCR first for this version?
> 
> Since 1.2.p1 have a support for BLCR too, which version do you suggest 
> to work with BLCR package, 1.2.1.p1 or 1.3a2 ?
> 
> Thank you very much,
> 
> Bagus
> 
> 
> 
> 
> On Tue, May 11, 2010 at 4:17 AM, Darius Buntinas <buntinas at mcs.anl.gov 
> <mailto:buntinas at mcs.anl.gov>> wrote:
> 
>     Hi Bagus,
> 
>     Sorry, I haven't written up the documentation on this yet.  You'll
>     need to install BLCR, and configure mpich2 with the following
>     configure options:
> 
>     --with-hydra-ckpointlib=blcr --enable-checkpointing
> 
>     If you didn't install BLCR in a standard system location (e.g., if
>     you installed it in your home directory), then you'll need to
>     specify the install location using the --with-blcr= configure option
>     as well.  Also, make sure that your LD_LIBRARY_PATH is set correctly
>     if necessary.
> 
>     Once you configure and make, you'll need to make sure the BLCR
>     kernel modules are loaded on each machine.  Use the
>     -ckpoint-interval option for mpiexec to specify how often to take
>     checkpoints.  You'll also need to specify the location where the
>     checkpoint files should be written using the -ckpoint-prefix option
>     (make sure the directory exists).
> 
>     To restart from a checkpoint specify the same number of processes as
>     the original run and the -ckpoint-prefix option, but leave off the
>     name of the executable.
> 
>     Let us know how this works for you.  Remember that you're using a
>     beta version, so you might still encounter some bugs.
> 
>     -d
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list