[mpich-discuss] Asking standard checkpoint in MPICH2
Darius Buntinas
buntinas at mcs.anl.gov
Tue May 11 10:35:15 CDT 2010
Checkpointing is only supported in 1.3a2. In 1.2.1.p1, hydra added some
checkpointing utility features, but there's no checkpointing in MPICH2
itself.
-d
On 05/10/2010 10:10 PM, Bagus Jati Santoso wrote:
> Hi Darius,
>
> OK. Thank you for answer my email.
>
> And how about MPICH-1.2.1.p1? The more stable version one..
> Do I need to install BLCR first for this version?
>
> Since 1.2.p1 have a support for BLCR too, which version do you suggest
> to work with BLCR package, 1.2.1.p1 or 1.3a2 ?
>
> Thank you very much,
>
> Bagus
>
>
>
>
> On Tue, May 11, 2010 at 4:17 AM, Darius Buntinas <buntinas at mcs.anl.gov
> <mailto:buntinas at mcs.anl.gov>> wrote:
>
> Hi Bagus,
>
> Sorry, I haven't written up the documentation on this yet. You'll
> need to install BLCR, and configure mpich2 with the following
> configure options:
>
> --with-hydra-ckpointlib=blcr --enable-checkpointing
>
> If you didn't install BLCR in a standard system location (e.g., if
> you installed it in your home directory), then you'll need to
> specify the install location using the --with-blcr= configure option
> as well. Also, make sure that your LD_LIBRARY_PATH is set correctly
> if necessary.
>
> Once you configure and make, you'll need to make sure the BLCR
> kernel modules are loaded on each machine. Use the
> -ckpoint-interval option for mpiexec to specify how often to take
> checkpoints. You'll also need to specify the location where the
> checkpoint files should be written using the -ckpoint-prefix option
> (make sure the directory exists).
>
> To restart from a checkpoint specify the same number of processes as
> the original run and the -ckpoint-prefix option, but leave off the
> name of the executable.
>
> Let us know how this works for you. Remember that you're using a
> beta version, so you might still encounter some bugs.
>
> -d
>
>
More information about the mpich-discuss
mailing list