[mpich-discuss] Coordinated Checkpoint without making checkpoint images
Darius Buntinas
buntinas at mcs.anl.gov
Tue Nov 1 14:20:47 CDT 2011
Hi Mehdi,
You'll need to modify the MPIDI_nem_ckpt_finish() function in mpid_nem_ckpt.c. This function is called when the checkpoint protocol has completed and we're ready to take a checkpoint of the process. The checkpoint is taken between the sem_post and sem_wait (the blcr checkpoint thread (i.e., the ckpt_cb function) is waiting on the ckpt_sem to take the checkpoint). If you're doing this without blcr, you'll also need to modify the mechanism to initiate the checkpoint. Normally the checkpoint thread will set MPIDI_nem_ckpt_start_checkpoint = TRUE at rank 0, so you'll need to do that yourself.
-d
On Oct 27, 2011, at 2:33 PM, Mohammed El Mehdi DIOURI wrote:
> Hi,
>
> I was wondering if with mpich2 we can run the coordinated checkpointing without making the checkpoint images.
> I mean for each checkpoint interval, we only play the messages that enable the corresponding coordination ?
>
> Thanks for your help,
>
> Mehdi.
> _______________________________________________
> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
More information about the mpich-discuss
mailing list