[mpich-discuss] Coordinated Checkpoint without making checkpoint images

Mohammed El Mehdi DIOURI mehdi.diouri at ens-lyon.fr
Wed Nov 9 10:07:19 CST 2011


Hi Darius,

Thanks for your answer.

In fact, in my configuration, mpich is set with blcr checkpoint lib.
I don't need to modify something in the blcr code ?

All I need to do is in the MPIDI_nem_ckpt_finish() function of mpid_nem_ckpt.c of mpich ?

Thank you for your help,

Mehdi.


Le 1 nov. 2011 à 20:20, Darius Buntinas a écrit :

> Hi Mehdi,
> 
> You'll need to modify the MPIDI_nem_ckpt_finish() function in mpid_nem_ckpt.c.  This function is called when the checkpoint protocol has completed and we're ready to take a checkpoint of the process.  The checkpoint is taken between the sem_post and sem_wait (the blcr checkpoint thread (i.e., the ckpt_cb function) is waiting on the ckpt_sem to take the checkpoint).  If you're doing this without blcr, you'll also need to modify the mechanism to initiate the checkpoint.  Normally the checkpoint thread will set MPIDI_nem_ckpt_start_checkpoint = TRUE at rank 0, so you'll need to do that yourself.
> 
> -d
> 
> 
> On Oct 27, 2011, at 2:33 PM, Mohammed El Mehdi DIOURI wrote:
> 
>> Hi,
>> 
>> I was wondering if with mpich2 we can run the coordinated checkpointing without making the checkpoint images. 
>> I mean for each checkpoint interval, we only play the messages that enable the corresponding coordination ?
>> 
>> Thanks for your help,
>> 
>> Mehdi.
>> _______________________________________________
>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list