[mpich-discuss] MPICH2 Checkpointing Error with BLCR

Manisha Chauhan manisha.chauhan at yahoo.co.in
Mon Sep 24 01:29:43 CDT 2012


Hi all,

Please help me to resolve the problem. Till now there is no reply. Please help me i am not able to resolve this problem. This will be very grateful on your part.

Thanks & Regards
Manisha Chauhan


----- Forwarded Message -----
From: Manisha Chauhan <manisha.chauhan at yahoo.co.in>
To: "mpich-discuss at mcs.anl.gov" <mpich-discuss at mcs.anl.gov> 
Sent: Saturday, 22 September 2012 9:37 AM
Subject: MPICH2 Checkpointing Error with BLCR
 

Hi,

I am working on check-pointing my MPI application. I installed both hydra and blcr.  I have also checked "mpiexec --info" and it shows check pointing library as blcr, But still I am not able to checkpoint my application.

It makes a request of "requesting checkpoint"  and returned with "checkpoint completed" but the context file is empty. The next time it tries it end with the following error.


MPICH2 
version= 1.4.1

[proxy:0:0 at tom-laptop] requesting checkpoint
[proxy:0:0 at tom-laptop] HYDT_ckpoint_checkpoint (./tools/ckpoint/ckpoint.c:111): Previous checkpoint has not
 completed.[proxy:0:0 at tom-laptop] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:902): checkpoint suspend failed
[proxy:0:0 at tom-laptop] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:0 at tom-laptop] main (./pm/pmiserv/pmip.c:210): demux engine error waiting for event
[mpiexec at tom-laptop] control_cb (./pm/pmiserv/pmiserv_cb.c:201): assert (!closed) failed
[mpiexec at tom-laptop] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[mpiexec at tom-laptop] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:197): error waiting for event
[mpiexec at tom-laptop] main (./ui/mpich/mpiexec.c:325): process manager error waiting for completion

Can you please help me  to find out the issue.

Regards
Manisha
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120924/11d764e4/attachment.html>


More information about the mpich-discuss mailing list