Dear all,<br><br>I have question.<br>I have a cluster with 11 machine, using Debian 5.0, MPICH2-1.3a2, and BLCR-0.8.2<br><br>I want to do some simulation and measure the performance.<br>Here it it my scenario :<br>When take the checkpointing of its node, and it's finished, each node will send its checkpoint file to next 3 computer.<br>
And after that sending is finished, each node will have 3 checkpoint file from 3 previous computer, then do XOR of this 3 checkpoint file.<br><br>I know how to send the checkpoint file of nodes to other computer.<br>But where can I put this mechanism's source code? Or is there anyone have a solution how to implement it?<br>
<br>Please I need the answer.<br><br>Thank you very much for your answer and attention.<br><br><br>Best regards,<br><br>Bagus<br><br>