[mpich-discuss] Fault tolerance - not stable.
Anatoly G
anatolyrishon at gmail.com
Wed Jan 11 07:03:40 CST 2012
>
> Hi Darius.
> Thank you for your's response.
> I changed code according to your's proposition.
> Results:
> Sometimes I get fail process as expected, and one more process fails
> unexpectedly.
> I attach code & logs.
> Execution command:
> mpiexec.hydra -genvall -disable-auto-cleanup -f machines_student.txt -n 24
> -launcher=rsh mpi_rcv_waitany 100000 1000000 3 5 1 logs/mpi_rcv_waitany_1_
>
>
> - I run 24 processes on 3 computers.
> - 100000 iterations
> - Expected fail of process 3 on iteration 5.
> - In addition I got fail of *process 18*. It's logs with it's errors
> is mpi_rcv_waitany_1__r18.log
>
> Can you please review again my code, and give me some tips to fix
> application.
>
> Anatoly.
>
>
> On Tue, Jan 10, 2012 at 9:32 PM, Anatoly G <anatolyrishon at gmail.com>wrote:
>
>>
>>
>> ---------- Forwarded message ----------
>> From: Darius Buntinas <buntinas at mcs.anl.gov>
>> Date: Tue, Jan 10, 2012 at 8:11 PM
>> Subject: Re: [mpich-discuss] Fault tolerance - not stable.
>> To: mpich-discuss at mcs.anl.gov
>>
>>
>>
>> I took a look at mpi_rcv_waitany.cpp, and I found a couple of issues.
>> I'm not sure if this is the problem, but we should fix these first.
>>
>> In Rcv_WaitAny(), in the while(true) loop, you do a waitany, but then you
>> iterate over the requests and do a test. I don't think this is what you
>> want to do. When waitany returns, mRcvRequests[slaveIdx] will be set to
>> MPI_REQUEST_NULL, so the subsequent test will return MPI_SUCCESS, and you
>> may not register a failure.
>>
>> Also, if all requests have previously completed, and you call waitany,
>> then it will return and set slaveIdx to MPI_UNDEFINED, so we need to
>> consider that case.
>>
>> Another issue is that you post a receive for the slave after it
>> completes, but never wait on that request. This is not allowed in MPI (but
>> you can probably get away with this most of the time).
>>
>> I _think_ what you want to do is this:
>>
>> while(mSlavesFinished < mSlaves) {
>> retErr = MPI_Waitany(mRcvRequests.size(), &*mRcvRequests.begin(),
>> &slaveIdx, &status);
>> slaveRank = slaveIdx + 1;
>>
>> if (retErr != MPI_SUCCESS) {
>> char Msg[256];
>> sprintf(Msg, "From rank %d, fail - request deallocated",
>> slaveRank);
>> handleMPIerror(mFpLog, Msg, retErr, &status);
>> mRcvRequests[slaveIdx] = MPI_REQUEST_NULL;
>> mIsSlaveLives[slaveIdx] = 0;
>> ++mSlavesFinished;
>> continue;
>> }
>>
>> /* if all requests have been completed, we should have exited the
>> loop already */
>> assert(slaveIdx != MPI_UNDEFINED);
>> /* if the slave is dead, we should not be able to receive a
>> message from it */
>> assert(mIsSlaveLives[slaveIdx});
>>
>> ++mSlavesRcvIters[slaveIdx];
>> if(mSlavesRcvIters[slaveIdx] == nIters) {
>> ++mSlavesFinished;
>> fprintf(mFpLog, "\n\nFrom rank %d, Got number = %d\n ",
>> slaveRank, mRcvNumsBuf[slaveIdx]);
>> fprintf(mFpLog, "Slave %d finished\n\n", slaveIdx+1);
>> } else {
>> MPI_Irecv(&(mRcvNumsBuf[slaveIdx]), 1, MPI::INT, slaveRank,
>> MPI::ANY_TAG, MPI_COMM_WORLD, &(mRcvRequests[slaveIdx]));
>> }
>> }
>>
>> Give this a try and see how it works.
>>
>> -d
>>
>>
>>
>> On Jan 10, 2012, at 12:50 AM, Anatoly G wrote:
>>
>> > Dear mpich-discuss,
>> > I have a problem while using fault tolerance feature, in MPICH2 hydra
>> process manager.
>> > The results are not consistent, sometimes tests pass, sometimes stall.
>> > If you executes command line written below in loop, after number of
>> iterations, test stall.
>> > Can you please help me with this problem?
>> >
>> > There are 3 tests. All 3 tests have same model master with number of
>> slaves. Communication operations are point to point.
>> >
>> > Slave algorithm is same for all 3 tests.
>> > for N times:
>> > MPI_Send integer to master.
>> > if IterI (parameter) && rank= fail_rank
>> > cause divide by zero exception. (A = 5.0; B = 0.0; C = A / B;)
>> > MPI_Recv(master)
>> >
>> > Master algorithm Test1 (mpi_send_rcv_waitany.cpp) :
>> > • For each slave call MPI_Irecv
>> > • while not got N messages from each slave continue
>> > • MPI_Waitany(slaveIdx)
>> > • if slaveIdx alive
>> > • MPI_Irecv(slaveIdx)
>> > • else
>> > • Mark it as finished.
>> > • MPI_Send to all slaves .
>> >
>> > Master algorithm Test2 (mpi_send_sync.cpp) :
>> > • slave = first slave
>> > • while not got N messages from each slave continue
>> > • MPI_Recv(slave)
>> > • if slaveIdx alive
>> > • pass to next live slave
>> > • else
>> > • Mark it as finished.
>> > • MPI_Send to all slaves .
>> >
>> > Master algorithm Test3 (mpi_send_async.cpp) :
>> > Same as test2, but instead of MPI_Recv, I use MPI_Irecv + MPI_Wait
>> >
>> > When test stall, I connect debugger to master process.
>> > Process stall in MPI_recv, or MPI_Irecv.
>> > I think, stall caused by following sequence:
>> > • Master receives integer from slave.
>> > • Tests slave - it's Ok.
>> > • Slave failes
>> > • master try to perform MPI_Irecv or MPI_Recv on failed slave.
>> > Problem happens on cluster (student_machines.txt) & on single machine
>> (machine_student1.txt)
>> >
>> > Execution lines:
>> > • /space/local/hydra/bin/mpiexec.hydra -genvall
>> -disable-auto-cleanup -f machine_student1.txt -n 8 -launcher=rsh
>> mpi_rcv_waitany 100000 1000000 3 10 1 logs/mpi_rcv_waitany_it_9/res_
>> > • /space/local/hydra/bin/mpiexec.hydra -genvall
>> -disable-auto-cleanup -f student_machines.txt -n 12 -launcher=rsh
>> mpi_rcv_waitany 100000 1000000 3 10 1 logs/mpi_rcv_waitany_it_9/res_
>> > Test performs 100000 iterations master with each slave.
>> > 1000000 scale number to distinguish between sequences if integers with
>> master & each slave.
>> > 3 - rank of process to cause fail (fail_rank)
>> > 10 - fail iteration. On iteration 10 process with rank 3 will cause
>> divide by zero exception.
>> > 1 logs/mpi_rcv_waitany_it_9/res_ defines log file.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> <machine_student1.txt><machines_student.txt><mpi_rcv_waitany.cpp><mpi_send_async.cpp><mpi_send_sync.cpp><mpi_test_incl.h>_______________________________________________
>> > mpich-discuss mailing list mpich-discuss at mcs.anl.gov
>> > To manage subscription options or unsubscribe:
>> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>> _______________________________________________
>> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120111/21b4e2ef/attachment-0001.htm>
More information about the mpich-discuss
mailing list