[mpich-discuss] How to implement this case

Xiao Li shinelee.thewise at gmail.com
Mon Jan 3 22:19:17 CST 2011


Hi Eric,

The if statement may make some error, I think it should be altered as .

 if ++trunk_sent[index] != M do
         MPI_Irecv(buffer[index], index, requests[index])
 else
        //remove the MPIRequest object of finished process, or else it might
be halt forever
        remove requests[index]
 end

cheers
Xiao

On Mon, Jan 3, 2011 at 4:24 PM, Xiao Li <shinelee.thewise at gmail.com> wrote:

> Hi Eric,
>
> You are right. An extra MPI_Irecv will be executed at end. Thanks for your
> comment.
>
> cheers
> Xiao
>
>
> On Mon, Jan 3, 2011 at 4:16 PM, Eric A. Borisch <eborisch at ieee.org> wrote:
>
>> Looks about right... I'm assuming there is a <do actual work here> to be
>> inserted between the MPI_Waitany and MPI_Irecv within the N*M-sized loop....
>> and I count from 0 rather than 1 by force of habit... :)
>>
>> I think the logic below will end with one extra attempted MPI_Irecv than
>> desired; perhaps change
>>
>> if trunk_sent[index] != M do
>>    MPI_Irecv(buffer[index], index, requests[index])
>>    trunk_sent[index]++
>> end
>>
>> to
>>
>> if ++trunk_sent[index] != M do
>>    MPI_Irecv(buffer[index], index, requests[index])
>> end
>>
>>  -Eric
>>
>>
>> On Mon, Jan 3, 2011 at 3:00 PM, Xiao Li <shinelee.thewise at gmail.com>wrote:
>>
>>> Hi Eric,
>>>
>>> Thanks for your detailed suggestion. After read MPI documents, I propose
>>> the following algorithm,
>>>
>>> //begin collecting data for the first trunk
>>> for i=1 to N do
>>>     MPI_Irecv(buffer[i], i, requests[i])
>>> end
>>> //set data sending counter
>>> for i=1 to N do
>>>     trunk_sent[i] = 0
>>> end
>>> //begin collecting data
>>> for i=1 to N*M do
>>>     MPI_Waitany(N, requests, &index, &status)
>>>     if trunk_sent[index] != M do
>>>          MPI_Irecv(buffer[index], index, requests[index])
>>>          trunk_sent[index]++
>>>     end
>>> end
>>>
>>> May I know what is your opinion of this algorithm?
>>>
>>> cheers
>>> Xiao
>>>
>>>
>>> On Mon, Jan 3, 2011 at 3:31 PM, Eric A. Borisch <eborisch at ieee.org>wrote:
>>>
>>>> Xiao,
>>>>
>>>> You should be able to get by with just N buffers, one for each client.
>>>> After you have processed the i-th iteration for client n, re-issue an
>>>> MPI_Irecv with the same buffer. This will match up with the next MPI_Send
>>>> from client n. You don't have to worry about synchronizing -- the MPI_Irecv
>>>> does not need to be posted before the MPI_Send. (But the MPI_Send won't
>>>> complete until it has been, of course...)
>>>>
>>>> You could always roll your own sockets, but MPI does a nice job of
>>>> managing connections and messages for you. In addition, MPI can be used
>>>> fairly efficiently on a wide range of interconnects, from shared memory to
>>>> Infiniband with little to no change on the user's part.
>>>>
>>>> In addition, you could likely improve performance in MPI by having two
>>>> sets (call them A and B) of buffers to send from on each worker; one is in
>>>> the "send" state (let's call this one A, started with an MPI_Isend after it
>>>> was initially filled) while you're filling B. After B is filled, initiate a
>>>> new MPI_Isend (very quick) on B and then wait for A's first send (MPI_Wait)
>>>> to complete. Once the first send on A is completed, you can start populating
>>>> A with the next iteration's output, initiate A's send, wait for B's send to
>>>> complete, and the cycle begins again.
>>>>
>>>> This approach allows you to overlay communication and computation times,
>>>> and still works with the MPI_Waitany() approach to harvesting completed jobs
>>>> in first-completed order on the master. This is an almost trivial thing to
>>>> implement in MPI, but achieving it with sockets requires (IMHO) much more
>>>> programmer overhead...
>>>>
>>>> Just my 2c.
>>>>
>>>>  Eric
>>>>
>>>>
>>>> On Mon, Jan 3, 2011 at 1:24 PM, Xiao Li <shinelee.thewise at gmail.com>wrote:
>>>>
>>>>> Hi Eric,
>>>>>
>>>>> Assume I have N workers and M trunks of sending data for each worker
>>>>> respectively, then I have to create N*M data buffer for MPI_Irecv usage. Is
>>>>> this method too costly?
>>>>>
>>>>> Or If I write raw socket programming, is that better? Just like
>>>>> traditional client/server socket programming? Master listens on port
>>>>> and spawn a new thread to accept worker's data storage request?
>>>>>
>>>>> cheers
>>>>> Xiao
>>>>>
>>>>>
>>>>> On Mon, Jan 3, 2011 at 2:13 PM, Eric A. Borisch <eborisch at ieee.org>wrote:
>>>>>
>>>>>> Look at the documentation for MPI_Irecv and MPI_Testany ... these
>>>>>> should help you do what you want.
>>>>>>
>>>>>>  Eric
>>>>>>
>>>>>> On Mon, Jan 3, 2011 at 12:45 PM, Xiao Li <shinelee.thewise at gmail.com>wrote:
>>>>>>
>>>>>>> Hi MPICH2 people,
>>>>>>>
>>>>>>> Now, I have a application that composed of single master and many
>>>>>>> workers. The application requirement is very simple: workers finish some
>>>>>>> jobs and send data to master and master store these data into files
>>>>>>> separately. I can simply use MPI_Send on worker side to send data to master.
>>>>>>> But master does not know the data sending sequence. Some worker go fast
>>>>>>> while some are slow. More specifically, suppose there are 5 workers, then
>>>>>>> the data send sequence may be 1,3,4,5,2 or 2,5,4,1,3. If I just write a for
>>>>>>> loop for(i=1 to 5) on master side with MPI_Recv to get data, the master and
>>>>>>> some faster worker have to wait for a long time. I know MPI_Gather can
>>>>>>> implement this. But I am not sure is MPI_Gather works parallelly or just a
>>>>>>> sequential MPI_Recv? Another issue is my data is extremely large, more than
>>>>>>> 1GB data needed to be sent to master. If I divide the data into pieces, I do
>>>>>>> not think MPI_Gather can work. I also tried to think about raw socket
>>>>>>> programming, but I do not think it is a good practice. Would you give me
>>>>>>> some suggestion please?
>>>>>>>
>>>>>>> cheers
>>>>>>> Xiao
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> mpich-discuss mailing list
>>>>>>> mpich-discuss at mcs.anl.gov
>>>>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> mpich-discuss mailing list
>>>>>> mpich-discuss at mcs.anl.gov
>>>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> mpich-discuss mailing list
>>>>> mpich-discuss at mcs.anl.gov
>>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>>>>
>>>>>
>>>> _______________________________________________
>>>> mpich-discuss mailing list
>>>> mpich-discuss at mcs.anl.gov
>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>>>
>>>>
>>>
>>> _______________________________________________
>>> mpich-discuss mailing list
>>> mpich-discuss at mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>>
>>>
>>
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20110103/5f28498a/attachment.htm>


More information about the mpich-discuss mailing list