[mpich-discuss] MPI application design

Ronald Paloschi ronald at audaces.com.br
Tue Aug 18 08:35:12 CDT 2009


Hi Diego,

Welll, you have a point in this... for sure... This will be a looong 
research for me because I dont know any of these tools.

Thanks for your help!

Ronald

Diego M. Vadell escreveu:
> Hi Ronald,
>
>     IMHO, you may be better with a job scheduler (torque or SGE) tying all 
> that up than programming with MPI. The job scheduler knows what machines of 
> your cluster are available, and can queue and run the jobs whenever they are 
> free. Just to not reinvent the wheel.
>
> Cheers,
>  -- Diego.
>
> On Monday 17 August 2009 10:22:51 ronald at audaces.com.br wrote:
>   
>> Hi all,
>>
>> This is my first message to this mailing list. Im just entering this world
>> of MPI and MPICH seems to be a wonderful implementation from what Ive seen
>> till now.
>>
>> My subject is not really into MPICH itself buf a more general MPI
>> application design. So... I hope you experts at MPI/MPICH can help me.
>>
>> Well... lets go to the problem:
>>
>> In our company, we a have a CPU intesive algorithm that we already run on a
>> multi-process "manually controlled" (not MPI) environment. The core
>> algorithm is a thirdy party library (not parallel) so, we dont have access
>> to the code in order to implement paralellism on the algorithm itself.
>>
>> But we can see it as a candidate for MPI if we look at the problem at an
>> higher level: A set of problems that can be solved concurrently by this
>> algorithm... The paralellismm consists of solving each problem in parallel.
>> Each "run" of the serial algorithm in a process. And of course we get all
>> the message passing mechanism (that is a big problem regarding parallel
>> applications and MPI handles it very well).
>>
>> We plan to use it like this: Every machine on the network will provide its
>> processors to the Cluster (one machine gives 1, others 2, others 4
>> processors) in every processor we should run one problem at a time. A
>> central processing unit will coordinate (and collect) the results and
>> manage the processes (using MPI).
>>
>> An example: My problem consists of 100 problems to be solved and... on my
>> network I have 10 processing unit (processors on slave machines). Lets say
>> we want to run our algorith (an heuristic procedure) for 10 minutes for
>> each problem. Our controller will for each problem:
>> - See if there is a free Processor.
>> - Send the problem to this processor so that it can compute its solution
>> - Recieve back its solution and keep it.
>> - Repeat till the 100 problems are solved.
>> In this case it will take 100 min. to solve all the 100 problems. (10 mins
>> for each 10 parallel workers, if it weren't in parallel it would take 1000
>> min.)
>>
>> This is the kind of paralellism that we need. I though of a lot of ways to
>> solve it but I would like to hear what do you guys think.
>>
>> At this time of my studies I'm tending to do it this way:
>> - The controlling unit is a server that accepts connections.
>> - For every new process it needs, it launches (mpiexec) a new process to
>> proccess it, passing the connection as a parameter to the executor.
>> - the executor communicates the progress through mpi to the server.
>> - The it goes till the end;
>>
>>
>> What do you guys think. Any help will be appreciated... Im really stuck by
>> my lack of experience on choosing the way to go.
>>
>> Thanks in advance.
>>
>> Ronald
>>     
>
>
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20090818/20e76fbb/attachment.htm>


More information about the mpich-discuss mailing list