[mpich2-dev] Porting MPI to a new n-way multi-core platform...

Dave Goodell goodell at mcs.anl.gov
Mon Jun 7 09:12:27 CDT 2010


On Jun 7, 2010 , at 7:23 AM CDT, 顏士敦 wrote:

> Dear all, 
> 
> I’m a completely newbie to MPI and I’m assigned a job with tight schedule to port MPI to a multi-core system currently under developing. 
> It would be very appreciated if anyone could show me where to start with. 

Good luck, that sounds like quite a bit of work.

> We’re currently developing a many-core / multi-core system, consisting of more than 10 ARM processors. 
> The system is constructed by connecting several 4-way SMP board with Gb-Ethernet. 
> The OS on cores will be linux together with certain RTOS, such as uC/OS-II. 
> I’m going to replace MPI’s underlying communication mechanism from TCP/IP socket to somehow our hardware specific ways. 
> 
> I’ve searched some data and it seems the most popular free MPI implementations would be MPICH2 and OpenMPI. 
> Some said MPICH2 is more portable than OpenMPI, so I just try hacking its source. 
> However tremendous sources overwhelm my brain pretty soon. 
> 
> Could someone please tell me whether I’m in right direction? 
> So far I’m suffering from below questions: 
> * Is there a porting guide available? 
> * Is there any abstraction layer for the socket communication mechanism? Which files should I look into? 

You can choose to customize MPICH2 at many different levels.  You can choose to write your own implementation of larger sections of code to get greater flexibility at the cost of additional development time.  Alternatively, you can customize just a small portion of the code if the current code is nearly compatible with your target architecture and performance goals.

In terms of ports to networks, you have three main choices for implementation.  You can implement a "device", a "channel", or a nemesis "network module" (netmod).  If you are on a tight timeline and are relatively unfamiliar with MPI, I would recommend staying away from a device implementation.  It is a _lot_ of work unless your network API closely matches the interface already.  If it suits your needs, the netmod interface is probably your best bet.  That interface is documented here: http://wiki.mcs.anl.gov/mpich2/index.php/Nemesis_Network_Module_API

You can use the "mx" and "tcp" netmod implementations as examples for your own code.

> * To port MPI on RTOS, which APIs should OS provide? 
> Glancing at the source I find MALLOC/FREE and sprintf(), yet I think there must be much more. 

It depends on how exotic your RTOS is.  The more POSIX-ish it is, the easier it will be.  Off the top of my head, I don't know of any specific examples where this has been done before, although someone else on the list might.  Your best bet is probably to try building a stock MPICH2 and just see where you run into trouble.  MPICH2 does expect a (nearly) C89-compliant C standard library, as well as some common system calls.  If you are missing these things as well, porting MPICH2 will be much more time consuming.

You mentioned Linux earlier together with your RTOS.  If you end up running MPICH2 on Linux, that should take care of most/all of your OS portability issues, leaving you to only deal with hardware/network issues.

Unfortunately, I don't think we natively support ARM right now when using Nemesis.  So an early stumbling point will be that you will need to implement atomic operations for ARM in the src/openpa directory.  Which version/flavor of ARM is this?  IIRC, some older types of ARM core don't have a very diverse or powerful set of atomic operations, which might make this difficult in that case.

As another test, you can also configure with "--with-device=ch3:sock" to get around the atomic operation problem, but then you will be using TCP with an older channel instead.

> Is there any list about this? 

This list (mpich2-dev at mcs.anl.gov) or possibly mpich-discuss at mcs.anl.gov are probably the best resources for MPICH2 porting and development issues.

-Dave



More information about the mpich2-dev mailing list