[mpich2-dev] MPIX vs. MPI etc.

Pavan Balaji balaji at mcs.anl.gov
Sun Aug 19 22:56:29 CDT 2012


Of course, there are no guarantees with MPIX_ functions and we can do 
whatever we want with them.  But it's trivial to just remove those 
functions; in fact, lesser work than just not implementing them.

Why would you do more work for a worse design?

  -- Pavan

On 08/19/2012 10:53 PM, Jeff Hammond wrote:
> Okay, but then why can't IBM implement MPIX_Ibarrier in such a way
> that it returns an error every time?  Bill seemed to object to that,
> which indicates that MPIX_Ibarrier must be implemented in a way that
> complies with MPI-3 or not at all.  That is inconsistent with the
> assertion that it's outside the standard.  I completely understand why
> Bill wants what he wants; I am just trying to reconcile the
> triple-meaning of MPIX that I observe.
>
> I guess I don't see what's so wrong with IBM implementing
> MPIX_Ibarrier just like it does the following, which all call
> MPID_Abort.
>
> anlvpn252:src jhammond$ grep "int MPID" misc/mpid_unimpl.c
> int MPID_Close_port(const char *port_name)
> int MPID_Open_port(MPID_Info *info_ptr,
> int MPID_Comm_accept(char *port_name,
> int MPID_Comm_connect(const char *port_name,
> int MPID_Comm_disconnect(MPID_Comm *comm_ptr)
> int MPID_Comm_spawn_multiple(int count
>
> I don't really see any reason that these can't be implemented, by the
> way, but IBM doesn't like them for obvious reasons.
>
> Jeff
>
> On Sun, Aug 19, 2012 at 10:42 PM, Pavan Balaji <balaji at mcs.anl.gov> wrote:
>>
>> MPIX_ functions were always meant to be extensions beyond the MPI standard.
>> The fact that these functions might get into the standard in MPI-3 does not
>> change that fact.  Apart from ARMCI-MPI, everything else follows this rule
>> correctly.  ARMCI-MPI does not use MPIX_ names at all; it is just in the
>> wrong directory.
>>
>>   -- Pavan
>>
>>
>> On 08/19/2012 10:36 PM, Jeff Hammond wrote:
>>>
>>> I don't really care if we resolve this for all time tonight, I just
>>> think it's worth noting that the current situation is confusing if one
>>> tries to be logical about it.
>>>
>>> Jeff
>>>
>>>
>>> ---------- Forwarded message ----------
>>> From: Jeff Hammond <jhammond at alcf.anl.gov>
>>> Date: Sun, Aug 19, 2012 at 10:33 PM
>>> Subject: Re: [mpich-ibm] pamid code contribution - based on mpich2 1.5b2
>>> To: Pavan Balaji <balaji at mcs.anl.gov>
>>> Cc: William Gropp <wgropp at illinois.edu>, mpich-ibm at mcs.anl.gov
>>>
>>>
>>> I agree that a user will be annoyed if an MPIX_Ibarrier symbol exists
>>> in MPICH2-1.5's PAMID but it doesn't work, however...
>>>
>>> This refers to the MPIX implementations of MPI-3 that are not yet
>>> standardized, does it not?  Is MPICH2-1.5 going to be an
>>> implementation of MPI-2.2 or MPI-3?  If the former, what is the
>>> argument for not letting the PAMID implementation of MPICH2-1.5 catch
>>> on fire and burn down the machine room when users call MPIX_Ibarrier?
>>> They have no reasonable expectation that this should work.  Only
>>> informed experts should be calling MPIX anyways, no?
>>>
>>> This further confirms that the MPIX situation is confusing.  MPICH2
>>> has MPIX for MPI-3 that are portable except for BGQ, MPIX that have no
>>> meaning except on BGQ, and MPIX that aren't in the standard but work
>>> everywhere because they sit on top of MPI.  It also strengthens my
>>> argument that there's no sense in trying to make MPIX user-friendly
>>> since these details are not something a novice can distinguish.
>>>
>>> The best solution is for IBM to implement all the MPIX functions in
>>> MPICH2-1.5, of course, starting with NBC and RMA :-)
>>>
>>> Jeff
>>>
>>> On Sun, Aug 19, 2012 at 10:26 PM, Pavan Balaji <balaji at mcs.anl.gov> wrote:
>>>>
>>>>
>>>> Agreed.  I like Bill's suggestion better.
>>>>
>>>>    -- Pavan
>>>>
>>>>
>>>> On 08/19/2012 10:24 PM, William Gropp wrote:
>>>>>
>>>>>
>>>>> I have a different concern about this (not about the IBM patch but the
>>>>> underlying approach).  I think it is a mistake to include unimplemented
>>>>> routines at all in the library.   In particular, there is no easy way to
>>>>> write a configure test for this without running the code; the model that
>>>>> configure uses for checking for a the availability of a function is
>>>>> whether a program containing it can be linked.  Applications should be
>>>>> encouraged to use standard tools for determining capabilities.
>>>>>     Including non-functional stubs breaks this model, and does so
>>>>> unnecessarily.
>>>>>
>>>>> Bill
>>>>>
>>>>> William Gropp
>>>>> Director, Parallel Computing Institute
>>>>> Deputy Director for Research
>>>>> Institute for Advanced Computing Applications and Technologies
>>>>> Paul and Cynthia Saylor Professor of Computer Science
>>>>> University of Illinois Urbana-Champaign
>>>>>
>>>>>
>>>>>
>>>>> On Aug 20, 2012, at 9:43 AM, Pavan Balaji wrote:
>>>>>
>>>>>> 5. Patch #3: For the MPI-3 functionality that's not implemented, it
>>>>>> looks like you are just calling an Abort.  I'd recommend adding a new
>>>>>> error code MPIX_ERR_NOT_IMPLEMENTED and return that instead.  That
>>>>>> way, applications might be able to do something useful with those
>>>>>> functions.
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Pavan Balaji
>>>> http://www.mcs.anl.gov/~balaji
>>>> _______________________________________________
>>>> mpich-ibm mailing list
>>>> mpich-ibm at lists.mcs.anl.gov
>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-ibm
>>>
>>>
>>>
>>>
>>> --
>>> Jeff Hammond
>>> Argonne Leadership Computing Facility
>>> University of Chicago Computation Institute
>>> jhammond at alcf.anl.gov / (630) 252-5381
>>> http://www.linkedin.com/in/jeffhammond
>>> https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
>>>
>>>
>>
>> --
>> Pavan Balaji
>> http://www.mcs.anl.gov/~balaji
>
>
>

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich2-dev mailing list