[mpich2-dev] MPIX vs. MPI etc.
Jeff Hammond
jhammond at alcf.anl.gov
Sun Aug 19 23:02:57 CDT 2012
> Of course, there are no guarantees with MPIX_ functions and we can do
> whatever we want with them. But it's trivial to just remove those
> functions; in fact, lesser work than just not implementing them.
Look, I would not do it the stupid way, I just want a hard verdict on
whether or not the stupid way is not permitted or just discouraged.
You and Bill seem to disagree on this.
Ultimately, it doesn't really matter. See below.
> Why would you do more work for a worse design?
Maybe you should move this thread back to mpich2-ibm and ask them why
they proposed this in the first place :-)
> On 08/19/2012 10:53 PM, Jeff Hammond wrote:
>>
>> Okay, but then why can't IBM implement MPIX_Ibarrier in such a way
>> that it returns an error every time? Bill seemed to object to that,
>> which indicates that MPIX_Ibarrier must be implemented in a way that
>> complies with MPI-3 or not at all. That is inconsistent with the
>> assertion that it's outside the standard. I completely understand why
>> Bill wants what he wants; I am just trying to reconcile the
>> triple-meaning of MPIX that I observe.
>>
>> I guess I don't see what's so wrong with IBM implementing
>> MPIX_Ibarrier just like it does the following, which all call
>> MPID_Abort.
>>
>> anlvpn252:src jhammond$ grep "int MPID" misc/mpid_unimpl.c
>> int MPID_Close_port(const char *port_name)
>> int MPID_Open_port(MPID_Info *info_ptr,
>> int MPID_Comm_accept(char *port_name,
>> int MPID_Comm_connect(const char *port_name,
>> int MPID_Comm_disconnect(MPID_Comm *comm_ptr)
>> int MPID_Comm_spawn_multiple(int count
>>
>> I don't really see any reason that these can't be implemented, by the
>> way, but IBM doesn't like them for obvious reasons.
>>
>> Jeff
>>
>> On Sun, Aug 19, 2012 at 10:42 PM, Pavan Balaji <balaji at mcs.anl.gov> wrote:
>>>
>>>
>>> MPIX_ functions were always meant to be extensions beyond the MPI
>>> standard.
>>> The fact that these functions might get into the standard in MPI-3 does
>>> not
>>> change that fact. Apart from ARMCI-MPI, everything else follows this
>>> rule
>>> correctly. ARMCI-MPI does not use MPIX_ names at all; it is just in the
>>> wrong directory.
>>>
>>> -- Pavan
>>>
>>>
>>> On 08/19/2012 10:36 PM, Jeff Hammond wrote:
>>>>
>>>>
>>>> I don't really care if we resolve this for all time tonight, I just
>>>> think it's worth noting that the current situation is confusing if one
>>>> tries to be logical about it.
>>>>
>>>> Jeff
>>>>
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: Jeff Hammond <jhammond at alcf.anl.gov>
>>>> Date: Sun, Aug 19, 2012 at 10:33 PM
>>>> Subject: Re: [mpich-ibm] pamid code contribution - based on mpich2 1.5b2
>>>> To: Pavan Balaji <balaji at mcs.anl.gov>
>>>> Cc: William Gropp <wgropp at illinois.edu>, mpich-ibm at mcs.anl.gov
>>>>
>>>>
>>>> I agree that a user will be annoyed if an MPIX_Ibarrier symbol exists
>>>> in MPICH2-1.5's PAMID but it doesn't work, however...
>>>>
>>>> This refers to the MPIX implementations of MPI-3 that are not yet
>>>> standardized, does it not? Is MPICH2-1.5 going to be an
>>>> implementation of MPI-2.2 or MPI-3? If the former, what is the
>>>> argument for not letting the PAMID implementation of MPICH2-1.5 catch
>>>> on fire and burn down the machine room when users call MPIX_Ibarrier?
>>>> They have no reasonable expectation that this should work. Only
>>>> informed experts should be calling MPIX anyways, no?
>>>>
>>>> This further confirms that the MPIX situation is confusing. MPICH2
>>>> has MPIX for MPI-3 that are portable except for BGQ, MPIX that have no
>>>> meaning except on BGQ, and MPIX that aren't in the standard but work
>>>> everywhere because they sit on top of MPI. It also strengthens my
>>>> argument that there's no sense in trying to make MPIX user-friendly
>>>> since these details are not something a novice can distinguish.
>>>>
>>>> The best solution is for IBM to implement all the MPIX functions in
>>>> MPICH2-1.5, of course, starting with NBC and RMA :-)
>>>>
>>>> Jeff
>>>>
>>>> On Sun, Aug 19, 2012 at 10:26 PM, Pavan Balaji <balaji at mcs.anl.gov>
>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> Agreed. I like Bill's suggestion better.
>>>>>
>>>>> -- Pavan
>>>>>
>>>>>
>>>>> On 08/19/2012 10:24 PM, William Gropp wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> I have a different concern about this (not about the IBM patch but the
>>>>>> underlying approach). I think it is a mistake to include
>>>>>> unimplemented
>>>>>> routines at all in the library. In particular, there is no easy way
>>>>>> to
>>>>>> write a configure test for this without running the code; the model
>>>>>> that
>>>>>> configure uses for checking for a the availability of a function is
>>>>>> whether a program containing it can be linked. Applications should be
>>>>>> encouraged to use standard tools for determining capabilities.
>>>>>> Including non-functional stubs breaks this model, and does so
>>>>>> unnecessarily.
>>>>>>
>>>>>> Bill
>>>>>>
>>>>>> William Gropp
>>>>>> Director, Parallel Computing Institute
>>>>>> Deputy Director for Research
>>>>>> Institute for Advanced Computing Applications and Technologies
>>>>>> Paul and Cynthia Saylor Professor of Computer Science
>>>>>> University of Illinois Urbana-Champaign
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Aug 20, 2012, at 9:43 AM, Pavan Balaji wrote:
>>>>>>
>>>>>>> 5. Patch #3: For the MPI-3 functionality that's not implemented, it
>>>>>>> looks like you are just calling an Abort. I'd recommend adding a new
>>>>>>> error code MPIX_ERR_NOT_IMPLEMENTED and return that instead. That
>>>>>>> way, applications might be able to do something useful with those
>>>>>>> functions.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Pavan Balaji
>>>>> http://www.mcs.anl.gov/~balaji
>>>>> _______________________________________________
>>>>> mpich-ibm mailing list
>>>>> mpich-ibm at lists.mcs.anl.gov
>>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-ibm
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jeff Hammond
>>>> Argonne Leadership Computing Facility
>>>> University of Chicago Computation Institute
>>>> jhammond at alcf.anl.gov / (630) 252-5381
>>>> http://www.linkedin.com/in/jeffhammond
>>>> https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
>>>>
>>>>
>>>
>>> --
>>> Pavan Balaji
>>> http://www.mcs.anl.gov/~balaji
>>
>>
>>
>>
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
--
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
More information about the mpich2-dev
mailing list