[petsc-dev] https://www.dursi.ca/post/hpc-is-dying-and-mpi-is-killing-it.html

Matthew Knepley knepley at gmail.com
Sun Mar 17 19:37:53 CDT 2019


On Sun, Mar 17, 2019 at 8:02 PM Jeff Hammond <jeff.science at gmail.com> wrote:

> When this was written, I was convinced that Dursi was wrong about
> everything because one of the key arguments against MPI was
> fault-intolerance, which I was sure was going to be solved soon.  However,
> LLNL has done everything in their power to torpedo MPI fault-tolerance in
> MPI-4 for the past 3+ years and I am no longer optimistic about MPI's
> ability to grow outside of traditional HPC because of the forum's inability
> to take fault-tolerance seriously.  It's also unclear that we can get by
> without it in a post-exascale world.
>

I am interested to hear that. I have never thought much of the software
fault tolerance argument. First, I do not think people will
buy machines that fail all the time. Second, I do not think software fault
tolerance is a very promising avenue, looking at the failure
of standardization on almost everything.

  Matt


> Jeff
>
> On Sun, Mar 17, 2019 at 4:33 PM Matthew Knepley via petsc-dev <
> petsc-dev at mcs.anl.gov> wrote:
>
>> On Sun, Mar 17, 2019 at 5:34 PM Jed Brown via petsc-dev <
>> petsc-dev at mcs.anl.gov> wrote:
>>
>>> "Smith, Barry F. via petsc-dev" <petsc-dev at mcs.anl.gov> writes:
>>>
>>> >   I stubbled on this today; I should have seen it years ago.
>>>
>>> https://lists.mpich.org/pipermail/devel/2015-April/000536.html
>>>
>>> https://twitter.com/KpDooty/status/585763759777574912
>>>
>>> Proposing, as replacements for MPI, systems with no successful libraries
>>> is a strange hill to die on.  I'd like a system with more flexible
>>> process management and better support/conventions for interactive
>>> environments (where interrupts and programming bugs are common).  That
>>> would help reduce impedance mismatch between MPI and the likes of Spark
>>> and Dask.  Any discussion of such tools should include an explanation of
>>> why MPI has been/is used by machine learning groups including Watson,
>>> Baidu, Bing, and OpenAI.
>>>
>>
>> I wasted 30s skimming this. I should has used that to recognize the
>> author as a member of the FLASH team before reading.
>>
>>    Matt
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>
>
> --
> Jeff Hammond
> jeff.science at gmail.com
> http://jeffhammond.github.io/
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20190317/f0c7b013/attachment-0001.html>


More information about the petsc-dev mailing list