[petsc-dev] fork for programming models debate (was "Using multiple mallocs with PETSc")
Jed Brown
jed at jedbrown.org
Tue Mar 14 22:52:16 CDT 2017
Jeff Hammond <jeff.science at gmail.com> writes:
> On Mon, Mar 13, 2017 at 8:08 PM, Jed Brown <jed at jedbrown.org> wrote:
>>
>> Jeff Hammond <jeff.science at gmail.com> writes:
>>
>> > OpenMP did not prevent OpenCL,
>>
>> This programming model isn't really intended for architectures with
>> persistent caches.
>>
>
> It's not clear to me how much this should matter in a good implementation.
> The lack of implementation effort for OpenCL on cache-coherent CPU
> architectures appears to be a more significant issue.
How do you keep data resident in cache between kernel launches?
>> > C11, C++11
>>
>> These are basically pthreads, which predates OpenMP.
>>
>
> I'm not sure why it matters which one came first. POSIX standardized
> threads in 1995, while OpenMP was first standardized in 1997. However, the
> first serious Pthreads implementation in Linux was in 2003.
And the first serious OpenMP on OS X was when?
> OpenMP standardized the best practices identified in Kuck, SGI and
> Cray directives, just like POSIX presumably standardized best
> practices in OS threads from various Unix implementations.
>
> C++11 and beyond have concurrency features beyond just threads. You
> probably hate all of them because they are C++, and in any case I won't
> argue, because I don't see anything that's implemented better
>
>>
>> > or Fortran 2008
>>
>> A different language and doesn't play well with others.
>>
>
> Sure, but you could use Fortran 2003 features to interoperate between C and
> Fortran if you wanted to leverage Fortran 2008 concurrency features in an
> ISO-compliant way. I'm not suggesting you want to do this, but I dispute
> the suggestion that Fortran does not play nice with C.
I think the above qualifies as not playing nicely in this context.
> Fortran coarrays images are OS processes in every implementation I know,
> although the standard does not explicitly require this implementation. The
> situation is identical to that of MPI, although there are actually MPI
> implementations based upon OS threads rather than OS processes (and they
> require compiler or OS magic to deal with non-heap data).
>
> Both of the widely available Fortran coarray implementations use MPI-3 RMA
> under the hood and all of the ones I know about define an image to be an OS
> process.
Are you trying to sell PETSc on MPI?
>> > from introducing parallelism. Not sure if your comment was meant to be
>> > serious,
>>
>> Partially. It was just enough to give the appearance of a solution
>> while not really being a solution.
>>
>
> It still isn't clear what you actually want. You appear to reject every
> standard API for enabling explicit vectorization for CPU execution
> (Fortran, OpenMP, OpenCL), which suggests that (1) you do not believe in
> vectorization, (2) you think that autovectorizing compilers are sufficient,
> (3) you think vector code is necessarily a non-portable software construct,
> or (4) you do not think vectorization is relevant to PETSc.
OpenMP is strictly about vectorization with nothing to do with threads
and MPI is sufficient? I don't have a problem with that, but will
probably stick to attributes and intrinsics instead of omp simd, at
least until it matures and demonstrates feature parity.
Have you tried writing a BLIS microkernel using omp simd? Is it any
good?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20170314/c7e3df00/attachment.sig>
More information about the petsc-dev
mailing list