[petsc-users] with-openmp error with hypre

Smith, Barry F. bsmith at mcs.anl.gov
Tue Feb 13 21:07:48 CST 2018



> On Feb 13, 2018, at 8:56 PM, Mark Adams <mfadams at lbl.gov> wrote:
> 
> I agree with Matt, flat 64 will be faster, I would expect, but this code has global metadata that would have to be replicated in a full scale run.\

  Use MPI 3 shared memory to expose the "global metadata" and forget this thread nonsense.

> We are just doing single socket test now (I think).
> 
> We have been tracking down what look like compiler bugs and we have only taken at peak performance to make sure we are not wasting our time with threads.

   You are wasting your time. There are better ways to deal with global metadata than with threads.

> 
> I agree 16x4 VS 64 would be interesting to see.
> 
> Mark
> 
> 
> 
> On Tue, Feb 13, 2018 at 2:02 PM, Kong, Fande <fande.kong at inl.gov> wrote:
> Curious about the comparison of 16x4 VS 64.
> 
> Fande,
> 
> On Tue, Feb 13, 2018 at 11:44 AM, Bakytzhan Kallemov <bkallemov at lbl.gov> wrote:
> Hi,
> I am not sure about 64 flat run, 
> unfortunately I did not save logs since it's easy to run,  but for 16 - here is the plot I got for different number of threads for KSPSolve time
> Baky
> 
> On 02/13/2018 10:28 AM, Matthew Knepley wrote:
>> On Tue, Feb 13, 2018 at 11:30 AM, Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>> > On Feb 13, 2018, at 10:12 AM, Mark Adams <mfadams at lbl.gov> wrote:
>> >
>> > FYI, we were able to get hypre with threads working on KNL on Cori by going down to -O1 optimization. We are getting about 2x speedup with 4 threads and 16 MPI processes per socket. Not bad.
>> 
>>   In other works using 16 MPI processes with 4 threads per process is twice as fast as running with 64 mpi processes?  Could you send the -log_view output for these two cases?
>> 
>> Is that what you mean? I took it to mean
>> 
>>   We ran 16MPI processes and got time T.
>>   We ran 16MPI processes with 4 threads each and got time T/2.
>> 
>> I would likely eat my shirt if 16x4 was 2x faster than 64.
>> 
>>   Matt
>>  
>> 
>> >
>> > There error, flatlined or slightly diverging hypre solves, occurred even in flat MPI runs with openmp=1.
>> 
>>   But the answers are wrong as soon as you turn on OpenMP?
>> 
>>    Thanks
>> 
>>     Barry
>> 
>> 
>> >
>> > We are going to test the Haswell nodes next.
>> >
>> > On Thu, Jan 25, 2018 at 4:16 PM, Mark Adams <mfadams at lbl.gov> wrote:
>> > Baky (cc'ed) is getting a strange error on Cori/KNL at NERSC. Using maint it runs fine with -with-openmp=0, it runs fine with -with-openmp=1 and gamg, but with hypre and -with-openmp=1, even running with flat MPI, the solver seems flatline (see attached and notice that the residual starts to creep after a few time steps).
>> >
>> > Maybe you can suggest a hypre test that I can run?
>> >
>> 
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://www.cse.buffalo.edu/~knepley/
> 
> 
> 



More information about the petsc-users mailing list