<div dir="ltr"><div dir="ltr">Dammit, didn't reply-all. Sorry.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jan 21, 2022 at 8:52 AM Paul T. Bauman <<a href="mailto:ptbauman@gmail.com">ptbauman@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jan 21, 2022 at 8:37 AM Jed Brown <<a href="mailto:jed@jedbrown.org" target="_blank">jed@jedbrown.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">"Paul T. Bauman" <<a href="mailto:ptbauman@gmail.com" target="_blank">ptbauman@gmail.com</a>> writes:<br>
<br>
> On Fri, Jan 21, 2022 at 8:19 AM Jed Brown <<a href="mailto:jed@jedbrown.org" target="_blank">jed@jedbrown.org</a>> wrote:<br>
><br>
>> Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank">mfadams@lbl.gov</a>> writes:<br>
>><br>
>> > Two questions about hypre on HIP:<br>
>> ><br>
>> > * I am doing this now. Is this correct?<br>
>> ><br>
>> > '--download-hypre',<br>
>> > '--download-hypre-configure-arguments=--enable-unified-memory',<br>
>><br>
><br>
> Apologies for interjecting, but I want to point out here that a pretty good<br>
> chunk of BoomerAMG is ported to the GPU and you may not need this<br>
> unified-memory option. I point this out because you will get substantially<br>
> better performance without this option, i.e. using "native" GPU memory. I<br>
> do not know the intricacies of the PETSc/HYPRE/GPU interaction so maybe<br>
> PETSc won't handle the CPU->GPU memcopies for you (I'm assuming vecs, mats<br>
> are assembled on the CPU) in which case you might need the option. And if<br>
> you do run into code paths in BoomerAMG that are not ported to the GPU and<br>
> you want to use them, I'd be very interested to know what the options are<br>
> that are missing a GPU port.<br>
<br>
We have matrices and vectors assembled on the device and logic to pass the device data to Hypre. Stefano knows the details.<br>
<br>
Will the option --enable-unified-memory hurt performance if we provide all data on the device?<br></blockquote><div><br></div><div>Yes. The way HYPRE's memory model is setup is that ALL GPU allocations are "native" (i.e. [cuda,hip]Malloc) or, if unified memory is enabled, then ALL GPU allocations are unified memory (i.e. [cuda,hip]MallocManaged). Regarding HIP, there is an HMM implementation of hipMallocManaged planned, but is it not yet delivered AFAIK (and it will *not* support gfx906, e.g. RVII, FYI), so, today, under the covers, hipMallocManaged is calling hipHostMalloc. So, today, all your unified memory allocations in HYPRE on HIP are doing CPU-pinned memory accesses. And performance is just truly terrible (as you might expect).<br></div></div></div>
</blockquote></div></div>