<html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><br class=""></div> Yes, the PetscKokkosScalar is just a PetscScalar in the Kokkos memory space (same with CUDA). <div class=""><br class=""></div><div class=""> Using variable names that give hints of where the variable lives is good but I think also indicating with a specific type is also useful to understand the code easily without needing to hunt around what lives where. Unfortunately the compiler cannot do type checking but for me it is still useful trying to understand the code.</div><div class=""><br class=""></div><div class=""> Barry<br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Dec 13, 2020, at 9:23 PM, Mark Adams <<a href="mailto:mfadams@lbl.gov" class="">mfadams@lbl.gov</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div dir="ltr" class=""><br class=""></div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Dec 13, 2020 at 4:34 PM Barry Smith <<a href="mailto:bsmith@petsc.dev" class="">bsmith@petsc.dev</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="overflow-wrap: break-word;" class=""><br class=""><div class=""><br class=""><blockquote type="cite" class=""><div class="">On Dec 13, 2020, at 8:39 AM, Mark Adams <<a href="mailto:mfadams@lbl.gov" target="_blank" class="">mfadams@lbl.gov</a>> wrote:</div><br class=""><div class=""><div dir="ltr" class="">So how do I declare <div class=""><br class=""></div><div class="">struct myMat {<div class=""> PetscScalar *a;</div><div class=""> PetscInt *i, *j, nrows;</div><div class="">}</div><div class=""><br class=""></div><div class="">so that I can use, or give me an alternative idiom, </div><div class=""><br class=""></div><div class="">struct myMat h_mat, *d_mat;<br class=""></div><div class="">....</div><div class="">cudaMemcpy( d_mat, &h_mat, sizeof(myMat), cudaMemcpyHostToDevice))<br class=""></div></div></div></div></blockquote><div class=""><br class=""></div> This will copy the pointers (not the entries in the arrays) which means in both d_mat and h_mat a and i,j refer to Kokkos (or CUDA etc) memory so </div></div></blockquote><div class=""><br class=""></div><div class="">Yes. Note, h_mat is, in my usage, a temporary container. It is a mix of host and device memory and is just there to stuff device pointers and raw (meta) data, to be copied whole onto the device.</div><div class=""> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="overflow-wrap: break-word;" class=""><div class=""><br class=""></div><div class=""><blockquote type="cite" class=""><div dir="ltr" class=""><div class="">struct myMat {<div class=""> PetscKokkosScalar *a;</div><div class=""> PetscKokkosInt *i, *j;</div></div></div></blockquote> PetscInt nrows;<blockquote type="cite" class=""><div dir="ltr" class=""><div class=""><div class="">}</div></div></div></blockquote><br class=""></div><div class=""> the nrows is annoying because in d_mat it refers to device memory while in h_mat it refers to host memory so we cannot label it clearly.</div></div></blockquote><div class=""><br class=""></div><div class="">Kokkos wraps all this ugliness, but in raw Cuda I do it this manual way. It's low level and you have to keep track in your head (names help) what is going on. But this mixed h_mat is just around in a local scope. A better name for it might be h_temp_mat.</div><div class=""><br class=""></div><div class="">I think your PetscKokkosScalar is a Scalar in the default execution's memory space.</div><div class=""> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="overflow-wrap: break-word;" class=""><div class=""><br class=""></div><div class=""> Sorry, by code reader I meant someone reading or maintaining the code. </div><div class=""><br class=""></div><div class=""> To me the above struct is clearer, I know immediately that *a and *i, *j point to Kokkos space. While labeled as PetscScalar *a I have to figure out somehow by examining other code where it points (this is a waste of time I think). </div></div></blockquote><div class=""><br class=""></div><div class="">I think Junchao is doing something like this in aijkokkosimpl.hpp, as I said. He and I use variable names to make it clear where they are (eg, d_mat for a Mat on the device). Junchoa had to use Kokkos data type to talk to Kokkos, but we both use naming conventions.</div><div class=""><br class=""></div><div class="">Here is a whole method that does this idiom. I have a Mat with a "data" pointer (this is a bad name but this was a void* at first) to its device "self". I'm not sure how your suggestion would impact this.</div><div class=""><br class=""></div><div class="">PETSC_EXTERN PetscErrorCode LandauCUDACreateMatMaps(P4estVertexMaps *maps, pointInterpolationP4est (*points)[LANDAU_MAX_Q_FACE], PetscInt Nf, PetscInt Nq)<br class="">{<br class=""> P4estVertexMaps h_maps;<br class=""> PetscFunctionBegin;<br class=""> h_maps.num_elements =maps->num_elements;<br class=""> h_maps.num_face = maps->num_face;<br class=""> h_maps.num_reduced = maps->num_reduced;<br class=""> h_maps.deviceType = maps->deviceType;<br class=""> h_maps.Nf = Nf;<br class=""> h_maps.Nq = Nq;<br class=""> CUDA_SAFE_CALL(cudaMalloc((void **)&h_maps.c_maps, maps->num_reduced * sizeof *points));<br class=""> CUDA_SAFE_CALL(cudaMemcpy( h_maps.c_maps, maps->c_maps, maps->num_reduced * sizeof *points, cudaMemcpyHostToDevice));<br class=""> CUDA_SAFE_CALL(cudaMalloc((void **)&h_maps.gIdx, maps->num_elements * sizeof *maps->gIdx));<br class=""> CUDA_SAFE_CALL(cudaMemcpy( h_maps.gIdx, maps->gIdx, maps->num_elements * sizeof *maps->gIdx, cudaMemcpyHostToDevice));<br class=""> CUDA_SAFE_CALL(cudaMalloc((void **)&maps->data, sizeof(P4estVertexMaps)));<br class=""> CUDA_SAFE_CALL(cudaMemcpy( maps->data, &h_maps, sizeof(P4estVertexMaps), cudaMemcpyHostToDevice));<br class=""> PetscFunctionReturn(0);<br class="">}<br class=""></div><div class=""><br class=""></div><div class="">Mark</div></div></div>
</div></blockquote></div><br class=""></div></body></html>