[petsc-dev] Modify 3rd party lib
Stefano Zampini
stefano.zampini at gmail.com
Fri Apr 24 16:12:06 CDT 2020
Sherry
You may want to include the attached patch in SUPERLU_DIST master.
This is to remove the prints that are not protected by the PRNTlevel macro
Il giorno mar 21 apr 2020 alle ore 14:12 Mark Adams <mfadams at lbl.gov> ha
scritto:
>
>
> On Mon, Apr 20, 2020 at 10:28 PM Xiaoye S. Li <xsli at lbl.gov> wrote:
>
>> Mark,
>> thanks for debugging this! Indeed, I confirm -- that particular "free"
>> should be regular free instead of cudaHostfree(), because that data
>> structure is not allocated by cudaAllocHost(). I have been running this
>> cuda code on Summit, somehow the bug didn't show up.
>>
>
> Odd, but it seems to work fine for me now. eg, I get a speedup of 6x on a
> ~50K equation 3D systems (Q3 elements with 2 dof per vertex).
>
>
>>
>> I just updated the master branch with this fix. Will be absorbed in a
>> future release.
>>
>> As for PRNTlevel>=2, perhaps check your cmake build script. It should be
>> set to 0 for production build.
>>
>>
> I don't see where that gets set. PRNTlevel does not seem to be in our
> repo. I see it in 'MAKE_INC/make.cuda_gpu: -DDEBUGlevel=0
> -DPRNTlevel=1 -DPROFlevel=0', but I think it is set at >= 2. I have
> manually disabled the print statements (~ 5 places).
>
> Thanks,
> Mark
>
>
>> Sherry
>>
>>
>> On Sun, Apr 19, 2020 at 6:32 PM Mark Adams <mfadams at lbl.gov> wrote:
>>
>>> Also, we have PRNTlevel>=2 in SuperLU_dist. This is causing a lot of
>>> output. It's not clear where that is set (it's a #define)
>>>
>>> On Sun, Apr 19, 2020 at 9:28 PM Mark Adams <mfadams at lbl.gov> wrote:
>>>
>>>> Sherry, I found the problem.
>>>>
>>>> I added this print statement to dDestroy_LU
>>>>
>>>> nb = CEILING(nsupers, grid->npcol);
>>>> for (i = 0; i < nb; ++i)
>>>> if ( Llu->Lrowind_bc_ptr[i] ) {
>>>>
>>>> * fprintf(stderr,"dDestroy_LU: GPU free Llu->Lnzval_bc_ptr[%d/%d] =
>>>> %p, CPU free Llu->Lrowind_bc_ptr =
>>>> %p\n",i,nb,Llu->Lnzval_bc_ptr[i],Llu->Lrowind_bc_ptr[i]);*
>>>> SUPERLU_FREE (Llu->Lrowind_bc_ptr[i]);
>>>> #ifdef GPU_ACC
>>>> checkCuda(cudaFreeHost(Llu->Lnzval_bc_ptr[i]));
>>>> #else
>>>> SUPERLU_FREE (Llu->Lnzval_bc_ptr[i]);
>>>> #endif
>>>> }
>>>>
>>>> And I see:
>>>>
>>>> 1 SNES Function norm 1.245977692562e-04
>>>>
>>>> *dDestroy_LU: GPU free Llu->Lnzval_bc_ptr[0/134] = 0x4ff9b000, CPU free
>>>> Llu->Lrowind_bc_ptr = 0x4ff9a000*ex112d: cudahook.cc:762: CUresult
>>>> host_free_callback(void*): Assertion `cacheNode != __null' failed.
>>>>
>>>> THis looks like Lnzval_bc_ptr is on the CPU so I removed the GPU_ACC
>>>> stuff and it works now.
>>>>
>>>> I see this in distribution. Perhaps this a serial run bug?
>>>>
>>>> On Sun, Apr 19, 2020 at 5:58 PM Xiaoye S. Li <xsli at lbl.gov> wrote:
>>>>
>>>>> Mark,
>>>>> you should fork a branch of your own to do this.
>>>>>
>>>>> Sherry
>>>>>
>>>>> On Sun, Apr 19, 2020 at 2:54 PM Stefano Zampini <
>>>>> stefano.zampini at gmail.com> wrote:
>>>>>
>>>>>> First, commit your changes to the superlu_dist branch, then rerun
>>>>>> configure with
>>>>>>
>>>>>> —download-superlu_dist-commit=HEAD
>>>>>>
>>>>>>
>>>>>> > On Apr 20, 2020, at 12:50 AM, Mark Adams <mfadams at lbl.gov> wrote:
>>>>>> >
>>>>>> > I would like to modify SuperLU_dist but if I change the source and
>>>>>> configure it says no need to reconfigure, use --force. I use --force and it
>>>>>> seems to clobber my changes. Can I tell configure to use build but not
>>>>>> download SuperLU?
>>>>>>
>>>>>>
--
Stefano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20200425/2d08b030/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch_superlu_dist
Type: application/octet-stream
Size: 998 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20200425/2d08b030/attachment.obj>
More information about the petsc-dev
mailing list