[petsc-users] Error with VecDestroy_MPIFFTW+0x61

Smith, Barry F. bsmith at mcs.anl.gov
Sun Apr 14 23:57:39 CDT 2019


  http://www.fftw.org/doc/Memory-Allocation.html The issue isn't really the fftw_malloc() it is that the current code is broken if the user calls VecDuplicate() because the new vectors don't have the correct extra space needed by FFTW and could cause random crashes and incorrect answers. 

  The correct fix is to have a VecDuplicate_FFTW_fin() VecDuplicate_FFTW_fout() VecDuplicate_FFTW_bout(). Have MatCreateVecsFFTW_FFTW() 
attach the matrix to the new vector with PetscObjectCompose() then the duplicate routines would call PetscObjectQuery to get the matrix and then call 
MatCreateVecsFFTW_FFTW() for the duplicate.

   Pretty easy: Sajid do you want to give it a try?

But note: one has to be very careful what vectors they duplicate and that they use them in the right places; for example duplicating a fin vector but using it in an fout or bout location and there is no error checking for that in PETSc or FFT. To add error checking one could attach to the FFT vectors a marker indicating if it is fin, fout, bout and then calls to MatMult_FFTW() etc would check the markers on the vectors and error if they are incompatible.

    Barry

I find this FFTW model of requiring extra space in the arrays to be a horrifically fragile API. Perhaps I misunderstand it.



> On Apr 14, 2019, at 8:28 PM, Matthew Knepley via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> On Sun, Apr 14, 2019 at 9:12 PM Sajid Ali <sajidsyed2021 at u.northwestern.edu> wrote:
> Just to confirm, there's no error when running with one rank. The error occurs only with mpirun -np x (x>1). 
> 
> This is completely broken. I attached a version that will work in parallel, but its ugly.
> 
> PETSc People:
> The MatCreateVecsFFT()  calls fftw_malloc()!!! in parallel. What possible motivation could there be?
> This causes a failure because the custom destroy calls fftw_free(). VecDuplicate calls PetscMalloc(),
> but then the custom destroy calls fftw_free() on that thing and chokes on the header we put on all
> allocated blocks. Its not easy to see who wrote the fftw_malloc() lines, but it seems to be at least 8 years
> ago. I can convert them to PetscMalloc(), but do we have any tests that would make us confident that
> this is not wrecking something? Is anyone familiar with this part of the code?
> 
>   Matt
>  
> Attaching the error log.
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/
> <ex_modify.c>



More information about the petsc-users mailing list