[petsc-users] Error with VecDestroy_MPIFFTW+0x61

Smith, Barry F. bsmith at mcs.anl.gov
Mon Apr 15 13:44:14 CDT 2019

  There are two distinct issues here. 

1) the use of fftw_malloc(). This is a relatively minor issue. This is causing the crash in the code because VecDuplicate() uses PetscMalloc() to obtain the array but when the array is freed fftw_malloc() is called on it. 

2) the padding that FFTW needs in its vectors. Look at MatCreateVecsFFTW_FFTW() source code

      alloc_local = fftw_mpi_local_size_1d(dim[0],comm,FFTW_FORWARD,FFTW_ESTIMATE,&local_n0,&local_0_start,&local_n1,&local_1_start);
      if (fin) {
        data_fin  = (fftw_complex*)fftw_malloc(sizeof(fftw_complex)*alloc_local);
        ierr      = VecCreateMPIWithArray(comm,1,local_n0,N,(const PetscScalar*)data_fin,fin);CHKERRQ(ierr);

Note that fftw_mpi_local_size_1d returns a "size" alloc_local  that is then passed into the fftw_malloc() which is then passed into VecCreateMPIWithArray(); this size is LARGER than local_n0 the local size of the MPI vector.  FFTW routines assume that "extra space" is available at the end of the arrays and use that as "work" space. The different fin, fout, bout have sometimes different local sizes and different amount of padding (see the code for exact details). 

When you call VecDuplicate() on an fin vector it doesn't know about the padding so creates a new vector with an array with only the size of local_n0, that is without any ghost padding in the end. If you pass this new vector into a FFTW routine it will access out of bounds memory and thus potentially crash or incorrect results.  

Note that if one replaced all the fftw_malloc() with PetscMalloc() the error described above would still be there. VecDuplicate() vectors would not have the extra padding they need.

My proposed fixed outlined before resolves both problems at the same time.


Final note just to make it absolutely clear, the use of fftw_malloc() has nothing to with the extra padding at the end of the vector arrays! The extra padding is managed by the call to alloc_local = fftw_mpi_local_size_1d() and the use of alloc_local to determine the size of the array to allocate. fftw_alloc() itself is not putting any extra padding at the end.

> On Apr 15, 2019, at 11:47 AM, Sajid Ali <sajidsyed2021 at u.northwestern.edu> wrote:
> Hi Barry & Matt, 
> I'd be happy to contribute a patch once I understand what's going on. 
> @Matt, Where is the padding occurring? In the VecCreateFFTW I see that each process looks up the dimension of array it's supposed to hold and asks for memory to hold that via fftw_malloc (which as you say is just a wrapper to simd-aligned malloc). Is the crash occurring because the first vector was created via fftw_malloc and duplicated via PETScMalloc and they happen to have different alignment sizes (FFTW was compiled with simd=avx2 since I'm using a Broadwell-Xeon and PETScMalloc aligns to PETSC_MEMALIGN ?)
> PS: I've only ever used FFTW via the python interface (and automated the build & but couldn't automate testing of pyfftw-mpi since cython coverage reporting is confusing).  
> Thank You,
> Sajid Ali
> Applied Physics
> Northwestern University

More information about the petsc-users mailing list