# [petsc-users] Usage of parallel FFT for doing batch 1d-FFTs over the columns of a dense 2d-matrix

Matthew Knepley knepley at gmail.com
Fri Dec 4 07:07:27 CST 2020

```On Fri, Dec 4, 2020 at 7:47 AM Roland Richter <roland.richter at ntnu.no>
wrote:

> Ideally those FFTs could be handled in parallel, after they are not
> depending on each other. Is that possible with MatFFT, or should I rather
> use FFTW for that?
>
> Which FFTs? The rows on each process are handled in parallel. For the rows
on a process, you could call FFTW yourself to try and get
instruction level parallelism over rows, but if the rows are long I don't
think this would matter much. I guess you could measure. If you
wanted parallelism among rows, you could divide the matrix more finely. If
you wanted parallelism within a row, you would need to change
the matrix storage  format. I am not sure if any packages do this,
certainly not any I know of. You could treat the rows as PETSc vectors and
get parallelism this way I guess.

Thanks,

Matt

> Thanks,
>
> Roland
> Am 04.12.20 um 13:19 schrieb Matthew Knepley:
>
> On Fri, Dec 4, 2020 at 5:32 AM Roland Richter <roland.richter at ntnu.no>
> wrote:
>
>> Hei,
>>
>> I am currently working on a problem which requires a large amount of
>> transformations of a field E(r, t) from time space to Fourier space E(r,
>> w) and back. The field is described in a 2d-matrix, with the r-dimension
>> along the columns and the t-dimension along the rows.
>>
>> For the transformation from time to frequency space and back I therefore
>> have to apply a 1d-FFT operation over each row of my matrix. For my
>> earlier attempts I used armadillo as matrix library and FFTW for doing
>> the transformations. Here I could use fftw_plan_many_dft to do all FFTs
>> at the same time. Unfortunately, armadillo does not support MPI, and
>> therefore I had to switch to PETSc for larger matrices.
>>
>> Based on the examples (such as example 143) PETSc has a way of doing
>> FFTs internally by creating an FFT object (using MatCreateFFT).
>> Unfortunately, I can not see how I could use that object to conduct the
>> operation described above without having to iterate over each row in my
>> original matrix (i.e. doing it sequential, not in parallel).
>>
>> Ideally I could distribute the FFTs such over my nodes that each node
>> takes several rows of the original matrix and applies the FFT to each of
>> them. As example, for a matrix with a size of 4x4 and two nodes node 0
>> would take row 0 and 1, while node 1 takes row 2 and 3, to avoid
>> unnecessary memory transfer between the nodes while conducting the FFTs.
>> Is that something PETSc can do, too?
>>
>
> The way I understand our setup (I did not write it), we use plan_many_dft
> to handle
> multiple dof FFTs, but these would be interlaced. You want many FFTs for
> non-interlaced
> storage, which is not something we do right now. You could definitely call
> FFTW directly
> if you want.
>
> Second, above it seems like you just want serial FFTs. You can definitely
> create a MatFFT
> with PETSC_COMM_SELF, and apply it to each row in the local rows, or
> create the plan
> yourself for the stack of rows.
>
>    Thanks,
>
>      Matt
>
>
>> Thanks!
>>
>> Regards,
>>
>> Roland
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>

--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their