[Nek5000-users] fields on a regular grid

Sun Mar 6 04:14:33 CST 2011

Dear Supriyo,

I'm quite certain we don't have what you desire, particularly
since your input .rea file is not itself lexicographically ordered.

One suggestion that I think would be of value is to generate your
.rea file as a single tensor-product (i.e., lexicographically ordered)
entity, by explicitly inputting the x- y- and z- ordinates.  The new
box file would look something like:

10 10 10 nelx,nely,nelz
0 .05 .1 .2 .35 .5 .65 .8 .9 .95 1. 
0 .05 .1 .2 .35 .5 .65 .8 .9 .95 1.
0 .05 .1 .2 .35 .5 .65 .8 .9 .95 1.

Note, because of the way genbox was written, you can have only 80
characters on an input line, so if you have more elements than
can fit on a single line, you must wrap them.  As an example, 
the following yields the same result as above:

10 10 10 nelx,nely,nelz
0 .05 .1 .2 .35 
.5 .65 .8 .9 .95 1. 
0 .05 .1 .2 .35 
.5 .65 .8 .9 .95 1.
0 .05 .1 .2 .35 
.5 .65 .8 .9 .95 1.

I use a script or short f77 program to generate my desired
coordinate lists.   (The "negative" nelx option was just
a shortcut to automatically provide commonly used distributions.)

With the data organized in this way it will be easier to transfer
into other formats.  For example, you'll have some hope of doing
tensor-product operations.   If you already have a lot of data
in your old layout you can use the grid-to-grid interpolation
feature to transfer the data to the tensor-product-ordered mesh.
See g2gi on the wiki page for how to do this.   You should check
the veracity of the g2gi output -- I would recommend interpolating
the coordinates (xm1,ym1,zm1) as a test.

Coming back to your original question, even once the data is
in a regular SEM array, we don't have any parallel mechanism
to repartition the n x n x n grid data onto P processors because
we never have a need for such.   We do have off-line serial routines
that can transfer data from an nelx x nely x nelz array of Nth-order
elements to an nxnxn array - this allows for analysis by FFTs etc.

Also, if your elements on each processor compose a "perfect" block
decomposition of the space, then you could locally reorganize the
data into the sub-block of the global nxnxn array.  That is, if
you had, say, 16 processors and an 8x8x8 array of elements, with
contiguous 2x2x2 blocs on each of the processors, then the 
remapping would be local.

As I type this, I realize now that we have a tool that could be
used to re-order the data.   I'm going to express it under
the assumption that your data is a proper tensor-product of 
nelx x nely x nelz elements --- if it is not, you simply need
to figure out where in such an array element eg out of nelgv
elements total is located.

Assuming the (nelx,nely,nelz) layout such that nelgv=nelx*nely*nelz,
then

       include 'SIZE'
       include 'TOTAL'
       include 'ZPER'
       integer e,eg,ex,ey,ez

       do e=1,nelv
          eg = lglel(e)
          call get_exyz(ex,ey,ez,eg,nelx,nely,nelz)
          :
          :
          :
       enddo

would tell you the (ex,ey,ez) index for element e in your (nelx,nely,nelz)
array.   From this, you can discern your local 1D point sets

     {i,j,k} in [1:n,1:n,1:n]

required for each element.  Given this, you get a set of values

     {xi,yj,zk} in [0,1]^3

and you can use a tensor-product of 1D interpolators to map from
element e to the desired  u_ijk values.  Use fd_weights_full in
nek to generate the polynomial interpolants from the SEM mesh to
your 1D coordinates.  It will output an interpolation matrix.
Then you need to apply this interpolator in a tensor-product
fashion to each element e, e=1,nelv, on each processor.
("grep weights *.f" in the trunk/nek directory to see example
usage).

You then need to organize your nxnxn data according to your desired
processor layout.   If this requires data movement, I think you can
use crystal_ituple_transfer and sort (grep crystal *.f).

The key here is to exploit the underlying tensor-product layout
of both the input and output data.  This allows you to use a
tensor-product of 1D interpolators, rather than general-purpose
off-grid interpolation, which is more expensive and generally
less accurate.   A (very) good idea is to choose your "n" distribution
so that none of the desired output values is on a spectral element
boundary.  Otherwise, there is ambiguity about which element or
processor is responsible for keeping this value.  Even if you decide
to use a rule controlled by, say, less-than-or-equal, you can get
into trouble because of floating-point round-off so that the data
does not end up where you think it should.

If this seems complicated, well, it is -- primarily because you
desire to move between two data spaces.   Again, I reiterate that
we have these tools in serial (which is where we would typically
post-process under such circumstances).

Paul

> 
> 
> Dear Paul,
> Thanks for your prompt reply. We greatly appreciate it.
> 
> Yes, our domain is box shape (3d). Our mesh is equal in all three 
> directions. Please
> find attached the .box file for your reference.
> 
> We are running NEK in parallel. We want the array to be equipartitioned 
> among all
> the processors. Array size = cube_root(number_elements)*order_polynomial.
> 
> We do not want the regular array to be output to a file.
> 
> with regards,
> 
> Supriyo
> 
> 
> 
> On Sat, 5 Mar 2011, nek5000-users at lists.mcs.anl.gov wrote:
> 
> > Dear NEK developers,
> >
> > We are trying to get velocity and temperature fields on a regular grid as
> > separate arrays (not being dumped in some file).
> > map2reg in postprocess.f seems to be doing that. However we are confused
> > with the dimensionality of the output array in that function.
> >
> > Please help us in this regard.
> >
> > With best wishes
> >
> > Supriyo Paul