how to initialize a parallel vector from one process?

Jed Brown jed at 59A2.org
Sun Jul 6 04:27:10 CDT 2008


I've done something similar when reading and writing netCDF in parallel where
the files are too large to be stored on a single processor.  NetCDF-4 makes this
obsolete, but here's the idea:

* The new parallel job makes a PETSc DA and uses PETSC_DECIDE to for partitioning.

* Process zero reads the header and broadcasts the dimensions.

* Each process determines the index range it needs to interpolate the file data
  onto the locally owned computational grid.  Send this index range to rank
  zero.

* Rank zero reads each of the blocks sequentially (the netCDF API has a read
  (imin,imax)x(jmin,jmax)x(kmin,kmax)) and sends it to the appropriate process.

* Each process does the necessary interpolation locally.

I've found that this performs just fine for many GiB of state and hundreds of
processors.  You have to get your hands a bit dirty with MPI.  Of course, there
are simpler (pure PETSc solutions) if you can fit the whole state on process
zero or if you can use the PETSc binary format.

Jed


On Sun 2008-07-06 11:22, Tonellot, Thierry-Laurent D wrote:
> Hi,
> 
>  
> 
> At the very beginning of my application I need to read data in a database to
> initialize a 3D vector distributed over several processes.
> 
> The database I’m using can only be accessed by one process (for instance the
> process 0). Moreover, due to performance issues, we need to limit the request
> to the database. Therefore the data need to be read by slices, for instance
> (z,x) slices.
> 
> A typical initialization would then consist in the following pseudo code:
> 
>  
> 
> Loop over y
> 
>             If (rank=0)
> 
>                         Read slice y
> 
>                         Send to each process the appropriate part of the data
> slice
> 
> Else
> 
>             Receive data
> 
> Endif
> 
> End loop
> 
>  
> 
> This process is quite heavy and its performances will probably depend on the
> way it is implemented.
> 
>  
> 
> I’m wondering if there is any way to perform this initialization efficiently
> using Petsc?
> 
>  
> 
> I’m also considering other packages to handle distributed arrays and I’m
> wondering how a package like global arrays compares with petsc/DA?
> 
>  
> 
> For instance global arrays seem to have a feature which is partly solving my
> problem above using the function “ga_fill_patch” which fills only a region of
> the parallel vector and can be called by any process…
> 
>  
> 
> Thank you in advance,
> 
>  
> 
> Thierry
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080706/8315350f/attachment.pgp>


More information about the petsc-users mailing list