version 0.9.1 is released

rene.redler redler at ccrl-nece.de
Wed Oct 8 06:09:18 CDT 2003


Hello,

> Short summary: parallel-netcdf-0.9.1 is available.  

I have just tried out the new version on the SX which (the new 
version) works quite nicely. Performance and scalability issues have not 
been checked yet.

Attached to this mail you'll find a somewhat updated README.SX with which 
I considered the latest modifications.

Unfortunately setting the FFLAGS to -Wl"-h nodefs" as is required to build 
the Fortran tests on SX lead to a make error in nf_test. Since -Wl is 
actually an option for the loader it makes more sense to set FLDFLAGS 
appropriately instead of FFLAGS. Doing it this way the fandc make fails 
because the Makefile.in does not know about any variable FLDFLAGS. I 
modified the Makefile.in in fandc (replaced MPIF77 by LINK.F and MPICC by 
LINK.c where appropriate and added an "include ../../macos.make") 
according to how it is already done in nf_test. This works and hopefully 
does on other architectures as well.

Some mionor issue for csnap.c:

Lines like (line 176):

  size_t start_3d[3] = { kstart, jstart, istart };

are not accepted by the C compiler which is a kind of strict with repect 
to c standard. As far as I know the C standard requires that kstart, 
jstart and istart have to be constant, which is obviously not the case 
here. I have attached a modified csnap.c which should also work with other 
compilers than just the SX C compiler ( I hope).

Rene


   _______________________________________________________________

      René Redler
      C&C Research Laboratories
      NEC Europe Ltd.                   Tel: +49 (0)2241 925240
      Rathausallee 10                   Fax: +49 (0)2241 925299 
      53757 Sankt Augustin              URL: www.ccrl-nece.de/~redler
   _______________________________________________________________
-------------- next part --------------
# Current notes for NEC SX; based on pnetcdf version 0.9.1; Oct 8, 2003

. On the SX, integer*1 does not exist, so no "int1" interfaces will be built.
  However, the declaration for the int1 functions still exist in pnetcdf.inc,
  which the compiler does not like.  Add -Wl"-h nodefs" to the FLDFLAGS
  environment variable to work around this problem.  In a future release, we
  will generate pnetcdf.inc 

. Our build process builds libpnetcdf.a from the objects in src/lib, then if
  the fortran interface is being built, adds the objects in src/libf to the
  already-existing library. NEC's make implementation does not like this: it
  sees that the .a file is newer than the .f files and thus skips adding the
  object files to the library.

  To work around this, do the usual "configure; make", but then change to
  src/libf and run "make" a 2nd time.

. SX Cross compiler environment is not supported yet. configure and make steps
  have to be invoked on the SX directly.

. With the following environment variables a pnetcdflib.a has been
  built successfully

    MPICC=mpic++
    MPIF77=mpif90
    FC=f90
    CC=c++
    FLDFLAGS=-Wl"-h nodefs"

  Built on NEC SX6 with:
   - C++/SX Compiler Rev.055 2003/03/25
   - f90/SX Compiler Rev.270 2003/04/04

. Compiler 


Robert Latham <robl at mcs.anl.gov>, Rene Redler <redler at ccrl-nece.de>
-------------- next part --------------
/******************************************************************************

  This is to test MPInetcdf, the parallel netCDF library being developed
  at Argonne National Lab and Northwestern Univ.

  This code writes one or two arrays, tt[k][j][i] (and smf[j][i], if
  'only_3d' is 0), into the file 'csnap.nc.' It then reads the field(s)
  from the file, and compares with the original field values.
 
  i=longitude, j=latitude, k=level
 
  To run: Set the global sizes, parallel decomposition and other I/O
  parameters below.
 
  By Woo-Sun Yang and Chris Ding
  NERSC, Lawrence Berkeley National Laboratory

 *****************************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <math.h>
#include <limits.h>
#include <float.h>
#include <mpi.h>
#include <pnetcdf.h>

/*** Field parameters ***/

const int totsiz_3d[3] = { 256, 256, 256 }; /* global sizes of 3D field */
int totsiz_2d[2];                           /* global sizes of 2D field */
int locsiz_3d[3];                           /* local sizes of 3D fields */
int locsiz_2d[2];                           /* local sizes of 2D fields */
int istart, jstart, kstart;                 /* offsets of 3D field */

const int random_fields = 0;                /* random fields? 1 or 0 */
const int only_3d       = 1;                /* I/O 3D field only? 1 or 0 */

int has_2d;                                 /* contains valid 2D data? */

const int nwrites = 5;                      /* number of write samples */
const int nreads  = 5;                      /* number of read samples */

const int fillmode = NC_NOFILL;             /* NC_FILL or NC_NOFILL; actually
                                               prefilling not supported */

/*** Parallel domain decomposition parameters ***/

MPI_Comm comm_cart;                         /* Cartesian communicator */
int mype;                                   /* rank in comm_cart */
int totpes;                                 /* total number of PEs */
int numpes[3] = {   0,   1,   1 };          /* number of PEs along axes;
                                               determined by MPI where
                                               a zero is specified */
int pe_coords[3];                           /* Cartesian PE coords */

/*** function prototypes ***/

void find_locnx(int nx, int mype, int totpes, int *locnx, int *xbegin);
void write_file(char *filename, double *t);
void read_file(char *filename, double *t);
void get_fields(double *tt, double *smf);
void compare_vec(double *a, double *b, int ndims, int *sizes, int corr_data);


int main(int argc, char *argv[]) {
  int isperiodic[3] = {0, 0, 0};
  int reorder = 0;
  double t[20], t_g[20];
  double file_size;
  double rates_l[4], rates_g[4];
  int i;

  MPI_Init(&argc,&argv);
  MPI_Comm_size(MPI_COMM_WORLD,&totpes);

  MPI_Dims_create(totpes,3,numpes);
  MPI_Cart_create(MPI_COMM_WORLD,3,numpes,isperiodic,reorder,&comm_cart);
  MPI_Comm_rank(comm_cart,&mype);
  MPI_Cart_coords(comm_cart,mype,3,pe_coords);

/*
   Determine local sizes for tt (locsiz_3d) and smf (locsiz_2d).
   Also determine whether the current processor contains valid 2D data.
   Compute file_size in 1e6 Bytes
 */

  find_locnx(totsiz_3d[0],pe_coords[0],numpes[0],&locsiz_3d[0],&kstart);
  find_locnx(totsiz_3d[1],pe_coords[1],numpes[1],&locsiz_3d[1],&jstart);
  find_locnx(totsiz_3d[2],pe_coords[2],numpes[2],&locsiz_3d[2],&istart);

  totsiz_2d[0] = totsiz_3d[1];
  totsiz_2d[1] = totsiz_3d[2];

  locsiz_2d[0] = locsiz_3d[1];
  locsiz_2d[1] = locsiz_3d[2];

  has_2d = (! only_3d) && (pe_coords[0] == numpes[0] - 1);

  if (only_3d)
    file_size = (((double) totsiz_3d[0])*((double) totsiz_3d[1])
               * ((double) totsiz_3d[2])) * 1.0e-6 * sizeof(double);
  else
    file_size = (((double) totsiz_3d[0])*((double) totsiz_3d[1])
               * ((double) totsiz_3d[2])
               + ((double) totsiz_2d[0])*((double) totsiz_2d[1]))
               * 1.0e-6 * sizeof(double);

/* Print data decomposition information */

  if (mype == 0)
    printf("mype  pe_coords    totsiz_3d         locsiz_3d       "
           "kstart,jstart,istart\n");

  MPI_Barrier(comm_cart);

  printf("%3d   %2d %2d %2d  %4d %4d %4d    %4d %4d %4d   %6d %6d %6d\n",
         mype, pe_coords[0], pe_coords[1], pe_coords[2],
         totsiz_3d[0], totsiz_3d[1], totsiz_3d[2],
         locsiz_3d[0], locsiz_3d[1], locsiz_3d[2],
         kstart, jstart, istart);

/* Write and then read back */

  for (i=0; i < 20; t[i++] = DBL_MAX);   /* ready for timing */

  write_file("csnap.nc", &t[ 0]);
  read_file ("csnap.nc", &t[10]);

/* Compute I/O rates */

  rates_l[0] = file_size / t[1];             /* write rate */
  rates_l[1] = file_size /(t[0] + t[1]);     /* effective write rate */
  rates_l[2] = file_size / t[11];            /* read rate */
  rates_l[3] = file_size /(t[10] + t[11]);   /* effective read rate */

  MPI_Allreduce(rates_l, rates_g, 4, MPI_DOUBLE, MPI_MIN, comm_cart);
  MPI_Allreduce(t, t_g, 20, MPI_DOUBLE, MPI_MAX, comm_cart);

  if (mype == 0) {
     printf("File size: %10.3e MB\n", file_size);
     printf("    Write: %9.3f MB/s  (eff., %9.3f MB/s)\n",
            rates_g[0], rates_g[1]);
     printf("    Read : %9.3f MB/s  (eff., %9.3f MB/s)\n",
            rates_g[2], rates_g[3]);
     printf(" %c %10.3e %3d %10.3e %10.3e %8.3f %10.3e %10.3e %8.3f\n",
           ((fillmode == NC_FILL) ? 'f' : 'n'), file_size, totpes,
            t_g[0], t_g[1], rates_g[1], t_g[10], t_g[11], rates_g[3]);
  }

  MPI_Comm_free(&comm_cart);
  MPI_Finalize();
  return 0;
}


void write_file(char *filename, double *t) {
  double *tt = NULL;
  double *smf = NULL;
  double t1, t2, t3;
  int dim_id[3];
  int lon_id, lat_id, lev_id;
  int ierr;
  int file_id;
  int t_id, smf_id;
  int ii;
  size_t start_3d[3];
  size_t count_3d[3];
  size_t start_2d[2];
  size_t count_2d[2];

  start_3d[0] = kstart;
  start_3d[1] = jstart;
  start_3d[2] = istart;
  count_3d[0] = locsiz_3d[0];
  count_3d[1] = locsiz_3d[1];
  count_3d[2] = locsiz_3d[2];
  start_2d[0] = jstart;
  start_2d[1] = istart;
  count_2d[0] = locsiz_2d[0];
  count_2d[1] = locsiz_2d[1];
  
  tt = malloc(locsiz_3d[0]*locsiz_3d[1]*locsiz_3d[2]*sizeof(double));

  if (has_2d)
    smf = malloc(locsiz_2d[0]*locsiz_2d[1]*sizeof(double));
  else
    smf = malloc(sizeof(double));

  for (ii = 1; ii <= nwrites; ii++) {

    if(mype == 0) unlink(filename);

    get_fields(tt, smf);
    MPI_Barrier(comm_cart);

    t1 = MPI_Wtime();

    ierr = ncmpi_create(comm_cart, filename, NC_CLOBBER, MPI_INFO_NULL,
                        &file_id);

/*  ierr = nc_set_fill(file_id,fillmode,&old_fillmode); */

    ierr = ncmpi_def_dim(file_id,"level",    (size_t) totsiz_3d[0],&lev_id);
    ierr = ncmpi_def_dim(file_id,"latitude", (size_t) totsiz_3d[1],&lat_id);
    ierr = ncmpi_def_dim(file_id,"longitude",(size_t) totsiz_3d[2],&lon_id);

    dim_id[0] = lev_id; dim_id[1] = lat_id; dim_id[2] = lon_id;

    ierr = ncmpi_def_var(file_id,"t",NC_DOUBLE,3,dim_id,&t_id);

    if (! only_3d)
      ierr = ncmpi_def_var(file_id,"smf",NC_DOUBLE,2,&dim_id[1],&smf_id);

    ierr = ncmpi_enddef(file_id);

    t2 = MPI_Wtime();

    ierr = ncmpi_put_vara_double_all(file_id,t_id,start_3d,count_3d,tt);

    if (! only_3d) {
      ierr = ncmpi_begin_indep_data(file_id);

      if (has_2d)
      ierr = ncmpi_put_vara_double(file_id,smf_id,start_2d,count_2d,smf);

      ierr = ncmpi_end_indep_data(file_id);
    }

    ierr = ncmpi_close(file_id);

    MPI_Barrier(comm_cart);
    t3 = MPI_Wtime();

    if (t2 - t1 < t[0]) t[0] = t2 - t1;
    if (t3 - t2 < t[1]) t[1] = t3 - t2;
    if (mype == 0) printf("write %d: %9.3e %9.3e\n", ii, t2-t1, t3-t2);
  }

  free(tt);
  free(smf);
}


void read_file(char *filename, double *t) {
  double *tt  = NULL;
  double *smf = NULL;
  double *buf = NULL;
  double t1, t2, t3;
  double dt1, dt2;
  int ncid;
  int vid_t, vid_smf;
  int i, j, k, ii, ierr;

  size_t start_3d[3];
  size_t count_3d[3];
  size_t start_2d[2];
  size_t count_2d[2];

  start_3d[0] = kstart;
  start_3d[1] = jstart;
  start_3d[2] = istart;
  count_3d[0] = locsiz_3d[0];
  count_3d[1] = locsiz_3d[1];
  count_3d[2] = locsiz_3d[2];
  start_2d[0] = jstart;
  start_2d[1] = istart;
  count_2d[0] = locsiz_2d[0];
  count_2d[1] = locsiz_2d[1];

  tt = malloc(locsiz_3d[0]*locsiz_3d[1]*locsiz_3d[2]*sizeof(double));

  if (has_2d)
    smf = malloc(locsiz_2d[0]*locsiz_2d[1]*sizeof(double));
  else
    smf = malloc(sizeof(double));

  buf = malloc(locsiz_3d[0]*locsiz_3d[1]*locsiz_3d[2]*sizeof(double));

  get_fields(tt, smf);

  for (ii = 1; ii <= nreads; ii++) {

    double *ptr = buf;
    for (k = 0; k < locsiz_3d[0]; k++)
      for (j = 0; j < locsiz_3d[1]; j++)
        for (i = 0; i < locsiz_3d[2]; i++)
          *ptr++ = 4.444;

    MPI_Barrier(comm_cart);
    t1 = MPI_Wtime();

    ierr = ncmpi_open(comm_cart, filename, NC_NOWRITE, MPI_INFO_NULL, &ncid);

    ierr = ncmpi_inq_varid(ncid,"t",&vid_t);
    if (! only_3d) ierr = ncmpi_inq_varid(ncid,"smf",&vid_smf);

    t2 = MPI_Wtime();

    ierr = ncmpi_get_vara_double_all(ncid,vid_t,start_3d,count_3d,buf);

    dt1 = MPI_Wtime();
    if (ii == 1) compare_vec(tt,buf,3,locsiz_3d,1);
    dt1 = MPI_Wtime() - dt1;

    if (! only_3d) {
      ierr = ncmpi_begin_indep_data(ncid);

      if (has_2d)
      ierr = ncmpi_get_vara_double(ncid,vid_smf,start_2d,count_2d,buf);

      dt2 = MPI_Wtime();
      if (ii == 1) compare_vec(smf,buf,2,locsiz_2d,has_2d);
      dt2 = MPI_Wtime() - dt2;

      ierr = ncmpi_end_indep_data(ncid);
    }

    ierr = ncmpi_close(ncid);

    MPI_Barrier(comm_cart);
    t3 = MPI_Wtime();

    if (t2 - t1 < t[0]) t[0] = t2 - t1;
    if ((t3 - t2) - (dt1 + dt2) < t[1]) t[1] = (t3 - t2) - (dt1 + dt2);
    if (mype == 0) printf(" read %d: %9.3e %9.3e\n", ii, t2-t1,
                          (t3-t2)-(dt1+dt2));
  }

  free(tt);
  free(smf);
  free(buf);
}


void find_locnx(int nx, int mype, int totpes, int *locnx, int *xbegin) {
  int xremain;

  *locnx = nx / totpes;
  xremain = nx - totpes*(*locnx);
  if (mype < xremain) (*locnx)++;
  *xbegin = mype*(nx/totpes) + xremain;
  if (mype < xremain) *xbegin += mype - xremain;
}


void get_fields(double *tt, double *smf) {
  int i, j, k;

  if (random_fields) {
    unsigned int seed = (INT_MAX / totpes) * mype;
    srand(seed);

    for (k = 0; k < locsiz_3d[0]; k++)
      for (j = 0; j < locsiz_3d[1]; j++)
        for (i = 0; i < locsiz_3d[2]; i++)
          *tt++ = ((double) (rand())) / (RAND_MAX + 1.);

    if (has_2d)
      for (j = 0; j < locsiz_2d[0]; j++)
        for (i = 0; i < locsiz_2d[1]; i++)
           *smf++ = ((double) (rand())) / (RAND_MAX + 1.);
  }
  else {
    for (k = 0; k < locsiz_3d[0]; k++)
      for (j = 0; j < locsiz_3d[1]; j++)
        for (i = 0; i < locsiz_3d[2]; i++)
           *tt++ = (istart + i + 1 + totsiz_3d[2]*(jstart + j
                                   + totsiz_3d[1]*(kstart + k)))*1.e-3;

    if (has_2d)
      for (j = 0; j < locsiz_2d[0]; j++)
        for (i = 0; i < locsiz_2d[1]; i++)
           *smf++ = (istart + i + 1 + totsiz_2d[1]*(jstart + j))*1.e-2;
  }
}


void compare_vec(double *a, double *b, int ndims, int *sizes, int corr_data) {
  double diff, delta, delmax, delmin;
  double ws[5], wr[5];
  int totsiz, i;

  if (corr_data) {
    totsiz = 1;
    for (i = 0; i < ndims; i++)
      totsiz = totsiz * sizes[i];

    ws[0] = 0.;           /*  diff    */
    ws[1] = 0.;           /*  sumsq   */
    ws[2] = totsiz;       /*  totsiz  */
    ws[3] = 0.;           /*  delmax  */
    ws[4] = DBL_MAX;      /*  delmin  */

    for (i = 0; i < totsiz; i++) {
      delta = (a[i] - b[i]) * (a[i] - b[i]);
      ws[0] = ws[0] + delta;
      ws[1] = ws[1] + a[i] * a[i];
      if (delta > ws[3]) ws[3] = delta;
      if (delta < ws[4]) ws[4] = delta;
    }
  }
  else {
    ws[0] = ws[1] = ws[2] = ws[3] = 0.;
    ws[4] = DBL_MAX;
  }

  MPI_Allreduce( ws,     wr,     3, MPI_DOUBLE, MPI_SUM, comm_cart);
  MPI_Allreduce(&ws[3], &delmax, 1, MPI_DOUBLE, MPI_MAX, comm_cart);
  MPI_Allreduce(&ws[4], &delmin, 1, MPI_DOUBLE, MPI_MIN, comm_cart);

  diff   = sqrt(wr[0]/wr[1]);           /*  Normalized error */
  delmax = sqrt(wr[2]*delmax/wr[1]);    /*  Normalized max difference */
  delmin = sqrt(wr[2]*delmin/wr[1]);    /*  Normalized min difference */

  if (mype == 0)
  printf("diff, delmax, delmin = %9.3e %9.3e %9.3e\n", diff, delmax, delmin);
}
-------------- next part --------------
srcdir = @srcdir@
VPATH = @srcdir@

include ../../macros.make

ALL: pnctestf pnctest csnap pnf_test

INCDIR = $(srcdir)/../../src/lib
INCDIRF = $(srcdir)/../../src/libf/
LNKDIR = ../../src/lib

MPICC = @MPICC@
MPIF77 = @MPIF77@

EXECS = pnctestf pnctest csnap pnf_test

pnctestf: pnctestf.F
	$(LINK.F) -o pnctestf $(srcdir)/pnctestf.F -I$(INCDIRF) -L$(LNKDIR) -lpnetcdf -lm $(LIBS)

pnctest: pnctest.c
	$(LINK.c) -o pnctest $(srcdir)/pnctest.c -I$(INCDIR) -L$(LNKDIR) -lpnetcdf -lm $(LIBS)

csnap:  csnap.o
	$(LINK.c) -L$(LNKDIR) -o csnap csnap.o -lpnetcdf -lm $(LIBS)

csnap.o:  csnap.c
	$(MPICC) -I$(INCDIR) -c $(srcdir)/csnap.c

pnf_test:  pnf_test.o
	$(LINK.F) -L$(LNKDIR) -o pnf_test pnf_test.o -lpnetcdf -lm $(LIBS)

pnf_test.o:  pnf_test.F
	$(MPIF77) $(FFLAGS) -I$(INCDIRF) -c $(srcdir)/pnf_test.F 

clean:
	rm -f $(EXECS) *.o *.nc



More information about the parallel-netcdf mailing list