version 0.9.1 is released
rene.redler
redler at ccrl-nece.de
Wed Oct 8 06:09:18 CDT 2003
Hello,
> Short summary: parallel-netcdf-0.9.1 is available.
I have just tried out the new version on the SX which (the new
version) works quite nicely. Performance and scalability issues have not
been checked yet.
Attached to this mail you'll find a somewhat updated README.SX with which
I considered the latest modifications.
Unfortunately setting the FFLAGS to -Wl"-h nodefs" as is required to build
the Fortran tests on SX lead to a make error in nf_test. Since -Wl is
actually an option for the loader it makes more sense to set FLDFLAGS
appropriately instead of FFLAGS. Doing it this way the fandc make fails
because the Makefile.in does not know about any variable FLDFLAGS. I
modified the Makefile.in in fandc (replaced MPIF77 by LINK.F and MPICC by
LINK.c where appropriate and added an "include ../../macos.make")
according to how it is already done in nf_test. This works and hopefully
does on other architectures as well.
Some mionor issue for csnap.c:
Lines like (line 176):
size_t start_3d[3] = { kstart, jstart, istart };
are not accepted by the C compiler which is a kind of strict with repect
to c standard. As far as I know the C standard requires that kstart,
jstart and istart have to be constant, which is obviously not the case
here. I have attached a modified csnap.c which should also work with other
compilers than just the SX C compiler ( I hope).
Rene
_______________________________________________________________
René Redler
C&C Research Laboratories
NEC Europe Ltd. Tel: +49 (0)2241 925240
Rathausallee 10 Fax: +49 (0)2241 925299
53757 Sankt Augustin URL: www.ccrl-nece.de/~redler
_______________________________________________________________
-------------- next part --------------
# Current notes for NEC SX; based on pnetcdf version 0.9.1; Oct 8, 2003
. On the SX, integer*1 does not exist, so no "int1" interfaces will be built.
However, the declaration for the int1 functions still exist in pnetcdf.inc,
which the compiler does not like. Add -Wl"-h nodefs" to the FLDFLAGS
environment variable to work around this problem. In a future release, we
will generate pnetcdf.inc
. Our build process builds libpnetcdf.a from the objects in src/lib, then if
the fortran interface is being built, adds the objects in src/libf to the
already-existing library. NEC's make implementation does not like this: it
sees that the .a file is newer than the .f files and thus skips adding the
object files to the library.
To work around this, do the usual "configure; make", but then change to
src/libf and run "make" a 2nd time.
. SX Cross compiler environment is not supported yet. configure and make steps
have to be invoked on the SX directly.
. With the following environment variables a pnetcdflib.a has been
built successfully
MPICC=mpic++
MPIF77=mpif90
FC=f90
CC=c++
FLDFLAGS=-Wl"-h nodefs"
Built on NEC SX6 with:
- C++/SX Compiler Rev.055 2003/03/25
- f90/SX Compiler Rev.270 2003/04/04
. Compiler
Robert Latham <robl at mcs.anl.gov>, Rene Redler <redler at ccrl-nece.de>
-------------- next part --------------
/******************************************************************************
This is to test MPInetcdf, the parallel netCDF library being developed
at Argonne National Lab and Northwestern Univ.
This code writes one or two arrays, tt[k][j][i] (and smf[j][i], if
'only_3d' is 0), into the file 'csnap.nc.' It then reads the field(s)
from the file, and compares with the original field values.
i=longitude, j=latitude, k=level
To run: Set the global sizes, parallel decomposition and other I/O
parameters below.
By Woo-Sun Yang and Chris Ding
NERSC, Lawrence Berkeley National Laboratory
*****************************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <math.h>
#include <limits.h>
#include <float.h>
#include <mpi.h>
#include <pnetcdf.h>
/*** Field parameters ***/
const int totsiz_3d[3] = { 256, 256, 256 }; /* global sizes of 3D field */
int totsiz_2d[2]; /* global sizes of 2D field */
int locsiz_3d[3]; /* local sizes of 3D fields */
int locsiz_2d[2]; /* local sizes of 2D fields */
int istart, jstart, kstart; /* offsets of 3D field */
const int random_fields = 0; /* random fields? 1 or 0 */
const int only_3d = 1; /* I/O 3D field only? 1 or 0 */
int has_2d; /* contains valid 2D data? */
const int nwrites = 5; /* number of write samples */
const int nreads = 5; /* number of read samples */
const int fillmode = NC_NOFILL; /* NC_FILL or NC_NOFILL; actually
prefilling not supported */
/*** Parallel domain decomposition parameters ***/
MPI_Comm comm_cart; /* Cartesian communicator */
int mype; /* rank in comm_cart */
int totpes; /* total number of PEs */
int numpes[3] = { 0, 1, 1 }; /* number of PEs along axes;
determined by MPI where
a zero is specified */
int pe_coords[3]; /* Cartesian PE coords */
/*** function prototypes ***/
void find_locnx(int nx, int mype, int totpes, int *locnx, int *xbegin);
void write_file(char *filename, double *t);
void read_file(char *filename, double *t);
void get_fields(double *tt, double *smf);
void compare_vec(double *a, double *b, int ndims, int *sizes, int corr_data);
int main(int argc, char *argv[]) {
int isperiodic[3] = {0, 0, 0};
int reorder = 0;
double t[20], t_g[20];
double file_size;
double rates_l[4], rates_g[4];
int i;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&totpes);
MPI_Dims_create(totpes,3,numpes);
MPI_Cart_create(MPI_COMM_WORLD,3,numpes,isperiodic,reorder,&comm_cart);
MPI_Comm_rank(comm_cart,&mype);
MPI_Cart_coords(comm_cart,mype,3,pe_coords);
/*
Determine local sizes for tt (locsiz_3d) and smf (locsiz_2d).
Also determine whether the current processor contains valid 2D data.
Compute file_size in 1e6 Bytes
*/
find_locnx(totsiz_3d[0],pe_coords[0],numpes[0],&locsiz_3d[0],&kstart);
find_locnx(totsiz_3d[1],pe_coords[1],numpes[1],&locsiz_3d[1],&jstart);
find_locnx(totsiz_3d[2],pe_coords[2],numpes[2],&locsiz_3d[2],&istart);
totsiz_2d[0] = totsiz_3d[1];
totsiz_2d[1] = totsiz_3d[2];
locsiz_2d[0] = locsiz_3d[1];
locsiz_2d[1] = locsiz_3d[2];
has_2d = (! only_3d) && (pe_coords[0] == numpes[0] - 1);
if (only_3d)
file_size = (((double) totsiz_3d[0])*((double) totsiz_3d[1])
* ((double) totsiz_3d[2])) * 1.0e-6 * sizeof(double);
else
file_size = (((double) totsiz_3d[0])*((double) totsiz_3d[1])
* ((double) totsiz_3d[2])
+ ((double) totsiz_2d[0])*((double) totsiz_2d[1]))
* 1.0e-6 * sizeof(double);
/* Print data decomposition information */
if (mype == 0)
printf("mype pe_coords totsiz_3d locsiz_3d "
"kstart,jstart,istart\n");
MPI_Barrier(comm_cart);
printf("%3d %2d %2d %2d %4d %4d %4d %4d %4d %4d %6d %6d %6d\n",
mype, pe_coords[0], pe_coords[1], pe_coords[2],
totsiz_3d[0], totsiz_3d[1], totsiz_3d[2],
locsiz_3d[0], locsiz_3d[1], locsiz_3d[2],
kstart, jstart, istart);
/* Write and then read back */
for (i=0; i < 20; t[i++] = DBL_MAX); /* ready for timing */
write_file("csnap.nc", &t[ 0]);
read_file ("csnap.nc", &t[10]);
/* Compute I/O rates */
rates_l[0] = file_size / t[1]; /* write rate */
rates_l[1] = file_size /(t[0] + t[1]); /* effective write rate */
rates_l[2] = file_size / t[11]; /* read rate */
rates_l[3] = file_size /(t[10] + t[11]); /* effective read rate */
MPI_Allreduce(rates_l, rates_g, 4, MPI_DOUBLE, MPI_MIN, comm_cart);
MPI_Allreduce(t, t_g, 20, MPI_DOUBLE, MPI_MAX, comm_cart);
if (mype == 0) {
printf("File size: %10.3e MB\n", file_size);
printf(" Write: %9.3f MB/s (eff., %9.3f MB/s)\n",
rates_g[0], rates_g[1]);
printf(" Read : %9.3f MB/s (eff., %9.3f MB/s)\n",
rates_g[2], rates_g[3]);
printf(" %c %10.3e %3d %10.3e %10.3e %8.3f %10.3e %10.3e %8.3f\n",
((fillmode == NC_FILL) ? 'f' : 'n'), file_size, totpes,
t_g[0], t_g[1], rates_g[1], t_g[10], t_g[11], rates_g[3]);
}
MPI_Comm_free(&comm_cart);
MPI_Finalize();
return 0;
}
void write_file(char *filename, double *t) {
double *tt = NULL;
double *smf = NULL;
double t1, t2, t3;
int dim_id[3];
int lon_id, lat_id, lev_id;
int ierr;
int file_id;
int t_id, smf_id;
int ii;
size_t start_3d[3];
size_t count_3d[3];
size_t start_2d[2];
size_t count_2d[2];
start_3d[0] = kstart;
start_3d[1] = jstart;
start_3d[2] = istart;
count_3d[0] = locsiz_3d[0];
count_3d[1] = locsiz_3d[1];
count_3d[2] = locsiz_3d[2];
start_2d[0] = jstart;
start_2d[1] = istart;
count_2d[0] = locsiz_2d[0];
count_2d[1] = locsiz_2d[1];
tt = malloc(locsiz_3d[0]*locsiz_3d[1]*locsiz_3d[2]*sizeof(double));
if (has_2d)
smf = malloc(locsiz_2d[0]*locsiz_2d[1]*sizeof(double));
else
smf = malloc(sizeof(double));
for (ii = 1; ii <= nwrites; ii++) {
if(mype == 0) unlink(filename);
get_fields(tt, smf);
MPI_Barrier(comm_cart);
t1 = MPI_Wtime();
ierr = ncmpi_create(comm_cart, filename, NC_CLOBBER, MPI_INFO_NULL,
&file_id);
/* ierr = nc_set_fill(file_id,fillmode,&old_fillmode); */
ierr = ncmpi_def_dim(file_id,"level", (size_t) totsiz_3d[0],&lev_id);
ierr = ncmpi_def_dim(file_id,"latitude", (size_t) totsiz_3d[1],&lat_id);
ierr = ncmpi_def_dim(file_id,"longitude",(size_t) totsiz_3d[2],&lon_id);
dim_id[0] = lev_id; dim_id[1] = lat_id; dim_id[2] = lon_id;
ierr = ncmpi_def_var(file_id,"t",NC_DOUBLE,3,dim_id,&t_id);
if (! only_3d)
ierr = ncmpi_def_var(file_id,"smf",NC_DOUBLE,2,&dim_id[1],&smf_id);
ierr = ncmpi_enddef(file_id);
t2 = MPI_Wtime();
ierr = ncmpi_put_vara_double_all(file_id,t_id,start_3d,count_3d,tt);
if (! only_3d) {
ierr = ncmpi_begin_indep_data(file_id);
if (has_2d)
ierr = ncmpi_put_vara_double(file_id,smf_id,start_2d,count_2d,smf);
ierr = ncmpi_end_indep_data(file_id);
}
ierr = ncmpi_close(file_id);
MPI_Barrier(comm_cart);
t3 = MPI_Wtime();
if (t2 - t1 < t[0]) t[0] = t2 - t1;
if (t3 - t2 < t[1]) t[1] = t3 - t2;
if (mype == 0) printf("write %d: %9.3e %9.3e\n", ii, t2-t1, t3-t2);
}
free(tt);
free(smf);
}
void read_file(char *filename, double *t) {
double *tt = NULL;
double *smf = NULL;
double *buf = NULL;
double t1, t2, t3;
double dt1, dt2;
int ncid;
int vid_t, vid_smf;
int i, j, k, ii, ierr;
size_t start_3d[3];
size_t count_3d[3];
size_t start_2d[2];
size_t count_2d[2];
start_3d[0] = kstart;
start_3d[1] = jstart;
start_3d[2] = istart;
count_3d[0] = locsiz_3d[0];
count_3d[1] = locsiz_3d[1];
count_3d[2] = locsiz_3d[2];
start_2d[0] = jstart;
start_2d[1] = istart;
count_2d[0] = locsiz_2d[0];
count_2d[1] = locsiz_2d[1];
tt = malloc(locsiz_3d[0]*locsiz_3d[1]*locsiz_3d[2]*sizeof(double));
if (has_2d)
smf = malloc(locsiz_2d[0]*locsiz_2d[1]*sizeof(double));
else
smf = malloc(sizeof(double));
buf = malloc(locsiz_3d[0]*locsiz_3d[1]*locsiz_3d[2]*sizeof(double));
get_fields(tt, smf);
for (ii = 1; ii <= nreads; ii++) {
double *ptr = buf;
for (k = 0; k < locsiz_3d[0]; k++)
for (j = 0; j < locsiz_3d[1]; j++)
for (i = 0; i < locsiz_3d[2]; i++)
*ptr++ = 4.444;
MPI_Barrier(comm_cart);
t1 = MPI_Wtime();
ierr = ncmpi_open(comm_cart, filename, NC_NOWRITE, MPI_INFO_NULL, &ncid);
ierr = ncmpi_inq_varid(ncid,"t",&vid_t);
if (! only_3d) ierr = ncmpi_inq_varid(ncid,"smf",&vid_smf);
t2 = MPI_Wtime();
ierr = ncmpi_get_vara_double_all(ncid,vid_t,start_3d,count_3d,buf);
dt1 = MPI_Wtime();
if (ii == 1) compare_vec(tt,buf,3,locsiz_3d,1);
dt1 = MPI_Wtime() - dt1;
if (! only_3d) {
ierr = ncmpi_begin_indep_data(ncid);
if (has_2d)
ierr = ncmpi_get_vara_double(ncid,vid_smf,start_2d,count_2d,buf);
dt2 = MPI_Wtime();
if (ii == 1) compare_vec(smf,buf,2,locsiz_2d,has_2d);
dt2 = MPI_Wtime() - dt2;
ierr = ncmpi_end_indep_data(ncid);
}
ierr = ncmpi_close(ncid);
MPI_Barrier(comm_cart);
t3 = MPI_Wtime();
if (t2 - t1 < t[0]) t[0] = t2 - t1;
if ((t3 - t2) - (dt1 + dt2) < t[1]) t[1] = (t3 - t2) - (dt1 + dt2);
if (mype == 0) printf(" read %d: %9.3e %9.3e\n", ii, t2-t1,
(t3-t2)-(dt1+dt2));
}
free(tt);
free(smf);
free(buf);
}
void find_locnx(int nx, int mype, int totpes, int *locnx, int *xbegin) {
int xremain;
*locnx = nx / totpes;
xremain = nx - totpes*(*locnx);
if (mype < xremain) (*locnx)++;
*xbegin = mype*(nx/totpes) + xremain;
if (mype < xremain) *xbegin += mype - xremain;
}
void get_fields(double *tt, double *smf) {
int i, j, k;
if (random_fields) {
unsigned int seed = (INT_MAX / totpes) * mype;
srand(seed);
for (k = 0; k < locsiz_3d[0]; k++)
for (j = 0; j < locsiz_3d[1]; j++)
for (i = 0; i < locsiz_3d[2]; i++)
*tt++ = ((double) (rand())) / (RAND_MAX + 1.);
if (has_2d)
for (j = 0; j < locsiz_2d[0]; j++)
for (i = 0; i < locsiz_2d[1]; i++)
*smf++ = ((double) (rand())) / (RAND_MAX + 1.);
}
else {
for (k = 0; k < locsiz_3d[0]; k++)
for (j = 0; j < locsiz_3d[1]; j++)
for (i = 0; i < locsiz_3d[2]; i++)
*tt++ = (istart + i + 1 + totsiz_3d[2]*(jstart + j
+ totsiz_3d[1]*(kstart + k)))*1.e-3;
if (has_2d)
for (j = 0; j < locsiz_2d[0]; j++)
for (i = 0; i < locsiz_2d[1]; i++)
*smf++ = (istart + i + 1 + totsiz_2d[1]*(jstart + j))*1.e-2;
}
}
void compare_vec(double *a, double *b, int ndims, int *sizes, int corr_data) {
double diff, delta, delmax, delmin;
double ws[5], wr[5];
int totsiz, i;
if (corr_data) {
totsiz = 1;
for (i = 0; i < ndims; i++)
totsiz = totsiz * sizes[i];
ws[0] = 0.; /* diff */
ws[1] = 0.; /* sumsq */
ws[2] = totsiz; /* totsiz */
ws[3] = 0.; /* delmax */
ws[4] = DBL_MAX; /* delmin */
for (i = 0; i < totsiz; i++) {
delta = (a[i] - b[i]) * (a[i] - b[i]);
ws[0] = ws[0] + delta;
ws[1] = ws[1] + a[i] * a[i];
if (delta > ws[3]) ws[3] = delta;
if (delta < ws[4]) ws[4] = delta;
}
}
else {
ws[0] = ws[1] = ws[2] = ws[3] = 0.;
ws[4] = DBL_MAX;
}
MPI_Allreduce( ws, wr, 3, MPI_DOUBLE, MPI_SUM, comm_cart);
MPI_Allreduce(&ws[3], &delmax, 1, MPI_DOUBLE, MPI_MAX, comm_cart);
MPI_Allreduce(&ws[4], &delmin, 1, MPI_DOUBLE, MPI_MIN, comm_cart);
diff = sqrt(wr[0]/wr[1]); /* Normalized error */
delmax = sqrt(wr[2]*delmax/wr[1]); /* Normalized max difference */
delmin = sqrt(wr[2]*delmin/wr[1]); /* Normalized min difference */
if (mype == 0)
printf("diff, delmax, delmin = %9.3e %9.3e %9.3e\n", diff, delmax, delmin);
}
-------------- next part --------------
srcdir = @srcdir@
VPATH = @srcdir@
include ../../macros.make
ALL: pnctestf pnctest csnap pnf_test
INCDIR = $(srcdir)/../../src/lib
INCDIRF = $(srcdir)/../../src/libf/
LNKDIR = ../../src/lib
MPICC = @MPICC@
MPIF77 = @MPIF77@
EXECS = pnctestf pnctest csnap pnf_test
pnctestf: pnctestf.F
$(LINK.F) -o pnctestf $(srcdir)/pnctestf.F -I$(INCDIRF) -L$(LNKDIR) -lpnetcdf -lm $(LIBS)
pnctest: pnctest.c
$(LINK.c) -o pnctest $(srcdir)/pnctest.c -I$(INCDIR) -L$(LNKDIR) -lpnetcdf -lm $(LIBS)
csnap: csnap.o
$(LINK.c) -L$(LNKDIR) -o csnap csnap.o -lpnetcdf -lm $(LIBS)
csnap.o: csnap.c
$(MPICC) -I$(INCDIR) -c $(srcdir)/csnap.c
pnf_test: pnf_test.o
$(LINK.F) -L$(LNKDIR) -o pnf_test pnf_test.o -lpnetcdf -lm $(LIBS)
pnf_test.o: pnf_test.F
$(MPIF77) $(FFLAGS) -I$(INCDIRF) -c $(srcdir)/pnf_test.F
clean:
rm -f $(EXECS) *.o *.nc
More information about the parallel-netcdf
mailing list