[Fwd: pnetcdf & Open MPI]

Dries Kimpe Dries.Kimpe at wis.kuleuven.be
Thu May 4 05:38:11 CDT 2006


The facts:

* parallel netcdf compiles, both with Open MPI (svn trunk) and mpich2 .
* With Open MPI, all tests fail with the following message

Testing write ... Error: Unsupported datatype passed to
ADIOI_Count_contiguous_blocks
[lts.mydomain.be:26763] [0,0,0] ORTE_ERROR_LOG: Not found in file
../../../../orte/mca/pls/base/pls_base_proxy.c at line 189

(both independent&collective writes, no mather what the underlying variable type is).

Steps to reproduce the problem:

1) build Open MPI trunk revision 9809

Special configure options used: --enable-static --enable-shared
gcc:
gcc (GCC) 3.3.6 (Gentoo 3.3.6, ssp-3.3.6-1.0, pie-8.7.8)
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

2) verify that the correct compiler is being called

lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi $ mpic++ -showme
g++ -I/home/lts/openmpi/include -I/home/lts/openmpi/include/openmpi -pthread -L/home/lts/openmpi/lib
-lmpi_cxx -lmpi -lorte -lopal -lrt -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl
lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi $ mpicc -showme
gcc -I/home/lts/openmpi/include -I/home/lts/openmpi/include/openmpi -pthread -L/home/lts/openmpi/lib
-lmpi -lorte -lopal -lrt -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl
lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi $ mpicc --version
gcc (GCC) 3.3.6 (Gentoo 3.3.6, ssp-3.3.6-1.0, pie-8.7.8)
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


3) unpack parallel-netcdf-1.0.1.tar.bz2
mkdir build
cd build
mkdir openmpi
cd openmpi
../../configure --prefix=/home/lts/openmpi --disable-fortran CC=mpicc
make
make install


4) go to test directory
make
(-> fortran tests fail to compile, which is 'normal')

Try out test/test_double:

lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi/test/test_double $ ./test_write test.nc
Testing write ... ADIOI_GEN_DELETE (line 22): **io No such file or directoryError: Unsupported
datatype passed to ADIOI_Count_contiguous_blocks
[mhd3:24861] [0,0,0] ORTE_ERROR_LOG: Not found in file
../../../../orte/mca/pls/base/pls_base_proxy.c at line 189

verify correct libraries are being used:
lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi/test/test_double $ ldd test_write
	linux-gate.so.1 =>  (0xffffe000)
	libmpi.so.0 => /home/lts/openmpi/lib/libmpi.so.0 (0xb7e3d000)
	liborte.so.0 => /home/lts/openmpi/lib/liborte.so.0 (0xb7da5000)
	libopal.so.0 => /home/lts/openmpi/lib/libopal.so.0 (0xb7d71000)
	librt.so.1 => /lib/librt.so.1 (0xb7d47000)
	libdl.so.2 => /lib/libdl.so.2 (0xb7d43000)
	libnsl.so.1 => /lib/libnsl.so.1 (0xb7d2e000)
	libutil.so.1 => /lib/libutil.so.1 (0xb7d29000)
	libm.so.6 => /lib/libm.so.6 (0xb7d07000)
	libpthread.so.0 => /lib/libpthread.so.0 (0xb7cf5000)
	libc.so.6 => /lib/libc.so.6 (0xb7be2000)
	/lib/ld-linux.so.2 (0xb7f53000)


Same behaviour on multiple CPUs:
lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi/test/test_double $ mpirun -np 2
./test_write test.nc
Testing write ... ADIOI_GEN_DELETE (line 22): **io No such file or directoryError: Unsupported
datatype passed to ADIOI_Count_contiguous_blocks
Error: Unsupported datatype passed to ADIOI_Count_contiguous_blocks
1 additional process aborted (not shown)



With test_dtype:
lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi/test/test_dtype $ ./test_nonblocking test.nc
testing memory subarray layout ...
ADIOI_GEN_DELETE (line 22): **io No such file or directory	 Filesize = 2.024MB, MAX_Memory_needed =
4.048MB

Initialization:  NDIMS = 3, NATIVE_ETYPE = float, NC_TYPE = NC_DOUBLE

	 NC Var_1 Shape:	 [17, 51, 153] Always ORDER_C
	 NC Var_2 Shape:	 [153, 51, 17] Always ORDER_C
	 Memory Array Shape:	 [17, 51, 153] MPI_ORDER_C
	 Memory Array Copys: buf1 for write, buf2 for read back (and compare)

Logical Array Partition:	 BLOCK partition along all dimensions

Access Pattern (subarray):  NPROCS = 1

	 Proc  0 of  1:  starts = [ 0,  0,  0], counts = [17, 51, 153]

TEST1:
	 [nonblocking] all procs writing their subarrays into Var_1 ...
Error: Unsupported datatype passed to ADIOI_Count_contiguous_blocks
[mhd3:27627] [0,0,0] ORTE_ERROR_LOG: Not found in file
../../../../orte/mca/pls/base/pls_base_proxy.c at line 189
lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi/test/test_dtype $


I already searched google for this kind of array, but found nothing useful.

   Greetings,
   Dries


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

-------------- next part --------------
An embedded message was scrubbed...
From: Dries Kimpe <Dries.Kimpe at wis.kuleuven.be>
Subject: pnetcdf & Open MPI 
Date: Tue, 02 May 2006 22:20:08 +0200
Size: 1283
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20060504/ad9bce59/attachment.eml>


More information about the parallel-netcdf mailing list