[Fwd: pnetcdf & Open MPI]
Dries Kimpe
Dries.Kimpe at wis.kuleuven.be
Thu May 4 05:38:11 CDT 2006
The facts:
* parallel netcdf compiles, both with Open MPI (svn trunk) and mpich2 .
* With Open MPI, all tests fail with the following message
Testing write ... Error: Unsupported datatype passed to
ADIOI_Count_contiguous_blocks
[lts.mydomain.be:26763] [0,0,0] ORTE_ERROR_LOG: Not found in file
../../../../orte/mca/pls/base/pls_base_proxy.c at line 189
(both independent&collective writes, no mather what the underlying variable type is).
Steps to reproduce the problem:
1) build Open MPI trunk revision 9809
Special configure options used: --enable-static --enable-shared
gcc:
gcc (GCC) 3.3.6 (Gentoo 3.3.6, ssp-3.3.6-1.0, pie-8.7.8)
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
2) verify that the correct compiler is being called
lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi $ mpic++ -showme
g++ -I/home/lts/openmpi/include -I/home/lts/openmpi/include/openmpi -pthread -L/home/lts/openmpi/lib
-lmpi_cxx -lmpi -lorte -lopal -lrt -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl
lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi $ mpicc -showme
gcc -I/home/lts/openmpi/include -I/home/lts/openmpi/include/openmpi -pthread -L/home/lts/openmpi/lib
-lmpi -lorte -lopal -lrt -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl
lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi $ mpicc --version
gcc (GCC) 3.3.6 (Gentoo 3.3.6, ssp-3.3.6-1.0, pie-8.7.8)
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
3) unpack parallel-netcdf-1.0.1.tar.bz2
mkdir build
cd build
mkdir openmpi
cd openmpi
../../configure --prefix=/home/lts/openmpi --disable-fortran CC=mpicc
make
make install
4) go to test directory
make
(-> fortran tests fail to compile, which is 'normal')
Try out test/test_double:
lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi/test/test_double $ ./test_write test.nc
Testing write ... ADIOI_GEN_DELETE (line 22): **io No such file or directoryError: Unsupported
datatype passed to ADIOI_Count_contiguous_blocks
[mhd3:24861] [0,0,0] ORTE_ERROR_LOG: Not found in file
../../../../orte/mca/pls/base/pls_base_proxy.c at line 189
verify correct libraries are being used:
lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi/test/test_double $ ldd test_write
linux-gate.so.1 => (0xffffe000)
libmpi.so.0 => /home/lts/openmpi/lib/libmpi.so.0 (0xb7e3d000)
liborte.so.0 => /home/lts/openmpi/lib/liborte.so.0 (0xb7da5000)
libopal.so.0 => /home/lts/openmpi/lib/libopal.so.0 (0xb7d71000)
librt.so.1 => /lib/librt.so.1 (0xb7d47000)
libdl.so.2 => /lib/libdl.so.2 (0xb7d43000)
libnsl.so.1 => /lib/libnsl.so.1 (0xb7d2e000)
libutil.so.1 => /lib/libutil.so.1 (0xb7d29000)
libm.so.6 => /lib/libm.so.6 (0xb7d07000)
libpthread.so.0 => /lib/libpthread.so.0 (0xb7cf5000)
libc.so.6 => /lib/libc.so.6 (0xb7be2000)
/lib/ld-linux.so.2 (0xb7f53000)
Same behaviour on multiple CPUs:
lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi/test/test_double $ mpirun -np 2
./test_write test.nc
Testing write ... ADIOI_GEN_DELETE (line 22): **io No such file or directoryError: Unsupported
datatype passed to ADIOI_Count_contiguous_blocks
Error: Unsupported datatype passed to ADIOI_Count_contiguous_blocks
1 additional process aborted (not shown)
With test_dtype:
lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi/test/test_dtype $ ./test_nonblocking test.nc
testing memory subarray layout ...
ADIOI_GEN_DELETE (line 22): **io No such file or directory Filesize = 2.024MB, MAX_Memory_needed =
4.048MB
Initialization: NDIMS = 3, NATIVE_ETYPE = float, NC_TYPE = NC_DOUBLE
NC Var_1 Shape: [17, 51, 153] Always ORDER_C
NC Var_2 Shape: [153, 51, 17] Always ORDER_C
Memory Array Shape: [17, 51, 153] MPI_ORDER_C
Memory Array Copys: buf1 for write, buf2 for read back (and compare)
Logical Array Partition: BLOCK partition along all dimensions
Access Pattern (subarray): NPROCS = 1
Proc 0 of 1: starts = [ 0, 0, 0], counts = [17, 51, 153]
TEST1:
[nonblocking] all procs writing their subarrays into Var_1 ...
Error: Unsupported datatype passed to ADIOI_Count_contiguous_blocks
[mhd3:27627] [0,0,0] ORTE_ERROR_LOG: Not found in file
../../../../orte/mca/pls/base/pls_base_proxy.c at line 189
lts at mhd3 ~/work/pnetcdf/parallel-netcdf-1.0.1/build/ompi/test/test_dtype $
I already searched google for this kind of array, but found nothing useful.
Greetings,
Dries
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
-------------- next part --------------
An embedded message was scrubbed...
From: Dries Kimpe <Dries.Kimpe at wis.kuleuven.be>
Subject: pnetcdf & Open MPI
Date: Tue, 02 May 2006 22:20:08 +0200
Size: 1283
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20060504/ad9bce59/attachment.eml>
More information about the parallel-netcdf
mailing list