pnetcdf nfmpi_put_vara_real_all problem
Wong, David
Wong.David-C at epa.gov
Mon Sep 23 08:45:50 CDT 2013
Hi Wei-keng,
Thanks for pointing out that. It works.
Cheers,
David
________________________________________
From: Wei-keng Liao <wkliao at ece.northwestern.edu>
Sent: Thursday, September 19, 2013 11:44 AM
To: Wong, David
Cc: parallel-netcdf at mcs.anl.gov
Subject: Re: pnetcdf nfmpi_put_vara_real_all problem
Since Edison is a Cray, could you try compile it with ftn?
i.e. using command "ftn epa.F -o epa"
ftn is a wrapper of ifort when module PrgEnv-intel is loaded.
(If you see a preprocessor error, please replace the line
include 'pnetcdf.inc'
with
#include 'pnetcdf.inc'
)
Wei-keng
On Sep 19, 2013, at 8:55 AM, Wong, David wrote:
> Hi Wei-keng,
>
> I used the same say as I compiled my other model. Here is the detail:
>
> 1052 edison01:junk > module list
> Currently Loaded Modulefiles:
> 1) modules/3.2.6.7 12) dmapp/6.0.1-1.0500.7263.9.31.ari 23) cray-mpich/6.0.2
> 2) nsg/1.2.0 13) gni-headers/3.0-1.0500.7161.11.4.ari 24) torque/4.2.3.h5
> 3) eswrap/1.0.19-1.010001.264.0 14) xpmem/0.1-2.0500.41356.1.11.ari 25) moab/7.2.3-r19-b121-SUSE11
> 4) switch/1.0-1.0500.41328.1.120.ari 15) job/1.5.5-0.1_2.0500.41368.1.92.ari 26) altd/1.0
> 5) craype-network-aries 16) csa/3.0.0-1_2.0500.41366.1.129.ari 27) darshan/2.2.7-106
> 6) craype/1.06 17) dvs/2.3_0.9.0-1.0500.1522.1.180 28) usg-default-modules/1.0
> 7) intel/13.1.3.192 18) rca/1.0.0-2.0500.41336.1.120.ari 29) cray-hdf5/1.8.9
> 8) cray-libsci/12.1.01 19) atp/1.6.3 30) cray-netcdf/4.2.1.1
> 9) udreg/2.3.2-1.0500.6756.2.10.ari 20) PrgEnv-intel/5.0.41 31) parallel-netcdf/1.3.1
> 10) ugni/5.0-1.0500.0.3.306.ari 21) craype-ivybridge
> 11) pmi/4.0.1-1.0000.9725.84.2.ari 22) cray-shmem/6.0.2
> 1053 edison01:junk > module display parallel-netcdf/1.3.1
> -------------------------------------------------------------------
> /opt/cray/modulefiles/parallel-netcdf/1.3.1:
>
> setenv CRAY_PARALLEL_NETCDF_DIR /opt/cray/parallel-netcdf/1.3.1
> setenv CRAY_PARALLEL_NETCDF_VERSION 1.3.1
> prepend-path PE_PRODUCT_LIST CRAY_PARALLEL_NETCDF
> setenv CRAY_PARALLEL_NETCDF 74
> setenv INTEL_PARALLEL_NETCDF 120
> setenv PGI_PARALLEL_NETCDF 119
> prepend-path PATH /opt/cray/parallel-netcdf/1.3.1/bin
> setenv PARALLEL_NETCDF_DIR /opt/cray/parallel-netcdf/1.3.1/intel/120
> prepend-path CRAY_LD_LIBRARY_PATH /opt/cray/parallel-netcdf/1.3.1/intel/120/lib
> prepend-path MANPATH /opt/cray/parallel-netcdf/1.3.1/man
> -------------------------------------------------------------------
>
> 1054 edison01:junk > module display cray-mpich/6.0.2
> -------------------------------------------------------------------
> /opt/cray/modulefiles/cray-mpich/6.0.2:
>
> conflict parallel-netcdf
> conflict cray-mpich
> conflict cray-mpich2
> conflict xt-mpich2
> conflict xt-mpt
> setenv CRAY_MPICH2_VER 6.0.2
> setenv CRAY_MPICH2_ROOTDIR /opt/cray/mpt/6.0.2
> setenv CRAY_MPICH2_BASEDIR /opt/cray/mpt/6.0.2/gni
> setenv CRAY_MPICH2_DIR /opt/cray/mpt/6.0.2/gni/mpich2-intel/130
> setenv MPICH_DIR /opt/cray/mpt/6.0.2/gni/mpich2-intel/130
> prepend-path PE_PRODUCT_LIST CRAY_MPICH2
> setenv CRAY_MPICH2 81
> setenv INTEL_MPICH2 130
> setenv PGI_MPICH2 121
> prepend-path CRAY_LD_LIBRARY_PATH /opt/cray/mpt/6.0.2/gni/mpich2-intel/130/lib
> prepend-path MANPATH /opt/cray/mpt/6.0.2/gni/man/mpich2
> module-whatis cray-mpich - Cray MPICH2 Message Passing Interface
> -------------------------------------------------------------------
>
> 1055 edison01:junk > make
> ifort -c -C -traceback -fixed -132 -O3 -I /opt/cray/parallel-netcdf/1.3.1/intel/120/include -I /opt/cray/mpt/6.0.2/gni/mpich2-intel/130/include -I. epa.F
> ifort epa.o -L/opt/cray/netcdf/4.2.1.1/intel/120/lib -lnetcdf -lnetcdff -L/opt/cray/parallel-netcdf/1.3.1/intel/120/lib -lpnetcdf -o epa.x
>
>
> Cheers,
> David
>
>
> ________________________________________
> From: Wei-keng Liao <wkliao at ece.northwestern.edu>
> Sent: Thursday, September 19, 2013 9:34 AM
> To: Wong, David
> Cc: parallel-netcdf at mcs.anl.gov
> Subject: Re: pnetcdf nfmpi_put_vara_real_all problem
>
> David,
>
> This is a different error from the previous one you reported.
> This error most likely is caused by the compile and link with
> wrong MPI or PnetCDF libraries, as nfmpi_create is the first
> PnetCDF call in the test program.
>
> I google the error message and it seems like you compiled with
> a wrong MPI header file, maybe implying the PnetCDF library was
> built with a different MPI compiler from the one you used to
> compile this test program.
>
> Could you show us how you compile the program?
> Also, what version of PnetCDF is used to compile?
>
>
> Wei-keng
>
> On Sep 19, 2013, at 7:35 AM, Wong, David wrote:
>
>> Hi Wei-keng,
>>
>> It failed at the following step:
>>
>> err = nfmpi_create(MPI_COMM_WORLD, filename, NF_CLOBBER,
>> + MPI_INFO_NULL, ncid)
>>
>> with the following error message (from one of the processor):
>>
>> Rank 1 [Thu Sep 19 04:27:50 2013] [c5-0c1s4n3] Fatal error in MPI_Comm_test_inter: Invalid communicator, error stack:
>> MPI_Comm_test_inter(110): MPI_Comm_test_inter(comm=0x84000002, flag=0x7fffffff8f90) failed
>> MPI_Comm_test_inter(83).: Invalid communicator
>> Rank 2 [Thu Sep 19 04:27:50 2013] [c5-0c1s4n3] Fatal error in MPI_Comm_test_inter: Invalid communicator, error stack:
>> MPI_Comm_test_inter(110): MPI_Comm_test_inter(comm=0x84000002, flag=0x7fffffff8f90) failed
>> MPI_Comm_test_inter(83).: Invalid communicator
>> Rank 3 [Thu Sep 19 04:27:50 2013] [c5-0c1s4n3] Fatal error in MPI_Comm_test_inter: Invalid communicator, error stack:
>> MPI_Comm_test_inter(110): MPI_Comm_test_inter(comm=0x84000002, flag=0x7fffffff8f90) failed
>> MPI_Comm_test_inter(83).: Invalid communicator
>> Rank 0 [Thu Sep 19 04:27:50 2013] [c5-0c1s4n3] Fatal error in MPI_Comm_test_inter: Invalid communicator, error stack:
>> MPI_Comm_test_inter(110): MPI_Comm_test_inter(comm=0x84000004, flag=0x7fffffff8f90) failed
>> MPI_Comm_test_inter(83).: Invalid communicator
>> forrtl: error (76): Abort trap signal
>> Image PC Routine Line Source
>> libc.so.6 00002AAAAB92CB35 Unknown Unknown Unknown
>> libc.so.6 00002AAAAB92E111 Unknown Unknown Unknown
>> epa.x 0000000000435E42 Unknown Unknown Unknown
>> epa.x 000000000042B160 Unknown Unknown Unknown
>> epa.x 000000000042B30D Unknown Unknown Unknown
>> libmpich_intel.so 00002AAAAC248B83 Unknown Unknown Unknown
>> epa.x 0000000000523C3E Unknown Unknown Unknown
>> epa.x 000000000051E508 Unknown Unknown Unknown
>> epa.x 000000000051DBC5 Unknown Unknown Unknown
>> epa.x 00000000004245EF MAIN__ 33 epa.F
>> epa.x 000000000042454C Unknown Unknown Unknown
>> libc.so.6 00002AAAAB918C16 Unknown Unknown Unknown
>> epa.x 000000000042444D Unknown Unknown Unknown
>>
>> Regarding to Rob's suggestion, I am using aprun or mpirun to launch the executable so I don't know how to invoke valgrind at this time.
>>
>> Cheers,
>> David
>>
>> --
>> David C. Wong Ph.D.
>> Atmospheric Modeling and Analysis Division
>> National Exposure Research Laboratory
>> US Environmental Protection Agency
>> Mail Drop E243-03
>> 109 T. W. Alexander Dr.
>> Research Triangle Park, NC 27711
>> 919-541-3400 919-541-1379 (fax)
>>
>> ________________________________________
>> From: Wei-keng Liao <wkliao at ece.northwestern.edu>
>> Sent: Wednesday, September 18, 2013 7:13 PM
>> To: Wong, David
>> Cc: parallel-netcdf at mcs.anl.gov
>> Subject: Re: pnetcdf nfmpi_put_vara_real_all problem
>>
>> Hi, David,
>>
>> Could you please try the attached program and let us know if it
>> generates the same error? It is written based on the information
>> you provided.
>>
>
More information about the parallel-netcdf
mailing list