segmentation fault
Wei-keng Liao
wkliao at ece.northwestern.edu
Thu Aug 2 22:47:57 CDT 2012
Hi, Jialin,
You are mistakenly using dimension ID in count.
count[j] = dimids[j];
Shouldn't it be the following?
count[j] = dim_sizes[dimids[j]];
Also, please avoid using NFS when doing parallel I/O.
It can cause inconsistent data and unexpected results.
Wei-keng
On Aug 2, 2012, at 8:49 PM, Liu, Jaln wrote:
> Hi,
> Can anybody help me with the following problem? I appreciate it.
>
> when I run the program in a 16 nodes cluster, I met the segmentation fault error
>
> the code is downloaded from pnetcdf website and is modified a little:
> source:
> #include <stdlib.h>
> #include <mpi.h>
> #include <pnetcdf.h>
> #include <stdio.h>
>
> static void handle_error(int status)
> {
> fprintf(stderr, "%s\n", ncmpi_strerror(status));
> exit(-1);
> }
>
>
> int main(int argc, char **argv) {
>
> int rank, nprocs;
> int ret, varid,ncfile, ndims, nvars, ngatts, unlimited;
> int var_ndims, var_natts;;
> MPI_Offset *dim_sizes, var_size;
> MPI_Offset *start, *count;
>
> char varname[NC_MAX_NAME+1];
> int dimids[NC_MAX_VAR_DIMS];
> nc_type type;
>
> int i, j;
>
> int *data;
>
> MPI_Init(&argc, &argv);
>
> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
> char * FILE_NAME="nfs:output.nc";
> ret = ncmpi_open(MPI_COMM_WORLD, FILE_NAME, NC_NOWRITE, MPI_INFO_NULL,
> &ncfile);
> if (ret != NC_NOERR) handle_error(ret);
>
>
> ret = ncmpi_inq(ncfile, &ndims, &nvars, &ngatts, &unlimited);
> if (ret != NC_NOERR) handle_error(ret);
>
> dim_sizes = calloc(ndims, sizeof(MPI_Offset));
>
> for(i=0; i<ndims; i++) {
> ret = ncmpi_inq_dimlen(ncfile, i, &(dim_sizes[i]) );
> if (ret != NC_NOERR) handle_error(ret);
> }
>
> for(i=0; i<nvars; i++) {
> ret = ncmpi_inq_var(ncfile, i, varname, &type, &var_ndims, dimids,
> &var_natts);
> if (ret != NC_NOERR) handle_error(ret);
>
> start = calloc(var_ndims, sizeof(MPI_Offset));
> count = calloc(var_ndims, sizeof(MPI_Offset));
>
> start[0] = (dim_sizes[dimids[0]]/nprocs)*rank;
> count[0] = (dim_sizes[dimids[0]]/nprocs);
> var_size = count[0];
>
> for (j=1; j<var_ndims; j++) {
> start[j] = 0;
> count[j] = dimids[j];
> var_size *= count[j];
> }
>
> switch(type) {
> case NC_INT:
> data = calloc(var_size, sizeof(int));
>
> ret = ncmpi_get_vara_int_all(ncfile, i, start, count, data);
>
> if (ret != NC_NOERR) handle_error(ret);
>
> break;
> default:
> /* we can do this for all the known netcdf types but this
> * example is already getting too long */
> fprintf(stderr, "unsupported NetCDF type \n");
> }
>
> free(start);
> free(count);
> if (data != NULL) free(data);
>
> }
> ret = ncmpi_close(ncfile);
> if (ret != NC_NOERR) handle_error(ret);
>
> MPI_Finalize();
> return 0;
> }
>
> the file output.nc:
> netcdf output {
> dimensions:
> d1 = 16 ;
> variables:
> int v1(d1) ;
> int v2(d1) ;
>
> // global attributes:
> :string = "Hello World\n",
> "" ;
> data:
>
> v1 = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 ;
>
> v2 = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 ;
> }
>
> Compiling command:
> mpicc -o preadsegfault preadsegfault.c
> Run the program:
> mpirun -np 4 ./preadsegfault
>
>
> Best Regards,
> Jialin Liu, Ph.D student.
> Computer Science Department
> Texas Tech University
> Phone: 806.742.3513(x241)
> Office:Engineer Center 304
> http://myweb.ttu.edu/jialliu/
> <preadsegfault.c><output.nc>
More information about the parallel-netcdf
mailing list