segmentation fault

Wei-keng Liao wkliao at ece.northwestern.edu
Thu Aug 2 22:47:57 CDT 2012


Hi, Jialin,

You are mistakenly using dimension ID in count.
            count[j] = dimids[j];

Shouldn't it be the following?
            count[j] = dim_sizes[dimids[j]];

Also, please avoid using NFS when doing parallel I/O.
It can cause inconsistent data and unexpected results.

Wei-keng

On Aug 2, 2012, at 8:49 PM, Liu, Jaln wrote:

> Hi,
> Can anybody help me with the following problem? I appreciate it.
> 
> when I run the program in a 16 nodes cluster, I met the segmentation fault error
> 
> the code is downloaded from pnetcdf website and is modified a little:
> source:
> #include <stdlib.h>
> #include <mpi.h>
> #include <pnetcdf.h>
> #include <stdio.h>
> 
> static void handle_error(int status)
> {
>    fprintf(stderr, "%s\n", ncmpi_strerror(status));
>    exit(-1);
> }
> 
> 
> int main(int argc, char **argv) {
> 
>    int rank, nprocs;
>    int ret, varid,ncfile, ndims, nvars, ngatts, unlimited;
>    int var_ndims, var_natts;;
>    MPI_Offset *dim_sizes, var_size;
>    MPI_Offset *start, *count;
> 
>    char varname[NC_MAX_NAME+1];
>    int dimids[NC_MAX_VAR_DIMS];
>    nc_type type;
> 
>    int i, j;
> 
>    int *data;
> 
>    MPI_Init(&argc, &argv);
> 
>    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>    MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
>    char * FILE_NAME="nfs:output.nc";
>    ret = ncmpi_open(MPI_COMM_WORLD, FILE_NAME, NC_NOWRITE, MPI_INFO_NULL,
>            &ncfile);
>    if (ret != NC_NOERR) handle_error(ret);
> 
> 
>    ret = ncmpi_inq(ncfile, &ndims, &nvars, &ngatts, &unlimited);
>    if (ret != NC_NOERR) handle_error(ret);
> 
>    dim_sizes = calloc(ndims, sizeof(MPI_Offset));
> 
>    for(i=0; i<ndims; i++)  {
>        ret = ncmpi_inq_dimlen(ncfile, i, &(dim_sizes[i]) );
>        if (ret != NC_NOERR) handle_error(ret);
>    }
> 
>    for(i=0; i<nvars; i++) {
>       ret = ncmpi_inq_var(ncfile, i, varname, &type, &var_ndims, dimids,
>                &var_natts);
>        if (ret != NC_NOERR) handle_error(ret);
> 
>        start = calloc(var_ndims, sizeof(MPI_Offset));
>        count = calloc(var_ndims, sizeof(MPI_Offset));
> 
>        start[0] = (dim_sizes[dimids[0]]/nprocs)*rank;
>        count[0] = (dim_sizes[dimids[0]]/nprocs);
>        var_size = count[0];
> 
>        for (j=1; j<var_ndims; j++) {
>            start[j] = 0;
>            count[j] = dimids[j];
>            var_size *= count[j];
>        }
> 
>        switch(type) {
>            case NC_INT:
>                data = calloc(var_size, sizeof(int));
> 
>                ret = ncmpi_get_vara_int_all(ncfile, i, start, count, data);
> 
>                if (ret != NC_NOERR) handle_error(ret);
> 
>                break;
>            default:
>                /* we can do this for all the known netcdf types but this
>                 * example is already getting too long  */
>                fprintf(stderr, "unsupported NetCDF type \n");
>        }
> 
>        free(start);
>        free(count);
>        if (data != NULL) free(data);
> 
> }
>    ret = ncmpi_close(ncfile);
>    if (ret != NC_NOERR) handle_error(ret);
> 
>    MPI_Finalize();
>    return 0;
> }
> 
> the file output.nc:
> netcdf output {
> dimensions:
>    d1 = 16 ;
> variables:
>    int v1(d1) ;
>    int v2(d1) ;
> 
> // global attributes:
>        :string = "Hello World\n",
>            "" ;
> data:
> 
> v1 = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 ;
> 
> v2 = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 ;
> }
> 
> Compiling command:
> mpicc -o preadsegfault preadsegfault.c
> Run the program:
> mpirun -np 4 ./preadsegfault
> 
> 
> Best Regards,
> Jialin Liu, Ph.D student.
> Computer Science Department
> Texas Tech University
> Phone: 806.742.3513(x241)
> Office:Engineer Center 304
> http://myweb.ttu.edu/jialliu/
> <preadsegfault.c><output.nc>



More information about the parallel-netcdf mailing list