segmentation fault

Liu, Jaln jaln.liu at ttu.edu
Fri Aug 3 17:15:36 CDT 2012


Hi Dr. Liao,

Thanks, I fixed the problem with your help. I think the source file in pnetcdf website should also be updated,

here is the original link
https://trac.mcs.anl.gov/projects/parallel-netcdf/browser/trunk/examples/tutorial/pnetcdf-read-standard.c

Could you give me more information about why using NFS can lead to inconsistent data? I found in the following link
http://trac.mcs.anl.gov/projects/parallel-netcdf/browser/trunk/README#L19
that romio can accept both nfs and pvfs2

Best Regards,
Jialin Liu, Ph.D student.
Computer Science Department
Texas Tech University
Phone: 806.742.3513(x241)
Office:Engineer Center 304
http://myweb.ttu.edu/jialliu/
________________________________________
From: parallel-netcdf-bounces at lists.mcs.anl.gov [parallel-netcdf-bounces at lists.mcs.anl.gov] on behalf of Wei-keng Liao [wkliao at ece.northwestern.edu]
Sent: Thursday, August 02, 2012 10:47 PM
To: parallel-netcdf at mcs.anl.gov
Subject: Re: segmentation fault

Hi, Jialin,

You are mistakenly using dimension ID in count.
            count[j] = dimids[j];

Shouldn't it be the following?
            count[j] = dim_sizes[dimids[j]];

Also, please avoid using NFS when doing parallel I/O.
It can cause inconsistent data and unexpected results.

Wei-keng

On Aug 2, 2012, at 8:49 PM, Liu, Jaln wrote:

> Hi,
> Can anybody help me with the following problem? I appreciate it.
>
> when I run the program in a 16 nodes cluster, I met the segmentation fault error
>
> the code is downloaded from pnetcdf website and is modified a little:
> source:
> #include <stdlib.h>
> #include <mpi.h>
> #include <pnetcdf.h>
> #include <stdio.h>
>
> static void handle_error(int status)
> {
>    fprintf(stderr, "%s\n", ncmpi_strerror(status));
>    exit(-1);
> }
>
>
> int main(int argc, char **argv) {
>
>    int rank, nprocs;
>    int ret, varid,ncfile, ndims, nvars, ngatts, unlimited;
>    int var_ndims, var_natts;;
>    MPI_Offset *dim_sizes, var_size;
>    MPI_Offset *start, *count;
>
>    char varname[NC_MAX_NAME+1];
>    int dimids[NC_MAX_VAR_DIMS];
>    nc_type type;
>
>    int i, j;
>
>    int *data;
>
>    MPI_Init(&argc, &argv);
>
>    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>    MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
>    char * FILE_NAME="nfs:output.nc";
>    ret = ncmpi_open(MPI_COMM_WORLD, FILE_NAME, NC_NOWRITE, MPI_INFO_NULL,
>            &ncfile);
>    if (ret != NC_NOERR) handle_error(ret);
>
>
>    ret = ncmpi_inq(ncfile, &ndims, &nvars, &ngatts, &unlimited);
>    if (ret != NC_NOERR) handle_error(ret);
>
>    dim_sizes = calloc(ndims, sizeof(MPI_Offset));
>
>    for(i=0; i<ndims; i++)  {
>        ret = ncmpi_inq_dimlen(ncfile, i, &(dim_sizes[i]) );
>        if (ret != NC_NOERR) handle_error(ret);
>    }
>
>    for(i=0; i<nvars; i++) {
>       ret = ncmpi_inq_var(ncfile, i, varname, &type, &var_ndims, dimids,
>                &var_natts);
>        if (ret != NC_NOERR) handle_error(ret);
>
>        start = calloc(var_ndims, sizeof(MPI_Offset));
>        count = calloc(var_ndims, sizeof(MPI_Offset));
>
>        start[0] = (dim_sizes[dimids[0]]/nprocs)*rank;
>        count[0] = (dim_sizes[dimids[0]]/nprocs);
>        var_size = count[0];
>
>        for (j=1; j<var_ndims; j++) {
>            start[j] = 0;
>            count[j] = dimids[j];
>            var_size *= count[j];
>        }
>
>        switch(type) {
>            case NC_INT:
>                data = calloc(var_size, sizeof(int));
>
>                ret = ncmpi_get_vara_int_all(ncfile, i, start, count, data);
>
>                if (ret != NC_NOERR) handle_error(ret);
>
>                break;
>            default:
>                /* we can do this for all the known netcdf types but this
>                 * example is already getting too long  */
>                fprintf(stderr, "unsupported NetCDF type \n");
>        }
>
>        free(start);
>        free(count);
>        if (data != NULL) free(data);
>
> }
>    ret = ncmpi_close(ncfile);
>    if (ret != NC_NOERR) handle_error(ret);
>
>    MPI_Finalize();
>    return 0;
> }
>
> the file output.nc:
> netcdf output {
> dimensions:
>    d1 = 16 ;
> variables:
>    int v1(d1) ;
>    int v2(d1) ;
>
> // global attributes:
>        :string = "Hello World\n",
>            "" ;
> data:
>
> v1 = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 ;
>
> v2 = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 ;
> }
>
> Compiling command:
> mpicc -o preadsegfault preadsegfault.c
> Run the program:
> mpirun -np 4 ./preadsegfault
>
>
> Best Regards,
> Jialin Liu, Ph.D student.
> Computer Science Department
> Texas Tech University
> Phone: 806.742.3513(x241)
> Office:Engineer Center 304
> http://myweb.ttu.edu/jialliu/
> <preadsegfault.c><output.nc>



More information about the parallel-netcdf mailing list