failure to dup communicator in pnetcdf

Rob Ross rross at mcs.anl.gov
Wed Sep 7 10:52:21 CDT 2011


If we ever use that communicator internally, then we need to dup it.  
Rookie move on our part! -- Rob

On Sep 7, 2011, at 10:50 AM, Rob Latham wrote:

> I noticed last week that parallel-netcdf does not duplicate the
> communicator passed to the library as part of ncmpi_open/nfmpi_open  
> and
> ncmpi_create/nfmpi_create.
>
> It's been 8 years and no one has complained, and a lot of the heavy
> lifting is done in the MPI-IO library (which does duplicate the
> communicator).  Still, I'm about to commit this patch unless someone
> can remind me why we do not need it...
>
>
> Index: src/lib/mpincio.c
> ===================================================================
> --- src/lib/mpincio.c	(revision 923)
> +++ src/lib/mpincio.c	(working copy)
> @@ -136,7 +136,7 @@ ncmpiio_create(MPI_Comm     comm,
>
>     nciop->mpiomode  = MPI_MODE_RDWR;
>     nciop->mpioflags = 0;
> -    nciop->comm      = comm;
> +    MPI_Comm_dup(comm, &(nciop->comm));
>
>     ncmpiio_extract_hints(nciop, info);
>
> @@ -147,7 +147,7 @@ ncmpiio_create(MPI_Comm     comm,
>         /* to avoid calling MPI_File_set_size() later, let process 0  
> check
>            if the file exists. If not, no need to call  
> MPI_File_set_size */
>         int rank;
> -        MPI_Comm_rank(comm, &rank);
> +        MPI_Comm_rank(nciop->comm, &rank);
>         if (rank == 0) { /* check if file exists */
>             if (access(path, F_OK) == 0) { /* but is this only  
> available in Linux? */
>                 /* file does exist, so delete it */
> @@ -164,7 +164,8 @@ ncmpiio_create(MPI_Comm     comm,
> #endif
>     }
>
> -    mpireturn = MPI_File_open(comm, (char *)path, mpiomode, info,  
> &nciop->collective_fh);
> +    mpireturn = MPI_File_open(nciop->comm, (char *)path, mpiomode,
> +            info, &nciop->collective_fh);
>     if (mpireturn != MPI_SUCCESS) {
>         int rank;
>         MPI_Comm_rank(comm, &rank);
> @@ -215,11 +216,12 @@ ncmpiio_open(MPI_Comm     comm,
>
>     nciop->mpiomode  = mpiomode;
>     nciop->mpioflags = 0;
> -    nciop->comm      = comm;
> +    MPI_Comm_dup(comm, &(nciop->comm));
>
>     ncmpiio_extract_hints(nciop, info);
>
> -    mpireturn = MPI_File_open(comm, (char *)path, mpiomode, info,  
> &nciop->collective_fh);
> +    mpireturn = MPI_File_open(nciop->comm, (char *)path, mpiomode,
> +            info, &nciop->collective_fh);
>     if (mpireturn != MPI_SUCCESS) {
>         int rank;
>         MPI_Comm_rank(comm, &rank);
> @@ -321,6 +323,9 @@ ncmpiio_close(ncio *nciop, int doUnlink) {
>     MPI_Info_free(&(nciop->mpiinfo));
> #endif
>
> +  if (nciop->comm != MPI_COMM_NULL) {
> +      MPI_Comm_free(&(nciop->comm));
> +  }
>
>   ncmpiio_free(nciop);
>
>
> -- 
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA



More information about the parallel-netcdf mailing list