[MPICH] Behavour if MPI_File_Open fails on some nodes

James S Perrin james.s.perrin at manchester.ac.uk
Wed Aug 22 05:46:51 CDT 2007


Hi,
	That works fine for me.
Thanks
James

Robert Latham wrote:
> On Tue, Aug 21, 2007 at 10:11:47AM -0500, Robert Latham wrote:
>> On Tue, Aug 21, 2007 at 10:52:12AM +0100, James S Perrin wrote:
>>> Sorry to be pendantic but did you use a leading dir path that didn't 
>>> exist on the other processors. I get the correct behaviour if the dir 
>>> paths exist on all processors but the files don't.
>> Thanks! That was the missing factor.   Ok, that should definitely not
>> happen, and since I can reproduce it here, I should be able to fix it
>> soon.  I'll send you a patch.
> 
> Please try this patch.  I'm a little wary of putting yet another
> Allreduce in the open path, but I don't think there's another way
> 
> Index: src/mpi/romio/adio/common/ad_fstype.c
> ===================================================================
> RCS file: /home/MPI/cvsMaster/romio/adio/common/ad_fstype.c,v
> retrieving revision 1.54
> diff -u -w -p -r1.54 ad_fstype.c
> --- src/mpi/romio/adio/common/ad_fstype.c       12 Mar 2007 20:40:40 -0000     1.54
> +++ src/mpi/romio/adio/common/ad_fstype.c       21 Aug 2007 16:41:56 -0000
> @@ -503,7 +503,7 @@ tables in a reasonable way. -- Rob, 06/0
>  void ADIO_ResolveFileType(MPI_Comm comm, char *filename, int *fstype, 
>                           ADIOI_Fns **ops, int *error_code)
>  {
> -    int myerrcode, file_system, min_code;
> +    int myerrcode, file_system, min_code, max_code;
>      char *tmp;
>      static char myname[] = "ADIO_RESOLVEFILETYPE";
>  
> @@ -514,6 +514,15 @@ void ADIO_ResolveFileType(MPI_Comm comm,
>         ADIO_FileSysType_fncall(filename, &file_system, &myerrcode);
>         if (myerrcode != MPI_SUCCESS) {
>             *error_code = myerrcode;
> +       }
> +
> +       /* the check for file system type will hang if any process got an error
> +        * in ADIO_FileSysType_fncall (this could happen if a full path exists
> +        * on one node but not on others, and no prefix like ufs: was provided)
> +        */
> +       MPI_Allreduce(error_code, &max_code, 1, MPI_INT, MPI_MAX, comm);
> +       if (max_code != MPI_SUCCESS)  {
> +               *error_code = max_code;
>             return;
>         }
> 
> 
>> ==rob
>>
> 


-- 
------------------------------------------------------------------------
James S. Perrin,                  | email: james.perrin at manchester.ac.uk
Research Computing Services,      | web:   www.mc.manchester.ac.uk
Kilburn Building, The University, | tel:   +44 161 275 6945
Manchester, England. M13 9PL.     | fax:   +44 161 275 0637
------------------------------------------------------------------------
"The test of intellect is the refusal to belabour the obvious"
- Alfred Bester
------------------------------------------------------------------------




More information about the mpich-discuss mailing list