filename prefixes
Gerald Creager
gerry.creager at tamu.edu
Wed Aug 11 18:47:04 CDT 2010
On the system I'm working with, I can't use the MPICH envVars such as:
MPICH_MPIIO_HINTS_DISPLAY 1
MPICH_MPIIO_HINTS “wrfout*:striping_factor=64”
Therefore, to set striping on the wrfout files, with a Lustre file
system and SGE for the batch queuing environment, I've gotta find where
the wrfout file creation instance occurs and add a couple lines of code
to make it create the wrfouts with stripe-counts appropriately set
(somewhere between 16-64, I think). What I intend to do eventually, is
to get that folded back into WRF as a namelist parameter, so that those
of us using pnetcdf (needed if proc count gets past ~512 or so on this
system) can have a simplified granular method of using striping on
parallel file systems (specifically with pnetcdf).
I've looked at Johnsen's work to use Lustre on the Cray XT5. It dowsn't
apply to my environment, more's the pity.
Thanks, Gerry
Don Morton wrote:
> I've used pnetcdf with WRF, using the nocolons option. I'm not sure
> specifically what you're asking now, but I can send you my notes if it
> helps...
>
> On Wed, Aug 11, 2010 at 3:13 PM, Gerald Creager <gerry.creager at tamu.edu
> <mailto:gerry.creager at tamu.edu>> wrote:
>
> It's a namelist.input spec: NOCOLONS
>
> I'm sorting thru some other issues with pnetcdf and WRF right now...
> I'm having to change it so it'll create wrfout_dxx files with the
> striping info correct at file creation. If anyone's had to do this,
> I'd appreciate a clue...
>
> gerry
>
> Jim Edwards wrote:
>
> Hi Johnny,
>
> I think that the real problem may be that WRF uses the colon
> character in filenames and the filesystem reserves this same
> character for special use. I think that there is a compile
> option for wrf not to use colons.
>
> Jim
>
> On Wed, Aug 11, 2010 at 4:44 PM, Johnny Chang
> <Johnny.Chang at nasa.gov <mailto:Johnny.Chang at nasa.gov>
> <mailto:Johnny.Chang at nasa.gov <mailto:Johnny.Chang at nasa.gov>>>
> wrote:
>
> Hello,
>
> I am helping a user trouble-shoot a runtime error using
> parallel-netcdf version 1.1.1 and mvapich2/1.2p1/intel-PIC.
>
> The error message is:
>
> 0: MPI_File_open : File does not exist, error stack:
> ADIO_RESOLVEFILETYPE_PREFIX(546): Invalid file name
> wrfout_d01_2006-07-25_00:00:00
> open_hist_w : error opening wrfout_d01_2006-07-25_00:00:00 for
> writing. ***
>
> While googling the ADIO_RESOLVEFILETYPE_PREFIX error, we
> found the
> ad_fstype.c code containing:
>
> 477 /*
> 478 ADIO_FileSysType_prefix - determines file system
> type for
> a file using
> 479 a prefix on the file name. upper layer should have
> already determined
> 480 that a prefix is present.
> 481 482 Input Parameters:
> 483 . filename - path to file, including prefix (xxx:)
> 484 485 Output Parameters:
> 486 . fstype - pointer to integer in which to store file
> system
> type (ADIO_XXX)
> 487 . error_code - pointer to integer in which to store
> error code
> 488 489 Returns MPI_SUCCESS in error_code on
> success. Filename
> not having a prefix
> 490 is considered an error. Except for on Windows systems
> where the default is NTFS.
> 491 492 */
> 493 static void ADIO_FileSysType_prefix(char *filename, int
> *fstype, int *error_code)
> 494 {
> 495 static char myname[] = "ADIO_RESOLVEFILETYPE_PREFIX";
> 496 *error_code = MPI_SUCCESS;
> 497 498 if (!strncmp(filename, "pfs:", 4) ||
> !strncmp(filename,
> "PFS:", 4)) {
> 499 *fstype = ADIO_PFS;
> 500 }
>
> ...
>
>
> 557 #else
> 558 *fstype = 0;
> 559 /* --BEGIN ERROR HANDLING-- */
> 560 *error_code = MPIO_Err_create_code(MPI_SUCCESS,
> MPIR_ERR_RECOVERABLE,
> 561 myname,
> __LINE__,
> MPI_ERR_NO_SUCH_FILE,
> 562 "**filename",
> "**filename %s", filename);
> 563 /* --END ERROR HANDLING-- */
> 564 #endif
> 565 }
> 566 }
>
> which seems to indicate that the MVAPICH2 library is expecting
> parallel-netcdf
> to pre-pend a prefix on the filename passed to the MVAPICH2
> library.
>
> We are running on a Lustre filesystem. So, we think that the
> parallel-netcdf
> library should have passed the "lustre:" or "LUSTRE:" prefix
> along
> with the
> actual filename. Are we right in this interpretation of the
> error?
>
> If so, then perhaps the parallel-netcdf library was not built
> correctly?
>
> Here is the beginning part of config.log:
>
>
> ------------------------------------------------------------------------
>
> This file contains any messages produced by compilers while
> running configure, to aid debugging if configure makes a mistake.
>
> It was created by configure, which was
> generated by GNU Autoconf 2.61. Invocation command line was
>
> $ ./configure --prefix=/nasa/parallel-netcdf/1.1.1/mvapich2
> --with-mpi=/nasa/mvapich2/1.2p1/intel-PIC
>
> ## --------- ##
> ## Platform. ##
> ## --------- ##
>
> hostname = pbspl1
> uname -m = x86_64
> uname -r = 2.6.16.60-0.42.5.03schamp-nasa
> uname -s = Linux
> uname -v = #1 SMP Tue Nov 10 20:46:20 UTC 2009
>
> /usr/bin/uname -p = unknown
> /bin/uname -X = unknown
>
> /bin/arch = x86_64
> /usr/bin/arch -k = unknown
> /usr/convex/getsysinfo = unknown
> /usr/bin/hostinfo = unknown
> /bin/machine = unknown
> /usr/bin/oslevel = unknown
> /bin/universe = unknown
>
> PATH: /nasa/intel/Compiler/11.1/046/bin/intel64
> PATH: /nasa/intel/Compiler/11.1/046/mkl/tools/environment
> PATH: /nasa/mvapich2/1.2p1/intel-PIC/bin
> PATH: /u/jrappley/bin
>
> If the problem is in the parallel-netcdf build, let us know
> what is the fix.
>
> Thanks in advance!
>
> Johnny
> -- Johnny Chang
> 650-604-4356
>
>
>
> --
> Gerry Creager -- gerry.creager at tamu.edu <mailto:gerry.creager at tamu.edu>
> Texas Mesonet -- AATLT, Texas A&M University
> Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983
> Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843
>
>
>
>
> --
> Arctic Region Supercomputing Center
> http://weather.arsc.edu/
--
Gerry Creager -- gerry.creager at tamu.edu
Texas Mesonet -- AATLT, Texas A&M University
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983
Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843
More information about the parallel-netcdf
mailing list