[mpich-discuss] failed to submit a MPICH-related job on a cluster

Pavan Balaji balaji at mcs.anl.gov
Fri Mar 16 16:35:08 CDT 2012


mpirun.lsf is not something we provide.  It might have been created by 
someone else on your system (e.g., your system administrator).

You might want to look at this thread for more information.

http://lists.mcs.anl.gov/pipermail/mpich-discuss/2012-March/012006.html

  -- Pavan

On 01/30/2012 11:34 PM, Rui Mei wrote:
> Dear all,
>
> I compiled our model and created the ccsm.exe file on a linux cluster,
> but after I submitted a job to run it with mpirun.lsf, it failed.
> I googled this error message and found a lot of threads online. It seems
> to be related to how to set up the use of process manager.
>
> The MPICH2 is configured with all process managers available by
> the administrator. The log file talks about "Hydra", but in the
> mpich2_wrapper file it tries to use MPD. I am not sure if this
> inconsistency is the cause. Any thoughts and suggestion will be appreciated.
>
> Thanks,
> Rui
>
> Here is the log file:
>
> "[mpiexec at cn60] match_arg (./utils/args/args.c:122): unrecognized argument a
> [mpiexec at cn60] HYDU_parse_array (./utils/args/args.c:140): argument
> matching returned error
> [mpiexec at cn60] parse_args (./ui/mpich/utils.c:1387): error parsing input
> array
> [mpiexec at cn60] HYD_uii_mpx_get_parameters (./ui/mpich/utils.c:1475):
> error parsing config args
>
> Usage: ./mpiexec [global opts] [exec1 local opts] : [exec2 local opts] : ...
>
> Global options (passed to all executables):
>
>    Global environment options:
>      -genv {name} {value}             environment variable name and value
>      -genvlist {env1,env2,...}        environment variable list to pass
>      -genvnone                        do not pass any environment variables
>      -genvall                         pass all environment variables not
> managed
>                                            by the launcher (default)
>
>    Other global options:
>      -f {name}                        file containing the host names
>      -hosts {host list}               comma separated host list
>      -wdir {dirname}                  working directory to use
>      -configfile {name}               config file containing MPMD launch
> options
>
>
> Local options (passed to individual executables):
>
>    Local environment options:
>      -env {name} {value}              environment variable name and value
>      -envlist {env1,env2,...}         environment variable list to pass
>      -envnone                         do not pass any environment variables
>      -envall                          pass all environment variables
> (default)
>
>    Other local options:
>      -n/-np {value}                   number of processes
>      {exec_name} {args}               executable name and arguments
>
>
> Hydra specific options (treated as global):
>
>    Launch options:
>      -launcher                        launcher to use ( ssh rsh fork
> slurm ll lsf sge manual persist)
>      -launcher-exec                   executable to use to launch processes
>      -enable-x/-disable-x             enable or disable X forwarding
>
>    Resource management kernel options:
>      -rmk                             resource management kernel to use
> ( user slurm ll lsf sge pbs)
>
>    Hybrid programming options:
>      -ranks-per-proc                  assign so many ranks to each process
>
>    Processor topology options:
>      -binding                         process-to-core binding mode
>      -topolib                         processor topology library ( hwloc
> plpa)
>
>    Checkpoint/Restart options:
>      -ckpoint-interval                checkpoint interval
>      -ckpoint-prefix                  checkpoint file prefix
>      -ckpoint-num                     checkpoint number to restart
>      -ckpointlib                      checkpointing library (none)
>
>    Demux engine options:
>      -demux                           demux engine ( poll select)
>
>    Other Hydra options:
>      -verbose                         verbose mode
>      -info                            build information
>      -print-all-exitcodes             print exit codes of all processes
>      -iface                           network interface to use
>      -ppn                             processes per node
>      -profile                         turn on internal profiling
>      -prepend-rank                    prepend rank to output
>      -prepend-pattern                 prepend pattern to output
>      -outfile-pattern                 direct stdout to file
>      -errfile-pattern                 direct stderr to file
>      -nameserver                      name server information (host:port
> format)
>      -disable-auto-cleanup            don't cleanup processes on error
>      -disable-hostname-propagation    let MPICH2 auto-detect the hostname
>      -order-nodes                     order nodes as
> ascending/descending cores
>
> Please see the intructions provided at
> http://wiki.mcs.anl.gov/mpich2/index.php/Using_the_Hydra_Process_Manager
> for further details
>
> Job  /usr/share/lsf/7.0/linux2.6-glibc2.3-x86_64/bin/mpich2_wrapper -a
> mpich2 -n 12 -f /etc/hosts -launcher ssh ./ccsm.exe
>
> TID   HOST_NAME   COMMAND_LINE            STATUS            TERMINATION_TIME
> ===== ========== ================  =======================
>   ===================
> 00001 cn60                         Undefined
> "
>
>
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list