[mpich-discuss] UNUSUAL MPICH-2 ERROR

Jeff Hammond jhammond at alcf.anl.gov
Tue Dec 13 07:52:40 CST 2011


I already looked through the IBM BGP admin Redbook but didn't see
anything.  I will ask our sysadmins where the docs are when I get to
work.

As far as I know, there is no need for MPI to be aware of the
scheduler on BGP.  We use Cobalt instead of Slurm/Moab or LoadLeveler
and the build instructions for BGP-MPI I have used did not include
anything about Cobalt.

I recommend you try two different solutions at this point:
1. contact IBM support about MPI problems with Slurm/Moab.
2. follow directions on DCMF home page and try to build the
BGP-specific MPI from source.

We should probably take this thread offline at this point since your
issues are more BGP-related than they are MPICH2-related.

Best,

Jeff

On Tue, Dec 13, 2011 at 7:31 AM, Sticks Mabakane <SMabakane at csir.co.za> wrote:
> Dear Jeff,
>
> We are currently moving our Blue Gene system from Loadleveler resource
> manager to Moab Scheduling system. The Moab is currently running very well
> with Slurm in Blue Gene but the problem is the mpi. Unfortunately; all the
> mpi wrappers in /bgsys/drivers/ppcfloor/comm/bin/ were compiled against
> Loadleveler libraries. Is there any Blue Gene documentation that you think
> it may help me?
>
> Regards,
>
> Sticks
>
>>>> Jeff Hammond <jhammond at alcf.anl.gov> 13/12/2011 15:04 >>>
>
> You should not install have to install MPICH2 on any Blue Gene system.
> Please look in /bgsys/drivers/ppcfloor/comm/bin/ for the mpi*
> compiler wrapper scripts.  If they are not present, please follow the
> IBM documentation on how to install the system software via RPM.  Do
> not try to install MPI on a Blue Gene by downloading the latest
> MPICH2.
>
> Jeff
>
> On Tue, Dec 13, 2011 at 6:30 AM, Sticks Mabakane <smabakane at csir.co.za>
> wrote:
>> Dear Mpich-2,
>>
>> I am an administrator of Blue Gene system and currently trying to install
>> mpich-2. I have successfully installed mpich-2 without any errors.
>> However;
>> when i try to run the DL_POLY application using mpich-2 is giving the
>> following error:
>>
>> mpiexec at chpcsn] HYD_pmcd_pmi_alloc_pg_scratch
>> (./pm/pmiserv/pmiserv_utils.c:595): assert (pg->pg_process_count *
>> sizeof(struct HYD_pmcd_pmi_ecount)) failed
>> [mpiexec at chpcsn] HYD_pmci_launch_procs (./pm/pmiserv/pmiserv_pmci.c:103):
>> error allocating pg scratch space
>> [mpiexec at chpcsn] main (./ui/mpich/mpiexec.c:401): process manager returned
>> error launching processes
>> ~
>>
>> I have also tried to issue the command: mpirun --help and it showed the
>> same
>> error message as follows:
>>
>> chpcsn:/bgsys/drivers/ppcfloor/bin # /bgsys/drivers/ppcfloor/bin/mpirun
>> --help
>> [mpiexec at chpcsn] HYDU_parse_array (./utils/args/args.c:137):
>> [mpiexec at chpcsn] parse_args (./ui/mpich/utils.c:1387): error parsing input
>> array
>> [mpiexec at chpcsn] HYD_uii_mpx_get_parameters (./ui/mpich/utils.c:1438):
>> unable to parse user arguments
>>
>> Usage: ./mpiexec [global opts] [exec1 local opts] : [exec2 local opts] :
>> ...
>>
>> Global options (passed to all executables):
>>
>>   Global environment options:
>>     -genv {name} {value}             environment variable name and value
>>     -genvlist {env1,env2,...}        environment variable list to pass
>>     -genvnone                        do not pass any environment variables
>>     -genvall                         pass all environment variables not
>> managed
>>                                           by the launcher (default)
>>                                                        :
>>                                                        :
>>                                                        :
>>
>> Mpich-2 was compiled using gcc-4.1.2 and G Fortran 95. Blue Gene system is
>> running: Linux PowerPC. Please receive an attached config.log file.
>>
>> Regards,
>>
>> Sticks
>>
>>
>> --
>> This message is subject to the CSIR's copyright terms and conditions,
>> e-mail
>> legal notice, and implemented Open Document Format (ODF) standard.
>> The full disclaimer details can be found at
>> http://www.csir.co.za/disclaimer.html.
>>
>>
>> This message has been scanned for viruses and dangerous content by
>> MailScanner,
>> and is believed to be clean.
>>
>>
>> _______________________________________________
>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>
>
>
> --
> Jeff Hammond
> Argonne Leadership Computing Facility
> University of Chicago Computation Institute
> jhammond at alcf.anl.gov / (630) 252-5381
> http://www.linkedin.com/in/jeffhammond
> https://wiki-old.alcf.anl.gov/index.php/User:Jhammond
>
> --
> This message is subject to the CSIR's copyright terms and conditions, e-mail
> legal notice, and implemented Open Document Format (ODF) standard.
> The full disclaimer details can be found at
> http://www.csir.co.za/disclaimer.html.
>
>
> This message has been scanned for viruses and dangerous content by
> MailScanner,
> and is believed to be clean.



-- 
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki-old.alcf.anl.gov/index.php/User:Jhammond


More information about the mpich-discuss mailing list