[petsc-dev] Fwd: [petsc-users] pctype hmpi

Barry Smith bsmith at mcs.anl.gov
Fri Aug 31 13:25:13 CDT 2012


   HMPI usage in PETSc relies on sending function pointer addresses between MPI processes; this use to be fine since you start up several copies of the same executable on a machine each has the same virtual addresses for instructions.  Yes, this a bit hacky but it worked great.

    With ASLR this model breaks.   On Linux I found a way supposedly to turn off ASLR using sysctl -w kernel.randomize_va_space=0  but on Apple there does not seem to be a system wide way to turn it off easily. 

     What to do? It looks like I have to switch to requiring shared libraries and passing string function names and dynamically access them.  Any thoughts on alternatives.  

   Barry


Begin forwarded message:

> From: Barry Smith <bsmith at mcs.anl.gov>
> Subject: Re: [petsc-users] pctype hmpi
> Date: August 31, 2012 1:17:07 PM CDT
> To: PETSc users list <petsc-users at mcs.anl.gov>
> Reply-To: PETSc users list <petsc-users at mcs.anl.gov>
> 
> 
> On Aug 30, 2012, at 10:02 PM, George Pau <gpau at lbl.gov> wrote:
> 
>> Hi Barry,
>> 
>> I tried with the addition of 
>> 
>> -hmpi_spawn_size 3
>> 
>> but I am still getting the same error though.
> 
>    The EXACT same error? Or some other error?
> 
>     What happens if you run with the -hmpi_merge_size <size> option instead?
> 
>   Barry
> 
> 1) I am getting a crash with the spawn version I suspect is due to bugs in the MPICH version I am using related to spawn.
> 
> 2) I am getting errors with the merge version due to Apple's ASLR which they make hard to turn off.
> 
> 
>> I am using mpich2.  Any other options to try?
>> 
>> George
>> 
>> 
>> On Aug 30, 2012, at 7:28 PM, Barry Smith wrote:
>> 
>>> 
>>> On Aug 30, 2012, at 7:24 PM, George Pau <gpau at lbl.gov> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I have some issues using the -pctype hmpi.  I used the same setting found at 
>>>> 
>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCHMPI.html
>>>> 
>>>> i.e. 
>>>> -pc_type hmpi 
>>>> -ksp_type preonly
>>>> -hmpi_ksp_type cg 
>>>> -hmpi_pc_type hypre 
>>>> -hmpi_pc_hypre_type boomeramg
>>>> 
>>>> My command  is
>>>> 
>>>> mpiexec -n 1 myprogram
>>> 
>>> Sorry the documentation doesn't make this clearer. You need to start PETSc with special options to get the "worker" processes initialized. From the manual page for PCHMPI it has
>>> 
>>> See PetscHMPIMerge() and PetscHMPISpawn() for two ways to start up MPI for use with this preconditioner
>>> 
>>> This will tell you want option to start PETSc up with.
>>> 
>>>  I will fix the PC so that it prints a far more useful error message.
>>> 
>>> 
>>> 
>>> Barry
>>> 
>>> 
>>>> 
>>>> But, I get 
>>>> 
>>>> [gilbert:4041] *** An error occurred in MPI_Bcast
>>>> [gilbert:4041] *** on communicator MPI_COMM_WORLD
>>>> [gilbert:4041] *** MPI_ERR_COMM: invalid communicator
>>>> [gilbert:4041] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>>>> 
>>>> with openmpi.  I get similar error with mpich2
>>>> 
>>>> Fatal error in PMPI_Bcast: Invalid communicator, error stack:
>>>> PMPI_Bcast(1478): MPI_Bcast(buf=0x7fffb683479c, count=1, MPI_INT, root=0, comm=0x0) failed
>>>> PMPI_Bcast(1418): Invalid communicator
>>>> 
>>>> I couldn't figure out what is wrong.    My petsc is  version 3.3.3 and the configuration is -with-debugging=0 --with-mpi-dir=/usr/lib/openmpi --download-hypre=1 and I am on a Ubuntu machine.
>>>> 
>>>> Note that with the default pc_type and ksp_type, everything is fine.  It was also tested with multiple processors.  I wondering whether there are some options that I am not specifying correctly?
>>>> 
>>>> -- 
>>>> George Pau
>>>> Earth Sciences Division
>>>> Lawrence Berkeley National Laboratory
>>>> One Cyclotron, MS 74-120
>>>> Berkeley, CA 94720
>>>> 
>>>> (510) 486-7196
>>>> gpau at lbl.gov
>>>> http://esd.lbl.gov/about/staff/georgepau/
>>>> 
>>> 
>> 
> 




More information about the petsc-dev mailing list