[petsc-users] Problem with petsc-dev+openmpi

Satish Balay balay at mcs.anl.gov
Wed Apr 10 09:36:40 CDT 2013


Which branch of petsc-dev? Please send relavent logs to petsc-maint.

Satish

On Wed, 10 Apr 2013, Zhang wrote:

> Hi,
> 
> I installed petsc-dev. Everything thing's fine when compiling it.
> 
> But problems appeared with the downloaded openmpi-1.6.3 when I make /ksp/ex2 and ran by
> 
> 
> mpirun -n 2 ex2
> 
> Errors as follows,
> 
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   orte_util_nidmap_init failed
>   --> Returned value Data unpack would read past end of buffer (-26) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> [ubuntu:03237] [[3831,1],0] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file util/nidmap.c at line 118
> [ubuntu:03237] [[3831,1],0] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ess_env_module.c at line 174
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   orte_ess_set_name failed
>   --> Returned value Data unpack would read past end of buffer (-26) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> [ubuntu:03237] [[3831,1],0] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file runtime/orte_init.c at line 128
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems.  This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
> 
>   ompi_mpi_init: orte_init failed
>   --> Returned "Data unpack would read past end of buffer" (-26) instead of "Success" (0)
> --------------------------------------------------------------------------
> [ubuntu:3237] *** An error occurred in MPI_Init_thread
> [ubuntu:3237] *** on a NULL communicator
> [ubuntu:3237] *** Unknown error
> [ubuntu:3237] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
> --------------------------------------------------------------------------
> An MPI process is aborting at a time when it cannot guarantee that all
> of its peer processes in the job will be killed properly.  You should
> double check that everything has shut down cleanly.
> 
>   Reason:     Before MPI_INIT completed
>   Local host: ubuntu
>   PID:        3237
> --------------------------------------------------------------------------
> [ubuntu:03240] [[3831,1],1] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file util/nidmap.c at line 118
> [ubuntu:03240] [[3831,1],1] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ess_env_module.c at line 174
> [ubuntu:03240] [[3831,1],1] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file runtime/orte_init.c at line 128
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 0 with PID 3237 on
> node ubuntu exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
> [ubuntu:03236] 1 more process has sent help message help-orte-runtime.txt / orte_init:startup:internal-failure
> [ubuntu:03236] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
> [ubuntu:03236] 1 more process has sent help message help-orte-runtime / orte_init:startup:internal-failure
> [ubuntu:03236] 1 more process has sent help message help-mpi-runtime / mpi_init:startup:internal-failure
> [ubuntu:03236] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal unknown handle
> [ubuntu:03236] 1 more process has sent help message help-mpi-runtime.txt / ompi mpi abort:cannot guarantee all killed
> 
> 
> Please help solve this, and many thanks
> 
> Zhenyu
> 



More information about the petsc-users mailing list