[petsc-users] Problem with petsc-dev+openmpi
Satish Balay
balay at mcs.anl.gov
Wed Apr 10 09:36:40 CDT 2013
Which branch of petsc-dev? Please send relavent logs to petsc-maint.
Satish
On Wed, 10 Apr 2013, Zhang wrote:
> Hi,
>
> I installed petsc-dev. Everything thing's fine when compiling it.
>
> But problems appeared with the downloaded openmpi-1.6.3 when I make /ksp/ex2 and ran by
>
>
> mpirun -n 2 ex2
>
> Errors as follows,
>
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems. This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
> orte_util_nidmap_init failed
> --> Returned value Data unpack would read past end of buffer (-26) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> [ubuntu:03237] [[3831,1],0] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file util/nidmap.c at line 118
> [ubuntu:03237] [[3831,1],0] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ess_env_module.c at line 174
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems. This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
> orte_ess_set_name failed
> --> Returned value Data unpack would read past end of buffer (-26) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> [ubuntu:03237] [[3831,1],0] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file runtime/orte_init.c at line 128
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> ompi_mpi_init: orte_init failed
> --> Returned "Data unpack would read past end of buffer" (-26) instead of "Success" (0)
> --------------------------------------------------------------------------
> [ubuntu:3237] *** An error occurred in MPI_Init_thread
> [ubuntu:3237] *** on a NULL communicator
> [ubuntu:3237] *** Unknown error
> [ubuntu:3237] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
> --------------------------------------------------------------------------
> An MPI process is aborting at a time when it cannot guarantee that all
> of its peer processes in the job will be killed properly. You should
> double check that everything has shut down cleanly.
>
> Reason: Before MPI_INIT completed
> Local host: ubuntu
> PID: 3237
> --------------------------------------------------------------------------
> [ubuntu:03240] [[3831,1],1] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file util/nidmap.c at line 118
> [ubuntu:03240] [[3831,1],1] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ess_env_module.c at line 174
> [ubuntu:03240] [[3831,1],1] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file runtime/orte_init.c at line 128
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 0 with PID 3237 on
> node ubuntu exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
> [ubuntu:03236] 1 more process has sent help message help-orte-runtime.txt / orte_init:startup:internal-failure
> [ubuntu:03236] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
> [ubuntu:03236] 1 more process has sent help message help-orte-runtime / orte_init:startup:internal-failure
> [ubuntu:03236] 1 more process has sent help message help-mpi-runtime / mpi_init:startup:internal-failure
> [ubuntu:03236] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal unknown handle
> [ubuntu:03236] 1 more process has sent help message help-mpi-runtime.txt / ompi mpi abort:cannot guarantee all killed
>
>
> Please help solve this, and many thanks
>
> Zhenyu
>
More information about the petsc-users
mailing list