NetBSD port

Kevin.Buckley at Kevin.Buckley at
Tue Dec 15 22:26:50 CST 2009

Hi again,

I though tI had got things working but maybe not, not completely,

I did this and stuff worked:

./configure  --with-c++-support --with-hdf5=/usr/pkg
PETSC_ARCH=netbsdelf5.0.-c-debug; export PETSC_ARCH
make all
make install
make test
cd src/snes/examples/tutorials/
make ex19
./ex19 -contours

Nice pictures!

I then moved the example ex19 source and the makefile out of the
distribution tree to somwhere else and built it against the
installed stuff and ran it: that worked too.

export PETSC_DIR=/vol/grid/pkg/petsc-3.0.0-p7
make ex19
./ex19 -dmmg_nlevels 4 -snes_monitor_draw
./ex19 -contours

I then built the package that needs PETSc, PISM, from Univ Alaska at
Fairbanks, and ran that.

What I then found is that the PISM stuff would fail if we launched it
into an Sun Grid Engine environment with more than TWO processors,

It also ran if simply mpiexec-d onto a four-processor machine but
not onto a four-machine grid.

I saw this block of error messages from a 4-node submission

[2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range
[2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[2]PETSC ERROR: or see[2]PETSC
ERROR: or try on linux or man libgmalloc on Apple to
find memory corruption errors
[2]PETSC ERROR: likely location of problem given in stack below
[2]PETSC ERROR: ---------------------  Stack Frames
[2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[2]PETSC ERROR:       INSTEAD the line number of the start of the function
[2]PETSC ERROR:       is given.
[2]PETSC ERROR: [2] VecScatterCreateCommon_PtoS line 1699
[2]PETSC ERROR: [2] VecScatterCreate_PtoS line 1508
[2]PETSC ERROR: [2] VecScatterCreate line 833 src/vec/vec/utils/vscat.c
[2]PETSC ERROR: [2] DACreate2d line 338 src/dm/da/src/da2.c
[2]PETSC ERROR: --------------------- Error Message
[2]PETSC ERROR: Signal received!
[2]PETSC ERROR: Petsc Release Version 3.0.0, Patch 7, Mon Jul  6 11:33:34
CDT 2009
[2]PETSC ERROR: See docs/changes/index.html for recent updates.
[2]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[2]PETSC ERROR: See docs/index.html for manual pages.
MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD
with errorcode 59.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
[2]PETSC ERROR: /vol/grid/pkg/pism-0.2.1/bin/pismv on a netbsdelf named by golledni Wed Dec 16 15:49:09 2009
[2]PETSC ERROR: Libraries linked from /vol/grid/pkg/petsc-3.0.0-p7/lib
[2]PETSC ERROR: Configure run at Mon Dec 14 17:02:49 2009
[2]PETSC ERROR: Configure options --with-c++-support --with-hdf5=/usr/pkg
--prefix=/vol/grid/pkg/petsc-3.0.0-p7 --with-shared=0
[2]PETSC ERROR: User provided function() line 0 in unknown directory
unknown file
mpirun has exited due to process rank 2 with PID 4365 on
node exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).

and this block of messages from an 8-node submission

[3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range
[3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[3]PETSC ERROR: or see[3]PETSC
ERROR: or try on linux or man libgmalloc on Apple to
find memory corruption errors
[3]PETSC ERROR: likely location of problem given in stack below
[3]PETSC ERROR: ---------------------  Stack Frames
[2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range
[2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[2]PETSC ERROR: or see[2]PETSC
ERROR: or try on linux or man
 libgmalloc on Apple to find memory corruption errors
[2]PETSC ERROR: likely location of problem given in stack below
[2]PETSC ERROR: ---------------------  Stack Frames

I then went back and tried to run the PETSc example and found similar
happenings, things run when submitted to a two-node "grid" but not a
four-node one, the error message block being:

[0]PETSC ERROR: --------------------- Error Message
[0]PETSC ERROR: Out of memory. This could be due to allocating
[0]PETSC ERROR: too large an object or bleeding by not properly
[0]PETSC ERROR: destroying unneeded objects.
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
[0]PETSC ERROR: Memory allocated 90628 Memory used by process 0
[0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
[0]PETSC ERROR: Memory requested 320!
[0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 7, Mon Jul  6 11:33:34
CDT 2009
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: /home/rialto1/kingstlind/kevin/PETSc/ex19 on a netbsdelf
named by kingstlind Wed Dec 16 16:45:39 2009
[0]PETSC ERROR: Libraries linked from /vol/grid/pkg/petsc-3.0.0-p7/lib
[0]PETSC ERROR: Configure run at Mon Dec 14 17:02:49 2009
[0]PETSC ERROR: Configure options --with-c++-support --with-hdf5=/usr/pkg
--prefix=/vol/grid/pkg/petsc-3.0.0-p7 --with-shared=0
[0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
[0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c
[0]PETSC ERROR: PetscFListAdd() line 235 in src/sys/dll/reg.c
[0]PETSC ERROR: MatRegister() line 140 in src/mat/interface/matreg.c
[0]PETSC ERROR: MatRegisterAll() line 106 in src/mat/interface/matregis.c
[0]PETSC ERROR: MatInitializePackage() line 54 in
[0]PETSC ERROR: MatCreate() line 74 in src/mat/utils/gcreate.c
[0]PETSC ERROR: DAGetInterpolation_2D_Q1() line 308 in
[0]PETSC ERROR: DAGetInterpolation() line 879 in src/dm/da/src/dainterp.c
[0]PETSC ERROR: DMGetInterpolation() line 144 in src/dm/da/utils/dm.c
[0]PETSC ERROR: DMMGSetDM() line 309 in src/snes/utils/damg.c
[0]PETSC ERROR: main() line 108 in src/snes/examples/tutorials/ex19.c
mpirun has exited due to process rank 0 with PID 9757 on
node exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
[1]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the
batch system) has told this process to end
[1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[1]PETSC ERROR: or see[2]PETSC
[] opal_sockaddr2str failed:Unknown error
(return code 4)
[3]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the
batch system) has told this process to end
[3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[3]PETSC ERROR: or see[3]PETSC
ERROR: or try on linux or man libgmalloc on Apple to
find memory corruption errors

Do the PETSc error message suggest anything wrong with my PETSc or do
they point to underlying problems with the OpenMPI ?

Any suggestions/insight welcome,

Kevin M. Buckley                                  Room:  CO327
School of Engineering and                         Phone: +64 4 463 5971
 Computer Science
Victoria University of Wellington
New Zealand

More information about the petsc-dev mailing list