[petsc-dev] Fwd: [petsc-users] Unexpected "Out of memory error" with SLEPC

Barry Smith bsmith at mcs.anl.gov
Thu Jun 26 12:31:03 CDT 2014


It would be nice if we could automatically detect this issue more often.

   Barry


Begin forwarded message:

> From: Mathis Friesdorf <mathisfriesdorf at gmail.com>
> Subject: Re: [petsc-users] Unexpected "Out of memory error" with SLEPC
> Date: June 26, 2014 at 5:08:23 AM CDT
> To: Karl Rupp <rupp at iue.tuwien.ac.at>
> Cc: <petsc-users at mcs.anl.gov>, <slepc-maint at grycap.upv.es>
> 
> Dear Karl,
> 
> thanks a lot! This indeed worked. I recompiled PETSC with
> 
> ./configure --with-64-bit-indices=1
> PETSC_ARCH = arch-linux2-c-debug-int64
> 
> and the error is gone. Again thanks for your help! 
> 
> All the best, Mathis
> 
> 
> On Wed, Jun 25, 2014 at 12:42 PM, Karl Rupp <rupp at iue.tuwien.ac.at> wrote:
> Hi Mathis,
> 
> this looks very much like an integer overflow:
> http://www.mcs.anl.gov/petsc/documentation/faq.html#with-64-bit-indices
> 
> Best regards,
> Karli
> 
> 
> On 06/25/2014 12:31 PM, Mathis Friesdorf wrote:
> Dear all,
> 
> after a very useful email exchange with Jed Brown quite a while ago, I
> was able to find the lowest eigenvalue of a large matrix which is
> constructed as a tensor product. Admittedly the solution is a bit
> hacked, but is based on a Matrix shell and Armadillo and therefore
> reasonably fast. The problem seems to work well for smaller systems, but
> once the vectors reach a certain size, I get "out of memory" errors. I
> have tested the initialization of a vector of that size and
> multiplication by the matrix. This works fine and takes roughly 20GB of
> memory. There are 256 GB available, so I see no reason why the esp
> solvers should complain. Does anyone have an idea what goes wrong here?
> The error message is not very helpful and claims that a memory is
> requested that is way beyond any reasonable number: /Memory requested
> 18446744056529684480./
> 
> 
> Thanks and all the best, Mathis Friesdorf
> 
> 
> *Output of the Program:*
> /mathis at n180:~/localisation$ ./local_plus 27/
> /System Size: 27
> 
> --------------------------------------------------------------------------
> [[30558,1],0]: A high-performance Open MPI point-to-point messaging module
> was unable to find any relevant network interfaces:
> 
> Module: OpenFabrics (openib)
>    Host: n180
> 
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Out of memory. This could be due to allocating
> [0]PETSC ERROR: too large an object or bleeding by not properly
> [0]PETSC ERROR: destroying unneeded objects.
> [0]PETSC ERROR: Memory allocated 3221286704 Memory used by process
> 3229827072
> [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> [0]PETSC ERROR: Memory requested 18446744056529684480!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./local_plus on a arch-linux2-cxx-debug named n180 by
> mathis Wed Jun 25 12:23:01 2014
> [0]PETSC ERROR: Libraries linked from /home/mathis/bin_nodebug/lib
> [0]PETSC ERROR: Configure run at Wed Jun 25 00:03:34 2014
> [0]PETSC ERROR: Configure options PETSC_DIR=/home/mathis/petsc-3.4.4
> --with-debugging=1 COPTFLAGS="-O3 -march=p4 -mtune=p4" --with-fortran=0
> -with-mpi=1 --with-mpi-dir=/usr/lib/openmpi --with-clanguage=cxx
> --prefix=/home/mathis/bin_nodebug
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscMallocAlign() line 46 in
> /home/mathis/petsc-3.4.4/src/sys/memory/mal.c
> [0]PETSC ERROR: PetscTrMallocDefault() line 189 in
> /home/mathis/petsc-3.4.4/src/sys/memory/mtr.c
> [0]PETSC ERROR: VecDuplicateVecs_Contiguous() line 62 in
> src/vec/contiguous.c
> [0]PETSC ERROR: VecDuplicateVecs() line 589 in
> /home/mathis/petsc-3.4.4/src/vec/vec/interface/vector.c
> [0]PETSC ERROR: EPSAllocateSolution() line 51 in src/eps/interface/mem.c
> [0]PETSC ERROR: EPSSetUp_KrylovSchur() line 141 in
> src/eps/impls/krylov/krylovschur/krylovschur.c
> [0]PETSC ERROR: EPSSetUp() line 147 in src/eps/interface/setup.c
> [0]PETSC ERROR: EPSSolve() line 90 in src/eps/interface/solve.c
> [0]PETSC ERROR: main() line 48 in local_plus.cpp
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode 55.
> 
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> 
> /
> *Output of make:
> */mathis at n180:~/localisation$ make local_plus
> 
> mpicxx -o local_plus.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
> -Wno-unknown-pragmas -g   -fPIC
> -I/home/mathis/armadillo-4.300.8/include -lblas -llapack
> -L/home/mathis/armadillo-4.300.8 -O3 -larmadillo -fomit-frame-pointer
> -I/home/mathis/bin_nodebug/include -I/home/mathis/bin_nodebug/include
> -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi
>   -D__INSDIR__= -I/home/mathis/bin_nodebug
> -I/home/mathis/bin_nodebug//include -I/home/mathis/bin_nodebug/include
> local_plus.cpp
> local_plus.cpp:22:0: warning: "__FUNCT__" redefined [enabled by default]
> In file included from /home/mathis/bin_nodebug/include/petscvec.h:10:0,
>                   from local_plus.cpp:10:
> /home/mathis/bin_nodebug/include/petscviewer.h:386:0: note: this is the
> location of the previous definition
> g++ -o local_plus local_plus.o -Wl,-rpath,/home/mathis/bin_nodebug//lib
> -L/home/mathis/bin_nodebug//lib -lslepc
> -Wl,-rpath,/home/mathis/bin_nodebug/lib -L/home/mathis/bin_nodebug/lib
>   -lpetsc -llapack -lblas -lpthread -Wl,-rpath,/usr/lib/openmpi/lib
> -L/usr/lib/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.7
> -L/usr/lib/gcc/x86_64-linux-gnu/4.7 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
> -L/lib/x86_64-linux-gnu -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte
> -lopen-pal -lnsl -lutil -lgcc_s -lpthread -ldl
> /bin/rm -f local_plus.o
> /
> *Code:
> *///Author: Mathis Friesdorf
> //MathisFriesdorf at gmail.com <mailto:MathisFriesdorf at gmail.com>
> 
> 
> static char help[] = "1D chain\n";
> 
> #include <iostream>
> #include <typeinfo>
> #include <armadillo>
> #include <petscsys.h>
> #include <petscvec.h>
> #include <petscmat.h>
> #include <slepcsys.h>
> #include <slepceps.h>
> #include <math.h>
> #include <assert.h>
> 
> PetscErrorCode BathMult(Mat H, Vec x,Vec y);
> PetscInt       L=30,d=2,dsys;
> PetscErrorCode ierr;
> arma::mat hint = "1.0 0 0 0.0; 0 -1.0 2.0 0; 0 2.0 -1.0 0; 0 0 0 1.0;";
> 
> #define __FUNCT__ "main"
> int main(int argc, char **argv)
> {
>    Mat H;
>    EPS eps;
>    Vec xr,xi;
>    PetscScalar kr,ki;
>    PetscInt j, nconv;
> 
>    L = strtol(argv[1],NULL,10);
>    dsys = pow(d,L);
>    printf("%s","System Size: ");
>    printf("%i",L);
>    printf("%s","\n");
>    SlepcInitialize(&argc,&argv,(char*)0,help);
> 
>    MatCreateShell(PETSC_COMM_WORLD,dsys,dsys,dsys,dsys,NULL,&H);
>    MatShellSetOperation(H,MATOP_MULT,(void(*)())BathMult);
>    ierr = MatGetVecs(H,NULL,&xr); CHKERRQ(ierr);
>    ierr = MatGetVecs(H,NULL,&xi); CHKERRQ(ierr);
> 
>    ierr = EPSCreate(PETSC_COMM_WORLD, &eps); CHKERRQ(ierr);
>    ierr = EPSSetOperators(eps, H, NULL); CHKERRQ(ierr);
>    ierr = EPSSetProblemType(eps, EPS_HEP); CHKERRQ(ierr);
>    ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL); CHKERRQ(ierr);
>    ierr = EPSSetFromOptions( eps ); CHKERRQ(ierr);
>    ierr = EPSSolve(eps); CHKERRQ(ierr);
>    ierr = EPSGetConverged(eps, &nconv); CHKERRQ(ierr);
>    for (j=0; j<1; j++) {
>      EPSGetEigenpair(eps, j, &kr, &ki, xr, xi);
>      printf("%s","Lowest Eigenvalue: ");
>      PetscPrintf(PETSC_COMM_WORLD,"%9F",kr);
>      PetscPrintf(PETSC_COMM_WORLD,"\n");
>    }
>    EPSDestroy(&eps);
> 
>    ierr = SlepcFinalize();
>    return 0;
> }
> #undef __FUNCT__
> 
> #define __FUNCT__ "BathMult"
> PetscErrorCode BathMult(Mat H, Vec x, Vec y)
> {
>    PetscInt l;
>    uint slice;
>    PetscScalar *arrayin,*arrayout;
> 
>    VecGetArray(x,&arrayin);
>    VecGetArray(y,&arrayout);
>    arma::cube A = arma::cube(arrayin,1,1,pow(d,L),
>        /*copy_aux_mem*/false,/*strict*/true);
>    arma::mat result = arma::mat(arrayout,pow(d,L),1,
>        /*copy_aux_mem*/false,/*strict*/true);
>    for (l=0;l<L-1;l++){
>      A.reshape(pow(d,L-2-l),pow(d,2),pow(d,l));
>      result.reshape(pow(d,L-l),pow(d,l));
>      for (slice=0;slice<A.n_slices;slice++){
>        result.col(slice) += vectorise(A.slice(slice)*hint);
>      }
>    }
>    arrayin = A.memptr();
>    ierr = VecRestoreArray(x,&arrayin); CHKERRQ(ierr);
>    arrayout = result.memptr();
>    ierr = VecRestoreArray(y,&arrayout); CHKERRQ(ierr);
>    PetscFunctionReturn(0);
> }
> #undef __FUNCT__/*
> *
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20140626/1efdc8a1/attachment.html>


More information about the petsc-dev mailing list