<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi Hong,<br>
      <br>
      Sorry for my late reply and thanks for the fix. Does that mean
      that I will be able to run that matrix on 10 procs in the future
      (petsc 3.6.2?)?<br>
      <br>
      Thanks<br>
      <br>
      Anthony<br>
      <br>
      <br>
      On 08/11/2015 09:58 AM, Hong wrote:<br>
    </div>
    <blockquote
cite="mid:CAGCphBtMUJQ4XsayVCv5jFJ9=m3iwZTder2Y+hr7xXypRGknNQ@mail.gmail.com"
      type="cite">
      <div dir="ltr">Anthony,
        <div>I pushed a fix</div>
        <div><a moz-do-not-send="true"
href="https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf">https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf</a></div>
        <div><br>
        </div>
        <div>Once it passes our nightly tests, I'll merge it to
          petsc-maint, then petsc-dev.</div>
        <div>Thanks for reporting it!</div>
        <div><br>
        </div>
        <div>Hong<br>
          <div class="gmail_extra"><br>
            <div class="gmail_quote">On Mon, Aug 10, 2015 at 4:27 PM,
              Barry Smith <span dir="ltr"><<a moz-do-not-send="true"
                  href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span>
              wrote:<br>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
                  Anthony,<br>
                <br>
                   This crash is in PETSc code before it calls the
                SuperLU_DIST numeric factorization; likely we have a
                mistake such as assuming a process has at least one row
                of the matrix and need to fix it.<br>
                <br>
                   Barry<br>
                <br>
                > 0x00007fe6ba609297 in
                MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,<br>
                >    A=0x14a6a70, info=0x19099f8)<br>
                >    at
/home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368<br>
                > 368           colA_start = rstart + ajj[0]; /* the
                smallest global col index of A */<br>
                <br>
                <br>
                <br>
                > On Aug 10, 2015, at 3:50 PM, Anthony Haas <<a
                  moz-do-not-send="true"
                  href="mailto:aph@email.arizona.edu">aph@email.arizona.edu</a>>
                wrote:<br>
                ><br>
                > Hi Sherry,<br>
                ><br>
                > I recently submitted a matrix for which I noticed
                that Superlu_dist was hanging when running on 4
                processors with parallel symbolic factorization. I have
                been using the latest version of Superlu_dist and the
                code is not hanging anymore. However, I noticed that
                when running the same matrix (I have attached the
                matrix), the code crashes with the following SIGSEGV
                when running on 10 procs (with or without parallel
                symbolic factorization). It is probably overkill to run
                such a 'small' matrix on 10 procs but I thought that it
                might still be useful to report the problem?? See below
                for the error obtained when running with gdb and also a
                code snippet to reproduce the error.<br>
                ><br>
                > Thanks,<br>
                ><br>
                ><br>
                > Anthony<br>
                ><br>
                ><br>
                ><br>
                > 1) ERROR in GDB<br>
                ><br>
                > Program received signal SIGSEGV, Segmentation
                fault.<br>
                > 0x00007fe6ba609297 in
                MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,<br>
                >    A=0x14a6a70, info=0x19099f8)<br>
                >    at
/home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368<br>
                > 368           colA_start = rstart + ajj[0]; /* the
                smallest global col index of A */<br>
                > (gdb)<br>
                ><br>
                ><br>
                ><br>
                > 2) PORTION OF CODE TO REPRODUCE ERROR<br>
                ><br>
                >    Subroutine HowBigLUCanBe(rank)<br>
                ><br>
                >      IMPLICIT NONE<br>
                ><br>
                >      integer(i4b),intent(in) :: rank<br>
                >      integer(i4b)            :: i,ct<br>
                >      real(dp)                :: begin,endd<br>
                >      complex(dpc)            :: sigma<br>
                ><br>
                >      PetscErrorCode ierr<br>
                ><br>
                ><br>
                >      if (rank==0) call cpu_time(begin)<br>
                ><br>
                >      if (rank==0) then<br>
                >         write(*,*)<br>
                >         write(*,*)'Testing How Big LU Can Be...'<br>
                >         write(*,*)'============================'<br>
                >         write(*,*)<br>
                >      endif<br>
                ><br>
                >      !sigma = (1.0d0,0.0d0)<br>
                >      !call
                MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) ! on
                exit A = A-sigma*B<br>
                ><br>
                >      !call
                MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr)<br>
                ><br>
                > !.....Write Matrix to ASCII and Binary Format<br>
                >      !call
                PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr)<br>
                >      !call MatView(DXX,viewer,ierr)<br>
                >      !call PetscViewerDestroy(viewer,ierr)<br>
                ><br>
                >      !call
PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr)<br>
                >      !call MatView(A,viewer,ierr)<br>
                >      !call PetscViewerDestroy(viewer,ierr)<br>
                ><br>
                > !...Load a Matrix in Binary Format<br>
                >      call
PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr)<br>
                >      call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr)<br>
                >      call MatSetType(DLOAD,MATAIJ,ierr)<br>
                >      call MatLoad(DLOAD,viewer,ierr)<br>
                >      call PetscViewerDestroy(viewer,ierr)<br>
                ><br>
                >      !call
                MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr)<br>
                ><br>
                ><br>
                > !.....Create Linear Solver Context<br>
                >      call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)<br>
                ><br>
                > !.....Set operators. Here the matrix that defines
                the linear system also serves as the preconditioning
                matrix.<br>
                >      !call
                KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr)
                !aha commented and replaced by next line<br>
                ><br>
                >      !call KSPSetOperators(ksp,A,A,ierr) !
                remember: here A = A-sigma*B<br>
                >      call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) !
                remember: here A = A-sigma*B<br>
                ><br>
                > !.....Set Relative and Absolute Tolerances and Uses
                Default for Divergence Tol<br>
                >      tol = 1.e-10<br>
                >      call
KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr)<br>
                ><br>
                > !.....Set the Direct (LU) Solver<br>
                >      call KSPSetType(ksp,KSPPREONLY,ierr)<br>
                >      call KSPGetPC(ksp,pc,ierr)<br>
                >      call PCSetType(pc,PCLU,ierr)<br>
                >      call
                PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr)
                ! MATSOLVERSUPERLU_DIST MATSOLVERMUMPS<br>
                ><br>
                > !.....Create Right-Hand-Side Vector<br>
                >      !call
                MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr)<br>
                >      !call
                MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr)<br>
                ><br>
                >      call
                MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr)<br>
                >      call
                MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr)<br>
                ><br>
                >      call
                MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr)<br>
                ><br>
                >      allocate(xwork1(IendA-IstartA))<br>
                >      allocate(loc(IendA-IstartA))<br>
                ><br>
                >      ct=0<br>
                >      do i=IstartA,IendA-1<br>
                >         ct=ct+1<br>
                >         loc(ct)=i<br>
                >         xwork1(ct)=(1.0d0,0.0d0)<br>
                >      enddo<br>
                ><br>
                >      call
                VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr)<br>
                >      call VecZeroEntries(sol,ierr)<br>
                ><br>
                >      deallocate(xwork1,loc)<br>
                ><br>
                > !.....Assemble Vectors<br>
                >      call VecAssemblyBegin(frhs,ierr)<br>
                >      call VecAssemblyEnd(frhs,ierr)<br>
                ><br>
                > !.....Solve the Linear System<br>
                >      call KSPSolve(ksp,frhs,sol,ierr)<br>
                ><br>
                >      !call
                VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr)<br>
                ><br>
                >      if (rank==0) then<br>
                >         call cpu_time(endd)<br>
                >         write(*,*)<br>
                >         print '("Total time for HowBigLUCanBe =
                ",f21.3," seconds.")',endd-begin<br>
                >      endif<br>
                ><br>
                >      call SlepcFinalize(ierr)<br>
                ><br>
                >      STOP<br>
                ><br>
                ><br>
                >    end Subroutine HowBigLUCanBe<br>
                ><br>
                > <<a moz-do-not-send="true"
                  href="http://Amat_binary.m.info" rel="noreferrer"
                  target="_blank">Amat_binary.m.info</a>><br>
                <br>
              </blockquote>
            </div>
            <br>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>