<div dir="ltr">Anthony,<div>I just patched petsc-maint branch.</div><div><br></div><div>Your matrix Amat_binary.m has empty diagonal blocks. Most petsc solvers require matrix diagonal entries to be allocated as 'non-zero', i.e., insert zero values to these zero entries. I would suggest you add zeros to Amat_binary.m during its buildup. This would enable petsc solvers, as well as other packages.</div><div><br></div><div>Again, thanks for bug reporting.</div><div><br></div><div>Hong</div><div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Aug 11, 2015 at 1:33 PM, Satish Balay <span dir="ltr"><<a href="mailto:balay@mcs.anl.gov" target="_blank">balay@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">yes - the patch will be in petsc 3.6.2.<br>
<br>
However - you can grab the patch right now - and start using it<br>
<br>
If using a 3.6.1 tarball - you can do download the (raw) patch from<br>
the url below and apply with:<br>
<br>
cd petsc-3.6.1<br>
patch -Np1 < patchfile<br>
<br>
If using a git clone - you can do:<br>
<br>
git fetch<br>
git checkout ceeba3afeff0c18262ed13ef92e2508ca68b0ecf<br>
<br>
Satish<br>
<span class=""><br>
On Tue, 11 Aug 2015, Anthony Haas wrote:<br>
<br>
> Hi Hong,<br>
><br>
> Sorry for my late reply and thanks for the fix. Does that mean that I will be<br>
> able to run that matrix on 10 procs in the future (petsc 3.6.2?)?<br>
><br>
> Thanks<br>
><br>
> Anthony<br>
><br>
><br>
> On 08/11/2015 09:58 AM, Hong wrote:<br>
> > Anthony,<br>
> > I pushed a fix<br>
> > <a href="https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf" rel="noreferrer" target="_blank">https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf</a><br>
> ><br>
> > Once it passes our nightly tests, I'll merge it to petsc-maint, then<br>
> > petsc-dev.<br>
> > Thanks for reporting it!<br>
> ><br>
> > Hong<br>
> ><br>
> > On Mon, Aug 10, 2015 at 4:27 PM, Barry Smith <<a href="mailto:bsmith@mcs.anl.gov">bsmith@mcs.anl.gov</a><br>
</span><span class="">> > <mailto:<a href="mailto:bsmith@mcs.anl.gov">bsmith@mcs.anl.gov</a>>> wrote:<br>
> ><br>
> ><br>
> > Anthony,<br>
> ><br>
> > This crash is in PETSc code before it calls the SuperLU_DIST<br>
> > numeric factorization; likely we have a mistake such as assuming a<br>
> > process has at least one row of the matrix and need to fix it.<br>
> ><br>
> > Barry<br>
> ><br>
> > > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,<br>
> > > A=0x14a6a70, info=0x19099f8)<br>
> > > at<br>
> > /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368<br>
> > > 368 colA_start = rstart + ajj[0]; /* the smallest<br>
> > global col index of A */<br>
> ><br>
> ><br>
> ><br>
> > > On Aug 10, 2015, at 3:50 PM, Anthony Haas <<a href="mailto:aph@email.arizona.edu">aph@email.arizona.edu</a><br>
</span><div><div class="h5">> > <mailto:<a href="mailto:aph@email.arizona.edu">aph@email.arizona.edu</a>>> wrote:<br>
> > ><br>
> > > Hi Sherry,<br>
> > ><br>
> > > I recently submitted a matrix for which I noticed that<br>
> > Superlu_dist was hanging when running on 4 processors with<br>
> > parallel symbolic factorization. I have been using the latest<br>
> > version of Superlu_dist and the code is not hanging anymore.<br>
> > However, I noticed that when running the same matrix (I have<br>
> > attached the matrix), the code crashes with the following SIGSEGV<br>
> > when running on 10 procs (with or without parallel symbolic<br>
> > factorization). It is probably overkill to run such a 'small'<br>
> > matrix on 10 procs but I thought that it might still be useful to<br>
> > report the problem?? See below for the error obtained when running<br>
> > with gdb and also a code snippet to reproduce the error.<br>
> > ><br>
> > > Thanks,<br>
> > ><br>
> > ><br>
> > > Anthony<br>
> > ><br>
> > ><br>
> > ><br>
> > > 1) ERROR in GDB<br>
> > ><br>
> > > Program received signal SIGSEGV, Segmentation fault.<br>
> > > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,<br>
> > > A=0x14a6a70, info=0x19099f8)<br>
> > > at<br>
> > /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368<br>
> > > 368 colA_start = rstart + ajj[0]; /* the smallest<br>
> > global col index of A */<br>
> > > (gdb)<br>
> > ><br>
> > ><br>
> > ><br>
> > > 2) PORTION OF CODE TO REPRODUCE ERROR<br>
> > ><br>
> > > Subroutine HowBigLUCanBe(rank)<br>
> > ><br>
> > > IMPLICIT NONE<br>
> > ><br>
> > > integer(i4b),intent(in) :: rank<br>
> > > integer(i4b) :: i,ct<br>
> > > real(dp) :: begin,endd<br>
> > > complex(dpc) :: sigma<br>
> > ><br>
> > > PetscErrorCode ierr<br>
> > ><br>
> > ><br>
> > > if (rank==0) call cpu_time(begin)<br>
> > ><br>
> > > if (rank==0) then<br>
> > > write(*,*)<br>
> > > write(*,*)'Testing How Big LU Can Be...'<br>
> > > write(*,*)'============================'<br>
> > > write(*,*)<br>
> > > endif<br>
> > ><br>
> > > !sigma = (1.0d0,0.0d0)<br>
> > > !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) !<br>
> > on exit A = A-sigma*B<br>
> > ><br>
> > > !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr)<br>
> > ><br>
> > > !.....Write Matrix to ASCII and Binary Format<br>
> > > !call<br>
> > PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr)<br>
> > > !call MatView(DXX,viewer,ierr)<br>
> > > !call PetscViewerDestroy(viewer,ierr)<br>
> > ><br>
> > > !call<br>
> > PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr)<br>
> > > !call MatView(A,viewer,ierr)<br>
> > > !call PetscViewerDestroy(viewer,ierr)<br>
> > ><br>
> > > !...Load a Matrix in Binary Format<br>
> > > call<br>
> > PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr)<br>
> > > call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr)<br>
> > > call MatSetType(DLOAD,MATAIJ,ierr)<br>
> > > call MatLoad(DLOAD,viewer,ierr)<br>
> > > call PetscViewerDestroy(viewer,ierr)<br>
> > ><br>
> > > !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr)<br>
> > ><br>
> > ><br>
> > > !.....Create Linear Solver Context<br>
> > > call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)<br>
> > ><br>
> > > !.....Set operators. Here the matrix that defines the linear<br>
> > system also serves as the preconditioning matrix.<br>
> > > !call<br>
> > KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha<br>
> > commented and replaced by next line<br>
> > ><br>
> > > !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A =<br>
> > A-sigma*B<br>
> > > call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here<br>
> > A = A-sigma*B<br>
> > ><br>
> > > !.....Set Relative and Absolute Tolerances and Uses Default for<br>
> > Divergence Tol<br>
> > > tol = 1.e-10<br>
> > > call<br>
> > KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr)<br>
> > ><br>
> > > !.....Set the Direct (LU) Solver<br>
> > > call KSPSetType(ksp,KSPPREONLY,ierr)<br>
> > > call KSPGetPC(ksp,pc,ierr)<br>
> > > call PCSetType(pc,PCLU,ierr)<br>
> > > call<br>
> > PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) !<br>
> > MATSOLVERSUPERLU_DIST MATSOLVERMUMPS<br>
> > ><br>
> > > !.....Create Right-Hand-Side Vector<br>
> > > !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr)<br>
> > > !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr)<br>
> > ><br>
> > > call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr)<br>
> > > call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr)<br>
> > ><br>
> > > call<br>
> > MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr)<br>
> > ><br>
> > > allocate(xwork1(IendA-IstartA))<br>
> > > allocate(loc(IendA-IstartA))<br>
> > ><br>
> > > ct=0<br>
> > > do i=IstartA,IendA-1<br>
> > > ct=ct+1<br>
> > > loc(ct)=i<br>
> > > xwork1(ct)=(1.0d0,0.0d0)<br>
> > > enddo<br>
> > ><br>
> > > call<br>
> > VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr)<br>
> > > call VecZeroEntries(sol,ierr)<br>
> > ><br>
> > > deallocate(xwork1,loc)<br>
> > ><br>
> > > !.....Assemble Vectors<br>
> > > call VecAssemblyBegin(frhs,ierr)<br>
> > > call VecAssemblyEnd(frhs,ierr)<br>
> > ><br>
> > > !.....Solve the Linear System<br>
> > > call KSPSolve(ksp,frhs,sol,ierr)<br>
> > ><br>
> > > !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr)<br>
> > ><br>
> > > if (rank==0) then<br>
> > > call cpu_time(endd)<br>
> > > write(*,*)<br>
> > > print '("Total time for HowBigLUCanBe = ",f21.3,"<br>
> > seconds.")',endd-begin<br>
> > > endif<br>
> > ><br>
> > > call SlepcFinalize(ierr)<br>
> > ><br>
> > > STOP<br>
> > ><br>
> > ><br>
> > > end Subroutine HowBigLUCanBe<br>
> > ><br>
</div></div>> > > <<a href="http://Amat_binary.m.info" rel="noreferrer" target="_blank">Amat_binary.m.info</a> <<a href="http://Amat_binary.m.info" rel="noreferrer" target="_blank">http://Amat_binary.m.info</a>>><br>
> ><br>
> ><br>
><br>
><br>
<br>
</blockquote></div><br></div></div></div>