From dominik at itis.ethz.ch Mon Jan 2 08:10:50 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Mon, 2 Jan 2012 15:10:50 +0100 Subject: [petsc-users] MatGetRowMinAbs - argument is not optional Message-ID: According to: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetRowMinAbs.html the 3rd arg should be optional, but if it is omitted, a compile error occurs (too few arguments...). The same case with MatGetRowMaxAbs. Dominik From dominik at itis.ethz.ch Mon Jan 2 08:33:51 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Mon, 2 Jan 2012 15:33:51 +0100 Subject: [petsc-users] Problems with MatGetRowMinAbs Message-ID: I am doing something as simple as: PetscReal mmin = 0.0; Vec vmin = 0; PetscInt msize = 0, nsize = 0; ierr = MatGetSize(Av, &msize, &nsize); CHKERRQ(ierr); ierr = VecCreateMPI(PETSC_COMM_WORLD, PETSC_DECIDE, msize, &vmin); CHKERRQ(ierr); ierr = MatGetRowMinAbs(Av, vmin, PETSC_NULL); CHKERRQ(ierr); or in words, setting the size for the vector storing min values in Av's rows to the actual number of rows, and I am surprised to get: [4]PETSC ERROR: Nonconforming object sizes! [4]PETSC ERROR: Nonconforming matrix and vector! pointing to the line with MatGetRowMinAbs. Values in msize and msize are what they are expected to be, the matrix is assembled. What am I doing wrong here? Thanks Dominik From dominik at itis.ethz.ch Mon Jan 2 08:35:00 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Mon, 2 Jan 2012 15:35:00 +0100 Subject: [petsc-users] MatGetRowMinAbs - argument is not optional In-Reply-To: References: Message-ID: I figure "optional" here was supposed to be PETSC_NULL. On Mon, Jan 2, 2012 at 3:10 PM, Dominik Szczerba wrote: > According to: > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetRowMinAbs.html > > the 3rd arg should be optional, but if it is omitted, a compile error > occurs (too few arguments...). The same case with MatGetRowMaxAbs. > > Dominik From dominik at itis.ethz.ch Mon Jan 2 08:42:20 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Mon, 2 Jan 2012 15:42:20 +0100 Subject: [petsc-users] Problems with MatGetRowMinAbs In-Reply-To: References: Message-ID: I figured out myself, the local size must also match. Dominik On Mon, Jan 2, 2012 at 3:33 PM, Dominik Szczerba wrote: > I am doing something as simple as: > > ? ? ? ?PetscReal mmin = 0.0; > ? ? ? ?Vec vmin = 0; > ? ? ? ?PetscInt msize = 0, nsize = 0; > ? ? ? ?ierr = MatGetSize(Av, &msize, &nsize); CHKERRQ(ierr); > ? ? ? ?ierr = VecCreateMPI(PETSC_COMM_WORLD, PETSC_DECIDE, msize, &vmin); > CHKERRQ(ierr); > ? ? ? ?ierr = MatGetRowMinAbs(Av, vmin, PETSC_NULL); CHKERRQ(ierr); > > or in words, setting the size for the vector storing min values in > Av's rows to the actual number of rows, and I am surprised to get: > > [4]PETSC ERROR: Nonconforming object sizes! > [4]PETSC ERROR: Nonconforming matrix and vector! > > pointing to the line with MatGetRowMinAbs. Values in msize and msize > are what they are expected to be, the matrix is assembled. What am I > doing wrong here? > > Thanks > Dominik From jedbrown at mcs.anl.gov Mon Jan 2 13:18:14 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 2 Jan 2012 13:18:14 -0600 Subject: [petsc-users] MatGetRowMinAbs - argument is not optional In-Reply-To: References: Message-ID: On Mon, Jan 2, 2012 at 08:35, Dominik Szczerba wrote: > I figure "optional" here was supposed to be PETSC_NULL. yes, I updated the docs for petsc-dev to clarify -------------- next part -------------- An HTML attachment was scrubbed... URL: From rm93 at buffalo.edu Mon Jan 2 17:16:00 2012 From: rm93 at buffalo.edu (Reza Madankan) Date: Mon, 2 Jan 2012 18:16:00 -0500 Subject: [petsc-users] multiplication of large matrices using MatMatMult Message-ID: Hello everyone; I am trying to compose a covariance matrix which is of size 72576 x 72576, by multiplication of a vector and its transpose, i.e. MatMatMult(Ypcq,YpcqT,MAT_INITIAL_MATRIX,PETSC_DEFAULT,&InnProd) where, Ypcq is a vector with 72576 elements and YpcqT is its transpose. Unfortunately Petsc returns out of memory message while running the code: [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Out of memory. This could be due to allocating [0]PETSC ERROR: too large an object or bleeding by not properly [0]PETSC ERROR: destroying unneeded objects. [0]PETSC ERROR: Memory allocated 0 Memory used by process 15732801536 [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. [0]PETSC ERROR: Memory requested 18446744068420534272! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 3, Fri Sep 30 10:28:33 CDT 2011 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./dec5pbst23 on a linux-imp named k07n14.ccr.buffalo.edu by rm93 Mon Jan 2 18:10:28 2012 [0]PETSC ERROR: Libraries linked from /util/petsc/petsc-3.2-p3/linux-impi-mkl/lib [0]PETSC ERROR: Configure run at Fri Oct 21 08:36:23 2011 [0]PETSC ERROR: Configure options --CC=/util/intel/impi/ 4.0.3.008/intel64/bin/mpiicc --FC=/util/intel/impi/ 4.0.3.008/intel64/bin/mpiifort --CXX=/util/intel/impi/ 4.0.3.008/intel64/bin/mpiicpc--with-blas-lapack-dir=/util/intel/composer_xe_2011_sp1/mkl/lib/intel64 --download-hypre=1 --with-debugging=0 -PETSC_ARCH=linux-impi-mkl --with-shared-libraries=1 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: PetscMallocAlign() line 49 in src/sys/memory/mal.c [0]PETSC ERROR: PetscFreeSpaceGet() line 13 in src/mat/utils/freespace.c [0]PETSC ERROR: MatMatMultSymbolic_SeqAIJ_SeqAIJ() line 76 in src/mat/impls/aij/seq/matmatmult.c [0]PETSC ERROR: MatMatMult_SeqAIJ_SeqAIJ() line 21 in src/mat/impls/aij/seq/matmatmult.c [0]PETSC ERROR: MatMatMult() line 8246 in src/mat/interface/matrix.c [0]PETSC ERROR: main() line 331 in "unknowndirectory/"dec5pbst23.c application called MPI_Abort(MPI_COMM_WORLD, 55) - process 0 Is there any way to get rid of this error? Thanks in advance, Reza -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Jan 2 17:24:44 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 2 Jan 2012 17:24:44 -0600 Subject: [petsc-users] multiplication of large matrices using MatMatMult In-Reply-To: References: Message-ID: On Mon, Jan 2, 2012 at 17:16, Reza Madankan wrote: > I am trying to compose a covariance matrix which is of size 72576 x 72576, > by multiplication of a vector and its transpose, i.e. > > MatMatMult(Ypcq,YpcqT,MAT_INITIAL_MATRIX,PETSC_DEFAULT,&InnProd) > > where, Ypcq is a vector with 72576 elements and YpcqT is its transpose. > This matrix is dense, right? Would you consider something less than the full covariance matrix, such as a low rank approximation? > Unfortunately Petsc returns out of memory message while running the code: > > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Out of memory. This could be due to allocating > [0]PETSC ERROR: too large an object or bleeding by not properly > [0]PETSC ERROR: destroying unneeded objects. > [0]PETSC ERROR: Memory allocated 0 Memory used by process 15732801536 > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. > [0]PETSC ERROR: Memory requested 18446744068420534272! > These huge numbers indicate memory corruption somewhere. I suggest running a smaller problem size under Valgrind. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rm93 at buffalo.edu Mon Jan 2 17:45:56 2012 From: rm93 at buffalo.edu (Reza Madankan) Date: Mon, 2 Jan 2012 18:45:56 -0500 Subject: [petsc-users] multiplication of large matrices using MatMatMult In-Reply-To: References: Message-ID: Yes, the matrix is dense. So, you mean that there is no way to evaluate the matrix, exactly? .... I have run the code for smaller size of Ypcq and it's working. But for larger sizes I get the error that I mentioned. On Jan 2, 2012 6:24 PM, "Jed Brown" wrote: > On Mon, Jan 2, 2012 at 17:16, Reza Madankan wrote: > >> I am trying to compose a covariance matrix which is of size 72576 x >> 72576, by multiplication of a vector and its transpose, i.e. >> >> MatMatMult(Ypcq,YpcqT,MAT_INITIAL_MATRIX,PETSC_DEFAULT,&InnProd) >> >> where, Ypcq is a vector with 72576 elements and YpcqT is its transpose. >> > > This matrix is dense, right? Would you consider something less than the > full covariance matrix, such as a low rank approximation? > > >> Unfortunately Petsc returns out of memory message while running the code: >> >> >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Out of memory. This could be due to allocating >> [0]PETSC ERROR: too large an object or bleeding by not properly >> [0]PETSC ERROR: destroying unneeded objects. >> [0]PETSC ERROR: Memory allocated 0 Memory used by process 15732801536 >> [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. >> [0]PETSC ERROR: Memory requested 18446744068420534272! >> > > These huge numbers indicate memory corruption somewhere. I suggest running > a smaller problem size under Valgrind. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Jan 2 17:51:01 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 2 Jan 2012 17:51:01 -0600 Subject: [petsc-users] multiplication of large matrices using MatMatMult In-Reply-To: References: Message-ID: On Mon, Jan 2, 2012 at 17:45, Reza Madankan wrote: > Yes, the matrix is dense. So, you mean that there is no way to evaluate > the matrix, exactly? You can evaluate it, but there is almost always a way to get the "science" result without computing all the entries. Such an algorithm would be faster. > .... I have run the code for smaller size of Ypcq and it's working. But > for larger sizes I get the error that I mentioned. It's possible that the corruption is caused by some integer overflow and not from actually running out of memory, but it's definitely worth checking that smaller sizes do not have some small error that is detectable by Valgrind (e.g. an uninitialized variable that happens to be equal to 0 at small sizes, or an off-by-one error that does not cause problems for small sizes). -------------- next part -------------- An HTML attachment was scrubbed... URL: From dharmareddy84 at gmail.com Mon Jan 2 22:21:43 2012 From: dharmareddy84 at gmail.com (Dharmendar Reddy) Date: Mon, 2 Jan 2012 22:21:43 -0600 Subject: [petsc-users] slepc eigenvectors Message-ID: Hello, I have a query regarding the eigenvectors computed in slepc. I am solving a genralized eigenvalue problem. I have attached the A and B matrices with this email. If i run slepc solver with default options are arpack, i get one set of vectors (complex) as solution. If i run with eps_type lapack I get real vectors. A is hermitian, and B is positive definite. ( the actual problem is a schrodinger equation for particle in infinite potential well, so the solution will be of the form sin(x)). I check the solution in matlab using eig(A,B) i get real vectors. Looks like there is some unitary transformation involved here, can you tell me what could be going on. i copy a small portion of the eigen vector of the lowest magnitude eigenvlaue (=0.0887) ---Method: (slepc and eps_type lapack) or matlab----- (-0.101596582735892,0.000000000000000E+000) (-0.200421875537261,0.000000000000000E+000) (-0.293780182034781,0.000000000000000E+000) (-0.379124930994127,0.000000000000000E+000) ... ... ... (-0.293780182033444,0.000000000000000E+000) (-0.200421875536298,0.000000000000000E+000) (-0.101596582735387,0.000000000000000E+000) ------------------------------------------------------------------------------ ---Method: (slepc and eps_type defualt or arpack) ---- (5.602609025416389E-002,8.475224384072830E-002) (0.110523934800485,0.167192667375096) (0.162006974547097,0.245072510835553) (0.209070886310831,0.316267414979582) (0.250431889351034,0.378835368586700) (0.284961763219882,0.431069680779720) (0.311718623092706,0.471545535910556) (0.329972611445050,0.499158857936955) (0.339225807211469,0.513156427631836) (0.339225807166595,0.513156427588630) (0.329972611486755,0.499158857980068) (0.311718623054404,0.471545535864886) (0.284961763251251,0.431069680822535) (0.250431889322221,0.378835368543795) (0.209070886332945,0.316267415014661) (0.162006974528570,0.245072510805346) (0.110523934811968,0.167192667394530) (5.602609024797538E-002,8.475224382992022E-002) -- ----------------------------------------------------- Dharmendar Reddy Palle Graduate Student Microelectronics Research center, University of Texas at Austin, 10100 Burnet Road, Bldg. 160 MER 2.608F, TX 78758-4445 e-mail: dharmareddy84 at gmail.com Phone: +1-512-350-9082 United States of America. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Amat.m Type: application/octet-stream Size: 3044 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Bmat.m Type: application/octet-stream Size: 3010 bytes Desc: not available URL: From jroman at dsic.upv.es Tue Jan 3 01:59:45 2012 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 3 Jan 2012 08:59:45 +0100 Subject: [petsc-users] slepc eigenvectors In-Reply-To: References: Message-ID: <792EDE2D-23AD-4E97-B150-A5450B675B7B@dsic.upv.es> On 03/01/2012, Dharmendar Reddy wrote: > Hello, > I have a query regarding the eigenvectors computed in slepc. I am solving a genralized eigenvalue problem. I have attached the A and B matrices with this email. If i run slepc solver with default options are arpack, i get one set of vectors (complex) as solution. If i run with eps_type lapack I get real vectors. A is hermitian, and B is positive definite. ( the actual problem is a schrodinger equation for particle in infinite potential well, so the solution will be of the form sin(x)). I check the solution in matlab using eig(A,B) i get real vectors. Looks like there is some unitary transformation involved here, can you tell me what could be going on. > > i copy a small portion of the eigen vector of the lowest magnitude eigenvlaue (=0.0887) > ---Method: (slepc and eps_type lapack) or matlab----- > (-0.101596582735892,0.000000000000000E+000) > (-0.200421875537261,0.000000000000000E+000) > (-0.293780182034781,0.000000000000000E+000) > (-0.379124930994127,0.000000000000000E+000) > ... > ... > ... > (-0.293780182033444,0.000000000000000E+000) > (-0.200421875536298,0.000000000000000E+000) > (-0.101596582735387,0.000000000000000E+000) > ------------------------------------------------------------------------------ > ---Method: (slepc and eps_type defualt or arpack) ---- > > > (5.602609025416389E-002,8.475224384072830E-002) > (0.110523934800485,0.167192667375096) (0.162006974547097,0.245072510835553) > (0.209070886310831,0.316267414979582) (0.250431889351034,0.378835368586700) > (0.284961763219882,0.431069680779720) (0.311718623092706,0.471545535910556) > (0.329972611445050,0.499158857936955) (0.339225807211469,0.513156427631836) > (0.339225807166595,0.513156427588630) (0.329972611486755,0.499158857980068) > (0.311718623054404,0.471545535864886) (0.284961763251251,0.431069680822535) > (0.250431889322221,0.378835368543795) (0.209070886332945,0.316267415014661) > (0.162006974528570,0.245072510805346) (0.110523934811968,0.167192667394530) > (5.602609024797538E-002,8.475224382992022E-002) I cannot reproduce the problem. I always get the correct eigenvector. Are you doing the computation in real arithmetic? Are you setting the problem type to EPS_GHEP? Jose From agrayver at gfz-potsdam.de Tue Jan 3 03:22:42 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Tue, 03 Jan 2012 10:22:42 +0100 Subject: [petsc-users] Differential operators in PETSc Message-ID: <4F02C8E2.1090803@gfz-potsdam.de> Hello, I want to implement general procedures to generate discretized differential operators (like grad,div,curl,laplacian) using PETSc and staggered structured grids. Before I start, I am wondering if there is something similar in PETSc already? Regards, Alexander From Johannes.Huber at unibas.ch Tue Jan 3 04:11:23 2012 From: Johannes.Huber at unibas.ch (Johannes.Huber at unibas.ch) Date: Tue, 03 Jan 2012 11:11:23 +0100 Subject: [petsc-users] Error in VecAssemeblyBegin after VecView Message-ID: <20120103111123.11696oxrgxp0hgzv@webmail.unibas.ch> Hi all, I have the following code snippet: { static int iCall(0); Vec v = m_State->vec(); // Get the Petsc vector from a libmesh PetscVector VecAssemblyBegin(v); VecAssemblyEnd(v); // Assembly works fine PetscViewer File; PetscViewerCreate(MPI_COMM_WORLD,&File); char Filename[32]; sprintf(Filename,"State_%03d.m",iCall++); PetscViewerASCIIOpen(MPI_COMM_WORLD,Filename,&File); PetscViewerSetFormat(File,PETSC_VIEWER_ASCII_MATLAB); VecView(v,File); PetscViewerDestroy(&File); // File is created and looks good VecAssemblyBegin(v); // Here it crashes VecAssemblyEnd(v); } When running this code on two processes, I receive the error message: [1]PETSC ERROR: VecAssemblyBegin_MPI() line 1010 in src/vec/vec/impls/mpi/pdvec.c [1]PETSC ERROR: VecAssemblyBegin() line 219 in src/vec/vec/interface/vector.c Taking a look in the mentioned file there is the following code: ierr = MPI_Allreduce(&xin->stash.insertmode,&addv,1,MPI_INT,MPI_BOR,comm);CHKERRQ(ierr); if (addv == (ADD_VALUES|INSERT_VALUES)) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_NOTSAMETYPE,"Some processors inserted values while others added"); What am I doing wrong? Thanks in advance, Hannes ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. From jedbrown at mcs.anl.gov Tue Jan 3 06:28:35 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 3 Jan 2012 06:28:35 -0600 Subject: [petsc-users] Differential operators in PETSc In-Reply-To: <4F02C8E2.1090803@gfz-potsdam.de> References: <4F02C8E2.1090803@gfz-potsdam.de> Message-ID: On Tue, Jan 3, 2012 at 03:22, Alexander Grayver wrote: > I want to implement general procedures to generate discretized > differential operators (like grad,div,curl,laplacian) using PETSc and > staggered structured grids. > Before I start, I am wondering if there is something similar in PETSc > already? > PETSc does not have specific support for staggered grids at this time. There are a few ways to do it, but none are ideal and we have a proposal pending in which one item is to improve staggered grid support (but we won't hear back for several months). If you want to do this, it would be useful to discuss how to manage staggered grids. The petsc-dev list would be the best place for those discussions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Jan 3 06:34:30 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 3 Jan 2012 06:34:30 -0600 Subject: [petsc-users] Error in VecAssemeblyBegin after VecView In-Reply-To: <20120103111123.11696oxrgxp0hgzv@webmail.unibas.ch> References: <20120103111123.11696oxrgxp0hgzv@webmail.unibas.ch> Message-ID: On Tue, Jan 3, 2012 at 04:11, wrote: > I have the following code snippet: > > { > static int iCall(0); > Vec v = m_State->vec(); // Get the Petsc vector from a libmesh > PetscVector > The problem is earlier, where you call VecSetValues() with ADD_VALUES in one place and with INSERT_VALUES in another place, perhaps through overloaded operators. > VecAssemblyBegin(v); > VecAssemblyEnd(v); > // Assembly works fine > PetscViewer File; > PetscViewerCreate(MPI_COMM_**WORLD,&File); > char Filename[32]; > sprintf(Filename,"State_%03d.**m",iCall++); > PetscViewerASCIIOpen(MPI_COMM_**WORLD,Filename,&File); > PetscViewerSetFormat(File,**PETSC_VIEWER_ASCII_MATLAB); > VecView(v,File); > PetscViewerDestroy(&File); > // File is created and looks good > VecAssemblyBegin(v); // Here it crashes > VecAssemblyEnd(v); > } > > When running this code on two processes, I receive the error message: > [1]PETSC ERROR: VecAssemblyBegin_MPI() line 1010 in > src/vec/vec/impls/mpi/pdvec.c > [1]PETSC ERROR: VecAssemblyBegin() line 219 in > src/vec/vec/interface/vector.c > Is this the *whole* error message? It should have printed quite a bit more than just this. > > Taking a look in the mentioned file there is the following code: > ierr = MPI_Allreduce(&xin->stash.**insertmode,&addv,1,MPI_INT,** > MPI_BOR,comm);CHKERRQ(ierr); > if (addv == (ADD_VALUES|INSERT_VALUES)) SETERRQ(PETSC_COMM_SELF,PETSC_**ERR_ARG_NOTSAMETYPE,"Some > processors inserted values while others added"); > > What am I doing wrong? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Johannes.Huber at unibas.ch Tue Jan 3 06:41:29 2012 From: Johannes.Huber at unibas.ch (Johannes.Huber at unibas.ch) Date: Tue, 03 Jan 2012 13:41:29 +0100 Subject: [petsc-users] Error in VecAssemeblyBegin after VecView In-Reply-To: References: <20120103111123.11696oxrgxp0hgzv@webmail.unibas.ch> Message-ID: <20120103134129.12977c8zc6k5a73t@webmail.unibas.ch> Hi Jed, thanks for your answer. The first assembly works well, and I would agree, if the first assmebly crashed. However, it's the second assembly call and in between those two calls, all I'm doing is viewing the vector. Quoting Jed Brown : > On Tue, Jan 3, 2012 at 04:11, wrote: > >> I have the following code snippet: >> >> { >> static int iCall(0); >> Vec v = m_State->vec(); // Get the Petsc vector from a libmesh >> PetscVector >> > > The problem is earlier, where you call VecSetValues() with ADD_VALUES in > one place and with INSERT_VALUES in another place, perhaps through > overloaded operators. > > >> VecAssemblyBegin(v); >> VecAssemblyEnd(v); >> // Assembly works fine >> PetscViewer File; >> PetscViewerCreate(MPI_COMM_**WORLD,&File); >> char Filename[32]; >> sprintf(Filename,"State_%03d.**m",iCall++); >> PetscViewerASCIIOpen(MPI_COMM_**WORLD,Filename,&File); >> PetscViewerSetFormat(File,**PETSC_VIEWER_ASCII_MATLAB); >> VecView(v,File); >> PetscViewerDestroy(&File); >> // File is created and looks good >> VecAssemblyBegin(v); // Here it crashes >> VecAssemblyEnd(v); >> } >> >> When running this code on two processes, I receive the error message: >> [1]PETSC ERROR: VecAssemblyBegin_MPI() line 1010 in >> src/vec/vec/impls/mpi/pdvec.c >> [1]PETSC ERROR: VecAssemblyBegin() line 219 in >> src/vec/vec/interface/vector.c >> > > Is this the *whole* error message? It should have printed quite a bit more > than just this. > > >> >> Taking a look in the mentioned file there is the following code: >> ierr = MPI_Allreduce(&xin->stash.**insertmode,&addv,1,MPI_INT,** >> MPI_BOR,comm);CHKERRQ(ierr); >> if (addv == (ADD_VALUES|INSERT_VALUES)) >> SETERRQ(PETSC_COMM_SELF,PETSC_**ERR_ARG_NOTSAMETYPE,"Some >> processors inserted values while others added"); >> >> What am I doing wrong? >> > ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. From tim.gallagher at gatech.edu Tue Jan 3 06:47:05 2012 From: tim.gallagher at gatech.edu (Tim Gallagher) Date: Tue, 03 Jan 2012 07:47:05 -0500 (EST) Subject: [petsc-users] Differential operators in PETSc In-Reply-To: Message-ID: I would also be interested in discussing staggered grids, both how to do them correctly and best ways to do them with what we've got in PETSc currently. Tim ----- Original Message ----- From: "Jed Brown" To: "PETSc users list" Sent: Tuesday, January 3, 2012 7:28:35 AM Subject: Re: [petsc-users] Differential operators in PETSc On Tue, Jan 3, 2012 at 03:22, Alexander Grayver < agrayver at gfz-potsdam.de > wrote: I want to implement general procedures to generate discretized differential operators (like grad,div,curl,laplacian) using PETSc and staggered structured grids. Before I start, I am wondering if there is something similar in PETSc already? PETSc does not have specific support for staggered grids at this time. There are a few ways to do it, but none are ideal and we have a proposal pending in which one item is to improve staggered grid support (but we won't hear back for several months). If you want to do this, it would be useful to discuss how to manage staggered grids. The petsc-dev list would be the best place for those discussions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Jan 3 06:59:55 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 3 Jan 2012 06:59:55 -0600 Subject: [petsc-users] Error in VecAssemeblyBegin after VecView In-Reply-To: <20120103134129.12977c8zc6k5a73t@webmail.unibas.ch> References: <20120103111123.11696oxrgxp0hgzv@webmail.unibas.ch> <20120103134129.12977c8zc6k5a73t@webmail.unibas.ch> Message-ID: On Tue, Jan 3, 2012 at 06:41, wrote: > The first assembly works well, and I would agree, if the first assmebly > crashed. However, it's the second assembly call and in between those two > calls, all I'm doing is viewing the vector. Use a debugger to set a breakpoint in VecSetValues(); maybe starting after your first assemble. Also try Valgrind, it could be memory corruption. You can also break in the first VecAssemblyBegin and do (gdb) p &vec->stash.insertmode $1 = (InsertMode *) 0xADDRESS (gdb) wat *$1 Hardware watchpoint 3: *$1 (gdb) c ... breaks when insertmode is modified for any reason. -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Tue Jan 3 09:21:29 2012 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Tue, 3 Jan 2012 18:51:29 +0330 Subject: [petsc-users] A question about KSP solver Message-ID: Dear Developers, How can I make KSP to ignore the convergence check after prescribed number of linear iterations? Thanks, BehZad -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Jan 3 09:28:46 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 3 Jan 2012 09:28:46 -0600 Subject: [petsc-users] A question about KSP solver In-Reply-To: References: Message-ID: On Tue, Jan 3, 2012 at 09:21, behzad baghapour wrote: > How can I make KSP to ignore the convergence check after prescribed number > of linear iterations? -ksp_max_it N you will get back KSP_DIVERGED_ITS if you call KSPGetConvergedReason(), but you can decide that is okay. -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Tue Jan 3 09:37:33 2012 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Tue, 3 Jan 2012 19:07:33 +0330 Subject: [petsc-users] A question about KSP solver In-Reply-To: References: Message-ID: Thanks, I am using SNES and I thought KSPGetConvergedReason() may automaticly checked after each Newton iteration and then the solution is stopped with this message: Linear solve did not converge due to DIVERGED_ITS iterations 40 Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE So how can I tell KSP in SNES to ignore the convergence check? On Tue, Jan 3, 2012 at 6:58 PM, Jed Brown wrote: > On Tue, Jan 3, 2012 at 09:21, behzad baghapour > wrote: > >> How can I make KSP to ignore the convergence check after prescribed >> number of linear iterations? > > > -ksp_max_it N > > you will get back KSP_DIVERGED_ITS if you call KSPGetConvergedReason(), > but you can decide that is okay. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Jan 3 09:43:21 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 3 Jan 2012 09:43:21 -0600 Subject: [petsc-users] A question about KSP solver In-Reply-To: References: Message-ID: On Tue, Jan 3, 2012 at 09:37, behzad baghapour wrote: > Thanks, I am using SNES and I thought KSPGetConvergedReason() may > automaticly checked after each Newton iteration and then the solution is > stopped with this message: > > Linear solve did not converge due to DIVERGED_ITS iterations 40 > Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE > > So how can I tell KSP in SNES to ignore the convergence check? > -snes_max_linear_solve_fail N http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetMaxLinearSolveFailures.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Tue Jan 3 09:50:10 2012 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Tue, 3 Jan 2012 19:20:10 +0330 Subject: [petsc-users] A question about KSP solver In-Reply-To: References: Message-ID: Thanks, Is it possible to handle the convergence check after each SNES iteration too? I know a little about SNESSetConvergenceTest() but I don't know how to use it? On Tue, Jan 3, 2012 at 7:13 PM, Jed Brown wrote: > On Tue, Jan 3, 2012 at 09:37, behzad baghapour > wrote: > >> Thanks, I am using SNES and I thought KSPGetConvergedReason() may >> automaticly checked after each Newton iteration and then the solution is >> stopped with this message: >> >> Linear solve did not converge due to DIVERGED_ITS iterations 40 >> Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE >> >> So how can I tell KSP in SNES to ignore the convergence check? >> > > -snes_max_linear_solve_fail N > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetMaxLinearSolveFailures.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Jan 3 09:55:04 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 3 Jan 2012 09:55:04 -0600 Subject: [petsc-users] A question about KSP solver In-Reply-To: References: Message-ID: On Tue, Jan 3, 2012 at 09:50, behzad baghapour wrote: > Thanks, Is it possible to handle the convergence check after each SNES > iteration too? I know a little about SNESSetConvergenceTest() but I don't > know how to use it? You implement the convegence test. There are examples and links to source code: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetConvergenceTest.html http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESDefaultConverged.html#SNESDefaultConverged -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Tue Jan 3 09:56:10 2012 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Tue, 3 Jan 2012 19:26:10 +0330 Subject: [petsc-users] A question about KSP solver In-Reply-To: References: Message-ID: Thank you very much. I will work on it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jan 3 10:56:11 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 3 Jan 2012 10:56:11 -0600 Subject: [petsc-users] Error in VecAssemeblyBegin after VecView In-Reply-To: References: <20120103111123.11696oxrgxp0hgzv@webmail.unibas.ch> <20120103134129.12977c8zc6k5a73t@webmail.unibas.ch> Message-ID: <1204C7B9-DBEC-4708-A31A-2AC8BACD5385@mcs.anl.gov> On Jan 3, 2012, at 6:59 AM, Jed Brown wrote: > On Tue, Jan 3, 2012 at 06:41, wrote: > The first assembly works well, and I would agree, if the first assmebly crashed. However, it's the second assembly call and in between those two calls, all I'm doing is viewing the vector. > > Use a debugger to set a breakpoint in VecSetValues(); maybe starting after your first assemble. Also try Valgrind, it could be memory corruption. http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > You can also break in the first VecAssemblyBegin and do > > (gdb) p &vec->stash.insertmode > $1 = (InsertMode *) 0xADDRESS > (gdb) wat *$1 > Hardware watchpoint 3: *$1 > (gdb) c > ... breaks when insertmode is modified for any reason. From dharmareddy84 at gmail.com Tue Jan 3 11:55:16 2012 From: dharmareddy84 at gmail.com (Dharmendar Reddy) Date: Tue, 3 Jan 2012 11:55:16 -0600 Subject: [petsc-users] slepc eigenvectors In-Reply-To: <792EDE2D-23AD-4E97-B150-A5450B675B7B@dsic.upv.es> References: <792EDE2D-23AD-4E97-B150-A5450B675B7B@dsic.upv.es> Message-ID: I use EPS_GHEP and PetscScalar is complex. I am wondering why i see this result. You can see from the eigenvectors in the previous email that the magnitudes of the components match. For the lapack/matlab solution the phase is pi (180 degres) for each component where as for defualt or arpack method phase is 56.53 degrees for each component. I will prepare a test case and email the code. thanks Reddy On Tue, Jan 3, 2012 at 1:59 AM, Jose E. Roman wrote: > > On 03/01/2012, Dharmendar Reddy wrote: > > > Hello, > > I have a query regarding the eigenvectors computed in slepc. I > am solving a genralized eigenvalue problem. I have attached the A and B > matrices with this email. If i run slepc solver with default options are > arpack, i get one set of vectors (complex) as solution. If i run with > eps_type lapack I get real vectors. A is hermitian, and B is positive > definite. ( the actual problem is a schrodinger equation for particle in > infinite potential well, so the solution will be of the form sin(x)). I > check the solution in matlab using eig(A,B) i get real vectors. Looks like > there is some unitary transformation involved here, can you tell me what > could be going on. > > > > i copy a small portion of the eigen vector of the lowest magnitude > eigenvlaue (=0.0887) > > ---Method: (slepc and eps_type lapack) or matlab----- > > (-0.101596582735892,0.000000000000000E+000) > > (-0.200421875537261,0.000000000000000E+000) > > (-0.293780182034781,0.000000000000000E+000) > > (-0.379124930994127,0.000000000000000E+000) > > ... > > ... > > ... > > (-0.293780182033444,0.000000000000000E+000) > > (-0.200421875536298,0.000000000000000E+000) > > (-0.101596582735387,0.000000000000000E+000) > > > ------------------------------------------------------------------------------ > > ---Method: (slepc and eps_type defualt or arpack) ---- > > > > > > (5.602609025416389E-002,8.475224384072830E-002) > > (0.110523934800485,0.167192667375096) > (0.162006974547097,0.245072510835553) > > (0.209070886310831,0.316267414979582) > (0.250431889351034,0.378835368586700) > > (0.284961763219882,0.431069680779720) > (0.311718623092706,0.471545535910556) > > (0.329972611445050,0.499158857936955) > (0.339225807211469,0.513156427631836) > > (0.339225807166595,0.513156427588630) > (0.329972611486755,0.499158857980068) > > (0.311718623054404,0.471545535864886) > (0.284961763251251,0.431069680822535) > > (0.250431889322221,0.378835368543795) > (0.209070886332945,0.316267415014661) > > (0.162006974528570,0.245072510805346) > (0.110523934811968,0.167192667394530) > > (5.602609024797538E-002,8.475224382992022E-002) > > I cannot reproduce the problem. I always get the correct eigenvector. Are > you doing the computation in real arithmetic? Are you setting the problem > type to EPS_GHEP? > > Jose > > > > -- ----------------------------------------------------- Dharmendar Reddy Palle Graduate Student Microelectronics Research center, University of Texas at Austin, 10100 Burnet Road, Bldg. 160 MER 2.608F, TX 78758-4445 e-mail: dharmareddy84 at gmail.com Phone: +1-512-350-9082 United States of America. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jack.poulson at gmail.com Tue Jan 3 12:07:17 2012 From: jack.poulson at gmail.com (Jack Poulson) Date: Tue, 3 Jan 2012 12:07:17 -0600 Subject: [petsc-users] slepc eigenvectors In-Reply-To: References: <792EDE2D-23AD-4E97-B150-A5450B675B7B@dsic.upv.es> Message-ID: Dharmendar, If your matrix is real, and you want real eigenvectors, you should set PetscScalar to be real. I'm sure that you are aware that the phase of eigenvectors is arbitrary; there is no reason to assume that a complex eigensolver would pick out a purely real eigenvector when it exists. Jack On Tue, Jan 3, 2012 at 11:55 AM, Dharmendar Reddy wrote: > I use EPS_GHEP and PetscScalar is complex. I am wondering why i see this > result. > You can see from the eigenvectors in the previous email that the > magnitudes of the components match. For the lapack/matlab solution the > phase is pi (180 degres) for each component where as for defualt or arpack > method phase is 56.53 degrees for each component. I will prepare a test > case and email the code. > > thanks > Reddy > > > On Tue, Jan 3, 2012 at 1:59 AM, Jose E. Roman wrote: > >> >> On 03/01/2012, Dharmendar Reddy wrote: >> >> > Hello, >> > I have a query regarding the eigenvectors computed in slepc. I >> am solving a genralized eigenvalue problem. I have attached the A and B >> matrices with this email. If i run slepc solver with default options are >> arpack, i get one set of vectors (complex) as solution. If i run with >> eps_type lapack I get real vectors. A is hermitian, and B is positive >> definite. ( the actual problem is a schrodinger equation for particle in >> infinite potential well, so the solution will be of the form sin(x)). I >> check the solution in matlab using eig(A,B) i get real vectors. Looks like >> there is some unitary transformation involved here, can you tell me what >> could be going on. >> > >> > i copy a small portion of the eigen vector of the lowest magnitude >> eigenvlaue (=0.0887) >> > ---Method: (slepc and eps_type lapack) or matlab----- >> > (-0.101596582735892,0.000000000000000E+000) >> > (-0.200421875537261,0.000000000000000E+000) >> > (-0.293780182034781,0.000000000000000E+000) >> > (-0.379124930994127,0.000000000000000E+000) >> > ... >> > ... >> > ... >> > (-0.293780182033444,0.000000000000000E+000) >> > (-0.200421875536298,0.000000000000000E+000) >> > (-0.101596582735387,0.000000000000000E+000) >> > >> ------------------------------------------------------------------------------ >> > ---Method: (slepc and eps_type defualt or arpack) ---- >> > >> > >> > (5.602609025416389E-002,8.475224384072830E-002) >> > (0.110523934800485,0.167192667375096) >> (0.162006974547097,0.245072510835553) >> > (0.209070886310831,0.316267414979582) >> (0.250431889351034,0.378835368586700) >> > (0.284961763219882,0.431069680779720) >> (0.311718623092706,0.471545535910556) >> > (0.329972611445050,0.499158857936955) >> (0.339225807211469,0.513156427631836) >> > (0.339225807166595,0.513156427588630) >> (0.329972611486755,0.499158857980068) >> > (0.311718623054404,0.471545535864886) >> (0.284961763251251,0.431069680822535) >> > (0.250431889322221,0.378835368543795) >> (0.209070886332945,0.316267415014661) >> > (0.162006974528570,0.245072510805346) >> (0.110523934811968,0.167192667394530) >> > (5.602609024797538E-002,8.475224382992022E-002) >> >> I cannot reproduce the problem. I always get the correct eigenvector. Are >> you doing the computation in real arithmetic? Are you setting the problem >> type to EPS_GHEP? >> >> Jose >> >> >> >> > > > -- > ----------------------------------------------------- > Dharmendar Reddy Palle > Graduate Student > Microelectronics Research center, > University of Texas at Austin, > 10100 Burnet Road, Bldg. 160 > MER 2.608F, TX 78758-4445 > e-mail: dharmareddy84 at gmail.com > Phone: +1-512-350-9082 > United States of America. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dharmareddy84 at gmail.com Tue Jan 3 12:48:21 2012 From: dharmareddy84 at gmail.com (Dharmendar Reddy) Date: Tue, 3 Jan 2012 12:48:21 -0600 Subject: [petsc-users] slepc eigenvectors In-Reply-To: References: <792EDE2D-23AD-4E97-B150-A5450B675B7B@dsic.upv.es> Message-ID: Hello Jack, I get what you say. For this specific test case, the Hamiltonian is real hermitian and it can be complex hermitian in some problems that i may work with later on, so i kept the PetscScalar to be complex. I understand that it can be arbitrary and it doesn't matter as long as i work with magnitudes and inner products. But In my case, i am looking at the following problem: A1 psi1 = lambda1 B1 psi1 : problem1 A2 psi2 = lambda2 B1 psi2 : problem2 Where A1 and A2 are different Hamiltonian, since i use the same spatial discretization (B1=B2 here). I am interested in inner products between eigenvectors of problem1 and problem2. If i use the same solver for problem1 and problem2, as long as the arbitrary phase picked up for eigenvectors of problem1 and problem2 is same, the inner products will be consistent but if the solver picks phase phi1 and phase phi2 (phi1 /= phi2) then i am not sure how to interpret the inner product innerproduct = psi1^H B1 psi2 On Tue, Jan 3, 2012 at 12:07 PM, Jack Poulson wrote: > Dharmendar, > > If your matrix is real, and you want real eigenvectors, you should set > PetscScalar to be real. I'm sure that you are aware that the phase of > eigenvectors is arbitrary; there is no reason to assume that a complex > eigensolver would pick out a purely real eigenvector when it exists. > > Jack > > > On Tue, Jan 3, 2012 at 11:55 AM, Dharmendar Reddy > wrote: > >> I use EPS_GHEP and PetscScalar is complex. I am wondering why i see this >> result. >> You can see from the eigenvectors in the previous email that the >> magnitudes of the components match. For the lapack/matlab solution the >> phase is pi (180 degres) for each component where as for defualt or arpack >> method phase is 56.53 degrees for each component. I will prepare a test >> case and email the code. >> >> thanks >> Reddy >> >> >> On Tue, Jan 3, 2012 at 1:59 AM, Jose E. Roman wrote: >> >>> >>> On 03/01/2012, Dharmendar Reddy wrote: >>> >>> > Hello, >>> > I have a query regarding the eigenvectors computed in slepc. >>> I am solving a genralized eigenvalue problem. I have attached the A and B >>> matrices with this email. If i run slepc solver with default options are >>> arpack, i get one set of vectors (complex) as solution. If i run with >>> eps_type lapack I get real vectors. A is hermitian, and B is positive >>> definite. ( the actual problem is a schrodinger equation for particle in >>> infinite potential well, so the solution will be of the form sin(x)). I >>> check the solution in matlab using eig(A,B) i get real vectors. Looks like >>> there is some unitary transformation involved here, can you tell me what >>> could be going on. >>> > >>> > i copy a small portion of the eigen vector of the lowest magnitude >>> eigenvlaue (=0.0887) >>> > ---Method: (slepc and eps_type lapack) or matlab----- >>> > (-0.101596582735892,0.000000000000000E+000) >>> > (-0.200421875537261,0.000000000000000E+000) >>> > (-0.293780182034781,0.000000000000000E+000) >>> > (-0.379124930994127,0.000000000000000E+000) >>> > ... >>> > ... >>> > ... >>> > (-0.293780182033444,0.000000000000000E+000) >>> > (-0.200421875536298,0.000000000000000E+000) >>> > (-0.101596582735387,0.000000000000000E+000) >>> > >>> ------------------------------------------------------------------------------ >>> > ---Method: (slepc and eps_type defualt or arpack) ---- >>> > >>> > >>> > (5.602609025416389E-002,8.475224384072830E-002) >>> > (0.110523934800485,0.167192667375096) >>> (0.162006974547097,0.245072510835553) >>> > (0.209070886310831,0.316267414979582) >>> (0.250431889351034,0.378835368586700) >>> > (0.284961763219882,0.431069680779720) >>> (0.311718623092706,0.471545535910556) >>> > (0.329972611445050,0.499158857936955) >>> (0.339225807211469,0.513156427631836) >>> > (0.339225807166595,0.513156427588630) >>> (0.329972611486755,0.499158857980068) >>> > (0.311718623054404,0.471545535864886) >>> (0.284961763251251,0.431069680822535) >>> > (0.250431889322221,0.378835368543795) >>> (0.209070886332945,0.316267415014661) >>> > (0.162006974528570,0.245072510805346) >>> (0.110523934811968,0.167192667394530) >>> > (5.602609024797538E-002,8.475224382992022E-002) >>> >>> I cannot reproduce the problem. I always get the correct eigenvector. >>> Are you doing the computation in real arithmetic? Are you setting the >>> problem type to EPS_GHEP? >>> >>> Jose >>> >>> >>> >>> >> >> >> -- >> ----------------------------------------------------- >> Dharmendar Reddy Palle >> Graduate Student >> Microelectronics Research center, >> University of Texas at Austin, >> 10100 Burnet Road, Bldg. 160 >> MER 2.608F, TX 78758-4445 >> e-mail: dharmareddy84 at gmail.com >> Phone: +1-512-350-9082 >> United States of America. >> >> > -- ----------------------------------------------------- Dharmendar Reddy Palle Graduate Student Microelectronics Research center, University of Texas at Austin, 10100 Burnet Road, Bldg. 160 MER 2.608F, TX 78758-4445 e-mail: dharmareddy84 at gmail.com Phone: +1-512-350-9082 United States of America. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jack.poulson at gmail.com Tue Jan 3 13:02:10 2012 From: jack.poulson at gmail.com (Jack Poulson) Date: Tue, 3 Jan 2012 13:02:10 -0600 Subject: [petsc-users] slepc eigenvectors In-Reply-To: References: <792EDE2D-23AD-4E97-B150-A5450B675B7B@dsic.upv.es> Message-ID: No matter what the relative phases of the eigenvectors of your first and second solutions are, the magnitude of the inner product will remain unchanged. I do not think there is any meaning in the phase of that inner product. If there was, you could rotate the returned eigenvectors using whatever extra information you have that determines their phase (I am skeptical that there is any). Jack On Tue, Jan 3, 2012 at 12:48 PM, Dharmendar Reddy wrote: > Hello Jack, > I get what you say. For this specific test case, the > Hamiltonian is real hermitian and it can be complex hermitian in some > problems that i may work with later on, so i kept the PetscScalar to be > complex. I understand that it can be arbitrary and it doesn't matter as > long as i work with magnitudes and inner products. But In my case, i am > looking at the following problem: > A1 psi1 = lambda1 B1 psi1 : problem1 > A2 psi2 = lambda2 B1 psi2 : problem2 > > Where A1 and A2 are different Hamiltonian, since i use the same spatial > discretization (B1=B2 here). I am interested in inner products between > eigenvectors of problem1 and problem2. If i use the same solver for > problem1 and problem2, as long as the arbitrary phase picked up for > eigenvectors of problem1 and problem2 is same, the inner products will be > consistent but if the solver picks phase phi1 and phase phi2 (phi1 /= phi2) > then i am not sure how to interpret the inner product > > innerproduct = psi1^H B1 psi2 > > > On Tue, Jan 3, 2012 at 12:07 PM, Jack Poulson wrote: > >> Dharmendar, >> >> If your matrix is real, and you want real eigenvectors, you should set >> PetscScalar to be real. I'm sure that you are aware that the phase of >> eigenvectors is arbitrary; there is no reason to assume that a complex >> eigensolver would pick out a purely real eigenvector when it exists. >> >> Jack >> >> >> On Tue, Jan 3, 2012 at 11:55 AM, Dharmendar Reddy < >> dharmareddy84 at gmail.com> wrote: >> >>> I use EPS_GHEP and PetscScalar is complex. I am wondering why i see this >>> result. >>> You can see from the eigenvectors in the previous email that the >>> magnitudes of the components match. For the lapack/matlab solution the >>> phase is pi (180 degres) for each component where as for defualt or arpack >>> method phase is 56.53 degrees for each component. I will prepare a test >>> case and email the code. >>> >>> thanks >>> Reddy >>> >>> >>> On Tue, Jan 3, 2012 at 1:59 AM, Jose E. Roman wrote: >>> >>>> >>>> On 03/01/2012, Dharmendar Reddy wrote: >>>> >>>> > Hello, >>>> > I have a query regarding the eigenvectors computed in slepc. >>>> I am solving a genralized eigenvalue problem. I have attached the A and B >>>> matrices with this email. If i run slepc solver with default options are >>>> arpack, i get one set of vectors (complex) as solution. If i run with >>>> eps_type lapack I get real vectors. A is hermitian, and B is positive >>>> definite. ( the actual problem is a schrodinger equation for particle in >>>> infinite potential well, so the solution will be of the form sin(x)). I >>>> check the solution in matlab using eig(A,B) i get real vectors. Looks like >>>> there is some unitary transformation involved here, can you tell me what >>>> could be going on. >>>> > >>>> > i copy a small portion of the eigen vector of the lowest magnitude >>>> eigenvlaue (=0.0887) >>>> > ---Method: (slepc and eps_type lapack) or matlab----- >>>> > (-0.101596582735892,0.000000000000000E+000) >>>> > (-0.200421875537261,0.000000000000000E+000) >>>> > (-0.293780182034781,0.000000000000000E+000) >>>> > (-0.379124930994127,0.000000000000000E+000) >>>> > ... >>>> > ... >>>> > ... >>>> > (-0.293780182033444,0.000000000000000E+000) >>>> > (-0.200421875536298,0.000000000000000E+000) >>>> > (-0.101596582735387,0.000000000000000E+000) >>>> > >>>> ------------------------------------------------------------------------------ >>>> > ---Method: (slepc and eps_type defualt or arpack) ---- >>>> > >>>> > >>>> > (5.602609025416389E-002,8.475224384072830E-002) >>>> > (0.110523934800485,0.167192667375096) >>>> (0.162006974547097,0.245072510835553) >>>> > (0.209070886310831,0.316267414979582) >>>> (0.250431889351034,0.378835368586700) >>>> > (0.284961763219882,0.431069680779720) >>>> (0.311718623092706,0.471545535910556) >>>> > (0.329972611445050,0.499158857936955) >>>> (0.339225807211469,0.513156427631836) >>>> > (0.339225807166595,0.513156427588630) >>>> (0.329972611486755,0.499158857980068) >>>> > (0.311718623054404,0.471545535864886) >>>> (0.284961763251251,0.431069680822535) >>>> > (0.250431889322221,0.378835368543795) >>>> (0.209070886332945,0.316267415014661) >>>> > (0.162006974528570,0.245072510805346) >>>> (0.110523934811968,0.167192667394530) >>>> > (5.602609024797538E-002,8.475224382992022E-002) >>>> >>>> I cannot reproduce the problem. I always get the correct eigenvector. >>>> Are you doing the computation in real arithmetic? Are you setting the >>>> problem type to EPS_GHEP? >>>> >>>> Jose >>>> >>>> >>>> >>>> >>> >>> >>> -- >>> ----------------------------------------------------- >>> Dharmendar Reddy Palle >>> Graduate Student >>> Microelectronics Research center, >>> University of Texas at Austin, >>> 10100 Burnet Road, Bldg. 160 >>> MER 2.608F, TX 78758-4445 >>> e-mail: dharmareddy84 at gmail.com >>> Phone: +1-512-350-9082 >>> United States of America. >>> >>> >> > > > -- > ----------------------------------------------------- > Dharmendar Reddy Palle > Graduate Student > Microelectronics Research center, > University of Texas at Austin, > 10100 Burnet Road, Bldg. 160 > MER 2.608F, TX 78758-4445 > e-mail: dharmareddy84 at gmail.com > Phone: +1-512-350-9082 > United States of America. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Tue Jan 3 16:58:10 2012 From: zonexo at gmail.com (TAY wee-beng) Date: Tue, 03 Jan 2012 23:58:10 +0100 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc Message-ID: <4F038802.6040307@gmail.com> Hi, I'm running a 3D Fortran CFD code. The grid used is Cartesian. The current code is partitioned in the z direction for MPI. For e.g. for total size z = 10, if partitioned into 5 cpus, it'll become size z = 2 for each cpu. Uneven grids are used to reduce the number of grids and the main bulk of grids clusters around the center. I read about load balancing software. I wonder if it will improve the performance/speed of my code. If so, what are the available choices for use with PETSc and Fortran? Are ParMETIS, Zoltan or Isorropi recommended? Thanks! -- Yours sincerely, TAY wee-beng From jedbrown at mcs.anl.gov Tue Jan 3 17:03:54 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 3 Jan 2012 17:03:54 -0600 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc In-Reply-To: <4F038802.6040307@gmail.com> References: <4F038802.6040307@gmail.com> Message-ID: On Tue, Jan 3, 2012 at 16:58, TAY wee-beng wrote: > I'm running a 3D Fortran CFD code. The grid used is Cartesian. The current > code is partitioned in the z direction for MPI. > > For e.g. for total size z = 10, if partitioned into 5 cpus, it'll become > size z = 2 for each cpu. > > Uneven grids are used to reduce the number of grids and the main bulk of > grids clusters around the center. > > I read about load balancing software. I wonder if it will improve the > performance/speed of my code. > > If so, what are the available choices for use with PETSc and Fortran? Are > ParMETIS, Zoltan or Isorropi recommended? > I would just use MatPartitioning (usually calling into ParMetis underneath) if you want an unstructured partition. Zoltan (and its more C++/Epetra-ified Isorropia interface) provides some assistance for moving application data, but I haven't found it to be easier to use than just moving the data myself and it adds an additional dependency. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jan 3 17:57:03 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 3 Jan 2012 17:57:03 -0600 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc In-Reply-To: References: <4F038802.6040307@gmail.com> Message-ID: On Jan 3, 2012, at 5:03 PM, Jed Brown wrote: > On Tue, Jan 3, 2012 at 16:58, TAY wee-beng wrote: > I'm running a 3D Fortran CFD code. The grid used is Cartesian. The current code is partitioned in the z direction for MPI. > > For e.g. for total size z = 10, if partitioned into 5 cpus, it'll become size z = 2 for each cpu. > > Uneven grids are used to reduce the number of grids and the main bulk of grids clusters around the center. > > I read about load balancing software. I wonder if it will improve the performance/speed of my code. > > If so, what are the available choices for use with PETSc and Fortran? Are ParMETIS, Zoltan or Isorropi recommended? > > I would just use MatPartitioning (usually calling into ParMetis underneath) if you want an unstructured partition. Zoltan (and its more C++/Epetra-ified Isorropia interface) provides some assistance for moving application data, but I haven't found it to be easier to use than just moving the data myself and it adds an additional dependency. Huh? Since it is a structured cartesian mesh code you just want to split up the z direction so that each process has an equal number of grid points (which likely you are already doing, suddenly introducing unstructured partitioning on top of this seems insane). Then when you run with -log_summary you can see the load balance in work (flops) and time for each part of the computation and determine if they are close to being equal. It's crazy to do a done of code development without knowing if load balancing is the problem. Also if you have only 2 sets of z direction values per process you are going to be doing way to much communication relative to the computation. Why not use 3d decomposition by slicing cleanly in all three directions, the end result will require much less communication. Barry From jedbrown at mcs.anl.gov Tue Jan 3 18:03:45 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 3 Jan 2012 18:03:45 -0600 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc In-Reply-To: References: <4F038802.6040307@gmail.com> Message-ID: On Tue, Jan 3, 2012 at 17:57, Barry Smith wrote: > Huh? Since it is a structured cartesian mesh code you just want to split > up the z direction so that each process has an equal number of grid points I may have misunderstood this: "Uneven grids are used to reduce the number of grids and the main bulk of grids clusters around the center." If the grid is structured, then I agree to just use a good structured decomposition. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jan 3 18:11:09 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 3 Jan 2012 18:11:09 -0600 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc In-Reply-To: References: <4F038802.6040307@gmail.com> Message-ID: <8E6421A2-7752-46B6-B632-53E72C12275E@mcs.anl.gov> On Jan 3, 2012, at 6:03 PM, Jed Brown wrote: > On Tue, Jan 3, 2012 at 17:57, Barry Smith wrote: > Huh? Since it is a structured cartesian mesh code you just want to split up the z direction so that each process has an equal number of grid points > > I may have misunderstood this: "Uneven grids are used to reduce the number of grids and the main bulk of grids clusters around the center." I interpreted this to mean that it is using a graded mesh in certain (or all) coordinate directions. I could be wrong. Barry > > If the grid is structured, then I agree to just use a good structured decomposition. From rudolph at berkeley.edu Tue Jan 3 21:57:18 2012 From: rudolph at berkeley.edu (Max Rudolph) Date: Tue, 3 Jan 2012 19:57:18 -0800 Subject: [petsc-users] -log_summary problem Message-ID: > On Tue, Dec 20, 2011 at 19:35, Max Rudolph wrote: > When I run my code with the -log_summary option, it hangs indefinitely after displaying: > > ======================================================================================================================== > Average time to get PetscTime(): 9.53674e-08 > Average time for MPI_Barrier(): 0.00164938 > > Is this a common problem, and if so, how do I fix it? This does not happen when I run the example programs - only my own code, so I must be at fault but without an error message I am not sure where to start. I am using petsc-3.1-p7. Thanks for your help. > > Are all processes calling PetscFinalize()? > > How did you set -log_summary? It should be provided at the time you invoke PetscInitialize() on all processes. > > Try running in a debugger, then break when it hangs and print the stack trace. I found the problem, or at least a workaround. I have a PetscRandom and freed it in the second to last line of my main subroutine: ... ierr= PetscRandomCreate(PETSC_COMM_WORLD, &r);CHKERRQ(ierr); ierr = PetscRandomSetType(r,PETSCRAND48);CHKERRQ(ierr); ... ierr = PetscRandomDestroy( r );CHKERRQ(ierr); ierr = PetscFinalize(); } If I comment out the line with PetscRandomDestroy, -log_summary seems to work. Max -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jan 3 22:08:25 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 3 Jan 2012 22:08:25 -0600 Subject: [petsc-users] -log_summary problem In-Reply-To: References: Message-ID: Max, If you have a stand alone PETSc program that ONLY does the randomcreate/destroy (between the PetscInitialize/Finalize()) does it still hang? Suggest running under valgrind to determine if memory corruption is the cause. http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind Barry On Jan 3, 2012, at 9:57 PM, Max Rudolph wrote: >> On Tue, Dec 20, 2011 at 19:35, Max Rudolph wrote: >> When I run my code with the -log_summary option, it hangs indefinitely after displaying: >> >> ======================================================================================================================== >> Average time to get PetscTime(): 9.53674e-08 >> Average time for MPI_Barrier(): 0.00164938 >> >> Is this a common problem, and if so, how do I fix it? This does not happen when I run the example programs - only my own code, so I must be at fault but without an error message I am not sure where to start. I am using petsc-3.1-p7. Thanks for your help. >> >> Are all processes calling PetscFinalize()? >> >> How did you set -log_summary? It should be provided at the time you invoke PetscInitialize() on all processes. >> >> Try running in a debugger, then break when it hangs and print the stack trace. > > I found the problem, or at least a workaround. I have a PetscRandom and freed it in the second to last line of my main subroutine: > ... > ierr= PetscRandomCreate(PETSC_COMM_WORLD, &r);CHKERRQ(ierr); > ierr = PetscRandomSetType(r,PETSCRAND48);CHKERRQ(ierr); > ... > ierr = PetscRandomDestroy( r );CHKERRQ(ierr); > ierr = PetscFinalize(); > } > > If I comment out the line with PetscRandomDestroy, -log_summary seems to work. > > Max From Sanjay.Kharche at liverpool.ac.uk Wed Jan 4 08:35:54 2012 From: Sanjay.Kharche at liverpool.ac.uk (Kharche, Sanjay) Date: Wed, 4 Jan 2012 14:35:54 +0000 Subject: [petsc-users] standalone code Message-ID: <04649ABFF695C94F8E6CF3BBBA9B16650F2FA510@BHEXMBX1.livad.liv.ac.uk> Dear All After having worked through several examples, I am now looking for some example reaction-diffusion C code that uses PetSc and does an implicit solution for the PDE. Can you suggest something? ta Sanjay -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Jan 4 08:44:47 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 4 Jan 2012 08:44:47 -0600 Subject: [petsc-users] standalone code In-Reply-To: <04649ABFF695C94F8E6CF3BBBA9B16650F2FA510@BHEXMBX1.livad.liv.ac.uk> References: <04649ABFF695C94F8E6CF3BBBA9B16650F2FA510@BHEXMBX1.livad.liv.ac.uk> Message-ID: On Wed, Jan 4, 2012 at 08:35, Kharche, Sanjay < Sanjay.Kharche at liverpool.ac.uk> wrote: > After having worked through several examples, I am now looking for some > example reaction-diffusion C code that uses PetSc and does an implicit > solution for the PDE. Can you suggest something? I take it you are interested in the time-dependent case. These two are good places to start src/ts/examples/tutorials/ex22.c - 1D advection-reaction src/ts/examples/tutorials/ex25.c - 1D "Brusselator" reaction-diffusion This example has more quirks, the physics is more complicated, and not all the algorithmic combinations work. src/ts/examples/tutorials/ex10.c - 1D non-equilibrium radiation-diffusion with Saha ionization model For 2D and 3D, there are other examples, either with different physics or not time-dependent. But the code structure is the same regardless of the dimension, so I recommend looking at the simple examples above (ex22 and ex25). -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Wed Jan 4 09:18:59 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 4 Jan 2012 16:18:59 +0100 Subject: [petsc-users] Differential operators in PETSc In-Reply-To: References: Message-ID: A side question: would that mean that the said set of operators is there already for regular structured (topologically Cartesian) grids? On Tue, Jan 3, 2012 at 1:47 PM, Tim Gallagher wrote: > I would also be interested in discussing staggered grids, both how to do > them correctly and best ways to do them with what we've got in PETSc > currently. > > Tim > > ________________________________ > From: "Jed Brown" > To: "PETSc users list" > Sent: Tuesday, January 3, 2012 7:28:35 AM > Subject: Re: [petsc-users] Differential operators in PETSc > > > On Tue, Jan 3, 2012 at 03:22, Alexander Grayver > wrote: >> >> I want to implement general procedures to generate discretized >> differential operators (like grad,div,curl,laplacian) using PETSc and >> staggered structured grids. >> Before I start, I am wondering if there is something similar in PETSc >> already? > > > PETSc does not have specific support for staggered grids at this time. There > are a few ways to do it, but none are ideal and we have a proposal pending > in which one item is to improve staggered grid support (but we won't hear > back for several months). If you want to do this, it would be useful to > discuss how to manage staggered grids. The petsc-dev list would be the best > place for those discussions. > From agrayver at gfz-potsdam.de Wed Jan 4 09:25:44 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Wed, 04 Jan 2012 16:25:44 +0100 Subject: [petsc-users] Differential operators in PETSc In-Reply-To: References: <4F02C8E2.1090803@gfz-potsdam.de> Message-ID: <4F046F78.9010904@gfz-potsdam.de> Jed, I would like to discuss it, but I'm afraid I can't pose right questions due to the lack of wide knowledge regarding problem, since my understanding and motivation is only within my problem, which is vector Helmholtz equation. Would anybody more experienced like to start this discussion? Regards, Alexander On 03.01.2012 13:28, Jed Brown wrote: > On Tue, Jan 3, 2012 at 03:22, Alexander Grayver > > wrote: > > I want to implement general procedures to generate discretized > differential operators (like grad,div,curl,laplacian) using PETSc > and staggered structured grids. > Before I start, I am wondering if there is something similar in > PETSc already? > > > PETSc does not have specific support for staggered grids at this time. > There are a few ways to do it, but none are ideal and we have a > proposal pending in which one item is to improve staggered grid > support (but we won't hear back for several months). If you want to do > this, it would be useful to discuss how to manage staggered grids. The > petsc-dev list would be the best place for those discussions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Jan 4 09:29:18 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 4 Jan 2012 09:29:18 -0600 Subject: [petsc-users] Differential operators in PETSc In-Reply-To: References: Message-ID: On Wed, Jan 4, 2012 at 09:18, Dominik Szczerba wrote: > A side question: would that mean that the said set of operators is > there already for regular structured (topologically Cartesian) grids? > No, but the formulas are simple. Note that mimetic discretizations of divergence and curl require staggering. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Jan 4 09:34:47 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 4 Jan 2012 09:34:47 -0600 Subject: [petsc-users] Differential operators in PETSc In-Reply-To: <4F046F78.9010904@gfz-potsdam.de> References: <4F02C8E2.1090803@gfz-potsdam.de> <4F046F78.9010904@gfz-potsdam.de> Message-ID: On Wed, Jan 4, 2012 at 09:25, Alexander Grayver wrote: > I would like to discuss it, but I'm afraid I can't pose right questions > due to the lack of wide knowledge regarding problem, since my understanding > and motivation is only within my problem, which is vector Helmholtz > equation. > Would anybody more experienced like to start this discussion? > I think the most important question is to figure out what API people want to interact with for staggered grids since that will guide the implementation. Staggered grids are unfortunately more dimension-dependent than non-staggered, so 2D versus 3D issues are more complicated. How would people like to address edge and face spaces in 3D? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Wed Jan 4 10:29:59 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 4 Jan 2012 17:29:59 +0100 Subject: [petsc-users] MatGetRowMinAbs returns negative number Message-ID: I have just noticed that MatGetRowMinAbs (but not the Max variant) in some cases returns a very small negative number like -1.62505e-17 (displayed as %g). Is this an epsilon effect or an indication of a problem? Regards, Dominik From knepley at gmail.com Wed Jan 4 10:40:37 2012 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 4 Jan 2012 10:40:37 -0600 Subject: [petsc-users] MatGetRowMinAbs returns negative number In-Reply-To: References: Message-ID: On Wed, Jan 4, 2012 at 10:29 AM, Dominik Szczerba wrote: > I have just noticed that MatGetRowMinAbs (but not the Max variant) in > some cases returns a very small negative number like -1.62505e-17 > (displayed as %g). Is this an epsilon effect or an indication of a > problem? > Its not possible that was actually the minimum element? Matt > Regards, > Dominik > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Jan 4 10:48:21 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 4 Jan 2012 10:48:21 -0600 Subject: [petsc-users] MatGetRowMinAbs returns negative number In-Reply-To: References: Message-ID: On Wed, Jan 4, 2012 at 10:40, Matthew Knepley wrote: > I have just noticed that MatGetRowMinAbs (but not the Max variant) in >> some cases returns a very small negative number like -1.62505e-17 >> (displayed as %g). Is this an epsilon effect or an indication of a >> problem? >> > > Its not possible that was actually the minimum element? > It is supposed to be returning the absolute value anyway, so no. Also, due to this patch of yours http://petsc.cs.iit.edu/petsc/petsc-dev/rev/e433a0 it should never return a value smaller than 1e-12 that isn't identically zero. I don't understand why this shift is being done, but I think that if you want a shift like this, it should at least be relative to the norm of the vector or something. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Jan 4 11:02:07 2012 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 4 Jan 2012 11:02:07 -0600 Subject: [petsc-users] MatGetRowMinAbs returns negative number In-Reply-To: References: Message-ID: On Wed, Jan 4, 2012 at 10:48 AM, Jed Brown wrote: > On Wed, Jan 4, 2012 at 10:40, Matthew Knepley wrote: > >> I have just noticed that MatGetRowMinAbs (but not the Max variant) in >>> some cases returns a very small negative number like -1.62505e-17 >>> (displayed as %g). Is this an epsilon effect or an indication of a >>> problem? >>> >> >> Its not possible that was actually the minimum element? >> > > It is supposed to be returning the absolute value anyway, so no. Also, due > to this patch of yours > > http://petsc.cs.iit.edu/petsc/petsc-dev/rev/e433a0 > > it should never return a value smaller than 1e-12 that isn't identically > zero. I don't understand why this shift is being done, but I think that if > you want a shift like this, it should at least be relative to the norm of > the vector or something. > Okay, here is what is wrong with the logic: 1) Its not a shift, it ignores values < 1.0e-12 2) The problem is on line 2.17 of the diff where it takes the first value as minimum, but does not take the absolute value 3) After that, no value is greater than 1.0e-12, so it does not change it You have a whole row of zeros. I will fix that line. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Jan 4 11:05:41 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 4 Jan 2012 11:05:41 -0600 Subject: [petsc-users] MatGetRowMinAbs returns negative number In-Reply-To: References: Message-ID: On Wed, Jan 4, 2012 at 11:02, Matthew Knepley wrote: > Okay, here is what is wrong with the logic: > > 1) Its not a shift, it ignores values < 1.0e-12 > you wrote that in the commit message > > 2) The problem is on line 2.17 of the diff where it takes the first value > as minimum, but does not take the absolute value > good catch -------------- next part -------------- An HTML attachment was scrubbed... URL: From Johannes.Huber at unibas.ch Wed Jan 4 11:59:22 2012 From: Johannes.Huber at unibas.ch (Johannes.Huber at unibas.ch) Date: Wed, 04 Jan 2012 18:59:22 +0100 Subject: [petsc-users] Error in VecAssemeblyBegin after VecView In-Reply-To: <1204C7B9-DBEC-4708-A31A-2AC8BACD5385@mcs.anl.gov> References: <20120103111123.11696oxrgxp0hgzv@webmail.unibas.ch> <20120103134129.12977c8zc6k5a73t@webmail.unibas.ch> <1204C7B9-DBEC-4708-A31A-2AC8BACD5385@mcs.anl.gov> Message-ID: <20120104185922.85472mk1dvoubm9m@webmail.unibas.ch> Hi all, thanks a lot for you help. I tracked the problem down to a minimal program and found, that it looks like I'm doing something wrong when using the matlab format. Here is the minimal program: #include static char help[] = "Matlab output.\n\n"; int main (int argc, char** argv) { PetscInitialize(&argc,&argv,(char*)0,help); Vec b; PetscInt ierr; int r; MPI_Comm_rank(PETSC_COMM_WORLD,&r); ierr=VecCreate(PETSC_COMM_WORLD,&b); CHKERRQ(ierr); ierr=VecSetSizes(b,10,PETSC_DECIDE); CHKERRQ(ierr); ierr=VecSetFromOptions(b); CHKERRQ(ierr); ierr=VecZeroEntries(b); CHKERRQ(ierr); ierr=VecAssemblyBegin(b); CHKERRQ(ierr); ierr=VecAssemblyEnd(b); CHKERRQ(ierr); PetscViewer File; ierr=PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Test.m",&File); CHKERRQ(ierr); ierr=PetscViewerSetFormat(File,PETSC_VIEWER_ASCII_MATLAB); CHKERRQ(ierr); // crash //ierr=PetscViewerSetFormat(File,PETSC_VIEWER_ASCII_INDEX); CHKERRQ(ierr); //works //ierr=PetscViewerSetFormat(File,PETSC_VIEWER_ASCII_DENSE); CHKERRQ(ierr); //works ierr=VecView(b,File); CHKERRQ(ierr); ierr=PetscViewerDestroy(&File); CHKERRQ(ierr); printf("[%d]: %d\n",r,__LINE__); VecAssemblyBegin(b); printf("[%d]: %d\n",r,__LINE__); VecAssemblyEnd(b); ierr=VecDestroy(&b); CHKERRQ(ierr); MPI_Barrier(PETSC_COMM_WORLD); PetscFinalize(); return 0; } Does anybody know, what's wrong with this? Thanks a lot, Hannes Quoting Barry Smith : > > On Jan 3, 2012, at 6:59 AM, Jed Brown wrote: > >> On Tue, Jan 3, 2012 at 06:41, wrote: >> The first assembly works well, and I would agree, if the first >> assmebly crashed. However, it's the second assembly call and in >> between those two calls, all I'm doing is viewing the vector. >> >> Use a debugger to set a breakpoint in VecSetValues(); maybe >> starting after your first assemble. Also try Valgrind, it could be >> memory corruption. > > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > >> >> You can also break in the first VecAssemblyBegin and do >> >> (gdb) p &vec->stash.insertmode >> $1 = (InsertMode *) 0xADDRESS >> (gdb) wat *$1 >> Hardware watchpoint 3: *$1 >> (gdb) c >> ... breaks when insertmode is modified for any reason. > > ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. From zonexo at gmail.com Wed Jan 4 13:18:43 2012 From: zonexo at gmail.com (TAY wee-beng) Date: Wed, 04 Jan 2012 20:18:43 +0100 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc In-Reply-To: <8E6421A2-7752-46B6-B632-53E72C12275E@mcs.anl.gov> References: <4F038802.6040307@gmail.com> <8E6421A2-7752-46B6-B632-53E72C12275E@mcs.anl.gov> Message-ID: <4F04A613.5060909@gmail.com> Hi Barry and Jed, So the 1st step should be checking the load balancing. If it's more or less balanced, will slicing it in 3 directions further improve the speed? Another thing is that I hope to do some form of adaptive mesh refinement. I'm a bit confused. Are partitioning software like ParMETIS, Zoltan or Isorropia also used for adaptive mesh refinement? Or which open source software can do that with PETSc and in Fortran? I searched and got libMesh, for use with PETSc and paramesh, which is in Fortran. Yours sincerely, TAY wee-beng On 4/1/2012 1:11 AM, Barry Smith wrote: > On Jan 3, 2012, at 6:03 PM, Jed Brown wrote: > >> On Tue, Jan 3, 2012 at 17:57, Barry Smith wrote: >> Huh? Since it is a structured cartesian mesh code you just want to split up the z direction so that each process has an equal number of grid points >> >> I may have misunderstood this: "Uneven grids are used to reduce the number of grids and the main bulk of grids clusters around the center." > I interpreted this to mean that it is using a graded mesh in certain (or all) coordinate directions. I could be wrong. > > Barry > >> If the grid is structured, then I agree to just use a good structured decomposition. From jedbrown at mcs.anl.gov Wed Jan 4 13:43:48 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 4 Jan 2012 13:43:48 -0600 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc In-Reply-To: <4F04A613.5060909@gmail.com> References: <4F038802.6040307@gmail.com> <8E6421A2-7752-46B6-B632-53E72C12275E@mcs.anl.gov> <4F04A613.5060909@gmail.com> Message-ID: On Wed, Jan 4, 2012 at 13:18, TAY wee-beng wrote: > So the 1st step should be checking the load balancing. If it's more or > less balanced, will slicing it in 3 directions further improve the speed? > You want some combination of balancing and small surface area. Slicing in 3 directions usually improves this. Depending on the anisotropy in the physics, it could be better or worse for solver convergence rates. > > Another thing is that I hope to do some form of adaptive mesh refinement. > On a Cartesian mesh? With what sort of discretization. There are lots of packages for this, each one targeting some class of discretizations and problems. > > I'm a bit confused. Are partitioning software like ParMETIS, Zoltan or > Isorropia also used for adaptive mesh refinement? > > Or which open source software can do that with PETSc and in Fortran? I > searched and got libMesh, for use with PETSc and paramesh, which is in > Fortran. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jan 4 14:28:07 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 4 Jan 2012 14:28:07 -0600 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc In-Reply-To: <4F04A613.5060909@gmail.com> References: <4F038802.6040307@gmail.com> <8E6421A2-7752-46B6-B632-53E72C12275E@mcs.anl.gov> <4F04A613.5060909@gmail.com> Message-ID: On Jan 4, 2012, at 1:18 PM, TAY wee-beng wrote: > Hi Barry and Jed, > > So the 1st step should be checking the load balancing. If it's more or less balanced, will slicing it in 3 directions further improve the speed? > > Another thing is that I hope to do some form of adaptive mesh refinement. > > I'm a bit confused. Are partitioning software like ParMETIS, Zoltan or Isorropia also used for adaptive mesh refinement? > > Or which open source software can do that with PETSc and in Fortran? I searched and got libMesh, for use with PETSc and paramesh, which is in Fortran. Go with libmesh, it has an active community and mailing list for issues that come up. Barry > > Yours sincerely, > > TAY wee-beng > > > On 4/1/2012 1:11 AM, Barry Smith wrote: >> On Jan 3, 2012, at 6:03 PM, Jed Brown wrote: >> >>> On Tue, Jan 3, 2012 at 17:57, Barry Smith wrote: >>> Huh? Since it is a structured cartesian mesh code you just want to split up the z direction so that each process has an equal number of grid points >>> >>> I may have misunderstood this: "Uneven grids are used to reduce the number of grids and the main bulk of grids clusters around the center." >> I interpreted this to mean that it is using a graded mesh in certain (or all) coordinate directions. I could be wrong. >> >> Barry >> >>> If the grid is structured, then I agree to just use a good structured decomposition. From karpeev at mcs.anl.gov Wed Jan 4 14:30:22 2012 From: karpeev at mcs.anl.gov (Dmitry Karpeev) Date: Wed, 4 Jan 2012 14:30:22 -0600 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc In-Reply-To: References: <4F038802.6040307@gmail.com> <8E6421A2-7752-46B6-B632-53E72C12275E@mcs.anl.gov> <4F04A613.5060909@gmail.com> Message-ID: On Wed, Jan 4, 2012 at 2:28 PM, Barry Smith wrote: > > On Jan 4, 2012, at 1:18 PM, TAY wee-beng wrote: > > > Hi Barry and Jed, > > > > So the 1st step should be checking the load balancing. If it's more or > less balanced, will slicing it in 3 directions further improve the speed? > > > > Another thing is that I hope to do some form of adaptive mesh refinement. > > > > I'm a bit confused. Are partitioning software like ParMETIS, Zoltan or > Isorropia also used for adaptive mesh refinement? > > > > Or which open source software can do that with PETSc and in Fortran? I > searched and got libMesh, for use with PETSc and paramesh, which is in > Fortran. > > Go with libmesh, it has an active community and mailing list for issues > that come up. > And will soon have its own DM :-) > > Barry > > > > > Yours sincerely, > > > > TAY wee-beng > > > > > > On 4/1/2012 1:11 AM, Barry Smith wrote: > >> On Jan 3, 2012, at 6:03 PM, Jed Brown wrote: > >> > >>> On Tue, Jan 3, 2012 at 17:57, Barry Smith wrote: > >>> Huh? Since it is a structured cartesian mesh code you just want to > split up the z direction so that each process has an equal number of grid > points > >>> > >>> I may have misunderstood this: "Uneven grids are used to reduce the > number of grids and the main bulk of grids clusters around the center." > >> I interpreted this to mean that it is using a graded mesh in certain > (or all) coordinate directions. I could be wrong. > >> > >> Barry > >> > >>> If the grid is structured, then I agree to just use a good structured > decomposition. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jan 4 16:18:24 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 4 Jan 2012 16:18:24 -0600 Subject: [petsc-users] Error in VecAssemeblyBegin after VecView In-Reply-To: <20120104185922.85472mk1dvoubm9m@webmail.unibas.ch> References: <20120103111123.11696oxrgxp0hgzv@webmail.unibas.ch> <20120103134129.12977c8zc6k5a73t@webmail.unibas.ch> <1204C7B9-DBEC-4708-A31A-2AC8BACD5385@mcs.anl.gov> <20120104185922.85472mk1dvoubm9m@webmail.unibas.ch> Message-ID: <3CD2B57B-3FEA-4BC0-BF9A-983521D1A050@mcs.anl.gov> Hannes, Thanks for reporting the problem. It is a bug in our handling of PETSC_VIEWER_ASCII_MATLAB in parallel. I have fixed the repository copy in both petsc-3.2 and petsc-dev It will be fixed in the next patch release of 3.2 or Satish can send you the patch file; just ask for it at petsc-maint at mcs.anl.gov Barry > Hannes On Jan 4, 2012, at 11:59 AM, Johannes.Huber at unibas.ch wrote: > Hi all, > thanks a lot for you help. > I tracked the problem down to a minimal program and found, that it looks like I'm doing something wrong when using the matlab format. Here is the minimal program: > #include > static char help[] = "Matlab output.\n\n"; > int main (int argc, char** argv) > { > PetscInitialize(&argc,&argv,(char*)0,help); > Vec b; > PetscInt ierr; > int r; > MPI_Comm_rank(PETSC_COMM_WORLD,&r); > ierr=VecCreate(PETSC_COMM_WORLD,&b); CHKERRQ(ierr); > ierr=VecSetSizes(b,10,PETSC_DECIDE); CHKERRQ(ierr); > ierr=VecSetFromOptions(b); CHKERRQ(ierr); > ierr=VecZeroEntries(b); CHKERRQ(ierr); > ierr=VecAssemblyBegin(b); CHKERRQ(ierr); > ierr=VecAssemblyEnd(b); CHKERRQ(ierr); > > PetscViewer File; > ierr=PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Test.m",&File); CHKERRQ(ierr); > ierr=PetscViewerSetFormat(File,PETSC_VIEWER_ASCII_MATLAB); CHKERRQ(ierr); // crash > //ierr=PetscViewerSetFormat(File,PETSC_VIEWER_ASCII_INDEX); CHKERRQ(ierr); //works > //ierr=PetscViewerSetFormat(File,PETSC_VIEWER_ASCII_DENSE); CHKERRQ(ierr); //works > ierr=VecView(b,File); CHKERRQ(ierr); > ierr=PetscViewerDestroy(&File); CHKERRQ(ierr); > > printf("[%d]: %d\n",r,__LINE__); > VecAssemblyBegin(b); > printf("[%d]: %d\n",r,__LINE__); > VecAssemblyEnd(b); > ierr=VecDestroy(&b); CHKERRQ(ierr); > MPI_Barrier(PETSC_COMM_WORLD); > PetscFinalize(); > return 0; > } > Does anybody know, what's wrong with this? > > Thanks a lot, > Hannes > > Quoting Barry Smith : > >> >> On Jan 3, 2012, at 6:59 AM, Jed Brown wrote: >> >>> On Tue, Jan 3, 2012 at 06:41, wrote: >>> The first assembly works well, and I would agree, if the first assmebly crashed. However, it's the second assembly call and in between those two calls, all I'm doing is viewing the vector. >>> >>> Use a debugger to set a breakpoint in VecSetValues(); maybe starting after your first assemble. Also try Valgrind, it could be memory corruption. >> >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> >>> >>> You can also break in the first VecAssemblyBegin and do >>> >>> (gdb) p &vec->stash.insertmode >>> $1 = (InsertMode *) 0xADDRESS >>> (gdb) wat *$1 >>> Hardware watchpoint 3: *$1 >>> (gdb) c >>> ... breaks when insertmode is modified for any reason. >> >> > > > > ---------------------------------------------------------------- > This message was sent using IMP, the Internet Messaging Program. > > From balay at mcs.anl.gov Wed Jan 4 16:21:38 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 4 Jan 2012 16:21:38 -0600 (CST) Subject: [petsc-users] Error in VecAssemeblyBegin after VecView In-Reply-To: <3CD2B57B-3FEA-4BC0-BF9A-983521D1A050@mcs.anl.gov> References: <20120103111123.11696oxrgxp0hgzv@webmail.unibas.ch> <20120103134129.12977c8zc6k5a73t@webmail.unibas.ch> <1204C7B9-DBEC-4708-A31A-2AC8BACD5385@mcs.anl.gov> <20120104185922.85472mk1dvoubm9m@webmail.unibas.ch> <3CD2B57B-3FEA-4BC0-BF9A-983521D1A050@mcs.anl.gov> Message-ID: The fix is here. http://petsc.cs.iit.edu/petsc/releases/petsc-3.2/rev/cd89aa6b51bc You can download the 'raw' patch [from the 'raw' link above] - save it as 'patchfile' - and apply with: patch -Np1 < patchfile Satish On Wed, 4 Jan 2012, Barry Smith wrote: > > Hannes, > > Thanks for reporting the problem. It is a bug in our handling of PETSC_VIEWER_ASCII_MATLAB in parallel. I have fixed the repository copy in both petsc-3.2 and petsc-dev It will be fixed in the next patch release of 3.2 or Satish can send you the patch file; just ask for it at petsc-maint at mcs.anl.gov > > Barry > > > > Hannes > > On Jan 4, 2012, at 11:59 AM, Johannes.Huber at unibas.ch wrote: > > > Hi all, > > thanks a lot for you help. > > I tracked the problem down to a minimal program and found, that it looks like I'm doing something wrong when using the matlab format. Here is the minimal program: > > #include > > static char help[] = "Matlab output.\n\n"; > > int main (int argc, char** argv) > > { > > PetscInitialize(&argc,&argv,(char*)0,help); > > Vec b; > > PetscInt ierr; > > int r; > > MPI_Comm_rank(PETSC_COMM_WORLD,&r); > > ierr=VecCreate(PETSC_COMM_WORLD,&b); CHKERRQ(ierr); > > ierr=VecSetSizes(b,10,PETSC_DECIDE); CHKERRQ(ierr); > > ierr=VecSetFromOptions(b); CHKERRQ(ierr); > > ierr=VecZeroEntries(b); CHKERRQ(ierr); > > ierr=VecAssemblyBegin(b); CHKERRQ(ierr); > > ierr=VecAssemblyEnd(b); CHKERRQ(ierr); > > > > PetscViewer File; > > ierr=PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Test.m",&File); CHKERRQ(ierr); > > ierr=PetscViewerSetFormat(File,PETSC_VIEWER_ASCII_MATLAB); CHKERRQ(ierr); // crash > > //ierr=PetscViewerSetFormat(File,PETSC_VIEWER_ASCII_INDEX); CHKERRQ(ierr); //works > > //ierr=PetscViewerSetFormat(File,PETSC_VIEWER_ASCII_DENSE); CHKERRQ(ierr); //works > > ierr=VecView(b,File); CHKERRQ(ierr); > > ierr=PetscViewerDestroy(&File); CHKERRQ(ierr); > > > > printf("[%d]: %d\n",r,__LINE__); > > VecAssemblyBegin(b); > > printf("[%d]: %d\n",r,__LINE__); > > VecAssemblyEnd(b); > > ierr=VecDestroy(&b); CHKERRQ(ierr); > > MPI_Barrier(PETSC_COMM_WORLD); > > PetscFinalize(); > > return 0; > > } > > Does anybody know, what's wrong with this? > > > > Thanks a lot, > > Hannes > > > > Quoting Barry Smith : > > > >> > >> On Jan 3, 2012, at 6:59 AM, Jed Brown wrote: > >> > >>> On Tue, Jan 3, 2012 at 06:41, wrote: > >>> The first assembly works well, and I would agree, if the first assmebly crashed. However, it's the second assembly call and in between those two calls, all I'm doing is viewing the vector. > >>> > >>> Use a debugger to set a breakpoint in VecSetValues(); maybe starting after your first assemble. Also try Valgrind, it could be memory corruption. > >> > >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> > >> > >>> > >>> You can also break in the first VecAssemblyBegin and do > >>> > >>> (gdb) p &vec->stash.insertmode > >>> $1 = (InsertMode *) 0xADDRESS > >>> (gdb) wat *$1 > >>> Hardware watchpoint 3: *$1 > >>> (gdb) c > >>> ... breaks when insertmode is modified for any reason. > >> > >> > > > > > > > > ---------------------------------------------------------------- > > This message was sent using IMP, the Internet Messaging Program. > > > > > > From dominik at itis.ethz.ch Wed Jan 4 17:29:07 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Thu, 5 Jan 2012 00:29:07 +0100 Subject: [petsc-users] MatGetRowMinAbs returns negative number In-Reply-To: References: Message-ID: Do you think this issue can potentially explain a behavior I observe: In my big block shell matrix there are two blocks submatrices: (say) B and B transposed. MaxAbs is exact same for both submatrices, MinAbs is zero for B (as expected), but small negative for the other (unexpected). Now question: do I have a bug (that I do not immediately see) or this can be hopefully only a trick in MinAbs? Thanks a lot. On Wed, Jan 4, 2012 at 6:05 PM, Jed Brown wrote: > On Wed, Jan 4, 2012 at 11:02, Matthew Knepley wrote: >> >> Okay, here is what is wrong with the logic: >> >> 1) Its not a shift, it ignores values < 1.0e-12 > > > you wrote that in the commit message > >> >> >> 2) The problem is on line 2.17 of the diff where it takes the first value >> as minimum, but does not take the absolute value > > > good catch From knepley at gmail.com Wed Jan 4 17:42:28 2012 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 4 Jan 2012 17:42:28 -0600 Subject: [petsc-users] MatGetRowMinAbs returns negative number In-Reply-To: References: Message-ID: On Wed, Jan 4, 2012 at 5:29 PM, Dominik Szczerba wrote: > Do you think this issue can potentially explain a behavior I observe: > > In my big block shell matrix there are two blocks submatrices: (say) B > and B transposed. MaxAbs is exact same for both submatrices, MinAbs is > zero for B (as expected), but small negative for the other > (unexpected). > > Now question: do I have a bug (that I do not immediately see) or this > can be hopefully only a trick in MinAbs? > It was a bug in MinAbs. I am pushing the fix. Matt > Thanks a lot. > > On Wed, Jan 4, 2012 at 6:05 PM, Jed Brown wrote: > > On Wed, Jan 4, 2012 at 11:02, Matthew Knepley wrote: > >> > >> Okay, here is what is wrong with the logic: > >> > >> 1) Its not a shift, it ignores values < 1.0e-12 > > > > > > you wrote that in the commit message > > > >> > >> > >> 2) The problem is on line 2.17 of the diff where it takes the first > value > >> as minimum, but does not take the absolute value > > > > > > good catch > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jan 4 18:02:06 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 4 Jan 2012 18:02:06 -0600 Subject: [petsc-users] MatGetRowMinAbs returns negative number In-Reply-To: References: Message-ID: <4A2E4300-5180-4636-B460-74CAA9386FB2@mcs.anl.gov> On Jan 4, 2012, at 5:42 PM, Matthew Knepley wrote: > On Wed, Jan 4, 2012 at 5:29 PM, Dominik Szczerba wrote: > Do you think this issue can potentially explain a behavior I observe: > > In my big block shell matrix there are two blocks submatrices: (say) B > and B transposed. MaxAbs is exact same for both submatrices, MinAbs is > zero for B (as expected), but small negative for the other > (unexpected). > > Now question: do I have a bug (that I do not immediately see) or this > can be hopefully only a trick in MinAbs? > > It was a bug in MinAbs. I am pushing the fix. To 3.2 I hope. Please send the patch link for 3.2 as Satish did just a little while ago for another bug. Barry > > Matt > > Thanks a lot. > > On Wed, Jan 4, 2012 at 6:05 PM, Jed Brown wrote: > > On Wed, Jan 4, 2012 at 11:02, Matthew Knepley wrote: > >> > >> Okay, here is what is wrong with the logic: > >> > >> 1) Its not a shift, it ignores values < 1.0e-12 > > > > > > you wrote that in the commit message > > > >> > >> > >> 2) The problem is on line 2.17 of the diff where it takes the first value > >> as minimum, but does not take the absolute value > > > > > > good catch > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From jiangwen84 at gmail.com Wed Jan 4 20:23:11 2012 From: jiangwen84 at gmail.com (Wen Jiang) Date: Wed, 4 Jan 2012 21:23:11 -0500 Subject: [petsc-users] How to use latest Metis5.0 in PETSc Message-ID: Hi guys, I do not know how to install PETSc with the latest version of Metis 5.0. I need to call some routines from Metis 5.0 in my PETSc codes. I tried to configure it with the command --with-metis-dir=METIS_5.0_DIR, but it does not work. Thanks, Wen -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Jan 4 20:24:53 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 4 Jan 2012 20:24:53 -0600 Subject: [petsc-users] How to use latest Metis5.0 in PETSc In-Reply-To: References: Message-ID: On Wed, Jan 4, 2012 at 20:23, Wen Jiang wrote: > I do not know how to install PETSc with the latest version of Metis 5.0. I > need to call some routines from Metis 5.0 in my PETSc codes. I tried to > configure it with the command --with-metis-dir=METIS_5.0_DIR, but it does > not work. > does not work? send log files -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jan 4 20:30:59 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 4 Jan 2012 20:30:59 -0600 Subject: [petsc-users] How to use latest Metis5.0 in PETSc In-Reply-To: References: Message-ID: On Jan 4, 2012, at 8:24 PM, Jed Brown wrote: > On Wed, Jan 4, 2012 at 20:23, Wen Jiang wrote: > I do not know how to install PETSc with the latest version of Metis 5.0. I need to call some routines from Metis 5.0 in my PETSc codes. I tried to configure it with the command --with-metis-dir=METIS_5.0_DIR, but it does not work. > > does not work? > > send log files to petsc-maint at mcs.anl.gov And please do not send installation issues to petsc-users, please send them to petsc-maint at mcs.anl.gov petsc-users is for questions/suggestions that a variety of PETSc users may be interested in, not for installation problems. Barry From balay at mcs.anl.gov Wed Jan 4 21:44:50 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 4 Jan 2012 21:44:50 -0600 (CST) Subject: [petsc-users] How to use latest Metis5.0 in PETSc In-Reply-To: References: Message-ID: On Wed, 4 Jan 2012, Wen Jiang wrote: > Hi guys, > > I do not know how to install PETSc with the latest version of Metis 5.0. I > need to call some routines from Metis 5.0 in my PETSc codes. I tried to > configure it with the command --with-metis-dir=METIS_5.0_DIR, but it does > not work. 1. use petsc-dev 2. install latest cmake 2.8.5 or higher [if not already on your machine] and have it at the begining of your PATH 3. Now install petsc-dev with '--download-metis' configure options If you encounter problmes - send relavent logs to petsc-maint. Satish From Johannes.Huber at unibas.ch Thu Jan 5 02:39:13 2012 From: Johannes.Huber at unibas.ch (Johannes.Huber at unibas.ch) Date: Thu, 05 Jan 2012 09:39:13 +0100 Subject: [petsc-users] Error in VecAssemeblyBegin after VecView In-Reply-To: References: <20120103111123.11696oxrgxp0hgzv@webmail.unibas.ch> <20120103134129.12977c8zc6k5a73t@webmail.unibas.ch> <1204C7B9-DBEC-4708-A31A-2AC8BACD5385@mcs.anl.gov> <20120104185922.85472mk1dvoubm9m@webmail.unibas.ch> <3CD2B57B-3FEA-4BC0-BF9A-983521D1A050@mcs.anl.gov> Message-ID: <20120105093913.595846m3ijgyw1ip@webmail.unibas.ch> Hi, great, it works now. When I take a look to the valgrind output, I see a few lines about unaddressable bytes. Is this a reason to concern? BTW: I also see memory leaks from getpwuid. Does anybody know about a patch for this? Many thanks, Hannes Quoting Satish Balay : > The fix is here. > > http://petsc.cs.iit.edu/petsc/releases/petsc-3.2/rev/cd89aa6b51bc > > You can download the 'raw' patch [from the 'raw' link above] - save it > as 'patchfile' - and apply with: > > patch -Np1 < patchfile > > Satish > > On Wed, 4 Jan 2012, Barry Smith wrote: > >> >> Hannes, >> >> Thanks for reporting the problem. It is a bug in our handling >> of PETSC_VIEWER_ASCII_MATLAB in parallel. I have fixed the >> repository copy in both petsc-3.2 and petsc-dev It will be fixed >> in the next patch release of 3.2 or Satish can send you the patch >> file; just ask for it at petsc-maint at mcs.anl.gov >> >> Barry >> >> >> > Hannes >> >> On Jan 4, 2012, at 11:59 AM, Johannes.Huber at unibas.ch wrote: >> >> > Hi all, >> > thanks a lot for you help. >> > I tracked the problem down to a minimal program and found, that >> it looks like I'm doing something wrong when using the matlab >> format. Here is the minimal program: >> > #include >> > static char help[] = "Matlab output.\n\n"; >> > int main (int argc, char** argv) >> > { >> > PetscInitialize(&argc,&argv,(char*)0,help); >> > Vec b; >> > PetscInt ierr; >> > int r; >> > MPI_Comm_rank(PETSC_COMM_WORLD,&r); >> > ierr=VecCreate(PETSC_COMM_WORLD,&b); CHKERRQ(ierr); >> > ierr=VecSetSizes(b,10,PETSC_DECIDE); CHKERRQ(ierr); >> > ierr=VecSetFromOptions(b); CHKERRQ(ierr); >> > ierr=VecZeroEntries(b); CHKERRQ(ierr); >> > ierr=VecAssemblyBegin(b); CHKERRQ(ierr); >> > ierr=VecAssemblyEnd(b); CHKERRQ(ierr); >> > >> > PetscViewer File; >> > ierr=PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Test.m",&File); >> CHKERRQ(ierr); >> > ierr=PetscViewerSetFormat(File,PETSC_VIEWER_ASCII_MATLAB); >> CHKERRQ(ierr); // crash >> > //ierr=PetscViewerSetFormat(File,PETSC_VIEWER_ASCII_INDEX); >> CHKERRQ(ierr); //works >> > //ierr=PetscViewerSetFormat(File,PETSC_VIEWER_ASCII_DENSE); >> CHKERRQ(ierr); //works >> > ierr=VecView(b,File); CHKERRQ(ierr); >> > ierr=PetscViewerDestroy(&File); CHKERRQ(ierr); >> > >> > printf("[%d]: %d\n",r,__LINE__); >> > VecAssemblyBegin(b); >> > printf("[%d]: %d\n",r,__LINE__); >> > VecAssemblyEnd(b); >> > ierr=VecDestroy(&b); CHKERRQ(ierr); >> > MPI_Barrier(PETSC_COMM_WORLD); >> > PetscFinalize(); >> > return 0; >> > } >> > Does anybody know, what's wrong with this? >> > >> > Thanks a lot, >> > Hannes >> > >> > Quoting Barry Smith : >> > >> >> >> >> On Jan 3, 2012, at 6:59 AM, Jed Brown wrote: >> >> >> >>> On Tue, Jan 3, 2012 at 06:41, wrote: >> >>> The first assembly works well, and I would agree, if the first >> assmebly crashed. However, it's the second assembly call and in >> between those two calls, all I'm doing is viewing the vector. >> >>> >> >>> Use a debugger to set a breakpoint in VecSetValues(); maybe >> starting after your first assemble. Also try Valgrind, it could be >> memory corruption. >> >> >> >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> >> >> >> >>> >> >>> You can also break in the first VecAssemblyBegin and do >> >>> >> >>> (gdb) p &vec->stash.insertmode >> >>> $1 = (InsertMode *) 0xADDRESS >> >>> (gdb) wat *$1 >> >>> Hardware watchpoint 3: *$1 >> >>> (gdb) c >> >>> ... breaks when insertmode is modified for any reason. >> >> >> >> >> > >> > >> > >> > ---------------------------------------------------------------- >> > This message was sent using IMP, the Internet Messaging Program. >> > >> > >> >> > > ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. From dominik at itis.ethz.ch Thu Jan 5 04:07:40 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Thu, 5 Jan 2012 11:07:40 +0100 Subject: [petsc-users] MatGetRowMinAbs returns negative number In-Reply-To: <4A2E4300-5180-4636-B460-74CAA9386FB2@mcs.anl.gov> References: <4A2E4300-5180-4636-B460-74CAA9386FB2@mcs.anl.gov> Message-ID: Is it also possible that MaxAbs would be computed wrongly? I get a different value when I compute it myself. Dominik On Thu, Jan 5, 2012 at 1:02 AM, Barry Smith wrote: > > On Jan 4, 2012, at 5:42 PM, Matthew Knepley wrote: > >> On Wed, Jan 4, 2012 at 5:29 PM, Dominik Szczerba wrote: >> Do you think this issue can potentially explain a behavior I observe: >> >> In my big block shell matrix there are two blocks submatrices: (say) B >> and B transposed. MaxAbs is exact same for both submatrices, MinAbs is >> zero for B (as expected), but small negative for the other >> (unexpected). >> >> Now question: do I have a bug (that I do not immediately see) or this >> can be hopefully only a trick in MinAbs? >> >> It was a bug in MinAbs. I am pushing the fix. > > ? ?To 3.2 I hope. Please send the patch link for 3.2 as Satish did just a little while ago for another bug. > > ? Barry > >> >> ? ?Matt >> >> Thanks a lot. >> >> On Wed, Jan 4, 2012 at 6:05 PM, Jed Brown wrote: >> > On Wed, Jan 4, 2012 at 11:02, Matthew Knepley wrote: >> >> >> >> Okay, here is what is wrong with the logic: >> >> >> >> 1) Its not a shift, it ignores values < 1.0e-12 >> > >> > >> > you wrote that in the commit message >> > >> >> >> >> >> >> 2) The problem is on line 2.17 of the diff where it takes the first value >> >> as minimum, but does not take the absolute value >> > >> > >> > good catch >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > From jedbrown at mcs.anl.gov Thu Jan 5 07:31:30 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 5 Jan 2012 07:31:30 -0600 Subject: [petsc-users] Error in VecAssemeblyBegin after VecView In-Reply-To: <20120105093913.595846m3ijgyw1ip@webmail.unibas.ch> References: <20120103111123.11696oxrgxp0hgzv@webmail.unibas.ch> <20120103134129.12977c8zc6k5a73t@webmail.unibas.ch> <1204C7B9-DBEC-4708-A31A-2AC8BACD5385@mcs.anl.gov> <20120104185922.85472mk1dvoubm9m@webmail.unibas.ch> <3CD2B57B-3FEA-4BC0-BF9A-983521D1A050@mcs.anl.gov> <20120105093913.595846m3ijgyw1ip@webmail.unibas.ch> Message-ID: On Thu, Jan 5, 2012 at 02:39, wrote: > When I take a look to the valgrind output, I see a few lines about > unaddressable bytes. Is this a reason to concern? > Please paste lines for questions like this. > BTW: I also see memory leaks from getpwuid. Does anybody know about a > patch for this? > I think this is a libc issue. -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Thu Jan 5 07:31:40 2012 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Thu, 5 Jan 2012 17:01:40 +0330 Subject: [petsc-users] Divergence when using Line Search Message-ID: Dear Developers, I used SNES with LS cubic and basic. In some cases the cubic model can stabilised the global convergence BUT in some case it fails while the basic model has given better convergence !!! The error is as follows: . . . 46: CFL = 3894.09, Nonlinear = 6.9086e-05, Linear = (64,4.54181e-10,1e-05) Linear solve converged due to CONVERGED_RTOL iterations 66 47: CFL = 3921.01, Nonlinear = 6.86117e-05, Linear = (66,3.51905e-10,1e-05) Linear solve converged due to CONVERGED_RTOL iterations 64 48: CFL = 3936.79, Nonlinear = 6.83367e-05, Linear = (64,4.50959e-10,1e-05) Linear solve converged due to CONVERGED_RTOL iterations 61 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Floating point exception! [0]PETSC ERROR: Infinite or not-a-number generated in norm! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 3, Fri Sep 30 10:28:33 CDT 2011 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./code on a linux-gnu named baghapour by baghapour Thu Jan 5 16:44:14 2012 [0]PETSC ERROR: Libraries linked from /home/baghapour/softs/petsc/linux-gnu-cxx-debug/lib [0]PETSC ERROR: Configure run at Wed Nov 9 19:16:47 2011 [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran --with-cxx=g++ --download-f2cblaslapack=1 --download-mpich=1 --with-clanguage=cxx --with-debugging=no --download-parms --download-hypre [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: VecNorm() line 167 in src/vec/vec/interface/rvector.c [0]PETSC ERROR: SNESLineSearchCubic() line 581 in src/snes/impls/ls/ls.c [0]PETSC ERROR: SNESSolve_LS() line 218 in src/snes/impls/ls/ls.c [0]PETSC ERROR: SNESSolve() line 2676 in src/snes/interface/snes.c [0]PETSC ERROR: _petsc_NewtonTimeAdvance() line 131 in Newton.cpp I am wondering while the basic model can solve the problem but cubic model can not. Please help me what conditions may leads the above issue in line search. Regards, BehZad -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Jan 5 07:45:00 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 5 Jan 2012 07:45:00 -0600 Subject: [petsc-users] Divergence when using Line Search In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 07:31, behzad baghapour wrote: > I used SNES with LS cubic and basic. In some cases the cubic model can > stabilised the global convergence BUT in some case it fails while the basic > model has given better convergence !!! > Yes, this happens sometimes. Which is better depends on the local shape of the nonlinearity. Just try both and use whichever performs better for your problem. -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Thu Jan 5 07:46:26 2012 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Thu, 5 Jan 2012 17:16:26 +0330 Subject: [petsc-users] Divergence when using Line Search In-Reply-To: References: Message-ID: OK. Thank you very much. On Thu, Jan 5, 2012 at 5:15 PM, Jed Brown wrote: > On Thu, Jan 5, 2012 at 07:31, behzad baghapour > wrote: > >> I used SNES with LS cubic and basic. In some cases the cubic model can >> stabilised the global convergence BUT in some case it fails while the basic >> model has given better convergence !!! >> > > Yes, this happens sometimes. Which is better depends on the local shape of > the nonlinearity. Just try both and use whichever performs better for your > problem. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From agrayver at gfz-potsdam.de Thu Jan 5 09:22:50 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Thu, 05 Jan 2012 16:22:50 +0100 Subject: [petsc-users] Multiple output using one viewer Message-ID: <4F05C04A.4060108@gfz-potsdam.de> Hello, To output many vectors I use following trick: call PetscViewerBinaryOpen(comm,'out',FILE_MODE_WRITE,viewer,ierr); CHKERRQ(ierr) call PetscViewerFileSetName(viewer,'A1.dat',ierr) call MatView(A1,viewer,ierr); CHKERRQ(ierr) call PetscViewerFileSetName(viewer,'A2.dat',ierr) call MatView(A2,viewer,ierr); CHKERRQ(ierr) call PetscViewerFileSetName(viewer,'A3.dat',ierr) call MatView(A3,viewer,ierr); CHKERRQ(ierr) call PetscViewerFileSetName(viewer,'A4.dat',ierr) call MatView(A4,viewer,ierr); CHKERRQ(ierr) call PetscViewerFileSetName(viewer,'V1.dat',ierr) call VecView(V1,viewer,ierr); CHKERRQ(ierr) call PetscViewerFileSetName(viewer,'V2.dat',ierr) call VecView(V2,viewer,ierr); CHKERRQ(ierr) call PetscViewerDestroy(viewer,ierr); CHKERRQ(ierr) In real application there are hundreds of calls like that. This is necessary for analyzing data somewhere in matlab. Eventually I get error: [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Unable to open file! [0]PETSC ERROR: Cannot open .info file V2.info for writing! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Development HG revision: 199bab0ea052fc92ce8e4abb56afc442629a19c8 HG Date: Tue Dec 13 22:22:13 2011 -0800 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: /home/test on a openmpi-i named node228 by agrayver Thu Jan 5 15:24:10 2012 [0]PETSC ERROR: Libraries linked from /home/lib/petsc-dev/openmpi-intel-complex-release-f-ds/lib [0]PETSC ERROR: Configure run at Wed Dec 14 10:54:32 2011 [0]PETSC ERROR: Configure options --with-petsc-arch=openmpi-intel-complex-release-f-ds --with-fortran-interfaces=1 --download-superlu --download-superlu_dist --download-mumps --download-pastix --download-parmetis --download-metis --download-ptscotch --with-scalapack-lib=/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_scalapack_lp64.a --with-scalapack-include=/opt/intel/Compiler/11.1/072/mkl/include --with-blacs-lib=/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_blacs_openmpi_lp64.a --with-blacs-include=/opt/intel/Compiler/11.1/072/mkl/include --with-mpi-dir=/opt/mpi/intel/openmpi-1.4.2 --with-scalar-type=complex --with-blas-lapack-dir=/opt/intel/Compiler/11.1/072/mkl/lib/em64t --with-precision=double --with-debugging=0 --with-fortran-kernels=1 --with-x=0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: PetscViewerFileSetName_Binary() line 1056 in /home/lib/petsc-dev/src/sys/viewer/impls/binary/binv.c [0]PETSC ERROR: PetscViewerFileSetName() line 595 in /home/lib/petsc-dev/src/sys/viewer/impls/ascii/filev.c Sorry! You were supposed to get help about: mpi-abort But I couldn't open the help file: /opt/mpi/intel/openmpi-1.4.2/share/openmpi/help-mpi-api.txt: Too many open files. Sorry! When I switch off verbose mode or reduce number of output files 3-4 times it works. My question is whether all file handles are closed correctly when I change file name? Regards, Alexander From jedbrown at mcs.anl.gov Thu Jan 5 09:30:03 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 5 Jan 2012 09:30:03 -0600 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: <4F05C04A.4060108@gfz-potsdam.de> References: <4F05C04A.4060108@gfz-potsdam.de> Message-ID: On Thu, Jan 5, 2012 at 09:22, Alexander Grayver wrote: > My question is whether all file handles are closed correctly when I change > file name? No, they are not. You should PetscViewerDestroy() and then create a new viewer. I think the file probably *should* be closed silently in this usage. The other alternative is to give an error, but leaking file handles is not okay. -------------- next part -------------- An HTML attachment was scrubbed... URL: From agrayver at gfz-potsdam.de Thu Jan 5 09:32:06 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Thu, 05 Jan 2012 16:32:06 +0100 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: References: <4F05C04A.4060108@gfz-potsdam.de> Message-ID: <4F05C276.2050905@gfz-potsdam.de> On 05.01.2012 16:30, Jed Brown wrote: > On Thu, Jan 5, 2012 at 09:22, Alexander Grayver > > wrote: > > My question is whether all file handles are closed correctly when > I change file name? > > > No, they are not. You should PetscViewerDestroy() and then create a > new viewer. > > > I think the file probably *should* be closed silently in this usage. > The other alternative is to give an error, but leaking file handles is > not okay. If you give error or just not close handle like now then what is the meaning of PetscViewerFileSetName? Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Jan 5 09:34:33 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 5 Jan 2012 09:34:33 -0600 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: <4F05C276.2050905@gfz-potsdam.de> References: <4F05C04A.4060108@gfz-potsdam.de> <4F05C276.2050905@gfz-potsdam.de> Message-ID: On Thu, Jan 5, 2012 at 09:32, Alexander Grayver wrote: > If you give error or just not close handle like now then what is the > meaning of PetscViewerFileSetName? The (flawed) expectation was that you would only call that once. -------------- next part -------------- An HTML attachment was scrubbed... URL: From agrayver at gfz-potsdam.de Thu Jan 5 09:36:25 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Thu, 05 Jan 2012 16:36:25 +0100 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: References: <4F05C04A.4060108@gfz-potsdam.de> <4F05C276.2050905@gfz-potsdam.de> Message-ID: <4F05C379.2040909@gfz-potsdam.de> On 05.01.2012 16:34, Jed Brown wrote: > The (flawed) expectation was that you would only call that once. Maybe this should be noted in the documentation? From jedbrown at mcs.anl.gov Thu Jan 5 09:40:30 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 5 Jan 2012 09:40:30 -0600 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: <4F05C379.2040909@gfz-potsdam.de> References: <4F05C04A.4060108@gfz-potsdam.de> <4F05C276.2050905@gfz-potsdam.de> <4F05C379.2040909@gfz-potsdam.de> Message-ID: On Thu, Jan 5, 2012 at 09:36, Alexander Grayver wrote: > Maybe this should be noted in the documentation? Yes, I think the old file should be closed (if it exists), but I'll wait for comment. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Thu Jan 5 09:41:57 2012 From: zonexo at gmail.com (TAY wee-beng) Date: Thu, 05 Jan 2012 16:41:57 +0100 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc In-Reply-To: <8E6421A2-7752-46B6-B632-53E72C12275E@mcs.anl.gov> References: <4F038802.6040307@gmail.com> <8E6421A2-7752-46B6-B632-53E72C12275E@mcs.anl.gov> Message-ID: <4F05C4C5.9060102@gmail.com> Hi, I just did a -log_summary and attach the text file, running across 8 and 16 processors. My most important concern is whether the load is balanced across the processors. In 16 processors case, for the time, it seems that the ratio for many events are higher than 1, reaching up to 6.8 for VecScatterEnd and 132.1 (?) for MatAssemblyBegin. However, for the flops, ratios are 1 and 1.1. so which is more important to look at? time or flops? If it's time, does it mean my run is highly unbalanced? Thanks! Yours sincerely, TAY wee-beng On 4/1/2012 1:11 AM, Barry Smith wrote: > On Jan 3, 2012, at 6:03 PM, Jed Brown wrote: > >> On Tue, Jan 3, 2012 at 17:57, Barry Smith wrote: >> Huh? Since it is a structured cartesian mesh code you just want to split up the z direction so that each process has an equal number of grid points >> >> I may have misunderstood this: "Uneven grids are used to reduce the number of grids and the main bulk of grids clusters around the center." > I interpreted this to mean that it is using a graded mesh in certain (or all) coordinate directions. I could be wrong. > > Barry > >> If the grid is structured, then I agree to just use a good structured decomposition. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: profile.txt URL: From jedbrown at mcs.anl.gov Thu Jan 5 09:59:54 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 5 Jan 2012 09:59:54 -0600 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc In-Reply-To: <4F05C4C5.9060102@gmail.com> References: <4F038802.6040307@gmail.com> <8E6421A2-7752-46B6-B632-53E72C12275E@mcs.anl.gov> <4F05C4C5.9060102@gmail.com> Message-ID: On Thu, Jan 5, 2012 at 09:41, TAY wee-beng wrote: > I just did a -log_summary and attach the text file, running across 8 and > 16 processors. My most important concern is whether the load is balanced > across the processors. > > In 16 processors case, for the time, it seems that the ratio for many > events are higher than 1, reaching up to 6.8 for VecScatterEnd > This takes about 1% of the run time and it's scaling well, so don't worry about it. > and 132.1 (?) for MatAssemblyBegin. > This is about 2% of run time, but it's not scaling. Do you compute a lot of matrix entries on processes that don't own the rows? Most of your solve time is going into PCSetUp() and PCApply, both of which are getting more expensive as you add processes. These are more than 10x more than spent in MatMult() and MatMult() takes slightly less time on more processes, so the increase isn't entirely due to memory issues. What methods are you using? > However, for the flops, ratios are 1 and 1.1. so which is more important > to look at? time or flops? > If you would rather do a lot of flops than solve the problem in a reasonable amount of time, you might as well use dense methods. ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 5 10:05:47 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 5 Jan 2012 10:05:47 -0600 Subject: [petsc-users] MatGetRowMinAbs returns negative number In-Reply-To: References: <4A2E4300-5180-4636-B460-74CAA9386FB2@mcs.anl.gov> Message-ID: On Thu, Jan 5, 2012 at 4:07 AM, Dominik Szczerba wrote: > Is it also possible that MaxAbs would be computed wrongly? I get a > different value when I compute it myself. Yes, that is possible. Here is the code: http://petsc.cs.iit.edu/petsc/petsc-dev/annotate/d1799f05bf01/src/mat/impls/aij/seq/aij.c#l2628 Matt > > Dominik > > On Thu, Jan 5, 2012 at 1:02 AM, Barry Smith wrote: > > > > On Jan 4, 2012, at 5:42 PM, Matthew Knepley wrote: > > > >> On Wed, Jan 4, 2012 at 5:29 PM, Dominik Szczerba > wrote: > >> Do you think this issue can potentially explain a behavior I observe: > >> > >> In my big block shell matrix there are two blocks submatrices: (say) B > >> and B transposed. MaxAbs is exact same for both submatrices, MinAbs is > >> zero for B (as expected), but small negative for the other > >> (unexpected). > >> > >> Now question: do I have a bug (that I do not immediately see) or this > >> can be hopefully only a trick in MinAbs? > >> > >> It was a bug in MinAbs. I am pushing the fix. > > > > To 3.2 I hope. Please send the patch link for 3.2 as Satish did just > a little while ago for another bug. > > > > Barry > > > >> > >> Matt > >> > >> Thanks a lot. > >> > >> On Wed, Jan 4, 2012 at 6:05 PM, Jed Brown wrote: > >> > On Wed, Jan 4, 2012 at 11:02, Matthew Knepley > wrote: > >> >> > >> >> Okay, here is what is wrong with the logic: > >> >> > >> >> 1) Its not a shift, it ignores values < 1.0e-12 > >> > > >> > > >> > you wrote that in the commit message > >> > > >> >> > >> >> > >> >> 2) The problem is on line 2.17 of the diff where it takes the first > value > >> >> as minimum, but does not take the absolute value > >> > > >> > > >> > good catch > >> > >> > >> > >> -- > >> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >> -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jiangwen84 at gmail.com Thu Jan 5 14:12:24 2012 From: jiangwen84 at gmail.com (Wen Jiang) Date: Thu, 5 Jan 2012 15:12:24 -0500 Subject: [petsc-users] call Metis routine in PETSc codes Message-ID: Hi, I built PETSc with parmetis, and tried to call a Metis routine in my PETSc code. But it cannot be linked correctly. I got the error like "undefined reference to 'METIS_PartMeshNodal(int*,int*,int*,int*,int*,int*,int*,int*,int*) " I don't know whether I should do anything special with my PETSc c++ code(s) in order to call a Metis routine. Thanks. Wen -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 5 14:16:09 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 5 Jan 2012 14:16:09 -0600 Subject: [petsc-users] call Metis routine in PETSc codes In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 2:12 PM, Wen Jiang wrote: > Hi, > > I built PETSc with parmetis, and tried to call a Metis routine in my PETSc > code. But it cannot be linked correctly. I got the error like > > "undefined reference to > 'METIS_PartMeshNodal(int*,int*,int*,int*,int*,int*,int*,int*,int*) " > > I don't know whether I should do anything special with my PETSc c++ > code(s) in order to call a Metis routine. > You MUST send the entire link line AND error message. Without data, this is just fortune telling. Matt > Thanks. > Wen > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Jan 5 14:16:46 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 5 Jan 2012 14:16:46 -0600 Subject: [petsc-users] call Metis routine in PETSc codes In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 14:12, Wen Jiang wrote: > I built PETSc with parmetis, and tried to call a Metis routine in my PETSc > code. But it cannot be linked correctly. I got the error like > > "undefined reference to > 'METIS_PartMeshNodal(int*,int*,int*,int*,int*,int*,int*,int*,int*) " > metis.h has extern "C" in the header, so you must have gotten this declaration from somewhere else. > > I don't know whether I should do anything special with my PETSc c++ > code(s) in order to call a Metis routine. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Thu Jan 5 14:25:39 2012 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Thu, 5 Jan 2012 14:25:39 -0600 Subject: [petsc-users] PETSC_HAVE_MPI_WIN_CREATE in petsc Message-ID: Hello, I browsed petsc source cdoe and found a macro PETSC_HAVE_MPI_WIN_CREATE. It seems petsc can use one-sided communication. I want to test MatMult()'s performance with one-sided. Then, how to enable it? Thanks! -- Junchao Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 5 14:27:38 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 5 Jan 2012 14:27:38 -0600 Subject: [petsc-users] PETSC_HAVE_MPI_WIN_CREATE in petsc In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 2:25 PM, Junchao Zhang wrote: > Hello, > I browsed petsc source cdoe and found a macro PETSC_HAVE_MPI_WIN_CREATE. > It seems petsc can use one-sided communication. I want to test > MatMult()'s performance with one-sided. > Then, how to enable it? > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate Matt > Thanks! > > -- Junchao Zhang > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Jan 5 14:29:32 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 5 Jan 2012 14:29:32 -0600 Subject: [petsc-users] PETSC_HAVE_MPI_WIN_CREATE in petsc In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 14:25, Junchao Zhang wrote: > I browsed petsc source cdoe and found a macro PETSC_HAVE_MPI_WIN_CREATE. > It seems petsc can use one-sided communication. I want to test > MatMult()'s performance with one-sided. > Then, how to enable it? > Just run with -vecscatter_window Note that this currently uses MPI_Win_fence() for completion which makes it more synchronous than necessary (it could and should use MPI_Win_post(), MPI_Win_start(), MPI_Win_complete(), MPI_Win_wait()). -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Jan 5 18:17:58 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 5 Jan 2012 18:17:58 -0600 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: References: <4F05C04A.4060108@gfz-potsdam.de> <4F05C276.2050905@gfz-potsdam.de> <4F05C379.2040909@gfz-potsdam.de> Message-ID: On Jan 5, 2012, at 9:40 AM, Jed Brown wrote: > On Thu, Jan 5, 2012 at 09:36, Alexander Grayver wrote: > Maybe this should be noted in the documentation? > > Yes, I think the old file should be closed (if it exists), but I'll wait for comment. I never thought about the case where someone called PetscViewerFileSetName() twice. I'm surprised that it works at all. Yes, it should (IMHO) be changed to close the old file if used twice. Barry BTW: It is also possible to put many vectors and matrices in the same file and read them in using several different ways using PetscBinaryRead(). I (personally) think any approach that involves creating hundreds of files is nuts. From jedbrown at mcs.anl.gov Thu Jan 5 18:28:23 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 5 Jan 2012 18:28:23 -0600 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: References: <4F05C04A.4060108@gfz-potsdam.de> <4F05C276.2050905@gfz-potsdam.de> <4F05C379.2040909@gfz-potsdam.de> Message-ID: On Thu, Jan 5, 2012 at 18:17, Barry Smith wrote: > BTW: It is also possible to put many vectors and matrices in the same file > and read them in using several different ways using PetscBinaryRead(). I > (personally) think any approach that involves creating hundreds of files is > nuts. > To be fair, such files (in PETSc binary format) do not have an index (e.g. to see the names and types of objects available in the file) or implement "seek" behavior (to just read the 19th vector). -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Jan 5 18:39:58 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 5 Jan 2012 18:39:58 -0600 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: References: <4F05C04A.4060108@gfz-potsdam.de> <4F05C276.2050905@gfz-potsdam.de> <4F05C379.2040909@gfz-potsdam.de> Message-ID: <2D4094EC-4B5E-455B-B0A7-1D91B6B015A5@mcs.anl.gov> On Jan 5, 2012, at 6:28 PM, Jed Brown wrote: > On Thu, Jan 5, 2012 at 18:17, Barry Smith wrote: > BTW: It is also possible to put many vectors and matrices in the same file and read them in using several different ways using PetscBinaryRead(). I (personally) think any approach that involves creating hundreds of files is nuts. > > To be fair, such files (in PETSc binary format) do not have an index (e.g. to see the names and types of objects available in the file) or implement "seek" behavior (to just read the 19th vector). Yes, it is not an extremely well designed and implemented top quality binary format like HDF5. Nor does it intend to be. And it is not suitable if you need random access to hundreds of PETSc objects, but for simple things like making movies in MATLAB or statistical analysis it is often fine to store all the objects in one file. Barry From rm93 at buffalo.edu Thu Jan 5 19:27:58 2012 From: rm93 at buffalo.edu (Reza Madankan) Date: Thu, 5 Jan 2012 20:27:58 -0500 Subject: [petsc-users] using PETSc commands in parallel for loop Message-ID: Hello; I had a quick question about using PETSc inside parallel for loop in C language. In more detail, I have couple of lines of matrix algebra which is written by using PETSc inside a for loop that I would like to parallelize it. Here is the code that I have written: MPI_Comm_size(MPI_COMM_WORLD,&Np); MPI_Comm_rank(MPI_COMM_WORLD,&myid); for (j=myid*(nw/Np);j<(myid+1)*(nw/Np);j++) { MatCreate(PETSC_COMM_WORLD,&Ypcq); MatSetSizes(Ypcq,PETSC_DECIDE,PETSC_DECIDE,ns*tindex_f,1); MatSetFromOptions(Ypcq); for (k=0; k From jedbrown at mcs.anl.gov Thu Jan 5 19:38:44 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 5 Jan 2012 19:38:44 -0600 Subject: [petsc-users] using PETSc commands in parallel for loop In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 19:27, Reza Madankan wrote: > I am trying to parallelize this loop using MPI, but I don't get the right > result. I appreciate if anyone could help me on that? > MPI is not a method for loop-level parallelization. Please read the PETSc manual examples examples to learn how MPI-based parallelism works. OpenMP is a different approach that specifically does loop-level parallelism, but its scope is more limited. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Johannes.Huber at unibas.ch Fri Jan 6 00:46:53 2012 From: Johannes.Huber at unibas.ch (Johannes.Huber at unibas.ch) Date: Fri, 06 Jan 2012 07:46:53 +0100 Subject: [petsc-users] Error in VecAssemeblyBegin after VecView In-Reply-To: References: <20120103111123.11696oxrgxp0hgzv@webmail.unibas.ch> <20120103134129.12977c8zc6k5a73t@webmail.unibas.ch> <1204C7B9-DBEC-4708-A31A-2AC8BACD5385@mcs.anl.gov> <20120104185922.85472mk1dvoubm9m@webmail.unibas.ch> <3CD2B57B-3FEA-4BC0-BF9A-983521D1A050@mcs.anl.gov> <20120105093913.595846m3ijgyw1ip@webmail.unibas.ch> Message-ID: <20120106074653.17381rd8ey2xe4t9@webmail.unibas.ch> Quoting Jed Brown : > On Thu, Jan 5, 2012 at 02:39, wrote: > >> When I take a look to the valgrind output, I see a few lines about >> unaddressable bytes. Is this a reason to concern? >> > > Please paste lines for questions like this. > Here are the lines: ==22507== Unaddressable byte(s) found during client check request ==22507== at 0x4E34BFD: check_mem_is_defined_untyped (libmpiwrap.c:953) ==22507== by 0x4E499AA: walk_type (libmpiwrap.c:691) ==22507== by 0x4E4D7A3: PMPI_Allreduce (libmpiwrap.c:924) ==22507== by 0x74846D0: MPIR_Get_contextid (in /usr/lib/libmpich.so.1.2) ==22507== by 0x7484CF1: MPIR_Comm_copy (in /usr/lib/libmpich.so.1.2) ==22507== by 0x747D280: PMPI_Comm_dup (in /usr/lib/libmpich.so.1.2) ==22507== by 0x4E48C4A: PMPI_Comm_dup (libmpiwrap.c:2110) ==22507== by 0x586878D: PetscCommDuplicate (tagm.c:149) ==22507== by 0x54718AD: PetscHeaderCreate_Private (inherit.c:51) ==22507== by 0x5891130: VecCreate (veccreate.c:39) ==22507== by 0x400E3A: main (Test.C:10) ==22507== Address 0xffffffffffffffff is not stack'd, malloc'd or (recently) free'd and ==22508== Unaddressable byte(s) found during client check request ==22508== at 0x4E34BFD: check_mem_is_defined_untyped (libmpiwrap.c:953) ==22508== by 0x4E499AA: walk_type (libmpiwrap.c:691) ==22508== by 0x4E4D7A3: PMPI_Allreduce (libmpiwrap.c:924) ==22508== by 0x74846D0: MPIR_Get_contextid (in /usr/lib/libmpich.so.1.2) ==22508== by 0x7484CF1: MPIR_Comm_copy (in /usr/lib/libmpich.so.1.2) ==22508== by 0x747D280: PMPI_Comm_dup (in /usr/lib/libmpich.so.1.2) ==22508== by 0x4E48C4A: PMPI_Comm_dup (libmpiwrap.c:2110) ==22508== by 0x586878D: PetscCommDuplicate (tagm.c:149) ==22508== by 0x54718AD: PetscHeaderCreate_Private (inherit.c:51) ==22508== by 0x5891130: VecCreate (veccreate.c:39) ==22508== by 0x400E3A: main (Test.C:10) ==22508== Address 0xffffffffffffffff is not stack'd, malloc'd or (recently) free'd ==22508== --22508-- REDIR: 0x744f8c0 (PMPI_Attr_put) redirected to 0x4e46b46 (PMPI_Attr_put) ... --22508-- REDIR: 0x74df090 (PMPI_Recv) redirected to 0x4e4e4b4 (PMPI_Recv) ==22508== Uninitialised byte(s) found during client check request ==22508== at 0x4E49738: PMPI_Get_count (libmpiwrap.c:953) ==22508== by 0x4E4E704: PMPI_Recv (libmpiwrap.c:419) ==22508== by 0x56BE525: VecView_MPI_ASCII (pdvec.c:78) ==22508== by 0x56C1BDA: VecView_MPI (pdvec.c:837) ==22508== by 0x58A9C9D: VecView (vector.c:746) ==22508== by 0x400EEA: main (Test.C:21) ==22508== Address 0x7fefffe30 is on thread 1's stack ==22508== Uninitialised value was created by a stack allocation ==22508== at 0x56BD490: VecView_MPI_ASCII (pdvec.c:36) ==22508== ==22508== Uninitialised byte(s) found during client check request ==22508== at 0x4E49738: PMPI_Get_count (libmpiwrap.c:953) ==22508== by 0x56BE57F: VecView_MPI_ASCII (pdvec.c:79) ==22508== by 0x56C1BDA: VecView_MPI (pdvec.c:837) ==22508== by 0x58A9C9D: VecView (vector.c:746) ==22508== by 0x400EEA: main (Test.C:21) ==22508== Address 0x7fefffe30 is on thread 1's stack ==22508== Uninitialised value was created by a stack allocation ==22508== at 0x56BD490: VecView_MPI_ASCII (pdvec.c:36) > >> BTW: I also see memory leaks from getpwuid. Does anybody know about a >> patch for this? >> > > I think this is a libc issue. > ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. From agrayver at gfz-potsdam.de Fri Jan 6 03:09:56 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Fri, 06 Jan 2012 10:09:56 +0100 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: References: <4F05C04A.4060108@gfz-potsdam.de> <4F05C276.2050905@gfz-potsdam.de> <4F05C379.2040909@gfz-potsdam.de> Message-ID: <4F06BA64.4020402@gfz-potsdam.de> On 06.01.2012 01:17, Barry Smith wrote: > On Jan 5, 2012, at 9:40 AM, Jed Brown wrote: > >> On Thu, Jan 5, 2012 at 09:36, Alexander Grayver wrote: >> Maybe this should be noted in the documentation? >> >> Yes, I think the old file should be closed (if it exists), but I'll wait for comment. > I never thought about the case where someone called PetscViewerFileSetName() twice. I'm surprised that it works at all. > > Yes, it should (IMHO) be changed to close the old file if used twice. > > Barry > > BTW: It is also possible to put many vectors and matrices in the same file and read them in using several different ways using PetscBinaryRead(). I (personally) think any approach that involves creating hundreds of files is nuts. This is not always convinient to store everything in one file, but in some cases I do want to use it. I haven't found any examples on that. Do I have to use FILE_MODE_APPEND and then write? What happens if file doesn't exist? Regards, Alexander From jedbrown at mcs.anl.gov Fri Jan 6 06:45:42 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 6 Jan 2012 06:45:42 -0600 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: <4F06BA64.4020402@gfz-potsdam.de> References: <4F05C04A.4060108@gfz-potsdam.de> <4F05C276.2050905@gfz-potsdam.de> <4F05C379.2040909@gfz-potsdam.de> <4F06BA64.4020402@gfz-potsdam.de> Message-ID: On Fri, Jan 6, 2012 at 03:09, Alexander Grayver wrote: > This is not always convinient to store everything in one file, but in some > cases I do want to use it. I haven't found any examples on that. Do I have > to use FILE_MODE_APPEND and then write? What happens if file doesn't exist? You just create a viewer and then call MatView(), VecView(), etc, repeatedly for each object you want to put in the file (e.g. once per time step). No need for FILE_MODE_APPEND and unless you want to append to an existing file. -------------- next part -------------- An HTML attachment was scrubbed... URL: From agrayver at gfz-potsdam.de Fri Jan 6 06:51:46 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Fri, 06 Jan 2012 13:51:46 +0100 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: References: <4F05C04A.4060108@gfz-potsdam.de> <4F05C276.2050905@gfz-potsdam.de> <4F05C379.2040909@gfz-potsdam.de> <4F06BA64.4020402@gfz-potsdam.de> Message-ID: <4F06EE62.6060901@gfz-potsdam.de> On 06.01.2012 13:45, Jed Brown wrote: > On Fri, Jan 6, 2012 at 03:09, Alexander Grayver > > wrote: > > This is not always convinient to store everything in one file, but > in some cases I do want to use it. I haven't found any examples on > that. Do I have to use FILE_MODE_APPEND and then write? What > happens if file doesn't exist? > > > You just create a viewer and then call MatView(), VecView(), etc, > repeatedly for each object you want to put in the file (e.g. once per > time step). No need for FILE_MODE_APPEND and unless you want to append > to an existing file. Ok, that was my meaning to use PetscViewerFileSetName from the beginning since if you have let's say ten different objects (Mat and Vec) and you need to output them at each iteration (time step or frequency for multi-freqs modeling) you need ten viewer objects which is not cool I guess, that is why I started to use one viewer and change just a name of the file. And to be honest I don't see any reason why having ten viewers is better than calling PetscViewerFileSetName ten times. Regards, Alexander -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Jan 6 07:13:32 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 6 Jan 2012 07:13:32 -0600 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: <4F06EE62.6060901@gfz-potsdam.de> References: <4F05C04A.4060108@gfz-potsdam.de> <4F05C276.2050905@gfz-potsdam.de> <4F05C379.2040909@gfz-potsdam.de> <4F06BA64.4020402@gfz-potsdam.de> <4F06EE62.6060901@gfz-potsdam.de> Message-ID: On Fri, Jan 6, 2012 at 06:51, Alexander Grayver wrote: > Ok, that was my meaning to use PetscViewerFileSetName from the beginning > since if you have let's say ten different objects (Mat and Vec) and you > need to output them at each iteration (time step or frequency for > multi-freqs modeling) you need ten viewer objects which is not cool I > guess, that is why I started to use one viewer and change just a name of > the file. > And to be honest I don't see any reason why having ten viewers is better > than calling PetscViewerFileSetName ten times. > Ten open file handles is not a big deal, just use ten viewers. There is file system metadata overhead associated with opening and closing files and it can get quite large on parallel file systems. If you planned to have thousands of open file handles, you should rethink your output format. -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Jan 6 08:35:00 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 6 Jan 2012 08:35:00 -0600 (CST) Subject: [petsc-users] Error in VecAssemeblyBegin after VecView In-Reply-To: <20120106074653.17381rd8ey2xe4t9@webmail.unibas.ch> References: <20120103111123.11696oxrgxp0hgzv@webmail.unibas.ch> <20120103134129.12977c8zc6k5a73t@webmail.unibas.ch> <1204C7B9-DBEC-4708-A31A-2AC8BACD5385@mcs.anl.gov> <20120104185922.85472mk1dvoubm9m@webmail.unibas.ch> <3CD2B57B-3FEA-4BC0-BF9A-983521D1A050@mcs.anl.gov> <20120105093913.595846m3ijgyw1ip@webmail.unibas.ch> <20120106074653.17381rd8ey2xe4t9@webmail.unibas.ch> Message-ID: These messages are from within mpi. If you wish to have a valgrind clean mpi - build one with --download-mpich=1. [use a different PETSC_ARCH for this build] Satish On Fri, 6 Jan 2012, Johannes.Huber at unibas.ch wrote: > Quoting Jed Brown : > > > On Thu, Jan 5, 2012 at 02:39, wrote: > > > > > When I take a look to the valgrind output, I see a few lines about > > > unaddressable bytes. Is this a reason to concern? > > > > > > > Please paste lines for questions like this. > > > Here are the lines: > > ==22507== Unaddressable byte(s) found during client check request > ==22507== at 0x4E34BFD: check_mem_is_defined_untyped (libmpiwrap.c:953) > ==22507== by 0x4E499AA: walk_type (libmpiwrap.c:691) > ==22507== by 0x4E4D7A3: PMPI_Allreduce (libmpiwrap.c:924) > ==22507== by 0x74846D0: MPIR_Get_contextid (in /usr/lib/libmpich.so.1.2) > ==22507== by 0x7484CF1: MPIR_Comm_copy (in /usr/lib/libmpich.so.1.2) > ==22507== by 0x747D280: PMPI_Comm_dup (in /usr/lib/libmpich.so.1.2) > ==22507== by 0x4E48C4A: PMPI_Comm_dup (libmpiwrap.c:2110) > ==22507== by 0x586878D: PetscCommDuplicate (tagm.c:149) > ==22507== by 0x54718AD: PetscHeaderCreate_Private (inherit.c:51) > ==22507== by 0x5891130: VecCreate (veccreate.c:39) > ==22507== by 0x400E3A: main (Test.C:10) > ==22507== Address 0xffffffffffffffff is not stack'd, malloc'd or (recently) > free'd > > and > > ==22508== Unaddressable byte(s) found during client check request > ==22508== at 0x4E34BFD: check_mem_is_defined_untyped (libmpiwrap.c:953) > ==22508== by 0x4E499AA: walk_type (libmpiwrap.c:691) > ==22508== by 0x4E4D7A3: PMPI_Allreduce (libmpiwrap.c:924) > ==22508== by 0x74846D0: MPIR_Get_contextid (in /usr/lib/libmpich.so.1.2) > ==22508== by 0x7484CF1: MPIR_Comm_copy (in /usr/lib/libmpich.so.1.2) > ==22508== by 0x747D280: PMPI_Comm_dup (in /usr/lib/libmpich.so.1.2) > ==22508== by 0x4E48C4A: PMPI_Comm_dup (libmpiwrap.c:2110) > ==22508== by 0x586878D: PetscCommDuplicate (tagm.c:149) > ==22508== by 0x54718AD: PetscHeaderCreate_Private (inherit.c:51) > ==22508== by 0x5891130: VecCreate (veccreate.c:39) > ==22508== by 0x400E3A: main (Test.C:10) > ==22508== Address 0xffffffffffffffff is not stack'd, malloc'd or (recently) > free'd > ==22508== > --22508-- REDIR: 0x744f8c0 (PMPI_Attr_put) redirected to 0x4e46b46 > (PMPI_Attr_put) > > ... > > --22508-- REDIR: 0x74df090 (PMPI_Recv) redirected to 0x4e4e4b4 (PMPI_Recv) > ==22508== Uninitialised byte(s) found during client check request > ==22508== at 0x4E49738: PMPI_Get_count (libmpiwrap.c:953) > ==22508== by 0x4E4E704: PMPI_Recv (libmpiwrap.c:419) > ==22508== by 0x56BE525: VecView_MPI_ASCII (pdvec.c:78) > ==22508== by 0x56C1BDA: VecView_MPI (pdvec.c:837) > ==22508== by 0x58A9C9D: VecView (vector.c:746) > ==22508== by 0x400EEA: main (Test.C:21) > ==22508== Address 0x7fefffe30 is on thread 1's stack > ==22508== Uninitialised value was created by a stack allocation > ==22508== at 0x56BD490: VecView_MPI_ASCII (pdvec.c:36) > ==22508== > ==22508== Uninitialised byte(s) found during client check request > ==22508== at 0x4E49738: PMPI_Get_count (libmpiwrap.c:953) > ==22508== by 0x56BE57F: VecView_MPI_ASCII (pdvec.c:79) > ==22508== by 0x56C1BDA: VecView_MPI (pdvec.c:837) > ==22508== by 0x58A9C9D: VecView (vector.c:746) > ==22508== by 0x400EEA: main (Test.C:21) > ==22508== Address 0x7fefffe30 is on thread 1's stack > ==22508== Uninitialised value was created by a stack allocation > ==22508== at 0x56BD490: VecView_MPI_ASCII (pdvec.c:36) > > > > > > BTW: I also see memory leaks from getpwuid. Does anybody know about a > > > patch for this? > > > > > > > I think this is a libc issue. > > > > > > ---------------------------------------------------------------- > This message was sent using IMP, the Internet Messaging Program. > > From li at loyno.edu Fri Jan 6 16:18:39 2012 From: li at loyno.edu (Xuefeng Li) Date: Fri, 6 Jan 2012 16:18:39 -0600 (CST) Subject: [petsc-users] How/where to update DASetStencilWidth() info In-Reply-To: References: <58356696.1460341259111616362.JavaMail.root@mail8.gatech.edu> Message-ID: Hello, everyone! The stencil width of my DA to a DMMG object is created to be 1. It is later adjusted using DASetStencilWidth() to 2. DAGetInfo() confirmed that the stencil width is indeed 2. However, DAGetGhostCorners()/DAGetCorners() show that the ghost point width is 1, instead of 2. How and where can we inform Petsc that the stencil width has been adjusted? I am running Petsc-3.1-p8. Thanks in advance. Regards, --Xuefeng Li, (504)865-3340(phone) Like floating clouds, the heart rests easy Like flowing water, the spirit stays free http://www.loyno.edu/~li/home New Orleans, Louisiana (504)865-2051(fax) From jedbrown at mcs.anl.gov Fri Jan 6 16:22:27 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 6 Jan 2012 16:22:27 -0600 Subject: [petsc-users] How/where to update DASetStencilWidth() info In-Reply-To: References: <58356696.1460341259111616362.JavaMail.root@mail8.gatech.edu> Message-ID: On Fri, Jan 6, 2012 at 16:18, Xuefeng Li
  • wrote: > The stencil width of my DA to a DMMG object is > created to be 1. It is later adjusted using > DASetStencilWidth() to 2. > You are not supposed to do this. Either call DMDACreate2d() (or whichever dimension you want) or call the sequence ierr = DMDACreate(comm, da);CHKERRQ(ierr); ierr = DMDASetDim(*da, 2);CHKERRQ(ierr); ierr = DMDASetSizes(*da, M, N, 1);CHKERRQ(ierr); ierr = DMDASetNumProcs(*da, m, n, PETSC_DECIDE);CHKERRQ(ierr); ierr = DMDASetBoundaryType(*da, bx, by, DMDA_BOUNDARY_NONE);CHKERRQ(ierr); ierr = DMDASetDof(*da, dof);CHKERRQ(ierr); ierr = DMDASetStencilType(*da, stencil_type);CHKERRQ(ierr); ierr = DMDASetStencilWidth(*da, s);CHKERRQ(ierr); ierr = DMDASetOwnershipRanges(*da, lx, ly, PETSC_NULL);CHKERRQ(ierr); ierr = DMSetFromOptions(*da);CHKERRQ(ierr); ierr = DMSetUp(*da);CHKERRQ(ierr); I am updating all these functions now to give an error if you call them after DMSetUp(). > DAGetInfo() confirmed > that the stencil width is indeed 2. However, > DAGetGhostCorners()/**DAGetCorners() show that > the ghost point width is 1, instead of 2. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From li at loyno.edu Fri Jan 6 16:32:12 2012 From: li at loyno.edu (Xuefeng Li) Date: Fri, 6 Jan 2012 16:32:12 -0600 (CST) Subject: [petsc-users] How/where to update DASetStencilWidth() info In-Reply-To: References: <58356696.1460341259111616362.JavaMail.root@mail8.gatech.edu> Message-ID: On Fri, 6 Jan 2012, Jed Brown wrote: > On Fri, Jan 6, 2012 at 16:18, Xuefeng Li
  • wrote: > >> The stencil width of my DA to a DMMG object is >> created to be 1. It is later adjusted using >> DASetStencilWidth() to 2. >> > > You are not supposed to do this. > > Either call DMDACreate2d() (or whichever dimension you want) or call the > sequence > Understand. However, is there a way to change/adjust the stencil width AFTER the DA has been created and set up to a DMMG? Or, in general, can we attach different DA objects to a DMMG at different times? > ierr = DMDACreate(comm, da);CHKERRQ(ierr); > ierr = DMDASetDim(*da, 2);CHKERRQ(ierr); > ierr = DMDASetSizes(*da, M, N, 1);CHKERRQ(ierr); > ierr = DMDASetNumProcs(*da, m, n, PETSC_DECIDE);CHKERRQ(ierr); > ierr = DMDASetBoundaryType(*da, bx, by, DMDA_BOUNDARY_NONE);CHKERRQ(ierr); > ierr = DMDASetDof(*da, dof);CHKERRQ(ierr); > ierr = DMDASetStencilType(*da, stencil_type);CHKERRQ(ierr); > ierr = DMDASetStencilWidth(*da, s);CHKERRQ(ierr); > ierr = DMDASetOwnershipRanges(*da, lx, ly, PETSC_NULL);CHKERRQ(ierr); > ierr = DMSetFromOptions(*da);CHKERRQ(ierr); > ierr = DMSetUp(*da);CHKERRQ(ierr); > > > I am updating all these functions now to give an error if you call them > after DMSetUp(). > > >> DAGetInfo() confirmed >> that the stencil width is indeed 2. However, >> DAGetGhostCorners()/**DAGetCorners() show that >> the ghost point width is 1, instead of 2. >> > Regards, --Xuefeng Li, (504)865-3340(phone) Like floating clouds, the heart rests easy Like flowing water, the spirit stays free http://www.loyno.edu/~li/home New Orleans, Louisiana (504)865-2051(fax) From knepley at gmail.com Fri Jan 6 16:34:42 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 6 Jan 2012 16:34:42 -0600 Subject: [petsc-users] How/where to update DASetStencilWidth() info In-Reply-To: References: <58356696.1460341259111616362.JavaMail.root@mail8.gatech.edu> Message-ID: On Fri, Jan 6, 2012 at 4:32 PM, Xuefeng Li
  • wrote: > On Fri, 6 Jan 2012, Jed Brown wrote: > > On Fri, Jan 6, 2012 at 16:18, Xuefeng Li
  • wrote: >> >> The stencil width of my DA to a DMMG object is >>> created to be 1. It is later adjusted using >>> DASetStencilWidth() to 2. >>> >>> >> You are not supposed to do this. >> >> Either call DMDACreate2d() (or whichever dimension you want) or call the >> sequence >> >> Understand. > > However, is there a way to change/adjust the stencil width > AFTER the DA has been created and set up to a DMMG? Or, > in general, can we attach different DA objects to a DMMG > at different times? > 1) You cannot change this after setting up the DA. Just make another DA, since they are very lightweight. 2) I think structuring new code around DMMG is wrong. It has been removed in petsc-dev and is deprecated in 3.2. What are you trying to do? Thanks, Matt > ierr = DMDACreate(comm, da);CHKERRQ(ierr); >> ierr = DMDASetDim(*da, 2);CHKERRQ(ierr); >> ierr = DMDASetSizes(*da, M, N, 1);CHKERRQ(ierr); >> ierr = DMDASetNumProcs(*da, m, n, PETSC_DECIDE);CHKERRQ(ierr); >> ierr = DMDASetBoundaryType(*da, bx, by, DMDA_BOUNDARY_NONE);CHKERRQ(** >> ierr); >> ierr = DMDASetDof(*da, dof);CHKERRQ(ierr); >> ierr = DMDASetStencilType(*da, stencil_type);CHKERRQ(ierr); >> ierr = DMDASetStencilWidth(*da, s);CHKERRQ(ierr); >> ierr = DMDASetOwnershipRanges(*da, lx, ly, PETSC_NULL);CHKERRQ(ierr); >> ierr = DMSetFromOptions(*da);CHKERRQ(**ierr); >> ierr = DMSetUp(*da);CHKERRQ(ierr); >> >> >> I am updating all these functions now to give an error if you call them >> after DMSetUp(). >> >> >> DAGetInfo() confirmed >>> that the stencil width is indeed 2. However, >>> DAGetGhostCorners()/****DAGetCorners() show that >>> the ghost point width is 1, instead of 2. >>> >>> >> > Regards, > > --Xuefeng Li, (504)865-3340(phone) > Like floating clouds, the heart rests easy > Like flowing water, the spirit stays free > http://www.loyno.edu/~li/home > New Orleans, Louisiana (504)865-2051(fax) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Jan 6 16:34:55 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 6 Jan 2012 16:34:55 -0600 Subject: [petsc-users] How/where to update DASetStencilWidth() info In-Reply-To: References: <58356696.1460341259111616362.JavaMail.root@mail8.gatech.edu> Message-ID: On Fri, Jan 6, 2012 at 16:32, Xuefeng Li
  • wrote: > However, is there a way to change/adjust the stencil width > AFTER the DA has been created and set up to a DMMG? Or, > in general, can we attach different DA objects to a DMMG > at different times? > 1. Please do not use DMMG, it has been almost completely removed. 2. You can create a new DM if you need to change its layout. -------------- next part -------------- An HTML attachment was scrubbed... URL: From li at loyno.edu Fri Jan 6 16:43:49 2012 From: li at loyno.edu (Xuefeng Li) Date: Fri, 6 Jan 2012 16:43:49 -0600 (CST) Subject: [petsc-users] How/where to update DASetStencilWidth() info In-Reply-To: References: <58356696.1460341259111616362.JavaMail.root@mail8.gatech.edu> Message-ID: On Fri, 6 Jan 2012, Matthew Knepley wrote: > On Fri, Jan 6, 2012 at 4:32 PM, Xuefeng Li
  • wrote: > >> However, is there a way to change/adjust the stencil width >> AFTER the DA has been created and set up to a DMMG? Or, >> in general, can we attach different DA objects to a DMMG >> at different times? >> > > 1) You cannot change this after setting up the DA. Just make another DA, > since > they are very lightweight. > > 2) I think structuring new code around DMMG is wrong. It has been removed in > petsc-dev and is deprecated in 3.2. What are you trying to do? > I am just trying to use the ghost points to share data among neighboring processes at one step during the whole iteration. At that particular step, the stencil width needs to be bigger. One can always set the stencil width to be big enough so that one does not need to adjust it. But this is just not as efficient as it can be. Thanks again. Regards, --Xuefeng Li, (504)865-3340(phone) Like floating clouds, the heart rests easy Like flowing water, the spirit stays free http://www.loyno.edu/~li/home New Orleans, Louisiana (504)865-2051(fax) From bsmith at mcs.anl.gov Fri Jan 6 20:58:46 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 6 Jan 2012 20:58:46 -0600 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: <4F06EE62.6060901@gfz-potsdam.de> References: <4F05C04A.4060108@gfz-potsdam.de> <4F05C276.2050905@gfz-potsdam.de> <4F05C379.2040909@gfz-potsdam.de> <4F06BA64.4020402@gfz-potsdam.de> <4F06EE62.6060901@gfz-potsdam.de> Message-ID: <117A7ABA-9E8D-42C3-9506-F8C72B2B2248@mcs.anl.gov> On Jan 6, 2012, at 6:51 AM, Alexander Grayver wrote: > On 06.01.2012 13:45, Jed Brown wrote: >> On Fri, Jan 6, 2012 at 03:09, Alexander Grayver wrote: >> This is not always convinient to store everything in one file, but in some cases I do want to use it. I haven't found any examples on that. Do I have to use FILE_MODE_APPEND and then write? What happens if file doesn't exist? >> >> You just create a viewer and then call MatView(), VecView(), etc, repeatedly for each object you want to put in the file (e.g. once per time step). No need for FILE_MODE_APPEND and unless you want to append to an existing file. > > Ok, that was my meaning to use PetscViewerFileSetName from the beginning since if you have let's say ten different objects (Mat and Vec) and you need to output them at each iteration (time step or frequency for multi-freqs modeling) you need ten viewer objects I don't understand why you need ten viewer objects. Why not just dump all the objects into the one file (creating one Viewer object and never changing its name) one after each other and then in MATLAB read then back in one after the other. Barry > which is not cool I guess, that is why I started to use one viewer and change just a name of the file. > And to be honest I don't see any reason why having ten viewers is better than calling PetscViewerFileSetName ten times. > > Regards, > Alexander From bsmith at mcs.anl.gov Fri Jan 6 21:07:43 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 6 Jan 2012 21:07:43 -0600 Subject: [petsc-users] How/where to update DASetStencilWidth() info In-Reply-To: References: <58356696.1460341259111616362.JavaMail.root@mail8.gatech.edu> Message-ID: On Jan 6, 2012, at 4:43 PM, Xuefeng Li wrote: > On Fri, 6 Jan 2012, Matthew Knepley wrote: > >> On Fri, Jan 6, 2012 at 4:32 PM, Xuefeng Li
  • wrote: >> >>> However, is there a way to change/adjust the stencil width >>> AFTER the DA has been created and set up to a DMMG? Or, >>> in general, can we attach different DA objects to a DMMG >>> at different times? >>> >> >> 1) You cannot change this after setting up the DA. Just make another DA, >> since >> they are very lightweight. >> >> 2) I think structuring new code around DMMG is wrong. It has been removed in >> petsc-dev and is deprecated in 3.2. What are you trying to do? >> > I am just trying to use the ghost points to share > data among neighboring processes at one step > during the whole iteration. At that particular > step, the stencil width needs to be bigger. > > One can always set the stencil width to be big > enough so that one does not need to adjust it. > But this is just not as efficient as it can be. Just create a bigger-stencil DMDA that you use during those "special" steps. Barry > > Thanks again. > > Regards, > > --Xuefeng Li, (504)865-3340(phone) > Like floating clouds, the heart rests easy > Like flowing water, the spirit stays free > http://www.loyno.edu/~li/home > New Orleans, Louisiana (504)865-2051(fax) > From mmnasr at gmail.com Sat Jan 7 16:00:40 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Sat, 7 Jan 2012 14:00:40 -0800 Subject: [petsc-users] GMRES solver Message-ID: Hi guys, I am trying to narrow down an issue with my Poisson solver. I have the following problem setup Laplace(f) = rhs(x,z,y) 0 <= x,y,z <= (Lx,Ly,Lz) I solve the Poisson equation in three dimensions with the analytical function f(x,y,z) defined by f(x,z,y) = cos(2*pi*x/Lx)*cos(2*pi*y/Ly)*cos(2*pi*z/Lz) + K where Lx = Ly =Lz = 1.0 and K is a constant I use to set f(Lx,Ly,Lz) = 0.0. Second order descritization is used for the Poisson equation. Also, Neumann boundary condition is used everywhere, but I set the top-right-front node's value to zero to get rid of the Nullspaced matrix manually. I use 20 grid points in each direction. The problem is: I use GMRES(20) without any preconditioners (rtol = 1e-12) to solve the linear system. It takes 77,000 iterations to converge!!!! For the size of only 8,000 unknowns, even though the lsys is not preconditioned, I guess that is a LOT of iterations. Next, I setup the exact same problem in MATLAB and use their GMRES solver function. I set the same parameters and MATLAB tells me that it converges using only 3870 iterations. I know that there might be some internal differences between MATLAB and PETSc's implementations of this method, but given the fact that these two solvers are not preconditioned, I am wondering about this big difference? Any ideas? Best, Mohamad -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Jan 7 16:14:51 2012 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 7 Jan 2012 16:14:51 -0600 Subject: [petsc-users] GMRES solver In-Reply-To: References: Message-ID: On Sat, Jan 7, 2012 at 4:00 PM, Mohamad M. Nasr-Azadani wrote: > Hi guys, > > I am trying to narrow down an issue with my Poisson solver. > I have the following problem setup > > Laplace(f) = rhs(x,z,y) > 0 <= x,y,z <= (Lx,Ly,Lz) > > I solve the Poisson equation in three dimensions with the analytical > function f(x,y,z) defined by > > f(x,z,y) = cos(2*pi*x/Lx)*cos(2*pi*y/Ly)*cos(2*pi*z/Lz) + K > where Lx = Ly =Lz = 1.0 and K is a constant I use to set f(Lx,Ly,Lz) = > 0.0. > > Second order descritization is used for the Poisson equation. > Also, Neumann boundary condition is used everywhere, but I set the > top-right-front node's value to zero to get rid of the Nullspaced matrix > manually. > I use 20 grid points in each direction. > > The problem is: > I use GMRES(20) without any preconditioners (rtol = 1e-12) to solve the > linear system. > It takes 77,000 iterations to converge!!!! > > For the size of only 8,000 unknowns, even though the lsys is not > preconditioned, I guess that is a LOT of iterations. > Next, I setup the exact same problem in MATLAB and use their GMRES solver > function. > I set the same parameters and MATLAB tells me that it converges using only > 3870 iterations. > 1) Matlab could be doing a lot of things. I am betting that they scale the problem, so -pc_type jacobi. 2) Why would anyone ever use GMRES without a preconditioner, particularly for a problem where several optimal PCs exist and are present in PETSc. Matt > I know that there might be some internal differences between MATLAB and > PETSc's implementations of this method, but given the fact that these two > solvers are not preconditioned, I am wondering about this big difference? > > Any ideas? > > Best, > Mohamad > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Jan 7 19:39:14 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 7 Jan 2012 19:39:14 -0600 Subject: [petsc-users] GMRES solver In-Reply-To: References: Message-ID: On Jan 7, 2012, at 4:00 PM, Mohamad M. Nasr-Azadani wrote: > Hi guys, > > I am trying to narrow down an issue with my Poisson solver. > I have the following problem setup > > Laplace(f) = rhs(x,z,y) > 0 <= x,y,z <= (Lx,Ly,Lz) > > I solve the Poisson equation in three dimensions with the analytical function f(x,y,z) defined by > > f(x,z,y) = cos(2*pi*x/Lx)*cos(2*pi*y/Ly)*cos(2*pi*z/Lz) + K > where Lx = Ly =Lz = 1.0 and K is a constant I use to set f(Lx,Ly,Lz) = 0.0. > > Second order descritization is used for the Poisson equation. > Also, Neumann boundary condition is used everywhere, but I set the top-right-front node's value to zero to get rid of the Nullspaced matrix manually. Please don't do this. That results in a unnecessaryly huge condition number. Use KSPSetNullSpace.() Also if you are really solving the Poisson problem you should use multigrid; if simple geometry then geometric multigrid if complicated geometry probably easier to use hypre BoomerAMG. No sane person solves Poisson problem with anything but a multigrid or FFT based solver. Barry > I use 20 grid points in each direction. > > The problem is: > I use GMRES(20) without any preconditioners (rtol = 1e-12) to solve the linear system. > It takes 77,000 iterations to converge!!!! > > For the size of only 8,000 unknowns, even though the lsys is not preconditioned, I guess that is a LOT of iterations. > Next, I setup the exact same problem in MATLAB and use their GMRES solver function. > I set the same parameters and MATLAB tells me that it converges using only 3870 iterations. > > I know that there might be some internal differences between MATLAB and PETSc's implementations of this method, but given the fact that these two solvers are not preconditioned, I am wondering about this big difference? > > Any ideas? > > Best, > Mohamad > From jedbrown at mcs.anl.gov Sun Jan 8 16:42:01 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sun, 8 Jan 2012 16:42:01 -0600 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: <117A7ABA-9E8D-42C3-9506-F8C72B2B2248@mcs.anl.gov> References: <4F05C04A.4060108@gfz-potsdam.de> <4F05C276.2050905@gfz-potsdam.de> <4F05C379.2040909@gfz-potsdam.de> <4F06BA64.4020402@gfz-potsdam.de> <4F06EE62.6060901@gfz-potsdam.de> <117A7ABA-9E8D-42C3-9506-F8C72B2B2248@mcs.anl.gov> Message-ID: It lets you read a time series of one field without needing to read all fields. Note that we could implement a "next object" seek functionality that would partly alleviate this issue, but the reader code would still need to know how many fields were interlaced. On Jan 6, 2012 8:58 PM, "Barry Smith" wrote: > > On Jan 6, 2012, at 6:51 AM, Alexander Grayver wrote: > > > On 06.01.2012 13:45, Jed Brown wrote: > >> On Fri, Jan 6, 2012 at 03:09, Alexander Grayver < > agrayver at gfz-potsdam.de> wrote: > >> This is not always convinient to store everything in one file, but in > some cases I do want to use it. I haven't found any examples on that. Do I > have to use FILE_MODE_APPEND and then write? What happens if file doesn't > exist? > >> > >> You just create a viewer and then call MatView(), VecView(), etc, > repeatedly for each object you want to put in the file (e.g. once per time > step). No need for FILE_MODE_APPEND and unless you want to append to an > existing file. > > > > Ok, that was my meaning to use PetscViewerFileSetName from the beginning > since if you have let's say ten different objects (Mat and Vec) and you > need to output them at each iteration (time step or frequency for > multi-freqs modeling) you need ten viewer objects > > I don't understand why you need ten viewer objects. Why not just dump all > the objects into the one file (creating one Viewer object and never > changing its name) one after each other and then in MATLAB read then back > in one after the other. > > Barry > > > which is not cool I guess, that is why I started to use one viewer and > change just a name of the file. > > And to be honest I don't see any reason why having ten viewers is better > than calling PetscViewerFileSetName ten times. > > > > Regards, > > Alexander > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Sun Jan 8 17:13:16 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Sun, 8 Jan 2012 15:13:16 -0800 Subject: [petsc-users] GMRES solver In-Reply-To: References: Message-ID: Thanks Barry and Matt, Barry, Also if you are really solving the Poisson problem you should use multigrid; if simple geometry then geometric multigrid if complicated geometry probably easier to use hypre BoomerAMG. No sane person solves Poisson problem with anything but a multigrid or FFT based solver. In my main code, I am actually doing what you suggested, i.e. GMRES + boomerAMG to solve for my Poisson equation. I have not used the KSPSetNullSpace() though. The problem is that my code (CFD, incompressible flow 3D) diverges after a long time integration and I am trying to find out why. The system that I have is a fairly big one, i.e. 100 million grid points and more. I see that pressure solution (which is obviously coupled to the velocity field) starts showing strange behavior. That's why I tried to first double check my pressure solver. Based on your experience, do you think that not using a nullspace() for the pressure solver for that linear system size could have caused it to diverge? Matt, 1) Matlab could be doing a lot of things. I am betting that they scale the problem, so -pc_type jacobi. That could be right. The reason that I relied on the MATLAB's gmres solver to behave exactly similar to PETSc was just their "help" saying that ************ X = GMRES(A,B,RESTART,TOL,MAXIT,M1,M2) use preconditioner M or M=M1*M2 and effectively solve the system inv(M)*A*X = inv(M)*B for X. If M is [] then a preconditioner is not applied. ************ Best, Mohamad On Sat, Jan 7, 2012 at 5:39 PM, Barry Smith wrote: > > On Jan 7, 2012, at 4:00 PM, Mohamad M. Nasr-Azadani wrote: > > > Hi guys, > > > > I am trying to narrow down an issue with my Poisson solver. > > I have the following problem setup > > > > Laplace(f) = rhs(x,z,y) > > 0 <= x,y,z <= (Lx,Ly,Lz) > > > > I solve the Poisson equation in three dimensions with the analytical > function f(x,y,z) defined by > > > > f(x,z,y) = cos(2*pi*x/Lx)*cos(2*pi*y/Ly)*cos(2*pi*z/Lz) + K > > where Lx = Ly =Lz = 1.0 and K is a constant I use to set f(Lx,Ly,Lz) = > 0.0. > > > > Second order descritization is used for the Poisson equation. > > Also, Neumann boundary condition is used everywhere, but I set the > top-right-front node's value to zero to get rid of the Nullspaced matrix > manually. > > Please don't do this. That results in a unnecessaryly huge condition > number. Use KSPSetNullSpace.() > > Also if you are really solving the Poisson problem you should use > multigrid; if simple geometry then geometric multigrid if complicated > geometry probably easier to use hypre BoomerAMG. No sane person solves > Poisson problem with anything but a multigrid or FFT based solver. > > Barry > > > I use 20 grid points in each direction. > > > > The problem is: > > I use GMRES(20) without any preconditioners (rtol = 1e-12) to solve the > linear system. > > It takes 77,000 iterations to converge!!!! > > > > For the size of only 8,000 unknowns, even though the lsys is not > preconditioned, I guess that is a LOT of iterations. > > Next, I setup the exact same problem in MATLAB and use their GMRES > solver function. > > I set the same parameters and MATLAB tells me that it converges using > only 3870 iterations. > > > > I know that there might be some internal differences between MATLAB and > PETSc's implementations of this method, but given the fact that these two > solvers are not preconditioned, I am wondering about this big difference? > > > > Any ideas? > > > > Best, > > Mohamad > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sun Jan 8 17:33:20 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sun, 8 Jan 2012 17:33:20 -0600 Subject: [petsc-users] GMRES solver In-Reply-To: References: Message-ID: Missing the null space can definitely cause problems. I suggest checking unpreconditioned residuals. On Jan 8, 2012 5:13 PM, "Mohamad M. Nasr-Azadani" wrote: > Thanks Barry and Matt, > > Barry, > Also if you are really solving the Poisson problem you should use > multigrid; if simple geometry then geometric multigrid if complicated > geometry probably easier to use hypre BoomerAMG. No sane person solves > Poisson problem with anything but a multigrid or FFT based solver. > > In my main code, I am actually doing what you suggested, i.e. GMRES + > boomerAMG to solve for my Poisson equation. I have not used the > KSPSetNullSpace() though. > The problem is that my code (CFD, incompressible flow 3D) diverges after a > long time integration and I am trying to find out why. > The system that I have is a fairly big one, i.e. 100 million grid points > and more. > I see that pressure solution (which is obviously coupled to the velocity > field) starts showing strange behavior. > That's why I tried to first double check my pressure solver. > > Based on your experience, do you think that not using a nullspace() for > the pressure solver for that linear system size could have caused it to > diverge? > > > Matt, > 1) Matlab could be doing a lot of things. I am betting that they scale the > problem, so -pc_type jacobi. > > That could be right. The reason that I relied on the MATLAB's gmres solver > to behave exactly similar to PETSc was just their "help" saying that > ************ > X = GMRES(A,B,RESTART,TOL,MAXIT,M1,M2) use preconditioner M or M=M1*M2 > and effectively solve the system inv(M)*A*X = inv(M)*B for X. If M is > [] then a preconditioner is not applied. > ************ > > Best, > Mohamad > > On Sat, Jan 7, 2012 at 5:39 PM, Barry Smith wrote: > >> >> On Jan 7, 2012, at 4:00 PM, Mohamad M. Nasr-Azadani wrote: >> >> > Hi guys, >> > >> > I am trying to narrow down an issue with my Poisson solver. >> > I have the following problem setup >> > >> > Laplace(f) = rhs(x,z,y) >> > 0 <= x,y,z <= (Lx,Ly,Lz) >> > >> > I solve the Poisson equation in three dimensions with the analytical >> function f(x,y,z) defined by >> > >> > f(x,z,y) = cos(2*pi*x/Lx)*cos(2*pi*y/Ly)*cos(2*pi*z/Lz) + K >> > where Lx = Ly =Lz = 1.0 and K is a constant I use to set f(Lx,Ly,Lz) = >> 0.0. >> > >> > Second order descritization is used for the Poisson equation. >> > Also, Neumann boundary condition is used everywhere, but I set the >> top-right-front node's value to zero to get rid of the Nullspaced matrix >> manually. >> >> Please don't do this. That results in a unnecessaryly huge condition >> number. Use KSPSetNullSpace.() >> >> Also if you are really solving the Poisson problem you should use >> multigrid; if simple geometry then geometric multigrid if complicated >> geometry probably easier to use hypre BoomerAMG. No sane person solves >> Poisson problem with anything but a multigrid or FFT based solver. >> >> Barry >> >> > I use 20 grid points in each direction. >> > >> > The problem is: >> > I use GMRES(20) without any preconditioners (rtol = 1e-12) to solve the >> linear system. >> > It takes 77,000 iterations to converge!!!! >> > >> > For the size of only 8,000 unknowns, even though the lsys is not >> preconditioned, I guess that is a LOT of iterations. >> > Next, I setup the exact same problem in MATLAB and use their GMRES >> solver function. >> > I set the same parameters and MATLAB tells me that it converges using >> only 3870 iterations. >> > >> > I know that there might be some internal differences between MATLAB and >> PETSc's implementations of this method, but given the fact that these two >> solvers are not preconditioned, I am wondering about this big difference? >> > >> > Any ideas? >> > >> > Best, >> > Mohamad >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sun Jan 8 17:33:21 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sun, 8 Jan 2012 17:33:21 -0600 Subject: [petsc-users] How/where to update DASetStencilWidth() info In-Reply-To: References: <58356696.1460341259111616362.JavaMail.root@mail8.gatech.edu> Message-ID: To keep the stencil small for matrix preallocation, you can create two DMDAs. On Jan 6, 2012 9:07 PM, "Barry Smith" wrote: > > On Jan 6, 2012, at 4:43 PM, Xuefeng Li wrote: > > > On Fri, 6 Jan 2012, Matthew Knepley wrote: > > > >> On Fri, Jan 6, 2012 at 4:32 PM, Xuefeng Li
  • wrote: > >> > >>> However, is there a way to change/adjust the stencil width > >>> AFTER the DA has been created and set up to a DMMG? Or, > >>> in general, can we attach different DA objects to a DMMG > >>> at different times? > >>> > >> > >> 1) You cannot change this after setting up the DA. Just make another DA, > >> since > >> they are very lightweight. > >> > >> 2) I think structuring new code around DMMG is wrong. It has been > removed in > >> petsc-dev and is deprecated in 3.2. What are you trying to do? > >> > > I am just trying to use the ghost points to share > > data among neighboring processes at one step > > during the whole iteration. At that particular > > step, the stencil width needs to be bigger. > > > > One can always set the stencil width to be big > > enough so that one does not need to adjust it. > > But this is just not as efficient as it can be. > > Just create a bigger-stencil DMDA that you use during those "special" > steps. > > Barry > > > > > Thanks again. > > > > Regards, > > > > --Xuefeng Li, (504)865-3340(phone) > > Like floating clouds, the heart rests easy > > Like flowing water, the spirit stays free > > http://www.loyno.edu/~li/home > > New Orleans, Louisiana (504)865-2051(fax) > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Sun Jan 8 20:28:10 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Sun, 8 Jan 2012 18:28:10 -0800 Subject: [petsc-users] GMRES solver In-Reply-To: References: Message-ID: Thanks Jed, Maybe I was not 100% clear, the way that I dealt with the nullspace was to set the pressure to zero at one node in the entire domain. Best, Mohamad On Sun, Jan 8, 2012 at 3:33 PM, Jed Brown wrote: > Missing the null space can definitely cause problems. > > I suggest checking unpreconditioned residuals. > On Jan 8, 2012 5:13 PM, "Mohamad M. Nasr-Azadani" > wrote: > >> Thanks Barry and Matt, >> >> Barry, >> Also if you are really solving the Poisson problem you should use >> multigrid; if simple geometry then geometric multigrid if complicated >> geometry probably easier to use hypre BoomerAMG. No sane person solves >> Poisson problem with anything but a multigrid or FFT based solver. >> >> In my main code, I am actually doing what you suggested, i.e. GMRES + >> boomerAMG to solve for my Poisson equation. I have not used the >> KSPSetNullSpace() though. >> The problem is that my code (CFD, incompressible flow 3D) diverges after >> a long time integration and I am trying to find out why. >> The system that I have is a fairly big one, i.e. 100 million grid points >> and more. >> I see that pressure solution (which is obviously coupled to the velocity >> field) starts showing strange behavior. >> That's why I tried to first double check my pressure solver. >> >> Based on your experience, do you think that not using a nullspace() for >> the pressure solver for that linear system size could have caused it to >> diverge? >> >> >> Matt, >> 1) Matlab could be doing a lot of things. I am betting that they scale >> the problem, so -pc_type jacobi. >> >> That could be right. The reason that I relied on the MATLAB's gmres >> solver to behave exactly similar to PETSc was just their "help" saying that >> ************ >> X = GMRES(A,B,RESTART,TOL,MAXIT,M1,M2) use preconditioner M or M=M1*M2 >> and effectively solve the system inv(M)*A*X = inv(M)*B for X. If M is >> [] then a preconditioner is not applied. >> ************ >> >> Best, >> Mohamad >> >> On Sat, Jan 7, 2012 at 5:39 PM, Barry Smith wrote: >> >>> >>> On Jan 7, 2012, at 4:00 PM, Mohamad M. Nasr-Azadani wrote: >>> >>> > Hi guys, >>> > >>> > I am trying to narrow down an issue with my Poisson solver. >>> > I have the following problem setup >>> > >>> > Laplace(f) = rhs(x,z,y) >>> > 0 <= x,y,z <= (Lx,Ly,Lz) >>> > >>> > I solve the Poisson equation in three dimensions with the analytical >>> function f(x,y,z) defined by >>> > >>> > f(x,z,y) = cos(2*pi*x/Lx)*cos(2*pi*y/Ly)*cos(2*pi*z/Lz) + K >>> > where Lx = Ly =Lz = 1.0 and K is a constant I use to set f(Lx,Ly,Lz) = >>> 0.0. >>> > >>> > Second order descritization is used for the Poisson equation. >>> > Also, Neumann boundary condition is used everywhere, but I set the >>> top-right-front node's value to zero to get rid of the Nullspaced matrix >>> manually. >>> >>> Please don't do this. That results in a unnecessaryly huge condition >>> number. Use KSPSetNullSpace.() >>> >>> Also if you are really solving the Poisson problem you should use >>> multigrid; if simple geometry then geometric multigrid if complicated >>> geometry probably easier to use hypre BoomerAMG. No sane person solves >>> Poisson problem with anything but a multigrid or FFT based solver. >>> >>> Barry >>> >>> > I use 20 grid points in each direction. >>> > >>> > The problem is: >>> > I use GMRES(20) without any preconditioners (rtol = 1e-12) to solve >>> the linear system. >>> > It takes 77,000 iterations to converge!!!! >>> > >>> > For the size of only 8,000 unknowns, even though the lsys is not >>> preconditioned, I guess that is a LOT of iterations. >>> > Next, I setup the exact same problem in MATLAB and use their GMRES >>> solver function. >>> > I set the same parameters and MATLAB tells me that it converges using >>> only 3870 iterations. >>> > >>> > I know that there might be some internal differences between MATLAB >>> and PETSc's implementations of this method, but given the fact that these >>> two solvers are not preconditioned, I am wondering about this big >>> difference? >>> > >>> > Any ideas? >>> > >>> > Best, >>> > Mohamad >>> > >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Jan 8 20:47:45 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 8 Jan 2012 20:47:45 -0600 Subject: [petsc-users] GMRES solver In-Reply-To: References: Message-ID: On Jan 8, 2012, at 5:13 PM, Mohamad M. Nasr-Azadani wrote: > Thanks Barry and Matt, > > Barry, > Also if you are really solving the Poisson problem you should use multigrid; if simple geometry then geometric multigrid if complicated geometry probably easier to use hypre BoomerAMG. No sane person solves Poisson problem with anything but a multigrid or FFT based solver. > > In my main code, I am actually doing what you suggested, i.e. GMRES + boomerAMG to solve for my Poisson equation. I have not used the KSPSetNullSpace() though. > The problem is that my code (CFD, incompressible flow 3D) diverges after a long time integration How do you know it is diverging? Because it looks weird? You are comparing it to something? How accurately are you solving the linear system? > and I am trying to find out why. > The system that I have is a fairly big one, i.e. 100 million grid points and more. > I see that pressure solution (which is obviously coupled to the velocity field) starts showing strange behavior. What strange behavior? Is the right hand side for the pressure solution reasonable but the solution "strange"? How do you know you are not giving "bad stuff" to the pressure solve? Barry > That's why I tried to first double check my pressure solver. > > Based on your experience, do you think that not using a nullspace() for the pressure solver for that linear system size could have caused it to diverge? > > > Matt, > 1) Matlab could be doing a lot of things. I am betting that they scale the problem, so -pc_type jacobi. > > That could be right. The reason that I relied on the MATLAB's gmres solver to behave exactly similar to PETSc was just their "help" saying that > ************ > X = GMRES(A,B,RESTART,TOL,MAXIT,M1,M2) use preconditioner M or M=M1*M2 > and effectively solve the system inv(M)*A*X = inv(M)*B for X. If M is > [] then a preconditioner is not applied. > ************ > > Best, > Mohamad > > On Sat, Jan 7, 2012 at 5:39 PM, Barry Smith wrote: > > On Jan 7, 2012, at 4:00 PM, Mohamad M. Nasr-Azadani wrote: > > > Hi guys, > > > > I am trying to narrow down an issue with my Poisson solver. > > I have the following problem setup > > > > Laplace(f) = rhs(x,z,y) > > 0 <= x,y,z <= (Lx,Ly,Lz) > > > > I solve the Poisson equation in three dimensions with the analytical function f(x,y,z) defined by > > > > f(x,z,y) = cos(2*pi*x/Lx)*cos(2*pi*y/Ly)*cos(2*pi*z/Lz) + K > > where Lx = Ly =Lz = 1.0 and K is a constant I use to set f(Lx,Ly,Lz) = 0.0. > > > > Second order descritization is used for the Poisson equation. > > Also, Neumann boundary condition is used everywhere, but I set the top-right-front node's value to zero to get rid of the Nullspaced matrix manually. > > Please don't do this. That results in a unnecessaryly huge condition number. Use KSPSetNullSpace.() > > Also if you are really solving the Poisson problem you should use multigrid; if simple geometry then geometric multigrid if complicated geometry probably easier to use hypre BoomerAMG. No sane person solves Poisson problem with anything but a multigrid or FFT based solver. > > Barry > > > I use 20 grid points in each direction. > > > > The problem is: > > I use GMRES(20) without any preconditioners (rtol = 1e-12) to solve the linear system. > > It takes 77,000 iterations to converge!!!! > > > > For the size of only 8,000 unknowns, even though the lsys is not preconditioned, I guess that is a LOT of iterations. > > Next, I setup the exact same problem in MATLAB and use their GMRES solver function. > > I set the same parameters and MATLAB tells me that it converges using only 3870 iterations. > > > > I know that there might be some internal differences between MATLAB and PETSc's implementations of this method, but given the fact that these two solvers are not preconditioned, I am wondering about this big difference? > > > > Any ideas? > > > > Best, > > Mohamad > > > > From jedbrown at mcs.anl.gov Sun Jan 8 20:54:05 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sun, 8 Jan 2012 20:54:05 -0600 Subject: [petsc-users] GMRES solver In-Reply-To: References: Message-ID: On Sun, Jan 8, 2012 at 20:28, Mohamad M. Nasr-Azadani wrote: > Maybe I was not 100% clear, the way that I dealt with the nullspace was to > set the pressure to zero at one node in the entire domain. Well, this pollutes the spectrum, although most preconditioners fix it. We recommend using KSPSetNullSpace() instead. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Sun Jan 8 21:33:33 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Sun, 8 Jan 2012 19:33:33 -0800 Subject: [petsc-users] GMRES solver In-Reply-To: References: Message-ID: How do you know it is diverging? Because it looks weird? You are comparing it to something? While marching in time, it only takes 20-30 iterations to solve for pressure. After a very long integrattion in time, all of a sudden the pressure lsys does not converge even in 10,000 iterations. So, that's why I am saying it is converging. How accurately are you solving the linear system? I tried rtol = 1e-10 and 1e-12. What strange behavior? Is the right hand side for the pressure solution reasonable but the solution "strange"? How do you know you are not giving "bad stuff" to the pressure solve? For my case, I get huge pressure gradients close to the solid boundaries. That, indeed, causes very small delta_t's when I keep integrating. That should not happen cause at this stage of my simulations, nothing is really happening in the flow field. Very small velocities and velocity gradients. I am trying to find the problem, since I have only seen this for such big problem size, i.e. 100 million grid points. That makes it really hard to see if the I am feeding back rhs into the pressure linear system. Best, Mohamad On Sun, Jan 8, 2012 at 6:47 PM, Barry Smith wrote: > > On Jan 8, 2012, at 5:13 PM, Mohamad M. Nasr-Azadani wrote: > > > Thanks Barry and Matt, > > > > Barry, > > Also if you are really solving the Poisson problem you should use > multigrid; if simple geometry then geometric multigrid if complicated > geometry probably easier to use hypre BoomerAMG. No sane person solves > Poisson problem with anything but a multigrid or FFT based solver. > > > > In my main code, I am actually doing what you suggested, i.e. GMRES + > boomerAMG to solve for my Poisson equation. I have not used the > KSPSetNullSpace() though. > > The problem is that my code (CFD, incompressible flow 3D) diverges after > a long time integration > > How do you know it is diverging? Because it looks weird? You are > comparing it to something? > > > How accurately are you solving the linear system? > > > > and I am trying to find out why. > > The system that I have is a fairly big one, i.e. 100 million grid points > and more. > > I see that pressure solution (which is obviously coupled to the velocity > field) starts showing strange behavior. > > What strange behavior? Is the right hand side for the pressure > solution reasonable but the solution "strange"? How do you know you are not > giving "bad stuff" to the pressure solve? > > Barry > > > > That's why I tried to first double check my pressure solver. > > > > Based on your experience, do you think that not using a nullspace() for > the pressure solver for that linear system size could have caused it to > diverge? > > > > > > Matt, > > 1) Matlab could be doing a lot of things. I am betting that they scale > the problem, so -pc_type jacobi. > > > > That could be right. The reason that I relied on the MATLAB's gmres > solver to behave exactly similar to PETSc was just their "help" saying that > > ************ > > X = GMRES(A,B,RESTART,TOL,MAXIT,M1,M2) use preconditioner M or M=M1*M2 > > and effectively solve the system inv(M)*A*X = inv(M)*B for X. If M is > > [] then a preconditioner is not applied. > > ************ > > > > Best, > > Mohamad > > > > On Sat, Jan 7, 2012 at 5:39 PM, Barry Smith wrote: > > > > On Jan 7, 2012, at 4:00 PM, Mohamad M. Nasr-Azadani wrote: > > > > > Hi guys, > > > > > > I am trying to narrow down an issue with my Poisson solver. > > > I have the following problem setup > > > > > > Laplace(f) = rhs(x,z,y) > > > 0 <= x,y,z <= (Lx,Ly,Lz) > > > > > > I solve the Poisson equation in three dimensions with the analytical > function f(x,y,z) defined by > > > > > > f(x,z,y) = cos(2*pi*x/Lx)*cos(2*pi*y/Ly)*cos(2*pi*z/Lz) + K > > > where Lx = Ly =Lz = 1.0 and K is a constant I use to set f(Lx,Ly,Lz) = > 0.0. > > > > > > Second order descritization is used for the Poisson equation. > > > Also, Neumann boundary condition is used everywhere, but I set the > top-right-front node's value to zero to get rid of the Nullspaced matrix > manually. > > > > Please don't do this. That results in a unnecessaryly huge condition > number. Use KSPSetNullSpace.() > > > > Also if you are really solving the Poisson problem you should use > multigrid; if simple geometry then geometric multigrid if complicated > geometry probably easier to use hypre BoomerAMG. No sane person solves > Poisson problem with anything but a multigrid or FFT based solver. > > > > Barry > > > > > I use 20 grid points in each direction. > > > > > > The problem is: > > > I use GMRES(20) without any preconditioners (rtol = 1e-12) to solve > the linear system. > > > It takes 77,000 iterations to converge!!!! > > > > > > For the size of only 8,000 unknowns, even though the lsys is not > preconditioned, I guess that is a LOT of iterations. > > > Next, I setup the exact same problem in MATLAB and use their GMRES > solver function. > > > I set the same parameters and MATLAB tells me that it converges using > only 3870 iterations. > > > > > > I know that there might be some internal differences between MATLAB > and PETSc's implementations of this method, but given the fact that these > two solvers are not preconditioned, I am wondering about this big > difference? > > > > > > Any ideas? > > > > > > Best, > > > Mohamad > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Sun Jan 8 21:34:09 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Sun, 8 Jan 2012 19:34:09 -0800 Subject: [petsc-users] GMRES solver In-Reply-To: References: Message-ID: Thanks Jed for the point. Best, Mohamad On Sun, Jan 8, 2012 at 6:54 PM, Jed Brown wrote: > On Sun, Jan 8, 2012 at 20:28, Mohamad M. Nasr-Azadani wrote: > >> Maybe I was not 100% clear, the way that I dealt with the nullspace was >> to set the pressure to zero at one node in the entire domain. > > > Well, this pollutes the spectrum, although most preconditioners fix it. We > recommend using KSPSetNullSpace() instead. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Jan 9 00:11:30 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 9 Jan 2012 00:11:30 -0600 Subject: [petsc-users] GMRES solver In-Reply-To: References: Message-ID: <4F10B2C6-47A0-4133-BD25-513C5B6DF904@mcs.anl.gov> On Jan 8, 2012, at 9:33 PM, Mohamad M. Nasr-Azadani wrote: > How do you know it is diverging? Because it looks weird? You are comparing it to something? > > While marching in time, it only takes 20-30 iterations to solve for pressure. > After a very long integrattion in time, all of a sudden the pressure lsys does not converge even in 10,000 iterations. So, that's why I am saying it is converging. Are you using the exact same matrix at each time-step or does the matrix change? > > How accurately are you solving the linear system? > I tried rtol = 1e-10 and 1e-12. > > What strange behavior? Is the right hand side for the pressure solution reasonable but the solution "strange"? How do you know you are not giving "bad stuff" to the pressure solve? > For my case, I get huge pressure gradients close to the solid boundaries. That, indeed, causes very small delta_t's when I keep integrating. That should not happen cause at this stage of my simulations, nothing is really happening in the flow field. Very small velocities and velocity gradients. > I am trying to find the problem, since I have only seen this for such big problem size, i.e. 100 million grid points. That makes it really hard to see if the I am feeding back rhs into the pressure linear system. > > Best, > Mohamad > > > > > > > On Sun, Jan 8, 2012 at 6:47 PM, Barry Smith wrote: > > On Jan 8, 2012, at 5:13 PM, Mohamad M. Nasr-Azadani wrote: > > > Thanks Barry and Matt, > > > > Barry, > > Also if you are really solving the Poisson problem you should use multigrid; if simple geometry then geometric multigrid if complicated geometry probably easier to use hypre BoomerAMG. No sane person solves Poisson problem with anything but a multigrid or FFT based solver. > > > > In my main code, I am actually doing what you suggested, i.e. GMRES + boomerAMG to solve for my Poisson equation. I have not used the KSPSetNullSpace() though. > > The problem is that my code (CFD, incompressible flow 3D) diverges after a long time integration > > How do you know it is diverging? Because it looks weird? You are comparing it to something? > > > How accurately are you solving the linear system? > > > > and I am trying to find out why. > > The system that I have is a fairly big one, i.e. 100 million grid points and more. > > I see that pressure solution (which is obviously coupled to the velocity field) starts showing strange behavior. > > What strange behavior? Is the right hand side for the pressure solution reasonable but the solution "strange"? How do you know you are not giving "bad stuff" to the pressure solve? > > Barry > > > > That's why I tried to first double check my pressure solver. > > > > Based on your experience, do you think that not using a nullspace() for the pressure solver for that linear system size could have caused it to diverge? > > > > > > Matt, > > 1) Matlab could be doing a lot of things. I am betting that they scale the problem, so -pc_type jacobi. > > > > That could be right. The reason that I relied on the MATLAB's gmres solver to behave exactly similar to PETSc was just their "help" saying that > > ************ > > X = GMRES(A,B,RESTART,TOL,MAXIT,M1,M2) use preconditioner M or M=M1*M2 > > and effectively solve the system inv(M)*A*X = inv(M)*B for X. If M is > > [] then a preconditioner is not applied. > > ************ > > > > Best, > > Mohamad > > > > On Sat, Jan 7, 2012 at 5:39 PM, Barry Smith wrote: > > > > On Jan 7, 2012, at 4:00 PM, Mohamad M. Nasr-Azadani wrote: > > > > > Hi guys, > > > > > > I am trying to narrow down an issue with my Poisson solver. > > > I have the following problem setup > > > > > > Laplace(f) = rhs(x,z,y) > > > 0 <= x,y,z <= (Lx,Ly,Lz) > > > > > > I solve the Poisson equation in three dimensions with the analytical function f(x,y,z) defined by > > > > > > f(x,z,y) = cos(2*pi*x/Lx)*cos(2*pi*y/Ly)*cos(2*pi*z/Lz) + K > > > where Lx = Ly =Lz = 1.0 and K is a constant I use to set f(Lx,Ly,Lz) = 0.0. > > > > > > Second order descritization is used for the Poisson equation. > > > Also, Neumann boundary condition is used everywhere, but I set the top-right-front node's value to zero to get rid of the Nullspaced matrix manually. > > > > Please don't do this. That results in a unnecessaryly huge condition number. Use KSPSetNullSpace.() > > > > Also if you are really solving the Poisson problem you should use multigrid; if simple geometry then geometric multigrid if complicated geometry probably easier to use hypre BoomerAMG. No sane person solves Poisson problem with anything but a multigrid or FFT based solver. > > > > Barry > > > > > I use 20 grid points in each direction. > > > > > > The problem is: > > > I use GMRES(20) without any preconditioners (rtol = 1e-12) to solve the linear system. > > > It takes 77,000 iterations to converge!!!! > > > > > > For the size of only 8,000 unknowns, even though the lsys is not preconditioned, I guess that is a LOT of iterations. > > > Next, I setup the exact same problem in MATLAB and use their GMRES solver function. > > > I set the same parameters and MATLAB tells me that it converges using only 3870 iterations. > > > > > > I know that there might be some internal differences between MATLAB and PETSc's implementations of this method, but given the fact that these two solvers are not preconditioned, I am wondering about this big difference? > > > > > > Any ideas? > > > > > > Best, > > > Mohamad > > > > > > > > > From mmnasr at gmail.com Mon Jan 9 00:28:17 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Sun, 8 Jan 2012 22:28:17 -0800 Subject: [petsc-users] GMRES solver In-Reply-To: <4F10B2C6-47A0-4133-BD25-513C5B6DF904@mcs.anl.gov> References: <4F10B2C6-47A0-4133-BD25-513C5B6DF904@mcs.anl.gov> Message-ID: Are you using the exact same matrix at each time-step or does the matrix change? Pressure lsys does not change over time. However, the diagonal of the velocity lsys's do change due to the variable time step. Best, M On Sun, Jan 8, 2012 at 10:11 PM, Barry Smith wrote: > > On Jan 8, 2012, at 9:33 PM, Mohamad M. Nasr-Azadani wrote: > > > How do you know it is diverging? Because it looks weird? You are > comparing it to something? > > > > While marching in time, it only takes 20-30 iterations to solve for > pressure. > > After a very long integrattion in time, all of a sudden the pressure > lsys does not converge even in 10,000 iterations. So, that's why I am > saying it is converging. > > Are you using the exact same matrix at each time-step or does the > matrix change? > > > > > > How accurately are you solving the linear system? > > I tried rtol = 1e-10 and 1e-12. > > > > What strange behavior? Is the right hand side for the pressure > solution reasonable but the solution "strange"? How do you know you are not > giving "bad stuff" to the pressure solve? > > For my case, I get huge pressure gradients close to the solid > boundaries. That, indeed, causes very small delta_t's when I keep > integrating. That should not happen cause at this stage of my simulations, > nothing is really happening in the flow field. Very small velocities and > velocity gradients. > > I am trying to find the problem, since I have only seen this for such > big problem size, i.e. 100 million grid points. That makes it really hard > to see if the I am feeding back rhs into the pressure linear system. > > > > Best, > > Mohamad > > > > > > > > > > > > > > On Sun, Jan 8, 2012 at 6:47 PM, Barry Smith wrote: > > > > On Jan 8, 2012, at 5:13 PM, Mohamad M. Nasr-Azadani wrote: > > > > > Thanks Barry and Matt, > > > > > > Barry, > > > Also if you are really solving the Poisson problem you should use > multigrid; if simple geometry then geometric multigrid if complicated > geometry probably easier to use hypre BoomerAMG. No sane person solves > Poisson problem with anything but a multigrid or FFT based solver. > > > > > > In my main code, I am actually doing what you suggested, i.e. GMRES + > boomerAMG to solve for my Poisson equation. I have not used the > KSPSetNullSpace() though. > > > The problem is that my code (CFD, incompressible flow 3D) diverges > after a long time integration > > > > How do you know it is diverging? Because it looks weird? You are > comparing it to something? > > > > > > How accurately are you solving the linear system? > > > > > > > and I am trying to find out why. > > > The system that I have is a fairly big one, i.e. 100 million grid > points and more. > > > I see that pressure solution (which is obviously coupled to the > velocity field) starts showing strange behavior. > > > > What strange behavior? Is the right hand side for the pressure > solution reasonable but the solution "strange"? How do you know you are not > giving "bad stuff" to the pressure solve? > > > > Barry > > > > > > > That's why I tried to first double check my pressure solver. > > > > > > Based on your experience, do you think that not using a nullspace() > for the pressure solver for that linear system size could have caused it to > diverge? > > > > > > > > > Matt, > > > 1) Matlab could be doing a lot of things. I am betting that they scale > the problem, so -pc_type jacobi. > > > > > > That could be right. The reason that I relied on the MATLAB's gmres > solver to behave exactly similar to PETSc was just their "help" saying that > > > ************ > > > X = GMRES(A,B,RESTART,TOL,MAXIT,M1,M2) use preconditioner M or M=M1*M2 > > > and effectively solve the system inv(M)*A*X = inv(M)*B for X. If M > is > > > [] then a preconditioner is not applied. > > > ************ > > > > > > Best, > > > Mohamad > > > > > > On Sat, Jan 7, 2012 at 5:39 PM, Barry Smith > wrote: > > > > > > On Jan 7, 2012, at 4:00 PM, Mohamad M. Nasr-Azadani wrote: > > > > > > > Hi guys, > > > > > > > > I am trying to narrow down an issue with my Poisson solver. > > > > I have the following problem setup > > > > > > > > Laplace(f) = rhs(x,z,y) > > > > 0 <= x,y,z <= (Lx,Ly,Lz) > > > > > > > > I solve the Poisson equation in three dimensions with the analytical > function f(x,y,z) defined by > > > > > > > > f(x,z,y) = cos(2*pi*x/Lx)*cos(2*pi*y/Ly)*cos(2*pi*z/Lz) + K > > > > where Lx = Ly =Lz = 1.0 and K is a constant I use to set f(Lx,Ly,Lz) > = 0.0. > > > > > > > > Second order descritization is used for the Poisson equation. > > > > Also, Neumann boundary condition is used everywhere, but I set the > top-right-front node's value to zero to get rid of the Nullspaced matrix > manually. > > > > > > Please don't do this. That results in a unnecessaryly huge condition > number. Use KSPSetNullSpace.() > > > > > > Also if you are really solving the Poisson problem you should use > multigrid; if simple geometry then geometric multigrid if complicated > geometry probably easier to use hypre BoomerAMG. No sane person solves > Poisson problem with anything but a multigrid or FFT based solver. > > > > > > Barry > > > > > > > I use 20 grid points in each direction. > > > > > > > > The problem is: > > > > I use GMRES(20) without any preconditioners (rtol = 1e-12) to solve > the linear system. > > > > It takes 77,000 iterations to converge!!!! > > > > > > > > For the size of only 8,000 unknowns, even though the lsys is not > preconditioned, I guess that is a LOT of iterations. > > > > Next, I setup the exact same problem in MATLAB and use their GMRES > solver function. > > > > I set the same parameters and MATLAB tells me that it converges > using only 3870 iterations. > > > > > > > > I know that there might be some internal differences between MATLAB > and PETSc's implementations of this method, but given the fact that these > two solvers are not preconditioned, I am wondering about this big > difference? > > > > > > > > Any ideas? > > > > > > > > Best, > > > > Mohamad > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Mon Jan 9 09:57:40 2012 From: zonexo at gmail.com (TAY wee-beng) Date: Mon, 09 Jan 2012 16:57:40 +0100 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc In-Reply-To: References: <4F038802.6040307@gmail.com> <8E6421A2-7752-46B6-B632-53E72C12275E@mcs.anl.gov> <4F05C4C5.9060102@gmail.com> Message-ID: <4F0B0E74.80902@gmail.com> Hi Jed, On 5/1/2012 4:59 PM, Jed Brown wrote: > On Thu, Jan 5, 2012 at 09:41, TAY wee-beng > wrote: > > I just did a -log_summary and attach the text file, running across > 8 and 16 processors. My most important concern is whether the load > is balanced across the processors. > > In 16 processors case, for the time, it seems that the ratio for > many events are higher than 1, reaching up to 6.8 for VecScatterEnd > > > This takes about 1% of the run time and it's scaling well, so don't > worry about it. > > and 132.1 (?) for MatAssemblyBegin. > > > This is about 2% of run time, but it's not scaling. Do you compute a > lot of matrix entries on processes that don't own the rows? I only compute rows which the processor own. Can it be the memory allocation? I'll check on that. > > Most of your solve time is going into PCSetUp() and PCApply, both of > which are getting more expensive as you add processes. These are more > than 10x more than spent in MatMult() and MatMult() takes slightly > less time on more processes, so the increase isn't entirely due to > memory issues. > > What methods are you using? What do you mean methods? I am doing Cartesian grid 3d CFD, using fractional mtd which solves the momentum and Poisson eqns. I construct the linear eqn matrix and insert them in PETSc matrix/vectors. Then I solve using Bicsstab and hypre AMG respectively. Why is PCSetUp() and PCApply using more time? > > However, for the flops, ratios are 1 and 1.1. so which is more > important to look at? time or flops? > > > If you would rather do a lot of flops than solve the problem in a > reasonable amount of time, you might as well use dense methods. ;-) Thanks again! -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Jan 9 10:07:34 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 9 Jan 2012 10:07:34 -0600 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc In-Reply-To: <4F0B0E74.80902@gmail.com> References: <4F038802.6040307@gmail.com> <8E6421A2-7752-46B6-B632-53E72C12275E@mcs.anl.gov> <4F05C4C5.9060102@gmail.com> <4F0B0E74.80902@gmail.com> Message-ID: On Mon, Jan 9, 2012 at 09:57, TAY wee-beng wrote: > I only compute rows which the processor own. Can it be the memory > allocation? I'll check on that. > Usually memory allocation mistakes cost much more. > > Most of your solve time is going into PCSetUp() and PCApply, both of > which are getting more expensive as you add processes. These are more than > 10x more than spent in MatMult() and MatMult() takes slightly less time on > more processes, so the increase isn't entirely due to memory issues. > > What methods are you using? > > > What do you mean methods? I am doing Cartesian grid 3d CFD, > Then just use a regular partition. If your meshes have boundary layers, you might want to adjust the subdomain aspect ratios so that strongly coupled cells tend to reside on the same process, but don't bother with general graph partitions (like ParMETIS) because you will end up writing most of an unstructured CFD code in order to use those partitions efficiently. > using fractional mtd which solves the momentum and Poisson eqns. I > construct the linear eqn matrix and insert them in PETSc matrix/vectors. > Then I solve using Bicsstab and hypre AMG respectively. Why is PCSetUp() > and PCApply using more time? > It is expensive because BoomerAMG setup and apply is expensive. -------------- next part -------------- An HTML attachment was scrubbed... URL: From agrayver at gfz-potsdam.de Mon Jan 9 10:31:44 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Mon, 09 Jan 2012 17:31:44 +0100 Subject: [petsc-users] Convergence/accuracy degradation with increasing number of procs Message-ID: <4F0B1670.7010300@gfz-potsdam.de> I have tested default GMRES+ILU(0) solver for problem of size ~200000 and I observed that both convergence rate and accuracy degrade when I increase number of processes. E.g., number of iterations almost doubles when going from 1 to 4 processors. I noticed that ILU(0) is somehow represented by sequential matrices internally and feel that this might be a reason? Regards, Alexander From knepley at gmail.com Mon Jan 9 10:36:19 2012 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 9 Jan 2012 10:36:19 -0600 Subject: [petsc-users] Convergence/accuracy degradation with increasing number of procs In-Reply-To: <4F0B1670.7010300@gfz-potsdam.de> References: <4F0B1670.7010300@gfz-potsdam.de> Message-ID: On Mon, Jan 9, 2012 at 10:31 AM, Alexander Grayver wrote: > I have tested default GMRES+ILU(0) solver for problem of size ~200000 and > I observed that both convergence rate and accuracy degrade when I increase > number of processes. > E.g., number of iterations almost doubles when going from 1 to 4 > processors. > > I noticed that ILU(0) is somehow represented by sequential matrices > internally and feel that this might be a reason? > The default is Block Jacobi+ILU(0) and GMRES(30) which will definitely degrade as the number of processes is increased. You would need to use an optimal preconditioner like Multigrid if you want a constant number of iterates. Matt > Regards, > Alexander > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From agrayver at gfz-potsdam.de Mon Jan 9 10:37:55 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Mon, 09 Jan 2012 17:37:55 +0100 Subject: [petsc-users] Convergence/accuracy degradation with increasing number of procs In-Reply-To: References: <4F0B1670.7010300@gfz-potsdam.de> Message-ID: <4F0B17E3.9080706@gfz-potsdam.de> I see this is due to preconditioner, but could you explain shortly what particularly causes this problem? It's not obvious to me. Thanks. On 09.01.2012 17:36, Matthew Knepley wrote: > On Mon, Jan 9, 2012 at 10:31 AM, Alexander Grayver > > wrote: > > I have tested default GMRES+ILU(0) solver for problem of size > ~200000 and I observed that both convergence rate and accuracy > degrade when I increase number of processes. > E.g., number of iterations almost doubles when going from 1 to 4 > processors. > > I noticed that ILU(0) is somehow represented by sequential > matrices internally and feel that this might be a reason? > > > The default is Block Jacobi+ILU(0) and GMRES(30) which will definitely > degrade as the number of > processes is increased. You would need to use an optimal > preconditioner like Multigrid if you want > a constant number of iterates. > > Matt > > Regards, > Alexander > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jan 9 10:46:04 2012 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 9 Jan 2012 10:46:04 -0600 Subject: [petsc-users] Convergence/accuracy degradation with increasing number of procs In-Reply-To: <4F0B17E3.9080706@gfz-potsdam.de> References: <4F0B1670.7010300@gfz-potsdam.de> <4F0B17E3.9080706@gfz-potsdam.de> Message-ID: On Mon, Jan 9, 2012 at 10:37 AM, Alexander Grayver wrote: > ** > I see this is due to preconditioner, but could you explain shortly what > particularly causes this problem? It's not obvious to me. > The size of the block being factorized by ILU(0) decreases as p increases. Thus, it is a weaker preconditioner. I recommend Yousef Saad's book which explains all this clearly. Matt > Thanks. > > On 09.01.2012 17:36, Matthew Knepley wrote: > > On Mon, Jan 9, 2012 at 10:31 AM, Alexander Grayver < > agrayver at gfz-potsdam.de> wrote: > >> I have tested default GMRES+ILU(0) solver for problem of size ~200000 and >> I observed that both convergence rate and accuracy degrade when I increase >> number of processes. >> E.g., number of iterations almost doubles when going from 1 to 4 >> processors. >> >> I noticed that ILU(0) is somehow represented by sequential matrices >> internally and feel that this might be a reason? >> > > The default is Block Jacobi+ILU(0) and GMRES(30) which will definitely > degrade as the number of > processes is increased. You would need to use an optimal preconditioner > like Multigrid if you want > a constant number of iterates. > > Matt > > >> Regards, >> Alexander >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From agrayver at gfz-potsdam.de Mon Jan 9 10:48:36 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Mon, 09 Jan 2012 17:48:36 +0100 Subject: [petsc-users] Convergence/accuracy degradation with increasing number of procs In-Reply-To: References: <4F0B1670.7010300@gfz-potsdam.de> <4F0B17E3.9080706@gfz-potsdam.de> Message-ID: <4F0B1A64.1000801@gfz-potsdam.de> Thanks Matt, that was my guess actually. On 09.01.2012 17:46, Matthew Knepley wrote: > On Mon, Jan 9, 2012 at 10:37 AM, Alexander Grayver > > wrote: > > I see this is due to preconditioner, but could you explain shortly > what particularly causes this problem? It's not obvious to me. > > > The size of the block being factorized by ILU(0) decreases as p > increases. Thus, it is > a weaker preconditioner. I recommend Yousef Saad's book which explains > all this clearly. > > Matt > > Thanks. > > On 09.01.2012 17:36, Matthew Knepley wrote: >> On Mon, Jan 9, 2012 at 10:31 AM, Alexander Grayver >> > wrote: >> >> I have tested default GMRES+ILU(0) solver for problem of size >> ~200000 and I observed that both convergence rate and >> accuracy degrade when I increase number of processes. >> E.g., number of iterations almost doubles when going from 1 >> to 4 processors. >> >> I noticed that ILU(0) is somehow represented by sequential >> matrices internally and feel that this might be a reason? >> >> >> The default is Block Jacobi+ILU(0) and GMRES(30) which will >> definitely degrade as the number of >> processes is increased. You would need to use an optimal >> preconditioner like Multigrid if you want >> a constant number of iterates. >> >> Matt >> >> Regards, >> Alexander >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Jan 9 13:01:38 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 9 Jan 2012 13:01:38 -0600 Subject: [petsc-users] [petsc-maint #101696] complex + float128 In-Reply-To: References: Message-ID: <3827E0BE-D340-4389-9762-EFB4F47C3FC5@mcs.anl.gov> We've never tried to do this. First you need to check if quadmath.h has quad complex stuff in it? If it does you need to add into petscmath.h the quad complex bindings for all the various math operations like PetscScalar and PetscSqrtScalar() etc Barry On Jan 9, 2012, at 12:05 PM, Xavier Garnaud wrote: > I am trying to build PETSc with complex number and quadruple precision, but > I get an error at the compiling stage. I do not get this error when I do > the same using real numbers. Are complex numbers incompatible with > quadruple precision? > Thank you very much, > Sincerely, > > Xavier > > From zonexo at gmail.com Mon Jan 9 13:11:00 2012 From: zonexo at gmail.com (TAY wee-beng) Date: Mon, 09 Jan 2012 20:11:00 +0100 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc In-Reply-To: References: <4F038802.6040307@gmail.com> <8E6421A2-7752-46B6-B632-53E72C12275E@mcs.anl.gov> <4F05C4C5.9060102@gmail.com> <4F0B0E74.80902@gmail.com> Message-ID: <4F0B3BC4.2080405@gmail.com> Hi Jed, On 9/1/2012 5:07 PM, Jed Brown wrote: > On Mon, Jan 9, 2012 at 09:57, TAY wee-beng > wrote: > > I only compute rows which the processor own. Can it be the memory > allocation? I'll check on that. > > > Usually memory allocation mistakes cost much more. > >> >> Most of your solve time is going into PCSetUp() and PCApply, both >> of which are getting more expensive as you add processes. These >> are more than 10x more than spent in MatMult() and MatMult() >> takes slightly less time on more processes, so the increase isn't >> entirely due to memory issues. >> >> What methods are you using? > > What do you mean methods? I am doing Cartesian grid 3d CFD, > > > Then just use a regular partition. If your meshes have boundary > layers, you might want to adjust the subdomain aspect ratios so that > strongly coupled cells tend to reside on the same process, but don't > bother with general graph partitions (like ParMETIS) because you will > end up writing most of an unstructured CFD code in order to use those > partitions efficiently. Can you explain a bit more about how to adjust the subdomain aspect ratios so that strongly coupled cells tend to reside on the same process? > > using fractional mtd which solves the momentum and Poisson eqns. I > construct the linear eqn matrix and insert them in PETSc > matrix/vectors. Then I solve using Bicsstab and hypre AMG > respectively. Why is PCSetUp() and PCApply using more time? > > > It is expensive because BoomerAMG setup and apply is expensive. So this is normal? Is there any suggestion to improve performance? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Jan 9 13:15:52 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 9 Jan 2012 13:15:52 -0600 Subject: [petsc-users] Software for load balancing to improve parallel performance to be used with PETSc In-Reply-To: <4F0B3BC4.2080405@gmail.com> References: <4F038802.6040307@gmail.com> <8E6421A2-7752-46B6-B632-53E72C12275E@mcs.anl.gov> <4F05C4C5.9060102@gmail.com> <4F0B0E74.80902@gmail.com> <4F0B3BC4.2080405@gmail.com> Message-ID: On Mon, Jan 9, 2012 at 13:11, TAY wee-beng wrote: > Can you explain a bit more about how to adjust the subdomain aspect ratios > so that strongly coupled cells tend to reside on the same process? > You can set the lx,ly,lz in DMDACreate3d(). > > >> using fractional mtd which solves the momentum and Poisson eqns. I >> construct the linear eqn matrix and insert them in PETSc matrix/vectors. >> Then I solve using Bicsstab and hypre AMG respectively. Why is PCSetUp() >> and PCApply using more time? >> > > It is expensive because BoomerAMG setup and apply is expensive. > > > So this is normal? Is there any suggestion to improve performance? > This is normal for BoomerAMG. You could use PCGAMG (-pc_type gamg, better with petsc-dev) or ML (--download-ml, then -pc_type ml) which are algebraic multigrid methods that are usually less expensive to setup and per iteration (but sometimes less strong). Geometric multigrid is another possibility. -------------- next part -------------- An HTML attachment was scrubbed... URL: From irving at naml.us Mon Jan 9 16:42:52 2012 From: irving at naml.us (Geoffrey Irving) Date: Mon, 9 Jan 2012 14:42:52 -0800 Subject: [petsc-users] lying about nullspaces Message-ID: Hello, I have a matrix A and an orthogonal projection operator P onto a fairly large (but sparse) linear subspace. I want to solve the system PAPx = Pb If set up a MatNullSpace object with MatNullSpaceSetFunction, with KSP solve the system I want even if A is nonsingular? Thanks, Geoffrey From jedbrown at mcs.anl.gov Mon Jan 9 17:08:54 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 9 Jan 2012 17:08:54 -0600 Subject: [petsc-users] lying about nullspaces In-Reply-To: References: Message-ID: Yes, this usually works as long as the preconditioner for A is stable on the non-null subspace. On Jan 9, 2012 5:43 PM, "Geoffrey Irving" wrote: > Hello, > > I have a matrix A and an orthogonal projection operator P onto a > fairly large (but sparse) linear subspace. I want to solve the system > > PAPx = Pb > > If set up a MatNullSpace object with MatNullSpaceSetFunction, with KSP > solve the system I want even if A is nonsingular? > > Thanks, > Geoffrey > -------------- next part -------------- An HTML attachment was scrubbed... URL: From irving at naml.us Mon Jan 9 17:20:22 2012 From: irving at naml.us (Geoffrey Irving) Date: Mon, 9 Jan 2012 15:20:22 -0800 Subject: [petsc-users] lying about nullspaces In-Reply-To: References: Message-ID: Ah, right. I'm using incomplete Cholesky as a preconditioner, which usually won't be stable on the subspace. The subspace is derived from freezing the normal velocity of points involved in collisions, so it has no useful algebraic properties. It's not too difficult to symbolically apply P to A (it won't change the sparsity), but unfortunately that would make the sparsity pattern change each iteration, which would significantly increase the cost of ICC. I suppose the best alternative may be to add a large energy term of the form || x - P x ||^2 P is block diagonal with dense blocks, so ICC should have no problem fixing the resulting bad condition number. Thanks, Geoffrey On Mon, Jan 9, 2012 at 3:08 PM, Jed Brown wrote: > Yes, this usually works as long as the preconditioner for A is stable on the > non-null subspace. > > On Jan 9, 2012 5:43 PM, "Geoffrey Irving" wrote: >> >> Hello, >> >> I have a matrix A and an orthogonal projection operator P onto a >> fairly large (but sparse) linear subspace. ?I want to solve the system >> >> ? ?PAPx = Pb >> >> If set up a MatNullSpace object with MatNullSpaceSetFunction, with KSP >> solve the system I want even if A is nonsingular? >> >> Thanks, >> Geoffrey From irving at naml.us Mon Jan 9 17:24:18 2012 From: irving at naml.us (Geoffrey Irving) Date: Mon, 9 Jan 2012 15:24:18 -0800 Subject: [petsc-users] lying about nullspaces In-Reply-To: References: Message-ID: On Mon, Jan 9, 2012 at 3:20 PM, Geoffrey Irving wrote: > Ah, right. ?I'm using incomplete Cholesky as a preconditioner, which > usually won't be stable on the subspace. ?The subspace is derived from > freezing the normal velocity of points involved in collisions, so it > has no useful algebraic properties. > > It's not too difficult to symbolically apply P to A (it won't change > the sparsity), but unfortunately that would make the sparsity pattern > change each iteration, which would significantly increase the cost of > ICC. Sorry, that sentence was poorly phrased. I meant that it's not too difficult to symbolically eliminate the relevant subspace from A, but that this would produce a matrix whose size changes each time around. I could also replace A with PAP directly without changing the sparsity pattern, but I imagine incomplete Cholesky might choke on the result, which would be a 3x3 block matrix with a lot of singular diagonal blocks. Geoffrey From jedbrown at mcs.anl.gov Mon Jan 9 18:30:38 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 9 Jan 2012 18:30:38 -0600 Subject: [petsc-users] lying about nullspaces In-Reply-To: References: Message-ID: If you make a consistent RHS, then ICC with a positive definite shift might be fine. You could certainly define a dynamic ordering that puts the "good" blocks first. If the transformation can identify the bad blocks, you could solve the embedded problem with PCREDISTRIBUTE. On Jan 9, 2012 6:24 PM, "Geoffrey Irving" wrote: > On Mon, Jan 9, 2012 at 3:20 PM, Geoffrey Irving wrote: > > Ah, right. I'm using incomplete Cholesky as a preconditioner, which > > usually won't be stable on the subspace. The subspace is derived from > > freezing the normal velocity of points involved in collisions, so it > > has no useful algebraic properties. > > > > It's not too difficult to symbolically apply P to A (it won't change > > the sparsity), but unfortunately that would make the sparsity pattern > > change each iteration, which would significantly increase the cost of > > ICC. > > Sorry, that sentence was poorly phrased. I meant that it's not too > difficult to symbolically eliminate the relevant subspace from A, but > that this would produce a matrix whose size changes each time around. > I could also replace A with PAP directly without changing the sparsity > pattern, but I imagine incomplete Cholesky might choke on the result, > which would be a 3x3 block matrix with a lot of singular diagonal > blocks. > > Geoffrey > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Mon Jan 9 21:47:44 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Mon, 9 Jan 2012 19:47:44 -0800 Subject: [petsc-users] How does KSPSetNullSpace() work? Message-ID: Hi guys, It might be a naive question, but I am wondering how KSPSetNullSpace() works when it is passed to a linear system ksp context? Say, for instance, we have the simple case of Poisson equation solved in a square domain and Neumann boundary condition applied to all boundaries. Does it take the integral of the solution and set it to zero as an extra constraint? Thanks, Mohamad -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Jan 9 21:57:02 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 9 Jan 2012 22:57:02 -0500 Subject: [petsc-users] How does KSPSetNullSpace() work? In-Reply-To: References: Message-ID: On Mon, Jan 9, 2012 at 22:47, Mohamad M. Nasr-Azadani wrote: > It might be a naive question, but I am wondering how KSPSetNullSpace() > works when it is passed to a linear system ksp context? > Say, for instance, we have the simple case of Poisson equation solved in a > square domain and Neumann boundary condition applied to all boundaries. > Does it take the integral of the solution and set it to zero as an extra > constraint? > It just projects out whatever you provide as a null space, so the Krylov method effectively runs in the remaining subspace. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Jan 9 22:30:59 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 9 Jan 2012 23:30:59 -0500 Subject: [petsc-users] lying about nullspaces In-Reply-To: References: Message-ID: On Mon, Jan 9, 2012 at 18:20, Geoffrey Irving wrote: > The subspace is derived from > freezing the normal velocity of points involved in collisions, so it > has no useful algebraic properties. > About how many in practice, both as absolute numbers and as fraction of the total number of nodes? Are the elastic bodies closely packed enough to undergo locking (as in granular media). I ask because it affects the locality of the response to the constraints. > > It's not too difficult to symbolically apply P to A (it won't change > the sparsity), but unfortunately that would make the sparsity pattern > change each iteration, which would significantly increase the cost of > ICC. > It changes each time step or each nonlinear iteration, but as long as you need a few linear iterations, the cost of the fresh symbolic factorization is not likely to be high. I'm all for reusing data structures, but if you are just using ICC, it might not be worth it. Preallocating for the reduced matrix might be tricky. Note that you can also enforce the constraints using Lagrange multipliers. If the effect of the Lagrange multipliers are local, then you can likely get away with an Uzawa-type algorithm (perhaps combined with some form of multigrid for the unconstrained system). If the contact constraints cause long-range response, Uzawa-type methods may not converge as quickly, but there are still lots of alternatives. -------------- next part -------------- An HTML attachment was scrubbed... URL: From irving at naml.us Mon Jan 9 23:08:56 2012 From: irving at naml.us (Geoffrey Irving) Date: Mon, 9 Jan 2012 21:08:56 -0800 Subject: [petsc-users] lying about nullspaces In-Reply-To: References: Message-ID: On Mon, Jan 9, 2012 at 8:30 PM, Jed Brown wrote: > On Mon, Jan 9, 2012 at 18:20, Geoffrey Irving wrote: >> >> The subspace is derived from >> freezing the normal velocity of points involved in collisions, so it >> has no useful algebraic properties. > > > About how many in practice, both as absolute numbers and as fraction of the > total number of nodes? Are the elastic bodies closely packed enough to > undergo locking (as in granular media). I ask because it affects the > locality of the response to the constraints. I don't have this simulation up and running yet, but roughly I'd expect 0 to 10% of the nodes to be involved in collisions. I'm dealing only with kinematic object collisions at the moment, so pairs of close colliding nodes will have very similar collision normals, and therefore very similar constraint subspaces, and therefore shouldn't lock. >> It's not too difficult to symbolically apply P to A (it won't change >> the sparsity), but unfortunately that would make the sparsity pattern >> change each iteration, which would significantly increase the cost of >> ICC. > > It changes each time step or each nonlinear iteration, but as long as you > need a few linear iterations, the cost of the fresh symbolic factorization > is not likely to be high. I'm all for reusing data structures, but if you > are just using ICC, it might not be worth it. Preallocating for the reduced > matrix might be tricky. For now, I believe I can get away with a single linear iteration. Even if I need a few, the extra cost of the first linear solve appears to be drastic. However, it appears you're right that this isn't due to preconditioner setup. The first solve takes over 50 times as long as the other solves: step 1 dt = 0.00694444, time = 0 cg icc converged: iterations = 4, rtol = 0.001, error = 9.56519e-05 actual L2 residual = 1.10131e-05 max speed = 0.00728987 END step 1 0.6109 s step 2 dt = 0.00694444, time = 0.00694444 cg icc converged: iterations = 3, rtol = 0.001, error = 0.000258359 actual L2 residual = 3.13442e-05 max speed = 0.0148876 END step 2 0.0089 s Note that this is a very small problem, but even if it took 100x the iterations the first solve would still be significant more expensive than the second. However, if I pretend the nonzero pattern changes every iteration, I only see a 20% performance hit overall, so something else is happening on the first iteration. Do you know what it is? The results of -log_summary are attached if it helps. > Note that you can also enforce the constraints using Lagrange multipliers. > If the effect of the Lagrange multipliers are local, then you can likely get > away with an Uzawa-type algorithm (perhaps combined with some form of > multigrid for the unconstrained system). If the contact constraints cause > long-range response, Uzawa-type methods may not converge as quickly, but > there are still lots of alternatives. Lagrange multipliers are unfortunate since the system is otherwise definite. The effect of the constraints will in general be global, since they will often be the only force combating the net effect of gravity. In any case, if recomputing the preconditioner appears to be cheap, symbolic elimination is probably the way to go. Thanks, Geoffrey -------------- next part -------------- ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./sim on a darwin named tile.local with 1 processor, by irving Mon Jan 9 20:50:19 2012 Using Petsc Release Version 3.2.0, Patch 5, Sat Oct 29 13:45:54 CDT 2011 Max Max/Min Avg Total Time (sec): 6.567e-01 1.00000 6.567e-01 Objects: 4.100e+01 1.00000 4.100e+01 Flops: 1.248e+07 1.00000 1.248e+07 1.248e+07 Flops/sec: 1.901e+07 1.00000 1.901e+07 1.901e+07 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 4.100e+01 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 6.5669e-01 100.0% 1.2482e+07 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 4.000e+01 97.6% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 25 1.0 2.0399e-03 1.0 5.25e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 42 0 0 0 0 42 0 0 0 2575 MatSolve 31 1.0 4.3278e-03 1.0 6.51e+06 1.0 0.0e+00 0.0e+00 0.0e+00 1 52 0 0 0 1 52 0 0 0 1505 MatCholFctrNum 6 1.0 1.2992e-02 1.0 1.70e+04 1.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 1 MatICCFactorSym 1 1.0 3.5391e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 1 0 0 0 2 1 0 0 0 2 0 MatAssemblyBegin 6 1.0 9.5367e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 6 1.0 7.3647e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 6 1.0 6.2156e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 6 1.0 3.7446e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 1 0 0 0 59 1 0 0 0 60 0 VecTDot 38 1.0 5.4598e-05 1.0 2.16e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 3954 VecNorm 31 1.0 1.0037e-04 1.0 1.76e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1755 VecCopy 6 1.0 1.1921e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 38 1.0 6.2704e-05 1.0 2.16e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 3443 VecAYPX 19 1.0 6.8188e-05 1.0 9.09e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1333 KSPSetup 6 1.0 2.6941e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 7 0 0 0 0 8 0 KSPSolve 6 1.0 2.7211e-02 1.0 1.25e+07 1.0 0.0e+00 0.0e+00 3.6e+01 4100 0 0 88 4100 0 0 90 459 PCSetUp 6 1.0 2.0375e-02 1.0 1.70e+04 1.0 0.0e+00 0.0e+00 2.7e+01 3 0 0 0 66 3 0 0 0 68 1 PCApply 31 1.0 4.3375e-03 1.0 6.51e+06 1.0 0.0e+00 0.0e+00 0.0e+00 1 52 0 0 0 1 52 0 0 0 1502 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 2 2 1984184 0 Vector 11 11 265848 0 Krylov Solver 1 1 1144 0 Preconditioner 1 1 904 0 Index Set 25 25 200624 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 #PETSc Option Table entries: -ksp_max_it 100 -ksp_rtol 1e-3 -ksp_type cg -log_summary -pc_factor_levels 0 -pc_factor_mat_ordering_type nd -pc_type icc #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Mon Nov 21 14:00:28 2011 Configure options: --prefix=/opt/local --with-python --with-debugging=0 --with-c-support=1 --with-c++-support=1 --with-pic=fPIC --with-shared-libraries=0 --with-mpi=1 --PETSC_ARCH=darwin --prefix=/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_math_petsc/petsc/work/destroot/opt/local/lib/petsc --with-cc=/opt/local/bin/openmpicc --with-cxx=/opt/local/bin/openmpicxx --with-mpiexec=/opt/local/bin/openmpiexec --with-fc=/opt/local/bin/openmpif90 --LIBS=-lstdc++ ----------------------------------------- Libraries compiled on Mon Nov 21 14:00:28 2011 on tile.local Machine characteristics: Darwin-11.2.0-x86_64-i386-64bit Using PETSc directory: /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_math_petsc/petsc/work/petsc-3.2-p5 Using PETSc arch: darwin ----------------------------------------- Using C compiler: /opt/local/bin/openmpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /opt/local/bin/openmpif90 -Wall -Wno-unused-variable -O ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_math_petsc/petsc/work/petsc-3.2-p5/darwin/include -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_math_petsc/petsc/work/petsc-3.2-p5/include -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_math_petsc/petsc/work/petsc-3.2-p5/include -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_math_petsc/petsc/work/petsc-3.2-p5/darwin/include -I/opt/local/include -I/opt/local/include/openmpi ----------------------------------------- Using C linker: /opt/local/bin/openmpicc Using Fortran linker: /opt/local/bin/openmpif90 Using libraries: -L/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_math_petsc/petsc/work/petsc-3.2-p5/darwin/lib -L/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_math_petsc/petsc/work/petsc-3.2-p5/darwin/lib -lpetsc -L/opt/local/lib -lX11 -lpthread -llapack -lblas -ldl -lstdc++ -lmpi_f90 -lmpi_f77 -lmpi -lgfortran -L/opt/local/lib/gcc44/gcc/x86_64-apple-darwin11/4.4.6 -L/opt/local/lib/gcc44 -lgcc_s.10.5 -lSystem -ldl -lstdc++ ----------------------------------------- From wolfshow at gmail.com Tue Jan 10 02:27:34 2012 From: wolfshow at gmail.com (Fatcharm) Date: Tue, 10 Jan 2012 16:27:34 +0800 Subject: [petsc-users] Strongly nonlinear equation solved within the framework of PETSc Message-ID: Dear all, First, I would like to re-describe my problem. I want to numerically solve a strongly nonlinear fourth-order equation, which is used to describe the dynamics of a liquid film.Please find the form of the equation below, u_t = -(1/3)*C*(u^3*u_xxx)_x + (A*(u_x/u))_x "u" is the thickness of the film to be solved, is a function of "x" and time "t" , "C" and "A" are constant parameters. u_xxx is the 3-th order derivative. I wrote a PETSc programs for this problem, using the central finite difference scheme in the space and CN method in time . I start my PETSc program from the /petsc-3.2-p5/src/ts/examples/tutorials/ex13.c I wrote in my "RHSFunction" function: u = uarray[i]; ux = (uarray[i+1] - uarray[i-1]); uxx = (-2.0*u + uarray[i-1] + uarray[i+1]); uxxx = (uarray[i+2] - 2.0*uarray[i+1] + 2.0*uarray[i-1] - uarray[i-2]); uxxxx = (uarray[i+2] - 4.0*uarray[i+1] + 6.0*u - 4.0*uarray[i-1] + uarray[i-2]); ucx = -(user->c/3.0)*sx*(0.75*sx*u*u*ux*uxxx + sx*u*u*u*uxxxx); uax = (user->a)*sx*(-0.25*ux*ux/(u*u+l_res) + uxx/(u+l_res)); f[i] = ucx + uax; Also I provided the "RHSJacobian" to evaluate the changing Jacobian. Followed Jed's advice, I run the program with -snes_monitor -snes_converged_reason -ksp_converged_reason I found that "Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH", at the same time "Linear solve converged due to CONVERGED_RTOL iterations 5". I tested to solve this problem with -ts_type beuler, I found: timestep 0: time 0, solution norm 0.00899198, max 0.057735, min 0.001 Linear solve converged due to CONVERGED_RTOL iterations 30 Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH timestep 1: time 0.001, solution norm 0.00899198, max 0.057735, min 0.001 Nonlinear solve converged due to CONVERGED_FNORM_ABS timestep 2: time 0.002, solution norm 0.00899198, max 0.057735, min 0.001 Linear solve converged due to CONVERGED_RTOL iterations 30 Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH timestep 3: time 0.003, solution norm 0.00899198, max 0.057735, min 0.001 Nonlinear solve converged due to CONVERGED_FNORM_ABS timestep 4: time 0.004, solution norm 0.00899198, max 0.057735, min 0.001 Linear solve converged due to CONVERGED_RTOL iterations 30 Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH timestep 5: time 0.005, solution norm 0.00899198, max 0.057735, min 0.001 Nonlinear solve converged due to CONVERGED_FNORM_ABS timestep 6: time 0.006, solution norm 0.00899198, max 0.057735, min 0.001 Linear solve converged due to CONVERGED_RTOL iterations 30 Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH I read http://scicomp.stackexchange.com/questions/30/why-is-newtons-method-not-converging I run with -pc_type lu, it was told that [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: No support for this operation for this object type! [0]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc LU! [0]PETSC ERROR: ------------------------------------------------------------------------ I run with -viewJacobian, the Jacobian looks reasonable. Only the value in the Jacobian is on the order of 10^7 or 10^8, Jed told me that "That may cause ill-conditioning"? If I run with -snes_type test -snes_test_display, I got timestep 0: time 0, solution norm 0.00899198, max 0.057735, min 0.001 Testing hand-coded Jacobian, if the ratio is O(1.e-8), the hand-coded Jacobian is probably correct. Run with -snes_test_display to show difference of hand-coded and finite difference Jacobian. [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Floating point exception! [0]PETSC ERROR: Infinite or not-a-number generated in norm! [0]PETSC ERROR: ------------------------------------------------------------------------ If I run with -snes_ls_monitor, I got timestep 0: time 0, solution norm 0.02004, max 0.0299803, min 0.0100197 Line search: gnorm after quadratic fit 6.651216893041e+04 Line search: Quadratically determined step, lambda=4.8079877662732573e-01 Line search: gnorm after quadratic fit 5.508857435695e+07 Line search: Cubic step no good, shrinking lambda, current gnorm 5.508989545851e+07 lambda=1.0000000000000002e-02 Line search: Cubic step no good, shrinking lambda, current gnorm 5.508855229151e+07 lambda=1.0000000000000002e-03 Line search: Cubic step no good, shrinking lambda, current gnorm 5.508989770948e+07 lambda=1.0000000000000003e-04 Line search: Cubic step no good, shrinking lambda, current gnorm 5.508855206653e+07 lambda=1.0000000000000004e-05 Line search: Cubic step no good, shrinking lambda, current gnorm 5.508989773203e+07 lambda=1.0000000000000004e-06 Line search: Cubic step no good, shrinking lambda, current gnorm 5.508855206428e+07 lambda=1.0000000000000005e-07 Line search: Cubic step no good, shrinking lambda, current gnorm 5.508989773226e+07 lambda=1.0000000000000005e-08 Line search: Cubic step no good, shrinking lambda, current gnorm 5.508855206425e+07 lambda=1.0000000000000005e-09 Line search: Cubic step no good, shrinking lambda, current gnorm 5.508989773226e+07 lambda=1.0000000000000006e-10 Line search: Cubic step no good, shrinking lambda, current gnorm 5.508855206425e+07 lambda=1.0000000000000006e-11 Line search: Cubic step no good, shrinking lambda, current gnorm 5.508989773226e+07 lambda=1.0000000000000006e-12 Line search: Cubic step no good, shrinking lambda, current gnorm 5.508855206425e+07 lambda=1.0000000000000007e-13 Line search: unable to find good step length! After 13 tries Line search: fnorm=6.6512168930411834e+04, gnorm=5.5088552064252406e+07, ynorm=2.0443556209235136e-01, minlambda=6.8680778683552649e-13, lambda=1.0000000000000007e-13, initial slope=-4.4238725909386473e+09 Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH timestep 1: time 0.001, solution norm 0.0192692, max 0.021293, min 0.0161292 Line search: gnorm after quadratic fit 5.509183229235e+07 Line search: Cubic step no good, shrinking lambda, current gnorm 5.509262678284e+07 lambda=1.0000000000000002e-02 Line search: Cubic step no good, shrinking lambda, current gnorm 5.509184917687e+07 lambda=1.0000000000000002e-03 Line search: Cubic step no good, shrinking lambda, current gnorm 5.509262521680e+07 lambda=1.0000000000000003e-04 Line search: Cubic step no good, shrinking lambda, current gnorm 5.509184934034e+07 lambda=1.0000000000000004e-05 Line search: Cubic step no good, shrinking lambda, current gnorm 5.509262520119e+07 lambda=1.0000000000000004e-06 Line search: Cubic step no good, shrinking lambda, current gnorm 5.509184934198e+07 lambda=1.0000000000000005e-07 Line search: Cubic step no good, shrinking lambda, current gnorm 5.509262520104e+07 lambda=1.0000000000000005e-08 Line search: Cubic step no good, shrinking lambda, current gnorm 5.509184934199e+07 lambda=1.0000000000000005e-09 Line search: Cubic step no good, shrinking lambda, current gnorm 5.509262520104e+07 lambda=1.0000000000000006e-10 Line search: Cubic step no good, shrinking lambda, current gnorm 5.509184934199e+07 lambda=1.0000000000000006e-11 Line search: Cubic step no good, shrinking lambda, current gnorm 5.509262520104e+07 lambda=1.0000000000000006e-12 Line search: Cubic step no good, shrinking lambda, current gnorm 5.509184934199e+07 lambda=1.0000000000000007e-13 Line search: unable to find good step length! After 13 tries Line search: fnorm=9.5088545581815284e+04, gnorm=5.5091849341994494e+07, ynorm=1.4128453377933892e-01, minlambda=8.9929091375150137e-13, lambda=1.0000000000000007e-13, initial slope=-9.0419264362675228e+09 Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH timestep 2: time 0.002, solution norm 0.0192692, max 0.021293, min 0.0161292 It is weird that if I run with -snes_converged_reason, I got Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH If I run with -snes_converged_reason and -info I found that Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE Could anyone please give me some suggestions? Thank you very much. Feng-Chao > > Message: 6 > Date: Thu, 29 Dec 2011 08:21:08 -0600 > From: Jed Brown > Subject: Re: [petsc-users] Strongly nonlinear equation solved within > the framework of PETSc > To: PETSc users list > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > On Thu, Dec 29, 2011 at 08:10, Fatcharm wrote: > >> We can see that the SNES Function norm is extremely large. I think it >> is because that the initial value for the unknown function H(X,T) is >> quite small and there is some terms like (1/H)(dH/dX) or >> (1/H^2)(dH/dX) in my equations. >> > > That may cause ill-conditioning, but you could still scale the equations so > that the initial norm was of order 1. It shouldn't matter here though, > because most methods are unaffected by scaling. > > Are you computing an analytic Jacobian or using finite differencing? > > >> >> For "Linear solve did not converge due to DIVERGED_DTOL iterations >> 3270", does this mean I should change the ksp_type? >> > > It's important to solve the linear systems before worrying about > convergence rates for Newton methods. Try a direct solve on a small problem > first, then read this > > http://scicomp.stackexchange.com/questions/513/why-is-my-iterative-linear-solver-not-converging > > If you fix the linear solve issues, but SNES is still not converging, read > > http://scicomp.stackexchange.com/questions/30/why-is-newtons-method-not-converging > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > > ------------------------------ > > _______________________________________________ > petsc-users mailing list > petsc-users at mcs.anl.gov > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > > > End of petsc-users Digest, Vol 36, Issue 84 > ******************************************* > From Thomas.Witkowski at tu-dresden.de Tue Jan 10 02:45:47 2012 From: Thomas.Witkowski at tu-dresden.de (Thomas Witkowski) Date: Tue, 10 Jan 2012 09:45:47 +0100 Subject: [petsc-users] DMDA stencil for union jack triangular meshes Message-ID: <20120110094547.n69bnnnoggwgokcc@mail.zih.tu-dresden.de> I'm about to implement the multigrid preconditioner for my FEM code for the case of regulare 2D triangular meshes. My meshes have a union jack pattern, thus the nodes are equidistributed in both directions but the stencil varies between a 9-point and a 5-point stencil. Is it possible to fit this into the DMDA framework? Thomas From agrayver at gfz-potsdam.de Tue Jan 10 02:50:36 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Tue, 10 Jan 2012 09:50:36 +0100 Subject: [petsc-users] How does KSPSetNullSpace() work? In-Reply-To: References: Message-ID: <4F0BFBDC.6010809@gfz-potsdam.de> On 10.01.2012 04:57, Jed Brown wrote: > On Mon, Jan 9, 2012 at 22:47, Mohamad M. Nasr-Azadani > > wrote: > > It might be a naive question, but I am wondering how > KSPSetNullSpace() works when it is passed to a linear system ksp > context? > Say, for instance, we have the simple case of Poisson equation > solved in a square domain and Neumann boundary condition applied > to all boundaries. > Does it take the integral of the solution and set it to zero as an > extra constraint? > > > It just projects out whatever you provide as a null space, so the > Krylov method effectively runs in the remaining subspace. In examples listed on this page: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetNullSpace.html You set nullspace without any particular information: ierr = MatNullSpaceCreate(PETSC_COMM_WORLD, PETSC_TRUE, 0, PETSC_NULL, &nullsp); ierr = KSPSetNullSpace(ksp, nullsp); What does it project and how works in this case? Regards, Alexander -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Jan 10 07:43:28 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 10 Jan 2012 08:43:28 -0500 Subject: [petsc-users] How does KSPSetNullSpace() work? In-Reply-To: <4F0BFBDC.6010809@gfz-potsdam.de> References: <4F0BFBDC.6010809@gfz-potsdam.de> Message-ID: On Tue, Jan 10, 2012 at 03:50, Alexander Grayver wrote: > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetNullSpace.html > > You set nullspace without any particular information: > > ierr = MatNullSpaceCreate(PETSC_COMM_WORLD, PETSC_TRUE, 0, PETSC_NULL, > &nullsp); > ierr = KSPSetNullSpace(ksp, nullsp); > > What does it project and how works in this case? This removes the constant null space. The has_const argument is equivalent to creating a constant vector (but more efficient). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Jan 10 07:52:45 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 10 Jan 2012 08:52:45 -0500 Subject: [petsc-users] lying about nullspaces In-Reply-To: References: Message-ID: On Tue, Jan 10, 2012 at 00:08, Geoffrey Irving wrote: > For now, I believe I can get away with a single linear iteration. > Single linear iteration (e.g. one GMRES cycle) or single linear solve (e.g. one Newton step)? > Even if I need a few, the extra cost of the first linear solve appears > to be drastic. However, it appears you're right that this isn't due > to preconditioner setup. The first solve takes over 50 times as long > as the other solves: > > step 1 > dt = 0.00694444, time = 0 > cg icc converged: iterations = 4, rtol = 0.001, error = 9.56519e-05 > actual L2 residual = 1.10131e-05 > max speed = 0.00728987 > END step 1 0.6109 s > How are you measuring this time? In -log_summary, I see 0.02 seconds in KSPSolve(). Maybe the time you see is because there are lots of page faults until you get the code loaded into memory? > step 2 > dt = 0.00694444, time = 0.00694444 > cg icc converged: iterations = 3, rtol = 0.001, error = 0.000258359 > actual L2 residual = 3.13442e-05 > max speed = 0.0148876 > END step 2 0.0089 s > > Note that this is a very small problem, but even if it took 100x the > iterations the first solve would still be significant more expensive > than the second. However, if I pretend the nonzero pattern changes > every iteration, I only see a 20% performance hit overall, so > something else is happening on the first iteration. Do you know what > it is? The results of -log_summary are attached if it helps. > > > Note that you can also enforce the constraints using Lagrange > multipliers. > > If the effect of the Lagrange multipliers are local, then you can likely > get > > away with an Uzawa-type algorithm (perhaps combined with some form of > > multigrid for the unconstrained system). If the contact constraints cause > > long-range response, Uzawa-type methods may not converge as quickly, but > > there are still lots of alternatives. > > Lagrange multipliers are unfortunate since the system is otherwise > definite. The effect of the constraints will in general be global, > since they will often be the only force combating the net effect of > gravity. In any case, if recomputing the preconditioner appears to be > cheap, symbolic elimination is probably the way to go. > Well, if the Schur complement in the space of Lagrange multipliers is very well conditioned (or is preconditionable) and you have a good preconditioner for the positive definite part, then the saddle point formulation is not a big deal. The best method will be problem dependent, but this part of the design space is relevant when setup is high relative to solves (e.g. algebraic multigrid). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Jan 10 08:00:18 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 10 Jan 2012 09:00:18 -0500 Subject: [petsc-users] DMDA stencil for union jack triangular meshes In-Reply-To: <20120110094547.n69bnnnoggwgokcc@mail.zih.tu-dresden.de> References: <20120110094547.n69bnnnoggwgokcc@mail.zih.tu-dresden.de> Message-ID: On Tue, Jan 10, 2012 at 03:45, Thomas Witkowski < Thomas.Witkowski at tu-dresden.de> wrote: > I'm about to implement the multigrid preconditioner for my FEM code for > the case of regulare 2D triangular meshes. My meshes have a union jack > pattern, thus the nodes are equidistributed in both directions but the > stencil varies between a 9-point and a 5-point stencil. Is it possible to > fit this into the DMDA framework? It's not easy to make DMCreateMatrix() preallocate for this arrangement, but you can overallocate (all 9-point), use http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMSetMatrixPreallocateOnly.html and then assemble what you have. The resulting matrix will be compacted so that the used part is contiguous, so the run-time performance will be the same as with perfect preallocation (but needed to allocate slightly more memory). -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jan 10 08:26:54 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 10 Jan 2012 08:26:54 -0600 Subject: [petsc-users] How does KSPSetNullSpace() work? In-Reply-To: References: <4F0BFBDC.6010809@gfz-potsdam.de> Message-ID: On Jan 10, 2012, at 7:43 AM, Jed Brown wrote: > On Tue, Jan 10, 2012 at 03:50, Alexander Grayver wrote: > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetNullSpace.html > > You set nullspace without any particular information: > > ierr = MatNullSpaceCreate(PETSC_COMM_WORLD, PETSC_TRUE, 0, PETSC_NULL, &nullsp); > ierr = KSPSetNullSpace(ksp, nullsp); > > What does it project and how works in this case? > > This removes the constant null space. The has_const argument is equivalent to creating a constant vector (but more efficient). Alexandar, You need to learn to use etags or one of the other mechanisms for searching PETSc source code, see sections 13.8 Emacs Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 13.9 Vi and Vim Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 13.10EclipseUsers ..........................................175 13.11Qt Creator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 13.12 DevelopersStudioUsers ....................................177 13.13 XCodeUsers(TheAppleGUIDevelopmentSystem . . . . . . . . . . . . . . . . . . . . . 177 in the users manual: http://www.mcs.anl.gov/petsc/petsc-dev/docs/manual.pdf This way you can see for yourself exactly how PETSc is doing things. It will be quicker and more accurate than always asking us (especially since we have to look at the code to answer your question anyways). Anyways Here is how the multiplies and preconditioner applies are done in the KSP solvers #define KSP_MatMult(ksp,A,x,y) (!ksp->transpose_solve) ? MatMult(A,x,y) : MatMultTranspose(A,x,y) #define KSP_MatMultTranspose(ksp,A,x,y) (!ksp->transpose_solve) ? MatMultTranspose(A,x,y) : MatMult(A,x,y) #define KSP_PCApply(ksp,x,y) (!ksp->transpose_solve) ? (PCApply(ksp->pc,x,y) || KSP_RemoveNullSpace(ksp,y)) : PCApplyTranspose(ksp->pc,x,y) #define KSP_PCApplyTranspose(ksp,x,y) (!ksp->transpose_solve) ? PCApplyTranspose(ksp->pc,x,y) : (PCApply(ksp->pc,x,y) || KSP_RemoveNullSpace(ksp,y)) #define KSP_PCApplyBAorAB(ksp,x,y,w) (!ksp->transpose_solve) ? (PCApplyBAorAB(ksp->pc,ksp->pc_side,x,y,w) || KSP_RemoveNullSpace(ksp,y)) : PCApplyBAorABTranspose(ksp->pc,ksp->pc_side,x,y,w) #define KSP_PCApplyBAorABTranspose(ksp,x,y,w) (!ksp->transpose_solve) ? (PCApplyBAorABTranspose(ksp->pc,ksp->pc_side,x,y,w) || KSP_RemoveNullSpace(ksp,y)) : PCApplyBAorAB(ksp->pc,ksp->pc_side,x,y,w) while #define KSP_RemoveNullSpace(ksp,y) ((ksp->nullsp && ksp->pc_side == PC_LEFT) ? MatNullSpaceRemove(ksp->nullsp,y,PETSC_NULL) : 0) and /*@C MatNullSpaceRemove - Removes all the components of a null space from a vector. Collective on MatNullSpace Input Parameters: + sp - the null space context . vec - the vector from which the null space is to be removed - out - if this is requested (not PETSC_NULL) then this is a vector with the null space removed otherwise the removal is done in-place (in vec) Note: The user is not responsible for the vector returned and should not destroy it. Level: advanced .keywords: PC, null space, remove .seealso: MatNullSpaceCreate(), MatNullSpaceDestroy(), MatNullSpaceSetFunction() @*/ PetscErrorCode MatNullSpaceRemove(MatNullSpace sp,Vec vec,Vec *out) { PetscScalar sum; PetscInt i,N; PetscErrorCode ierr; PetscFunctionBegin; PetscValidHeaderSpecific(sp,MAT_NULLSPACE_CLASSID,1); PetscValidHeaderSpecific(vec,VEC_CLASSID,2); if (out) { PetscValidPointer(out,3); if (!sp->vec) { ierr = VecDuplicate(vec,&sp->vec);CHKERRQ(ierr); ierr = PetscLogObjectParent(sp,sp->vec);CHKERRQ(ierr); } ierr = VecCopy(vec,sp->vec);CHKERRQ(ierr); vec = *out = sp->vec; } if (sp->has_cnst) { ierr = VecGetSize(vec,&N);CHKERRQ(ierr); if (N > 0) { ierr = VecSum(vec,&sum);CHKERRQ(ierr); sum = sum/((PetscScalar)(-1.0*N)); ierr = VecShift(vec,sum);CHKERRQ(ierr); } } if (sp->n) { ierr = VecMDot(vec,sp->n,sp->vecs,sp->alpha);CHKERRQ(ierr); for (i=0; in; i++) sp->alpha[i] = -sp->alpha[i]; ierr = VecMAXPY(vec,sp->n,sp->alpha,sp->vecs);CHKERRQ(ierr); } if (sp->remove){ ierr = (*sp->remove)(sp,vec,sp->rmctx);CHKERRQ(ierr); } PetscFunctionReturn(0); } All I found within seconds using etags. Now you can see exactly where and how the null space is being removed. Barry From agrayver at gfz-potsdam.de Tue Jan 10 09:25:48 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Tue, 10 Jan 2012 16:25:48 +0100 Subject: [petsc-users] How does KSPSetNullSpace() work? In-Reply-To: References: <4F0BFBDC.6010809@gfz-potsdam.de> Message-ID: <4F0C587C.5030203@gfz-potsdam.de> Sorry Barry, I will try to follow this way, but it sometimes scares me to get into PETSc source code, so I dare to use your openness. Should give it up, though. :) On 10.01.2012 15:26, Barry Smith wrote: > On Jan 10, 2012, at 7:43 AM, Jed Brown wrote: > >> On Tue, Jan 10, 2012 at 03:50, Alexander Grayver wrote: >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetNullSpace.html >> >> You set nullspace without any particular information: >> >> ierr = MatNullSpaceCreate(PETSC_COMM_WORLD, PETSC_TRUE, 0, PETSC_NULL,&nullsp); >> ierr = KSPSetNullSpace(ksp, nullsp); >> >> What does it project and how works in this case? >> >> This removes the constant null space. The has_const argument is equivalent to creating a constant vector (but more efficient). > Alexandar, > > You need to learn to use etags or one of the other mechanisms for searching PETSc source code, see sections > > 13.8 Emacs Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 > 13.9 Vi and Vim Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 > 13.10EclipseUsers ..........................................175 > 13.11Qt Creator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 13.12 > DevelopersStudioUsers ....................................177 13.13 > XCodeUsers(TheAppleGUIDevelopmentSystem . . . . . . . . . . . . . . . . . . . . . 177 > > in the users manual: http://www.mcs.anl.gov/petsc/petsc-dev/docs/manual.pdf > > This way you can see for yourself exactly how PETSc is doing things. It will be quicker and more accurate than always asking us (especially since we have to look at the code to answer your question anyways). Anyways > > Here is how the multiplies and preconditioner applies are done in the KSP solvers > > #define KSP_MatMult(ksp,A,x,y) (!ksp->transpose_solve) ? MatMult(A,x,y) : MatMultTranspose(A,x,y) > #define KSP_MatMultTranspose(ksp,A,x,y) (!ksp->transpose_solve) ? MatMultTranspose(A,x,y) : MatMult(A,x,y) > #define KSP_PCApply(ksp,x,y) (!ksp->transpose_solve) ? (PCApply(ksp->pc,x,y) || KSP_RemoveNullSpace(ksp,y)) : PCApplyTranspose(ksp->pc,x,y) > #define KSP_PCApplyTranspose(ksp,x,y) (!ksp->transpose_solve) ? PCApplyTranspose(ksp->pc,x,y) : (PCApply(ksp->pc,x,y) || KSP_RemoveNullSpace(ksp,y)) > #define KSP_PCApplyBAorAB(ksp,x,y,w) (!ksp->transpose_solve) ? (PCApplyBAorAB(ksp->pc,ksp->pc_side,x,y,w) || KSP_RemoveNullSpace(ksp,y)) : PCApplyBAorABTranspose(ksp->pc,ksp->pc_side,x,y,w) > #define KSP_PCApplyBAorABTranspose(ksp,x,y,w) (!ksp->transpose_solve) ? (PCApplyBAorABTranspose(ksp->pc,ksp->pc_side,x,y,w) || KSP_RemoveNullSpace(ksp,y)) : PCApplyBAorAB(ksp->pc,ksp->pc_side,x,y,w) > > while > > #define KSP_RemoveNullSpace(ksp,y) ((ksp->nullsp&& ksp->pc_side == PC_LEFT) ? MatNullSpaceRemove(ksp->nullsp,y,PETSC_NULL) : 0) > > and > > /*@C > MatNullSpaceRemove - Removes all the components of a null space from a vector. > > Collective on MatNullSpace > > Input Parameters: > + sp - the null space context > . vec - the vector from which the null space is to be removed > - out - if this is requested (not PETSC_NULL) then this is a vector with the null space removed otherwise > the removal is done in-place (in vec) > > Note: The user is not responsible for the vector returned and should not destroy it. > > Level: advanced > > .keywords: PC, null space, remove > > .seealso: MatNullSpaceCreate(), MatNullSpaceDestroy(), MatNullSpaceSetFunction() > @*/ > PetscErrorCode MatNullSpaceRemove(MatNullSpace sp,Vec vec,Vec *out) > { > PetscScalar sum; > PetscInt i,N; > PetscErrorCode ierr; > > PetscFunctionBegin; > PetscValidHeaderSpecific(sp,MAT_NULLSPACE_CLASSID,1); > PetscValidHeaderSpecific(vec,VEC_CLASSID,2); > > if (out) { > PetscValidPointer(out,3); > if (!sp->vec) { > ierr = VecDuplicate(vec,&sp->vec);CHKERRQ(ierr); > ierr = PetscLogObjectParent(sp,sp->vec);CHKERRQ(ierr); > } > ierr = VecCopy(vec,sp->vec);CHKERRQ(ierr); > vec = *out = sp->vec; > } > > if (sp->has_cnst) { > ierr = VecGetSize(vec,&N);CHKERRQ(ierr); > if (N> 0) { > ierr = VecSum(vec,&sum);CHKERRQ(ierr); > sum = sum/((PetscScalar)(-1.0*N)); > ierr = VecShift(vec,sum);CHKERRQ(ierr); > } > } > > if (sp->n) { > ierr = VecMDot(vec,sp->n,sp->vecs,sp->alpha);CHKERRQ(ierr); > for (i=0; in; i++) sp->alpha[i] = -sp->alpha[i]; > ierr = VecMAXPY(vec,sp->n,sp->alpha,sp->vecs);CHKERRQ(ierr); > } > > if (sp->remove){ > ierr = (*sp->remove)(sp,vec,sp->rmctx);CHKERRQ(ierr); > } > PetscFunctionReturn(0); > } > > All I found within seconds using etags. Now you can see exactly where and how the null space is being removed. > > Barry > > > From knepley at gmail.com Tue Jan 10 13:26:07 2012 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 10 Jan 2012 13:26:07 -0600 Subject: [petsc-users] Strongly nonlinear equation solved within the framework of PETSc In-Reply-To: References: Message-ID: On Tue, Jan 10, 2012 at 2:27 AM, Fatcharm wrote: > Dear all, > > First, I would like to re-describe my problem. I want to numerically > solve a strongly nonlinear fourth-order equation, which is used to > describe the dynamics of a liquid film.Please find the form of the > equation below, > > u_t = -(1/3)*C*(u^3*u_xxx)_x + (A*(u_x/u))_x > > "u" is the thickness of the film to be solved, is a function of "x" > and time "t" , "C" and "A" are constant parameters. > u_xxx is the 3-th order derivative. > > I wrote a PETSc programs for this problem, using the central finite > difference scheme in the space and CN method in time . > > I start my PETSc program from the > /petsc-3.2-p5/src/ts/examples/tutorials/ex13.c > > I wrote in my "RHSFunction" function: > u = uarray[i]; > ux = (uarray[i+1] - uarray[i-1]); > uxx = (-2.0*u + uarray[i-1] + uarray[i+1]); > uxxx = (uarray[i+2] - 2.0*uarray[i+1] + 2.0*uarray[i-1] - > uarray[i-2]); > uxxxx = (uarray[i+2] - 4.0*uarray[i+1] + 6.0*u - > 4.0*uarray[i-1] + uarray[i-2]); > ucx = -(user->c/3.0)*sx*(0.75*sx*u*u*ux*uxxx + sx*u*u*u*uxxxx); > uax = (user->a)*sx*(-0.25*ux*ux/(u*u+l_res) + uxx/(u+l_res)); > f[i] = ucx + uax; > > Also I provided the "RHSJacobian" to evaluate the changing Jacobian. > > Followed Jed's advice, I run the program with -snes_monitor > -snes_converged_reason -ksp_converged_reason > > I found that "Nonlinear solve did not converge due to > DIVERGED_LINE_SEARCH", at the same time "Linear solve converged due to > CONVERGED_RTOL iterations 5". > This sounds like an error in your Jacobian. Run a small problem with -snes_fd. This is in the FAQ. > I tested to solve this problem with -ts_type beuler, I found: > > timestep 0: time 0, solution norm 0.00899198, max 0.057735, min 0.001 > Linear solve converged due to CONVERGED_RTOL iterations 30 > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH > timestep 1: time 0.001, solution norm 0.00899198, max 0.057735, min 0.001 > Nonlinear solve converged due to CONVERGED_FNORM_ABS > timestep 2: time 0.002, solution norm 0.00899198, max 0.057735, min 0.001 > Linear solve converged due to CONVERGED_RTOL iterations 30 > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH > timestep 3: time 0.003, solution norm 0.00899198, max 0.057735, min 0.001 > Nonlinear solve converged due to CONVERGED_FNORM_ABS > timestep 4: time 0.004, solution norm 0.00899198, max 0.057735, min 0.001 > Linear solve converged due to CONVERGED_RTOL iterations 30 > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH > timestep 5: time 0.005, solution norm 0.00899198, max 0.057735, min 0.001 > Nonlinear solve converged due to CONVERGED_FNORM_ABS > timestep 6: time 0.006, solution norm 0.00899198, max 0.057735, min 0.001 > Linear solve converged due to CONVERGED_RTOL iterations 30 > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH > > I read > http://scicomp.stackexchange.com/questions/30/why-is-newtons-method-not-converging > > I run with -pc_type lu, it was told that > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc LU! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > Always run o n1 process first, and never never never never set the type explicitly to MPIAIJ. Use AIJ, which falls back to SEQAIJ on 1 processes. Matt > I run with -viewJacobian, the Jacobian looks reasonable. Only the > value in the Jacobian is on the order of 10^7 or 10^8, Jed told me > that "That may cause ill-conditioning"? > > If I run with -snes_type test -snes_test_display, I got > > timestep 0: time 0, solution norm 0.00899198, max 0.057735, min 0.001 > Testing hand-coded Jacobian, if the ratio is > O(1.e-8), the hand-coded Jacobian is probably correct. > Run with -snes_test_display to show difference > of hand-coded and finite difference Jacobian. > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Floating point exception! > [0]PETSC ERROR: Infinite or not-a-number generated in norm! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > If I run with -snes_ls_monitor, I got > > timestep 0: time 0, solution norm 0.02004, max 0.0299803, min 0.0100197 > Line search: gnorm after quadratic fit 6.651216893041e+04 > Line search: Quadratically determined step, > lambda=4.8079877662732573e-01 > Line search: gnorm after quadratic fit 5.508857435695e+07 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.508989545851e+07 lambda=1.0000000000000002e-02 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.508855229151e+07 lambda=1.0000000000000002e-03 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.508989770948e+07 lambda=1.0000000000000003e-04 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.508855206653e+07 lambda=1.0000000000000004e-05 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.508989773203e+07 lambda=1.0000000000000004e-06 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.508855206428e+07 lambda=1.0000000000000005e-07 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.508989773226e+07 lambda=1.0000000000000005e-08 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.508855206425e+07 lambda=1.0000000000000005e-09 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.508989773226e+07 lambda=1.0000000000000006e-10 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.508855206425e+07 lambda=1.0000000000000006e-11 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.508989773226e+07 lambda=1.0000000000000006e-12 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.508855206425e+07 lambda=1.0000000000000007e-13 > Line search: unable to find good step length! After 13 tries > Line search: fnorm=6.6512168930411834e+04, > gnorm=5.5088552064252406e+07, ynorm=2.0443556209235136e-01, > minlambda=6.8680778683552649e-13, lambda=1.0000000000000007e-13, > initial slope=-4.4238725909386473e+09 > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH > timestep 1: time 0.001, solution norm 0.0192692, max 0.021293, min > 0.0161292 > Line search: gnorm after quadratic fit 5.509183229235e+07 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.509262678284e+07 lambda=1.0000000000000002e-02 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.509184917687e+07 lambda=1.0000000000000002e-03 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.509262521680e+07 lambda=1.0000000000000003e-04 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.509184934034e+07 lambda=1.0000000000000004e-05 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.509262520119e+07 lambda=1.0000000000000004e-06 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.509184934198e+07 lambda=1.0000000000000005e-07 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.509262520104e+07 lambda=1.0000000000000005e-08 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.509184934199e+07 lambda=1.0000000000000005e-09 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.509262520104e+07 lambda=1.0000000000000006e-10 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.509184934199e+07 lambda=1.0000000000000006e-11 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.509262520104e+07 lambda=1.0000000000000006e-12 > Line search: Cubic step no good, shrinking lambda, current gnorm > 5.509184934199e+07 lambda=1.0000000000000007e-13 > Line search: unable to find good step length! After 13 tries > Line search: fnorm=9.5088545581815284e+04, > gnorm=5.5091849341994494e+07, ynorm=1.4128453377933892e-01, > minlambda=8.9929091375150137e-13, lambda=1.0000000000000007e-13, > initial slope=-9.0419264362675228e+09 > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH > timestep 2: time 0.002, solution norm 0.0192692, max 0.021293, min > 0.0161292 > > > > > It is weird that if I run with -snes_converged_reason, I got > > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH > > If I run with -snes_converged_reason and -info I found that > > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE > > > Could anyone please give me some suggestions? > > Thank you very much. > > Feng-Chao > > > > > > > > Message: 6 > > Date: Thu, 29 Dec 2011 08:21:08 -0600 > > From: Jed Brown > > Subject: Re: [petsc-users] Strongly nonlinear equation solved within > > the framework of PETSc > > To: PETSc users list > > Message-ID: > > JXWuNc1PNJA at mail.gmail.com> > > Content-Type: text/plain; charset="utf-8" > > > > On Thu, Dec 29, 2011 at 08:10, Fatcharm wrote: > > > >> We can see that the SNES Function norm is extremely large. I think it > >> is because that the initial value for the unknown function H(X,T) is > >> quite small and there is some terms like (1/H)(dH/dX) or > >> (1/H^2)(dH/dX) in my equations. > >> > > > > That may cause ill-conditioning, but you could still scale the equations > so > > that the initial norm was of order 1. It shouldn't matter here though, > > because most methods are unaffected by scaling. > > > > Are you computing an analytic Jacobian or using finite differencing? > > > > > >> > >> For "Linear solve did not converge due to DIVERGED_DTOL iterations > >> 3270", does this mean I should change the ksp_type? > >> > > > > It's important to solve the linear systems before worrying about > > convergence rates for Newton methods. Try a direct solve on a small > problem > > first, then read this > > > > > http://scicomp.stackexchange.com/questions/513/why-is-my-iterative-linear-solver-not-converging > > > > If you fix the linear solve issues, but SNES is still not converging, > read > > > > > http://scicomp.stackexchange.com/questions/30/why-is-newtons-method-not-converging > > -------------- next part -------------- > > An HTML attachment was scrubbed... > > URL: > > < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20111229/0e223952/attachment.htm > > > > > > ------------------------------ > > > > _______________________________________________ > > petsc-users mailing list > > petsc-users at mcs.anl.gov > > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > > > > > > End of petsc-users Digest, Vol 36, Issue 84 > > ******************************************* > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From irving at naml.us Tue Jan 10 17:28:43 2012 From: irving at naml.us (Geoffrey Irving) Date: Tue, 10 Jan 2012 15:28:43 -0800 Subject: [petsc-users] lying about nullspaces In-Reply-To: References: Message-ID: On Tue, Jan 10, 2012 at 5:52 AM, Jed Brown wrote: > On Tue, Jan 10, 2012 at 00:08, Geoffrey Irving wrote: >> >> For now, I believe I can get away with a single linear iteration. > > > Single linear iteration (e.g. one GMRES cycle) or single linear solve (e.g. > one Newton step)? Single linear solve (one Newton step). >> Even if I need a few, the extra cost of the first linear solve appears >> to be drastic. ?However, it appears you're right that this isn't due >> to preconditioner setup. ?The first solve takes over 50 times as long >> as the other solves: >> >> ? ?step 1 >> ? ? ?dt = 0.00694444, time = 0 >> ? ? ?cg icc converged: iterations = 4, rtol = 0.001, error = 9.56519e-05 >> ? ? ?actual L2 residual = 1.10131e-05 >> ? ? ?max speed = 0.00728987 >> ? ?END step 1 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?0.6109 s > > > How are you measuring this time? In -log_summary, I see 0.02 seconds in > KSPSolve(). Maybe the time you see is because there are lots of page faults > until you get the code loaded into memory? It turns out the initial overhead was due to a bug in my computation of row lengths. It's much faster with the bug fixed. Is there a way to detect reallocations so as avoid this kind of error in future? I looked through the seqaij code and didn't see anything obvious, and also couldn't find a function to compute the actual final row lengths. Thanks, Geoffrey From jedbrown at mcs.anl.gov Tue Jan 10 22:23:06 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 10 Jan 2012 23:23:06 -0500 Subject: [petsc-users] Strongly nonlinear equation solved within the framework of PETSc In-Reply-To: References: Message-ID: On Tue, Jan 10, 2012 at 03:27, Fatcharm wrote: > It is weird that if I run with -snes_converged_reason, I got > > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH > > If I run with -snes_converged_reason and -info I found that > > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE > Wait, identical options except for -info are giving different results? This sounds like a memory error, could you try Valgrind? -------------- next part -------------- An HTML attachment was scrubbed... URL: From agrayver at gfz-potsdam.de Wed Jan 11 02:43:46 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Wed, 11 Jan 2012 09:43:46 +0100 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: <117A7ABA-9E8D-42C3-9506-F8C72B2B2248@mcs.anl.gov> References: <4F05C04A.4060108@gfz-potsdam.de> <4F05C276.2050905@gfz-potsdam.de> <4F05C379.2040909@gfz-potsdam.de> <4F06BA64.4020402@gfz-potsdam.de> <4F06EE62.6060901@gfz-potsdam.de> <117A7ABA-9E8D-42C3-9506-F8C72B2B2248@mcs.anl.gov> Message-ID: <4F0D4BC2.1080101@gfz-potsdam.de> Jed, Barry, Thank you for your advices. I restructured output and got rid of PetscViewerFileSetName by having several viewers and putting more things in one file. Regards, Alexander On 07.01.2012 03:58, Barry Smith wrote: > On Jan 6, 2012, at 6:51 AM, Alexander Grayver wrote: > >> On 06.01.2012 13:45, Jed Brown wrote: >>> On Fri, Jan 6, 2012 at 03:09, Alexander Grayver wrote: >>> This is not always convinient to store everything in one file, but in some cases I do want to use it. I haven't found any examples on that. Do I have to use FILE_MODE_APPEND and then write? What happens if file doesn't exist? >>> >>> You just create a viewer and then call MatView(), VecView(), etc, repeatedly for each object you want to put in the file (e.g. once per time step). No need for FILE_MODE_APPEND and unless you want to append to an existing file. >> Ok, that was my meaning to use PetscViewerFileSetName from the beginning since if you have let's say ten different objects (Mat and Vec) and you need to output them at each iteration (time step or frequency for multi-freqs modeling) you need ten viewer objects > I don't understand why you need ten viewer objects. Why not just dump all the objects into the one file (creating one Viewer object and never changing its name) one after each other and then in MATLAB read then back in one after the other. > > Barry > >> which is not cool I guess, that is why I started to use one viewer and change just a name of the file. >> And to be honest I don't see any reason why having ten viewers is better than calling PetscViewerFileSetName ten times. >> >> Regards, >> Alexander From Andrew.Parker2 at baesystems.com Wed Jan 11 04:36:36 2012 From: Andrew.Parker2 at baesystems.com (Parker, Andrew (UK Filton)) Date: Wed, 11 Jan 2012 10:36:36 -0000 Subject: [petsc-users] 3.2-p6 Message-ID: Hi, Just before Christmas I was kindly provided a patch for the pbjacob method for 7x7 matrices: pbjacobi.patch. I applied this to 3.2-p5 at it worked fine, but I wonder when/if it will make it to a release and if so will it be on p6? When is p6 scheduled as I need to plan an internal release based on this? Cheers again, Andy ******************************************************************** This email and any attachments are confidential to the intended recipient and may also be privileged. If you are not the intended recipient please delete it from your system and notify the sender. You should not copy it or use it for any purpose nor disclose or distribute its contents to any other person. ******************************************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Jan 11 08:16:19 2012 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 11 Jan 2012 08:16:19 -0600 Subject: [petsc-users] lying about nullspaces In-Reply-To: References: Message-ID: On Tue, Jan 10, 2012 at 5:28 PM, Geoffrey Irving wrote: > On Tue, Jan 10, 2012 at 5:52 AM, Jed Brown wrote: > > On Tue, Jan 10, 2012 at 00:08, Geoffrey Irving wrote: > >> > >> For now, I believe I can get away with a single linear iteration. > > > > > > Single linear iteration (e.g. one GMRES cycle) or single linear solve > (e.g. > > one Newton step)? > > Single linear solve (one Newton step). > > >> Even if I need a few, the extra cost of the first linear solve appears > >> to be drastic. However, it appears you're right that this isn't due > >> to preconditioner setup. The first solve takes over 50 times as long > >> as the other solves: > >> > >> step 1 > >> dt = 0.00694444, time = 0 > >> cg icc converged: iterations = 4, rtol = 0.001, error = 9.56519e-05 > >> actual L2 residual = 1.10131e-05 > >> max speed = 0.00728987 > >> END step 1 0.6109 s > > > > > > How are you measuring this time? In -log_summary, I see 0.02 seconds in > > KSPSolve(). Maybe the time you see is because there are lots of page > faults > > until you get the code loaded into memory? > > It turns out the initial overhead was due to a bug in my computation > of row lengths. It's much faster with the bug fixed. Is there a way > to detect reallocations so as avoid this kind of error in future? I > looked through the seqaij code and didn't see anything obvious, and > also couldn't find a function to compute the actual final row lengths. > MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_TRUE); http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetOption.html Matt > Thanks, > Geoffrey > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Jan 11 08:32:57 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 11 Jan 2012 09:32:57 -0500 Subject: [petsc-users] lying about nullspaces In-Reply-To: References: Message-ID: On Wed, Jan 11, 2012 at 09:16, Matthew Knepley wrote: > MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_TRUE); > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetOption.html > If MatSetFromOptions() is called, these options can be used -mat_new_nonzero_allocation_err -mat_new_nonzero_location_err -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Jan 11 10:01:29 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 11 Jan 2012 10:01:29 -0600 (CST) Subject: [petsc-users] 3.2-p6 In-Reply-To: References: Message-ID: petsc-3.2-p6 tarball is now available for download. And yes - it should have the patch you mention. Satish On Wed, 11 Jan 2012, Parker, Andrew (UK Filton) wrote: > Hi, > > > > Just before Christmas I was kindly provided a patch for the pbjacob > method for 7x7 matrices: pbjacobi.patch. I applied this to 3.2-p5 at it > worked fine, but I wonder when/if it will make it to a release and if so > will it be on p6? When is p6 scheduled as I need to plan an internal > release based on this? > > > > Cheers again, > > Andy > > > ******************************************************************** > This email and any attachments are confidential to the intended > recipient and may also be privileged. If you are not the intended > recipient please delete it from your system and notify the sender. > You should not copy it or use it for any purpose nor disclose or > distribute its contents to any other person. > ******************************************************************** > > From Andrew.Parker2 at baesystems.com Wed Jan 11 10:03:16 2012 From: Andrew.Parker2 at baesystems.com (Parker, Andrew (UK Filton)) Date: Wed, 11 Jan 2012 16:03:16 -0000 Subject: [petsc-users] 3.2-p6 In-Reply-To: References: Message-ID: Ok Great thank, I can't see it on the main web page, should I look elsewhere? Cheers, Andy -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Satish Balay Sent: 11 January 2012 16:01 To: PETSc users list Subject: Re: [petsc-users] 3.2-p6 *** WARNING *** This message has originated outside your organisation, either from an external partner or the Global Internet. Keep this in mind if you answer this message. To report a suspicious email, follow the instructions on the Global Intranet at " / " petsc-3.2-p6 tarball is now available for download. And yes - it should have the patch you mention. Satish On Wed, 11 Jan 2012, Parker, Andrew (UK Filton) wrote: > Hi, > > > > Just before Christmas I was kindly provided a patch for the pbjacob > method for 7x7 matrices: pbjacobi.patch. I applied this to 3.2-p5 at it > worked fine, but I wonder when/if it will make it to a release and if so > will it be on p6? When is p6 scheduled as I need to plan an internal > release based on this? > > > > Cheers again, > > Andy > > > ******************************************************************** > This email and any attachments are confidential to the intended > recipient and may also be privileged. If you are not the intended > recipient please delete it from your system and notify the sender. > You should not copy it or use it for any purpose nor disclose or > distribute its contents to any other person. > ******************************************************************** > > From balay at mcs.anl.gov Wed Jan 11 10:08:55 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 11 Jan 2012 10:08:55 -0600 (CST) Subject: [petsc-users] 3.2-p6 In-Reply-To: References: Message-ID: You should see the links to the p6 tarballs at the download webpage: http://www.mcs.anl.gov/petsc/download/index.html [if you still see p5 links - hit the 'reload' button of your browser] The direct links are: http://ftp.mcs.anl.gov/pub/petsc/release-snapshots/petsc-3.2-p6.tar.gz http://ftp.mcs.anl.gov/pub/petsc/release-snapshots/petsc-lite-3.2-p6.tar.gz Satish On Wed, 11 Jan 2012, Parker, Andrew (UK Filton) wrote: > Ok Great thank, I can't see it on the main web page, should I look > elsewhere? > > Cheers, > Andy > > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov > [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Satish Balay > Sent: 11 January 2012 16:01 > To: PETSc users list > Subject: Re: [petsc-users] 3.2-p6 > > > *** WARNING *** > > This message has originated outside your organisation, > either from an external partner or the Global Internet. > Keep this in mind if you answer this message. > To report a suspicious email, follow the instructions > on the Global Intranet at " / " > > > > > petsc-3.2-p6 tarball is now available for download. > > And yes - it should have the patch you mention. > > Satish > > On Wed, 11 Jan 2012, Parker, Andrew (UK Filton) wrote: > > > Hi, > > > > > > > > Just before Christmas I was kindly provided a patch for the pbjacob > > method for 7x7 matrices: pbjacobi.patch. I applied this to 3.2-p5 at > it > > worked fine, but I wonder when/if it will make it to a release and if > so > > will it be on p6? When is p6 scheduled as I need to plan an internal > > release based on this? > > > > > > > > Cheers again, > > > > Andy > > > > > > ******************************************************************** > > This email and any attachments are confidential to the intended > > recipient and may also be privileged. If you are not the intended > > recipient please delete it from your system and notify the sender. > > You should not copy it or use it for any purpose nor disclose or > > distribute its contents to any other person. > > ******************************************************************** > > > > > > > From bsmith at mcs.anl.gov Wed Jan 11 21:39:35 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 11 Jan 2012 21:39:35 -0600 Subject: [petsc-users] Strongly nonlinear equation solved within the framework of PETSc In-Reply-To: References: Message-ID: <0D90791A-151C-4B4F-89E0-118AE4FC419E@mcs.anl.gov> On Jan 10, 2012, at 10:23 PM, Jed Brown wrote: > On Tue, Jan 10, 2012 at 03:27, Fatcharm wrote: > It is weird that if I run with -snes_converged_reason, I got > > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH > > If I run with -snes_converged_reason and -info I found that > > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE > > Wait, identical options except for -info are giving different results? > > This sounds like a memory error, could you try Valgrind? See http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind From xyuan at lbl.gov Thu Jan 12 12:51:29 2012 From: xyuan at lbl.gov (Xuefei Yuan (Rebecca)) Date: Thu, 12 Jan 2012 10:51:29 -0800 Subject: [petsc-users] The test input for petsc-dev/src/ksp/ksp/examples/tutorials/ex10.c Message-ID: <1497B86E-6D27-4ABD-8A39-1C74729D56BA@lbl.gov> Hello all, Is there any sample matrix and rhs data files for the test of ex10.c? Thanks, Rebecca From balay at mcs.anl.gov Thu Jan 12 13:09:08 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 12 Jan 2012 13:09:08 -0600 (CST) Subject: [petsc-users] The test input for petsc-dev/src/ksp/ksp/examples/tutorials/ex10.c In-Reply-To: <1497B86E-6D27-4ABD-8A39-1C74729D56BA@lbl.gov> References: <1497B86E-6D27-4ABD-8A39-1C74729D56BA@lbl.gov> Message-ID: On Thu, 12 Jan 2012, Xuefei (Rebecca) Yuan wrote: > Hello all, > > Is there any sample matrix and rhs data files for the test of ex10.c? You can get some sample datafiles [mat,vec] from http://ftp.mcs.anl.gov/pub/petsc/matrices and use with ex10 Satish From xyuan at lbl.gov Thu Jan 12 14:04:39 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Thu, 12 Jan 2012 12:04:39 -0800 Subject: [petsc-users] The test input for petsc-dev/src/ksp/ksp/examples/tutorials/ex10.c In-Reply-To: References: <1497B86E-6D27-4ABD-8A39-1C74729D56BA@lbl.gov> Message-ID: <29FB9AB6-A25C-4536-8A1E-617D62F71AFE@lbl.gov> Hi Satish, Thanks very much! Best regards, Rebecca On Jan 12, 2012, at 11:09 AM, Satish Balay wrote: > On Thu, 12 Jan 2012, Xuefei (Rebecca) Yuan wrote: > >> Hello all, >> >> Is there any sample matrix and rhs data files for the test of ex10.c? > > > You can get some sample datafiles [mat,vec] from > http://ftp.mcs.anl.gov/pub/petsc/matrices and use with ex10 > > Satish From Patrick.TAMAIN at cea.fr Fri Jan 13 07:22:19 2012 From: Patrick.TAMAIN at cea.fr (TAMAIN Patrick 207314) Date: Fri, 13 Jan 2012 13:22:19 +0000 Subject: [petsc-users] Block LU preconditioning Message-ID: <1C8B2DC22010F44D8D91EDCB0B2E8DED1535C7E3@EXDAG0-B1.intra.cea.fr> Hi, I am working on an application that requires the inversion of an implicit operator that is more or less a very anisotropic 3D diffusion operator (about 1E6 ratio between diffusion coefficient in 1 direction with respect to the 2 others). The matrix is therefore very badly conditioned. By direction splitting, I can split the system in 2 pieces: A = B + C where B is block diagonal and contains all the coefficients corresponding to the high diffusion part of the operator. B seems like a good preconditioner for the system (I checked in low resolution cases with matlab that applying B^-1 to A reduces its condition number by a factor of close to 1E6), but B is ill-conditioned. However, since it is block diagonal and that it will not vary in time (contrary to C), it seems to me that it would be a good idea to perform once and for all an LU factorization of B and use it to precondition A each time the ksp solver is called afterwards. I somehow can't figure out how to do this in practise with PETSC. I tried to configure the KSP like this call KSPSetOperators(myKSP,A,B,SAME_PRECONDITIONER,ierr) and then run PETSC with the following options: -pc_type asm -sub_pc_type lu -ksp_type bcgsl but the computation time becomes awful, far more than if I try to do separately the LU decomposition of the blocks of B with MUMPS. Would you have some advice / hints on how to implement such a preconditioner for my problem? Thanks in advance. Patrick -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Jan 13 07:42:14 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 13 Jan 2012 08:42:14 -0500 Subject: [petsc-users] Block LU preconditioning In-Reply-To: <1C8B2DC22010F44D8D91EDCB0B2E8DED1535C7E3@EXDAG0-B1.intra.cea.fr> References: <1C8B2DC22010F44D8D91EDCB0B2E8DED1535C7E3@EXDAG0-B1.intra.cea.fr> Message-ID: On Fri, Jan 13, 2012 at 08:22, TAMAIN Patrick 207314 wrote: > I somehow can?t figure out how to do this in practise with PETSC. I tried > to configure the KSP like this**** > > call KSPSetOperators(myKSP,A,B,SAME_PRECONDITIONER,ierr) **** > > and then run PETSC with the following options:**** > > -pc_type asm -sub_pc_type lu -ksp_type bcgsl**** > > but the computation time becomes awful, far more than if I try to do > separately the LU decomposition of the blocks of B with MUMPS. > Make sure the subdomains are chosen to align with the direction you want to couple strongly (and perhaps span the domain if that is the method you want). There are APIs to set parallel subdomains that would couple on a subcommunicator. I might consider using the original operator and just setting up the domain with this structure. The fill will be reasonably low if you have long, skinny domains. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jan 13 08:15:53 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 13 Jan 2012 08:15:53 -0600 Subject: [petsc-users] Block LU preconditioning In-Reply-To: <1C8B2DC22010F44D8D91EDCB0B2E8DED1535C7E3@EXDAG0-B1.intra.cea.fr> References: <1C8B2DC22010F44D8D91EDCB0B2E8DED1535C7E3@EXDAG0-B1.intra.cea.fr> Message-ID: <41DD9140-E1A1-4B88-9E40-7A4F3E8FB583@mcs.anl.gov> On Jan 13, 2012, at 7:22 AM, TAMAIN Patrick 207314 wrote: > Hi, > > I am working on an application that requires the inversion of an implicit operator that is more or less a very anisotropic 3D diffusion operator (about 1E6 ratio between diffusion coefficient in 1 direction with respect to the 2 others). The matrix is therefore very badly conditioned. By direction splitting, I can split the system in 2 pieces: > > A = B + C > > where B is block diagonal and contains all the coefficients corresponding to the high diffusion part of the operator. B seems like a good preconditioner for the system (I checked in low resolution cases with matlab that applying B^-1 to A reduces its condition number by a factor of close to 1E6), but B is ill-conditioned. However, since it is block diagonal and that it will not vary in time (contrary to C), it seems to me that it would be a good idea to perform once and for all an LU factorization of B and use it to precondition A each time the ksp solver is called afterwards. > > I somehow can?t figure out how to do this in practise with PETSC. I tried to configure the KSP like this > call KSPSetOperators(myKSP,A,B,SAME_PRECONDITIONER,ierr) > and then run PETSC with the following options: > -pc_type asm -sub_pc_type lu -ksp_type bcgsl > but the computation time becomes awful, far more than if I try to do separately the LU decomposition of the blocks of B with MUMPS. So if you run with -pc_type lu -pc_factor_mat_solver_package mumps it works well in terms of convergence rate but -pc_type asm -sub_pc_type lu performs poorly. Then it is a question of you are using ASM as a preconditioner for B which apparently it is not working well. So you need to focus on what makes a good preconditioner for B. Barry > > Would you have some advice / hints on how to implement such a preconditioner for my problem? > > Thanks in advance. > > Patrick > From junchao.zhang at gmail.com Fri Jan 13 11:18:27 2012 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 13 Jan 2012 11:18:27 -0600 Subject: [petsc-users] Sparse matrix partitioning in PETSc Message-ID: Hello, When I use MatLoad() to load a sparse matrix, I want each process to have equal number of nonzeros, instead of equal number of rows. How could I achieve that in PETSc? Thanks! -- Junchao Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: From rxk at cfdrc.com Fri Jan 13 11:28:12 2012 From: rxk at cfdrc.com (Ravi Kannan) Date: Fri, 13 Jan 2012 11:28:12 -0600 Subject: [petsc-users] metis for dev version of petsc Message-ID: <004901ccd218$ba199ab0$2e4cd010$@com> Hi all, For the last few years, we were using metis function (for partitioning) as follows: METIS_PartGraphKway(&nCells,&ia[0],&ja[0],&vwgt[0],&adjwgt[0],&wgtflag,&numf lag,&nParts,options,&edgecut,&part[0]); Recently, we started using the development version for some testing. This needs a more recent version of metis: METIS_SetDefaultOptions(&options[0]); METIS_PartGraphKway(&nCells, &ncon, &ia[0],&ja[0],&vwgt[0],&vsize[0],&adjwgt[0],&nParts,&tpwgts[0],&ubvec[0], &options[0],&edgecut,&part[0]); This has the same function name but with different argument list. We noticed that the latest version does not partition properly as before : we get just one cell in the 0th partition. Any inputs on this? Thanks, Ravi. _________________________________________ Ravi Kannan CFD Research Corporation Senior Scientist 256.726.4851 rxk at cfdrc.com _________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean at mcs.anl.gov Fri Jan 13 11:33:04 2012 From: sean at mcs.anl.gov (Sean Farley) Date: Fri, 13 Jan 2012 11:33:04 -0600 Subject: [petsc-users] metis for dev version of petsc In-Reply-To: <004901ccd218$ba199ab0$2e4cd010$@com> References: <004901ccd218$ba199ab0$2e4cd010$@com> Message-ID: > > Recently, we started using the development version for some testing. This > needs a more recent version of metis: **** > > METIS_SetDefaultOptions(&options[0]);**** > > METIS_PartGraphKway(&nCells, &ncon, > &ia[0],&ja[0],&vwgt[0],&vsize[0],&adjwgt[0],&nParts,&tpwgts[0],&ubvec[0], > &options[0],&edgecut,&part[0]);**** > > This has the same function name but with different argument list.**** > > ** ** > > We noticed that the latest version does not partition properly as before : > we get just one cell in the 0th partition.**** > > ** ** > > Any inputs on this? > What are you setting for options? i.e. are you trying to use the default parameters or do you have your own weights? -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at mcs.anl.gov Fri Jan 13 12:06:52 2012 From: abhyshr at mcs.anl.gov (Shri) Date: Fri, 13 Jan 2012 12:06:52 -0600 (CST) Subject: [petsc-users] Sparse matrix partitioning in PETSc In-Reply-To: Message-ID: <1852867152.140422.1326478012466.JavaMail.root@zimbra.anl.gov> MatLoad() does not distribute the rows based on the number of non zeros. You'll need to first compute the number of rows on each process that gives you equal/nearly equal number of non zeros and then call MatSetSizes() followed by MatLoad(). ----- Original Message ----- > Hello, > When I use MatLoad() to load a sparse matrix, I want each process to > have equal number of nonzeros, instead of equal number of rows. > How could I achieve that in PETSc? > Thanks! > -- Junchao Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 13 12:56:35 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 13 Jan 2012 12:56:35 -0600 Subject: [petsc-users] metis for dev version of petsc In-Reply-To: References: <004901ccd218$ba199ab0$2e4cd010$@com> Message-ID: On Fri, Jan 13, 2012 at 11:33 AM, Sean Farley wrote: > Recently, we started using the development version for some testing. This >> needs a more recent version of metis: **** >> >> METIS_SetDefaultOptions(&options[0]);**** >> >> METIS_PartGraphKway(&nCells, &ncon, >> &ia[0],&ja[0],&vwgt[0],&vsize[0],&adjwgt[0],&nParts,&tpwgts[0],&ubvec[0], >> &options[0],&edgecut,&part[0]);**** >> >> This has the same function name but with different argument list.**** >> >> ** ** >> >> We noticed that the latest version does not partition properly as before >> : we get just one cell in the 0th partition.**** >> >> ** ** >> >> Any inputs on this? >> > > What are you setting for options? i.e. are you trying to use the default > parameters or do you have your own weights? > To follow up on Sean's note, is there a reason that this partitioning problem does not fit into MatPartitioning? If not, we would be very interested in extending our interface to support it. Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rxk at cfdrc.com Fri Jan 13 17:33:55 2012 From: rxk at cfdrc.com (Ravi Kannan) Date: Fri, 13 Jan 2012 17:33:55 -0600 Subject: [petsc-users] metis for dev version of petsc In-Reply-To: References: <004901ccd218$ba199ab0$2e4cd010$@com> Message-ID: <00c701ccd24b$d0ff8a90$72fe9fb0$@com> Hi Sean, Works now : we had to set a few arguments to NULL. RAvi From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: Friday, January 13, 2012 12:57 PM To: PETSc users list Subject: Re: [petsc-users] metis for dev version of petsc On Fri, Jan 13, 2012 at 11:33 AM, Sean Farley wrote: Recently, we started using the development version for some testing. This needs a more recent version of metis: METIS_SetDefaultOptions(&options[0]); METIS_PartGraphKway(&nCells, &ncon, &ia[0],&ja[0],&vwgt[0],&vsize[0],&adjwgt[0],&nParts,&tpwgts[0],&ubvec[0], &options[0],&edgecut,&part[0]); This has the same function name but with different argument list. We noticed that the latest version does not partition properly as before : we get just one cell in the 0th partition. Any inputs on this? What are you setting for options? i.e. are you trying to use the default parameters or do you have your own weights? To follow up on Sean's note, is there a reason that this partitioning problem does not fit into MatPartitioning? If not, we would be very interested in extending our interface to support it. Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean at mcs.anl.gov Fri Jan 13 17:43:12 2012 From: sean at mcs.anl.gov (Sean Farley) Date: Fri, 13 Jan 2012 17:43:12 -0600 Subject: [petsc-users] metis for dev version of petsc In-Reply-To: <00c701ccd24b$d0ff8a90$72fe9fb0$@com> References: <004901ccd218$ba199ab0$2e4cd010$@com> <00c701ccd24b$d0ff8a90$72fe9fb0$@com> Message-ID: > > Works now : we had to set a few arguments to NULL. > Glad to hear it. Yes, George changed the way to specify default parameters (old way was to set options[0] = 0) to just pass NULL for options. Unfortunately, ParMETIS still uses the old interface and this can cause a lot of head ache, and potential memory leak / crash. -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Sat Jan 14 11:38:00 2012 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Sat, 14 Jan 2012 11:38:00 -0600 Subject: [petsc-users] Sparse matrix partitioning in PETSc In-Reply-To: <1852867152.140422.1326478012466.JavaMail.root@zimbra.anl.gov> References: <1852867152.140422.1326478012466.JavaMail.root@zimbra.anl.gov> Message-ID: Does PETSc provide convenient functions to compute this layout( i.e.,# rows on each processor), or I have to do it myself? I browsed PETSc document and did not find them. On Fri, Jan 13, 2012 at 12:06 PM, Shri wrote: > MatLoad() does not distribute the rows based on the number of non zeros. > You'll need to first compute the number of rows on each > process that gives you equal/nearly equal number of non zeros and then > call MatSetSizes() followed by MatLoad(). > > ------------------------------ > > Hello, > When I use MatLoad() to load a sparse matrix, I want each process to > have equal number of nonzeros, instead of equal number of rows. > How could I achieve that in PETSc? > > Thanks! > -- Junchao Zhang > > > -- Junchao Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Jan 14 11:41:35 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 14 Jan 2012 12:41:35 -0500 Subject: [petsc-users] Sparse matrix partitioning in PETSc In-Reply-To: References: <1852867152.140422.1326478012466.JavaMail.root@zimbra.anl.gov> Message-ID: On Sat, Jan 14, 2012 at 12:38, Junchao Zhang wrote: > Does PETSc provide convenient functions to compute this layout( i.e.,# > rows on each processor), or I have to do it myself? > I browsed PETSc document and did not find them. > MatSetSizes() lets you specify local and/or global sizes. If you specify both, it checks that they are compatible. The implementation uses http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscSplitOwnership.html If these are not suitable for your purposes, can you be more specific about what you would like to do? -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Sat Jan 14 11:53:17 2012 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Sat, 14 Jan 2012 11:53:17 -0600 Subject: [petsc-users] Sparse matrix partitioning in PETSc In-Reply-To: References: <1852867152.140422.1326478012466.JavaMail.root@zimbra.anl.gov> Message-ID: I want to load a sparse matrix from a file (in PETSc binary format). The matrix will be in MPIAIJ format. I want it is a balanced load: each processor will have nearly equal number of nonzeros. I don't come up with a sequence of PETSc calls to do that. On Sat, Jan 14, 2012 at 11:41 AM, Jed Brown wrote: > On Sat, Jan 14, 2012 at 12:38, Junchao Zhang wrote: > >> Does PETSc provide convenient functions to compute this layout( i.e.,# >> rows on each processor), or I have to do it myself? >> I browsed PETSc document and did not find them. >> > > MatSetSizes() lets you specify local and/or global sizes. If you specify > both, it checks that they are compatible. The implementation uses > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscSplitOwnership.html > > > If these are not suitable for your purposes, can you be more specific > about what you would like to do? > -- Junchao Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at mcs.anl.gov Sat Jan 14 12:10:42 2012 From: abhyshr at mcs.anl.gov (Shri) Date: Sat, 14 Jan 2012 12:10:42 -0600 (CST) Subject: [petsc-users] Sparse matrix partitioning in PETSc In-Reply-To: Message-ID: <462861672.142691.1326564642077.JavaMail.root@zimbra.anl.gov> Does your matrix come from an unstructured grid? If so, then you'll need to use a partitioning package such as permits. Read section 3.5 of the manual http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf and also see the example http://www.mcs.anl.gov/petsc/petsc-current/src/mat/examples/tutorials/ex15.c.html ----- Original Message ----- > I want to load a sparse matrix from a file (in PETSc binary format). > The matrix will be in MPIAIJ format. > I want it is a balanced load: each processor will have nearly equal > number of nonzeros. > I don't come up with a sequence of PETSc calls to do that. > On Sat, Jan 14, 2012 at 11:41 AM, Jed Brown < jedbrown at mcs.anl.gov > > wrote: > > On Sat, Jan 14, 2012 at 12:38, Junchao Zhang < > > junchao.zhang at gmail.com > > > wrote: > > > Does PETSc provide convenient functions to compute this layout( > > > i.e.,# > > > rows on each processor), or I have to do it myself? > > > I browsed PETSc document and did not find them. > > MatSetSizes() lets you specify local and/or global sizes. If you > > specify both, it checks that they are compatible. The implementation > > uses > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscSplitOwnership.html > > If these are not suitable for your purposes, can you be more > > specific > > about what you would like to do? > -- > Junchao Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Jan 14 12:20:31 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 14 Jan 2012 13:20:31 -0500 Subject: [petsc-users] Sparse matrix partitioning in PETSc In-Reply-To: References: <1852867152.140422.1326478012466.JavaMail.root@zimbra.anl.gov> Message-ID: On Sat, Jan 14, 2012 at 12:53, Junchao Zhang wrote: > I want to load a sparse matrix from a file (in PETSc binary format). The > matrix will be in MPIAIJ format. > I want it is a balanced load: each processor will have nearly equal number > of nonzeros. > I don't come up with a sequence of PETSc calls to do that. > This is a reasonable thing to do for poorly balanced matrices and would be fairly easy to implement by summing the row indices, choosing offsets, and distributing. That is, it would be easy to add -matload_partition_by_nonzeros. In practice, for good distribution, especially on a different number of processes than original wrote a matrix, the matrix should be read, partitioned, and redistributed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Sat Jan 14 12:32:19 2012 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Sat, 14 Jan 2012 12:32:19 -0600 Subject: [petsc-users] Sparse matrix partitioning in PETSc In-Reply-To: <462861672.142691.1326564642077.JavaMail.root@zimbra.anl.gov> References: <462861672.142691.1326564642077.JavaMail.root@zimbra.anl.gov> Message-ID: My matrices are from Florida matrix collection. I want to use them to test MatMult() Now I know I have to compute the partitioning myself. Thanks. On Sat, Jan 14, 2012 at 12:10 PM, Shri wrote: > Does your matrix come from an unstructured grid? > If so, then you'll need to use a partitioning package such as permits. > Read section 3.5 of the manual > http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf > and also see the example > > http://www.mcs.anl.gov/petsc/petsc-current/src/mat/examples/tutorials/ex15.c.html > > > > ------------------------------ > > I want to load a sparse matrix from a file (in PETSc binary format). The > matrix will be in MPIAIJ format. > I want it is a balanced load: each processor will have nearly equal number > of nonzeros. > I don't come up with a sequence of PETSc calls to do that. > > On Sat, Jan 14, 2012 at 11:41 AM, Jed Brown wrote: > >> On Sat, Jan 14, 2012 at 12:38, Junchao Zhang wrote: >> >>> Does PETSc provide convenient functions to compute this layout( i.e.,# >>> rows on each processor), or I have to do it myself? >>> I browsed PETSc document and did not find them. >>> >> >> MatSetSizes() lets you specify local and/or global sizes. If you specify >> both, it checks that they are compatible. The implementation uses >> >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscSplitOwnership.html >> >> >> If these are not suitable for your purposes, can you be more specific >> about what you would like to do? >> > > > > -- > Junchao Zhang > > > -- Junchao Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Jan 14 12:46:23 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 14 Jan 2012 13:46:23 -0500 Subject: [petsc-users] Sparse matrix partitioning in PETSc In-Reply-To: References: <462861672.142691.1326564642077.JavaMail.root@zimbra.anl.gov> Message-ID: On Sat, Jan 14, 2012 at 13:32, Junchao Zhang wrote: > My matrices are from Florida matrix collection. > These ASCII formats cannot be read efficiently in parallel. > I want to use them to test MatMult() > You just want to benchmark sparse matrix kernels? Some might consider it to be cheating, but you should consider also reordering the matrices using MatGetOrdering(mat,MATORDERINGRCM,&rperm,&cperm). This will tend to improve cache reuse and can offer a large (e.g. 2x depending on the matrix) speedup relative to the original ordering. -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Sat Jan 14 13:07:43 2012 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Sat, 14 Jan 2012 13:07:43 -0600 Subject: [petsc-users] Sparse matrix partitioning in PETSc In-Reply-To: References: <462861672.142691.1326564642077.JavaMail.root@zimbra.anl.gov> Message-ID: >From PETSc FAQ, I learned to convert an ASCII matrix to binary sequentially, then read in PETSc in parallel Yes, I want to benchmark parallel SpMV on clusters, in various implementations, such as PETSc MatMult(). I'm not sure whether I should reorder a matrix before benchmarking. From the view of benchmarking, maybe I should also benchmark *bad *matrices. On Sat, Jan 14, 2012 at 12:46 PM, Jed Brown wrote: > On Sat, Jan 14, 2012 at 13:32, Junchao Zhang wrote: > >> My matrices are from Florida matrix collection. >> > > These ASCII formats cannot be read efficiently in parallel. > > >> I want to use them to test MatMult() >> > > You just want to benchmark sparse matrix kernels? > > Some might consider it to be cheating, but you should consider also > reordering the matrices using > MatGetOrdering(mat,MATORDERINGRCM,&rperm,&cperm). This will tend to improve > cache reuse and can offer a large (e.g. 2x depending on the matrix) speedup > relative to the original ordering. > -- Junchao Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Jan 14 13:19:09 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 14 Jan 2012 14:19:09 -0500 Subject: [petsc-users] Sparse matrix partitioning in PETSc In-Reply-To: References: <462861672.142691.1326564642077.JavaMail.root@zimbra.anl.gov> Message-ID: On Sat, Jan 14, 2012 at 14:07, Junchao Zhang wrote: > I'm not sure whether I should reorder a matrix before benchmarking. From > the view of benchmarking, maybe I should also benchmark *bad *matrices. > Sure, but using a bad ordering is essentially a contrived difficulty. I can take a tridiagonal matrix, permute to a random ordering, and then every MatMult() implementation will be horrible because of horrible memory access. At the end of the day, what matters is the ability to solve science and engineering problems of interest. If you can get a factor of 2 without changing the problem (just by choosing a good ordering) then it's probably a good idea to do so. Just because some legacy application inadvertently used a bad ordering doesn't mean that you would actually do that for a practical problem. For the PDE problems and architectures that we have tested (and using decent orderings like RCM), we usually see a high fraction of peak memory bandwidth based on a performance model that assumes optimal vector reuse given cache size constraints. -------------- next part -------------- An HTML attachment was scrubbed... URL: From xavier.garnaud at ladhyx.polytechnique.fr Sun Jan 15 05:54:49 2012 From: xavier.garnaud at ladhyx.polytechnique.fr (Xavier Garnaud) Date: Sun, 15 Jan 2012 12:54:49 +0100 Subject: [petsc-users] [petsc-maint #101696] complex + float128 In-Reply-To: <3827E0BE-D340-4389-9762-EFB4F47C3FC5@mcs.anl.gov> References: <3827E0BE-D340-4389-9762-EFB4F47C3FC5@mcs.anl.gov> Message-ID: Thank you for your help. quadmath.h contains quad complex functions. I naively added the functions for quad complex. The beginning of the compilation works, but then it fails with an MPI related error. I will contact you again when i manage to make it work, Sincerely, Xavier On Mon, Jan 9, 2012 at 8:01 PM, Barry Smith wrote: > > We've never tried to do this. > > First you need to check if quadmath.h has quad complex stuff in it? If > it does you need to add into petscmath.h the quad complex bindings for all > the various math operations like PetscScalar and PetscSqrtScalar() etc > > Barry > > > On Jan 9, 2012, at 12:05 PM, Xavier Garnaud wrote: > > > I am trying to build PETSc with complex number and quadruple precision, > but > > I get an error at the compiling stage. I do not get this error when I do > > the same using real numbers. Are complex numbers incompatible with > > quadruple precision? > > Thank you very much, > > Sincerely, > > > > Xavier > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: petscmath.h Type: text/x-chdr Size: 11401 bytes Desc: not available URL: From jedbrown at mcs.anl.gov Sun Jan 15 10:43:36 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sun, 15 Jan 2012 10:43:36 -0600 Subject: [petsc-users] [petsc-maint #101696] complex + float128 In-Reply-To: References: <3827E0BE-D340-4389-9762-EFB4F47C3FC5@mcs.anl.gov> Message-ID: On Sun, Jan 15, 2012 at 05:54, Xavier Garnaud < xavier.garnaud at ladhyx.polytechnique.fr> wrote: > Thank you for your help. > > quadmath.h contains quad complex functions. I naively added the functions > for quad complex. The beginning of the compilation works, but then it fails > with an MPI related error. I will contact you again when i manage to make > it work, > What was the MPI error? It's likely that we are missing data types and MPI_Ops for quad complex. > > Sincerely, > > Xavier > > > > On Mon, Jan 9, 2012 at 8:01 PM, Barry Smith wrote: > >> >> We've never tried to do this. >> >> First you need to check if quadmath.h has quad complex stuff in it? If >> it does you need to add into petscmath.h the quad complex bindings for all >> the various math operations like PetscScalar and PetscSqrtScalar() etc >> >> Barry >> >> >> On Jan 9, 2012, at 12:05 PM, Xavier Garnaud wrote: >> >> > I am trying to build PETSc with complex number and quadruple precision, >> but >> > I get an error at the compiling stage. I do not get this error when I do >> > the same using real numbers. Are complex numbers incompatible with >> > quadruple precision? >> > Thank you very much, >> > Sincerely, >> > >> > Xavier >> > >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sun Jan 15 16:18:33 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sun, 15 Jan 2012 16:18:33 -0600 Subject: [petsc-users] [petsc-maint #101696] complex + float128 In-Reply-To: References: <3827E0BE-D340-4389-9762-EFB4F47C3FC5@mcs.anl.gov> Message-ID: On Sun, Jan 15, 2012 at 16:06, Xavier Garnaud < xavier.garnaud at ladhyx.polytechnique.fr> wrote: > Here is the error message I got. > > Building C object CMakeFiles/petsc.dir/src/sys/draw/interface/dsflush.c.o > /home/garnaud/local/petsc/petsc-3.2-p5/src/sys/draw/interface/dsetpause.c: > In function PetscDrawSetPause: > > /home/garnaud/local/petsc/petsc-3.2-p5/src/sys/draw/interface/dsetpause.c:33:3: > error: MPIU___FLOAT128 undeclared (first use in this function) > > /home/garnaud/local/petsc/petsc-3.2-p5/src/sys/draw/interface/dsetpause.c:33:3: > note: each undeclared identifier is reported only once for each function it > appears in > You will need something like the following, and also some code at about src/sys/objects/pinit.c:750 to create MPIU_C___FLOAT128_COMPLEX. Let us know if you work it out, otherwise I'll take a look at it later. --- a/include/petscmath.h +++ b/include/petscmath.h @@ -112,6 +112,9 @@ #define MPIU_SCALAR MPIU_C_COMPLEX #elif defined(PETSC_USE_REAL_DOUBLE) #define MPIU_SCALAR MPIU_C_DOUBLE_COMPLEX +#elif defined(PETSC_USE_REAL___FLOAT128) +#define MPIU_SCALAR MPIU_C___FLOAT128_COMPLEX +extern MPI_Datatype MPIU_C___FLOAT128_COMPLEX #endif /* PETSC_USE_REAL_* */ /* > > Thank you > > On Sun, Jan 15, 2012 at 5:43 PM, Jed Brown wrote: > > > On Sun, Jan 15, 2012 at 05:54, Xavier Garnaud < > > xavier.garnaud at ladhyx.polytechnique.fr> wrote: > > > >> Thank you for your help. > >> > >> quadmath.h contains quad complex functions. I naively added the > functions > >> for quad complex. The beginning of the compilation works, but then it > fails > >> with an MPI related error. I will contact you again when i manage to > make > >> it work, > >> > > > > What was the MPI error? > > > > It's likely that we are missing data types and MPI_Ops for quad complex. > > > > > >> > >> Sincerely, > >> > >> Xavier > >> > >> > >> > >> On Mon, Jan 9, 2012 at 8:01 PM, Barry Smith wrote: > >> > >>> > >>> We've never tried to do this. > >>> > >>> First you need to check if quadmath.h has quad complex stuff in it? > >>> If it does you need to add into petscmath.h the quad complex bindings > for > >>> all the various math operations like PetscScalar and PetscSqrtScalar() > etc > >>> > >>> Barry > >>> > >>> > >>> On Jan 9, 2012, at 12:05 PM, Xavier Garnaud wrote: > >>> > >>> > I am trying to build PETSc with complex number and quadruple > >>> precision, but > >>> > I get an error at the compiling stage. I do not get this error when I > >>> do > >>> > the same using real numbers. Are complex numbers incompatible with > >>> > quadruple precision? > >>> > Thank you very much, > >>> > Sincerely, > >>> > > >>> > Xavier > >>> > > >>> > > >>> > >>> > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Mon Jan 16 02:16:59 2012 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Mon, 16 Jan 2012 08:16:59 +0000 Subject: [petsc-users] DMDA with cell-centered finite volume method? Message-ID: Is it possible to use DMDAs in the context of a cell-centered finite volume method? For example, a 2D grid with 9-by-9 grid points would have 8-by-8 unknowns, located in the cell centers. After reading the documentation, I got the impression that the unknowns must be situated at the grid points, or am I mistaken? Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From jedbrown at mcs.anl.gov Mon Jan 16 06:10:56 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 16 Jan 2012 06:10:56 -0600 Subject: [petsc-users] DMDA with cell-centered finite volume method? In-Reply-To: References: Message-ID: On Mon, Jan 16, 2012 at 02:16, Klaij, Christiaan wrote: > Is it possible to use DMDAs in the context of a cell-centered > finite volume method? For example, a 2D grid with 9-by-9 grid > points would have 8-by-8 unknowns, located in the cell > centers. After reading the documentation, I got the impression > that the unknowns must be situated at the grid points, or am I > mistaken? > Just make an 8x8 grid. Geometric refinement and coarsening is a little more delicate. I have used the hack of saying that the grid is periodic so that refinement is to 2n instead of 2n-1. You might want to use Q0 interpolation in this case. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amesga1 at tigers.lsu.edu Mon Jan 16 18:20:02 2012 From: amesga1 at tigers.lsu.edu (Ataollah Mesgarnejad) Date: Mon, 16 Jan 2012 18:20:02 -0600 Subject: [petsc-users] SNESVI convergence spped Message-ID: <2B2259CF-97A3-47AE-BA86-29A97BE2B514@tigers.lsu.edu> Dear all, I'm trying to use SNESVI to solve a quadratic problem with box constraints. My problem in FE context reads: (\int_{Omega} E phi_i phi_j + \alpha \epsilon dphi_i dphi_j dx) V_i - (\int_{Omega} \alpha \frac{phi_j}{\epsilon} dx) = 0 , 0<= V <= 1 or: [A]{V}-{b}={0} here phi is the basis function, E and \alpha are positive constants, and \epsilon is a positive regularization parameter in order of mesh resolution. In this problem we expect V =1 a.e. and go to zero very fast at some places. I'm running this on a rather small problem (<500000 DOFS) on small number of processors (<72). I expected SNESVI to converge in couple of iterations (<10) since my A matrix doesn't change, however I'm experiencing a slow convergence (~50-70 iterations). I checked KSP solver for SNES and it converges with a few iterations. I would appreciate any suggestions or observations to increase the convergence speed? Best, Ata From karpeev at mcs.anl.gov Mon Jan 16 18:43:19 2012 From: karpeev at mcs.anl.gov (Dmitry Karpeev) Date: Mon, 16 Jan 2012 18:43:19 -0600 Subject: [petsc-users] SNESVI convergence spped In-Reply-To: <2B2259CF-97A3-47AE-BA86-29A97BE2B514@tigers.lsu.edu> References: <2B2259CF-97A3-47AE-BA86-29A97BE2B514@tigers.lsu.edu> Message-ID: What is the solution that you end up converging to, and what are the boundary conditions? Thanks. Dmitry. On Mon, Jan 16, 2012 at 6:20 PM, Ataollah Mesgarnejad < amesga1 at tigers.lsu.edu> wrote: > Dear all, > > I'm trying to use SNESVI to solve a quadratic problem with box > constraints. My problem in FE context reads: > > (\int_{Omega} E phi_i phi_j + \alpha \epsilon dphi_i dphi_j dx) V_i - > (\int_{Omega} \alpha \frac{phi_j}{\epsilon} dx) = 0 , 0<= V <= 1 > > or: > > [A]{V}-{b}={0} > > here phi is the basis function, E and \alpha are positive constants, and > \epsilon is a positive regularization parameter in order of mesh > resolution. In this problem we expect V =1 a.e. and go to zero very fast > at some places. > I'm running this on a rather small problem (<500000 DOFS) on small number > of processors (<72). I expected SNESVI to converge in couple of iterations > (<10) since my A matrix doesn't change, however I'm experiencing a slow > convergence (~50-70 iterations). I checked KSP solver for SNES and it > converges with a few iterations. > > I would appreciate any suggestions or observations to increase the > convergence speed? > > Best, > Ata -------------- next part -------------- An HTML attachment was scrubbed... URL: From amesga1 at tigers.lsu.edu Mon Jan 16 18:49:03 2012 From: amesga1 at tigers.lsu.edu (Ataollah Mesgarnejad) Date: Mon, 16 Jan 2012 18:49:03 -0600 Subject: [petsc-users] SNESVI convergence spped In-Reply-To: References: <2B2259CF-97A3-47AE-BA86-29A97BE2B514@tigers.lsu.edu> Message-ID: <90DD5423-A147-432D-8C56-7AD0070E4322@tigers.lsu.edu> > What is the solution that you end up converging to I get the correct solution. > , and what are the boundary conditions? > I have natural BCs everywhere ( dV/dn=0) so I don't force it explicitly. Ata > Thanks. > Dmitry. > > On Mon, Jan 16, 2012 at 6:20 PM, Ataollah Mesgarnejad wrote: > Dear all, > > I'm trying to use SNESVI to solve a quadratic problem with box constraints. My problem in FE context reads: > > (\int_{Omega} E phi_i phi_j + \alpha \epsilon dphi_i dphi_j dx) V_i - (\int_{Omega} \alpha \frac{phi_j}{\epsilon} dx) = 0 , 0<= V <= 1 > > or: > > [A]{V}-{b}={0} > > here phi is the basis function, E and \alpha are positive constants, and \epsilon is a positive regularization parameter in order of mesh resolution. In this problem we expect V =1 a.e. and go to zero very fast at some places. > I'm running this on a rather small problem (<500000 DOFS) on small number of processors (<72). I expected SNESVI to converge in couple of iterations (<10) since my A matrix doesn't change, however I'm experiencing a slow convergence (~50-70 iterations). I checked KSP solver for SNES and it converges with a few iterations. > > I would appreciate any suggestions or observations to increase the convergence speed? > > Best, > Ata > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Jan 16 20:05:30 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 16 Jan 2012 20:05:30 -0600 Subject: [petsc-users] SNESVI convergence spped In-Reply-To: <2B2259CF-97A3-47AE-BA86-29A97BE2B514@tigers.lsu.edu> References: <2B2259CF-97A3-47AE-BA86-29A97BE2B514@tigers.lsu.edu> Message-ID: What do you get with -snes_vi_monitor it could be it is taking a while to get the right active set. Barry On Jan 16, 2012, at 6:20 PM, Ataollah Mesgarnejad wrote: > Dear all, > > I'm trying to use SNESVI to solve a quadratic problem with box constraints. My problem in FE context reads: > > (\int_{Omega} E phi_i phi_j + \alpha \epsilon dphi_i dphi_j dx) V_i - (\int_{Omega} \alpha \frac{phi_j}{\epsilon} dx) = 0 , 0<= V <= 1 > > or: > > [A]{V}-{b}={0} > > here phi is the basis function, E and \alpha are positive constants, and \epsilon is a positive regularization parameter in order of mesh resolution. In this problem we expect V =1 a.e. and go to zero very fast at some places. > I'm running this on a rather small problem (<500000 DOFS) on small number of processors (<72). I expected SNESVI to converge in couple of iterations (<10) since my A matrix doesn't change, however I'm experiencing a slow convergence (~50-70 iterations). I checked KSP solver for SNES and it converges with a few iterations. > > I would appreciate any suggestions or observations to increase the convergence speed? > > Best, > Ata From karpeev at mcs.anl.gov Mon Jan 16 20:11:54 2012 From: karpeev at mcs.anl.gov (Dmitry Karpeev) Date: Mon, 16 Jan 2012 20:11:54 -0600 Subject: [petsc-users] SNESVI convergence spped In-Reply-To: References: <2B2259CF-97A3-47AE-BA86-29A97BE2B514@tigers.lsu.edu> Message-ID: It seems to me that the problem is that ultimately ALL of the degrees of freedom are in the active set, but they get added to it a few at a time -- and there may even be some "chatter" there -- necessitating many SNESVI steps. Could it be that the regularization makes things worse? When \epsilon \ll 1, the unconstrained solution is highly oscillatory, possibly further exacerbating the problem. It's possible that it would be better if V just diverged uniformly. Then nearly all of the degrees of freedom would bump up against the upper obstacle all at once. Dmitry. On Mon, Jan 16, 2012 at 8:05 PM, Barry Smith wrote: > > What do you get with -snes_vi_monitor it could be it is taking a while > to get the right active set. > > Barry > > On Jan 16, 2012, at 6:20 PM, Ataollah Mesgarnejad wrote: > > > Dear all, > > > > I'm trying to use SNESVI to solve a quadratic problem with box > constraints. My problem in FE context reads: > > > > (\int_{Omega} E phi_i phi_j + \alpha \epsilon dphi_i dphi_j dx) V_i - > (\int_{Omega} \alpha \frac{phi_j}{\epsilon} dx) = 0 , 0<= V <= 1 > > > > or: > > > > [A]{V}-{b}={0} > > > > here phi is the basis function, E and \alpha are positive constants, and > \epsilon is a positive regularization parameter in order of mesh > resolution. In this problem we expect V =1 a.e. and go to zero very fast > at some places. > > I'm running this on a rather small problem (<500000 DOFS) on small > number of processors (<72). I expected SNESVI to converge in couple of > iterations (<10) since my A matrix doesn't change, however I'm experiencing > a slow convergence (~50-70 iterations). I checked KSP solver for SNES and > it converges with a few iterations. > > > > I would appreciate any suggestions or observations to increase the > convergence speed? > > > > Best, > > Ata > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourdin at lsu.edu Mon Jan 16 20:49:24 2012 From: bourdin at lsu.edu (Blaise Bourdin) Date: Mon, 16 Jan 2012 20:49:24 -0600 Subject: [petsc-users] SNESVI convergence spped In-Reply-To: References: <2B2259CF-97A3-47AE-BA86-29A97BE2B514@tigers.lsu.edu> Message-ID: Hi, Ata and I are working together on this. The problem he describes is 1/2 of the iteration of our variational fracture code. In our application, E is position dependant, and typically becomes very large along very thin bands with width of the order of epsilon in the domain. Essentially, we expect that V will remain exactly equal to 1 almost everywhere, and will transition to 0 on these bands. Of course, we are interested in the limit as epsilon goes to 0. If the problem indeed is that it takes many steps to add the degrees of freedom. Is there any way to initialize manually the list of active constraints? To give you an idea, here is a link to a picture of the type of solution we expect. blue=1 https://www.math.lsu.edu/~bourdin/377451-0000.png Blaise > It seems to me that the problem is that ultimately ALL of the degrees of freedom are in the active set, > but they get added to it a few at a time -- and there may even be some "chatter" there -- necessitating many SNESVI steps. > Could it be that the regularization makes things worse? When \epsilon \ll 1, the unconstrained solution is highly oscillatory, possibly further exacerbating the problem. It's possible that it would be better if V just diverged uniformly. Then nearly all of the degrees of freedom would bump up against the upper obstacle all at once. > > Dmitry. > > On Mon, Jan 16, 2012 at 8:05 PM, Barry Smith wrote: > > What do you get with -snes_vi_monitor it could be it is taking a while to get the right active set. > > Barry > > On Jan 16, 2012, at 6:20 PM, Ataollah Mesgarnejad wrote: > > > Dear all, > > > > I'm trying to use SNESVI to solve a quadratic problem with box constraints. My problem in FE context reads: > > > > (\int_{Omega} E phi_i phi_j + \alpha \epsilon dphi_i dphi_j dx) V_i - (\int_{Omega} \alpha \frac{phi_j}{\epsilon} dx) = 0 , 0<= V <= 1 > > > > or: > > > > [A]{V}-{b}={0} > > > > here phi is the basis function, E and \alpha are positive constants, and \epsilon is a positive regularization parameter in order of mesh resolution. In this problem we expect V =1 a.e. and go to zero very fast at some places. > > I'm running this on a rather small problem (<500000 DOFS) on small number of processors (<72). I expected SNESVI to converge in couple of iterations (<10) since my A matrix doesn't change, however I'm experiencing a slow convergence (~50-70 iterations). I checked KSP solver for SNES and it converges with a few iterations. > > > > I would appreciate any suggestions or observations to increase the convergence speed? > > > > Best, > > Ata > > -- Department of Mathematics and Center for Computation & Technology Louisiana State University, Baton Rouge, LA 70803, USA Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin -------------- next part -------------- An HTML attachment was scrubbed... URL: From karpeev at mcs.anl.gov Mon Jan 16 21:00:37 2012 From: karpeev at mcs.anl.gov (Dmitry Karpeev) Date: Mon, 16 Jan 2012 21:00:37 -0600 Subject: [petsc-users] SNESVI convergence spped In-Reply-To: References: <2B2259CF-97A3-47AE-BA86-29A97BE2B514@tigers.lsu.edu> Message-ID: I think looking at the output of -snes_vi_monitor, as Barry suggested, would be useful to see what really is going on. What initial guess are you using? If you initialize V = 1 then whether a degree of freedom belongs to the active set or not will depend only on the sign of the residual there. I imagine that only a few dofs will be driven away from the box boundary by a large E-term? Dmitry. On Mon, Jan 16, 2012 at 8:49 PM, Blaise Bourdin wrote: > Hi, > > Ata and I are working together on this. The problem he describes is 1/2 of > the iteration of our variational fracture code. > In our application, E is position dependant, and typically becomes very > large along very thin bands with width of the order of epsilon in the > domain. Essentially, we expect that V will remain exactly equal to 1 almost > everywhere, and will transition to 0 on these bands. Of course, we are > interested in the limit as epsilon goes to 0. > > If the problem indeed is that it takes many steps to add the degrees of > freedom. Is there any way to initialize manually the list of active > constraints? To give you an idea, here is a link to a picture of the type > of solution we expect. blue=1 > https://www.math.lsu.edu/~bourdin/377451-0000.png > > Blaise > > > > It seems to me that the problem is that ultimately ALL of the degrees of > freedom are in the active set, > but they get added to it a few at a time -- and there may even be some > "chatter" there -- necessitating many SNESVI steps. > Could it be that the regularization makes things worse? When \epsilon \ll > 1, the unconstrained solution is highly oscillatory, possibly further > exacerbating the problem. It's possible that it would be better if V just > diverged uniformly. Then nearly all of the degrees of freedom would bump > up against the upper obstacle all at once. > > Dmitry. > > On Mon, Jan 16, 2012 at 8:05 PM, Barry Smith wrote: > >> >> What do you get with -snes_vi_monitor it could be it is taking a while >> to get the right active set. >> >> Barry >> >> On Jan 16, 2012, at 6:20 PM, Ataollah Mesgarnejad wrote: >> >> > Dear all, >> > >> > I'm trying to use SNESVI to solve a quadratic problem with box >> constraints. My problem in FE context reads: >> > >> > (\int_{Omega} E phi_i phi_j + \alpha \epsilon dphi_i dphi_j dx) V_i - >> (\int_{Omega} \alpha \frac{phi_j}{\epsilon} dx) = 0 , 0<= V <= 1 >> > >> > or: >> > >> > [A]{V}-{b}={0} >> > >> > here phi is the basis function, E and \alpha are positive constants, >> and \epsilon is a positive regularization parameter in order of mesh >> resolution. In this problem we expect V =1 a.e. and go to zero very fast >> at some places. >> > I'm running this on a rather small problem (<500000 DOFS) on small >> number of processors (<72). I expected SNESVI to converge in couple of >> iterations (<10) since my A matrix doesn't change, however I'm experiencing >> a slow convergence (~50-70 iterations). I checked KSP solver for SNES and >> it converges with a few iterations. >> > >> > I would appreciate any suggestions or observations to increase the >> convergence speed? >> > >> > Best, >> > Ata >> >> > > -- > Department of Mathematics and Center for Computation & Technology > Louisiana State University, Baton Rouge, LA 70803, USA > Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 > http://www.math.lsu.edu/~bourdin > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From domenico.borzacchiello at univ-st-etienne.fr Tue Jan 17 02:31:51 2012 From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr) Date: Tue, 17 Jan 2012 09:31:51 +0100 (CET) Subject: [petsc-users] Scalable Direct Coarse Solver for PCMG Message-ID: Hi, I'm solving Stokes problem in a coupled way with Vanka smoothers and MG and I need to choose a solver for the coarse grid correction. I've tried with an iterative solver too but it works only fairly on regular grids with few processes and otherwise performs very poorly with communication dominated calculations. I'd like to know if the tufo-fischer direct solver is suitable to solve saddle problems. The matrix is full rank. Thank you, Domenico From C.Klaij at marin.nl Tue Jan 17 01:44:10 2012 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Tue, 17 Jan 2012 07:44:10 +0000 Subject: [petsc-users] DMGetMatrix segfault Message-ID: I'm learning to use DMs, below is a first try but it gives a segfault. Any ideas? #include int main(int argc, char **argv) { DM da0, da1; DMDABoundaryType bx = DMDA_BOUNDARY_PERIODIC, by = DMDA_BOUNDARY_PERIODIC; DMDAStencilType stype = DMDA_STENCIL_STAR; PetscInt Mx = 8, My = 8; PetscInitialize(&argc, &argv, PETSC_NULL, PETSC_NULL); // create distributed array for Mx-by-My grid DMDACreate2d(PETSC_COMM_WORLD,bx,by,stype,Mx,My,PETSC_DECIDE,PETSC_DECIDE,1,1,PETSC_NULL,PETSC_NULL,&da0); DMDACreate2d(PETSC_COMM_WORLD,bx,by,stype,Mx,My,PETSC_DECIDE,PETSC_DECIDE,1,1,PETSC_NULL,PETSC_NULL,&da1); // mat nest from pack DM pack; Mat A; DMCompositeCreate(PETSC_COMM_WORLD,&pack); DMCompositeAddDM(pack,da0); DMCompositeAddDM(pack,da1); DMGetMatrix(pack,MATNEST,&A); MatView(A,PETSC_VIEWER_DEFAULT); PetscFinalize(); return 0; } [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] DMCompositeGetISLocalToGlobalMappings line 798 src/dm/impls/composite/pack.c [0]PETSC ERROR: [0] DMCreateLocalToGlobalMapping_Composite line 1490 src/dm/impls/composite/pack.c [0]PETSC ERROR: [0] DMGetLocalToGlobalMapping line 355 src/dm/interface/dm.c [0]PETSC ERROR: [0] DMGetMatrix_Composite line 222 src/dm/impls/composite/packm.c [0]PETSC ERROR: [0] DMGetMatrix line 569 src/dm/interface/dm.c [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 5, Sat Oct 29 13:45:54 CDT 2011 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./dmda-try3 on a linux_64b named lin0133 by cklaij Tue Jan 17 08:34:36 2012 [0]PETSC ERROR: Libraries linked from /opt/refresco/64bit_intelv11.1_openmpi/petsc-3.2-p5/lib [0]PETSC ERROR: Configure run at Mon Jan 16 14:03:34 2012 [0]PETSC ERROR: Configure options --prefix=/opt/refresco/64bit_intelv11.1_openmpi/petsc-3.2-p5 --with-mpi-dir=/opt/refresco/64bit_intelv11.1_openmpi/openmpi-1.4.4 --with-x=0 --with-mpe=0 --with-debugging=1 --with-hypre-include=/opt/refresco/64bit_intelv11.1_openmpi/hypre-2.7.0b/include --with-hypre-lib=/opt/refresco/64bit_intelv11.1_openmpi/hypre-2.7.0b/lib/libHYPRE.a --with-ml-include=/opt/refresco/64bit_intelv11.1_openmpi/ml-6.2/include --with-ml-lib=/opt/refresco/64bit_intelv11.1_openmpi/ml-6.2/lib/libml.a --with-blas-lapack-dir=/opt/intel/mkl [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 59. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec has exited due to process rank 0 with PID 1445 on node lin0133 exiting without calling "finalize". This may have caused other processes in the application to be terminated by signals sent by mpiexec (as reported here). -------------------------------------------------------------------------- dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From C.Klaij at marin.nl Tue Jan 17 02:04:29 2012 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Tue, 17 Jan 2012 08:04:29 +0000 Subject: [petsc-users] DMDA with cell-centered finite volume method? Message-ID: > > Is it possible to use DMDAs in the context of a cell-centered > > finite volume method? For example, a 2D grid with 9-by-9 grid > > points would have 8-by-8 unknowns, located in the cell > > centers. After reading the documentation, I got the impression > > that the unknowns must be situated at the grid points, or am I > > mistaken? > > > > Just make an 8x8 grid. Geometric refinement and coarsening is a little more > delicate. I have used the hack of saying that the grid is periodic so that > refinement is to 2n instead of 2n-1. You might want to use Q0 interpolation > in this case. Thanks Jed, but how would I store the grid point coordinates? The grid could be non-uniform (stretching, deforming) so I still need a 9-by-9 DM to store the grid point coordinates, right? dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From jedbrown at mcs.anl.gov Tue Jan 17 06:16:44 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 17 Jan 2012 06:16:44 -0600 Subject: [petsc-users] Scalable Direct Coarse Solver for PCMG In-Reply-To: References: Message-ID: On Tue, Jan 17, 2012 at 02:31, wrote: > I'd like to know if the tufo-fischer direct solver is suitable to solve > saddle problems. > Not unless you formulate A^T A which doubles the condition number. If you need a direct coarse level solver, I suggest using MUMPS or SuperLU_DIST, semi-redundantly if you have too many processors (e.g. thousands). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Jan 17 06:32:38 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 17 Jan 2012 06:32:38 -0600 Subject: [petsc-users] DMGetMatrix segfault In-Reply-To: References: Message-ID: On Tue, Jan 17, 2012 at 01:44, Klaij, Christiaan wrote: > DMCompositeCreate(PETSC_COMM_WORLD,&pack); > DMCompositeAddDM(pack,da0); > DMCompositeAddDM(pack,da1); > Add this one line: DMSetUp(pack); I'll update petsc-dev to call DMSetUp() automatically when it is needed. > DMGetMatrix(pack,MATNEST,&A); > MatView(A,PETSC_VIEWER_DEFAULT); > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Jan 17 06:45:10 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 17 Jan 2012 06:45:10 -0600 Subject: [petsc-users] DMDA with cell-centered finite volume method? In-Reply-To: References: Message-ID: On Tue, Jan 17, 2012 at 02:04, Klaij, Christiaan wrote: > Thanks Jed, but how would I store the grid point coordinates? The grid > could > be non-uniform (stretching, deforming) so I still need a 9-by-9 DM to store > the grid point coordinates, right? > If you want to store coordinates that way, yes. One way is to create a separate DMDA for the coordinates (setting lx,ly,lz so the layout is compatible) and use PetscObjectCompose()/PetscObjectQuery() to attach it to the "solution" DM. If you need refinement/coarsening, you can implement it with whatever rule you want and set up the coarse/fine DM with an attached coordinate DM. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jan 17 08:16:06 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 17 Jan 2012 08:16:06 -0600 Subject: [petsc-users] SNESVI convergence spped In-Reply-To: References: <2B2259CF-97A3-47AE-BA86-29A97BE2B514@tigers.lsu.edu> Message-ID: <9018222C-E09A-4253-AB3E-745181517208@mcs.anl.gov> Blaise, Let's not solve the problem until we know what the problem is. -snes_vi_monitor first then think about the cure Barry On Jan 16, 2012, at 8:49 PM, Blaise Bourdin wrote: > Hi, > > Ata and I are working together on this. The problem he describes is 1/2 of the iteration of our variational fracture code. > In our application, E is position dependant, and typically becomes very large along very thin bands with width of the order of epsilon in the domain. Essentially, we expect that V will remain exactly equal to 1 almost everywhere, and will transition to 0 on these bands. Of course, we are interested in the limit as epsilon goes to 0. > > If the problem indeed is that it takes many steps to add the degrees of freedom. Is there any way to initialize manually the list of active constraints? To give you an idea, here is a link to a picture of the type of solution we expect. blue=1 > https://www.math.lsu.edu/~bourdin/377451-0000.png > > Blaise > > > >> It seems to me that the problem is that ultimately ALL of the degrees of freedom are in the active set, >> but they get added to it a few at a time -- and there may even be some "chatter" there -- necessitating many SNESVI steps. >> Could it be that the regularization makes things worse? When \epsilon \ll 1, the unconstrained solution is highly oscillatory, possibly further exacerbating the problem. It's possible that it would be better if V just diverged uniformly. Then nearly all of the degrees of freedom would bump up against the upper obstacle all at once. >> >> Dmitry. >> >> On Mon, Jan 16, 2012 at 8:05 PM, Barry Smith wrote: >> >> What do you get with -snes_vi_monitor it could be it is taking a while to get the right active set. >> >> Barry >> >> On Jan 16, 2012, at 6:20 PM, Ataollah Mesgarnejad wrote: >> >> > Dear all, >> > >> > I'm trying to use SNESVI to solve a quadratic problem with box constraints. My problem in FE context reads: >> > >> > (\int_{Omega} E phi_i phi_j + \alpha \epsilon dphi_i dphi_j dx) V_i - (\int_{Omega} \alpha \frac{phi_j}{\epsilon} dx) = 0 , 0<= V <= 1 >> > >> > or: >> > >> > [A]{V}-{b}={0} >> > >> > here phi is the basis function, E and \alpha are positive constants, and \epsilon is a positive regularization parameter in order of mesh resolution. In this problem we expect V =1 a.e. and go to zero very fast at some places. >> > I'm running this on a rather small problem (<500000 DOFS) on small number of processors (<72). I expected SNESVI to converge in couple of iterations (<10) since my A matrix doesn't change, however I'm experiencing a slow convergence (~50-70 iterations). I checked KSP solver for SNES and it converges with a few iterations. >> > >> > I would appreciate any suggestions or observations to increase the convergence speed? >> > >> > Best, >> > Ata >> >> > > -- > Department of Mathematics and Center for Computation & Technology > Louisiana State University, Baton Rouge, LA 70803, USA > Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin > > > > > > > From jedbrown at mcs.anl.gov Tue Jan 17 08:25:39 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 17 Jan 2012 08:25:39 -0600 Subject: [petsc-users] Scalable Direct Coarse Solver for PCMG In-Reply-To: References: Message-ID: On Tue, Jan 17, 2012 at 06:16, Jed Brown wrote: > Not unless you formulate A^T A which doubles the condition number. Jack pointed out my typo, of course this *squares* the condition number (not a good idea unless your coarse level is quite well-conditioned). -------------- next part -------------- An HTML attachment was scrubbed... URL: From amesga1 at tigers.lsu.edu Tue Jan 17 08:28:35 2012 From: amesga1 at tigers.lsu.edu (Ataollah Mesgarnejad) Date: Tue, 17 Jan 2012 08:28:35 -0600 Subject: [petsc-users] SNESVI convergence spped In-Reply-To: <9018222C-E09A-4253-AB3E-745181517208@mcs.anl.gov> References: <2B2259CF-97A3-47AE-BA86-29A97BE2B514@tigers.lsu.edu> <9018222C-E09A-4253-AB3E-745181517208@mcs.anl.gov> Message-ID: Barry, I'm already running the program with -snes_vi_monitor. I'll update everyone in a few hours. Thanks, Ata On Jan 17, 2012, at 8:16 AM, Barry Smith wrote: > > Blaise, > > Let's not solve the problem until we know what the problem is. -snes_vi_monitor first then think about the cure > > Barry > > On Jan 16, 2012, at 8:49 PM, Blaise Bourdin wrote: > >> Hi, >> >> Ata and I are working together on this. The problem he describes is 1/2 of the iteration of our variational fracture code. >> In our application, E is position dependant, and typically becomes very large along very thin bands with width of the order of epsilon in the domain. Essentially, we expect that V will remain exactly equal to 1 almost everywhere, and will transition to 0 on these bands. Of course, we are interested in the limit as epsilon goes to 0. >> >> If the problem indeed is that it takes many steps to add the degrees of freedom. Is there any way to initialize manually the list of active constraints? To give you an idea, here is a link to a picture of the type of solution we expect. blue=1 >> https://www.math.lsu.edu/~bourdin/377451-0000.png >> >> Blaise >> >> >> >>> It seems to me that the problem is that ultimately ALL of the degrees of freedom are in the active set, >>> but they get added to it a few at a time -- and there may even be some "chatter" there -- necessitating many SNESVI steps. >>> Could it be that the regularization makes things worse? When \epsilon \ll 1, the unconstrained solution is highly oscillatory, possibly further exacerbating the problem. It's possible that it would be better if V just diverged uniformly. Then nearly all of the degrees of freedom would bump up against the upper obstacle all at once. >>> >>> Dmitry. >>> >>> On Mon, Jan 16, 2012 at 8:05 PM, Barry Smith wrote: >>> >>> What do you get with -snes_vi_monitor it could be it is taking a while to get the right active set. >>> >>> Barry >>> >>> On Jan 16, 2012, at 6:20 PM, Ataollah Mesgarnejad wrote: >>> >>>> Dear all, >>>> >>>> I'm trying to use SNESVI to solve a quadratic problem with box constraints. My problem in FE context reads: >>>> >>>> (\int_{Omega} E phi_i phi_j + \alpha \epsilon dphi_i dphi_j dx) V_i - (\int_{Omega} \alpha \frac{phi_j}{\epsilon} dx) = 0 , 0<= V <= 1 >>>> >>>> or: >>>> >>>> [A]{V}-{b}={0} >>>> >>>> here phi is the basis function, E and \alpha are positive constants, and \epsilon is a positive regularization parameter in order of mesh resolution. In this problem we expect V =1 a.e. and go to zero very fast at some places. >>>> I'm running this on a rather small problem (<500000 DOFS) on small number of processors (<72). I expected SNESVI to converge in couple of iterations (<10) since my A matrix doesn't change, however I'm experiencing a slow convergence (~50-70 iterations). I checked KSP solver for SNES and it converges with a few iterations. >>>> >>>> I would appreciate any suggestions or observations to increase the convergence speed? >>>> >>>> Best, >>>> Ata >>> >>> >> >> -- >> Department of Mathematics and Center for Computation & Technology >> Louisiana State University, Baton Rouge, LA 70803, USA >> Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin >> >> >> >> >> >> >> > From praghanmor at gmail.com Tue Jan 17 12:26:15 2012 From: praghanmor at gmail.com (Rahul Praghanmor) Date: Tue, 17 Jan 2012 23:56:15 +0530 Subject: [petsc-users] unstructured finite volume method matrix assembly for partitioned mesh Message-ID: Dear Sir, I am working on a parallel unstructured finite volume solver.The solver is efficiently running in parallel using gauss seidel linear solver.The matrix is sparse and stored by CSR format.Now I want to implement a PETSc library to make the convergence faster.But I am going through a major problem as discussed below. If I partitioned a big domain say rectangular duct into 4 zones using parMetis.Each zone is solved in separate processor as fairly solved by gauss seidel linear solver.But I want to solve these zones by PETSc.How to do that?How to form a matrix with global numbering which is required format for PETSc to form a matrix?Does it necessarily important to form a global matrix? very few information available for assembling a matrix from unstructured finite volume method in PETSc. Thankx and regards, Rahul. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 17 12:41:32 2012 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 17 Jan 2012 12:41:32 -0600 Subject: [petsc-users] unstructured finite volume method matrix assembly for partitioned mesh In-Reply-To: References: Message-ID: On Tue, Jan 17, 2012 at 12:26 PM, Rahul Praghanmor wrote: > Dear Sir, > I am working on a parallel unstructured finite volume solver.The > solver is efficiently running in parallel using gauss seidel linear > solver.The matrix is sparse and stored by CSR format.Now I want to > implement a PETSc library to make the convergence faster.But I am going > through a major problem as discussed below. > If I partitioned a big domain say rectangular duct into 4 zones > using parMetis.Each zone is solved in separate processor as fairly solved > by gauss seidel linear solver.But I want to solve these zones by PETSc.How > to do that?How to form a matrix with global numbering which is required > format for PETSc to form a matrix?Does it necessarily important to form a > global matrix? very few information available for assembling a matrix from > unstructured finite volume method in PETSc. > If you already run ParMetis, just number the rows it puts on each process consecutively. Matt > Thankx and regards, > Rahul. > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From stali at geology.wisc.edu Tue Jan 17 14:29:58 2012 From: stali at geology.wisc.edu (Tabrez Ali) Date: Tue, 17 Jan 2012 14:29:58 -0600 Subject: [petsc-users] preallocation and matrix storage Message-ID: <4F15DA46.4000107@geology.wisc.edu> Hello If we are not doing exact pre-allocation (I know this is despised by PETSc developers) then it seems that we need to specify an nz (per row) value of at least "max nonzeros in any row". For example with nz=18 I do get 0 mallocs [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5236 X 5236; storage space: 22728 unneeded,71520 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 18 But with nz=16 I get 10 mallocs [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5236 X 5236; storage space: 12406 unneeded,71520 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 10 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 18 So the question is that why is the number of mallocs 10 (with nz=16) when the total storage space has been overestimated (because it says "12406 unneeded")? Or does "unneeded" mean something else? Thanks in advance Tabrez From knepley at gmail.com Tue Jan 17 14:23:41 2012 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 17 Jan 2012 14:23:41 -0600 Subject: [petsc-users] preallocation and matrix storage In-Reply-To: <4F15DA46.4000107@geology.wisc.edu> References: <4F15DA46.4000107@geology.wisc.edu> Message-ID: On Tue, Jan 17, 2012 at 2:29 PM, Tabrez Ali wrote: > Hello > > If we are not doing exact pre-allocation (I know this is despised by PETSc > developers) then it seems that we need to specify an nz (per row) value of > at least "max nonzeros in any row". > > > For example with nz=18 I do get 0 mallocs > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5236 X 5236; storage space: > 22728 unneeded,71520 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 18 > > > But with nz=16 I get 10 mallocs > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5236 X 5236; storage space: > 12406 unneeded,71520 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 10 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 18 > > So the question is that why is the number of mallocs 10 (with nz=16) when > the total storage space has been overestimated (because it says "12406 > unneeded")? > > Or does "unneeded" mean something else? > We are doing things row-by-row here, so each row gets more space, but we do not continually move the rest of the matrix around. Matt > Thanks in advance > > Tabrez > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xyuan at lbl.gov Tue Jan 17 14:51:30 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Tue, 17 Jan 2012 12:51:30 -0800 Subject: [petsc-users] To store the RHS vector b of KSPSolve. Message-ID: Hello all, I have a piece of code that need to store the RHS vector b of the ksp object via vecview. However, the routine is like: ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal, FormJacobianLocal,0,0);CHKERRQ(ierr); ierr = DMMGSolve(dmmg);CHKERRQ(ierr); The rhs evaluation is in FormFunctionLocal(), how could store the rhs vector b? Thanks very much! Cheers, Rebecca From knepley at gmail.com Tue Jan 17 14:53:13 2012 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 17 Jan 2012 14:53:13 -0600 Subject: [petsc-users] To store the RHS vector b of KSPSolve. In-Reply-To: References: Message-ID: On Tue, Jan 17, 2012 at 2:51 PM, Xuefei (Rebecca) Yuan wrote: > Hello all, > > I have a piece of code that need to store the RHS vector b of the ksp > object via vecview. > > However, the routine is like: > > > ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal, > FormJacobianLocal,0,0);CHKERRQ(ierr); > ierr = DMMGSolve(dmmg);CHKERRQ(ierr); > > The rhs evaluation is in FormFunctionLocal(), how could store the rhs > vector b? > 1) In your nonlinear problem, what do you mean by b? 2) Don't use DMMG, use petsc-dev with regular SNES and SNESSetDM() Matt > Thanks very much! > > Cheers, > > Rebecca > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xyuan at lbl.gov Tue Jan 17 15:01:39 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Tue, 17 Jan 2012 13:01:39 -0800 Subject: [petsc-users] To store the RHS vector b of KSPSolve. In-Reply-To: References: Message-ID: <8B22457B-5E43-4E23-91C6-250635091AE4@lbl.gov> Hello Matt, On Jan 17, 2012, at 12:53 PM, Matthew Knepley wrote: > On Tue, Jan 17, 2012 at 2:51 PM, Xuefei (Rebecca) Yuan wrote: > Hello all, > > I have a piece of code that need to store the RHS vector b of the ksp object via vecview. > > However, the routine is like: > > > ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal, FormJacobianLocal,0,0);CHKERRQ(ierr); > ierr = DMMGSolve(dmmg);CHKERRQ(ierr); > > The rhs evaluation is in FormFunctionLocal(), how could store the rhs vector b? > > 1) In your nonlinear problem, what do you mean by b? > This b is refer to the linear problem's rhs, i.e., for the nonlinear problem F(u)=0 is the residual evaluation in FormFunctionLocal, and for the linear solver, Ax=b, this rhs vector b is -F(u^{k]), k is the iteration number for the nonlinear solver. I want to separate the Jacobian matrix and the rhs vector from the very beginning for testing some linear solver: ******* start solving for time = 0.50000 at time step = 1****** 0 SNES Function norm 1.242539468950e-02 start saving jacobian Linear solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 3.341795546738e-05 Linear solve converged due to CONVERGED_ITS iterations 1 2 SNES Function norm 1.329755764187e-08 Linear solve converged due to CONVERGED_ITS iterations 1 3 SNES Function norm 3.067585609727e-12 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE The jacobian matrix was saved before calling the linear solver provided by PETSc, and I need to get the corresponding rhs vector b with this jacobian matrix. > 2) Don't use DMMG, use petsc-dev with regular SNES and SNESSetDM() > I will look into that later. > Matt > Thanks very much! Best regards, Rebecca > Thanks very much! > > Cheers, > > Rebecca > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 17 15:12:23 2012 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 17 Jan 2012 15:12:23 -0600 Subject: [petsc-users] To store the RHS vector b of KSPSolve. In-Reply-To: <8B22457B-5E43-4E23-91C6-250635091AE4@lbl.gov> References: <8B22457B-5E43-4E23-91C6-250635091AE4@lbl.gov> Message-ID: On Tue, Jan 17, 2012 at 3:01 PM, Xuefei (Rebecca) Yuan wrote: > Hello Matt, > > On Jan 17, 2012, at 12:53 PM, Matthew Knepley wrote: > > On Tue, Jan 17, 2012 at 2:51 PM, Xuefei (Rebecca) Yuan wrote: > >> Hello all, >> >> I have a piece of code that need to store the RHS vector b of the ksp >> object via vecview. >> >> However, the routine is like: >> >> >> ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal, >> FormJacobianLocal,0,0);CHKERRQ(ierr); >> ierr = DMMGSolve(dmmg);CHKERRQ(ierr); >> >> The rhs evaluation is in FormFunctionLocal(), how could store the rhs >> vector b? >> > > 1) In your nonlinear problem, what do you mean by b? > > > This b is refer to the linear problem's rhs, i.e., for the nonlinear > problem F(u)=0 is the residual evaluation in FormFunctionLocal, and for the > linear solver, Ax=b, this rhs vector b is -F(u^{k]), k is the iteration > number for the nonlinear solver. > -ksp_view_binary? Matt > I want to separate the Jacobian matrix and the rhs vector from the very > beginning for testing some linear solver: > > ******* start solving for time = 0.50000 at time step = 1****** > 0 SNES Function norm 1.242539468950e-02 > start saving jacobian > Linear solve converged due to CONVERGED_ITS iterations 1 > 1 SNES Function norm 3.341795546738e-05 > Linear solve converged due to CONVERGED_ITS iterations 1 > 2 SNES Function norm 1.329755764187e-08 > Linear solve converged due to CONVERGED_ITS iterations 1 > 3 SNES Function norm 3.067585609727e-12 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE > > The jacobian matrix was saved before calling the linear solver provided by > PETSc, and I need to get the corresponding rhs vector b with this jacobian > matrix. > > 2) Don't use DMMG, use petsc-dev with regular SNES and SNESSetDM() > > I will look into that later. > > Matt > > > Thanks very much! > > Best regards, > > Rebecca > > > > Thanks very much! >> >> Cheers, >> >> Rebecca >> >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xyuan at lbl.gov Tue Jan 17 15:46:15 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Tue, 17 Jan 2012 13:46:15 -0800 Subject: [petsc-users] To store the RHS vector b of KSPSolve. In-Reply-To: References: <8B22457B-5E43-4E23-91C6-250635091AE4@lbl.gov> Message-ID: <8F6DA7CA-A7D2-42DB-97DF-E967B07E5735@lbl.gov> > > -ksp_view_binary? > This will save all the ksp content in binary form (more than one set of jacobian and rhs), if I only want to save the first initial Jacobian and rhs, will let -ksp_max_it 1 -snes_max_it 1 give me that? Thanks, Rebecca From knepley at gmail.com Tue Jan 17 15:51:17 2012 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 17 Jan 2012 15:51:17 -0600 Subject: [petsc-users] To store the RHS vector b of KSPSolve. In-Reply-To: <8F6DA7CA-A7D2-42DB-97DF-E967B07E5735@lbl.gov> References: <8B22457B-5E43-4E23-91C6-250635091AE4@lbl.gov> <8F6DA7CA-A7D2-42DB-97DF-E967B07E5735@lbl.gov> Message-ID: On Tue, Jan 17, 2012 at 3:46 PM, Xuefei (Rebecca) Yuan wrote: > > > > -ksp_view_binary? > > > > This will save all the ksp content in binary form (more than one set of > jacobian and rhs), if I only want to save the first initial Jacobian and > rhs, will let -ksp_max_it 1 -snes_max_it 1 give me that? > Yes Matt > Thanks, > > Rebecca > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Wed Jan 18 02:07:15 2012 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Wed, 18 Jan 2012 08:07:15 +0000 Subject: [petsc-users] access to matnest block (0,1) ? Message-ID: I have two DMs which I add to a DMComposite and then use MATNEST when getting the corresponding matrix. This gives me block (0,0) and block (1,1). How do I set/get blocks (0,1) and (1,0)? Looking at ex28 I tried MatGetLocalSubMatrix but it gives a null arg... #include int main(int argc, char **argv) { DM da0, da1; DMDABoundaryType bx = DMDA_BOUNDARY_PERIODIC, by = DMDA_BOUNDARY_PERIODIC; DMDAStencilType stype = DMDA_STENCIL_STAR; PetscInt Mx = 8, My = 8; PetscInitialize(&argc, &argv, PETSC_NULL, PETSC_NULL); // create distributed array for Q DMDACreate2d(PETSC_COMM_WORLD,bx,by,stype,Mx,My,PETSC_DECIDE,PETSC_DECIDE,2,1,PETSC_NULL,PETSC_NULL,&da0); // create distributed array for C DMDACreate2d(PETSC_COMM_WORLD,bx,by,stype,Mx,My,PETSC_DECIDE,PETSC_DECIDE,1,1,PETSC_NULL,PETSC_NULL,&da1); // mat nest from pack DM pack; Mat A; Vec X; DMCompositeCreate(PETSC_COMM_WORLD,&pack); DMCompositeAddDM(pack,da0); DMCompositeAddDM(pack,da1); DMSetUp(pack); DMGetMatrix(pack,MATNEST,&A); MatView(A,PETSC_VIEWER_DEFAULT); IS *is; Mat G; PetscInt col[1],ierr; PetscScalar vals[1]; DMCompositeGetLocalISs(pack,&is); MatGetLocalSubMatrix(A,is[0],is[1],&G); MatView(G,PETSC_VIEWER_DEFAULT); PetscFinalize(); return 0; } $ mpiexec -n 1 ./dmda-try3 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : type=seqaij, rows=128, cols=128 (0,1) : PETSC_NULL (1,0) : PETSC_NULL (1,1) : type=seqaij, rows=64, cols=64 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Null argument, when expecting valid pointer! [0]PETSC ERROR: Null Object: Parameter # 1! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 5, Sat Oct 29 13:45:54 CDT 2011 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./dmda-try3 on a linux_64b named lin0133 by cklaij Wed Jan 18 09:00:37 2012 [0]PETSC ERROR: Libraries linked from /opt/refresco/64bit_intelv11.1_openmpi/petsc-3.2-p5/lib [0]PETSC ERROR: Configure run at Mon Jan 16 14:03:34 2012 [0]PETSC ERROR: Configure options --prefix=/opt/refresco/64bit_intelv11.1_openmpi/petsc-3.2-p5 --with-mpi-dir=/opt/refresco/64bit_intelv11.1_openmpi/openmpi-1.4.4 --with-x=0 --with-mpe=0 --with-debugging=1 --with-hypre-include=/opt/refresco/64bit_intelv11.1_openmpi/hypre-2.7.0b/include --with-hypre-lib=/opt/refresco/64bit_intelv11.1_openmpi/hypre-2.7.0b/lib/libHYPRE.a --with-ml-include=/opt/refresco/64bit_intelv11.1_openmpi/ml-6.2/include --with-ml-lib=/opt/refresco/64bit_intelv11.1_openmpi/ml-6.2/lib/libml.a --with-blas-lapack-dir=/opt/intel/mkl [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: MatView() line 723 in src/mat/interface/matrix.c dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From jedbrown at mcs.anl.gov Wed Jan 18 06:22:16 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 18 Jan 2012 06:22:16 -0600 Subject: [petsc-users] access to matnest block (0,1) ? In-Reply-To: References: Message-ID: On Wed, Jan 18, 2012 at 02:07, Klaij, Christiaan wrote: > I have two DMs which I add to a DMComposite and then use MATNEST when > getting the corresponding matrix. This gives me block (0,0) and block > (1,1). How do I set/get blocks (0,1) and (1,0)? Looking at ex28 I tried > MatGetLocalSubMatrix but it gives a null arg... > So the problem is that we have no way of knowing what preallocation (nonzero pattern) _should_ go in the off-diagonal part. Unfortunately, the current preallocation mechanism (DMCompositeSetCoupling()) is a difficult thing to implement and the mechanism does not directly apply to MatNest. If you have ideas for a good preallocation API, I would like to hear it. I need to get back to the preallocation issue because it's an obvious wart in the multiphysics support (as long as we don't have fast dynamic preallocation, which is a somewhat viable alternative). What I would like is for the user to call MatGetLocalSubMatrix() for any blocks that they want allocated and set preallocation in terms of the local ordering. The current (unfortunate) solution for MatNest with off-diagonal parts is to create the submatrices after DMGetMatrix(), preallocate as you like, and copy the ISLocalToGlobalMappings over. > #include > > int main(int argc, char **argv) > { > > DM da0, da1; > DMDABoundaryType bx = DMDA_BOUNDARY_PERIODIC, by = DMDA_BOUNDARY_PERIODIC; > DMDAStencilType stype = DMDA_STENCIL_STAR; > PetscInt Mx = 8, My = 8; > > PetscInitialize(&argc, &argv, PETSC_NULL, PETSC_NULL); > > // create distributed array for Q > > DMDACreate2d(PETSC_COMM_WORLD,bx,by,stype,Mx,My,PETSC_DECIDE,PETSC_DECIDE,2,1,PETSC_NULL,PETSC_NULL,&da0); > > // create distributed array for C > > DMDACreate2d(PETSC_COMM_WORLD,bx,by,stype,Mx,My,PETSC_DECIDE,PETSC_DECIDE,1,1,PETSC_NULL,PETSC_NULL,&da1); > > // mat nest from pack > DM pack; > Mat A; > Vec X; > DMCompositeCreate(PETSC_COMM_WORLD,&pack); > DMCompositeAddDM(pack,da0); > DMCompositeAddDM(pack,da1); > DMSetUp(pack); > DMGetMatrix(pack,MATNEST,&A); > MatView(A,PETSC_VIEWER_DEFAULT); > > IS *is; > Mat G; > PetscInt col[1],ierr; > PetscScalar vals[1]; > DMCompositeGetLocalISs(pack,&is); > MatGetLocalSubMatrix(A,is[0],is[1],&G); > MatView(G,PETSC_VIEWER_DEFAULT); > > PetscFinalize(); > > return 0; > > } > > > $ mpiexec -n 1 ./dmda-try3 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : type=seqaij, rows=128, cols=128 > (0,1) : PETSC_NULL > (1,0) : PETSC_NULL > (1,1) : type=seqaij, rows=64, cols=64 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Null argument, when expecting valid pointer! > [0]PETSC ERROR: Null Object: Parameter # 1! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 5, Sat Oct 29 13:45:54 > CDT 2011 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./dmda-try3 on a linux_64b named lin0133 by cklaij Wed Jan > 18 09:00:37 2012 > [0]PETSC ERROR: Libraries linked from > /opt/refresco/64bit_intelv11.1_openmpi/petsc-3.2-p5/lib > [0]PETSC ERROR: Configure run at Mon Jan 16 14:03:34 2012 > [0]PETSC ERROR: Configure options > --prefix=/opt/refresco/64bit_intelv11.1_openmpi/petsc-3.2-p5 > --with-mpi-dir=/opt/refresco/64bit_intelv11.1_openmpi/openmpi-1.4.4 > --with-x=0 --with-mpe=0 --with-debugging=1 > --with-hypre-include=/opt/refresco/64bit_intelv11.1_openmpi/hypre-2.7.0b/include > --with-hypre-lib=/opt/refresco/64bit_intelv11.1_openmpi/hypre-2.7.0b/lib/libHYPRE.a > --with-ml-include=/opt/refresco/64bit_intelv11.1_openmpi/ml-6.2/include > --with-ml-lib=/opt/refresco/64bit_intelv11.1_openmpi/ml-6.2/lib/libml.a > --with-blas-lapack-dir=/opt/intel/mkl > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: MatView() line 723 in src/mat/interface/matrix.c > > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amesga1 at tigers.lsu.edu Wed Jan 18 09:40:43 2012 From: amesga1 at tigers.lsu.edu (Ataollah Mesgarnejad) Date: Wed, 18 Jan 2012 09:40:43 -0600 Subject: [petsc-users] SNESVI convergence spped In-Reply-To: <9018222C-E09A-4253-AB3E-745181517208@mcs.anl.gov> References: <2B2259CF-97A3-47AE-BA86-29A97BE2B514@tigers.lsu.edu> <9018222C-E09A-4253-AB3E-745181517208@mcs.anl.gov> Message-ID: <26304804-B4D1-4357-AA9B-E3FFCCB7CA31@tigers.lsu.edu> Dear all, Just realized that my email didn't go through because of my attachments, so here it is: Sorry if it took a bit long to do the runs, I wasn't feeling well yesterday. I attached the output I get from a small problem (90elements, 621 DOFs ) with different SNESVI types (exodusII and command line outputs). As you can see rsaug exits with an error but ss and rs run (and their results are similar). However, after V goes to zero at a cross section line searches for both of them (rs,ss) fail?! Also as you can see KSP converges for every step. These are the tolerances I pass to SNES: user->KSP_default_rtol = 1e-12; user->KSP_default_atol = 1e-12; user->KSP_default_dtol = 1e3; user->KSP_default_maxit = 50000; user->psi_default_frtol = 1e-8; // snes_frtol user->psi_default_fatol = 1e-8; //snes_fatol user->psi_maxit = 500; //snes_maxit user->psi_max_funcs = 1000; //snes_max_func_its Ps: files are here: http://cl.ly/0Z001Z3y1k0Q0g0s2F2R thanks, Ata On Jan 17, 2012, at 8:16 AM, Barry Smith wrote: > > Blaise, > > Let's not solve the problem until we know what the problem is. -snes_vi_monitor first then think about the cure > > Barry > > On Jan 16, 2012, at 8:49 PM, Blaise Bourdin wrote: > >> Hi, >> >> Ata and I are working together on this. The problem he describes is 1/2 of the iteration of our variational fracture code. >> In our application, E is position dependant, and typically becomes very large along very thin bands with width of the order of epsilon in the domain. Essentially, we expect that V will remain exactly equal to 1 almost everywhere, and will transition to 0 on these bands. Of course, we are interested in the limit as epsilon goes to 0. >> >> If the problem indeed is that it takes many steps to add the degrees of freedom. Is there any way to initialize manually the list of active constraints? To give you an idea, here is a link to a picture of the type of solution we expect. blue=1 >> https://www.math.lsu.edu/~bourdin/377451-0000.png >> >> Blaise >> >> >> >>> It seems to me that the problem is that ultimately ALL of the degrees of freedom are in the active set, >>> but they get added to it a few at a time -- and there may even be some "chatter" there -- necessitating many SNESVI steps. >>> Could it be that the regularization makes things worse? When \epsilon \ll 1, the unconstrained solution is highly oscillatory, possibly further exacerbating the problem. It's possible that it would be better if V just diverged uniformly. Then nearly all of the degrees of freedom would bump up against the upper obstacle all at once. >>> >>> Dmitry. >>> >>> On Mon, Jan 16, 2012 at 8:05 PM, Barry Smith wrote: >>> >>> What do you get with -snes_vi_monitor it could be it is taking a while to get the right active set. >>> >>> Barry >>> >>> On Jan 16, 2012, at 6:20 PM, Ataollah Mesgarnejad wrote: >>> >>>> Dear all, >>>> >>>> I'm trying to use SNESVI to solve a quadratic problem with box constraints. My problem in FE context reads: >>>> >>>> (\int_{Omega} E phi_i phi_j + \alpha \epsilon dphi_i dphi_j dx) V_i - (\int_{Omega} \alpha \frac{phi_j}{\epsilon} dx) = 0 , 0<= V <= 1 >>>> >>>> or: >>>> >>>> [A]{V}-{b}={0} >>>> >>>> here phi is the basis function, E and \alpha are positive constants, and \epsilon is a positive regularization parameter in order of mesh resolution. In this problem we expect V =1 a.e. and go to zero very fast at some places. >>>> I'm running this on a rather small problem (<500000 DOFS) on small number of processors (<72). I expected SNESVI to converge in couple of iterations (<10) since my A matrix doesn't change, however I'm experiencing a slow convergence (~50-70 iterations). I checked KSP solver for SNES and it converges with a few iterations. >>>> >>>> I would appreciate any suggestions or observations to increase the convergence speed? >>>> >>>> Best, >>>> Ata >>> >>> >> >> -- >> Department of Mathematics and Center for Computation & Technology >> Louisiana State University, Baton Rouge, LA 70803, USA >> Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin >> >> >> >> >> >> >> > From jiangwen84 at gmail.com Wed Jan 18 10:32:45 2012 From: jiangwen84 at gmail.com (Wen Jiang) Date: Wed, 18 Jan 2012 11:32:45 -0500 Subject: [petsc-users] generate entries on 'wrong' process Message-ID: Hi, I am working on FEM codes with spline-based element type. For 3D case, one element has 64 nodes and every two neighboring elements share 48 nodes. Thus regardless how I partition a mesh, there are still very large number of entries that have to write on the 'wrong' processor. And my code is running on clusters, the processes are sending between 550 and 620 Million packets per second across the network. My code seems IO-bound at this moment and just get stuck at the matrix assembly stage. A -info file is attached. Do I have other options to optimize my codes to be less io-intensive? Thanks in advance. [0] VecAssemblyBegin_MPI(): Stash has 210720 entries, uses 12 mallocs. [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. [5] MatAssemblyBegin_MPIAIJ(): Stash has 4806656 entries, uses 8 mallocs. [6] MatAssemblyBegin_MPIAIJ(): Stash has 5727744 entries, uses 9 mallocs. [4] MatAssemblyBegin_MPIAIJ(): Stash has 5964288 entries, uses 9 mallocs. [7] MatAssemblyBegin_MPIAIJ(): Stash has 7408128 entries, uses 9 mallocs. [3] MatAssemblyBegin_MPIAIJ(): Stash has 8123904 entries, uses 9 mallocs. [2] MatAssemblyBegin_MPIAIJ(): Stash has 11544576 entries, uses 10 mallocs. [0] MatStashScatterBegin_Private(): No of messages: 1 [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 107888648 [0] MatAssemblyBegin_MPIAIJ(): Stash has 13486080 entries, uses 10 mallocs. [1] MatAssemblyBegin_MPIAIJ(): Stash has 16386048 entries, uses 10 mallocs. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2514537 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 [0] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 unneeded,2525390 used [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 [5] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode routines [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2500281 used [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 [3] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2500281 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 [1] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines [4] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2500281 used [4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 [4] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2525733 used [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 [2] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: petsc_info Type: application/octet-stream Size: 24579 bytes Desc: not available URL: From balay at mcs.anl.gov Wed Jan 18 11:07:21 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 18 Jan 2012 11:07:21 -0600 (CST) Subject: [petsc-users] generate entries on 'wrong' process In-Reply-To: References: Message-ID: You can do 2 things. 1. allocate sufficient stash space to avoid mallocs. You can do this with the following runtime command line options -vecstash_initial_size -matstash_initial_size 2. flush stashed values in stages instead of doing a single large communication at the end. MatAssemblyBegin/End(MAT_FLUSH_ASSEMBLY) MatAssemblyBegin/End(MAT_FLUSH_ASSEMBLY) ... ... MatAssemblyBegin/End(MAT_FINAL_ASSEMBLY) Satish On Wed, 18 Jan 2012, Wen Jiang wrote: > Hi, > > I am working on FEM codes with spline-based element type. For 3D case, one > element has 64 nodes and every two neighboring elements share 48 nodes. > Thus regardless how I partition a mesh, there are still very large number > of entries that have to write on the 'wrong' processor. And my code is > running on clusters, the processes are sending between 550 and 620 Million > packets per second across the network. My code seems IO-bound at this > moment and just get stuck at the matrix assembly stage. A -info file is > attached. Do I have other options to optimize my codes to be less > io-intensive? > > Thanks in advance. > > [0] VecAssemblyBegin_MPI(): Stash has 210720 entries, uses 12 mallocs. > [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. > [5] MatAssemblyBegin_MPIAIJ(): Stash has 4806656 entries, uses 8 mallocs. > [6] MatAssemblyBegin_MPIAIJ(): Stash has 5727744 entries, uses 9 mallocs. > [4] MatAssemblyBegin_MPIAIJ(): Stash has 5964288 entries, uses 9 mallocs. > [7] MatAssemblyBegin_MPIAIJ(): Stash has 7408128 entries, uses 9 mallocs. > [3] MatAssemblyBegin_MPIAIJ(): Stash has 8123904 entries, uses 9 mallocs. > [2] MatAssemblyBegin_MPIAIJ(): Stash has 11544576 entries, uses 10 mallocs. > [0] MatStashScatterBegin_Private(): No of messages: 1 > [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 107888648 > [0] MatAssemblyBegin_MPIAIJ(): Stash has 13486080 entries, uses 10 mallocs. > [1] MatAssemblyBegin_MPIAIJ(): Stash has 16386048 entries, uses 10 mallocs. > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > unneeded,2514537 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [0] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > routines > [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > unneeded,2525390 used > [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [5] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode > routines > [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > unneeded,2500281 used > [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [3] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > routines > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > unneeded,2500281 used > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [1] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > routines > [4] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > unneeded,2500281 used > [4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [4] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > routines > [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > unneeded,2525733 used > [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [2] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > routines > From jedbrown at mcs.anl.gov Wed Jan 18 11:39:49 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 18 Jan 2012 11:39:49 -0600 Subject: [petsc-users] Multiple output using one viewer In-Reply-To: References: <4F05C04A.4060108@gfz-potsdam.de> <4F05C276.2050905@gfz-potsdam.de> <4F05C379.2040909@gfz-potsdam.de> Message-ID: On Thu, Jan 5, 2012 at 18:17, Barry Smith wrote: > On Jan 5, 2012, at 9:40 AM, Jed Brown wrote: > > > On Thu, Jan 5, 2012 at 09:36, Alexander Grayver > wrote: > > Maybe this should be noted in the documentation? > > > > Yes, I think the old file should be closed (if it exists), but I'll wait > for comment. > > I never thought about the case where someone called > PetscViewerFileSetName() twice. I'm surprised that it works at all. > > Yes, it should (IMHO) be changed to close the old file if used twice. It works this way now. http://petsc.cs.iit.edu/petsc/petsc-dev/rev/3a98e6a0994d -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Jan 18 11:40:28 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 18 Jan 2012 11:40:28 -0600 Subject: [petsc-users] DMGetMatrix segfault In-Reply-To: References: Message-ID: On Tue, Jan 17, 2012 at 06:32, Jed Brown wrote: > I'll update petsc-dev to call DMSetUp() automatically when it is needed. > http://petsc.cs.iit.edu/petsc/petsc-dev/rev/56deb0e7db8b -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jan 18 12:56:10 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 18 Jan 2012 12:56:10 -0600 Subject: [petsc-users] generate entries on 'wrong' process In-Reply-To: References: Message-ID: <47754349-9741-4740-BBB4-F4B84EA07CEF@mcs.anl.gov> What is the symptom of "just got stuck". Send the results of the whole run with -log_summary to petsc-maint at mcs.anl.gov and we'll see how much time is in that communication. Barry On Jan 18, 2012, at 10:32 AM, Wen Jiang wrote: > Hi, > > I am working on FEM codes with spline-based element type. For 3D case, one element has 64 nodes and every two neighboring elements share 48 nodes. Thus regardless how I partition a mesh, there are still very large number of entries that have to write on the 'wrong' processor. And my code is running on clusters, the processes are sending between 550 and 620 Million packets per second across the network. My code seems IO-bound at this moment and just get stuck at the matrix assembly stage. A -info file is attached. Do I have other options to optimize my codes to be less io-intensive? > > Thanks in advance. > > [0] VecAssemblyBegin_MPI(): Stash has 210720 entries, uses 12 mallocs. > [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. > [5] MatAssemblyBegin_MPIAIJ(): Stash has 4806656 entries, uses 8 mallocs. > [6] MatAssemblyBegin_MPIAIJ(): Stash has 5727744 entries, uses 9 mallocs. > [4] MatAssemblyBegin_MPIAIJ(): Stash has 5964288 entries, uses 9 mallocs. > [7] MatAssemblyBegin_MPIAIJ(): Stash has 7408128 entries, uses 9 mallocs. > [3] MatAssemblyBegin_MPIAIJ(): Stash has 8123904 entries, uses 9 mallocs. > [2] MatAssemblyBegin_MPIAIJ(): Stash has 11544576 entries, uses 10 mallocs. > [0] MatStashScatterBegin_Private(): No of messages: 1 > [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 107888648 > [0] MatAssemblyBegin_MPIAIJ(): Stash has 13486080 entries, uses 10 mallocs. > [1] MatAssemblyBegin_MPIAIJ(): Stash has 16386048 entries, uses 10 mallocs. > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2514537 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [0] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines > [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 unneeded,2525390 used > [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [5] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode routines > [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2500281 used > [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [3] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2500281 used > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [1] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines > [4] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2500281 used > [4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [4] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines > [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2525733 used > [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [2] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines > From rxk at cfdrc.com Wed Jan 18 14:03:43 2012 From: rxk at cfdrc.com (Ravi Kannan) Date: Wed, 18 Jan 2012 14:03:43 -0600 Subject: [petsc-users] [petsc-dev] boomerAmg scalability In-Reply-To: <2D67D12E-86C8-4A7E-BD5A-5B955004274C@columbia.edu> References: <9E6B0CE58F7CC24294FAB8EF9902362A75B89942E7@EXCHMBB.ornl.gov> <9E6B0CE58F7CC24294FAB8EF9902362A75B89B4875@EXCHMBB.ornl.gov> <9110062A-05FF-4096-8DC2-CDEC14E5F8CE@mcs.anl.gov> <0B0FC28A-D635-4803-9B38-4954DCE3974B@ornl.gov> <003701ccbb45$f2bb23f0$d8316bd0$@com> <007801cccb04$5751ad20$05f50760$@com> <349C5EE1-314F-4E03-AF6C-E20D4D3DBDCF@columbia.edu> <001a01cccbff$8aca1420$a05e3c60$@com> <0! ! 00601cccf08$8aa7c300 $9ff74900$@com> <9E0AFCB4-D283-4329-8B7F-3DB1C4116197@columbia.edu> <003501ccd08e$2bda4500$838ecf00$@com> <30973A31-D618-46F8-9831-8F87CC3C43D0@columbia.edu> <004201ccd181$840a5dc0$8c1f1940$@com> <4CFD9328-CC1A-4984-9BBA-FF0CA8D7CB64@columbia.edu> <2548B229-DB77-4397-A677-146F2F0E3C5C@columbia.edu> <004101ccd217$09c79fa0$1d56dee0$@com> <949849D4-6822-4ED6-8373-F84F6C3209F0@columbia.edu> <00be01ccd24b$b78dde90$26a99bb0$@com> <56683DBA-A733-45FC-AB5B-E18368442FC6@columbia.edu> <2D67D12E-86C8-4A7E-BD5A-5B955004274C@columbia.edu> Message-ID: <006f01ccd61c$47c0fc80$d742f580$@com> Hi Mark, Hong, As you might remember, the reason for this whole exercise was to obtain a solution for a very stiff problem. We did have Hypre Boomer amg. This did not scale, but gives correct solution. So we wanted an alternative; hence we approached you for gamg. However for certain cases, gamg crashes. Even for the working cases, it takes about 15-20 times more sweeps than the boomer-hypre. Hence it is cost-prohibitive. Hopefully this gamg solver can be improved in the near future, for users like us. Warm Regards, Ravi. From: Mark F. Adams [mailto:mark.adams at columbia.edu] Sent: Wednesday, January 18, 2012 9:56 AM To: Hong Zhang Cc: rxk at cfdrc.com Subject: Re: [petsc-dev] boomerAmg scalability Hong and Ravi, I fixed a bug with the 6x6 problem. There seemed to be a bug in MatTranposeMat with funny decomposition, that was not really verified. So we can wait for Ravi to continue with his tests a fix them as they arise. Mark ps, Ravi, I may not have cc'ed so I will send again. On Jan 17, 2012, at 7:37 PM, Hong Zhang wrote: Ravi, I wrote a simple test ex163.c (attached) on MatTransposeMatMult(). Loading your 6x6 matrix gives no error from MatTransposeMatMult() using 1,2,...7 processes. For example, petsc-dev/src/mat/examples/tests>mpiexec -n 4 ./ex163 -f /Users/hong/Downloads/repetscdevboomeramgscalability/binaryoutput A: Matrix Object: 1 MPI processes type: mpiaij row 0: (0, 1.66668e+06) (1, -1.35) (3, -0.6) row 1: (0, -1.35) (1, 1.66667e+06) (2, -1.35) (4, -0.6) row 2: (1, -1.35) (2, 1.66667e+06) (5, -0.6) row 3: (0, -0.6) (3, 1.66668e+06) (4, -1.35) row 4: (1, -0.6) (3, -1.35) (4, 1.66667e+06) (5, -1.35) row 5: (2, -0.6) (4, -1.35) (5, 1.66667e+06) C = A^T * A: Matrix Object: 1 MPI processes type: mpiaij row 0: (0, 2.77781e+12) (1, -4.50002e+06) (2, 1.8225) (3, -2.00001e+06) (4, 1.62) row 1: (0, -4.50002e+06) (1, 2.77779e+12) (2, -4.50001e+06) (3, 1.62) (4, -2.00001e+06) (5, 1.62) row 2: (0, 1.8225) (1, -4.50001e+06) (2, 2.7778e+12) (4, 1.62) (5, -2.00001e+06) row 3: (0, -2.00001e+06) (1, 1.62) (3, 2.77781e+12) (4, -4.50002e+06) (5, 1.8225) row 4: (0, 1.62) (1, -2.00001e+06) (2, 1.62) (3, -4.50002e+06) (4, 2.77779e+12) (5, -4.50001e+06) row 5: (1, 1.62) (2, -2.00001e+06) (3, 1.8225) (4, -4.50001e+06) (5, 2.7778e+12) Do I miss something? Hong On Sat, Jan 14, 2012 at 3:37 PM, Mark F. Adams wrote: Ravi, this system is highly diagonally dominate. I've fixed the code so you can pull and try again. I've decided to basically just do a one level method with DD systems. I don't know if that is the best semantics, I think Barry will hate it, because it gives you a one level solver when you asked for MG. It now picks up the coarse grid solver as the solver, which is wrong, so I need to fix this if we decide to stick with the current semantics. And again thanks for helping to pound on this code. Mark On Jan 13, 2012, at 6:33 PM, Ravi Kannan wrote: Hi Mark, Hong, Lets make it simpler. I fixed my partitiotion bug (in metis). Now there is a equidivision of cells. To simplify even further, lets run a much smaller case : with 6 cells (equations) in SERIAL. This one crashes. The out and the ksp_view_binary files are attached. Thanks, RAvi. From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams Sent: Friday, January 13, 2012 3:00 PM To: For users of the development version of PETSc Subject: Re: [petsc-dev] boomerAmg scalability Well, we do have a bug here. It should work with zero elements on a proc, but the code is being actively developed so you are really helping us to find these cracks. If its not too hard it would be nice if you could give use these matrices, before you fix it, so we can fix this bug. You can just send it to Hong and I (cc'ed). Mark On Jan 13, 2012, at 12:16 PM, Ravi Kannan wrote: Hi Mark,Hong Thanks for the observation w.r.t the proc 0 having 2 equations. This is a bug from our end. We will fix it and get back to you if needed. Thanks, Ravi. From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams Sent: Thursday, January 12, 2012 10:03 PM To: Hong Zhang Cc: For users of the development version of PETSc Subject: Re: [petsc-dev] boomerAmg scalability Ravi, can you run with -ksp_view_binary? This will produce two files. Hong, ex10 will read in these files and solve them. I will probably not be able to get to this until Monday. Also, this matrix has just two equations on proc 0 and and about 11000 on proc 1 so its is strangely balanced, in case that helps ... Mark On Jan 12, 2012, at 10:35 PM, Hong Zhang wrote: Ravi, I need more info for debugging. Can you provide a simple stand alone code and matrices in petsc binary format that reproduce the error? MatTransposeMatMult() for mpiaij is a newly developed subroutine - less than one month old and not well tested yet :-( I used petsc-dev/src/mat/examples/tests/ex94.c for testing. Thanks, Hong On Thu, Jan 12, 2012 at 9:17 PM, Mark F. Adams wrote: It looks like the problem is in MatTransposeMatMult and Hong (cc'ed) is working on it. I'm hoping that your output will be enough for Hong to figure this out but I could not reproduce this problem with any of my tests. If Hong can not figure this out then we will need to get the matrix from you to reproduce this. Mark On Jan 12, 2012, at 6:25 PM, Ravi Kannan wrote: Hi Mark, Any luck with the gamg bug fix? Thanks, Ravi. From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams Sent: Wednesday, January 11, 2012 1:54 PM To: For users of the development version of PETSc Subject: Re: [petsc-dev] boomerAmg scalability This seems to be dying earlier than it was last week, so it looks like a new bug in MatTransposeMatMult. Mark On Jan 11, 2012, at 1:59 PM, Matthew Knepley wrote: On Wed, Jan 11, 2012 at 12:23 PM, Ravi Kannan wrote: Hi Mark, I downloaded the dev version again. This time, the program crashes even earlier. Attached is the serial and parallel info outputs. Could you kindly take a look. It looks like this is a problem with MatMatMult(). Can you try to reproduce this using KSP ex10? You put your matrix in binary format and use -pc_type gamg. Then you can send us the matrix and we can track it down. Or are you running an example there? Thanks, Matt Thanks, Ravi. From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams Sent: Monday, January 09, 2012 3:08 PM To: For users of the development version of PETSc Subject: Re: [petsc-dev] boomerAmg scalability Yes its all checked it, just pull from dev. Mark On Jan 9, 2012, at 2:54 PM, Ravi Kannan wrote: Hi Mark, Thanks for your efforts. Do I need to do the install from scratch once again? Or some particular files (check out gamg.c for instance)? Thanks, Ravi. From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams Sent: Friday, January 06, 2012 10:30 AM To: For users of the development version of PETSc Subject: Re: [petsc-dev] boomerAmg scalability I think I found the problem. You will need to use petsc-dev to get the fix. Mark On Jan 6, 2012, at 8:55 AM, Mark F. Adams wrote: Ravi, I forgot but you can just use -ksp_view_binary to output the matrix data (two files). You could run it with two procs and a Jacobi solver to get it past the solve, where it writes the matrix (I believe). Mark On Jan 5, 2012, at 6:19 PM, Ravi Kannan wrote: Just send in another email with the attachment. From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Jed Brown Sent: Thursday, January 05, 2012 5:15 PM To: For users of the development version of PETSc Subject: Re: [petsc-dev] boomerAmg scalability On Thu, Jan 5, 2012 at 17:12, Ravi Kannan wrote: I have attached the verbose+info outputs for both the serial and the parallel (2 partitions). NOTE: the serial output at some location says PC=Jacobi! Is it implicitly converting the PC to a Jacobi? Looks like you forgot the attachment. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jiangwen84 at gmail.com Wed Jan 18 14:22:59 2012 From: jiangwen84 at gmail.com (Wen Jiang) Date: Wed, 18 Jan 2012 15:22:59 -0500 Subject: [petsc-users] generate entries on 'wrong' process (Barry Smith) Message-ID: Hi Barry, The symptom of "just got stuck" is simply that the code just stays there and never moves on. One more thing is that all the processes are at 99% cpu utilization. I do see some network traffic between the head node and computation nodes. The quantity is very small, but the sheer number of packets is huge. The processes are sending between 550 and 620 Million packets per second across the network. Since my code never finishes, I cannot get the summary files by add -log_summary. any other way to get summary file? BTW, my codes are running without any problem on shared-memory desktop with any number of processes. On Wed, Jan 18, 2012 at 3:03 PM, wrote: > Send petsc-users mailing list submissions to > petsc-users at mcs.anl.gov > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > or, via email, send a message with subject or body 'help' to > petsc-users-request at mcs.anl.gov > > You can reach the person managing the list at > petsc-users-owner at mcs.anl.gov > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of petsc-users digest..." > > > Today's Topics: > > 1. Re: generate entries on 'wrong' process (Barry Smith) > 2. Re: [petsc-dev] boomerAmg scalability (Ravi Kannan) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 18 Jan 2012 12:56:10 -0600 > From: Barry Smith > Subject: Re: [petsc-users] generate entries on 'wrong' process > To: PETSc users list > Message-ID: <47754349-9741-4740-BBB4-F4B84EA07CEF at mcs.anl.gov> > Content-Type: text/plain; charset=us-ascii > > > What is the symptom of "just got stuck". Send the results of the whole > run with -log_summary to petsc-maint at mcs.anl.gov and we'll see how much > time is in that communication. > > Barry > > > On Jan 18, 2012, at 10:32 AM, Wen Jiang wrote: > > > Hi, > > > > I am working on FEM codes with spline-based element type. For 3D case, > one element has 64 nodes and every two neighboring elements share 48 nodes. > Thus regardless how I partition a mesh, there are still very large number > of entries that have to write on the 'wrong' processor. And my code is > running on clusters, the processes are sending between 550 and 620 Million > packets per second across the network. My code seems IO-bound at this > moment and just get stuck at the matrix assembly stage. A -info file is > attached. Do I have other options to optimize my codes to be less > io-intensive? > > > > Thanks in advance. > > > > [0] VecAssemblyBegin_MPI(): Stash has 210720 entries, uses 12 mallocs. > > [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. > > [5] MatAssemblyBegin_MPIAIJ(): Stash has 4806656 entries, uses 8 mallocs. > > [6] MatAssemblyBegin_MPIAIJ(): Stash has 5727744 entries, uses 9 mallocs. > > [4] MatAssemblyBegin_MPIAIJ(): Stash has 5964288 entries, uses 9 mallocs. > > [7] MatAssemblyBegin_MPIAIJ(): Stash has 7408128 entries, uses 9 mallocs. > > [3] MatAssemblyBegin_MPIAIJ(): Stash has 8123904 entries, uses 9 mallocs. > > [2] MatAssemblyBegin_MPIAIJ(): Stash has 11544576 entries, uses 10 > mallocs. > > [0] MatStashScatterBegin_Private(): No of messages: 1 > > [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 107888648 > > [0] MatAssemblyBegin_MPIAIJ(): Stash has 13486080 entries, uses 10 > mallocs. > > [1] MatAssemblyBegin_MPIAIJ(): Stash has 16386048 entries, uses 10 > mallocs. > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: > 0 unneeded,2514537 used > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [0] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using > Inode routines > > [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: > 0 unneeded,2525390 used > > [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [5] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using > Inode routines > > [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: > 0 unneeded,2500281 used > > [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [3] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using > Inode routines > > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: > 0 unneeded,2500281 used > > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [1] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using > Inode routines > > [4] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: > 0 unneeded,2500281 used > > [4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [4] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using > Inode routines > > [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: > 0 unneeded,2525733 used > > [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [2] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using > Inode routines > > > > > > ------------------------------ > > Message: 2 > Date: Wed, 18 Jan 2012 14:03:43 -0600 > From: "Ravi Kannan" > Subject: Re: [petsc-users] [petsc-dev] boomerAmg scalability > To: "'Mark F. Adams'" > Cc: 'PETSc users list' > Message-ID: <006f01ccd61c$47c0fc80$d742f580$@com> > Content-Type: text/plain; charset="us-ascii" > > Hi Mark, Hong, > > > > As you might remember, the reason for this whole exercise was to obtain a > solution for a very stiff problem. > > > > We did have Hypre Boomer amg. This did not scale, but gives correct > solution. So we wanted an alternative; hence we approached you for gamg. > > > > However for certain cases, gamg crashes. Even for the working cases, it > takes about 15-20 times more sweeps than the boomer-hypre. Hence it is > cost-prohibitive. > > > > Hopefully this gamg solver can be improved in the near future, for users > like us. > > > > Warm Regards, > > Ravi. > > > > > > From: Mark F. Adams [mailto:mark.adams at columbia.edu] > Sent: Wednesday, January 18, 2012 9:56 AM > To: Hong Zhang > Cc: rxk at cfdrc.com > Subject: Re: [petsc-dev] boomerAmg scalability > > > > Hong and Ravi, > > > > I fixed a bug with the 6x6 problem. There seemed to be a bug in > MatTranposeMat with funny decomposition, that was not really verified. So > we can wait for Ravi to continue with his tests a fix them as they arise. > > > > Mark > > ps, Ravi, I may not have cc'ed so I will send again. > > > > On Jan 17, 2012, at 7:37 PM, Hong Zhang wrote: > > > > > > Ravi, > > I wrote a simple test ex163.c (attached) on MatTransposeMatMult(). > > Loading your 6x6 matrix gives no error from MatTransposeMatMult() > > using 1,2,...7 processes. > > For example, > > > > petsc-dev/src/mat/examples/tests>mpiexec -n 4 ./ex163 -f > /Users/hong/Downloads/repetscdevboomeramgscalability/binaryoutput > > A: > > Matrix Object: 1 MPI processes > > type: mpiaij > > row 0: (0, 1.66668e+06) (1, -1.35) (3, -0.6) > > row 1: (0, -1.35) (1, 1.66667e+06) (2, -1.35) (4, -0.6) > > row 2: (1, -1.35) (2, 1.66667e+06) (5, -0.6) > > row 3: (0, -0.6) (3, 1.66668e+06) (4, -1.35) > > row 4: (1, -0.6) (3, -1.35) (4, 1.66667e+06) (5, -1.35) > > row 5: (2, -0.6) (4, -1.35) (5, 1.66667e+06) > > > > C = A^T * A: > > Matrix Object: 1 MPI processes > > type: mpiaij > > row 0: (0, 2.77781e+12) (1, -4.50002e+06) (2, 1.8225) (3, -2.00001e+06) > (4, 1.62) > > row 1: (0, -4.50002e+06) (1, 2.77779e+12) (2, -4.50001e+06) (3, 1.62) > (4, -2.00001e+06) (5, 1.62) > > row 2: (0, 1.8225) (1, -4.50001e+06) (2, 2.7778e+12) (4, 1.62) (5, > -2.00001e+06) > > row 3: (0, -2.00001e+06) (1, 1.62) (3, 2.77781e+12) (4, -4.50002e+06) > (5, 1.8225) > > row 4: (0, 1.62) (1, -2.00001e+06) (2, 1.62) (3, -4.50002e+06) (4, > 2.77779e+12) (5, -4.50001e+06) > > row 5: (1, 1.62) (2, -2.00001e+06) (3, 1.8225) (4, -4.50001e+06) (5, > 2.7778e+12) > > > > Do I miss something? > > > > Hong > > > > On Sat, Jan 14, 2012 at 3:37 PM, Mark F. Adams > wrote: > > Ravi, this system is highly diagonally dominate. I've fixed the code so > you > can pull and try again. > > > > I've decided to basically just do a one level method with DD systems. I > don't know if that is the best semantics, I think Barry will hate it, > because it gives you a one level solver when you asked for MG. It now > picks > up the coarse grid solver as the solver, which is wrong, so I need to fix > this if we decide to stick with the current semantics. > > > > And again thanks for helping to pound on this code. > > > > Mark > > > > On Jan 13, 2012, at 6:33 PM, Ravi Kannan wrote: > > > > Hi Mark, Hong, > > > > Lets make it simpler. I fixed my partitiotion bug (in metis). Now there is > a > equidivision of cells. > > > > To simplify even further, lets run a much smaller case : with 6 cells > (equations) in SERIAL. This one crashes. The out and the ksp_view_binary > files are attached. > > > > Thanks, > > RAvi. > > > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] > On Behalf Of Mark F. Adams > Sent: Friday, January 13, 2012 3:00 PM > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > > > Well, we do have a bug here. It should work with zero elements on a proc, > but the code is being actively developed so you are really helping us to > find these cracks. > > > > If its not too hard it would be nice if you could give use these matrices, > before you fix it, so we can fix this bug. You can just send it to Hong > and > I (cc'ed). > > > > Mark > > > > On Jan 13, 2012, at 12:16 PM, Ravi Kannan wrote: > > > > Hi Mark,Hong > > > > Thanks for the observation w.r.t the proc 0 having 2 equations. This is a > bug from our end. We will fix it and get back to you if needed. > > > > Thanks, > > Ravi. > > > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] > On Behalf Of Mark F. Adams > Sent: Thursday, January 12, 2012 10:03 PM > To: Hong Zhang > Cc: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > > > Ravi, can you run with -ksp_view_binary? This will produce two files. > > > > Hong, ex10 will read in these files and solve them. I will probably not be > able to get to this until Monday. > > > > Also, this matrix has just two equations on proc 0 and and about 11000 on > proc 1 so its is strangely balanced, in case that helps ... > > > > Mark > > > > On Jan 12, 2012, at 10:35 PM, Hong Zhang wrote: > > > > > > Ravi, > > > > I need more info for debugging. Can you provide a simple stand alone code > and matrices in petsc > > binary format that reproduce the error? > > > > MatTransposeMatMult() for mpiaij is a newly developed subroutine - less > than > one month old > > and not well tested yet :-( > > I used petsc-dev/src/mat/examples/tests/ex94.c for testing. > > > > Thanks, > > > > Hong > > On Thu, Jan 12, 2012 at 9:17 PM, Mark F. Adams > wrote: > > It looks like the problem is in MatTransposeMatMult and Hong (cc'ed) is > working on it. > > > > I'm hoping that your output will be enough for Hong to figure this out but > I > could not reproduce this problem with any of my tests. > > > > If Hong can not figure this out then we will need to get the matrix from > you > to reproduce this. > > > > Mark > > > > > > On Jan 12, 2012, at 6:25 PM, Ravi Kannan wrote: > > > > > > Hi Mark, > > > > Any luck with the gamg bug fix? > > > > Thanks, > > Ravi. > > > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] > On Behalf Of Mark F. Adams > Sent: Wednesday, January 11, 2012 1:54 PM > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > > > This seems to be dying earlier than it was last week, so it looks like a > new > bug in MatTransposeMatMult. > > > > Mark > > > > On Jan 11, 2012, at 1:59 PM, Matthew Knepley wrote: > > > > On Wed, Jan 11, 2012 at 12:23 PM, Ravi Kannan wrote: > > Hi Mark, > > > > I downloaded the dev version again. This time, the program crashes even > earlier. Attached is the serial and parallel info outputs. > > > > Could you kindly take a look. > > > > It looks like this is a problem with MatMatMult(). Can you try to reproduce > this using KSP ex10? You put > > your matrix in binary format and use -pc_type gamg. Then you can send us > the > matrix and we can track > > it down. Or are you running an example there? > > > > Thanks, > > > > Matt > > > > > > > > Thanks, > > Ravi. > > > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] > On Behalf Of Mark F. Adams > Sent: Monday, January 09, 2012 3:08 PM > > > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > > > > > Yes its all checked it, just pull from dev. > > Mark > > > > On Jan 9, 2012, at 2:54 PM, Ravi Kannan wrote: > > > > Hi Mark, > > > > Thanks for your efforts. > > > > Do I need to do the install from scratch once again? Or some particular > files (check out gamg.c for instance)? > > > > Thanks, > > Ravi. > > > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] > On Behalf Of Mark F. Adams > Sent: Friday, January 06, 2012 10:30 AM > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > > > I think I found the problem. You will need to use petsc-dev to get the > fix. > > > > Mark > > > > On Jan 6, 2012, at 8:55 AM, Mark F. Adams wrote: > > > > Ravi, I forgot but you can just use -ksp_view_binary to output the matrix > data (two files). You could run it with two procs and a Jacobi solver to > get it past the solve, where it writes the matrix (I believe). > > Mark > > > > On Jan 5, 2012, at 6:19 PM, Ravi Kannan wrote: > > > > Just send in another email with the attachment. > > > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] > On Behalf Of Jed Brown > Sent: Thursday, January 05, 2012 5:15 PM > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > > > On Thu, Jan 5, 2012 at 17:12, Ravi Kannan wrote: > > I have attached the verbose+info outputs for both the serial and the > parallel (2 partitions). NOTE: the serial output at some location says > PC=Jacobi! Is it implicitly converting the PC to a Jacobi? > > > > Looks like you forgot the attachment. > > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120118/965a1679/attachment.htm > > > > ------------------------------ > > _______________________________________________ > petsc-users mailing list > petsc-users at mcs.anl.gov > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > > > End of petsc-users Digest, Vol 37, Issue 41 > ******************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Jan 18 14:44:21 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 18 Jan 2012 14:44:21 -0600 Subject: [petsc-users] [petsc-dev] boomerAmg scalability In-Reply-To: <006f01ccd61c$47c0fc80$d742f580$@com> References: <9E6B0CE58F7CC24294FAB8EF9902362A75B89942E7@EXCHMBB.ornl.gov> <9E6B0CE58F7CC24294FAB8EF9902362A75B89B4875@EXCHMBB.ornl.gov> <9110062A-05FF-4096-8DC2-CDEC14E5F8CE@mcs.anl.gov> <0B0FC28A-D635-4803-9B38-4954DCE3974B@ornl.gov> <003701ccbb45$f2bb23f0$d8316bd0$@com> <007801cccb04$5751ad20$05f50760$@com> <349C5EE1-314F-4E03-AF6C-E20D4D3DBDCF@columbia.edu> <001a01cccbff$8aca1420$a05e3c60$@com> <9E0AFCB4-D283-4329-8B7F-3DB1C4116197@columbia.edu> <003501ccd08e$2bda4500$838ecf00$@com> <30973A31-D618-46F8-9831-8F87CC3C43D0@columbia.edu> <004201ccd181$840a5dc0$8c1f1940$@com> <4CFD9328-CC1A-4984-9BBA-FF0CA8D7CB64@columbia.edu> <2548B229-DB77-4397-A677-146F2F0E3C5C@columbia.edu> <004101ccd217$09c79fa0$1d56dee0$@com> <949849D4-6822-4ED6-8373-F84F6C3209F0@columbia.edu> <00be01ccd24b$b78dde90$26a99bb0$@com> <56683DBA-A733-45FC-AB5B-E18368442FC6@columbia.edu> <2D67D12E-86C8-4A7E-BD5A-5B955004274C@columbia.edu> <006f01ccd61c$47c0fc80$d742f580$@com> Message-ID: On Wed, Jan 18, 2012 at 14:03, Ravi Kannan wrote: > We did have Hypre Boomer amg. This did not scale, but gives correct > solution. So we wanted an alternative; hence we approached you for gamg.** > ** > > ** ** > > However for certain cases, gamg crashes. Even for the working cases, it > takes about 15-20 times more sweeps than the boomer-hypre. Hence it is > cost-prohibitive. > It would help to have a representative test matrix for us to test aggregation and smoothing strategies. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jiangwen84 at gmail.com Wed Jan 18 14:55:16 2012 From: jiangwen84 at gmail.com (Wen Jiang) Date: Wed, 18 Jan 2012 15:55:16 -0500 Subject: [petsc-users] generate entries on 'wrong' process (Satish Balay) Message-ID: Hi Satish, Thanks for your suggestion. I just tried both of these methods, but it seems that they did not work neither. After adding matstash_initial_size and vecstash_initial_size, stash uses 0 mallocs in mat_assembly stage. I cannot see much difference when I use MAT_FLUSH_ASSEMBLY after every element stiffness matrix are added. I list the last a few -info information where my codes gets stuck. [5] MatAssemblyBegin_MPIAIJ(): Stash has 4806656 entries, uses 0 mallocs. [4] MatAssemblyBegin_MPIAIJ(): Stash has 5964288 entries, uses 0 mallocs. [6] MatAssemblyBegin_MPIAIJ(): Stash has 5727744 entries, uses 0 mallocs. [3] MatAssemblyBegin_MPIAIJ(): Stash has 8123904 entries, uses 0 mallocs. [7] MatAssemblyBegin_MPIAIJ(): Stash has 7408128 entries, uses 0 mallocs. [2] MatAssemblyBegin_MPIAIJ(): Stash has 11544576 entries, uses 0 mallocs. [0] MatStashScatterBegin_Private(): No of messages: 1 [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 107888648 [0] MatAssemblyBegin_MPIAIJ(): Stash has 13486080 entries, uses 1 mallocs. [1] MatAssemblyBegin_MPIAIJ(): Stash has 16386048 entries, uses 1 mallocs. On Wed, Jan 18, 2012 at 1:00 PM, wrote: > Send petsc-users mailing list submissions to > petsc-users at mcs.anl.gov > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > or, via email, send a message with subject or body 'help' to > petsc-users-request at mcs.anl.gov > > You can reach the person managing the list at > petsc-users-owner at mcs.anl.gov > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of petsc-users digest..." > > > Today's Topics: > > 1. Re: generate entries on 'wrong' process (Satish Balay) > 2. Re: Multiple output using one viewer (Jed Brown) > 3. Re: DMGetMatrix segfault (Jed Brown) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 18 Jan 2012 11:07:21 -0600 (CST) > From: Satish Balay > Subject: Re: [petsc-users] generate entries on 'wrong' process > To: PETSc users list > Message-ID: > Content-Type: TEXT/PLAIN; charset=US-ASCII > > You can do 2 things. > > 1. allocate sufficient stash space to avoid mallocs. > You can do this with the following runtime command line options > -vecstash_initial_size > -matstash_initial_size > > 2. flush stashed values in stages instead of doing a single > large communication at the end. > > > MatAssemblyBegin/End(MAT_FLUSH_ASSEMBLY) > > MatAssemblyBegin/End(MAT_FLUSH_ASSEMBLY) > ... > ... > > > MatAssemblyBegin/End(MAT_FINAL_ASSEMBLY) > > Satish > > > On Wed, 18 Jan 2012, Wen Jiang wrote: > > > Hi, > > > > I am working on FEM codes with spline-based element type. For 3D case, > one > > element has 64 nodes and every two neighboring elements share 48 nodes. > > Thus regardless how I partition a mesh, there are still very large > number > > of entries that have to write on the 'wrong' processor. And my code is > > running on clusters, the processes are sending between 550 and 620 > Million > > packets per second across the network. My code seems IO-bound at this > > moment and just get stuck at the matrix assembly stage. A -info file is > > attached. Do I have other options to optimize my codes to be less > > io-intensive? > > > > Thanks in advance. > > > > [0] VecAssemblyBegin_MPI(): Stash has 210720 entries, uses 12 mallocs. > > [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. > > [5] MatAssemblyBegin_MPIAIJ(): Stash has 4806656 entries, uses 8 mallocs. > > [6] MatAssemblyBegin_MPIAIJ(): Stash has 5727744 entries, uses 9 mallocs. > > [4] MatAssemblyBegin_MPIAIJ(): Stash has 5964288 entries, uses 9 mallocs. > > [7] MatAssemblyBegin_MPIAIJ(): Stash has 7408128 entries, uses 9 mallocs. > > [3] MatAssemblyBegin_MPIAIJ(): Stash has 8123904 entries, uses 9 mallocs. > > [2] MatAssemblyBegin_MPIAIJ(): Stash has 11544576 entries, uses 10 > mallocs. > > [0] MatStashScatterBegin_Private(): No of messages: 1 > > [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 107888648 > > [0] MatAssemblyBegin_MPIAIJ(): Stash has 13486080 entries, uses 10 > mallocs. > > [1] MatAssemblyBegin_MPIAIJ(): Stash has 16386048 entries, uses 10 > mallocs. > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > > unneeded,2514537 used > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [0] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using > Inode > > routines > > [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > > unneeded,2525390 used > > [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [5] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using > Inode > > routines > > [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > > unneeded,2500281 used > > [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [3] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using > Inode > > routines > > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > > unneeded,2500281 used > > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [1] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using > Inode > > routines > > [4] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > > unneeded,2500281 used > > [4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [4] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using > Inode > > routines > > [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > > unneeded,2525733 used > > [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [2] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using > Inode > > routines > > > > > > ------------------------------ > > Message: 2 > Date: Wed, 18 Jan 2012 11:39:49 -0600 > From: Jed Brown > Subject: Re: [petsc-users] Multiple output using one viewer > To: PETSc users list > Message-ID: > > > Content-Type: text/plain; charset="utf-8" > > On Thu, Jan 5, 2012 at 18:17, Barry Smith wrote: > > > On Jan 5, 2012, at 9:40 AM, Jed Brown wrote: > > > > > On Thu, Jan 5, 2012 at 09:36, Alexander Grayver < > agrayver at gfz-potsdam.de> > > wrote: > > > Maybe this should be noted in the documentation? > > > > > > Yes, I think the old file should be closed (if it exists), but I'll > wait > > for comment. > > > > I never thought about the case where someone called > > PetscViewerFileSetName() twice. I'm surprised that it works at all. > > > > Yes, it should (IMHO) be changed to close the old file if used twice. > > > It works this way now. > > http://petsc.cs.iit.edu/petsc/petsc-dev/rev/3a98e6a0994d > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120118/32ccd7d5/attachment-0001.htm > > > > ------------------------------ > > Message: 3 > Date: Wed, 18 Jan 2012 11:40:28 -0600 > From: Jed Brown > Subject: Re: [petsc-users] DMGetMatrix segfault > To: PETSc users list > Message-ID: > > > Content-Type: text/plain; charset="utf-8" > > On Tue, Jan 17, 2012 at 06:32, Jed Brown wrote: > > > I'll update petsc-dev to call DMSetUp() automatically when it is needed. > > > > http://petsc.cs.iit.edu/petsc/petsc-dev/rev/56deb0e7db8b > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120118/ffb21c3f/attachment-0001.htm > > > > ------------------------------ > > _______________________________________________ > petsc-users mailing list > petsc-users at mcs.anl.gov > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > > > End of petsc-users Digest, Vol 37, Issue 40 > ******************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jan 18 15:05:56 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 18 Jan 2012 15:05:56 -0600 Subject: [petsc-users] generate entries on 'wrong' process (Barry Smith) In-Reply-To: References: Message-ID: <5AC2FA09-539A-4510-9F41-69F6E537AEBA@mcs.anl.gov> On Jan 18, 2012, at 2:22 PM, Wen Jiang wrote: > Hi Barry, > > The symptom of "just got stuck" is simply that the code just stays there and never moves on. One more thing is that all the processes are at 99% cpu utilization. I do see some network traffic between the head node and computation nodes. The quantity is very small, but the sheer number of packets is huge. The processes are sending between 550 and 620 Million packets per second across the network. > > Since my code never finishes, I cannot get the summary files by add -log_summary. any other way to get summary file? My guess is that you are running a larger problem on the this system and your preallocation for the matrix is wrong. While in the small run you sent the preallocation is correct. Usually the only thing that causes it to take forever is not the parallel communication but is the preallocation. After you create the matrix and set its preallocation call MatSetOption(mat, NEW_NONZERO_ALLOCATION_ERR,PETSC_TRUE); then run. It will stop with an error message if preallocation is wrong. Barry > > BTW, my codes are running without any problem on shared-memory desktop with any number of processes. > > On Wed, Jan 18, 2012 at 3:03 PM, wrote: > Send petsc-users mailing list submissions to > petsc-users at mcs.anl.gov > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > or, via email, send a message with subject or body 'help' to > petsc-users-request at mcs.anl.gov > > You can reach the person managing the list at > petsc-users-owner at mcs.anl.gov > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of petsc-users digest..." > > > Today's Topics: > > 1. Re: generate entries on 'wrong' process (Barry Smith) > 2. Re: [petsc-dev] boomerAmg scalability (Ravi Kannan) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 18 Jan 2012 12:56:10 -0600 > From: Barry Smith > Subject: Re: [petsc-users] generate entries on 'wrong' process > To: PETSc users list > Message-ID: <47754349-9741-4740-BBB4-F4B84EA07CEF at mcs.anl.gov> > Content-Type: text/plain; charset=us-ascii > > > What is the symptom of "just got stuck". Send the results of the whole run with -log_summary to petsc-maint at mcs.anl.gov and we'll see how much time is in that communication. > > Barry > > > On Jan 18, 2012, at 10:32 AM, Wen Jiang wrote: > > > Hi, > > > > I am working on FEM codes with spline-based element type. For 3D case, one element has 64 nodes and every two neighboring elements share 48 nodes. Thus regardless how I partition a mesh, there are still very large number of entries that have to write on the 'wrong' processor. And my code is running on clusters, the processes are sending between 550 and 620 Million packets per second across the network. My code seems IO-bound at this moment and just get stuck at the matrix assembly stage. A -info file is attached. Do I have other options to optimize my codes to be less io-intensive? > > > > Thanks in advance. > > > > [0] VecAssemblyBegin_MPI(): Stash has 210720 entries, uses 12 mallocs. > > [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. > > [5] MatAssemblyBegin_MPIAIJ(): Stash has 4806656 entries, uses 8 mallocs. > > [6] MatAssemblyBegin_MPIAIJ(): Stash has 5727744 entries, uses 9 mallocs. > > [4] MatAssemblyBegin_MPIAIJ(): Stash has 5964288 entries, uses 9 mallocs. > > [7] MatAssemblyBegin_MPIAIJ(): Stash has 7408128 entries, uses 9 mallocs. > > [3] MatAssemblyBegin_MPIAIJ(): Stash has 8123904 entries, uses 9 mallocs. > > [2] MatAssemblyBegin_MPIAIJ(): Stash has 11544576 entries, uses 10 mallocs. > > [0] MatStashScatterBegin_Private(): No of messages: 1 > > [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 107888648 > > [0] MatAssemblyBegin_MPIAIJ(): Stash has 13486080 entries, uses 10 mallocs. > > [1] MatAssemblyBegin_MPIAIJ(): Stash has 16386048 entries, uses 10 mallocs. > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2514537 used > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [0] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines > > [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 unneeded,2525390 used > > [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [5] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode routines > > [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2500281 used > > [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [3] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines > > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2500281 used > > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [1] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines > > [4] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2500281 used > > [4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [4] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines > > [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2525733 used > > [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [2] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines > > > > > > ------------------------------ > > Message: 2 > Date: Wed, 18 Jan 2012 14:03:43 -0600 > From: "Ravi Kannan" > Subject: Re: [petsc-users] [petsc-dev] boomerAmg scalability > To: "'Mark F. Adams'" > Cc: 'PETSc users list' > Message-ID: <006f01ccd61c$47c0fc80$d742f580$@com> > Content-Type: text/plain; charset="us-ascii" > > Hi Mark, Hong, > > > > As you might remember, the reason for this whole exercise was to obtain a > solution for a very stiff problem. > > > > We did have Hypre Boomer amg. This did not scale, but gives correct > solution. So we wanted an alternative; hence we approached you for gamg. > > > > However for certain cases, gamg crashes. Even for the working cases, it > takes about 15-20 times more sweeps than the boomer-hypre. Hence it is > cost-prohibitive. > > > > Hopefully this gamg solver can be improved in the near future, for users > like us. > > > > Warm Regards, > > Ravi. > > > > > > From: Mark F. Adams [mailto:mark.adams at columbia.edu] > Sent: Wednesday, January 18, 2012 9:56 AM > To: Hong Zhang > Cc: rxk at cfdrc.com > Subject: Re: [petsc-dev] boomerAmg scalability > > > > Hong and Ravi, > > > > I fixed a bug with the 6x6 problem. There seemed to be a bug in > MatTranposeMat with funny decomposition, that was not really verified. So > we can wait for Ravi to continue with his tests a fix them as they arise. > > > > Mark > > ps, Ravi, I may not have cc'ed so I will send again. > > > > On Jan 17, 2012, at 7:37 PM, Hong Zhang wrote: > > > > > > Ravi, > > I wrote a simple test ex163.c (attached) on MatTransposeMatMult(). > > Loading your 6x6 matrix gives no error from MatTransposeMatMult() > > using 1,2,...7 processes. > > For example, > > > > petsc-dev/src/mat/examples/tests>mpiexec -n 4 ./ex163 -f > /Users/hong/Downloads/repetscdevboomeramgscalability/binaryoutput > > A: > > Matrix Object: 1 MPI processes > > type: mpiaij > > row 0: (0, 1.66668e+06) (1, -1.35) (3, -0.6) > > row 1: (0, -1.35) (1, 1.66667e+06) (2, -1.35) (4, -0.6) > > row 2: (1, -1.35) (2, 1.66667e+06) (5, -0.6) > > row 3: (0, -0.6) (3, 1.66668e+06) (4, -1.35) > > row 4: (1, -0.6) (3, -1.35) (4, 1.66667e+06) (5, -1.35) > > row 5: (2, -0.6) (4, -1.35) (5, 1.66667e+06) > > > > C = A^T * A: > > Matrix Object: 1 MPI processes > > type: mpiaij > > row 0: (0, 2.77781e+12) (1, -4.50002e+06) (2, 1.8225) (3, -2.00001e+06) > (4, 1.62) > > row 1: (0, -4.50002e+06) (1, 2.77779e+12) (2, -4.50001e+06) (3, 1.62) > (4, -2.00001e+06) (5, 1.62) > > row 2: (0, 1.8225) (1, -4.50001e+06) (2, 2.7778e+12) (4, 1.62) (5, > -2.00001e+06) > > row 3: (0, -2.00001e+06) (1, 1.62) (3, 2.77781e+12) (4, -4.50002e+06) > (5, 1.8225) > > row 4: (0, 1.62) (1, -2.00001e+06) (2, 1.62) (3, -4.50002e+06) (4, > 2.77779e+12) (5, -4.50001e+06) > > row 5: (1, 1.62) (2, -2.00001e+06) (3, 1.8225) (4, -4.50001e+06) (5, > 2.7778e+12) > > > > Do I miss something? > > > > Hong > > > > On Sat, Jan 14, 2012 at 3:37 PM, Mark F. Adams > wrote: > > Ravi, this system is highly diagonally dominate. I've fixed the code so you > can pull and try again. > > > > I've decided to basically just do a one level method with DD systems. I > don't know if that is the best semantics, I think Barry will hate it, > because it gives you a one level solver when you asked for MG. It now picks > up the coarse grid solver as the solver, which is wrong, so I need to fix > this if we decide to stick with the current semantics. > > > > And again thanks for helping to pound on this code. > > > > Mark > > > > On Jan 13, 2012, at 6:33 PM, Ravi Kannan wrote: > > > > Hi Mark, Hong, > > > > Lets make it simpler. I fixed my partitiotion bug (in metis). Now there is a > equidivision of cells. > > > > To simplify even further, lets run a much smaller case : with 6 cells > (equations) in SERIAL. This one crashes. The out and the ksp_view_binary > files are attached. > > > > Thanks, > > RAvi. > > > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] > On Behalf Of Mark F. Adams > Sent: Friday, January 13, 2012 3:00 PM > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > > > Well, we do have a bug here. It should work with zero elements on a proc, > but the code is being actively developed so you are really helping us to > find these cracks. > > > > If its not too hard it would be nice if you could give use these matrices, > before you fix it, so we can fix this bug. You can just send it to Hong and > I (cc'ed). > > > > Mark > > > > On Jan 13, 2012, at 12:16 PM, Ravi Kannan wrote: > > > > Hi Mark,Hong > > > > Thanks for the observation w.r.t the proc 0 having 2 equations. This is a > bug from our end. We will fix it and get back to you if needed. > > > > Thanks, > > Ravi. > > > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] > On Behalf Of Mark F. Adams > Sent: Thursday, January 12, 2012 10:03 PM > To: Hong Zhang > Cc: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > > > Ravi, can you run with -ksp_view_binary? This will produce two files. > > > > Hong, ex10 will read in these files and solve them. I will probably not be > able to get to this until Monday. > > > > Also, this matrix has just two equations on proc 0 and and about 11000 on > proc 1 so its is strangely balanced, in case that helps ... > > > > Mark > > > > On Jan 12, 2012, at 10:35 PM, Hong Zhang wrote: > > > > > > Ravi, > > > > I need more info for debugging. Can you provide a simple stand alone code > and matrices in petsc > > binary format that reproduce the error? > > > > MatTransposeMatMult() for mpiaij is a newly developed subroutine - less than > one month old > > and not well tested yet :-( > > I used petsc-dev/src/mat/examples/tests/ex94.c for testing. > > > > Thanks, > > > > Hong > > On Thu, Jan 12, 2012 at 9:17 PM, Mark F. Adams > wrote: > > It looks like the problem is in MatTransposeMatMult and Hong (cc'ed) is > working on it. > > > > I'm hoping that your output will be enough for Hong to figure this out but I > could not reproduce this problem with any of my tests. > > > > If Hong can not figure this out then we will need to get the matrix from you > to reproduce this. > > > > Mark > > > > > > On Jan 12, 2012, at 6:25 PM, Ravi Kannan wrote: > > > > > > Hi Mark, > > > > Any luck with the gamg bug fix? > > > > Thanks, > > Ravi. > > > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] > On Behalf Of Mark F. Adams > Sent: Wednesday, January 11, 2012 1:54 PM > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > > > This seems to be dying earlier than it was last week, so it looks like a new > bug in MatTransposeMatMult. > > > > Mark > > > > On Jan 11, 2012, at 1:59 PM, Matthew Knepley wrote: > > > > On Wed, Jan 11, 2012 at 12:23 PM, Ravi Kannan wrote: > > Hi Mark, > > > > I downloaded the dev version again. This time, the program crashes even > earlier. Attached is the serial and parallel info outputs. > > > > Could you kindly take a look. > > > > It looks like this is a problem with MatMatMult(). Can you try to reproduce > this using KSP ex10? You put > > your matrix in binary format and use -pc_type gamg. Then you can send us the > matrix and we can track > > it down. Or are you running an example there? > > > > Thanks, > > > > Matt > > > > > > > > Thanks, > > Ravi. > > > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] > On Behalf Of Mark F. Adams > Sent: Monday, January 09, 2012 3:08 PM > > > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > > > > > Yes its all checked it, just pull from dev. > > Mark > > > > On Jan 9, 2012, at 2:54 PM, Ravi Kannan wrote: > > > > Hi Mark, > > > > Thanks for your efforts. > > > > Do I need to do the install from scratch once again? Or some particular > files (check out gamg.c for instance)? > > > > Thanks, > > Ravi. > > > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] > On Behalf Of Mark F. Adams > Sent: Friday, January 06, 2012 10:30 AM > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > > > I think I found the problem. You will need to use petsc-dev to get the fix. > > > > Mark > > > > On Jan 6, 2012, at 8:55 AM, Mark F. Adams wrote: > > > > Ravi, I forgot but you can just use -ksp_view_binary to output the matrix > data (two files). You could run it with two procs and a Jacobi solver to > get it past the solve, where it writes the matrix (I believe). > > Mark > > > > On Jan 5, 2012, at 6:19 PM, Ravi Kannan wrote: > > > > Just send in another email with the attachment. > > > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] > On Behalf Of Jed Brown > Sent: Thursday, January 05, 2012 5:15 PM > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > > > On Thu, Jan 5, 2012 at 17:12, Ravi Kannan wrote: > > I have attached the verbose+info outputs for both the serial and the > parallel (2 partitions). NOTE: the serial output at some location says > PC=Jacobi! Is it implicitly converting the PC to a Jacobi? > > > > Looks like you forgot the attachment. > > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > _______________________________________________ > petsc-users mailing list > petsc-users at mcs.anl.gov > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > > > End of petsc-users Digest, Vol 37, Issue 41 > ******************************************* > From bsmith at mcs.anl.gov Wed Jan 18 15:08:10 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 18 Jan 2012 15:08:10 -0600 Subject: [petsc-users] generate entries on 'wrong' process (Satish Balay) In-Reply-To: References: Message-ID: <9581B9F4-0D6C-4CF8-9ADF-67D721EE9EB2@mcs.anl.gov> On Jan 18, 2012, at 2:55 PM, Wen Jiang wrote: > Hi Satish, > > Thanks for your suggestion. > > I just tried both of these methods, but it seems that they did not work neither. After adding matstash_initial_size and vecstash_initial_size, stash uses 0 mallocs in mat_assembly stage. I cannot see much difference when I use MAT_FLUSH_ASSEMBLY after every element stiffness matrix are added. I list the last a few -info information where my codes gets stuck. > > [5] MatAssemblyBegin_MPIAIJ(): Stash has 4806656 entries, uses 0 mallocs. > [4] MatAssemblyBegin_MPIAIJ(): Stash has 5964288 entries, uses 0 mallocs. > [6] MatAssemblyBegin_MPIAIJ(): Stash has 5727744 entries, uses 0 mallocs. > [3] MatAssemblyBegin_MPIAIJ(): Stash has 8123904 entries, uses 0 mallocs. > [7] MatAssemblyBegin_MPIAIJ(): Stash has 7408128 entries, uses 0 mallocs. > [2] MatAssemblyBegin_MPIAIJ(): Stash has 11544576 entries, uses 0 mallocs. > [0] MatStashScatterBegin_Private(): No of messages: 1 > [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 107888648 > [0] MatAssemblyBegin_MPIAIJ(): Stash has 13486080 entries, uses 1 mallocs. > [1] MatAssemblyBegin_MPIAIJ(): Stash has 16386048 entries, uses 1 mallocs. I am now 99.9% sure your problem is incorrect preallocation. See my previous email to determine the problem. Barry > > > > > On Wed, Jan 18, 2012 at 1:00 PM, wrote: > Send petsc-users mailing list submissions to > petsc-users at mcs.anl.gov > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > or, via email, send a message with subject or body 'help' to > petsc-users-request at mcs.anl.gov > > You can reach the person managing the list at > petsc-users-owner at mcs.anl.gov > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of petsc-users digest..." > > > Today's Topics: > > 1. Re: generate entries on 'wrong' process (Satish Balay) > 2. Re: Multiple output using one viewer (Jed Brown) > 3. Re: DMGetMatrix segfault (Jed Brown) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 18 Jan 2012 11:07:21 -0600 (CST) > From: Satish Balay > Subject: Re: [petsc-users] generate entries on 'wrong' process > To: PETSc users list > Message-ID: > Content-Type: TEXT/PLAIN; charset=US-ASCII > > You can do 2 things. > > 1. allocate sufficient stash space to avoid mallocs. > You can do this with the following runtime command line options > -vecstash_initial_size > -matstash_initial_size > > 2. flush stashed values in stages instead of doing a single > large communication at the end. > > > MatAssemblyBegin/End(MAT_FLUSH_ASSEMBLY) > > MatAssemblyBegin/End(MAT_FLUSH_ASSEMBLY) > ... > ... > > > MatAssemblyBegin/End(MAT_FINAL_ASSEMBLY) > > Satish > > > On Wed, 18 Jan 2012, Wen Jiang wrote: > > > Hi, > > > > I am working on FEM codes with spline-based element type. For 3D case, one > > element has 64 nodes and every two neighboring elements share 48 nodes. > > Thus regardless how I partition a mesh, there are still very large number > > of entries that have to write on the 'wrong' processor. And my code is > > running on clusters, the processes are sending between 550 and 620 Million > > packets per second across the network. My code seems IO-bound at this > > moment and just get stuck at the matrix assembly stage. A -info file is > > attached. Do I have other options to optimize my codes to be less > > io-intensive? > > > > Thanks in advance. > > > > [0] VecAssemblyBegin_MPI(): Stash has 210720 entries, uses 12 mallocs. > > [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. > > [5] MatAssemblyBegin_MPIAIJ(): Stash has 4806656 entries, uses 8 mallocs. > > [6] MatAssemblyBegin_MPIAIJ(): Stash has 5727744 entries, uses 9 mallocs. > > [4] MatAssemblyBegin_MPIAIJ(): Stash has 5964288 entries, uses 9 mallocs. > > [7] MatAssemblyBegin_MPIAIJ(): Stash has 7408128 entries, uses 9 mallocs. > > [3] MatAssemblyBegin_MPIAIJ(): Stash has 8123904 entries, uses 9 mallocs. > > [2] MatAssemblyBegin_MPIAIJ(): Stash has 11544576 entries, uses 10 mallocs. > > [0] MatStashScatterBegin_Private(): No of messages: 1 > > [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 107888648 > > [0] MatAssemblyBegin_MPIAIJ(): Stash has 13486080 entries, uses 10 mallocs. > > [1] MatAssemblyBegin_MPIAIJ(): Stash has 16386048 entries, uses 10 mallocs. > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > > unneeded,2514537 used > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [0] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > > routines > > [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > > unneeded,2525390 used > > [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [5] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode > > routines > > [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > > unneeded,2500281 used > > [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [3] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > > routines > > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > > unneeded,2500281 used > > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [1] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > > routines > > [4] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > > unneeded,2500281 used > > [4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [4] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > > routines > > [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > > unneeded,2525733 used > > [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [2] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > > routines > > > > > > ------------------------------ > > Message: 2 > Date: Wed, 18 Jan 2012 11:39:49 -0600 > From: Jed Brown > Subject: Re: [petsc-users] Multiple output using one viewer > To: PETSc users list > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > On Thu, Jan 5, 2012 at 18:17, Barry Smith wrote: > > > On Jan 5, 2012, at 9:40 AM, Jed Brown wrote: > > > > > On Thu, Jan 5, 2012 at 09:36, Alexander Grayver > > wrote: > > > Maybe this should be noted in the documentation? > > > > > > Yes, I think the old file should be closed (if it exists), but I'll wait > > for comment. > > > > I never thought about the case where someone called > > PetscViewerFileSetName() twice. I'm surprised that it works at all. > > > > Yes, it should (IMHO) be changed to close the old file if used twice. > > > It works this way now. > > http://petsc.cs.iit.edu/petsc/petsc-dev/rev/3a98e6a0994d > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > Message: 3 > Date: Wed, 18 Jan 2012 11:40:28 -0600 > From: Jed Brown > Subject: Re: [petsc-users] DMGetMatrix segfault > To: PETSc users list > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > On Tue, Jan 17, 2012 at 06:32, Jed Brown wrote: > > > I'll update petsc-dev to call DMSetUp() automatically when it is needed. > > > > http://petsc.cs.iit.edu/petsc/petsc-dev/rev/56deb0e7db8b > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > _______________________________________________ > petsc-users mailing list > petsc-users at mcs.anl.gov > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > > > End of petsc-users Digest, Vol 37, Issue 40 > ******************************************* > From praghanmor at gmail.com Wed Jan 18 23:52:59 2012 From: praghanmor at gmail.com (Rahul Praghanmor) Date: Thu, 19 Jan 2012 11:22:59 +0530 Subject: [petsc-users] unstructured finite volume method matrix assembly for partitioned mesh In-Reply-To: References: Message-ID: my apologies...!! I am using METIS for partitioning the domain. On Wed, Jan 18, 2012 at 12:11 AM, Matthew Knepley wrote: > On Tue, Jan 17, 2012 at 12:26 PM, Rahul Praghanmor wrote: > >> Dear Sir, >> I am working on a parallel unstructured finite volume >> solver.The solver is efficiently running in parallel using gauss seidel >> linear solver.The matrix is sparse and stored by CSR format.Now I want to >> implement a PETSc library to make the convergence faster.But I am going >> through a major problem as discussed below. >> If I partitioned a big domain say rectangular duct into 4 zones >> using parMetis.Each zone is solved in separate processor as fairly solved >> by gauss seidel linear solver.But I want to solve these zones by PETSc.How >> to do that?How to form a matrix with global numbering which is required >> format for PETSc to form a matrix?Does it necessarily important to form a >> global matrix? very few information available for assembling a matrix from >> unstructured finite volume method in PETSc. >> > > If you already run ParMetis, just number the rows it puts on each process > consecutively. > > Matt > > >> Thankx and regards, >> Rahul. >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- Zeus Numerix Pvt. Ltd. I2IT Campus, P-14 Rajiv Gandhi Infotech Park, Phase-1 Hinjewadi Pune -57 0ff. +91 64731511/9965 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Thu Jan 19 02:20:02 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Thu, 19 Jan 2012 00:20:02 -0800 Subject: [petsc-users] VecView() and binary output Message-ID: Hi guys, I use VecView() in conjunction with PetscViewerBinaryOpen() to write the binary data associated with a 3D data obtained from DA structured arrays. I noticed that on my machine, the size of the file is 8 bytes bigger than the size of the data. Could someone tell me where that extra 8 byte is stored and how it is going help? best Mohamad * * -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Thu Jan 19 02:42:57 2012 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 19 Jan 2012 08:42:57 +0000 Subject: [petsc-users] access to matnest block (0,1) ? Message-ID: > > I have two DMs which I add to a DMComposite and then use MATNEST when > > getting the corresponding matrix. This gives me block (0,0) and block > > (1,1). How do I set/get blocks (0,1) and (1,0)? Looking at ex28 I tried > > MatGetLocalSubMatrix but it gives a null arg... > > > > So the problem is that we have no way of knowing what preallocation > (nonzero pattern) _should_ go in the off-diagonal part. Unfortunately, the > current preallocation mechanism (DMCompositeSetCoupling()) is a difficult > thing to implement and the mechanism does not directly apply to MatNest. If > you have ideas for a good preallocation API, I would like to hear it. I > need to get back to the preallocation issue because it's an obvious wart in > the multiphysics support (as long as we don't have fast dynamic > preallocation, which is a somewhat viable alternative). What I would like > is for the user to call MatGetLocalSubMatrix() for any blocks that they > want allocated and set preallocation in terms of the local ordering. > > The current (unfortunate) solution for MatNest with off-diagonal parts is > to create the submatrices after DMGetMatrix(), preallocate as you like, and > copy the ISLocalToGlobalMappings over. I see the problem, no ideas for a good general preallocation mechanism, sorry. Preallocation would depend on how the user discretizes the cross terms, wouldn't it? So how can we expect PETSc to "deduce" it from the diagonal blocks? As a user I would be happy to preallocate myself. Out of curiosity, how does it happen in ex28.c: I only see two matrices being defined (line 347 and 355), I don't see DMCompositeSetCoupling() at all, yet on line 271 block (0,1) is available... What about a less general (but important) case: saddle point problems arising from incompressible Stokes, Oseen and Navier-Stokes eqs. with Schur type preconditioning. In 2D with N cells and co-located variables arranged as (u1,...,uN,v1,...,vN,p1,...pN) the matrix would have the form [Q G, D 0] with Q a 2N-by-2N matrix, G a 2N-by-N matrix and D a N-by-2N matrix. Since the variables are co-located, they share the same partitioning but could have different stencils. How to use the "split local space", DMComposite and MATNEST in this case? dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From mmnasr at gmail.com Thu Jan 19 02:50:51 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Thu, 19 Jan 2012 00:50:51 -0800 Subject: [petsc-users] A few questions about hdf5 viewer Message-ID: Hi guys, I have compiled petsc to use HDF5 package. I like to store the data from a parallel vector(s) (obtained from structured DA in 3 dimensions) to file using VecView() in conjunction with PetscViewerHDF5Open(). I followed the example here http://www.mcs.anl.gov/petsc/petsc-current/src/dm/examples/tutorials/ex10.c.html and everything looks fine. However, I had a couple questions: 1- When I am done writing the parallel vector obtained from the DA (and PETSC_COMM_WORLD), // Create the HDF5 viewer PetscViewerHDF5Open (PETSC_COMM_WORLD ,"gauss.h5",FILE_MODE_WRITE,&H5viewer); // Write the H5 file VecView (gauss,H5viewer); // Cleaning stage PetscViewerDestroy (&H5viewer); how can I add data that are just simple 1-D numbers stored on local arrays. Easier said, I would like to add the structured grid coordinates (first all x's, then all y's, and then all z's) at the end (or to the beginning) of each data (*.h5) file. But the grid coordinates are stored locally on each machine and not derived from any parallel vectors or DA. I was thinking about creating vectors and viewers using PETSC_COMM_SELF but i am not sure if that is the right approach since that vector is created on all processors locally. 2- When using VecView() and HDF5 writer, what is the status of data compression? The reason that I am asking is that, I used the same example above and comparing two files saved via two different PetscViewers, i.e. (just) Binary and HDF5 (Binary) the size is not reduced in the (*.h5) case. In fact, it is slightly bigger than pure binary file!! Is there any command we have to set in Petsc to tell HDF5 viewer to use data compression? Thanks for your patience, Best, Mohamad -------------- next part -------------- An HTML attachment was scrubbed... URL: From agrayver at gfz-potsdam.de Thu Jan 19 03:23:09 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Thu, 19 Jan 2012 10:23:09 +0100 Subject: [petsc-users] VecView() and binary output In-Reply-To: References: Message-ID: <4F17E0FD.8070305@gfz-potsdam.de> There is a header in the beginning of the file. The info in the header defines which object type is stored and its dimensions. The size of the header can vary depending on the stored object (Vec,Mat,etc) and type you use for indices (int32,int64). For Vec object and int32 you have one integer defining the code of the Vec object and another vector size. That sums up to 8 extra bytes. On 19.01.2012 09:20, Mohamad M. Nasr-Azadani wrote: > Hi guys, > > I use VecView() in conjunction with > > > PetscViewerBinaryOpen() > > > to write the binary data associated with a 3D data obtained from DA > structured arrays. > > > I noticed that on my machine, the size of the file is 8 bytes bigger > than the size of the data. > > > Could someone tell me where that extra 8 byte is stored and how it > is going help? > > > > best > > > Mohamad > > * > * -- Regards, Alexander -------------- next part -------------- An HTML attachment was scrubbed... URL: From domenico.borzacchiello at univ-st-etienne.fr Thu Jan 19 06:59:03 2012 From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr) Date: Thu, 19 Jan 2012 13:59:03 +0100 (CET) Subject: [petsc-users] [KSP] true norm residual Message-ID: <8b1650b201050b5f06358b1c54aaebde.squirrel@arcon.univ-st-etienne.fr> Hi, When I solve a linear system and monitor the true norm residual I got somethig like this (rtol is set to 1e-6) : Residual norms for stokes_ solve. 0 KSP preconditioned resid norm 1.270103385648e+00 true resid norm 2.319288792500e+06 ||r(i)||/||b|| 1.667494921925e-01 1 KSP preconditioned resid norm 5.669390085343e-01 true resid norm 1.031308013796e+06 ||r(i)||/||b|| 7.414776812212e-02 2 KSP preconditioned resid norm 2.347505837595e-01 true resid norm 5.772588330636e+05 ||r(i)||/||b|| 4.150307524801e-02 3 KSP preconditioned resid norm 1.374434713779e-01 true resid norm 2.045693242518e+05 ||r(i)||/||b|| 1.470788417874e-02 4 KSP preconditioned resid norm 4.177447095627e-02 true resid norm 1.127811207483e+05 ||r(i)||/||b|| 8.108604100745e-03 5 KSP preconditioned resid norm 1.979676092561e-02 true resid norm 1.175236295643e+05 ||r(i)||/||b|| 8.449575410292e-03 6 KSP preconditioned resid norm 7.297213488366e-03 true resid norm 1.135209489825e+05 ||r(i)||/||b|| 8.161795399204e-03 7 KSP preconditioned resid norm 3.043016937820e-03 true resid norm 1.156795165002e+05 ||r(i)||/||b|| 8.316989542594e-03 8 KSP preconditioned resid norm 1.655126721698e-03 true resid norm 1.141320219921e+05 ||r(i)||/||b|| 8.205729606261e-03 9 KSP preconditioned resid norm 4.928188158108e-04 true resid norm 1.143795580962e+05 ||r(i)||/||b|| 8.223526665338e-03 10 KSP preconditioned resid norm 2.038487964289e-04 true resid norm 1.141943101596e+05 ||r(i)||/||b|| 8.210207927522e-03 11 KSP preconditioned resid norm 1.116161343513e-04 true resid norm 1.142236376429e+05 ||r(i)||/||b|| 8.212316480350e-03 Linear solve converged due to CONVERGED_RTOL iterations 11. I'd be thankful if I could get some clarification on a few things: - the unpreconditioned norm doesn't converge but if I visualize my solution it looks ok - the solver says that the solution has converged due to RTOL but it doesn't seem that r(11)/r(0) is less than 1e-6 I use GMRES with MG Preconditioning and GMRES+PCSHELL as smoother. The PCSHELL at each level also implies a pre-solve scaling of the solution and rhs vectors but these are also unscaled in a post-solve step. thanks, Domenico From bsmith at mcs.anl.gov Thu Jan 19 07:39:52 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 19 Jan 2012 07:39:52 -0600 Subject: [petsc-users] [KSP] true norm residual In-Reply-To: <8b1650b201050b5f06358b1c54aaebde.squirrel@arcon.univ-st-etienne.fr> References: <8b1650b201050b5f06358b1c54aaebde.squirrel@arcon.univ-st-etienne.fr> Message-ID: <260C53DD-6781-43F6-B04F-7FE7DB345D52@mcs.anl.gov> > > I use GMRES with MG Preconditioning and GMRES+PCSHELL as smoother. The > PCSHELL at each level also implies a pre-solve scaling of the solution and > rhs vectors but these are also unscaled in a post-solve step. You cannot safely use GMRES inside GMRES so you should always run this case with -ksp_type fgmres You get unpredictable behavior if you use a nonlinear preconditioner inside GMRES. Barry We've debated changing the default KPS to fgmres to prevent this type of error, the problem is that is slight overkill in the case when the preconditioner is linear which is the more common case. On Jan 19, 2012, at 6:59 AM, domenico.borzacchiello at univ-st-etienne.fr wrote: > Hi, > When I solve a linear system and monitor the true norm residual I got > somethig like this (rtol is set to 1e-6) : > > > Residual norms for stokes_ solve. > 0 KSP preconditioned resid norm 1.270103385648e+00 true resid norm > 2.319288792500e+06 ||r(i)||/||b|| 1.667494921925e-01 > 1 KSP preconditioned resid norm 5.669390085343e-01 true resid norm > 1.031308013796e+06 ||r(i)||/||b|| 7.414776812212e-02 > 2 KSP preconditioned resid norm 2.347505837595e-01 true resid norm > 5.772588330636e+05 ||r(i)||/||b|| 4.150307524801e-02 > 3 KSP preconditioned resid norm 1.374434713779e-01 true resid norm > 2.045693242518e+05 ||r(i)||/||b|| 1.470788417874e-02 > 4 KSP preconditioned resid norm 4.177447095627e-02 true resid norm > 1.127811207483e+05 ||r(i)||/||b|| 8.108604100745e-03 > 5 KSP preconditioned resid norm 1.979676092561e-02 true resid norm > 1.175236295643e+05 ||r(i)||/||b|| 8.449575410292e-03 > 6 KSP preconditioned resid norm 7.297213488366e-03 true resid norm > 1.135209489825e+05 ||r(i)||/||b|| 8.161795399204e-03 > 7 KSP preconditioned resid norm 3.043016937820e-03 true resid norm > 1.156795165002e+05 ||r(i)||/||b|| 8.316989542594e-03 > 8 KSP preconditioned resid norm 1.655126721698e-03 true resid norm > 1.141320219921e+05 ||r(i)||/||b|| 8.205729606261e-03 > 9 KSP preconditioned resid norm 4.928188158108e-04 true resid norm > 1.143795580962e+05 ||r(i)||/||b|| 8.223526665338e-03 > 10 KSP preconditioned resid norm 2.038487964289e-04 true resid norm > 1.141943101596e+05 ||r(i)||/||b|| 8.210207927522e-03 > 11 KSP preconditioned resid norm 1.116161343513e-04 true resid norm > 1.142236376429e+05 ||r(i)||/||b|| 8.212316480350e-03 > > Linear solve converged due to CONVERGED_RTOL iterations 11. > > > I'd be thankful if I could get some clarification on a few things: > > - the unpreconditioned norm doesn't converge but if I visualize my > solution it looks ok > > - the solver says that the solution has converged due to RTOL but it doesn't > seem that r(11)/r(0) is less than 1e-6 > > I use GMRES with MG Preconditioning and GMRES+PCSHELL as smoother. The > PCSHELL at each level also implies a pre-solve scaling of the solution and > rhs vectors but these are also unscaled in a post-solve step. > > > thanks, > Domenico > > From agrayver at gfz-potsdam.de Thu Jan 19 08:30:39 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Thu, 19 Jan 2012 15:30:39 +0100 Subject: [petsc-users] MKL BLAS interface inconsistency Message-ID: <4F18290F.3080301@gfz-potsdam.de> Dear petsc team, I've been struggling with this problem for a long time and seeking for your advice now. Let's take simple petsc program: #include int main(int argc,char **args) { Vec b; PetscReal norm; PetscInitialize(&argc,&args,(char *)0,NULL); VecCreateSeq(PETSC_COMM_SELF,100,&b); VecSet(b,2.0); VecNorm(b,NORM_1_AND_2,&norm); return 0; } This program works well if I compile petsc with non-mkl blas/lapack. However, if I compile petsc with mkl blas/lapack this program crashes: zdotc, FP=7fff8f121e30 VecNorm_Seq, FP=7fff8f121f90 VecNorm_Seq, FP=7fff8f1220f0 VecNorm, FP=7fff8f122230 main, FP=7fff8f122290 It crashes in zdotc. You call this routine as following: *z = BLASdot_(&bn,xx,&one,xx,&one); When I look at the zdotc interface in mkl.h I see: void ZDOTC(MKL_Complex16 *pres, const MKL_INT *n, const MKL_Complex16 *x, const MKL_INT *incx, const MKL_Complex16 *y, const MKL_INT *incy); I also found example here: http://software.intel.com/en-us/articles/intel-math-kernel-library-intel-mkl-blas-cblas-and-lapack-compilinglinking-functions-fortran-and-cc-calls/ There are 6 input parameters. But in "classical" BLAS implementation 5 input and 1 output. For example, NORM_1 works well since interface to dzasum is the same. Any ideas? -- Regards, Alexander From jedbrown at mcs.anl.gov Thu Jan 19 08:57:24 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 19 Jan 2012 08:57:24 -0600 Subject: [petsc-users] access to matnest block (0,1) ? In-Reply-To: References: Message-ID: On Thu, Jan 19, 2012 at 02:42, Klaij, Christiaan wrote: > > > I have two DMs which I add to a DMComposite and then use MATNEST when > > > getting the corresponding matrix. This gives me block (0,0) and block > > > (1,1). How do I set/get blocks (0,1) and (1,0)? Looking at ex28 I tried > > > MatGetLocalSubMatrix but it gives a null arg... > > > > > > > So the problem is that we have no way of knowing what preallocation > > (nonzero pattern) _should_ go in the off-diagonal part. Unfortunately, > the > > current preallocation mechanism (DMCompositeSetCoupling()) is a difficult > > thing to implement and the mechanism does not directly apply to MatNest. > If > > you have ideas for a good preallocation API, I would like to hear it. I > > need to get back to the preallocation issue because it's an obvious wart > in > > the multiphysics support (as long as we don't have fast dynamic > > preallocation, which is a somewhat viable alternative). What I would like > > is for the user to call MatGetLocalSubMatrix() for any blocks that they > > want allocated and set preallocation in terms of the local ordering. > > > > The current (unfortunate) solution for MatNest with off-diagonal parts is > > to create the submatrices after DMGetMatrix(), preallocate as you like, > and > > copy the ISLocalToGlobalMappings over. > > I see the problem, no ideas for a good general preallocation > mechanism, sorry. Preallocation would depend on how the user > discretizes the cross terms, wouldn't it? So how can we expect > PETSc to "deduce" it from the diagonal blocks? As a user I would > be happy to preallocate myself. > > Out of curiosity, how does it happen in ex28.c: I only see two > matrices being defined (line 347 and 355), I don't see > DMCompositeSetCoupling() at all, yet on line 271 block (0,1) is > available... > It only assembles the block diagonal when you use MatNest. if (!Buk) PetscFunctionReturn(0); /* Not assembling this block */ It builds the whole matrix when you use AIJ, but preallocation isn't correct. > > What about a less general (but important) case: saddle point > problems arising from incompressible Stokes, Oseen and > Navier-Stokes eqs. with Schur type preconditioning. In 2D with N > cells and co-located variables arranged > as (u1,...,uN,v1,...,vN,p1,...pN) the matrix would have the form > [Q G, D 0] with Q a 2N-by-2N matrix, G a 2N-by-N matrix and D a > N-by-2N matrix. Since the variables are co-located, they share > the same partitioning but could have different stencils. How to use > the "split local space", DMComposite and MATNEST in this case? > If you order this way, then you don't need DMComposite or MatNest (although you can still make a MatNest that operates in this ordering, we just don't have a way to make it automatically). -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 19 08:59:39 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 19 Jan 2012 08:59:39 -0600 Subject: [petsc-users] A few questions about hdf5 viewer In-Reply-To: References: Message-ID: On Thu, Jan 19, 2012 at 2:50 AM, Mohamad M. Nasr-Azadani wrote: > Hi guys, > > I have compiled petsc to use HDF5 package. > > I like to store the data from a parallel vector(s) (obtained from > structured DA in 3 dimensions) to file using VecView() in conjunction with PetscViewerHDF5Open(). > > I followed the example here > http://www.mcs.anl.gov/petsc/petsc-current/src/dm/examples/tutorials/ex10.c.html > and everything looks fine. > > However, I had a couple questions: > > 1- When I am done writing the parallel vector obtained from the DA (and > PETSC_COMM_WORLD), > > // Create the HDF5 viewer > PetscViewerHDF5Open > (PETSC_COMM_WORLD > ,"gauss.h5",FILE_MODE_WRITE,&H5viewer); > // Write the H5 file > VecView > (gauss,H5viewer); > // Cleaning stage > PetscViewerDestroy > (&H5viewer); > > how can I add data that are just simple 1-D numbers stored on local > arrays. > Easier said, I would like to add the structured grid coordinates (first > all x's, then all y's, and then all z's) at the end (or to the beginning) > of each data (*.h5) file. But the grid coordinates are stored locally on > each machine and not derived from any parallel vectors or DA. I was > thinking about creating vectors and viewers using PETSC_COMM_SELF but i am > not sure if that is the right approach since that vector is created on all > processors locally. > Use the DA coordinate mechanism and you can get the coordinates as a parallel Vec. > 2- When using VecView() and HDF5 writer, what is the status of data > compression? > The reason that I am asking is that, I used the same example above and > comparing two files saved via two different PetscViewers, i.e. (just) > Binary and HDF5 (Binary) the size is not reduced in the (*.h5) case. > In fact, it is slightly bigger than pure binary file!! > Is there any command we have to set in Petsc to tell HDF5 viewer to use > data compression? > We do not support it. We are happy to take patches that enable this. Thanks, Matt > Thanks for your patience, > Best, > Mohamad > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From agrayver at gfz-potsdam.de Thu Jan 19 10:55:21 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Thu, 19 Jan 2012 17:55:21 +0100 Subject: [petsc-users] MKL BLAS interface inconsistency In-Reply-To: <4F18290F.3080301@gfz-potsdam.de> References: <4F18290F.3080301@gfz-potsdam.de> Message-ID: <4F184AF9.2000608@gfz-potsdam.de> This (dirty) patch solves the problem: diff -r a3e9ca59ab58 src/vec/vec/impls/seq/bvec2.c --- a/src/vec/vec/impls/seq/bvec2.c Tue Jan 17 22:04:05 2012 -0600 +++ b/src/vec/vec/impls/seq/bvec2.c Thu Jan 19 17:28:39 2012 +0100 @@ -232,12 +232,13 @@ PetscErrorCode ierr; PetscInt n = xin->map->n; PetscBLASInt one = 1, bn = PetscBLASIntCast(n); + PetscScalar cnorm; PetscFunctionBegin; if (type == NORM_2 || type == NORM_FROBENIUS) { ierr = VecGetArrayRead(xin,&xx);CHKERRQ(ierr); - *z = BLASdot_(&bn,xx,&one,xx,&one); - *z = PetscSqrtReal(*z); + zdotc(&cnorm,&bn,xx,&one,xx,&one); + *z = PetscSqrtReal(PetscAbsScalar(cnorm)); ierr = VecRestoreArrayRead(xin,&xx);CHKERRQ(ierr); ierr = PetscLogFlops(PetscMax(2.0*n-1,0.0));CHKERRQ(ierr); } else if (type == NORM_INFINITY) { The same is applied to mpi vector implementation from /petsc-dev/src/vec/vec/impls/mpi/pvec2.c Of course it works only if one uses Intel MKL BLAS/LAPACK and double complex arithmetics. Which is my case. On 19.01.2012 15:30, Alexander Grayver wrote: > Dear petsc team, > > I've been struggling with this problem for a long time and seeking for > your advice now. > Let's take simple petsc program: > > #include > int main(int argc,char **args) > { > Vec b; > PetscReal norm; > > PetscInitialize(&argc,&args,(char *)0,NULL); > > VecCreateSeq(PETSC_COMM_SELF,100,&b); > VecSet(b,2.0); > VecNorm(b,NORM_1_AND_2,&norm); > > return 0; > } > > This program works well if I compile petsc with non-mkl blas/lapack. > However, if I compile petsc with mkl blas/lapack this program crashes: > > zdotc, FP=7fff8f121e30 > VecNorm_Seq, FP=7fff8f121f90 > VecNorm_Seq, FP=7fff8f1220f0 > VecNorm, FP=7fff8f122230 > main, FP=7fff8f122290 > > It crashes in zdotc. You call this routine as following: > > *z = BLASdot_(&bn,xx,&one,xx,&one); > > When I look at the zdotc interface in mkl.h I see: > void ZDOTC(MKL_Complex16 *pres, const MKL_INT *n, const > MKL_Complex16 *x, const MKL_INT *incx, const MKL_Complex16 *y, const > MKL_INT *incy); > > I also found example here: > http://software.intel.com/en-us/articles/intel-math-kernel-library-intel-mkl-blas-cblas-and-lapack-compilinglinking-functions-fortran-and-cc-calls/ > There are 6 input parameters. But in "classical" BLAS implementation 5 > input and 1 output. > > For example, NORM_1 works well since interface to dzasum is the same. > > Any ideas? > -- Regards, Alexander From mark.adams at columbia.edu Wed Jan 18 16:00:04 2012 From: mark.adams at columbia.edu (Mark F. Adams) Date: Wed, 18 Jan 2012 14:00:04 -0800 Subject: [petsc-users] [petsc-dev] boomerAmg scalability In-Reply-To: <006f01ccd61c$47c0fc80$d742f580$@com> References: <9E6B0CE58F7CC24294FAB8EF9902362A75B89942E7@EXCHMBB.ornl.gov> <9E6B0CE58F7CC24294FAB8EF9902362A75B89B4875@EXCHMBB.ornl.gov> <9110062A-05FF-4096-8DC2-CDEC14E5F8CE@mcs.anl.gov> <0B0FC28A-D635-4803-9B38-4954DCE3974B@ornl.gov> <003701ccbb45$f2bb23f0$d8316bd0$@com> <007801cccb04$5751ad20$05f50760$@com> <349C5EE1-314F-4E03-AF6C-E20D4D3DBDCF@columbia.edu> <001a01cccbff$8aca1420$a05e3c60$@com> <0! ! ! 00601cccf08$8aa7c300 $9ff74900$@com> <9E0AFCB4-D283-4329-8B7F-3DB1C4116197@columbia.edu> <003501ccd08e$2bda4500$838ecf00$@com> <30973A31-D618-46F8-9831-8F87CC3C43D0@columbia.edu> <004201ccd181$840a5dc0$8c1f1940$@com> <4CFD9328-CC1A-4984-9BBA-FF0CA8D7CB64@columbia.edu> <2548B229-DB77-4397-A677-146F2F0E3C5C@columbia.edu> <004101ccd217$09c79fa0$1d56dee0$@com> <949849D4-6822-4ED6-8373-F84F6C3209F0@columbia.edu> <00be01ccd24b$b78dde90$26a99bb0$@com> <56683DBA-A733-45FC-AB5B-E18368442FC6@columbia.edu> <2D67D12E-86C8-4A7E-BD5A-5B955004274C@columbia.edu> <006f01ccd61c$47c0fc80$d742f580$@com> Message-ID: 15-20 times more iterations is huge. There are a few things to try. GAMG can get confused when it does an eigen solve by scaling of BC equations. -ksp_diagonal_scale should fix this. If the differences are still huge then I would have to take a look at the matrix. The default solver type is a simpler less optimal method. I debate what to make the default but this is another method: -pc_gamg_type sa This should not make a huge difference (2-3x at most). Also, if you run with -pc_gamg_verbose, this is small and useful. Mark On Jan 18, 2012, at 12:03 PM, Ravi Kannan wrote: > Hi Mark, Hong, > > As you might remember, the reason for this whole exercise was to obtain a solution for a very stiff problem. > > We did have Hypre Boomer amg. This did not scale, but gives correct solution. So we wanted an alternative; hence we approached you for gamg. > > However for certain cases, gamg crashes. Even for the working cases, it takes about 15-20 times more sweeps than the boomer-hypre. Hence it is cost-prohibitive. > > Hopefully this gamg solver can be improved in the near future, for users like us. > > Warm Regards, > Ravi. > > > From: Mark F. Adams [mailto:mark.adams at columbia.edu] > Sent: Wednesday, January 18, 2012 9:56 AM > To: Hong Zhang > Cc: rxk at cfdrc.com > Subject: Re: [petsc-dev] boomerAmg scalability > > Hong and Ravi, > > I fixed a bug with the 6x6 problem. There seemed to be a bug in MatTranposeMat with funny decomposition, that was not really verified. So we can wait for Ravi to continue with his tests a fix them as they arise. > > Mark > ps, Ravi, I may not have cc'ed so I will send again. > > On Jan 17, 2012, at 7:37 PM, Hong Zhang wrote: > > > Ravi, > I wrote a simple test ex163.c (attached) on MatTransposeMatMult(). > Loading your 6x6 matrix gives no error from MatTransposeMatMult() > using 1,2,...7 processes. > For example, > > petsc-dev/src/mat/examples/tests>mpiexec -n 4 ./ex163 -f /Users/hong/Downloads/repetscdevboomeramgscalability/binaryoutput > A: > Matrix Object: 1 MPI processes > type: mpiaij > row 0: (0, 1.66668e+06) (1, -1.35) (3, -0.6) > row 1: (0, -1.35) (1, 1.66667e+06) (2, -1.35) (4, -0.6) > row 2: (1, -1.35) (2, 1.66667e+06) (5, -0.6) > row 3: (0, -0.6) (3, 1.66668e+06) (4, -1.35) > row 4: (1, -0.6) (3, -1.35) (4, 1.66667e+06) (5, -1.35) > row 5: (2, -0.6) (4, -1.35) (5, 1.66667e+06) > > C = A^T * A: > Matrix Object: 1 MPI processes > type: mpiaij > row 0: (0, 2.77781e+12) (1, -4.50002e+06) (2, 1.8225) (3, -2.00001e+06) (4, 1.62) > row 1: (0, -4.50002e+06) (1, 2.77779e+12) (2, -4.50001e+06) (3, 1.62) (4, -2.00001e+06) (5, 1.62) > row 2: (0, 1.8225) (1, -4.50001e+06) (2, 2.7778e+12) (4, 1.62) (5, -2.00001e+06) > row 3: (0, -2.00001e+06) (1, 1.62) (3, 2.77781e+12) (4, -4.50002e+06) (5, 1.8225) > row 4: (0, 1.62) (1, -2.00001e+06) (2, 1.62) (3, -4.50002e+06) (4, 2.77779e+12) (5, -4.50001e+06) > row 5: (1, 1.62) (2, -2.00001e+06) (3, 1.8225) (4, -4.50001e+06) (5, 2.7778e+12) > > Do I miss something? > > Hong > > On Sat, Jan 14, 2012 at 3:37 PM, Mark F. Adams wrote: > Ravi, this system is highly diagonally dominate. I've fixed the code so you can pull and try again. > > I've decided to basically just do a one level method with DD systems. I don't know if that is the best semantics, I think Barry will hate it, because it gives you a one level solver when you asked for MG. It now picks up the coarse grid solver as the solver, which is wrong, so I need to fix this if we decide to stick with the current semantics. > > And again thanks for helping to pound on this code. > > Mark > > On Jan 13, 2012, at 6:33 PM, Ravi Kannan wrote: > > Hi Mark, Hong, > > Lets make it simpler. I fixed my partitiotion bug (in metis). Now there is a equidivision of cells. > > To simplify even further, lets run a much smaller case : with 6 cells (equations) in SERIAL. This one crashes. The out and the ksp_view_binary files are attached. > > Thanks, > RAvi. > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams > Sent: Friday, January 13, 2012 3:00 PM > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > Well, we do have a bug here. It should work with zero elements on a proc, but the code is being actively developed so you are really helping us to find these cracks. > > If its not too hard it would be nice if you could give use these matrices, before you fix it, so we can fix this bug. You can just send it to Hong and I (cc'ed). > > Mark > > On Jan 13, 2012, at 12:16 PM, Ravi Kannan wrote: > > > Hi Mark,Hong > > Thanks for the observation w.r.t the proc 0 having 2 equations. This is a bug from our end. We will fix it and get back to you if needed. > > Thanks, > Ravi. > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams > Sent: Thursday, January 12, 2012 10:03 PM > To: Hong Zhang > Cc: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > Ravi, can you run with -ksp_view_binary? This will produce two files. > > Hong, ex10 will read in these files and solve them. I will probably not be able to get to this until Monday. > > Also, this matrix has just two equations on proc 0 and and about 11000 on proc 1 so its is strangely balanced, in case that helps ... > > Mark > > On Jan 12, 2012, at 10:35 PM, Hong Zhang wrote: > > > > Ravi, > > I need more info for debugging. Can you provide a simple stand alone code and matrices in petsc > binary format that reproduce the error? > > MatTransposeMatMult() for mpiaij is a newly developed subroutine - less than one month old > and not well tested yet :-( > I used petsc-dev/src/mat/examples/tests/ex94.c for testing. > > Thanks, > > Hong > > On Thu, Jan 12, 2012 at 9:17 PM, Mark F. Adams wrote: > It looks like the problem is in MatTransposeMatMult and Hong (cc'ed) is working on it. > > I'm hoping that your output will be enough for Hong to figure this out but I could not reproduce this problem with any of my tests. > > If Hong can not figure this out then we will need to get the matrix from you to reproduce this. > > Mark > > > On Jan 12, 2012, at 6:25 PM, Ravi Kannan wrote: > > > > Hi Mark, > > Any luck with the gamg bug fix? > > Thanks, > Ravi. > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams > Sent: Wednesday, January 11, 2012 1:54 PM > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > This seems to be dying earlier than it was last week, so it looks like a new bug in MatTransposeMatMult. > > Mark > > On Jan 11, 2012, at 1:59 PM, Matthew Knepley wrote: > > > On Wed, Jan 11, 2012 at 12:23 PM, Ravi Kannan wrote: > Hi Mark, > > I downloaded the dev version again. This time, the program crashes even earlier. Attached is the serial and parallel info outputs. > > Could you kindly take a look. > > It looks like this is a problem with MatMatMult(). Can you try to reproduce this using KSP ex10? You put > your matrix in binary format and use -pc_type gamg. Then you can send us the matrix and we can track > it down. Or are you running an example there? > > Thanks, > > Matt > > > > Thanks, > Ravi. > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams > Sent: Monday, January 09, 2012 3:08 PM > > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > > Yes its all checked it, just pull from dev. > Mark > > On Jan 9, 2012, at 2:54 PM, Ravi Kannan wrote: > > > Hi Mark, > > Thanks for your efforts. > > Do I need to do the install from scratch once again? Or some particular files (check out gamg.c for instance)? > > Thanks, > Ravi. > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams > Sent: Friday, January 06, 2012 10:30 AM > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > I think I found the problem. You will need to use petsc-dev to get the fix. > > Mark > > On Jan 6, 2012, at 8:55 AM, Mark F. Adams wrote: > > > Ravi, I forgot but you can just use -ksp_view_binary to output the matrix data (two files). You could run it with two procs and a Jacobi solver to get it past the solve, where it writes the matrix (I believe). > Mark > > On Jan 5, 2012, at 6:19 PM, Ravi Kannan wrote: > > > Just send in another email with the attachment. > > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Jed Brown > Sent: Thursday, January 05, 2012 5:15 PM > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] boomerAmg scalability > > On Thu, Jan 5, 2012 at 17:12, Ravi Kannan wrote: > I have attached the verbose+info outputs for both the serial and the parallel (2 partitions). NOTE: the serial output at some location says PC=Jacobi! Is it implicitly converting the PC to a Jacobi? > > Looks like you forgot the attachment. > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Thu Jan 19 17:19:00 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Thu, 19 Jan 2012 15:19:00 -0800 Subject: [petsc-users] A few questions about hdf5 viewer In-Reply-To: References: Message-ID: Thanks Mat, Use the DA coordinate mechanism and you can get the coordinates as a parallel Vec. well, that won't be working for me since although I use one DA and the parallel vectors derived from same DA, yet I am using staggered grid formulation. So, there the coordinates could be different for different vectors. Is there any other way around this ? On Thu, Jan 19, 2012 at 6:59 AM, Matthew Knepley wrote: > On Thu, Jan 19, 2012 at 2:50 AM, Mohamad M. Nasr-Azadani > wrote: > >> Hi guys, >> >> I have compiled petsc to use HDF5 package. >> >> I like to store the data from a parallel vector(s) (obtained from >> structured DA in 3 dimensions) to file using VecView() in conjunction with PetscViewerHDF5Open(). >> >> I followed the example here >> http://www.mcs.anl.gov/petsc/petsc-current/src/dm/examples/tutorials/ex10.c.html >> and everything looks fine. >> >> However, I had a couple questions: >> >> 1- When I am done writing the parallel vector obtained from the DA (and >> PETSC_COMM_WORLD), >> >> // Create the HDF5 viewer >> PetscViewerHDF5Open >> (PETSC_COMM_WORLD >> ,"gauss.h5",FILE_MODE_WRITE,&H5viewer); >> // Write the H5 file >> VecView >> (gauss,H5viewer); >> // Cleaning stage >> PetscViewerDestroy >> (&H5viewer); >> >> how can I add data that are just simple 1-D numbers stored on local >> arrays. >> Easier said, I would like to add the structured grid coordinates (first >> all x's, then all y's, and then all z's) at the end (or to the beginning) >> of each data (*.h5) file. But the grid coordinates are stored locally on >> each machine and not derived from any parallel vectors or DA. I was >> thinking about creating vectors and viewers using PETSC_COMM_SELF but i am >> not sure if that is the right approach since that vector is created on all >> processors locally. >> > > Use the DA coordinate mechanism and you can get the coordinates as a > parallel Vec. > > >> 2- When using VecView() and HDF5 writer, what is the status of data >> compression? >> The reason that I am asking is that, I used the same example above and >> comparing two files saved via two different PetscViewers, i.e. (just) >> Binary and HDF5 (Binary) the size is not reduced in the (*.h5) case. >> In fact, it is slightly bigger than pure binary file!! >> Is there any command we have to set in Petsc to tell HDF5 viewer to use >> data compression? >> > > We do not support it. We are happy to take patches that enable this. > > Thanks, > > Matt > > >> Thanks for your patience, >> Best, >> Mohamad >> >> >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Thu Jan 19 17:19:49 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Thu, 19 Jan 2012 15:19:49 -0800 Subject: [petsc-users] VecView() and binary output In-Reply-To: <4F17E0FD.8070305@gfz-potsdam.de> References: <4F17E0FD.8070305@gfz-potsdam.de> Message-ID: Thanks Alexander. That totally makes sense now. Best, Mohamad On Thu, Jan 19, 2012 at 1:23 AM, Alexander Grayver wrote: > ** > There is a header in the beginning of the file. The info in the header > defines which object type is stored and its dimensions. > The size of the header can vary depending on the stored object > (Vec,Mat,etc) and type you use for indices (int32,int64). > > For Vec object and int32 you have one integer defining the code of the Vec > object and another vector size. That sums up to 8 extra bytes. > > > On 19.01.2012 09:20, Mohamad M. Nasr-Azadani wrote: > > Hi guys, > > I use VecView() in conjunction with PetscViewerBinaryOpen() > to write the binary data associated with a 3D data obtained from DA > structured arrays. > I noticed that on my machine, the size of the file is 8 bytes bigger > than the size of the data. > Could someone tell me where that extra 8 byte is stored and how it is > going help? > > best > Mohamad > * > * > > > > -- > Regards, > Alexander > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 19 17:20:11 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 19 Jan 2012 17:20:11 -0600 Subject: [petsc-users] A few questions about hdf5 viewer In-Reply-To: References: Message-ID: On Thu, Jan 19, 2012 at 5:19 PM, Mohamad M. Nasr-Azadani wrote: > Thanks Mat, > Use the DA coordinate mechanism and you can get the coordinates as a > parallel Vec. > > well, that won't be working for me since although I use one DA and the > parallel vectors derived from same DA, yet I am using staggered grid > formulation. So, there the coordinates could be different for different > vectors. > Is there any other way around this ? > I do not understand what you mean, be more specific. Matt > On Thu, Jan 19, 2012 at 6:59 AM, Matthew Knepley wrote: > >> On Thu, Jan 19, 2012 at 2:50 AM, Mohamad M. Nasr-Azadani < >> mmnasr at gmail.com> wrote: >> >>> Hi guys, >>> >>> I have compiled petsc to use HDF5 package. >>> >>> I like to store the data from a parallel vector(s) (obtained from >>> structured DA in 3 dimensions) to file using VecView() in conjunction with PetscViewerHDF5Open(). >>> >>> I followed the example here >>> http://www.mcs.anl.gov/petsc/petsc-current/src/dm/examples/tutorials/ex10.c.html >>> and everything looks fine. >>> >>> However, I had a couple questions: >>> >>> 1- When I am done writing the parallel vector obtained from the DA (and >>> PETSC_COMM_WORLD), >>> >>> // Create the HDF5 viewer >>> PetscViewerHDF5Open >>> (PETSC_COMM_WORLD >>> ,"gauss.h5",FILE_MODE_WRITE,&H5viewer); >>> // Write the H5 file >>> VecView >>> (gauss,H5viewer); >>> // Cleaning stage >>> PetscViewerDestroy >>> (&H5viewer); >>> >>> how can I add data that are just simple 1-D numbers stored on local >>> arrays. >>> Easier said, I would like to add the structured grid coordinates (first >>> all x's, then all y's, and then all z's) at the end (or to the beginning) >>> of each data (*.h5) file. But the grid coordinates are stored locally on >>> each machine and not derived from any parallel vectors or DA. I was >>> thinking about creating vectors and viewers using PETSC_COMM_SELF but i am >>> not sure if that is the right approach since that vector is created on all >>> processors locally. >>> >> >> Use the DA coordinate mechanism and you can get the coordinates as a >> parallel Vec. >> >> >>> 2- When using VecView() and HDF5 writer, what is the status of data >>> compression? >>> The reason that I am asking is that, I used the same example above and >>> comparing two files saved via two different PetscViewers, i.e. (just) >>> Binary and HDF5 (Binary) the size is not reduced in the (*.h5) case. >>> In fact, it is slightly bigger than pure binary file!! >>> Is there any command we have to set in Petsc to tell HDF5 viewer to use >>> data compression? >>> >> >> We do not support it. We are happy to take patches that enable this. >> >> Thanks, >> >> Matt >> >> >>> Thanks for your patience, >>> Best, >>> Mohamad >>> >>> >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Thu Jan 19 17:26:35 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Thu, 19 Jan 2012 15:26:35 -0800 Subject: [petsc-users] A few questions about hdf5 viewer In-Reply-To: References: Message-ID: Sorry for the confusion. I use only one DA parallel layout for my problem in spite of the fact that I use MAC-staggered grid (I added one extra cell to the end of the domain in each direction so that I overcome the difficulty of one extra cell associated with each velocity in the corresponding direction, i.e. one extra u-grid exists in x-direction or one extra v-grid exists in y-direction). So, the vectors (global and locals) are derived from same DA but they do not refer to the same physical location. I could fix this if I create DA for each velocity and scalar components and set the coordinates for each of them separately, but I would rather not do this at this point. I hope I was clear. Best, Mohamad On Thu, Jan 19, 2012 at 3:20 PM, Matthew Knepley wrote: > On Thu, Jan 19, 2012 at 5:19 PM, Mohamad M. Nasr-Azadani > wrote: > >> Thanks Mat, >> Use the DA coordinate mechanism and you can get the coordinates as a >> parallel Vec. >> >> well, that won't be working for me since although I use one DA and the >> parallel vectors derived from same DA, yet I am using staggered grid >> formulation. So, there the coordinates could be different for different >> vectors. >> Is there any other way around this ? >> > > I do not understand what you mean, be more specific. > > Matt > > >> On Thu, Jan 19, 2012 at 6:59 AM, Matthew Knepley wrote: >> >>> On Thu, Jan 19, 2012 at 2:50 AM, Mohamad M. Nasr-Azadani < >>> mmnasr at gmail.com> wrote: >>> >>>> Hi guys, >>>> >>>> I have compiled petsc to use HDF5 package. >>>> >>>> I like to store the data from a parallel vector(s) (obtained from >>>> structured DA in 3 dimensions) to file using VecView() in conjunction with PetscViewerHDF5Open(). >>>> >>>> I followed the example here >>>> http://www.mcs.anl.gov/petsc/petsc-current/src/dm/examples/tutorials/ex10.c.html >>>> and everything looks fine. >>>> >>>> However, I had a couple questions: >>>> >>>> 1- When I am done writing the parallel vector obtained from the DA (and >>>> PETSC_COMM_WORLD), >>>> >>>> // Create the HDF5 viewer >>>> PetscViewerHDF5Open >>>> (PETSC_COMM_WORLD >>>> ,"gauss.h5",FILE_MODE_WRITE,&H5viewer); >>>> // Write the H5 file >>>> VecView >>>> (gauss,H5viewer); >>>> // Cleaning stage >>>> PetscViewerDestroy >>>> (&H5viewer); >>>> >>>> how can I add data that are just simple 1-D numbers stored on local >>>> arrays. >>>> Easier said, I would like to add the structured grid coordinates >>>> (first all x's, then all y's, and then all z's) at the end (or to the >>>> beginning) of each data (*.h5) file. But the grid coordinates are stored >>>> locally on each machine and not derived from any parallel vectors or DA. I >>>> was thinking about creating vectors and viewers using PETSC_COMM_SELF but i >>>> am not sure if that is the right approach since that vector is created on >>>> all processors locally. >>>> >>> >>> Use the DA coordinate mechanism and you can get the coordinates as a >>> parallel Vec. >>> >>> >>>> 2- When using VecView() and HDF5 writer, what is the status of data >>>> compression? >>>> The reason that I am asking is that, I used the same example above and >>>> comparing two files saved via two different PetscViewers, i.e. (just) >>>> Binary and HDF5 (Binary) the size is not reduced in the (*.h5) case. >>>> In fact, it is slightly bigger than pure binary file!! >>>> Is there any command we have to set in Petsc to tell HDF5 viewer to use >>>> data compression? >>>> >>> >>> We do not support it. We are happy to take patches that enable this. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks for your patience, >>>> Best, >>>> Mohamad >>>> >>>> >>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 19 17:30:22 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 19 Jan 2012 17:30:22 -0600 Subject: [petsc-users] A few questions about hdf5 viewer In-Reply-To: References: Message-ID: On Thu, Jan 19, 2012 at 5:26 PM, Mohamad M. Nasr-Azadani wrote: > Sorry for the confusion. > I use only one DA parallel layout for my problem in spite of the fact that > I use MAC-staggered grid (I added one extra cell to the end of the domain > in each direction so that I overcome the difficulty of one extra cell > associated with each velocity in the corresponding direction, i.e. one > extra u-grid exists in x-direction or one extra v-grid exists in > y-direction). > So, the vectors (global and locals) are derived from same DA but they do > not refer to the same physical location. > I could fix this if I create DA for each velocity and scalar components > and set the coordinates for each of them separately, but I would rather not > do this at this point. > 1) What is wrong with more DAs? 2) Get the coordinate vector for one, copy it, and then shift it to get the others Matt > I hope I was clear. > Best, > Mohamad > > > > > On Thu, Jan 19, 2012 at 3:20 PM, Matthew Knepley wrote: > >> On Thu, Jan 19, 2012 at 5:19 PM, Mohamad M. Nasr-Azadani < >> mmnasr at gmail.com> wrote: >> >>> Thanks Mat, >>> Use the DA coordinate mechanism and you can get the coordinates as a >>> parallel Vec. >>> >>> well, that won't be working for me since although I use one DA and the >>> parallel vectors derived from same DA, yet I am using staggered grid >>> formulation. So, there the coordinates could be different for different >>> vectors. >>> Is there any other way around this ? >>> >> >> I do not understand what you mean, be more specific. >> >> Matt >> >> >>> On Thu, Jan 19, 2012 at 6:59 AM, Matthew Knepley wrote: >>> >>>> On Thu, Jan 19, 2012 at 2:50 AM, Mohamad M. Nasr-Azadani < >>>> mmnasr at gmail.com> wrote: >>>> >>>>> Hi guys, >>>>> >>>>> I have compiled petsc to use HDF5 package. >>>>> >>>>> I like to store the data from a parallel vector(s) (obtained from >>>>> structured DA in 3 dimensions) to file using VecView() in conjunction with PetscViewerHDF5Open(). >>>>> >>>>> I followed the example here >>>>> http://www.mcs.anl.gov/petsc/petsc-current/src/dm/examples/tutorials/ex10.c.html >>>>> and everything looks fine. >>>>> >>>>> However, I had a couple questions: >>>>> >>>>> 1- When I am done writing the parallel vector obtained from the DA >>>>> (and PETSC_COMM_WORLD), >>>>> >>>>> // Create the HDF5 viewer >>>>> PetscViewerHDF5Open >>>>> (PETSC_COMM_WORLD >>>>> ,"gauss.h5",FILE_MODE_WRITE,&H5viewer); >>>>> // Write the H5 file >>>>> VecView >>>>> (gauss,H5viewer); >>>>> // Cleaning stage >>>>> PetscViewerDestroy >>>>> (&H5viewer); >>>>> >>>>> how can I add data that are just simple 1-D numbers stored on local >>>>> arrays. >>>>> Easier said, I would like to add the structured grid coordinates >>>>> (first all x's, then all y's, and then all z's) at the end (or to the >>>>> beginning) of each data (*.h5) file. But the grid coordinates are stored >>>>> locally on each machine and not derived from any parallel vectors or DA. I >>>>> was thinking about creating vectors and viewers using PETSC_COMM_SELF but i >>>>> am not sure if that is the right approach since that vector is created on >>>>> all processors locally. >>>>> >>>> >>>> Use the DA coordinate mechanism and you can get the coordinates as a >>>> parallel Vec. >>>> >>>> >>>>> 2- When using VecView() and HDF5 writer, what is the status of data >>>>> compression? >>>>> The reason that I am asking is that, I used the same example above and >>>>> comparing two files saved via two different PetscViewers, i.e. (just) >>>>> Binary and HDF5 (Binary) the size is not reduced in the (*.h5) case. >>>>> In fact, it is slightly bigger than pure binary file!! >>>>> Is there any command we have to set in Petsc to tell HDF5 viewer to >>>>> use data compression? >>>>> >>>> >>>> We do not support it. We are happy to take patches that enable this. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks for your patience, >>>>> Best, >>>>> Mohamad >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Jan 19 17:30:48 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 19 Jan 2012 17:30:48 -0600 Subject: [petsc-users] A few questions about hdf5 viewer In-Reply-To: References: Message-ID: On Thu, Jan 19, 2012 at 17:26, Mohamad M. Nasr-Azadani wrote: > I use only one DA parallel layout for my problem in spite of the fact that > I use MAC-staggered grid (I added one extra cell to the end of the domain > in each direction so that I overcome the difficulty of one extra cell > associated with each velocity in the corresponding direction, i.e. one > extra u-grid exists in x-direction or one extra v-grid exists in > y-direction). > So, the vectors (global and locals) are derived from same DA but they do > not refer to the same physical location. > This is fine, but you will have to manage coordinates yourself. I would create a DMDA with similar layout to the "solution" DMDA to hold the coordinates. You could even just create this thing for viewing. Then put your local coordinates into its global vector and view that. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Thu Jan 19 18:00:19 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Thu, 19 Jan 2012 16:00:19 -0800 Subject: [petsc-users] A few questions about hdf5 viewer In-Reply-To: References: Message-ID: 1) What is wrong with more DAs? Nothing. Just I never needed more than one DA. 2) Get the coordinate vector for one, copy it, and then shift it to get the others Well, I played around a bit and I realized that the coordinate vector returned from DMDAGetCoordinates() is actually of size Nx*Ny*Nz*3 (for DA 3D). This will double the size of my output file. What I need is just to add 3 1-D arrays of (x[Nx]+y[Ny]+z[Nz]) including the grid coordinates to the end of the *.h5 file and then later on, I can use any visualization software to load the data using those coordinates. I am using orthogonal grid, that's why I don't need all the (x,y,z) coordinates for each cell. Thanks again, Mohamad On Thu, Jan 19, 2012 at 3:30 PM, Matthew Knepley wrote: > On Thu, Jan 19, 2012 at 5:26 PM, Mohamad M. Nasr-Azadani > wrote: > >> Sorry for the confusion. >> I use only one DA parallel layout for my problem in spite of the fact >> that I use MAC-staggered grid (I added one extra cell to the end of the >> domain in each direction so that I overcome the difficulty of one extra >> cell associated with each velocity in the corresponding direction, i.e. one >> extra u-grid exists in x-direction or one extra v-grid exists in >> y-direction). >> So, the vectors (global and locals) are derived from same DA but they do >> not refer to the same physical location. >> I could fix this if I create DA for each velocity and scalar components >> and set the coordinates for each of them separately, but I would rather not >> do this at this point. >> > > 1) What is wrong with more DAs? > > 2) Get the coordinate vector for one, copy it, and then shift it to get > the others > > Matt > > >> I hope I was clear. >> Best, >> Mohamad >> >> >> >> >> On Thu, Jan 19, 2012 at 3:20 PM, Matthew Knepley wrote: >> >>> On Thu, Jan 19, 2012 at 5:19 PM, Mohamad M. Nasr-Azadani < >>> mmnasr at gmail.com> wrote: >>> >>>> Thanks Mat, >>>> Use the DA coordinate mechanism and you can get the coordinates as a >>>> parallel Vec. >>>> >>>> well, that won't be working for me since although I use one DA and the >>>> parallel vectors derived from same DA, yet I am using staggered grid >>>> formulation. So, there the coordinates could be different for different >>>> vectors. >>>> Is there any other way around this ? >>>> >>> >>> I do not understand what you mean, be more specific. >>> >>> Matt >>> >>> >>>> On Thu, Jan 19, 2012 at 6:59 AM, Matthew Knepley wrote: >>>> >>>>> On Thu, Jan 19, 2012 at 2:50 AM, Mohamad M. Nasr-Azadani < >>>>> mmnasr at gmail.com> wrote: >>>>> >>>>>> Hi guys, >>>>>> >>>>>> I have compiled petsc to use HDF5 package. >>>>>> >>>>>> I like to store the data from a parallel vector(s) (obtained from >>>>>> structured DA in 3 dimensions) to file using VecView() in conjunction with PetscViewerHDF5Open(). >>>>>> >>>>>> I followed the example here >>>>>> http://www.mcs.anl.gov/petsc/petsc-current/src/dm/examples/tutorials/ex10.c.html >>>>>> and everything looks fine. >>>>>> >>>>>> However, I had a couple questions: >>>>>> >>>>>> 1- When I am done writing the parallel vector obtained from the DA >>>>>> (and PETSC_COMM_WORLD), >>>>>> >>>>>> // Create the HDF5 viewer >>>>>> PetscViewerHDF5Open >>>>>> (PETSC_COMM_WORLD >>>>>> ,"gauss.h5",FILE_MODE_WRITE,&H5viewer); >>>>>> // Write the H5 file >>>>>> VecView >>>>>> (gauss,H5viewer); >>>>>> // Cleaning stage >>>>>> PetscViewerDestroy >>>>>> (&H5viewer); >>>>>> >>>>>> how can I add data that are just simple 1-D numbers stored on local >>>>>> arrays. >>>>>> Easier said, I would like to add the structured grid coordinates >>>>>> (first all x's, then all y's, and then all z's) at the end (or to the >>>>>> beginning) of each data (*.h5) file. But the grid coordinates are stored >>>>>> locally on each machine and not derived from any parallel vectors or DA. I >>>>>> was thinking about creating vectors and viewers using PETSC_COMM_SELF but i >>>>>> am not sure if that is the right approach since that vector is created on >>>>>> all processors locally. >>>>>> >>>>> >>>>> Use the DA coordinate mechanism and you can get the coordinates as a >>>>> parallel Vec. >>>>> >>>>> >>>>>> 2- When using VecView() and HDF5 writer, what is the status of data >>>>>> compression? >>>>>> The reason that I am asking is that, I used the same example above >>>>>> and comparing two files saved via two different PetscViewers, i.e. (just) >>>>>> Binary and HDF5 (Binary) the size is not reduced in the (*.h5) case. >>>>>> In fact, it is slightly bigger than pure binary file!! >>>>>> Is there any command we have to set in Petsc to tell HDF5 viewer to >>>>>> use data compression? >>>>>> >>>>> >>>>> We do not support it. We are happy to take patches that enable this. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks for your patience, >>>>>> Best, >>>>>> Mohamad >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Jan 19 18:02:48 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 19 Jan 2012 18:02:48 -0600 Subject: [petsc-users] A few questions about hdf5 viewer In-Reply-To: References: Message-ID: On Thu, Jan 19, 2012 at 18:00, Mohamad M. Nasr-Azadani wrote: > What I need is just to add 3 1-D arrays of (x[Nx]+y[Ny]+z[Nz]) including > the grid coordinates to the end of the *.h5 file and then later on, I can > use any visualization software to load the data using those coordinates. I > am using orthogonal grid, that's why I don't need all the > (x,y,z) coordinates for each cell. If you want this special case, you have to manage it by hand. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Thu Jan 19 18:08:52 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Thu, 19 Jan 2012 16:08:52 -0800 Subject: [petsc-users] A few questions about hdf5 viewer In-Reply-To: References: Message-ID: Thanks Jed. My solution to that was to create a 1D vector local to for instance processor zero that holds all the coordinates. Then after I dumped all the parallel data to the *.h5 file, I create another viewer on processor zero or PETSC_COMM_SELF and dump that new vector including tHe coordinates to the end of the existing file. Do you think that should be possible? Thanks, Mohamad On Thu, Jan 19, 2012 at 4:02 PM, Jed Brown wrote: > On Thu, Jan 19, 2012 at 18:00, Mohamad M. Nasr-Azadani wrote: > >> What I need is just to add 3 1-D arrays of (x[Nx]+y[Ny]+z[Nz]) including >> the grid coordinates to the end of the *.h5 file and then later on, I can >> use any visualization software to load the data using those coordinates. I >> am using orthogonal grid, that's why I don't need all the >> (x,y,z) coordinates for each cell. > > > If you want this special case, you have to manage it by hand. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 19 18:33:12 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 19 Jan 2012 18:33:12 -0600 Subject: [petsc-users] A few questions about hdf5 viewer In-Reply-To: References: Message-ID: On Thu, Jan 19, 2012 at 6:08 PM, Mohamad M. Nasr-Azadani wrote: > Thanks Jed. > My solution to that was to create a 1D vector local to for instance > processor zero that holds all the coordinates. Then after I dumped all the > parallel data to the *.h5 file, I create another viewer on processor zero > or PETSC_COMM_SELF and dump that new vector including tHe coordinates to > the end of the existing file. > Do you think that should be possible? > You should just create a parallel Vec to hold the 1D data. DAGetLocalInfo() tells you all the local sizes. Matt > Thanks, > Mohamad > > > > > On Thu, Jan 19, 2012 at 4:02 PM, Jed Brown wrote: > >> On Thu, Jan 19, 2012 at 18:00, Mohamad M. Nasr-Azadani wrote: >> >>> What I need is just to add 3 1-D arrays of (x[Nx]+y[Ny]+z[Nz]) including >>> the grid coordinates to the end of the *.h5 file and then later on, I can >>> use any visualization software to load the data using those coordinates. I >>> am using orthogonal grid, that's why I don't need all the >>> (x,y,z) coordinates for each cell. >> >> >> If you want this special case, you have to manage it by hand. >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Thu Jan 19 18:44:10 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Thu, 19 Jan 2012 16:44:10 -0800 Subject: [petsc-users] A few questions about hdf5 viewer In-Reply-To: References: Message-ID: You should just create a parallel Vec to hold the 1D data. DAGetLocalInfo() tells you all the local sizes. Thanks Matt. But I am not sure how this would help since PETSC_COMM_WORLD includes all the processors and in all three directions whereas the 1-D data is only the size of Nx points. This vector is not big at all, I could even create is on one processor. The only concern that I have is can I create an *.h5 file using PETSC_COMM_WORLD, dump the data, close the file and then, re-open the file (append mode) with a different viewer created via PETSC_COMM_SELF, and dump the coordinates vector to the end of it? Of course, I only call PETSC_COMM_SELF on processor zero. Thanks, M On Thu, Jan 19, 2012 at 4:33 PM, Matthew Knepley wrote: > On Thu, Jan 19, 2012 at 6:08 PM, Mohamad M. Nasr-Azadani > wrote: > >> Thanks Jed. >> My solution to that was to create a 1D vector local to for instance >> processor zero that holds all the coordinates. Then after I dumped all the >> parallel data to the *.h5 file, I create another viewer on processor zero >> or PETSC_COMM_SELF and dump that new vector including tHe coordinates to >> the end of the existing file. >> Do you think that should be possible? >> > > You should just create a parallel Vec to hold the 1D data. > DAGetLocalInfo() tells you all the local sizes. > > Matt > > >> Thanks, >> Mohamad >> >> >> >> >> On Thu, Jan 19, 2012 at 4:02 PM, Jed Brown wrote: >> >>> On Thu, Jan 19, 2012 at 18:00, Mohamad M. Nasr-Azadani >> > wrote: >>> >>>> What I need is just to add 3 1-D arrays of (x[Nx]+y[Ny]+z[Nz]) >>>> including the grid coordinates to the end of the *.h5 file and then later >>>> on, I can use any visualization software to load the data using those >>>> coordinates. I am using orthogonal grid, that's why I don't need all the >>>> (x,y,z) coordinates for each cell. >>> >>> >>> If you want this special case, you have to manage it by hand. >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 19 18:49:59 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 19 Jan 2012 18:49:59 -0600 Subject: [petsc-users] A few questions about hdf5 viewer In-Reply-To: References: Message-ID: On Thu, Jan 19, 2012 at 6:44 PM, Mohamad M. Nasr-Azadani wrote: > You should just create a parallel Vec to hold the 1D data. > DAGetLocalInfo() tells you all the local sizes. > > Thanks Matt. But I am not sure how this would help since PETSC_COMM_WORLD > includes all the processors and in all three directions whereas the 1-D > data is only the size of Nx points. > This vector is not big at all, I could even create is on one processor. > The only concern that I have is > can I create an *.h5 file using PETSC_COMM_WORLD, dump the data, close the > file and then, re-open the file (append mode) with a different viewer > created via PETSC_COMM_SELF, and dump the coordinates vector to the end of > it? Of course, I only call PETSC_COMM_SELF on processor zero. > Then give 0 sizes on everything but 0. Matt > Thanks, > M > > > On Thu, Jan 19, 2012 at 4:33 PM, Matthew Knepley wrote: > >> On Thu, Jan 19, 2012 at 6:08 PM, Mohamad M. Nasr-Azadani < >> mmnasr at gmail.com> wrote: >> >>> Thanks Jed. >>> My solution to that was to create a 1D vector local to for instance >>> processor zero that holds all the coordinates. Then after I dumped all the >>> parallel data to the *.h5 file, I create another viewer on processor zero >>> or PETSC_COMM_SELF and dump that new vector including tHe coordinates to >>> the end of the existing file. >>> Do you think that should be possible? >>> >> >> You should just create a parallel Vec to hold the 1D data. >> DAGetLocalInfo() tells you all the local sizes. >> >> Matt >> >> >>> Thanks, >>> Mohamad >>> >>> >>> >>> >>> On Thu, Jan 19, 2012 at 4:02 PM, Jed Brown wrote: >>> >>>> On Thu, Jan 19, 2012 at 18:00, Mohamad M. Nasr-Azadani < >>>> mmnasr at gmail.com> wrote: >>>> >>>>> What I need is just to add 3 1-D arrays of (x[Nx]+y[Ny]+z[Nz]) >>>>> including the grid coordinates to the end of the *.h5 file and then later >>>>> on, I can use any visualization software to load the data using those >>>>> coordinates. I am using orthogonal grid, that's why I don't need all the >>>>> (x,y,z) coordinates for each cell. >>>> >>>> >>>> If you want this special case, you have to manage it by hand. >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Thu Jan 19 18:52:11 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Thu, 19 Jan 2012 16:52:11 -0800 Subject: [petsc-users] A few questions about hdf5 viewer In-Reply-To: References: Message-ID: Then give 0 sizes on everything but 0. Thanks Matt. Or can I just do everything all those operation on processor zero? Mohamad On Thu, Jan 19, 2012 at 4:49 PM, Matthew Knepley wrote: > On Thu, Jan 19, 2012 at 6:44 PM, Mohamad M. Nasr-Azadani > wrote: > >> You should just create a parallel Vec to hold the 1D data. >> DAGetLocalInfo() tells you all the local sizes. >> >> Thanks Matt. But I am not sure how this would help since PETSC_COMM_WORLD >> includes all the processors and in all three directions whereas the 1-D >> data is only the size of Nx points. >> This vector is not big at all, I could even create is on one processor. >> The only concern that I have is >> can I create an *.h5 file using PETSC_COMM_WORLD, dump the data, close >> the file and then, re-open the file (append mode) with a different viewer >> created via PETSC_COMM_SELF, and dump the coordinates vector to the end of >> it? Of course, I only call PETSC_COMM_SELF on processor zero. >> > > Then give 0 sizes on everything but 0. > > Matt > > >> Thanks, >> M >> >> >> On Thu, Jan 19, 2012 at 4:33 PM, Matthew Knepley wrote: >> >>> On Thu, Jan 19, 2012 at 6:08 PM, Mohamad M. Nasr-Azadani < >>> mmnasr at gmail.com> wrote: >>> >>>> Thanks Jed. >>>> My solution to that was to create a 1D vector local to for instance >>>> processor zero that holds all the coordinates. Then after I dumped all the >>>> parallel data to the *.h5 file, I create another viewer on processor zero >>>> or PETSC_COMM_SELF and dump that new vector including tHe coordinates to >>>> the end of the existing file. >>>> Do you think that should be possible? >>>> >>> >>> You should just create a parallel Vec to hold the 1D data. >>> DAGetLocalInfo() tells you all the local sizes. >>> >>> Matt >>> >>> >>>> Thanks, >>>> Mohamad >>>> >>>> >>>> >>>> >>>> On Thu, Jan 19, 2012 at 4:02 PM, Jed Brown wrote: >>>> >>>>> On Thu, Jan 19, 2012 at 18:00, Mohamad M. Nasr-Azadani < >>>>> mmnasr at gmail.com> wrote: >>>>> >>>>>> What I need is just to add 3 1-D arrays of (x[Nx]+y[Ny]+z[Nz]) >>>>>> including the grid coordinates to the end of the *.h5 file and then later >>>>>> on, I can use any visualization software to load the data using those >>>>>> coordinates. I am using orthogonal grid, that's why I don't need all the >>>>>> (x,y,z) coordinates for each cell. >>>>> >>>>> >>>>> If you want this special case, you have to manage it by hand. >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xyuan at lbl.gov Thu Jan 19 19:20:49 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Thu, 19 Jan 2012 17:20:49 -0800 Subject: [petsc-users] The ColPerm option for superlu Message-ID: Hello, I use ksp_view to get the information about the lu pc as follow: PC Object: 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 0, needed 0 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=262144, cols=262144 package used to perform factorization: superlu total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 SuperLU run parameters: Equil: NO ColPerm: 3 IterRefine: 0 SymmetricMode: NO DiagPivotThresh: 1 PivotGrowth: NO ConditionNumber: YES RowPerm: 0 ReplaceTinyPivot: NO PrintStat: YES lwork: 0 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=262144, cols=262144 total: nonzeros=25969216, allocated nonzeros=25969216 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 65536 nodes, limit used is 5 In the SuperLU run parameters, how could I change ColPerm:3 to ColPerm has the value of 2? Is there any option that I can use to make such a change through PETSc? Thanks very much! Best regards, Rebecca From jedbrown at mcs.anl.gov Thu Jan 19 19:59:15 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 19 Jan 2012 19:59:15 -0600 Subject: [petsc-users] The ColPerm option for superlu In-Reply-To: References: Message-ID: On Thu, Jan 19, 2012 at 19:20, Xuefei (Rebecca) Yuan wrote: > Hello, > > I use ksp_view to get the information about the lu pc as follow: > > PC Object: 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 0, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=262144, cols=262144 > package used to perform factorization: superlu > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > SuperLU run parameters: > Equil: NO > ColPerm: 3 > IterRefine: 0 > SymmetricMode: NO > DiagPivotThresh: 1 > PivotGrowth: NO > ConditionNumber: YES > RowPerm: 0 > ReplaceTinyPivot: NO > PrintStat: YES > lwork: 0 > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=262144, cols=262144 > total: nonzeros=25969216, allocated nonzeros=25969216 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 65536 nodes, limit used is 5 > > > In the SuperLU run parameters, how could I change ColPerm:3 to ColPerm has > the value of 2? > $ ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu -help |grep colperm -mat_superlu_colperm (choose one of) NATURAL MMD_ATA MMD_AT_PLUS_A COLAMD (None) > > Is there any option that I can use to make such a change through PETSc? > > Thanks very much! > > Best regards, > > Rebecca > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Thu Jan 19 21:54:33 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Thu, 19 Jan 2012 19:54:33 -0800 Subject: [petsc-users] hdf5 viewer and zero sized local vectors Message-ID: Hi, I am trying to use VecView() and hdf5 viewer. I am trying to create a "parallel" vector of size N which has N elements on only processor zero and 0 elements on all other processors. This is the code that I have (if it helps): PetscViewer H5viewer; // Create the HDF5 viewer int ierr = PetscViewerHDF5Open(PCW, "temp.h5", FILE_MODE_WRITE, &H5viewer); PETScErrAct(ierr); Vec x; ierr = VecCreate(PCW, &x); PETScErrAct(ierr); int n_local = 0; /* Rank of current processor */ if (params->rank == MASTER) { n_local = N; } ierr = VecSetSizes(x,n_local, N); PETScErrAct(ierr); ierr = VecSetFromOptions(x); PETScErrAct(ierr); double *vcoord; ierr = VecGetArray(x, &vcoord); PETScErrAct(ierr); int i; for (i=0; i From knepley at gmail.com Thu Jan 19 22:19:36 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 19 Jan 2012 22:19:36 -0600 Subject: [petsc-users] hdf5 viewer and zero sized local vectors In-Reply-To: References: Message-ID: On Thu, Jan 19, 2012 at 9:54 PM, Mohamad M. Nasr-Azadani wrote: > Hi, > > I am trying to use VecView() and hdf5 viewer. > I am trying to create a "parallel" vector of size N which has N elements > on only processor zero and 0 elements on all other processors. > > This is the code that I have (if it helps): > > PetscViewer H5viewer; > > // Create the HDF5 viewer > > int ierr = PetscViewerHDF5Open(PCW, "temp.h5", FILE_MODE_WRITE, &H5viewer); PETScErrAct(ierr); > > Vec x; > > ierr = VecCreate(PCW, &x); PETScErrAct(ierr); > > int n_local = 0; > > /* Rank of current processor */ > > if (params->rank == MASTER) { > > n_local = N; > > } > > ierr = VecSetSizes(x,n_local, N); PETScErrAct(ierr); > > ierr = VecSetFromOptions(x); PETScErrAct(ierr); > > double *vcoord; > > ierr = VecGetArray(x, &vcoord); PETScErrAct(ierr); > > int i; > > for (i=0; i > vcoord[i] = coord[i]; > > } > > ierr = VecRestoreArray(x, vcoord); PETScErrAct(ierr); > > ierr = PetscObjectSetName((PetscObject) x, gridname); PETScErrAct(ierr); > > // Write the H5 file > > ierr = VecView(x, H5viewer); PETScErrAct(ierr); > > ierr = VecDestroy(x); PETScErrAct(ierr); > > ierr = PetscViewerDestroy(H5viewer); PETScErrAct(ierr); > > > > When I create a viewer and use VecView() to write this vector to file, > this is the error I get. Apparently, it seems it does not like it that > other processors do not have any elements. > Any ideas how to fix that? > > HDF5-DIAG: Error detected in HDF5 (1.8.4) HDF5-DIAG: Error detected in > HDF5 (1.8.4) HDF5-DIAG: Error detected in HDF5 (1.8.4) MPI-process > 2MPI-process 3: > #000: H5S.c line 1335 in H5Screate_simple(): zero sized dimension for > non-unlimited dimension > It appears that HDF5 has some limitations. Matt > : > #000: H5S.c line 1335 in H5Screate_simple(): zero sized dimension for > non-unlimited dimension > major: Invalid arguments to routine > minor: Bad value > major: Invalid arguments to routine > minor: Bad value > [2]PETSC ERROR: MPI-process 1--------------------- Error Message > ------------------------------------ > [3]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [2]PETSC ERROR: Error in external library! > [3]PETSC ERROR: Error in external library! > [2]PETSC ERROR: Cannot H5Screate_simple()! > [3]PETSC ERROR: Cannot H5Screate_simple()! > [2]PETSC ERROR: [3]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: > ------------------------------------------------------------------------ > : > #000: H5S.c line 1335 in H5Screate_simple(): zero sized dimension for > non-unlimited dimension > major: Invalid arguments to routine > minor: Bad value > [1]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [1]PETSC ERROR: Error in external library! > [1]PETSC ERROR: Cannot H5Screate_simple()! > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48 > CDT 2011 > [1]PETSC ERROR: See docs/changes/index.html for recent updates. > [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [1]PETSC ERROR: See docs/index.html for manual pages. > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: ./gvg on a linux-gnu named mylaptop by mmnasr Thu Jan 19 > 19:47:49 2012 > [1]PETSC ERROR: Libraries linked from > /home/mmnasr/Softwares/petsc-3.1-p8/linux-gnuPetsc Release Version 3.1.0, > Patch 8, Thu Mar 17 13:37:48 CDT 2011 > -hdf5/lib > [1]PETSC ERROR: Configure run at Tue Sep 6 18:38:57 2011 > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-gnu-hdf5 --with-cc=gcc > --with-fc=gfortran --download-f-blas-lapack=1 --download-mpich=1 > --with-debugging=0 COPTFLAGS=-O3 FOPTFLAGS=-O3 > --download-hypre=/home/mmnasr/Softwares/hypre-2.7.0b.tar.gz > --download-hdf5=1 > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: VecView_MPI_HDF5() line 748 in > src/vec/vec/impls/mpi/pdvec.c > [1]PETSC ERROR: VecView_MPI() line 840 in src/vec/vec/impls/mpi/pdvec.c > [1]PETSC ERROR: VecView() line 710 in src/vec/vec/interface/vector.c > [1]PETSC ERROR: Output_grid_hdf5() line 2358 in "unknowndirectory/"Output.c > [2]PETSC ERROR: See docs/changes/index.html for recent updates. > [3]PETSC ERROR: [2]PETSC ERROR: See docs/faq.html for hints about trouble > shooting. > Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48 CDT 2011 > [2]PETSC ERROR: See docs/index.html for manual pages. > [2]PETSC ERROR: [3]PETSC ERROR: See docs/changes/index.html for recent > updates. > [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [3]PETSC ERROR: See docs/index.html for manual pages. > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: ./gvg on a linux-gnu named mylaptop by mmnasr Thu Jan 19 > 19:47:49 2012 > [3]PETSC ERROR: Libraries linked from > /home/mmnasr/Softwares/petsc-3.1-p8/linux-gnu-hdf5/lib > [3]PETSC ERROR: Configure run at Tue Sep 6 18:38:57 2011 > [3]PETSC ERROR: Configure options PETSC_ARCH=linux-gnu-hdf5 --with-cc=gcc > --with-fc=gfortran --download-f-blas-lapack=1 --download-mpich=1 > --with-debugging=0 COPTFLAGS=-O3 FOPTFLAGS=-O3 > --download-hypre=/home/mmnasr/Softwares/hypre-2.7.0b.tar.gz > --download-hdf5=1 > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: VecView_MPI_HDF5() line 748 in > src/vec/vec/impls/mpi/pdvec.c > [3]PETSC ERROR: VecView_MPI() line 840 in src/vec/vec/impls/mpi/pdvec.c > [3]PETSC > ------------------------------------------------------------------------ > ERROR: VecView() line 710 in src/vec/vec/interface/vector.c > [3]PETSC ERROR: Output_grid_hdf5() line 2358 in "unknowndirectory/"Output.c > [2]PETSC ERROR: ./gvg on a linux-gnu named mylaptop by mmnasr Thu Jan 19 > 19:47:49 2012 > [2]PETSC ERROR: Libraries linked from > /home/mmnasr/Softwares/petsc-3.1-p8/linux-gnu-hdf5/lib > [2]PETSC ERROR: Configure run at Tue Sep 6 18:38:57 2011 > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-gnu-hdf5 --with-cc=gcc > --with-fc=gfortran --download-f-blas-lapack=1 --download-mpich=1 > --with-debugging=0 COPTFLAGS=-O3 FOPTFLAGS=-O3 > --download-hypre=/home/mmnasr/Softwares/hypre-2.7.0b.tar.gz > --download-hdf5=1 > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: VecView_MPI_HDF5() line 748 in > src/vec/vec/impls/mpi/pdvec.c > [2]PETSC ERROR: VecView_MPI() line 840 in src/vec/vec/impls/mpi/pdvec.c > [2]PETSC ERROR: VecView() line 710 in src/vec/vec/interface/vector.c > [2]PETSC ERROR: Output_grid_hdf5() line 2358 in "unknowndirectory/"Output.c > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.H.Jongsma at utwente.nl Fri Jan 20 05:30:27 2012 From: S.H.Jongsma at utwente.nl (S.H.Jongsma at utwente.nl) Date: Fri, 20 Jan 2012 12:30:27 +0100 Subject: [petsc-users] Transpose MATMPIBAIJ matrix Message-ID: <143110A87BAAFB43BC0E619689E8CF06024FD1D8@ctwex4.ctw.utwente.nl> Hello fellow PETSc users, While using PETSc I came across some conspicuous behavior regarding the use of MatTranspose. When I use this function with a MATSEQBAIJ matrix it performs considerably better than when I use it with a MATMPIBAIJ matrix. I was wondering if anyone of you noticed the same behavior and if this is the case, what did you do about it to circumvent the 'problem'. In case no one came across this problem before, any suggestions that can help solving this performance issue are welcome. To make my problem more clear, I will give some details on my implementation and the test I performed to come to my conclusion. The PETSc version I use is 3.1-p8. I transpose the matrix in place using the following command: MatTranspose(myMatrix, MAT_REUSE_MATRIX, &myMatrix); The matrix I used to test the performance is square and consists of 24576 times 24576 blocks of size 5, which means that the matrix has 122880 rows (and columns, of course). The number of non-zero blocks in the matrix is: 291830. Timing the performance gives the following results: MATSEQBAIJ: 0.320 s MATMPIBAIJ: 1474 s (running on 1 processor) MATMPIBAIJ: 376 s (running on 2 processors) As I said, the difference in performance is quite considerable, so any suggestion that can help me solving this issue are greatly appreciated. Thank you in advance, Kind regards, Sietse Jongsma -------------- next part -------------- An HTML attachment was scrubbed... URL: From bernardomartinsrocha at gmail.com Fri Jan 20 08:28:59 2012 From: bernardomartinsrocha at gmail.com (Bernardo Rocha) Date: Fri, 20 Jan 2012 12:28:59 -0200 Subject: [petsc-users] SNESSetUpdate function Message-ID: Hi everyone, I'm using SNES but I would like to update my data after each successful step computation within the nonlinear solver using a different function. I mean, after the solution of each linear system I would like to get \Delta x_k to update x_{k+1} which is stored somewhere in my application class (C++ code). Looking at the documentation I found this function: PetscErrorCode SNESSetUpdate(SNES snes, PetscErrorCode (*func)(SNES, PetscInt)) but func does not take an application context "object" like void *ctx, that I could use to get my data and finally update it. Is there another function that would allow me to do this? Thanks in advance, Best regards, Bernardo M. Rocha -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Jan 20 08:32:39 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 20 Jan 2012 08:32:39 -0600 Subject: [petsc-users] Transpose MATMPIBAIJ matrix In-Reply-To: <143110A87BAAFB43BC0E619689E8CF06024FD1D8@ctwex4.ctw.utwente.nl> References: <143110A87BAAFB43BC0E619689E8CF06024FD1D8@ctwex4.ctw.utwente.nl> Message-ID: On Fri, Jan 20, 2012 at 05:30, wrote: > ** ** > > Hello fellow PETSc users,**** > > ** ** > > While using PETSc I came across some conspicuous behavior regarding the > use of MatTranspose. When I use this function with a MATSEQBAIJ matrix it > performs considerably better than when I use it with a MATMPIBAIJ matrix. I > was wondering if anyone of you noticed the same behavior and if this is the > case, what did you do about it to circumvent the ?problem?. In case no one > came across this problem before, any suggestions that can help solving this > performance issue are welcome. **** > > ** ** > > To make my problem more clear, I will give some details on my > implementation and the test I performed to come to my conclusion. The PETSc > version I use is 3.1-p8. > Just a friendly reminder to upgrade to petsc-3.2. > I transpose the matrix in place using the following command:**** > > ** ** > > MatTranspose(myMatrix, MAT_REUSE_MATRIX, &myMatrix); > The code is not doing correct preallocation here. Can you try AIJ? It still uses a heuristic, but is more likely to be sufficient. Or, still with MPIBAIJ if you have a square matrix with symmetric nonzero pattern, this out-of-place transpose should perform well. MatDuplicate(myMatrix,MAT_DO_NOT_COPY_VALUES,&myMatrixTranspose); MatTranspose(myMatrix,MAT_REUSE_MATRIX,&myMatrixTranspose); Why do you need a transpose? Sparse transpose is a very expensive operation in parallel (even with correct preallocation) and it should always be avoidable (KSPSolveTranspose(), etc). > **** > > ** ** > > The matrix I used to test the performance is square and consists of 24576 > times 24576 blocks of size 5, which means that the matrix has 122880 rows > (and columns, of course). The number of non-zero blocks in the matrix is: > 291830. Timing the performance gives the following results:**** > > ** ** > > MATSEQBAIJ: 0.320 s**** > > MATMPIBAIJ: 1474 s (running on 1 processor) **** > > MATMPIBAIJ: 376 s (running on 2 processors)**** > > ** ** > > As I said, the difference in performance is quite considerable, so any > suggestion that can help me solving this issue are greatly appreciated. ** > ** > > ** ** > > Thank you in advance,**** > > ** ** > > Kind regards,**** > > ** ** > > ** ** > > Sietse Jongsma**** > > ** ** > > ** ** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 20 08:33:32 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 08:33:32 -0600 Subject: [petsc-users] SNESSetUpdate function In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 8:28 AM, Bernardo Rocha < bernardomartinsrocha at gmail.com> wrote: > Hi everyone, > > I'm using SNES but I would like to update my data after each successful > step computation within the nonlinear solver using a different function. I > mean, after the solution of each linear system I would like to get \Delta > x_k to update x_{k+1} which is stored somewhere in my application class > (C++ code). Looking at the documentation I found this function: > > PetscErrorCode SNESSetUpdate(SNES snes, PetscErrorCode (*func)(SNES, > PetscInt)) > > but func does not take an application context "object" like void *ctx, > that I could use to get my data and finally update it. Is there another > function that would allow me to do this? > Can you use http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/SNES/SNESGetApplicationContext.html Matt > Thanks in advance, > Best regards, > Bernardo M. Rocha > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bernardomartinsrocha at gmail.com Fri Jan 20 08:41:18 2012 From: bernardomartinsrocha at gmail.com (Bernardo Rocha) Date: Fri, 20 Jan 2012 12:41:18 -0200 Subject: [petsc-users] SNESSetUpdate function In-Reply-To: References: Message-ID: OK, thanks... On Fri, Jan 20, 2012 at 12:33 PM, Matthew Knepley wrote: > On Fri, Jan 20, 2012 at 8:28 AM, Bernardo Rocha < > bernardomartinsrocha at gmail.com> wrote: > >> Hi everyone, >> >> I'm using SNES but I would like to update my data after each successful >> step computation within the nonlinear solver using a different function. I >> mean, after the solution of each linear system I would like to get \Delta >> x_k to update x_{k+1} which is stored somewhere in my application class >> (C++ code). Looking at the documentation I found this function: >> >> PetscErrorCode SNESSetUpdate(SNES snes, PetscErrorCode (*func)(SNES, >> PetscInt)) >> >> but func does not take an application context "object" like void *ctx, >> that I could use to get my data and finally update it. Is there another >> function that would allow me to do this? >> > > Can you use > http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/SNES/SNESGetApplicationContext.html > > Matt > > >> Thanks in advance, >> Best regards, >> Bernardo M. Rocha >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Jan 20 08:55:53 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 20 Jan 2012 08:55:53 -0600 Subject: [petsc-users] SNESSetUpdate function In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 08:28, Bernardo Rocha < bernardomartinsrocha at gmail.com> wrote: > I'm using SNES but I would like to update my data after each successful > step computation within the nonlinear solver using a different function. I > mean, after the solution of each linear system I would like to get \Delta > x_k to update x_{k+1} which is stored somewhere in my application class > (C++ code). Can you explain more about what you are trying to do? What you describe does not sound algorithmically correct. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bernardomartinsrocha at gmail.com Fri Jan 20 09:21:14 2012 From: bernardomartinsrocha at gmail.com (Bernardo Rocha) Date: Fri, 20 Jan 2012 13:21:14 -0200 Subject: [petsc-users] SNESSetUpdate function In-Reply-To: References: Message-ID: I have a C++ class that solves a nonlinear problem using PETSC's KSP without using SNES. Now I'm trying to make use of SNES. I created some functions to form the Function and Jacobian, just like the example (snes/examples/tutorials/ex1.c). I'm solving a nonlinear mechanics problem. When I have some prescribed displacements (fixed dofs) I do not assemble the whole system, so when SNES starts solving the linear systems the solution vector *u* contains only some part of the displacement, the other were already imposed in my vector *x*. So I need to implement a different Update function that considers if a degree of freedom is fixed or not, in order to update *x *to continue the nonlinear iteration. That's why I think I need to do this. Is it clear? On Fri, Jan 20, 2012 at 12:55 PM, Jed Brown wrote: > On Fri, Jan 20, 2012 at 08:28, Bernardo Rocha < > bernardomartinsrocha at gmail.com> wrote: > >> I'm using SNES but I would like to update my data after each successful >> step computation within the nonlinear solver using a different function. I >> mean, after the solution of each linear system I would like to get \Delta >> x_k to update x_{k+1} which is stored somewhere in my application class >> (C++ code). > > > Can you explain more about what you are trying to do? What you describe > does not sound algorithmically correct. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 20 09:30:12 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 09:30:12 -0600 Subject: [petsc-users] SNESSetUpdate function In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 9:21 AM, Bernardo Rocha < bernardomartinsrocha at gmail.com> wrote: > I have a C++ class that solves a nonlinear problem using PETSC's KSP > without using SNES. > Now I'm trying to make use of SNES. > > I created some functions to form the Function and Jacobian, just like the > example (snes/examples/tutorials/ex1.c). I'm solving a nonlinear mechanics > problem. When I have some prescribed displacements (fixed dofs) I do not > assemble the whole system, so when SNES starts solving the linear systems > the solution vector *u* contains only some part of the displacement, the > other were already imposed in my vector *x*. So I need to implement a > different Update function that considers if a degree of freedom is fixed or > not, in order to update *x *to continue the nonlinear iteration. > > That's why I think I need to do this. Is it clear? > I may not understand what you are doing, but it sounds to me like you are imposing Dirichlet conditions on the displacement. You can do this two other ways: a) Eliminate those degrees of freedom from the system you give PETSc or b) For fixed dofs, make that row of the Jacobian the identity, and put a zero in the residual. Matt > On Fri, Jan 20, 2012 at 12:55 PM, Jed Brown wrote: > >> On Fri, Jan 20, 2012 at 08:28, Bernardo Rocha < >> bernardomartinsrocha at gmail.com> wrote: >> >>> I'm using SNES but I would like to update my data after each successful >>> step computation within the nonlinear solver using a different function. I >>> mean, after the solution of each linear system I would like to get \Delta >>> x_k to update x_{k+1} which is stored somewhere in my application class >>> (C++ code). >> >> >> Can you explain more about what you are trying to do? What you describe >> does not sound algorithmically correct. >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Fri Jan 20 09:43:09 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 20 Jan 2012 16:43:09 +0100 Subject: [petsc-users] problems compiling Message-ID: I am building on a Cray system with crayftn fortran compiler. I specify --with-fc=crayftn and --FFLAGS=-em. Despite the latter, I am still getting the below cited error. Interestingly, the mod file IS THERE: > find /tmp/petsc-0Bnd1w -iname \*.mod /tmp/petsc-0Bnd1w/config.compilers/CONFIGTEST.mod How do I go on from here? Thanks Dominik ================================================================================ TEST checkFortranModuleInclude from config.compilers(/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py:1155) TESTING: checkFortranModuleInclude from config.compilers(config/BuildSystem/config/compilers.py:1155) Figures out what flag is used to specify the include path for Fortran modules Pushing language FC sh: crayftn -c -o /tmp/petsc-0Bnd1w/config.compilers/conftest.o -I/tmp/petsc-0Bnd1w/config.compilers -I/tmp/petsc-0Bnd1w/config.setCompilers -em -O3 /tmp/petsc-0Bnd1w/config.compilers/conftest.F Executing: crayftn -c -o /tmp/petsc-0Bnd1w/config.compilers/conftest.o -I/tmp/petsc-0Bnd1w/config.compilers -I/tmp/petsc-0Bnd1w/config.setCompilers -em -O3 /tmp/petsc-0Bnd1w/config.compilers/conftest.F sh: Successful compile: Source: module configtest integer testint parameter (testint = 42) end module configtest ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- Fortran module was not created during the compile. configtest.mod/CONFIGTEST.mod not found ******************************************************************************* File "/users/dsz/pack/petsc-3.2-p5/config/configure.py", line 283, in petsc_configure framework.configure(out = sys.stdout) File "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/framework.py", line 925, in configure child.configure() File "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py", line 1338, in configure self.executeTest(self.checkFortranModuleInclude) File "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/base.py", line 115, in executeTest ret = apply(test, args,kargs) File "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py", line 1187, in checkFortranModuleInclude raise RuntimeError('Fortran module was not created during the compile. configtest.mod/CONFIGTEST.mod not found') From jiangwen84 at gmail.com Fri Jan 20 09:44:13 2012 From: jiangwen84 at gmail.com (Wen Jiang) Date: Fri, 20 Jan 2012 10:44:13 -0500 Subject: [petsc-users] generate entries on 'wrong' process Message-ID: Hi Barry, Thanks for your suggestion. I just added MatSetOption(mat, MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_TRUE) to my code, but I did not get any error information regarding to bad allocation. And my code is stuck there. I attached the output file below. Thanks. [0] VecAssemblyBegin_MPI(): Stash has 210720 entries, uses 12 mallocs. [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. [5] MatAssemblyBegin_MPIAIJ(): Stash has 4806656 entries, uses 8 mallocs. [6] MatAssemblyBegin_MPIAIJ(): Stash has 5727744 entries, uses 9 mallocs. [4] MatAssemblyBegin_MPIAIJ(): Stash has 5964288 entries, uses 9 mallocs. [7] MatAssemblyBegin_MPIAIJ(): Stash has 7408128 entries, uses 9 mallocs. [3] MatAssemblyBegin_MPIAIJ(): Stash has 8123904 entries, uses 9 mallocs. [2] MatAssemblyBegin_MPIAIJ(): Stash has 11544576 entries, uses 10 mallocs. [0] MatStashScatterBegin_Private(): No of messages: 1 [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 107888648 [0] MatAssemblyBegin_MPIAIJ(): Stash has 13486080 entries, uses 10 mallocs. [1] MatAssemblyBegin_MPIAIJ(): Stash has 16386048 entries, uses 10 mallocs. [7] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 unneeded,2514194 used [7] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [7] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 [7] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode routines [7] PetscCommDuplicate(): Using internal PETSc communicator 47582902893600 339106512 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2514537 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 [0] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines [0] PetscCommDuplicate(): Using internal PETSc communicator 46968795675680 536030192 [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter [6] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 unneeded,2499938 used [6] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [6] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 [6] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode routines [6] PetscCommDuplicate(): Using internal PETSc communicator 47399146302496 509504096 [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 unneeded,2525390 used [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 [5] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode routines [5] PetscCommDuplicate(): Using internal PETSc communicator 47033309994016 520223440 [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2500281 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 [1] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines [1] PetscCommDuplicate(): Using internal PETSc communicator 47149241441312 163068544 [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 unneeded,2525733 used [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 [2] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode routines [2] PetscCommDuplicate(): Using internal PETSc communicator 47674980494368 119371056 > > > Since my code never finishes, I cannot get the summary files by add > -log_summary. any other way to get summary file? > My guess is that you are running a larger problem on the this system and > your preallocation for the matrix is wrong. While in the small run you sent > the preallocation is correct. > > Usually the only thing that causes it to take forever is not the > parallel communication but is the preallocation. After you create the > matrix and set its preallocation call > MatSetOption(mat, NEW_NONZERO_ALLOCATION_ERR,PETSC_TRUE); then run. It > will stop with an error message if preallocation is wrong. > > Barry > > > > > > > BTW, my codes are running without any problem on shared-memory desktop > with any number of processes. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From petsc-maint at mcs.anl.gov Fri Jan 20 09:45:41 2012 From: petsc-maint at mcs.anl.gov (Matthew Knepley) Date: Fri, 20 Jan 2012 09:45:41 -0600 Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 9:43 AM, Dominik Szczerba wrote: > I am building on a Cray system with crayftn fortran compiler. I > specify --with-fc=crayftn and --FFLAGS=-em. Despite the latter, I am > still getting the below cited error. Interestingly, the mod file IS > THERE: > > > find /tmp/petsc-0Bnd1w -iname \*.mod > /tmp/petsc-0Bnd1w/config.compilers/CONFIGTEST.mod > > How do I go on from here? > The problem is not that is doesn't create a module file, but that it will not put it where we ask it to. Please send entire configure.log (don't Cc users). Matt > Thanks > Dominik > > > > > ================================================================================ > TEST checkFortranModuleInclude from > > config.compilers(/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py:1155) > TESTING: checkFortranModuleInclude from > config.compilers(config/BuildSystem/config/compilers.py:1155) > Figures out what flag is used to specify the include path for Fortran > modules > Pushing language FC > sh: crayftn -c -o /tmp/petsc-0Bnd1w/config.compilers/conftest.o > -I/tmp/petsc-0Bnd1w/config.compilers > -I/tmp/petsc-0Bnd1w/config.setCompilers -em -O3 > /tmp/petsc-0Bnd1w/config.compilers/conftest.F > Executing: crayftn -c -o /tmp/petsc-0Bnd1w/config.compilers/conftest.o > -I/tmp/petsc-0Bnd1w/config.compilers > -I/tmp/petsc-0Bnd1w/config.setCompilers -em -O3 > /tmp/petsc-0Bnd1w/config.compilers/conftest.F > sh: > Successful compile: > Source: > module configtest > integer testint > parameter (testint = 42) > end module configtest > > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > for details): > > ------------------------------------------------------------------------------- > Fortran module was not created during the compile. > configtest.mod/CONFIGTEST.mod not found > > ******************************************************************************* > File "/users/dsz/pack/petsc-3.2-p5/config/configure.py", line 283, > in petsc_configure > framework.configure(out = sys.stdout) > File > "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/framework.py", > line 925, in configure > child.configure() > File > "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py", > line 1338, in configure > self.executeTest(self.checkFortranModuleInclude) > File "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/base.py", > line 115, in executeTest > ret = apply(test, args,kargs) > File > "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py", > line 1187, in checkFortranModuleInclude > raise RuntimeError('Fortran module was not created during the > compile. configtest.mod/CONFIGTEST.mod not found') > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 20 09:48:16 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 09:48:16 -0600 Subject: [petsc-users] generate entries on 'wrong' process In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 9:44 AM, Wen Jiang wrote: > Hi Barry, > > Thanks for your suggestion. I just added MatSetOption(mat, > MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_TRUE) to my code, but I did not get > any error information regarding to bad allocation. And my code is stuck > there. I attached the output file below. Thanks. > Run with -start_in_debugger and get a stack trace. Note that your stashes are enormous. You might consider MatAssemblyBegin/End(A, MAT_ASSEMBLY_FLUSH) during assembly. Matt > [0] VecAssemblyBegin_MPI(): Stash has 210720 entries, uses 12 mallocs. > [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. > [5] MatAssemblyBegin_MPIAIJ(): Stash has 4806656 entries, uses 8 mallocs. > [6] MatAssemblyBegin_MPIAIJ(): Stash has 5727744 entries, uses 9 mallocs. > [4] MatAssemblyBegin_MPIAIJ(): Stash has 5964288 entries, uses 9 mallocs. > [7] MatAssemblyBegin_MPIAIJ(): Stash has 7408128 entries, uses 9 mallocs. > [3] MatAssemblyBegin_MPIAIJ(): Stash has 8123904 entries, uses 9 mallocs. > [2] MatAssemblyBegin_MPIAIJ(): Stash has 11544576 entries, uses 10 mallocs. > [0] MatStashScatterBegin_Private(): No of messages: 1 > [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 107888648 > [0] MatAssemblyBegin_MPIAIJ(): Stash has 13486080 entries, uses 10 mallocs. > [1] MatAssemblyBegin_MPIAIJ(): Stash has 16386048 entries, uses 10 mallocs. > [7] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > unneeded,2514194 used > [7] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [7] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [7] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode > routines > [7] PetscCommDuplicate(): Using internal PETSc communicator 47582902893600 > 339106512 > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > unneeded,2514537 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [0] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > routines > [0] PetscCommDuplicate(): Using internal PETSc communicator 46968795675680 > 536030192 > [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter > [6] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > unneeded,2499938 used > [6] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [6] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [6] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode > routines > [6] PetscCommDuplicate(): Using internal PETSc communicator 47399146302496 > 509504096 > [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > unneeded,2525390 used > [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [5] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode > routines > [5] PetscCommDuplicate(): Using internal PETSc communicator 47033309994016 > 520223440 > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > unneeded,2500281 used > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [1] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > routines > [1] PetscCommDuplicate(): Using internal PETSc communicator 47149241441312 > 163068544 > [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > unneeded,2525733 used > [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [2] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > routines > [2] PetscCommDuplicate(): Using internal PETSc communicator 47674980494368 > 119371056 > > > > > >> > Since my code never finishes, I cannot get the summary files by add >> -log_summary. any other way to get summary file? >> > > My guess is that you are running a larger problem on the this system and >> your preallocation for the matrix is wrong. While in the small run you sent >> the preallocation is correct. >> >> Usually the only thing that causes it to take forever is not the >> parallel communication but is the preallocation. After you create the >> matrix and set its preallocation call >> MatSetOption(mat, NEW_NONZERO_ALLOCATION_ERR,PETSC_TRUE); then run. It >> will stop with an error message if preallocation is wrong. >> >> Barry >> >> >> >> > >> > BTW, my codes are running without any problem on shared-memory desktop >> with any number of processes. >> > >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jan 20 09:49:13 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 20 Jan 2012 09:49:13 -0600 Subject: [petsc-users] problems compiling In-Reply-To: References: Message-ID: <6AB08E30-E045-40EB-995F-94DC85565CF1@mcs.anl.gov> Dominik, Come on, you know better. All installation problems got to petsc-maint at mcs.anl.gov ONLY and ALWAYS send the configure.log Barry On Jan 20, 2012, at 9:43 AM, Dominik Szczerba wrote: > I am building on a Cray system with crayftn fortran compiler. I > specify --with-fc=crayftn and --FFLAGS=-em. Despite the latter, I am > still getting the below cited error. Interestingly, the mod file IS > THERE: > >> find /tmp/petsc-0Bnd1w -iname \*.mod > /tmp/petsc-0Bnd1w/config.compilers/CONFIGTEST.mod > > How do I go on from here? > > Thanks > Dominik > > > > ================================================================================ > TEST checkFortranModuleInclude from > config.compilers(/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py:1155) > TESTING: checkFortranModuleInclude from > config.compilers(config/BuildSystem/config/compilers.py:1155) > Figures out what flag is used to specify the include path for Fortran modules > Pushing language FC > sh: crayftn -c -o /tmp/petsc-0Bnd1w/config.compilers/conftest.o > -I/tmp/petsc-0Bnd1w/config.compilers > -I/tmp/petsc-0Bnd1w/config.setCompilers -em -O3 > /tmp/petsc-0Bnd1w/config.compilers/conftest.F > Executing: crayftn -c -o /tmp/petsc-0Bnd1w/config.compilers/conftest.o > -I/tmp/petsc-0Bnd1w/config.compilers > -I/tmp/petsc-0Bnd1w/config.setCompilers -em -O3 > /tmp/petsc-0Bnd1w/config.compilers/conftest.F > sh: > Successful compile: > Source: > module configtest > integer testint > parameter (testint = 42) > end module configtest > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > for details): > ------------------------------------------------------------------------------- > Fortran module was not created during the compile. > configtest.mod/CONFIGTEST.mod not found > ******************************************************************************* > File "/users/dsz/pack/petsc-3.2-p5/config/configure.py", line 283, > in petsc_configure > framework.configure(out = sys.stdout) > File "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/framework.py", > line 925, in configure > child.configure() > File "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py", > line 1338, in configure > self.executeTest(self.checkFortranModuleInclude) > File "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/base.py", > line 115, in executeTest > ret = apply(test, args,kargs) > File "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py", > line 1187, in checkFortranModuleInclude > raise RuntimeError('Fortran module was not created during the > compile. configtest.mod/CONFIGTEST.mod not found') From jiangwen84 at gmail.com Fri Jan 20 10:21:59 2012 From: jiangwen84 at gmail.com (Wen Jiang) Date: Fri, 20 Jan 2012 11:21:59 -0500 Subject: [petsc-users] generate entries on 'wrong' process Message-ID: Hi, Matt Could you tell me some more details about how to get a stack trace there? I know little about it. The job is submitted on head node and running on compute nodes. Thanks. On Fri, Jan 20, 2012 at 9:44 AM, Wen Jiang wrote: > Hi Barry, > > Thanks for your suggestion. I just added MatSetOption(mat, > MAT_NEW_NONZERO_ALLOCATION_ ERR,PETSC_TRUE) to my code, but I did not get > any error information regarding to bad allocation. And my code is stuck > there. I attached the output file below. Thanks. > Run with -start_in_debugger and get a stack trace. Note that your stashes are enormous. You might consider MatAssemblyBegin/End(A, MAT_ASSEMBLY_FLUSH) during assembly. Matt > [0] VecAssemblyBegin_MPI(): Stash has 210720 entries, uses 12 mallocs. > [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. > [5] MatAssemblyBegin_MPIAIJ(): Stash has 4806656 entries, uses 8 mallocs. > [6] MatAssemblyBegin_MPIAIJ(): Stash has 5727744 entries, uses 9 mallocs. > [4] MatAssemblyBegin_MPIAIJ(): Stash has 5964288 entries, uses 9 mallocs. > [7] MatAssemblyBegin_MPIAIJ(): Stash has 7408128 entries, uses 9 mallocs. > [3] MatAssemblyBegin_MPIAIJ(): Stash has 8123904 entries, uses 9 mallocs. > [2] MatAssemblyBegin_MPIAIJ(): Stash has 11544576 entries, uses 10 mallocs. > [0] MatStashScatterBegin_Private(): No of messages: 1 > [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 107888648 > [0] MatAssemblyBegin_MPIAIJ(): Stash has 13486080 entries, uses 10 mallocs. > [1] MatAssemblyBegin_MPIAIJ(): Stash has 16386048 entries, uses 10 mallocs. > [7] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > unneeded,2514194 used > [7] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [7] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [7] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode > routines > [7] PetscCommDuplicate(): Using internal PETSc communicator 47582902893600 > 339106512 > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > unneeded,2514537 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [0] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > routines > [0] PetscCommDuplicate(): Using internal PETSc communicator 46968795675680 > 536030192 > [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter > [6] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > unneeded,2499938 used > [6] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [6] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [6] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode > routines > [6] PetscCommDuplicate(): Using internal PETSc communicator 47399146302496 > 509504096 > [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > unneeded,2525390 used > [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [5] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode > routines > [5] PetscCommDuplicate(): Using internal PETSc communicator 47033309994016 > 520223440 > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > unneeded,2500281 used > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [1] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > routines > [1] PetscCommDuplicate(): Using internal PETSc communicator 47149241441312 > 163068544 > [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > unneeded,2525733 used > [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [2] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > routines > [2] PetscCommDuplicate(): Using internal PETSc communicator 47674980494368 > 119371056 > > > > > >> > Since my code never finishes, I cannot get the summary files by add >> -log_summary. any other way to get summary file? >> > > My guess is that you are running a larger problem on the this system and >> your preallocation for the matrix is wrong. While in the small run you sent >> the preallocation is correct. >> >> Usually the only thing that causes it to take forever is not the >> parallel communication but is the preallocation. After you create the >> matrix and set its preallocation call >> MatSetOption(mat, NEW_NONZERO_ALLOCATION_ERR,PETSC_TRUE); then run. It >> will stop with an error message if preallocation is wrong. >> >> Barry >> >> >> >> > >> > BTW, my codes are running without any problem on shared-memory desktop >> with any number of processes. >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 20 10:25:00 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 10:25:00 -0600 Subject: [petsc-users] generate entries on 'wrong' process In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 10:21 AM, Wen Jiang wrote: > Hi, Matt > > Could you tell me some more details about how to get a stack trace there? > I know little about it. The job is submitted on head node and running on > compute nodes. > 1) Always run serial problems until you understand what is happening 2) Run with -start_in_debugger, and type 'cont' in the debugger (read about gdb) 3) When it stalls, Ctrl-C and then type 'where' Matt > Thanks. > > On Fri, Jan 20, 2012 at 9:44 AM, Wen Jiang wrote: > > > Hi Barry, > > > > Thanks for your suggestion. I just added MatSetOption(mat, > > MAT_NEW_NONZERO_ALLOCATION_ > ERR,PETSC_TRUE) to my code, but I did not get > > any error information regarding to bad allocation. And my code is stuck > > there. I attached the output file below. Thanks. > > > > Run with -start_in_debugger and get a stack trace. Note that your stashes > are enormous. You might consider > MatAssemblyBegin/End(A, MAT_ASSEMBLY_FLUSH) during assembly. > > Matt > > > > [0] VecAssemblyBegin_MPI(): Stash has 210720 entries, uses 12 mallocs. > > [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. > > [5] MatAssemblyBegin_MPIAIJ(): Stash has 4806656 entries, uses 8 mallocs. > > [6] MatAssemblyBegin_MPIAIJ(): Stash has 5727744 entries, uses 9 mallocs. > > [4] MatAssemblyBegin_MPIAIJ(): Stash has 5964288 entries, uses 9 mallocs. > > [7] MatAssemblyBegin_MPIAIJ(): Stash has 7408128 entries, uses 9 mallocs. > > [3] MatAssemblyBegin_MPIAIJ(): Stash has 8123904 entries, uses 9 mallocs. > > [2] MatAssemblyBegin_MPIAIJ(): Stash has 11544576 entries, uses 10 > mallocs. > > [0] MatStashScatterBegin_Private(): No of messages: 1 > > [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 107888648 > > [0] MatAssemblyBegin_MPIAIJ(): Stash has 13486080 entries, uses 10 > mallocs. > > [1] MatAssemblyBegin_MPIAIJ(): Stash has 16386048 entries, uses 10 > mallocs. > > [7] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > > unneeded,2514194 used > > [7] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [7] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [7] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using > Inode > > routines > > [7] PetscCommDuplicate(): Using internal PETSc communicator > 47582902893600 > > 339106512 > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > > unneeded,2514537 used > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [0] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using > Inode > > routines > > [0] PetscCommDuplicate(): Using internal PETSc communicator > 46968795675680 > > 536030192 > > [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter > > [6] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > > unneeded,2499938 used > > [6] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [6] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [6] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using > Inode > > routines > > [6] PetscCommDuplicate(): Using internal PETSc communicator > 47399146302496 > > 509504096 > > [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > > unneeded,2525390 used > > [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [5] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using > Inode > > routines > > [5] PetscCommDuplicate(): Using internal PETSc communicator > 47033309994016 > > 520223440 > > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > > unneeded,2500281 used > > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [1] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using > Inode > > routines > > [1] PetscCommDuplicate(): Using internal PETSc communicator > 47149241441312 > > 163068544 > > [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > > unneeded,2525733 used > > [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > > [2] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using > Inode > > routines > > [2] PetscCommDuplicate(): Using internal PETSc communicator > 47674980494368 > > 119371056 > > > > > > > > > > >> > Since my code never finishes, I cannot get the summary files by add > >> -log_summary. any other way to get summary file? > >> > > > > My guess is that you are running a larger problem on the this system > and > >> your preallocation for the matrix is wrong. While in the small run you > sent > >> the preallocation is correct. > >> > >> Usually the only thing that causes it to take forever is not the > >> parallel communication but is the preallocation. After you create the > >> matrix and set its preallocation call > >> MatSetOption(mat, NEW_NONZERO_ALLOCATION_ERR,PETSC_TRUE); then run. It > >> will stop with an error message if preallocation is wrong. > >> > >> Barry > >> > >> > >> > >> > > >> > BTW, my codes are running without any problem on shared-memory desktop > >> with any number of processes. > >> > > >> > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From popov at uni-mainz.de Fri Jan 20 11:08:12 2012 From: popov at uni-mainz.de (Anton Popov) Date: Fri, 20 Jan 2012 18:08:12 +0100 Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: References: Message-ID: <4F199F7C.6010809@uni-mainz.de> Hi Dominik, It really depends on which Cray machine you are installing. If you like I can describe my recent installation experience on Cray XE6 "Rosa" in Zurich (it uses Linux). First make sure which compiler you are using. For me, GNU compilers did the job well. I also prefer using ACML BLAS/LAPACK Library, because XE6 uses AMD processors. So just go to your .bashrc and add something like: module load PrgEnv-gnu module load acml # PETSC 3.X export PETSC_DIR="$HOME/LIB/petsc-3.X-pY" export PETSC_ARCH="PETSC_CRAY_XE6_GNU_OPT" export PATH=$PATH:$PETSC_DIR/bin/ Don't forget to logout/login for changes to take effect. Also, you should only have one PrgEnv module in your .bashrc. Depending on the packages you need, you can modify the configuration command below: ./config/configure.py \ --with-batch=1 \ --known-mpi-shared=0 \ --known-memcmp-ok \ --with-blas-lapack-lib="$ACML_DIR/gfortran64/lib/libacml.a" \ --COPTFLAGS="-O3" \ --FOPTFLAGS="-O3" \ --CXXOPTFLAGS="-O3" \ --with-x=0 \ --with-debugging=0 \ --with-clib-autodetect=0 \ --with-cxxlib-autodetect=0 \ --with-fortranlib-autodetect=0 \ --with-shared=0 \ --with-dynamic=0 \ --with-mpi-compilers=1 \ --with-cc=cc \ --with-cxx=CC \ --with-fc=ftn \ --download-blacs=1 \ --download-scalapack=1 \ --download-mumps=1 \ --download-superlu_dist=1 \ --download-parmetis=1 \ --download-ml=1 If you don't need ACML, just make PETSc download and install BLAS/LAPACK for you. What is important here is to always keep the following: --with-batch=1 \ --known-mpi-shared=0 \ --with-clib-autodetect=0 \ --with-cxxlib-autodetect=0 \ --with-fortranlib-autodetect=0 \ NOTE: on CRAY machine you have to submit "conftest" executable on single processor via batch system (Rosa uses slurm) and run "reconfigure.py" to finalize configuration. After you proceed with this, you have to MANUALLY DELETE the following keys in $PETSC_DIR/$PETSC_ARCH/include/petscconf.h PETSC_HAVE_SYS_PROCFS_H PETSC_HAVE_DLFCN_H PETSC_HAVE_SYS_SOCKET_H PETSC_HAVE_SYS_UTSNAME_H PETSC_HAVE_PWD_H PETSC_HAVE_GETWD PETSC_HAVE_UNAME PETSC_HAVE_GETHOSTNAME PETSC_HAVE_GETDOMAINNAME PETSC_HAVE_NETINET_IN_H PETSC_HAVE_NETDB_H PETSC_USE_SOCKET_VIEWER PETSC_HAVE_GETPWUID Otherwise nothing will run, and you'll get segfaults in the very beginning. The reason behind it is that Cray version of Linux only supports reduced set of system calls. NOW you can do "make all" Hope it'll be helpful, Anton ------------------------ On 1/20/12 4:45 PM, Matthew Knepley wrote: > On Fri, Jan 20, 2012 at 9:43 AM, Dominik Szczerba > > wrote: > > I am building on a Cray system with crayftn fortran compiler. I > specify --with-fc=crayftn and --FFLAGS=-em. Despite the latter, I am > still getting the below cited error. Interestingly, the mod file IS > THERE: > > > find /tmp/petsc-0Bnd1w -iname \*.mod > /tmp/petsc-0Bnd1w/config.compilers/CONFIGTEST.mod > > How do I go on from here? > > > The problem is not that is doesn't create a module file, but that it > will not put it > where we ask it to. Please send entire configure.log (don't Cc users). > > Matt > > Thanks > Dominik > > > > ================================================================================ > TEST checkFortranModuleInclude from > config.compilers(/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py:1155) > TESTING: checkFortranModuleInclude from > config.compilers(config/BuildSystem/config/compilers.py:1155) > Figures out what flag is used to specify the include path for > Fortran modules > Pushing language FC > sh: crayftn -c -o /tmp/petsc-0Bnd1w/config.compilers/conftest.o > -I/tmp/petsc-0Bnd1w/config.compilers > -I/tmp/petsc-0Bnd1w/config.setCompilers -em -O3 > /tmp/petsc-0Bnd1w/config.compilers/conftest.F > Executing: crayftn -c -o /tmp/petsc-0Bnd1w/config.compilers/conftest.o > -I/tmp/petsc-0Bnd1w/config.compilers > -I/tmp/petsc-0Bnd1w/config.setCompilers -em -O3 > /tmp/petsc-0Bnd1w/config.compilers/conftest.F > sh: > Successful compile: > Source: > module configtest > integer testint > parameter (testint = 42) > end module configtest > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > for details): > ------------------------------------------------------------------------------- > Fortran module was not created during the compile. > configtest.mod/CONFIGTEST.mod not found > ******************************************************************************* > File "/users/dsz/pack/petsc-3.2-p5/config/configure.py", line 283, > in petsc_configure > framework.configure(out = sys.stdout) > File > "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/framework.py", > line 925, in configure > child.configure() > File > "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py", > line 1338, in configure > self.executeTest(self.checkFortranModuleInclude) > File > "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/base.py", > line 115, in executeTest > ret = apply(test, args,kargs) > File > "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py", > line 1187, in checkFortranModuleInclude > raise RuntimeError('Fortran module was not created during the > compile. configtest.mod/CONFIGTEST.mod not found') > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Fri Jan 20 11:12:12 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 20 Jan 2012 18:12:12 +0100 Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: <4F199F7C.6010809@uni-mainz.de> References: <4F199F7C.6010809@uni-mainz.de> Message-ID: Hi Anton, Many thanks for your hints. I am on the same machine. The key difference here is that I want to use the cray environment to compare performance with gnu. And I have no issues with gnu and my experience is very similar to yours. Many thanks and best regards, Dominik On Fri, Jan 20, 2012 at 6:08 PM, Anton Popov wrote: > Hi Dominik, > > It really depends on which Cray machine you are installing. If you like I > can describe my recent installation experience on Cray XE6 "Rosa" in Zurich > (it uses Linux). > First make sure which compiler you are using. For me, GNU compilers did the > job well. I also prefer using ACML BLAS/LAPACK Library, because XE6 uses AMD > processors. > > So just go to your .bashrc and add something like: > > module load PrgEnv-gnu > module load acml > > # PETSC 3.X > export PETSC_DIR="$HOME/LIB/petsc-3.X-pY" > export PETSC_ARCH="PETSC_CRAY_XE6_GNU_OPT" > export PATH=$PATH:$PETSC_DIR/bin/ > > Don't forget to logout/login for changes to take effect. Also, you should > only have one PrgEnv module in your .bashrc. > > Depending on the packages you need, you can modify the configuration command > below: > > ./config/configure.py \ > --with-batch=1 \ > --known-mpi-shared=0 \ > --known-memcmp-ok \ > --with-blas-lapack-lib="$ACML_DIR/gfortran64/lib/libacml.a" \ > --COPTFLAGS="-O3" \ > --FOPTFLAGS="-O3" \ > --CXXOPTFLAGS="-O3" \ > --with-x=0 \ > --with-debugging=0 \ > --with-clib-autodetect=0 \ > --with-cxxlib-autodetect=0 \ > --with-fortranlib-autodetect=0 \ > --with-shared=0 \ > --with-dynamic=0 \ > --with-mpi-compilers=1 \ > --with-cc=cc \ > --with-cxx=CC \ > --with-fc=ftn \ > --download-blacs=1 \ > --download-scalapack=1 \ > --download-mumps=1 \ > --download-superlu_dist=1 \ > --download-parmetis=1 \ > --download-ml=1 > > If you don't need ACML, just make PETSc download and install BLAS/LAPACK for > you. > > What is important here is to always keep the following: > --with-batch=1 \ > --known-mpi-shared=0 \ > --with-clib-autodetect=0 \ > --with-cxxlib-autodetect=0 \ > --with-fortranlib-autodetect=0 \ > > NOTE: on CRAY machine you have to submit "conftest" executable on single > processor via batch system (Rosa uses slurm) and run "reconfigure.py" to > finalize configuration. > > After you proceed with this, you have to MANUALLY DELETE the following keys > in $PETSC_DIR/$PETSC_ARCH/include/petscconf.h > > PETSC_HAVE_SYS_PROCFS_H > PETSC_HAVE_DLFCN_H > PETSC_HAVE_SYS_SOCKET_H > PETSC_HAVE_SYS_UTSNAME_H > PETSC_HAVE_PWD_H > PETSC_HAVE_GETWD > PETSC_HAVE_UNAME > PETSC_HAVE_GETHOSTNAME > PETSC_HAVE_GETDOMAINNAME > PETSC_HAVE_NETINET_IN_H > PETSC_HAVE_NETDB_H > PETSC_USE_SOCKET_VIEWER > PETSC_HAVE_GETPWUID > > Otherwise nothing will run, and you'll get segfaults in the very beginning. > The reason behind it is that Cray version of Linux only supports reduced set > of system calls. > > NOW you can do "make all" > > Hope it'll be helpful, > > Anton > > ------------------------ > > > On 1/20/12 4:45 PM, Matthew Knepley wrote: > > On Fri, Jan 20, 2012 at 9:43 AM, Dominik Szczerba > wrote: >> >> I am building on a Cray system with crayftn fortran compiler. I >> specify --with-fc=crayftn and --FFLAGS=-em. Despite the latter, I am >> still getting the below cited error. Interestingly, the mod file IS >> THERE: >> >> > find /tmp/petsc-0Bnd1w -iname \*.mod >> /tmp/petsc-0Bnd1w/config.compilers/CONFIGTEST.mod >> >> How do I go on from here? > > > The problem is not that is doesn't create a module file, but that it will > not put it > where we ask it to. Please send entire configure.log (don't Cc users). > > ? ?Matt > >> >> Thanks >> Dominik >> >> >> >> >> ================================================================================ >> TEST checkFortranModuleInclude from >> >> config.compilers(/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py:1155) >> TESTING: checkFortranModuleInclude from >> config.compilers(config/BuildSystem/config/compilers.py:1155) >> ?Figures out what flag is used to specify the include path for Fortran >> modules >> ? ? ? ?Pushing language FC >> sh: crayftn -c -o /tmp/petsc-0Bnd1w/config.compilers/conftest.o >> -I/tmp/petsc-0Bnd1w/config.compilers >> -I/tmp/petsc-0Bnd1w/config.setCompilers -em -O3 >> /tmp/petsc-0Bnd1w/config.compilers/conftest.F >> Executing: crayftn -c -o /tmp/petsc-0Bnd1w/config.compilers/conftest.o >> -I/tmp/petsc-0Bnd1w/config.compilers >> -I/tmp/petsc-0Bnd1w/config.setCompilers -em -O3 >> /tmp/petsc-0Bnd1w/config.compilers/conftest.F >> sh: >> Successful compile: >> Source: >> ? ? ?module configtest >> ? ? ?integer testint >> ? ? ?parameter (testint = 42) >> ? ? ?end module configtest >> >> ******************************************************************************* >> ? ? ? ? UNABLE to CONFIGURE with GIVEN OPTIONS ? ?(see configure.log >> for details): >> >> ------------------------------------------------------------------------------- >> Fortran module was not created during the compile. >> configtest.mod/CONFIGTEST.mod not found >> >> ******************************************************************************* >> ?File "/users/dsz/pack/petsc-3.2-p5/config/configure.py", line 283, >> in petsc_configure >> ? ?framework.configure(out = sys.stdout) >> ?File >> "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/framework.py", >> line 925, in configure >> ? ?child.configure() >> ?File >> "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py", >> line 1338, in configure >> ? ?self.executeTest(self.checkFortranModuleInclude) >> ?File "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/base.py", >> line 115, in executeTest >> ? ?ret = apply(test, args,kargs) >> ?File >> "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py", >> line 1187, in checkFortranModuleInclude >> ? ?raise RuntimeError('Fortran module was not created during the >> compile. configtest.mod/CONFIGTEST.mod not found') >> > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > From popov at uni-mainz.de Fri Jan 20 11:21:11 2012 From: popov at uni-mainz.de (Anton Popov) Date: Fri, 20 Jan 2012 18:21:11 +0100 Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: References: <4F199F7C.6010809@uni-mainz.de> Message-ID: <4F19A287.3010806@uni-mainz.de> Well, I simply gave up on Cray compilers, as well as on the other compilers installed on this machine (Intel, Portland, Pathscale). After many attempts only GNU compilers were functional with PETSc. On 1/20/12 6:12 PM, Dominik Szczerba wrote: > Hi Anton, > > Many thanks for your hints. I am on the same machine. The key > difference here is that I want to use the cray environment to compare > performance with gnu. And I have no issues with gnu and my experience > is very similar to yours. > > Many thanks and best regards, > Dominik > > On Fri, Jan 20, 2012 at 6:08 PM, Anton Popov wrote: >> Hi Dominik, >> >> It really depends on which Cray machine you are installing. If you like I >> can describe my recent installation experience on Cray XE6 "Rosa" in Zurich >> (it uses Linux). >> First make sure which compiler you are using. For me, GNU compilers did the >> job well. I also prefer using ACML BLAS/LAPACK Library, because XE6 uses AMD >> processors. >> >> So just go to your .bashrc and add something like: >> >> module load PrgEnv-gnu >> module load acml >> >> # PETSC 3.X >> export PETSC_DIR="$HOME/LIB/petsc-3.X-pY" >> export PETSC_ARCH="PETSC_CRAY_XE6_GNU_OPT" >> export PATH=$PATH:$PETSC_DIR/bin/ >> >> Don't forget to logout/login for changes to take effect. Also, you should >> only have one PrgEnv module in your .bashrc. >> >> Depending on the packages you need, you can modify the configuration command >> below: >> >> ./config/configure.py \ >> --with-batch=1 \ >> --known-mpi-shared=0 \ >> --known-memcmp-ok \ >> --with-blas-lapack-lib="$ACML_DIR/gfortran64/lib/libacml.a" \ >> --COPTFLAGS="-O3" \ >> --FOPTFLAGS="-O3" \ >> --CXXOPTFLAGS="-O3" \ >> --with-x=0 \ >> --with-debugging=0 \ >> --with-clib-autodetect=0 \ >> --with-cxxlib-autodetect=0 \ >> --with-fortranlib-autodetect=0 \ >> --with-shared=0 \ >> --with-dynamic=0 \ >> --with-mpi-compilers=1 \ >> --with-cc=cc \ >> --with-cxx=CC \ >> --with-fc=ftn \ >> --download-blacs=1 \ >> --download-scalapack=1 \ >> --download-mumps=1 \ >> --download-superlu_dist=1 \ >> --download-parmetis=1 \ >> --download-ml=1 >> >> If you don't need ACML, just make PETSc download and install BLAS/LAPACK for >> you. >> >> What is important here is to always keep the following: >> --with-batch=1 \ >> --known-mpi-shared=0 \ >> --with-clib-autodetect=0 \ >> --with-cxxlib-autodetect=0 \ >> --with-fortranlib-autodetect=0 \ >> >> NOTE: on CRAY machine you have to submit "conftest" executable on single >> processor via batch system (Rosa uses slurm) and run "reconfigure.py" to >> finalize configuration. >> >> After you proceed with this, you have to MANUALLY DELETE the following keys >> in $PETSC_DIR/$PETSC_ARCH/include/petscconf.h >> >> PETSC_HAVE_SYS_PROCFS_H >> PETSC_HAVE_DLFCN_H >> PETSC_HAVE_SYS_SOCKET_H >> PETSC_HAVE_SYS_UTSNAME_H >> PETSC_HAVE_PWD_H >> PETSC_HAVE_GETWD >> PETSC_HAVE_UNAME >> PETSC_HAVE_GETHOSTNAME >> PETSC_HAVE_GETDOMAINNAME >> PETSC_HAVE_NETINET_IN_H >> PETSC_HAVE_NETDB_H >> PETSC_USE_SOCKET_VIEWER >> PETSC_HAVE_GETPWUID >> >> Otherwise nothing will run, and you'll get segfaults in the very beginning. >> The reason behind it is that Cray version of Linux only supports reduced set >> of system calls. >> >> NOW you can do "make all" >> >> Hope it'll be helpful, >> >> Anton >> >> ------------------------ >> >> >> On 1/20/12 4:45 PM, Matthew Knepley wrote: >> >> On Fri, Jan 20, 2012 at 9:43 AM, Dominik Szczerba >> wrote: >>> I am building on a Cray system with crayftn fortran compiler. I >>> specify --with-fc=crayftn and --FFLAGS=-em. Despite the latter, I am >>> still getting the below cited error. Interestingly, the mod file IS >>> THERE: >>> >>>> find /tmp/petsc-0Bnd1w -iname \*.mod >>> /tmp/petsc-0Bnd1w/config.compilers/CONFIGTEST.mod >>> >>> How do I go on from here? >> >> The problem is not that is doesn't create a module file, but that it will >> not put it >> where we ask it to. Please send entire configure.log (don't Cc users). >> >> Matt >> >>> Thanks >>> Dominik >>> >>> >>> >>> >>> ================================================================================ >>> TEST checkFortranModuleInclude from >>> >>> config.compilers(/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py:1155) >>> TESTING: checkFortranModuleInclude from >>> config.compilers(config/BuildSystem/config/compilers.py:1155) >>> Figures out what flag is used to specify the include path for Fortran >>> modules >>> Pushing language FC >>> sh: crayftn -c -o /tmp/petsc-0Bnd1w/config.compilers/conftest.o >>> -I/tmp/petsc-0Bnd1w/config.compilers >>> -I/tmp/petsc-0Bnd1w/config.setCompilers -em -O3 >>> /tmp/petsc-0Bnd1w/config.compilers/conftest.F >>> Executing: crayftn -c -o /tmp/petsc-0Bnd1w/config.compilers/conftest.o >>> -I/tmp/petsc-0Bnd1w/config.compilers >>> -I/tmp/petsc-0Bnd1w/config.setCompilers -em -O3 >>> /tmp/petsc-0Bnd1w/config.compilers/conftest.F >>> sh: >>> Successful compile: >>> Source: >>> module configtest >>> integer testint >>> parameter (testint = 42) >>> end module configtest >>> >>> ******************************************************************************* >>> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log >>> for details): >>> >>> ------------------------------------------------------------------------------- >>> Fortran module was not created during the compile. >>> configtest.mod/CONFIGTEST.mod not found >>> >>> ******************************************************************************* >>> File "/users/dsz/pack/petsc-3.2-p5/config/configure.py", line 283, >>> in petsc_configure >>> framework.configure(out = sys.stdout) >>> File >>> "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/framework.py", >>> line 925, in configure >>> child.configure() >>> File >>> "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py", >>> line 1338, in configure >>> self.executeTest(self.checkFortranModuleInclude) >>> File "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/base.py", >>> line 115, in executeTest >>> ret = apply(test, args,kargs) >>> File >>> "/users/dsz/pack/petsc-3.2-p5/config/BuildSystem/config/compilers.py", >>> line 1187, in checkFortranModuleInclude >>> raise RuntimeError('Fortran module was not created during the >>> compile. configtest.mod/CONFIGTEST.mod not found') >>> >> >> >> -- >> What most experimenters take for granted before they begin their experiments >> is infinitely more interesting than any results to which their experiments >> lead. >> -- Norbert Wiener >> >> From balay at mcs.anl.gov Fri Jan 20 11:25:10 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 20 Jan 2012 11:25:10 -0600 (CST) Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: <4F19A287.3010806@uni-mainz.de> References: <4F199F7C.6010809@uni-mainz.de> <4F19A287.3010806@uni-mainz.de> Message-ID: the default PGI compilers should work. Thats what I usually test with on cray. Satish On Fri, 20 Jan 2012, Anton Popov wrote: > Well, I simply gave up on Cray compilers, as well as on the other compilers > installed on this machine (Intel, Portland, Pathscale). After many attempts > only GNU compilers were functional with PETSc. From jiangwen84 at gmail.com Fri Jan 20 11:31:59 2012 From: jiangwen84 at gmail.com (Wen Jiang) Date: Fri, 20 Jan 2012 12:31:59 -0500 Subject: [petsc-users] generate entries on 'wrong' process Message-ID: Hi Matt, The serial job is running without any problems and never stalls. Actually the parallel jobs also running successfully on distributed-memory desktop or on single node of cluster. It will get stuck if it is running on more than one compute node(now it is running on two nodes). Both the serial job and parallel job (running on distributed or cluster) I mentioned before have the same size(dofs). But If I ran a smaller job on cluster with two nodes, it might not get stuck and work fine. As you said before, I add MAT_ASSEMBLY_FLUSH after every element stiffness matrix is inserted. I got the output like below, and it gets stuck too. [0] MatStashScatterBegin_Private() : No of messages: 1 [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 24584 [0] MatAssemblyBegin_MPIAIJ(): Stash has 3072 entries, uses 0 mallocs. [3] MatAssemblyBegin_MPIAIJ(): Stash has 3072 entries, uses 0 mallocs. [2] MatAssemblyBegin_MPIAIJ(): Stash has 4096 entries, uses 0 mallocs. [7] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [5] MatAssemblyBegin_MPIAIJ(): Stash has 2048 entries, uses 0 mallocs. [4] MatAssemblyBegin_MPIAIJ(): Stash has 2048 entries, uses 0 mallocs. [6] MatAssemblyBegin_MPIAIJ(): Stash has 1024 entries, uses 0 mallocs. [1] MatAssemblyBegin_MPIAIJ(): Stash has 3072 entries, uses 0 mallocs. [0] MatStashScatterBegin_Private(): No of messages: 1 [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 24584 [0] MatAssemblyBegin_MPIAIJ(): Stash has 3072 entries, uses 0 mallocs. [3] MatAssemblyBegin_MPIAIJ(): Stash has 3072 entries, uses 0 mallocs. [2] MatAssemblyBegin_MPIAIJ(): Stash has 4096 entries, uses 0 mallocs. [7] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [4] MatAssemblyBegin_MPIAIJ(): Stash has 2048 entries, uses 0 mallocs. [5] MatAssemblyBegin_MPIAIJ(): Stash has 2048 entries, uses 0 mallocs. [6] MatAssemblyBegin_MPIAIJ(): Stash has 1024 entries, uses 0 mallocs. [1] MatAssemblyBegin_MPIAIJ(): Stash has 3072 entries, uses 0 mallocs. [0] MatStashScatterBegin_Private(): No of messages: 1 [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 24584 [0] MatAssemblyBegin_MPIAIJ(): Stash has 3072 entries, uses 0 mallocs. [3] MatAssemblyBegin_MPIAIJ(): Stash has 3072 entries, uses 0 mallocs. [2] MatAssemblyBegin_MPIAIJ(): Stash has 4096 entries, uses 0 mallocs. On Fri, Jan 20, 2012 at 10:21 AM, Wen Jiang wrote: > Hi, Matt > > Could you tell me some more details about how to get a stack trace there? > I know little about it. The job is submitted on head node and running on > compute nodes. > 1) Always run serial problems until you understand what is happening 2) Run with -start_in_debugger, and type 'cont' in the debugger (read about gdb) 3) When it stalls, Ctrl-C and then type 'where' Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Jan 20 11:36:17 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 20 Jan 2012 11:36:17 -0600 Subject: [petsc-users] generate entries on 'wrong' process In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 11:31, Wen Jiang wrote: > The serial job is running without any problems and never stalls. Actually > the parallel jobs also running successfully on distributed-memory desktop > or on single node of cluster. It will get stuck if it is running on more > than one compute node(now it is running on two nodes). Both the serial job > and parallel job (running on distributed or cluster) I mentioned before > have the same size(dofs). But If I ran a smaller job on cluster with two > nodes, it might not get stuck and work fine. > > As you said before, I add MAT_ASSEMBLY_FLUSH after every element stiffness > matrix is inserted. > This will deadlock unless the number of elements is *exactly* the same on every process. > I got the output like below, and it gets stuck too. > When it "gets stuck", attach a debugger and get stack traces. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 20 12:05:54 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 12:05:54 -0600 Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: <4F19A287.3010806@uni-mainz.de> References: <4F199F7C.6010809@uni-mainz.de> <4F19A287.3010806@uni-mainz.de> Message-ID: On Fri, Jan 20, 2012 at 11:21 AM, Anton Popov wrote: > Well, I simply gave up on Cray compilers, as well as on the other > compilers installed on this machine (Intel, Portland, Pathscale). After > many attempts only GNU compilers were functional with PETSc. 1) If you don't send in the error reports, we can't fix it. 2) Dominik, I have pushed a possible fix for this to petsc-dev. Can you try running with that on this machine? Thanks, Matt > On 1/20/12 6:12 PM, Dominik Szczerba wrote: > >> Hi Anton, >> >> Many thanks for your hints. I am on the same machine. The key >> difference here is that I want to use the cray environment to compare >> performance with gnu. And I have no issues with gnu and my experience >> is very similar to yours. >> >> Many thanks and best regards, >> Dominik >> >> On Fri, Jan 20, 2012 at 6:08 PM, Anton Popov wrote: >> >>> Hi Dominik, >>> >>> It really depends on which Cray machine you are installing. If you like I >>> can describe my recent installation experience on Cray XE6 "Rosa" in >>> Zurich >>> (it uses Linux). >>> First make sure which compiler you are using. For me, GNU compilers did >>> the >>> job well. I also prefer using ACML BLAS/LAPACK Library, because XE6 uses >>> AMD >>> processors. >>> >>> So just go to your .bashrc and add something like: >>> >>> module load PrgEnv-gnu >>> module load acml >>> >>> # PETSC 3.X >>> export PETSC_DIR="$HOME/LIB/petsc-3.**X-pY" >>> export PETSC_ARCH="PETSC_CRAY_XE6_**GNU_OPT" >>> export PATH=$PATH:$PETSC_DIR/bin/ >>> >>> Don't forget to logout/login for changes to take effect. Also, you should >>> only have one PrgEnv module in your .bashrc. >>> >>> Depending on the packages you need, you can modify the configuration >>> command >>> below: >>> >>> ./config/configure.py \ >>> --with-batch=1 \ >>> --known-mpi-shared=0 \ >>> --known-memcmp-ok \ >>> --with-blas-lapack-lib="$ACML_**DIR/gfortran64/lib/libacml.a" \ >>> --COPTFLAGS="-O3" \ >>> --FOPTFLAGS="-O3" \ >>> --CXXOPTFLAGS="-O3" \ >>> --with-x=0 \ >>> --with-debugging=0 \ >>> --with-clib-autodetect=0 \ >>> --with-cxxlib-autodetect=0 \ >>> --with-fortranlib-autodetect=0 \ >>> --with-shared=0 \ >>> --with-dynamic=0 \ >>> --with-mpi-compilers=1 \ >>> --with-cc=cc \ >>> --with-cxx=CC \ >>> --with-fc=ftn \ >>> --download-blacs=1 \ >>> --download-scalapack=1 \ >>> --download-mumps=1 \ >>> --download-superlu_dist=1 \ >>> --download-parmetis=1 \ >>> --download-ml=1 >>> >>> If you don't need ACML, just make PETSc download and install BLAS/LAPACK >>> for >>> you. >>> >>> What is important here is to always keep the following: >>> --with-batch=1 \ >>> --known-mpi-shared=0 \ >>> --with-clib-autodetect=0 \ >>> --with-cxxlib-autodetect=0 \ >>> --with-fortranlib-autodetect=0 \ >>> >>> NOTE: on CRAY machine you have to submit "conftest" executable on single >>> processor via batch system (Rosa uses slurm) and run "reconfigure.py" to >>> finalize configuration. >>> >>> After you proceed with this, you have to MANUALLY DELETE the following >>> keys >>> in $PETSC_DIR/$PETSC_ARCH/**include/petscconf.h >>> >>> PETSC_HAVE_SYS_PROCFS_H >>> PETSC_HAVE_DLFCN_H >>> PETSC_HAVE_SYS_SOCKET_H >>> PETSC_HAVE_SYS_UTSNAME_H >>> PETSC_HAVE_PWD_H >>> PETSC_HAVE_GETWD >>> PETSC_HAVE_UNAME >>> PETSC_HAVE_GETHOSTNAME >>> PETSC_HAVE_GETDOMAINNAME >>> PETSC_HAVE_NETINET_IN_H >>> PETSC_HAVE_NETDB_H >>> PETSC_USE_SOCKET_VIEWER >>> PETSC_HAVE_GETPWUID >>> >>> Otherwise nothing will run, and you'll get segfaults in the very >>> beginning. >>> The reason behind it is that Cray version of Linux only supports reduced >>> set >>> of system calls. >>> >>> NOW you can do "make all" >>> >>> Hope it'll be helpful, >>> >>> Anton >>> >>> ------------------------ >>> >>> >>> On 1/20/12 4:45 PM, Matthew Knepley wrote: >>> >>> On Fri, Jan 20, 2012 at 9:43 AM, Dominik Szczerba >>> wrote: >>> >>>> I am building on a Cray system with crayftn fortran compiler. I >>>> specify --with-fc=crayftn and --FFLAGS=-em. Despite the latter, I am >>>> still getting the below cited error. Interestingly, the mod file IS >>>> THERE: >>>> >>>> find /tmp/petsc-0Bnd1w -iname \*.mod >>>>> >>>> /tmp/petsc-0Bnd1w/config.**compilers/CONFIGTEST.mod >>>> >>>> How do I go on from here? >>>> >>> >>> The problem is not that is doesn't create a module file, but that it will >>> not put it >>> where we ask it to. Please send entire configure.log (don't Cc users). >>> >>> Matt >>> >>> Thanks >>>> Dominik >>>> >>>> >>>> >>>> >>>> ==============================**==============================** >>>> ==================== >>>> TEST checkFortranModuleInclude from >>>> >>>> config.compilers(/users/dsz/**pack/petsc-3.2-p5/config/** >>>> BuildSystem/config/compilers.**py:1155) >>>> TESTING: checkFortranModuleInclude from >>>> config.compilers(config/**BuildSystem/config/compilers.**py:1155) >>>> Figures out what flag is used to specify the include path for Fortran >>>> modules >>>> Pushing language FC >>>> sh: crayftn -c -o /tmp/petsc-0Bnd1w/config.**compilers/conftest.o >>>> -I/tmp/petsc-0Bnd1w/config.**compilers >>>> -I/tmp/petsc-0Bnd1w/config.**setCompilers -em -O3 >>>> /tmp/petsc-0Bnd1w/config.**compilers/conftest.F >>>> Executing: crayftn -c -o /tmp/petsc-0Bnd1w/config.** >>>> compilers/conftest.o >>>> -I/tmp/petsc-0Bnd1w/config.**compilers >>>> -I/tmp/petsc-0Bnd1w/config.**setCompilers -em -O3 >>>> /tmp/petsc-0Bnd1w/config.**compilers/conftest.F >>>> sh: >>>> Successful compile: >>>> Source: >>>> module configtest >>>> integer testint >>>> parameter (testint = 42) >>>> end module configtest >>>> >>>> **************************************************************** >>>> ******************* >>>> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log >>>> for details): >>>> >>>> ------------------------------**------------------------------** >>>> ------------------- >>>> Fortran module was not created during the compile. >>>> configtest.mod/CONFIGTEST.mod not found >>>> >>>> **************************************************************** >>>> ******************* >>>> File "/users/dsz/pack/petsc-3.2-p5/**config/configure.py", line 283, >>>> in petsc_configure >>>> framework.configure(out = sys.stdout) >>>> File >>>> "/users/dsz/pack/petsc-3.2-p5/**config/BuildSystem/config/** >>>> framework.py", >>>> line 925, in configure >>>> child.configure() >>>> File >>>> "/users/dsz/pack/petsc-3.2-p5/**config/BuildSystem/config/** >>>> compilers.py", >>>> line 1338, in configure >>>> self.executeTest(self.**checkFortranModuleInclude) >>>> File "/users/dsz/pack/petsc-3.2-p5/**config/BuildSystem/config/** >>>> base.py", >>>> line 115, in executeTest >>>> ret = apply(test, args,kargs) >>>> File >>>> "/users/dsz/pack/petsc-3.2-p5/**config/BuildSystem/config/** >>>> compilers.py", >>>> line 1187, in checkFortranModuleInclude >>>> raise RuntimeError('Fortran module was not created during the >>>> compile. configtest.mod/CONFIGTEST.mod not found') >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments >>> is infinitely more interesting than any results to which their >>> experiments >>> lead. >>> -- Norbert Wiener >>> >>> >>> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From cbergstrom at pathscale.com Fri Jan 20 12:11:22 2012 From: cbergstrom at pathscale.com (=?ISO-8859-1?Q?=22C=2E_Bergstr=F6m=22?=) Date: Sat, 21 Jan 2012 01:11:22 +0700 Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: References: <4F199F7C.6010809@uni-mainz.de> <4F19A287.3010806@uni-mainz.de> Message-ID: <4F19AE4A.60401@pathscale.com> On 01/21/12 01:05 AM, Matthew Knepley wrote: > On Fri, Jan 20, 2012 at 11:21 AM, Anton Popov > wrote: > > Well, I simply gave up on Cray compilers, as well as on the other > compilers installed on this machine (Intel, Portland, Pathscale). > After many attempts only GNU compilers were functional with PETSc. > > > 1) If you don't send in the error reports, we can't fix it. +1 from PathScale - We need bug reports. Please email me directly if you're serious about getting the issues resolved. thanks From dominik at itis.ethz.ch Fri Jan 20 12:49:44 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 20 Jan 2012 19:49:44 +0100 Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: References: <4F199F7C.6010809@uni-mainz.de> <4F19A287.3010806@uni-mainz.de> Message-ID: > 2) Dominik, I have pushed a possible fix for this to petsc-dev. Can you try > running with that on this machine? There is no mercurial on the system. Is your fix already in the tarball? From knepley at gmail.com Fri Jan 20 12:51:48 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 12:51:48 -0600 Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: References: <4F199F7C.6010809@uni-mainz.de> <4F19A287.3010806@uni-mainz.de> Message-ID: On Fri, Jan 20, 2012 at 12:49 PM, Dominik Szczerba wrote: > > 2) Dominik, I have pushed a possible fix for this to petsc-dev. Can you > try > > running with that on this machine? > > There is no mercurial on the system. Is your fix already in the tarball? > http://petsc.cs.iit.edu/petsc/petsc-dev/archive/tip.tar.gz Mercurial is trivial to build in your own directory. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Jan 20 12:56:49 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 20 Jan 2012 12:56:49 -0600 (CST) Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: References: <4F199F7C.6010809@uni-mainz.de> <4F19A287.3010806@uni-mainz.de> Message-ID: On Fri, 20 Jan 2012, Dominik Szczerba wrote: > > 2) Dominik, I have pushed a possible fix for this to petsc-dev. Can you try > > running with that on this machine? > > There is no mercurial on the system. Is your fix already in the tarball? you can download the patch from here - and apply as before with the patch command. [Note: get the 'raw' patch] http://petsc.cs.iit.edu/petsc/BuildSystem/rev/39eccbe256d3 satish From dominik at itis.ethz.ch Fri Jan 20 12:57:02 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 20 Jan 2012 19:57:02 +0100 Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: References: <4F199F7C.6010809@uni-mainz.de> <4F19A287.3010806@uni-mainz.de> Message-ID: I am getting CONFIGURATION CRASH. Sending configure.log to maint in a sec. On Fri, Jan 20, 2012 at 7:51 PM, Matthew Knepley wrote: > On Fri, Jan 20, 2012 at 12:49 PM, Dominik Szczerba > wrote: >> >> > 2) Dominik, I have pushed a possible fix for this to petsc-dev. Can you >> > try >> > running with that on this machine? >> >> There is no mercurial on the system. Is your fix already in the tarball? > > > http://petsc.cs.iit.edu/petsc/petsc-dev/archive/tip.tar.gz > > Mercurial is trivial to build in your own directory. > > ? ?Matt > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener From dominik at itis.ethz.ch Fri Jan 20 13:03:08 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 20 Jan 2012 20:03:08 +0100 Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: References: <4F199F7C.6010809@uni-mainz.de> <4F19A287.3010806@uni-mainz.de> Message-ID: On Fri, Jan 20, 2012 at 7:56 PM, Satish Balay wrote: > On Fri, 20 Jan 2012, Dominik Szczerba wrote: > >> > 2) Dominik, I have pushed a possible fix for this to petsc-dev. Can you try >> > running with that on this machine? >> >> There is no mercurial on the system. Is your fix already in the tarball? > > you can download the patch from here - and apply as before with the patch > command. [Note: get the 'raw' patch] Please explain the 'raw' part. Thanks From balay at mcs.anl.gov Fri Jan 20 13:13:04 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 20 Jan 2012 13:13:04 -0600 (CST) Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: References: <4F199F7C.6010809@uni-mainz.de> <4F19A287.3010806@uni-mainz.de> Message-ID: On Fri, 20 Jan 2012, Dominik Szczerba wrote: > On Fri, Jan 20, 2012 at 7:56 PM, Satish Balay wrote: > > On Fri, 20 Jan 2012, Dominik Szczerba wrote: > > > >> > 2) Dominik, I have pushed a possible fix for this to petsc-dev. Can you try > >> > running with that on this machine? > >> > >> There is no mercurial on the system. Is your fix already in the tarball? > > > > you can download the patch from here - and apply as before with the patch > > command. [Note: get the 'raw' patch] > > Please explain the 'raw' part. http://petsc.cs.iit.edu/petsc/BuildSystem/rev/39eccbe256d3 shows the patch in html. However there is a link on this page with the text 'raw' If you click on that - you get plain text patch - that can be applied with 'patch command' http://petsc.cs.iit.edu/petsc/BuildSystem/raw-rev/39eccbe256d3 Ig uess I should have just sent the 'raw' link - but figured the html link had more utility [as its more browsable] - so listed the html link to the patch. You can browse all petsc-dev changes at: http://petsc.cs.iit.edu/petsc/petsc-dev/ http://petsc.cs.iit.edu/petsc/BuildSystem/ Satish From jiangwen84 at gmail.com Fri Jan 20 13:52:12 2012 From: jiangwen84 at gmail.com (Wen Jiang) Date: Fri, 20 Jan 2012 14:52:12 -0500 Subject: [petsc-users] generate entries on 'wrong' process Message-ID: Hi Jed, Could you cover it a bit more details why it will get deadlock unless the number of elements is *exactly* the same on every process? Thanks. Regards, Wen Message: 5 Date: Fri, 20 Jan 2012 11:36:17 -0600 From: Jed Brown Subject: Re: [petsc-users] generate entries on 'wrong' process To: PETSc users list Message-ID: Content-Type: text/plain; charset="utf-8" On Fri, Jan 20, 2012 at 11:31, Wen Jiang wrote: > The serial job is running without any problems and never stalls. Actually > the parallel jobs also running successfully on distributed-memory desktop > or on single node of cluster. It will get stuck if it is running on more > than one compute node(now it is running on two nodes). Both the serial job > and parallel job (running on distributed or cluster) I mentioned before > have the same size(dofs). But If I ran a smaller job on cluster with two > nodes, it might not get stuck and work fine. > > As you said before, I add MAT_ASSEMBLY_FLUSH after every element stiffness > matrix is inserted. > This will deadlock unless the number of elements is *exactly* the same on every process. > I got the output like below, and it gets stuck too. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 20 13:54:45 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 13:54:45 -0600 Subject: [petsc-users] generate entries on 'wrong' process In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 1:52 PM, Wen Jiang wrote: > Hi Jed, > > Could you cover it a bit more details why it will get deadlock unless the > number of elements is *exactly* the same on every process? Thanks. > The flush call is collective. Everyone has to call it the same number of times. Matt > Regards, > Wen > > Message: 5 > Date: Fri, 20 Jan 2012 11:36:17 -0600 > From: Jed Brown > Subject: Re: [petsc-users] generate entries on 'wrong' process > To: PETSc users list > Message-ID: > > > Content-Type: text/plain; charset="utf-8" > > On Fri, Jan 20, 2012 at 11:31, Wen Jiang wrote: > > > The serial job is running without any problems and never stalls. Actually > > the parallel jobs also running successfully on distributed-memory desktop > > or on single node of cluster. It will get stuck if it is running on more > > than one compute node(now it is running on two nodes). Both the serial > job > > and parallel job (running on distributed or cluster) I mentioned before > > have the same size(dofs). But If I ran a smaller job on cluster with two > > nodes, it might not get stuck and work fine. > > > > As you said before, I add MAT_ASSEMBLY_FLUSH after every element > stiffness > > matrix is inserted. > > > > This will deadlock unless the number of elements is *exactly* the same on > every process. > > > > I got the output like below, and it gets stuck too. > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Jan 20 13:55:05 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 20 Jan 2012 13:55:05 -0600 Subject: [petsc-users] generate entries on 'wrong' process In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 13:52, Wen Jiang wrote: > Could you cover it a bit more details why it will get deadlock unless the > number of elements is *exactly* the same on every process? MatAssemblyBegin/End are collective, so every process needs to call them together. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jan 20 13:57:42 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 20 Jan 2012 13:57:42 -0600 Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: References: <4F199F7C.6010809@uni-mainz.de> <4F19A287.3010806@uni-mainz.de> Message-ID: Please move this thread off of petsc-users and to petsc-maint at mcs.anl.gov Barry On Jan 20, 2012, at 1:13 PM, Satish Balay wrote: > On Fri, 20 Jan 2012, Dominik Szczerba wrote: > >> On Fri, Jan 20, 2012 at 7:56 PM, Satish Balay wrote: >>> On Fri, 20 Jan 2012, Dominik Szczerba wrote: >>> >>>>> 2) Dominik, I have pushed a possible fix for this to petsc-dev. Can you try >>>>> running with that on this machine? >>>> >>>> There is no mercurial on the system. Is your fix already in the tarball? >>> >>> you can download the patch from here - and apply as before with the patch >>> command. [Note: get the 'raw' patch] >> >> Please explain the 'raw' part. > > http://petsc.cs.iit.edu/petsc/BuildSystem/rev/39eccbe256d3 > > shows the patch in html. However there is a link on this page with the > text 'raw' > > If you click on that - you get plain text patch - that can be applied > with 'patch command' > > http://petsc.cs.iit.edu/petsc/BuildSystem/raw-rev/39eccbe256d3 > > > Ig uess I should have just sent the 'raw' link - but figured the html > link had more utility [as its more browsable] - so listed the html > link to the patch. > > You can browse all petsc-dev changes at: > http://petsc.cs.iit.edu/petsc/petsc-dev/ > http://petsc.cs.iit.edu/petsc/BuildSystem/ > > Satish > From dominik at itis.ethz.ch Fri Jan 20 13:57:58 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 20 Jan 2012 20:57:58 +0100 Subject: [petsc-users] [petsc-maint #103068] problems compiling In-Reply-To: References: <4F199F7C.6010809@uni-mainz.de> <4F19A287.3010806@uni-mainz.de> Message-ID: Ah yes, thanks, no, I would not apply a html formatted patch with line numbers :) Thanks, Dominik On Fri, Jan 20, 2012 at 8:13 PM, Satish Balay wrote: > On Fri, 20 Jan 2012, Dominik Szczerba wrote: > >> On Fri, Jan 20, 2012 at 7:56 PM, Satish Balay wrote: >> > On Fri, 20 Jan 2012, Dominik Szczerba wrote: >> > >> >> > 2) Dominik, I have pushed a possible fix for this to petsc-dev. Can you try >> >> > running with that on this machine? >> >> >> >> There is no mercurial on the system. Is your fix already in the tarball? >> > >> > you can download the patch from here - and apply as before with the patch >> > command. [Note: get the 'raw' patch] >> >> Please explain the 'raw' part. > > http://petsc.cs.iit.edu/petsc/BuildSystem/rev/39eccbe256d3 > > shows the patch in html. However there is a link on this page with the > text 'raw' > > If you click on that - you get plain text patch - that can be applied > with 'patch command' > > http://petsc.cs.iit.edu/petsc/BuildSystem/raw-rev/39eccbe256d3 > > > Ig uess I should have just sent the 'raw' link - but figured the html > link had more utility [as its more browsable] - so listed the html > link to the patch. > > You can browse all petsc-dev changes at: > http://petsc.cs.iit.edu/petsc/petsc-dev/ > http://petsc.cs.iit.edu/petsc/BuildSystem/ > > Satish > From mmnasr at gmail.com Fri Jan 20 15:15:20 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Fri, 20 Jan 2012 13:15:20 -0800 Subject: [petsc-users] hdf5 and FILE_MODE_APPEND Message-ID: Hi guys, Does *PetscViewerHDF5Open()*work with *FILE_MODE_APPEND*? This is the error I get when I first open the file, write some data to it using VecView(), close it and then again reopen (with append mode) it to add data to the end of it. [0]PETSC ERROR: Operation done in wrong order! [0]PETSC ERROR: Must call PetscViewerFileSetMode() before PetscViewerFileSetName()! Thanks, Mohamad -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 20 15:19:39 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 15:19:39 -0600 Subject: [petsc-users] hdf5 and FILE_MODE_APPEND In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 3:15 PM, Mohamad M. Nasr-Azadani wrote: > Hi guys, > > Does *PetscViewerHDF5Open()*work with *FILE_MODE_APPEND*? > No, but HDF5 does not work like that. Matt > This is the error I get when I first open the file, write some data to it > using VecView(), close it and then again reopen (with append mode) it to > add data to the end of it. > > [0]PETSC ERROR: Operation done in wrong order! > [0]PETSC ERROR: Must call PetscViewerFileSetMode() before > PetscViewerFileSetName()! > > Thanks, > Mohamad > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Fri Jan 20 15:22:43 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Fri, 20 Jan 2012 13:22:43 -0800 Subject: [petsc-users] hdf5 and FILE_MODE_APPEND In-Reply-To: References: Message-ID: Thanks Matt, So, this cannot be done this way, i.e. generate an hdf5 file at one instance and write a field data, close it then add another field data at later instance to the same file? On Fri, Jan 20, 2012 at 1:19 PM, Matthew Knepley wrote: > On Fri, Jan 20, 2012 at 3:15 PM, Mohamad M. Nasr-Azadani > wrote: > >> Hi guys, >> >> Does *PetscViewerHDF5Open()*work with *FILE_MODE_APPEND*? >> > > No, but HDF5 does not work like that. > > Matt > > >> This is the error I get when I first open the file, write some data to it >> using VecView(), close it and then again reopen (with append mode) it to >> add data to the end of it. >> >> [0]PETSC ERROR: Operation done in wrong order! >> [0]PETSC ERROR: Must call PetscViewerFileSetMode() before >> PetscViewerFileSetName()! >> >> Thanks, >> Mohamad >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 20 15:27:25 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 15:27:25 -0600 Subject: [petsc-users] hdf5 and FILE_MODE_APPEND In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 3:22 PM, Mohamad M. Nasr-Azadani wrote: > Thanks Matt, > So, this cannot be done this way, i.e. generate an hdf5 file at one > instance and write a field data, close it then add another field data at > later instance to the same file? Can't you just open up the file and write something with a different name? Matt > On Fri, Jan 20, 2012 at 1:19 PM, Matthew Knepley wrote: > >> On Fri, Jan 20, 2012 at 3:15 PM, Mohamad M. Nasr-Azadani < >> mmnasr at gmail.com> wrote: >> >>> Hi guys, >>> >>> Does *PetscViewerHDF5Open()*work with *FILE_MODE_APPEND*? >>> >> >> No, but HDF5 does not work like that. >> >> Matt >> >> >>> This is the error I get when I first open the file, write some data to >>> it using VecView(), close it and then again reopen (with append mode) it to >>> add data to the end of it. >>> >>> [0]PETSC ERROR: Operation done in wrong order! >>> [0]PETSC ERROR: Must call PetscViewerFileSetMode() before >>> PetscViewerFileSetName()! >>> >>> Thanks, >>> Mohamad >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Fri Jan 20 15:27:25 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 20 Jan 2012 22:27:25 +0100 Subject: [petsc-users] performance surprise Message-ID: I am running some performance tests on a distributed cluster each node 16 cores (Cray). I am very surprised to find that my benchmark jobs are about 3x slower when running on N nodes using all 16 cores than when running on N*16 nodes using only one core. I find this using 2 independent petsc builds and they both exibit the same behavior: my own gnu build and the system module petsc, both 3.2. I was so far unable to build my own petsc version with cray compilers to compare. The scheme is relatively complex with a shell matrix and block preconditioners, transient non-linear problem. I am using boomeramg from hypre. What do you think this unexpected performance may come from? Is it possible that the node interconnect is faster than the shared memory bus on the node? I was expecting the exact opposite. Thanks for any opinions. Dominik From mmnasr at gmail.com Fri Jan 20 15:33:38 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Fri, 20 Jan 2012 13:33:38 -0800 Subject: [petsc-users] hdf5 and FILE_MODE_APPEND In-Reply-To: References: Message-ID: At this point, I can't since the first time the viewer is created via PETSC_COMM_WORLD and the second time, PETSC_COMM_SELF. (I am not even 100% sure if that could cause any troubles with the hdf5 file?) Mohamad On Fri, Jan 20, 2012 at 1:27 PM, Matthew Knepley wrote: > On Fri, Jan 20, 2012 at 3:22 PM, Mohamad M. Nasr-Azadani > wrote: > >> Thanks Matt, >> So, this cannot be done this way, i.e. generate an hdf5 file at one >> instance and write a field data, close it then add another field data at >> later instance to the same file? > > > Can't you just open up the file and write something with a different name? > > Matt > > >> On Fri, Jan 20, 2012 at 1:19 PM, Matthew Knepley wrote: >> >>> On Fri, Jan 20, 2012 at 3:15 PM, Mohamad M. Nasr-Azadani < >>> mmnasr at gmail.com> wrote: >>> >>>> Hi guys, >>>> >>>> Does *PetscViewerHDF5Open()*work with *FILE_MODE_APPEND*? >>>> >>> >>> No, but HDF5 does not work like that. >>> >>> Matt >>> >>> >>>> This is the error I get when I first open the file, write some data to >>>> it using VecView(), close it and then again reopen (with append mode) it to >>>> add data to the end of it. >>>> >>>> [0]PETSC ERROR: Operation done in wrong order! >>>> [0]PETSC ERROR: Must call PetscViewerFileSetMode() before >>>> PetscViewerFileSetName()! >>>> >>>> Thanks, >>>> Mohamad >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Jan 20 15:36:33 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 20 Jan 2012 15:36:33 -0600 Subject: [petsc-users] performance surprise In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 15:27, Dominik Szczerba wrote: > I am running some performance tests on a distributed cluster each node > 16 cores (Cray). > I am very surprised to find that my benchmark jobs are about 3x slower when > running on N nodes using all 16 cores than when running on N*16 nodes > using only one core. > Yes, this is normal. Memory bandwidth is the overwhelming bottleneck for most sparse linear algebra. One core can almost saturate the bandwidth of a socket, so you see little benefit from the extra cores. Pay attention to memory bandwidth when you buy computers and try to make your algorithms use a lot of flops per memory access if you want to utilize the floating point hardware you have lying around. > I find this using 2 independent petsc builds and > they both exibit the same behavior: my own gnu > build and the system module petsc, both 3.2. I was so far unable to > build my own petsc version with cray compilers to compare. > > The scheme is relatively complex with a shell matrix and block > preconditioners, transient non-linear problem. I am using boomeramg > from hypre. > > What do you think this unexpected performance may come from? Is it > possible that the node interconnect is faster than the shared memory > bus on the node? I was expecting the exact opposite. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Jan 20 15:37:20 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 20 Jan 2012 15:37:20 -0600 Subject: [petsc-users] hdf5 and FILE_MODE_APPEND In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 15:33, Mohamad M. Nasr-Azadani wrote: > At this point, I can't since the first time the viewer is created via > PETSC_COMM_WORLD and the second time, PETSC_COMM_SELF. > (I am not even 100% sure if that could cause any troubles with the hdf5 > file?) > Well this sounds pretty confusing. What are you actually trying to do? -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Fri Jan 20 15:47:24 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Fri, 20 Jan 2012 13:47:24 -0800 Subject: [petsc-users] hdf5 and FILE_MODE_APPEND In-Reply-To: References: Message-ID: I know, it is getting very confusing. But this sounds very simple but got this complicated so far. I think you have seen my emails yesterday. I want to write a vector (3D DMDA, structured) at each time to an hdf5 file. But, I want to also add the coodinates to the same *.h5 file as well. The grid is structured, so I only need 3 1-D arrays representing the coordinates (and not 1D*1D*1D coordinates cell coordinates). How would you go about it? I thought about creating a "parallel" vector which has the size of 0 on all processors and N on processor zero and then dump the coordinates into that vector and into the same hdf5 file. No luck, since hdf5 writer does not like zero-sized local vectors (if I could have done this, then I could use the same hdf5 viewer and not close the file). But now, I am stuck. I think about closing the hdf5 viewer after I dumped the field data, create a vector using PETSC_COMM_SELF including the coordinates on all processor and then add that vector to the end of same hdf5 file (that's why I need to open it and use append mode). Of course, I would do the writing only on processor zero. I hope I did not confuse you. Thanks, Mohamad On Fri, Jan 20, 2012 at 1:37 PM, Jed Brown wrote: > On Fri, Jan 20, 2012 at 15:33, Mohamad M. Nasr-Azadani wrote: > >> At this point, I can't since the first time the viewer is created via >> PETSC_COMM_WORLD and the second time, PETSC_COMM_SELF. >> (I am not even 100% sure if that could cause any troubles with the hdf5 >> file?) >> > > Well this sounds pretty confusing. What are you actually trying to do? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 20 15:55:01 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 15:55:01 -0600 Subject: [petsc-users] hdf5 and FILE_MODE_APPEND In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 3:47 PM, Mohamad M. Nasr-Azadani wrote: > I know, it is getting very confusing. But this sounds very simple but got > this complicated so far. > I think you have seen my emails yesterday. > I want to write a vector (3D DMDA, structured) at each time to an hdf5 > file. > But, I want to also add the coodinates to the same *.h5 file as well. > The grid is structured, so I only need 3 1-D arrays representing the > coordinates (and not 1D*1D*1D coordinates cell coordinates). > How would you go about it? > > I thought about creating a "parallel" vector which has the size of 0 on > all processors and N on processor zero and then dump the coordinates into > that vector and into the same hdf5 file. No luck, since hdf5 writer does > not like zero-sized local vectors (if I could have done this, then I could > use the same hdf5 viewer and not close the file). But now, I am stuck. > I think about closing the hdf5 viewer after I dumped the field data, > create a vector using PETSC_COMM_SELF including the coordinates on all > processor and then add that vector to the end of same hdf5 file (that's why > I need to open it and use append mode). Of course, I would do the writing > only on processor zero. > There is no need for append mode. HDF5 does not work that way. Just open it up and write. Matt > I hope I did not confuse you. > Thanks, > Mohamad > > > > > > On Fri, Jan 20, 2012 at 1:37 PM, Jed Brown wrote: > >> On Fri, Jan 20, 2012 at 15:33, Mohamad M. Nasr-Azadani wrote: >> >>> At this point, I can't since the first time the viewer is created via >>> PETSC_COMM_WORLD and the second time, PETSC_COMM_SELF. >>> (I am not even 100% sure if that could cause any troubles with the hdf5 >>> file?) >>> >> >> Well this sounds pretty confusing. What are you actually trying to do? >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Jan 20 16:00:13 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 20 Jan 2012 16:00:13 -0600 Subject: [petsc-users] hdf5 and FILE_MODE_APPEND In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 15:55, Matthew Knepley wrote: > There is no need for append mode. HDF5 does not work that way. Just open > it up and write. As written, that would truncate the old file. Making this work might be as simple as adding a case for FILE_MODE_APPEND that opened the file H5F_ACC_RDWR. /* Create or open the file collectively */ switch (hdf5->btype) { case FILE_MODE_READ: hdf5->file_id = H5Fopen(name, H5F_ACC_RDONLY, plist_id); break; case FILE_MODE_WRITE: hdf5->file_id = H5Fcreate(name, H5F_ACC_TRUNC, H5P_DEFAULT, plist_id); break; -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 20 16:12:24 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 16:12:24 -0600 Subject: [petsc-users] hdf5 and FILE_MODE_APPEND In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 4:00 PM, Jed Brown wrote: > On Fri, Jan 20, 2012 at 15:55, Matthew Knepley wrote: > >> There is no need for append mode. HDF5 does not work that way. Just open >> it up and write. > > > As written, that would truncate the old file. Making this work might be as > simple as adding a case for FILE_MODE_APPEND that opened the file > H5F_ACC_RDWR. > > /* Create or open the file collectively */ > switch (hdf5->btype) { > case FILE_MODE_READ: > hdf5->file_id = H5Fopen(name, H5F_ACC_RDONLY, plist_id); > break; > case FILE_MODE_WRITE: > hdf5->file_id = H5Fcreate(name, H5F_ACC_TRUNC, H5P_DEFAULT, > plist_id); > break; > I thought we have changed this for PyLith, but we did it differently. This has been pushed. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jan 20 16:33:49 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 20 Jan 2012 16:33:49 -0600 Subject: [petsc-users] performance surprise In-Reply-To: References: Message-ID: <89DA5FA7-F8C9-415E-8739-A0ECEF137358@mcs.anl.gov> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers Likely you will do best if you use perhaps 1/2 the cores per node. You should experiment by starting with 1 core per node, then use 2 etc until you see the performance peak and the will tell you the sweet spot. Barry On Jan 20, 2012, at 3:36 PM, Jed Brown wrote: > On Fri, Jan 20, 2012 at 15:27, Dominik Szczerba wrote: > I am running some performance tests on a distributed cluster each node > 16 cores (Cray). > I am very surprised to find that my benchmark jobs are about 3x slower when > running on N nodes using all 16 cores than when running on N*16 nodes > using only one core. > > Yes, this is normal. Memory bandwidth is the overwhelming bottleneck for most sparse linear algebra. One core can almost saturate the bandwidth of a socket, so you see little benefit from the extra cores. > > Pay attention to memory bandwidth when you buy computers and try to make your algorithms use a lot of flops per memory access if you want to utilize the floating point hardware you have lying around. > > I find this using 2 independent petsc builds and > they both exibit the same behavior: my own gnu > build and the system module petsc, both 3.2. I was so far unable to > build my own petsc version with cray compilers to compare. > > The scheme is relatively complex with a shell matrix and block > preconditioners, transient non-linear problem. I am using boomeramg > from hypre. > > What do you think this unexpected performance may come from? Is it > possible that the node interconnect is faster than the shared memory > bus on the node? I was expecting the exact opposite. > From xyuan at lbl.gov Fri Jan 20 18:02:09 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Fri, 20 Jan 2012 16:02:09 -0800 Subject: [petsc-users] How to simplify the nonzero structure of the jacobian matrix. Message-ID: Hello all, This is a test for np=1 case of the nonzero structure of the jacobian matrix. The jacobian matrix is created via ierr = DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); After creation of the jacobian matrix, ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); PetscViewer viewer; char fileName[120]; sprintf(fileName, "jacobian_after_creation.m");CHKERRQ(ierr); FILE * fp; ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); ierr = PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); ierr = MatView (jacobian, viewer); CHKERRQ (ierr); ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); CHKERRQ(ierr); ierr = PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); PetscViewerDestroy(&viewer); I took a look at the structure of the jacobian by storing it in the matlab format, the matrix has 5776 nonzeros entries, however, those values are all zeros at the moment as I have not insert or add any values into it yet, the structure shows: (the following figure shows a global replacement of 0.0 by 1.0 for those 5776 numbers) -------------- next part -------------- A non-text attachment was scrubbed... Name: jacobian_after_creation.eps Type: image/eps Size: 112025 bytes Desc: not available URL: -------------- next part -------------- Inside the FormJacobianLocal() function, I have selected the index to pass to the nonzero values to jacobian, for example, ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, INSERT_VALUES);CHKERRQ(ierr); where col[0].i = column[4].i; col[1].i = column[5].i; col[2].i = column[6].i; col[3].i = column[9].i; col[4].i = column[10].i; col[5].i = column[12].i; col[0].j = column[4].j; col[1].j = column[5].j; col[2].j = column[6].j; col[3].j = column[9].j; col[4].j = column[10].j; col[5].j = column[12].j; col[0].c = column[4].c; col[1].c = column[5].c; col[2].c = column[6].c; col[3].c = column[9].c; col[4].c = column[10].c; col[5].c = column[12].c; v[0] = value[4]; v[1] = value[5]; v[2] = value[6]; v[3] = value[9]; v[4] = value[10]; v[5] = value[12]; and did not pass the zero entries into the jacobian matrix. However, after inserting or adding all values to the matrix, by the same routine above to take a look at the jacobian matrix in matlab format, the matrix still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other 4701 numbers are all zeros. The spy() gives -------------- next part -------------- A non-text attachment was scrubbed... Name: jacobian_after_assembly.eps Type: image/eps Size: 28032 bytes Desc: not available URL: -------------- next part -------------- for the true nonzero structures. But the ksp_view will give the nonzeros number as 5776, instead of 1075: linear system matrix = precond matrix: Matrix Object: Mat_0x84000000_1 1 MPI processes type: seqaij rows=100, cols=100 total: nonzeros=5776, allocated nonzeros=5776 It is a waste of memory to have all those values of zeros been stored in the jacobian. Is there anyway to get rid of those zero values in jacobian and has the only nonzero numbers stored in jacobian? In such a case, the ksp_view will tell that total: nonzeros=1075. Thanks very much! Have a nice weekend! Cheers, Rebecca From mmnasr at gmail.com Fri Jan 20 18:05:00 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Fri, 20 Jan 2012 16:05:00 -0800 Subject: [petsc-users] hdf5 and FILE_MODE_APPEND In-Reply-To: References: Message-ID: Thanks to you both Matt and Jed. Mohamad On Fri, Jan 20, 2012 at 2:12 PM, Matthew Knepley wrote: > On Fri, Jan 20, 2012 at 4:00 PM, Jed Brown wrote: > >> On Fri, Jan 20, 2012 at 15:55, Matthew Knepley wrote: >> >>> There is no need for append mode. HDF5 does not work that way. Just open >>> it up and write. >> >> >> As written, that would truncate the old file. Making this work might be >> as simple as adding a case for FILE_MODE_APPEND that opened the file >> H5F_ACC_RDWR. >> >> /* Create or open the file collectively */ >> switch (hdf5->btype) { >> case FILE_MODE_READ: >> hdf5->file_id = H5Fopen(name, H5F_ACC_RDONLY, plist_id); >> break; >> case FILE_MODE_WRITE: >> hdf5->file_id = H5Fcreate(name, H5F_ACC_TRUNC, H5P_DEFAULT, >> plist_id); >> break; >> > > I thought we have changed this for PyLith, but we did it differently. This > has been pushed. > > Matt > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 20 18:09:02 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 18:09:02 -0600 Subject: [petsc-users] How to simplify the nonzero structure of the jacobian matrix. In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: > Hello all, > > This is a test for np=1 case of the nonzero structure of the jacobian > matrix. The jacobian matrix is created via > > ierr = > DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, > parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, > 0, 0, &da);CHKERRQ(ierr); > > ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); > > After creation of the jacobian matrix, > > ierr = MatAssemblyBegin(jacobian, > MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(jacobian, > MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > > PetscViewer viewer; > char fileName[120]; > sprintf(fileName, > "jacobian_after_creation.m");CHKERRQ(ierr); > > FILE * fp; > > ierr = > PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); > ierr = > PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); > ierr = MatView (jacobian, viewer); CHKERRQ (ierr); > ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); > CHKERRQ(ierr); > ierr = > PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); > ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); > PetscViewerDestroy(&viewer); > > I took a look at the structure of the jacobian by storing it in the matlab > format, the matrix has 5776 nonzeros entries, however, those values are all > zeros at the moment as I have not insert or add any values into it yet, the > structure shows: (the following figure shows a global replacement of 0.0 by > 1.0 for those 5776 numbers) > > > > > Inside the FormJacobianLocal() function, I have selected the index to pass > to the nonzero values to jacobian, for example, > > ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, > INSERT_VALUES);CHKERRQ(ierr); > > where > > col[0].i = column[4].i; > col[1].i = column[5].i; > col[2].i = column[6].i; > col[3].i = column[9].i; > col[4].i = column[10].i; > col[5].i = column[12].i; > > > col[0].j = column[4].j; > col[1].j = column[5].j; > col[2].j = column[6].j; > col[3].j = column[9].j; > col[4].j = column[10].j; > col[5].j = column[12].j; > > col[0].c = column[4].c; > col[1].c = column[5].c; > col[2].c = column[6].c; > col[3].c = column[9].c; > col[4].c = column[10].c; > col[5].c = column[12].c; > > v[0] = value[4]; > v[1] = value[5]; > v[2] = value[6]; > v[3] = value[9]; > v[4] = value[10]; > v[5] = value[12]; > > and did not pass the zero entries into the jacobian matrix. However, > after inserting or adding all values to the matrix, by the same routine > above to take a look at the jacobian matrix in matlab format, the matrix > still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other > 4701 numbers are all zeros. The spy() gives > > > > > for the true nonzero structures. > > But the ksp_view will give the nonzeros number as 5776, instead of 1075: > > linear system matrix = precond matrix: > Matrix Object: Mat_0x84000000_1 1 MPI processes > type: seqaij > rows=100, cols=100 > total: nonzeros=5776, allocated nonzeros=5776 > > It is a waste of memory to have all those values of zeros been stored in > the jacobian. > > Is there anyway to get rid of those zero values in jacobian and has the > only nonzero numbers stored in jacobian? In such a case, the ksp_view will > tell that total: nonzeros=1075. > MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); Matt > Thanks very much! > > Have a nice weekend! > > Cheers, > > Rebecca > > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xyuan at lbl.gov Fri Jan 20 18:28:28 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Fri, 20 Jan 2012 16:28:28 -0800 Subject: [petsc-users] How to simplify the nonzero structure of the jacobian matrix. In-Reply-To: References: Message-ID: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> Hello Matt, I have changed the code as ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); but still get the same result as before, the matrix still has 5776 nonzeros: % Size = 100 100 2 % Nonzeros = 5776 3 zzz = zeros(5776,3); Then I switch the order as ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); nothing changed. The version is 3.1-p8. Thanks very much! Best regards, Rebecca On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: > On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: > Hello all, > > This is a test for np=1 case of the nonzero structure of the jacobian matrix. The jacobian matrix is created via > > ierr = DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); > > ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); > > After creation of the jacobian matrix, > > ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > > PetscViewer viewer; > char fileName[120]; > sprintf(fileName, "jacobian_after_creation.m");CHKERRQ(ierr); > > FILE * fp; > > ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); > ierr = PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); > ierr = MatView (jacobian, viewer); CHKERRQ (ierr); > ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); CHKERRQ(ierr); > ierr = PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); > ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); > PetscViewerDestroy(&viewer); > > I took a look at the structure of the jacobian by storing it in the matlab format, the matrix has 5776 nonzeros entries, however, those values are all zeros at the moment as I have not insert or add any values into it yet, the structure shows: (the following figure shows a global replacement of 0.0 by 1.0 for those 5776 numbers) > > > > > Inside the FormJacobianLocal() function, I have selected the index to pass to the nonzero values to jacobian, for example, > > ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, INSERT_VALUES);CHKERRQ(ierr); > > where > > col[0].i = column[4].i; > col[1].i = column[5].i; > col[2].i = column[6].i; > col[3].i = column[9].i; > col[4].i = column[10].i; > col[5].i = column[12].i; > > > col[0].j = column[4].j; > col[1].j = column[5].j; > col[2].j = column[6].j; > col[3].j = column[9].j; > col[4].j = column[10].j; > col[5].j = column[12].j; > > col[0].c = column[4].c; > col[1].c = column[5].c; > col[2].c = column[6].c; > col[3].c = column[9].c; > col[4].c = column[10].c; > col[5].c = column[12].c; > > v[0] = value[4]; > v[1] = value[5]; > v[2] = value[6]; > v[3] = value[9]; > v[4] = value[10]; > v[5] = value[12]; > > and did not pass the zero entries into the jacobian matrix. However, after inserting or adding all values to the matrix, by the same routine above to take a look at the jacobian matrix in matlab format, the matrix still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other 4701 numbers are all zeros. The spy() gives > > > > > for the true nonzero structures. > > But the ksp_view will give the nonzeros number as 5776, instead of 1075: > > linear system matrix = precond matrix: > Matrix Object: Mat_0x84000000_1 1 MPI processes > type: seqaij > rows=100, cols=100 > total: nonzeros=5776, allocated nonzeros=5776 > > It is a waste of memory to have all those values of zeros been stored in the jacobian. > > Is there anyway to get rid of those zero values in jacobian and has the only nonzero numbers stored in jacobian? In such a case, the ksp_view will tell that total: nonzeros=1075. > > MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); > > Matt > > Thanks very much! > > Have a nice weekend! > > Cheers, > > Rebecca > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 20 18:32:25 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 18:32:25 -0600 Subject: [petsc-users] How to simplify the nonzero structure of the jacobian matrix. In-Reply-To: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> References: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> Message-ID: On Fri, Jan 20, 2012 at 6:28 PM, Xuefei (Rebecca) Yuan wrote: > Hello Matt, > > I have changed the code as > > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, > PETSC_TRUE);CHKERRQ(ierr); > ierr = MatAssemblyBegin(jacobian, > MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > You have to set it before you start setting values, so we know to ignore them. Matt > but still get the same result as before, the matrix still has 5776 > nonzeros: > > % Size = 100 100 > 2 % Nonzeros = 5776 > 3 zzz = zeros(5776,3); > > Then I switch the order as > > ierr = MatAssemblyBegin(jacobian, > MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, > PETSC_TRUE);CHKERRQ(ierr); > > nothing changed. > > The version is 3.1-p8. > > Thanks very much! > > Best regards, > > Rebecca > > > > > On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: > > On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: > >> Hello all, >> >> This is a test for np=1 case of the nonzero structure of the jacobian >> matrix. The jacobian matrix is created via >> >> ierr = >> DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, >> parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, >> 0, 0, &da);CHKERRQ(ierr); >> >> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >> >> After creation of the jacobian matrix, >> >> ierr = MatAssemblyBegin(jacobian, >> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatAssemblyEnd(jacobian, >> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> >> PetscViewer viewer; >> char fileName[120]; >> sprintf(fileName, >> "jacobian_after_creation.m");CHKERRQ(ierr); >> >> FILE * fp; >> >> ierr = >> PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); >> ierr = >> PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); >> ierr = MatView (jacobian, viewer); CHKERRQ (ierr); >> ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); >> CHKERRQ(ierr); >> ierr = >> PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); >> ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); >> PetscViewerDestroy(&viewer); >> >> I took a look at the structure of the jacobian by storing it in the >> matlab format, the matrix has 5776 nonzeros entries, however, those values >> are all zeros at the moment as I have not insert or add any values into it >> yet, the structure shows: (the following figure shows a global replacement >> of 0.0 by 1.0 for those 5776 numbers) >> >> >> >> >> Inside the FormJacobianLocal() function, I have selected the index to >> pass to the nonzero values to jacobian, for example, >> >> ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, >> INSERT_VALUES);CHKERRQ(ierr); >> >> where >> >> col[0].i = column[4].i; >> col[1].i = column[5].i; >> col[2].i = column[6].i; >> col[3].i = column[9].i; >> col[4].i = column[10].i; >> col[5].i = column[12].i; >> >> >> col[0].j = column[4].j; >> col[1].j = column[5].j; >> col[2].j = column[6].j; >> col[3].j = column[9].j; >> col[4].j = column[10].j; >> col[5].j = column[12].j; >> >> col[0].c = column[4].c; >> col[1].c = column[5].c; >> col[2].c = column[6].c; >> col[3].c = column[9].c; >> col[4].c = column[10].c; >> col[5].c = column[12].c; >> >> v[0] = value[4]; >> v[1] = value[5]; >> v[2] = value[6]; >> v[3] = value[9]; >> v[4] = value[10]; >> v[5] = value[12]; >> >> and did not pass the zero entries into the jacobian matrix. However, >> after inserting or adding all values to the matrix, by the same routine >> above to take a look at the jacobian matrix in matlab format, the matrix >> still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other >> 4701 numbers are all zeros. The spy() gives >> >> >> >> >> for the true nonzero structures. >> >> But the ksp_view will give the nonzeros number as 5776, instead of 1075: >> >> linear system matrix = precond matrix: >> Matrix Object: Mat_0x84000000_1 1 MPI processes >> type: seqaij >> rows=100, cols=100 >> total: nonzeros=5776, allocated nonzeros=5776 >> >> It is a waste of memory to have all those values of zeros been stored in >> the jacobian. >> >> Is there anyway to get rid of those zero values in jacobian and has the >> only nonzero numbers stored in jacobian? In such a case, the ksp_view will >> tell that total: nonzeros=1075. >> > > MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); > > Matt > > >> Thanks very much! >> >> Have a nice weekend! >> >> Cheers, >> >> Rebecca >> >> >> >> >> >> >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xyuan at lbl.gov Fri Jan 20 18:55:38 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Fri, 20 Jan 2012 16:55:38 -0800 Subject: [petsc-users] How to simplify the nonzero structure of the jacobian matrix. In-Reply-To: References: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> Message-ID: <24EA91F5-6441-4582-9B8F-8F510FA7126F@lbl.gov> Hello Matt, I tried several times for 3.1-p8 and dev version by putting MatSetOption 1) right after creation of the matrix: #ifdef petscDev ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); #else ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); #endif ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); 2) at the beginning of the FormJacobianLocal() routine: PetscFunctionBegin; ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); 3) before calling MatAssemblyBegin() in FormJacobianLocal() routine: ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); None of those works. What is wrong here? Thanks, R On Jan 20, 2012, at 4:32 PM, Matthew Knepley wrote: > On Fri, Jan 20, 2012 at 6:28 PM, Xuefei (Rebecca) Yuan wrote: > Hello Matt, > > I have changed the code as > > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); > ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > > You have to set it before you start setting values, so we know to ignore them. > > Matt > > but still get the same result as before, the matrix still has 5776 nonzeros: > > % Size = 100 100 > 2 % Nonzeros = 5776 > 3 zzz = zeros(5776,3); > > Then I switch the order as > > ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); > > nothing changed. > > The version is 3.1-p8. > > Thanks very much! > > Best regards, > > Rebecca > > > > > On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: > >> On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: >> Hello all, >> >> This is a test for np=1 case of the nonzero structure of the jacobian matrix. The jacobian matrix is created via >> >> ierr = DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); >> >> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >> >> After creation of the jacobian matrix, >> >> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> >> PetscViewer viewer; >> char fileName[120]; >> sprintf(fileName, "jacobian_after_creation.m");CHKERRQ(ierr); >> >> FILE * fp; >> >> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); >> ierr = PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); >> ierr = MatView (jacobian, viewer); CHKERRQ (ierr); >> ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); CHKERRQ(ierr); >> ierr = PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); >> ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); >> PetscViewerDestroy(&viewer); >> >> I took a look at the structure of the jacobian by storing it in the matlab format, the matrix has 5776 nonzeros entries, however, those values are all zeros at the moment as I have not insert or add any values into it yet, the structure shows: (the following figure shows a global replacement of 0.0 by 1.0 for those 5776 numbers) >> >> >> >> >> Inside the FormJacobianLocal() function, I have selected the index to pass to the nonzero values to jacobian, for example, >> >> ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, INSERT_VALUES);CHKERRQ(ierr); >> >> where >> >> col[0].i = column[4].i; >> col[1].i = column[5].i; >> col[2].i = column[6].i; >> col[3].i = column[9].i; >> col[4].i = column[10].i; >> col[5].i = column[12].i; >> >> >> col[0].j = column[4].j; >> col[1].j = column[5].j; >> col[2].j = column[6].j; >> col[3].j = column[9].j; >> col[4].j = column[10].j; >> col[5].j = column[12].j; >> >> col[0].c = column[4].c; >> col[1].c = column[5].c; >> col[2].c = column[6].c; >> col[3].c = column[9].c; >> col[4].c = column[10].c; >> col[5].c = column[12].c; >> >> v[0] = value[4]; >> v[1] = value[5]; >> v[2] = value[6]; >> v[3] = value[9]; >> v[4] = value[10]; >> v[5] = value[12]; >> >> and did not pass the zero entries into the jacobian matrix. However, after inserting or adding all values to the matrix, by the same routine above to take a look at the jacobian matrix in matlab format, the matrix still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other 4701 numbers are all zeros. The spy() gives >> >> >> >> >> for the true nonzero structures. >> >> But the ksp_view will give the nonzeros number as 5776, instead of 1075: >> >> linear system matrix = precond matrix: >> Matrix Object: Mat_0x84000000_1 1 MPI processes >> type: seqaij >> rows=100, cols=100 >> total: nonzeros=5776, allocated nonzeros=5776 >> >> It is a waste of memory to have all those values of zeros been stored in the jacobian. >> >> Is there anyway to get rid of those zero values in jacobian and has the only nonzero numbers stored in jacobian? In such a case, the ksp_view will tell that total: nonzeros=1075. >> >> MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); >> >> Matt >> >> Thanks very much! >> >> Have a nice weekend! >> >> Cheers, >> >> Rebecca >> >> >> >> >> >> >> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 20 19:01:15 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 19:01:15 -0600 Subject: [petsc-users] How to simplify the nonzero structure of the jacobian matrix. In-Reply-To: <24EA91F5-6441-4582-9B8F-8F510FA7126F@lbl.gov> References: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> <24EA91F5-6441-4582-9B8F-8F510FA7126F@lbl.gov> Message-ID: On Fri, Jan 20, 2012 at 6:55 PM, Xuefei (Rebecca) Yuan wrote: > Hello Matt, > > I tried several times for 3.1-p8 and dev version by putting MatSetOption > Are you sure your entries are exactly 0.0? Matt > 1) right after creation of the matrix: > > #ifdef petscDev > ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, > &jacobian);CHKERRQ(ierr); > #else > ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, > &jacobian);CHKERRQ(ierr); > #endif > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, > PETSC_TRUE);CHKERRQ(ierr); > > 2) at the beginning of the FormJacobianLocal() routine: > > PetscFunctionBegin; > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, > PETSC_TRUE);CHKERRQ(ierr); > > 3) before calling MatAssemblyBegin() in FormJacobianLocal() routine: > > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, > PETSC_TRUE);CHKERRQ(ierr); > ierr = MatAssemblyBegin(jacobian, > MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > > None of those works. What is wrong here? > > Thanks, > > R > > > On Jan 20, 2012, at 4:32 PM, Matthew Knepley wrote: > > On Fri, Jan 20, 2012 at 6:28 PM, Xuefei (Rebecca) Yuan wrote: > >> Hello Matt, >> >> I have changed the code as >> >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >> PETSC_TRUE);CHKERRQ(ierr); >> ierr = MatAssemblyBegin(jacobian, >> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> > > You have to set it before you start setting values, so we know to ignore > them. > > Matt > > >> but still get the same result as before, the matrix still has 5776 >> nonzeros: >> >> % Size = 100 100 >> 2 % Nonzeros = 5776 >> 3 zzz = zeros(5776,3); >> >> Then I switch the order as >> >> ierr = MatAssemblyBegin(jacobian, >> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >> PETSC_TRUE);CHKERRQ(ierr); >> >> nothing changed. >> >> The version is 3.1-p8. >> >> Thanks very much! >> >> Best regards, >> >> Rebecca >> >> >> >> >> On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: >> >> On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: >> >>> Hello all, >>> >>> This is a test for np=1 case of the nonzero structure of the jacobian >>> matrix. The jacobian matrix is created via >>> >>> ierr = >>> DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, >>> parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, >>> 0, 0, &da);CHKERRQ(ierr); >>> >>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>> >>> After creation of the jacobian matrix, >>> >>> ierr = MatAssemblyBegin(jacobian, >>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, >>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> >>> PetscViewer viewer; >>> char fileName[120]; >>> sprintf(fileName, >>> "jacobian_after_creation.m");CHKERRQ(ierr); >>> >>> FILE * fp; >>> >>> ierr = >>> PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); >>> ierr = >>> PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); >>> ierr = MatView (jacobian, viewer); CHKERRQ (ierr); >>> ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); >>> CHKERRQ(ierr); >>> ierr = >>> PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); >>> ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); >>> PetscViewerDestroy(&viewer); >>> >>> I took a look at the structure of the jacobian by storing it in the >>> matlab format, the matrix has 5776 nonzeros entries, however, those values >>> are all zeros at the moment as I have not insert or add any values into it >>> yet, the structure shows: (the following figure shows a global replacement >>> of 0.0 by 1.0 for those 5776 numbers) >>> >>> >>> >>> >>> Inside the FormJacobianLocal() function, I have selected the index to >>> pass to the nonzero values to jacobian, for example, >>> >>> ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, >>> INSERT_VALUES);CHKERRQ(ierr); >>> >>> where >>> >>> col[0].i = column[4].i; >>> col[1].i = column[5].i; >>> col[2].i = column[6].i; >>> col[3].i = column[9].i; >>> col[4].i = column[10].i; >>> col[5].i = column[12].i; >>> >>> >>> col[0].j = column[4].j; >>> col[1].j = column[5].j; >>> col[2].j = column[6].j; >>> col[3].j = column[9].j; >>> col[4].j = column[10].j; >>> col[5].j = column[12].j; >>> >>> col[0].c = column[4].c; >>> col[1].c = column[5].c; >>> col[2].c = column[6].c; >>> col[3].c = column[9].c; >>> col[4].c = column[10].c; >>> col[5].c = column[12].c; >>> >>> v[0] = value[4]; >>> v[1] = value[5]; >>> v[2] = value[6]; >>> v[3] = value[9]; >>> v[4] = value[10]; >>> v[5] = value[12]; >>> >>> and did not pass the zero entries into the jacobian matrix. However, >>> after inserting or adding all values to the matrix, by the same routine >>> above to take a look at the jacobian matrix in matlab format, the matrix >>> still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other >>> 4701 numbers are all zeros. The spy() gives >>> >>> >>> >>> >>> for the true nonzero structures. >>> >>> But the ksp_view will give the nonzeros number as 5776, instead of 1075: >>> >>> linear system matrix = precond matrix: >>> Matrix Object: Mat_0x84000000_1 1 MPI processes >>> type: seqaij >>> rows=100, cols=100 >>> total: nonzeros=5776, allocated nonzeros=5776 >>> >>> It is a waste of memory to have all those values of zeros been stored in >>> the jacobian. >>> >>> Is there anyway to get rid of those zero values in jacobian and has the >>> only nonzero numbers stored in jacobian? In such a case, the ksp_view will >>> tell that total: nonzeros=1075. >>> >> >> MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); >> >> Matt >> >> >>> Thanks very much! >>> >>> Have a nice weekend! >>> >>> Cheers, >>> >>> Rebecca >>> >>> >>> >>> >>> >>> >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xyuan at lbl.gov Fri Jan 20 19:05:34 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Fri, 20 Jan 2012 17:05:34 -0800 Subject: [petsc-users] How to simplify the nonzero structure of the jacobian matrix. In-Reply-To: References: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> <24EA91F5-6441-4582-9B8F-8F510FA7126F@lbl.gov> Message-ID: <534363CA-0962-4D5F-BD7D-AA54911AA20B@lbl.gov> > On Fri, Jan 20, 2012 at 6:55 PM, Xuefei (Rebecca) Yuan wrote: > Hello Matt, > > I tried several times for 3.1-p8 and dev version by putting MatSetOption > > Are you sure your entries are exactly 0.0? > Yes, because I have looked at the output, they are 0.00000000000000e+00. R > Matt > > 1) right after creation of the matrix: > > #ifdef petscDev > ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); > #else > ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); > #endif > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); > > 2) at the beginning of the FormJacobianLocal() routine: > > PetscFunctionBegin; > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); > > 3) before calling MatAssemblyBegin() in FormJacobianLocal() routine: > > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); > ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > > None of those works. What is wrong here? > > Thanks, > > R > > > On Jan 20, 2012, at 4:32 PM, Matthew Knepley wrote: > >> On Fri, Jan 20, 2012 at 6:28 PM, Xuefei (Rebecca) Yuan wrote: >> Hello Matt, >> >> I have changed the code as >> >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> >> You have to set it before you start setting values, so we know to ignore them. >> >> Matt >> >> but still get the same result as before, the matrix still has 5776 nonzeros: >> >> % Size = 100 100 >> 2 % Nonzeros = 5776 >> 3 zzz = zeros(5776,3); >> >> Then I switch the order as >> >> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >> >> nothing changed. >> >> The version is 3.1-p8. >> >> Thanks very much! >> >> Best regards, >> >> Rebecca >> >> >> >> >> On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: >> >>> On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: >>> Hello all, >>> >>> This is a test for np=1 case of the nonzero structure of the jacobian matrix. The jacobian matrix is created via >>> >>> ierr = DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); >>> >>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>> >>> After creation of the jacobian matrix, >>> >>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> >>> PetscViewer viewer; >>> char fileName[120]; >>> sprintf(fileName, "jacobian_after_creation.m");CHKERRQ(ierr); >>> >>> FILE * fp; >>> >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); >>> ierr = PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); >>> ierr = MatView (jacobian, viewer); CHKERRQ (ierr); >>> ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); CHKERRQ(ierr); >>> ierr = PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); >>> ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); >>> PetscViewerDestroy(&viewer); >>> >>> I took a look at the structure of the jacobian by storing it in the matlab format, the matrix has 5776 nonzeros entries, however, those values are all zeros at the moment as I have not insert or add any values into it yet, the structure shows: (the following figure shows a global replacement of 0.0 by 1.0 for those 5776 numbers) >>> >>> >>> >>> >>> Inside the FormJacobianLocal() function, I have selected the index to pass to the nonzero values to jacobian, for example, >>> >>> ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, INSERT_VALUES);CHKERRQ(ierr); >>> >>> where >>> >>> col[0].i = column[4].i; >>> col[1].i = column[5].i; >>> col[2].i = column[6].i; >>> col[3].i = column[9].i; >>> col[4].i = column[10].i; >>> col[5].i = column[12].i; >>> >>> >>> col[0].j = column[4].j; >>> col[1].j = column[5].j; >>> col[2].j = column[6].j; >>> col[3].j = column[9].j; >>> col[4].j = column[10].j; >>> col[5].j = column[12].j; >>> >>> col[0].c = column[4].c; >>> col[1].c = column[5].c; >>> col[2].c = column[6].c; >>> col[3].c = column[9].c; >>> col[4].c = column[10].c; >>> col[5].c = column[12].c; >>> >>> v[0] = value[4]; >>> v[1] = value[5]; >>> v[2] = value[6]; >>> v[3] = value[9]; >>> v[4] = value[10]; >>> v[5] = value[12]; >>> >>> and did not pass the zero entries into the jacobian matrix. However, after inserting or adding all values to the matrix, by the same routine above to take a look at the jacobian matrix in matlab format, the matrix still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other 4701 numbers are all zeros. The spy() gives >>> >>> >>> >>> >>> for the true nonzero structures. >>> >>> But the ksp_view will give the nonzeros number as 5776, instead of 1075: >>> >>> linear system matrix = precond matrix: >>> Matrix Object: Mat_0x84000000_1 1 MPI processes >>> type: seqaij >>> rows=100, cols=100 >>> total: nonzeros=5776, allocated nonzeros=5776 >>> >>> It is a waste of memory to have all those values of zeros been stored in the jacobian. >>> >>> Is there anyway to get rid of those zero values in jacobian and has the only nonzero numbers stored in jacobian? In such a case, the ksp_view will tell that total: nonzeros=1075. >>> >>> MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); >>> >>> Matt >>> >>> Thanks very much! >>> >>> Have a nice weekend! >>> >>> Cheers, >>> >>> Rebecca >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xyuan at lbl.gov Fri Jan 20 19:07:46 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Fri, 20 Jan 2012 17:07:46 -0800 Subject: [petsc-users] How to simplify the nonzero structure of the jacobian matrix. In-Reply-To: References: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> <24EA91F5-6441-4582-9B8F-8F510FA7126F@lbl.gov> Message-ID: Here is the output: On Jan 20, 2012, at 5:01 PM, Matthew Knepley wrote: > On Fri, Jan 20, 2012 at 6:55 PM, Xuefei (Rebecca) Yuan wrote: > Hello Matt, > > I tried several times for 3.1-p8 and dev version by putting MatSetOption > > Are you sure your entries are exactly 0.0? > > Matt > > 1) right after creation of the matrix: > > #ifdef petscDev > ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); > #else > ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); > #endif > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); > > 2) at the beginning of the FormJacobianLocal() routine: > > PetscFunctionBegin; > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); > > 3) before calling MatAssemblyBegin() in FormJacobianLocal() routine: > > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); > ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > > None of those works. What is wrong here? > > Thanks, > > R > > > On Jan 20, 2012, at 4:32 PM, Matthew Knepley wrote: > >> On Fri, Jan 20, 2012 at 6:28 PM, Xuefei (Rebecca) Yuan wrote: >> Hello Matt, >> >> I have changed the code as >> >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> >> You have to set it before you start setting values, so we know to ignore them. >> >> Matt >> >> but still get the same result as before, the matrix still has 5776 nonzeros: >> >> % Size = 100 100 >> 2 % Nonzeros = 5776 >> 3 zzz = zeros(5776,3); >> >> Then I switch the order as >> >> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >> >> nothing changed. >> >> The version is 3.1-p8. >> >> Thanks very much! >> >> Best regards, >> >> Rebecca >> >> >> >> >> On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: >> >>> On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: >>> Hello all, >>> >>> This is a test for np=1 case of the nonzero structure of the jacobian matrix. The jacobian matrix is created via >>> >>> ierr = DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); >>> >>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>> >>> After creation of the jacobian matrix, >>> >>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> >>> PetscViewer viewer; >>> char fileName[120]; >>> sprintf(fileName, "jacobian_after_creation.m");CHKERRQ(ierr); >>> >>> FILE * fp; >>> >>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); >>> ierr = PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); >>> ierr = MatView (jacobian, viewer); CHKERRQ (ierr); >>> ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); CHKERRQ(ierr); >>> ierr = PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); >>> ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); >>> PetscViewerDestroy(&viewer); >>> >>> I took a look at the structure of the jacobian by storing it in the matlab format, the matrix has 5776 nonzeros entries, however, those values are all zeros at the moment as I have not insert or add any values into it yet, the structure shows: (the following figure shows a global replacement of 0.0 by 1.0 for those 5776 numbers) >>> >>> >>> >>> >>> Inside the FormJacobianLocal() function, I have selected the index to pass to the nonzero values to jacobian, for example, >>> >>> ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, INSERT_VALUES);CHKERRQ(ierr); >>> >>> where >>> >>> col[0].i = column[4].i; >>> col[1].i = column[5].i; >>> col[2].i = column[6].i; >>> col[3].i = column[9].i; >>> col[4].i = column[10].i; >>> col[5].i = column[12].i; >>> >>> >>> col[0].j = column[4].j; >>> col[1].j = column[5].j; >>> col[2].j = column[6].j; >>> col[3].j = column[9].j; >>> col[4].j = column[10].j; >>> col[5].j = column[12].j; >>> >>> col[0].c = column[4].c; >>> col[1].c = column[5].c; >>> col[2].c = column[6].c; >>> col[3].c = column[9].c; >>> col[4].c = column[10].c; >>> col[5].c = column[12].c; >>> >>> v[0] = value[4]; >>> v[1] = value[5]; >>> v[2] = value[6]; >>> v[3] = value[9]; >>> v[4] = value[10]; >>> v[5] = value[12]; >>> >>> and did not pass the zero entries into the jacobian matrix. However, after inserting or adding all values to the matrix, by the same routine above to take a look at the jacobian matrix in matlab format, the matrix still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other 4701 numbers are all zeros. The spy() gives >>> >>> >>> >>> >>> for the true nonzero structures. >>> >>> But the ksp_view will give the nonzeros number as 5776, instead of 1075: >>> >>> linear system matrix = precond matrix: >>> Matrix Object: Mat_0x84000000_1 1 MPI processes >>> type: seqaij >>> rows=100, cols=100 >>> total: nonzeros=5776, allocated nonzeros=5776 >>> >>> It is a waste of memory to have all those values of zeros been stored in the jacobian. >>> >>> Is there anyway to get rid of those zero values in jacobian and has the only nonzero numbers stored in jacobian? In such a case, the ksp_view will tell that total: nonzeros=1075. >>> >>> MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); >>> >>> Matt >>> >>> Thanks very much! >>> >>> Have a nice weekend! >>> >>> Cheers, >>> >>> Rebecca >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: jacobian_tx5_ty5_x5_y5_nl1_gt100_ot2_di10.m Type: application/octet-stream Size: 173374 bytes Desc: not available URL: From knepley at gmail.com Fri Jan 20 19:23:43 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 20 Jan 2012 19:23:43 -0600 Subject: [petsc-users] How to simplify the nonzero structure of the jacobian matrix. In-Reply-To: References: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> <24EA91F5-6441-4582-9B8F-8F510FA7126F@lbl.gov> Message-ID: On Fri, Jan 20, 2012 at 7:07 PM, Xuefei (Rebecca) Yuan wrote: > Here is the output: > > > > > > On Jan 20, 2012, at 5:01 PM, Matthew Knepley wrote: > > On Fri, Jan 20, 2012 at 6:55 PM, Xuefei (Rebecca) Yuan wrote: > >> Hello Matt, >> >> I tried several times for 3.1-p8 and dev version by putting MatSetOption >> > > Are you sure your entries are exactly 0.0? > > Are you using ADD_VALUES? http://petsc.cs.iit.edu/petsc/petsc-dev/file/783e93230143/src/mat/impls/aij/seq/aij.c#l310 Matt > Matt > > >> 1) right after creation of the matrix: >> >> #ifdef petscDev >> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, >> &jacobian);CHKERRQ(ierr); >> #else >> ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, >> &jacobian);CHKERRQ(ierr); >> #endif >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >> PETSC_TRUE);CHKERRQ(ierr); >> >> 2) at the beginning of the FormJacobianLocal() routine: >> >> PetscFunctionBegin; >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >> PETSC_TRUE);CHKERRQ(ierr); >> >> 3) before calling MatAssemblyBegin() in FormJacobianLocal() routine: >> >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >> PETSC_TRUE);CHKERRQ(ierr); >> ierr = MatAssemblyBegin(jacobian, >> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> >> None of those works. What is wrong here? >> >> Thanks, >> >> R >> >> >> On Jan 20, 2012, at 4:32 PM, Matthew Knepley wrote: >> >> On Fri, Jan 20, 2012 at 6:28 PM, Xuefei (Rebecca) Yuan wrote: >> >>> Hello Matt, >>> >>> I have changed the code as >>> >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >>> PETSC_TRUE);CHKERRQ(ierr); >>> ierr = MatAssemblyBegin(jacobian, >>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, >>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> >> >> You have to set it before you start setting values, so we know to ignore >> them. >> >> Matt >> >> >>> but still get the same result as before, the matrix still has 5776 >>> nonzeros: >>> >>> % Size = 100 100 >>> 2 % Nonzeros = 5776 >>> 3 zzz = zeros(5776,3); >>> >>> Then I switch the order as >>> >>> ierr = MatAssemblyBegin(jacobian, >>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, >>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >>> PETSC_TRUE);CHKERRQ(ierr); >>> >>> nothing changed. >>> >>> The version is 3.1-p8. >>> >>> Thanks very much! >>> >>> Best regards, >>> >>> Rebecca >>> >>> >>> >>> >>> On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: >>> >>> On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: >>> >>>> Hello all, >>>> >>>> This is a test for np=1 case of the nonzero structure of the jacobian >>>> matrix. The jacobian matrix is created via >>>> >>>> ierr = >>>> DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, >>>> parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, >>>> 0, 0, &da);CHKERRQ(ierr); >>>> >>>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>>> >>>> After creation of the jacobian matrix, >>>> >>>> ierr = MatAssemblyBegin(jacobian, >>>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(jacobian, >>>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> >>>> PetscViewer viewer; >>>> char fileName[120]; >>>> sprintf(fileName, >>>> "jacobian_after_creation.m");CHKERRQ(ierr); >>>> >>>> FILE * fp; >>>> >>>> ierr = >>>> PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); >>>> ierr = >>>> PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); >>>> ierr = MatView (jacobian, viewer); CHKERRQ (ierr); >>>> ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); >>>> CHKERRQ(ierr); >>>> ierr = >>>> PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); >>>> ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); >>>> PetscViewerDestroy(&viewer); >>>> >>>> I took a look at the structure of the jacobian by storing it in the >>>> matlab format, the matrix has 5776 nonzeros entries, however, those values >>>> are all zeros at the moment as I have not insert or add any values into it >>>> yet, the structure shows: (the following figure shows a global replacement >>>> of 0.0 by 1.0 for those 5776 numbers) >>>> >>>> >>>> >>>> >>>> Inside the FormJacobianLocal() function, I have selected the index to >>>> pass to the nonzero values to jacobian, for example, >>>> >>>> ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, >>>> INSERT_VALUES);CHKERRQ(ierr); >>>> >>>> where >>>> >>>> col[0].i = column[4].i; >>>> col[1].i = column[5].i; >>>> col[2].i = column[6].i; >>>> col[3].i = column[9].i; >>>> col[4].i = column[10].i; >>>> col[5].i = column[12].i; >>>> >>>> >>>> col[0].j = column[4].j; >>>> col[1].j = column[5].j; >>>> col[2].j = column[6].j; >>>> col[3].j = column[9].j; >>>> col[4].j = column[10].j; >>>> col[5].j = column[12].j; >>>> >>>> col[0].c = column[4].c; >>>> col[1].c = column[5].c; >>>> col[2].c = column[6].c; >>>> col[3].c = column[9].c; >>>> col[4].c = column[10].c; >>>> col[5].c = column[12].c; >>>> >>>> v[0] = value[4]; >>>> v[1] = value[5]; >>>> v[2] = value[6]; >>>> v[3] = value[9]; >>>> v[4] = value[10]; >>>> v[5] = value[12]; >>>> >>>> and did not pass the zero entries into the jacobian matrix. However, >>>> after inserting or adding all values to the matrix, by the same routine >>>> above to take a look at the jacobian matrix in matlab format, the matrix >>>> still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other >>>> 4701 numbers are all zeros. The spy() gives >>>> >>>> >>>> >>>> >>>> for the true nonzero structures. >>>> >>>> But the ksp_view will give the nonzeros number as 5776, instead of 1075: >>>> >>>> linear system matrix = precond matrix: >>>> Matrix Object: Mat_0x84000000_1 1 MPI processes >>>> type: seqaij >>>> rows=100, cols=100 >>>> total: nonzeros=5776, allocated nonzeros=5776 >>>> >>>> It is a waste of memory to have all those values of zeros been stored >>>> in the jacobian. >>>> >>>> Is there anyway to get rid of those zero values in jacobian and has the >>>> only nonzero numbers stored in jacobian? In such a case, the ksp_view will >>>> tell that total: nonzeros=1075. >>>> >>> >>> MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); >>> >>> Matt >>> >>> >>>> Thanks very much! >>>> >>>> Have a nice weekend! >>>> >>>> Cheers, >>>> >>>> Rebecca >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xyuan at lbl.gov Fri Jan 20 19:28:35 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Fri, 20 Jan 2012 17:28:35 -0800 Subject: [petsc-users] How to simplify the nonzero structure of the jacobian matrix. In-Reply-To: References: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> <24EA91F5-6441-4582-9B8F-8F510FA7126F@lbl.gov> Message-ID: I did use ierr = MatSetValuesStencil(jacobian, 1, &row, 13, column, value, INSERT_VALUES);CHKERRQ(ierr); so it is INSERT_VALUES. Let me try to use ADD_VALUES instead of INSERT_VALUES, and see if this will make any difference. Thanks, R On Jan 20, 2012, at 5:23 PM, Matthew Knepley wrote: > On Fri, Jan 20, 2012 at 7:07 PM, Xuefei (Rebecca) Yuan wrote: > Here is the output: > > > > > > On Jan 20, 2012, at 5:01 PM, Matthew Knepley wrote: > >> On Fri, Jan 20, 2012 at 6:55 PM, Xuefei (Rebecca) Yuan wrote: >> Hello Matt, >> >> I tried several times for 3.1-p8 and dev version by putting MatSetOption >> >> Are you sure your entries are exactly 0.0? > > > Are you using ADD_VALUES? > > http://petsc.cs.iit.edu/petsc/petsc-dev/file/783e93230143/src/mat/impls/aij/seq/aij.c#l310 > > Matt > >> Matt >> >> 1) right after creation of the matrix: >> >> #ifdef petscDev >> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >> #else >> ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >> #endif >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >> >> 2) at the beginning of the FormJacobianLocal() routine: >> >> PetscFunctionBegin; >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >> >> 3) before calling MatAssemblyBegin() in FormJacobianLocal() routine: >> >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> >> None of those works. What is wrong here? >> >> Thanks, >> >> R >> >> >> On Jan 20, 2012, at 4:32 PM, Matthew Knepley wrote: >> >>> On Fri, Jan 20, 2012 at 6:28 PM, Xuefei (Rebecca) Yuan wrote: >>> Hello Matt, >>> >>> I have changed the code as >>> >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> >>> You have to set it before you start setting values, so we know to ignore them. >>> >>> Matt >>> >>> but still get the same result as before, the matrix still has 5776 nonzeros: >>> >>> % Size = 100 100 >>> 2 % Nonzeros = 5776 >>> 3 zzz = zeros(5776,3); >>> >>> Then I switch the order as >>> >>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> >>> nothing changed. >>> >>> The version is 3.1-p8. >>> >>> Thanks very much! >>> >>> Best regards, >>> >>> Rebecca >>> >>> >>> >>> >>> On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: >>> >>>> On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: >>>> Hello all, >>>> >>>> This is a test for np=1 case of the nonzero structure of the jacobian matrix. The jacobian matrix is created via >>>> >>>> ierr = DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); >>>> >>>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>>> >>>> After creation of the jacobian matrix, >>>> >>>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> >>>> PetscViewer viewer; >>>> char fileName[120]; >>>> sprintf(fileName, "jacobian_after_creation.m");CHKERRQ(ierr); >>>> >>>> FILE * fp; >>>> >>>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); >>>> ierr = PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); >>>> ierr = MatView (jacobian, viewer); CHKERRQ (ierr); >>>> ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); CHKERRQ(ierr); >>>> ierr = PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); >>>> ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); >>>> PetscViewerDestroy(&viewer); >>>> >>>> I took a look at the structure of the jacobian by storing it in the matlab format, the matrix has 5776 nonzeros entries, however, those values are all zeros at the moment as I have not insert or add any values into it yet, the structure shows: (the following figure shows a global replacement of 0.0 by 1.0 for those 5776 numbers) >>>> >>>> >>>> >>>> >>>> Inside the FormJacobianLocal() function, I have selected the index to pass to the nonzero values to jacobian, for example, >>>> >>>> ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, INSERT_VALUES);CHKERRQ(ierr); >>>> >>>> where >>>> >>>> col[0].i = column[4].i; >>>> col[1].i = column[5].i; >>>> col[2].i = column[6].i; >>>> col[3].i = column[9].i; >>>> col[4].i = column[10].i; >>>> col[5].i = column[12].i; >>>> >>>> >>>> col[0].j = column[4].j; >>>> col[1].j = column[5].j; >>>> col[2].j = column[6].j; >>>> col[3].j = column[9].j; >>>> col[4].j = column[10].j; >>>> col[5].j = column[12].j; >>>> >>>> col[0].c = column[4].c; >>>> col[1].c = column[5].c; >>>> col[2].c = column[6].c; >>>> col[3].c = column[9].c; >>>> col[4].c = column[10].c; >>>> col[5].c = column[12].c; >>>> >>>> v[0] = value[4]; >>>> v[1] = value[5]; >>>> v[2] = value[6]; >>>> v[3] = value[9]; >>>> v[4] = value[10]; >>>> v[5] = value[12]; >>>> >>>> and did not pass the zero entries into the jacobian matrix. However, after inserting or adding all values to the matrix, by the same routine above to take a look at the jacobian matrix in matlab format, the matrix still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other 4701 numbers are all zeros. The spy() gives >>>> >>>> >>>> >>>> >>>> for the true nonzero structures. >>>> >>>> But the ksp_view will give the nonzeros number as 5776, instead of 1075: >>>> >>>> linear system matrix = precond matrix: >>>> Matrix Object: Mat_0x84000000_1 1 MPI processes >>>> type: seqaij >>>> rows=100, cols=100 >>>> total: nonzeros=5776, allocated nonzeros=5776 >>>> >>>> It is a waste of memory to have all those values of zeros been stored in the jacobian. >>>> >>>> Is there anyway to get rid of those zero values in jacobian and has the only nonzero numbers stored in jacobian? In such a case, the ksp_view will tell that total: nonzeros=1075. >>>> >>>> MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); >>>> >>>> Matt >>>> >>>> Thanks very much! >>>> >>>> Have a nice weekend! >>>> >>>> Cheers, >>>> >>>> Rebecca >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xyuan at lbl.gov Fri Jan 20 19:32:27 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Fri, 20 Jan 2012 17:32:27 -0800 Subject: [petsc-users] How to simplify the nonzero structure of the jacobian matrix. In-Reply-To: References: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> <24EA91F5-6441-4582-9B8F-8F510FA7126F@lbl.gov> Message-ID: <9D401E15-131B-4A39-AB76-313D58223C55@lbl.gov> I have replace INSERT_VALUES by ADD_VALUES, ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, ADD_VALUES);CHKERRQ(ierr); but still cannot get rid of those zeros... Cheers, R On Jan 20, 2012, at 5:28 PM, Xuefei (Rebecca) Yuan wrote: > I did use > > ierr = MatSetValuesStencil(jacobian, 1, &row, 13, column, value, INSERT_VALUES);CHKERRQ(ierr); > > > so it is INSERT_VALUES. > > Let me try to use ADD_VALUES instead of INSERT_VALUES, and see if this will make any difference. > > Thanks, > > R > > > > On Jan 20, 2012, at 5:23 PM, Matthew Knepley wrote: > >> On Fri, Jan 20, 2012 at 7:07 PM, Xuefei (Rebecca) Yuan wrote: >> Here is the output: >> >> >> >> >> >> On Jan 20, 2012, at 5:01 PM, Matthew Knepley wrote: >> >>> On Fri, Jan 20, 2012 at 6:55 PM, Xuefei (Rebecca) Yuan wrote: >>> Hello Matt, >>> >>> I tried several times for 3.1-p8 and dev version by putting MatSetOption >>> >>> Are you sure your entries are exactly 0.0? >> >> >> Are you using ADD_VALUES? >> >> http://petsc.cs.iit.edu/petsc/petsc-dev/file/783e93230143/src/mat/impls/aij/seq/aij.c#l310 >> >> Matt >> >>> Matt >>> >>> 1) right after creation of the matrix: >>> >>> #ifdef petscDev >>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>> #else >>> ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>> #endif >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> >>> 2) at the beginning of the FormJacobianLocal() routine: >>> >>> PetscFunctionBegin; >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> >>> 3) before calling MatAssemblyBegin() in FormJacobianLocal() routine: >>> >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> >>> None of those works. What is wrong here? >>> >>> Thanks, >>> >>> R >>> >>> >>> On Jan 20, 2012, at 4:32 PM, Matthew Knepley wrote: >>> >>>> On Fri, Jan 20, 2012 at 6:28 PM, Xuefei (Rebecca) Yuan wrote: >>>> Hello Matt, >>>> >>>> I have changed the code as >>>> >>>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> >>>> You have to set it before you start setting values, so we know to ignore them. >>>> >>>> Matt >>>> >>>> but still get the same result as before, the matrix still has 5776 nonzeros: >>>> >>>> % Size = 100 100 >>>> 2 % Nonzeros = 5776 >>>> 3 zzz = zeros(5776,3); >>>> >>>> Then I switch the order as >>>> >>>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>>> >>>> nothing changed. >>>> >>>> The version is 3.1-p8. >>>> >>>> Thanks very much! >>>> >>>> Best regards, >>>> >>>> Rebecca >>>> >>>> >>>> >>>> >>>> On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: >>>> >>>>> On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: >>>>> Hello all, >>>>> >>>>> This is a test for np=1 case of the nonzero structure of the jacobian matrix. The jacobian matrix is created via >>>>> >>>>> ierr = DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); >>>>> >>>>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>>>> >>>>> After creation of the jacobian matrix, >>>>> >>>>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>>> >>>>> PetscViewer viewer; >>>>> char fileName[120]; >>>>> sprintf(fileName, "jacobian_after_creation.m");CHKERRQ(ierr); >>>>> >>>>> FILE * fp; >>>>> >>>>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); >>>>> ierr = PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); >>>>> ierr = MatView (jacobian, viewer); CHKERRQ (ierr); >>>>> ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); CHKERRQ(ierr); >>>>> ierr = PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); >>>>> ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); >>>>> PetscViewerDestroy(&viewer); >>>>> >>>>> I took a look at the structure of the jacobian by storing it in the matlab format, the matrix has 5776 nonzeros entries, however, those values are all zeros at the moment as I have not insert or add any values into it yet, the structure shows: (the following figure shows a global replacement of 0.0 by 1.0 for those 5776 numbers) >>>>> >>>>> >>>>> >>>>> >>>>> Inside the FormJacobianLocal() function, I have selected the index to pass to the nonzero values to jacobian, for example, >>>>> >>>>> ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, INSERT_VALUES);CHKERRQ(ierr); >>>>> >>>>> where >>>>> >>>>> col[0].i = column[4].i; >>>>> col[1].i = column[5].i; >>>>> col[2].i = column[6].i; >>>>> col[3].i = column[9].i; >>>>> col[4].i = column[10].i; >>>>> col[5].i = column[12].i; >>>>> >>>>> >>>>> col[0].j = column[4].j; >>>>> col[1].j = column[5].j; >>>>> col[2].j = column[6].j; >>>>> col[3].j = column[9].j; >>>>> col[4].j = column[10].j; >>>>> col[5].j = column[12].j; >>>>> >>>>> col[0].c = column[4].c; >>>>> col[1].c = column[5].c; >>>>> col[2].c = column[6].c; >>>>> col[3].c = column[9].c; >>>>> col[4].c = column[10].c; >>>>> col[5].c = column[12].c; >>>>> >>>>> v[0] = value[4]; >>>>> v[1] = value[5]; >>>>> v[2] = value[6]; >>>>> v[3] = value[9]; >>>>> v[4] = value[10]; >>>>> v[5] = value[12]; >>>>> >>>>> and did not pass the zero entries into the jacobian matrix. However, after inserting or adding all values to the matrix, by the same routine above to take a look at the jacobian matrix in matlab format, the matrix still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other 4701 numbers are all zeros. The spy() gives >>>>> >>>>> >>>>> >>>>> >>>>> for the true nonzero structures. >>>>> >>>>> But the ksp_view will give the nonzeros number as 5776, instead of 1075: >>>>> >>>>> linear system matrix = precond matrix: >>>>> Matrix Object: Mat_0x84000000_1 1 MPI processes >>>>> type: seqaij >>>>> rows=100, cols=100 >>>>> total: nonzeros=5776, allocated nonzeros=5776 >>>>> >>>>> It is a waste of memory to have all those values of zeros been stored in the jacobian. >>>>> >>>>> Is there anyway to get rid of those zero values in jacobian and has the only nonzero numbers stored in jacobian? In such a case, the ksp_view will tell that total: nonzeros=1075. >>>>> >>>>> MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); >>>>> >>>>> Matt >>>>> >>>>> Thanks very much! >>>>> >>>>> Have a nice weekend! >>>>> >>>>> Cheers, >>>>> >>>>> Rebecca >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Sat Jan 21 10:50:24 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 21 Jan 2012 17:50:24 +0100 Subject: [petsc-users] performance surprise In-Reply-To: <89DA5FA7-F8C9-415E-8739-A0ECEF137358@mcs.anl.gov> References: <89DA5FA7-F8C9-415E-8739-A0ECEF137358@mcs.anl.gov> Message-ID: Jed, Barry: thanks for the responses. I was aware of the related FAQ entry but somehow thought only standard desktop SMPs were concerned. I never suspected interconnect communication to be better than internal memory bus... Thanks Dominik From jedbrown at mcs.anl.gov Sat Jan 21 10:53:57 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 21 Jan 2012 10:53:57 -0600 Subject: [petsc-users] performance surprise In-Reply-To: References: <89DA5FA7-F8C9-415E-8739-A0ECEF137358@mcs.anl.gov> Message-ID: On Sat, Jan 21, 2012 at 10:50, Dominik Szczerba wrote: > Jed, Barry: thanks for the responses. I was aware of the related FAQ > entry but somehow thought only standard desktop SMPs were concerned. I > never suspected interconnect communication to be better than internal > memory bus... > You are conflating memory bandwidth with network bandwidth. You need memory bandwidth if you have a working set larger than cache, but each socket only has enough memory bandwidth for one or two cores, so adding more cores doesn't speed up your program. Network performance is a separate issue. -------------- next part -------------- An HTML attachment was scrubbed... URL: From xyuan at lbl.gov Sat Jan 21 12:19:54 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Sat, 21 Jan 2012 10:19:54 -0800 Subject: [petsc-users] The matrix allocated by calling MatMPIAIJSetProallocation() was flushed out after calling DMMGSetSNESLocal() In-Reply-To: References: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> <24EA91F5-6441-4582-9B8F-8F510FA7126F@lbl.gov> Message-ID: <00188A05-D954-4302-B32A-692CB276AD79@lbl.gov> Hello, After I was not able to get rid of the zeros in the matrix by >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); I would like to create the matrix by the following routine: ierr = MatCreate(PETSC_COMM_WORLD, &jacobian);CHKERRQ(ierr); ierr = MatSetType(jacobian, MATMPIAIJ);CHKERRQ(ierr); ierr = MatSetSizes(jacobian, PETSC_DECIDE, PETSC_DECIDE, (PetscInt)(info.mx*info.my*4), (PetscInt)(info.mx*info.my*4));CHKERRQ(ierr); ierr = MatMPIAIJSetPreallocation(jacobian, 11, PETSC_NULL, 18, PETSC_NULL);CHKERRQ(ierr); instead of using ierr = DMGetMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); to save the memory. However, I found that the routine DMGetMatrix() was still called inside DMMGSetSNESLocal(), therefore, the matrix I created was useless: ********************************************* Breakpoint 2, main (argc=3, argv=0x7fff5fbff770) at twcartffxmhd.c:255 255 ierr = MatCreate(PETSC_COMM_WORLD, &jacobian);CHKERRQ(ierr); (gdb) n 256 ierr = MatSetType(jacobian, MATMPIAIJ);CHKERRQ(ierr); (gdb) 257 ierr = MatSetSizes(jacobian, PETSC_DECIDE, PETSC_DECIDE, (PetscInt)(info.mx*info.my*4), (PetscInt)(info.mx*info.my*4));CHKERRQ(ierr); (gdb) 258 ierr = MatMPIAIJSetPreallocation(jacobian, 11, PETSC_NULL, 18, PETSC_NULL);CHKERRQ(ierr); (gdb) Breakpoint 3, main (argc=3, argv=0x7fff5fbff770) at twcartffxmhd.c:260 260 ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal, FormJacobianLocal,0,0);CHKERRQ(ierr); (gdb) s DMMGSetSNESLocal_Private (dmmg=0x102004480, function=0x100016ba9 , jacobian=0x100050e83 , ad_function=0, admf_function=0) at damgsnes.c:932 932 PetscInt i,nlevels = dmmg[0]->nlevels; (gdb) n 934 PetscErrorCode (*computejacobian)(SNES,Vec,Mat*,Mat*,MatStructure*,void*) = 0; (gdb) s 937 PetscFunctionBegin; (gdb) n 938 if (jacobian) computejacobian = DMMGComputeJacobian; (gdb) 942 CHKMEMQ; (gdb) 943 ierr = PetscObjectGetCookie((PetscObject) dmmg[0]->dm,&cookie);CHKERRQ(ierr); (gdb) 944 if (cookie == DM_COOKIE) { (gdb) 948 ierr = PetscOptionsHasName(PETSC_NULL, "-dmmg_form_function_ghost", &flag);CHKERRQ(ierr); (gdb) 949 if (flag) { (gdb) 952 ierr = DMMGSetSNES(dmmg,DMMGFormFunction,computejacobian);CHKERRQ(ierr); (gdb) Matrix Object: type=seqaij, rows=100, cols=100 total: nonzeros=5776, allocated nonzeros=5776 using I-node routines: found 25 nodes, limit used is 5 954 for (i=0; i On Fri, Jan 20, 2012 at 7:07 PM, Xuefei (Rebecca) Yuan wrote: > Here is the output: > > > > > > On Jan 20, 2012, at 5:01 PM, Matthew Knepley wrote: > >> On Fri, Jan 20, 2012 at 6:55 PM, Xuefei (Rebecca) Yuan wrote: >> Hello Matt, >> >> I tried several times for 3.1-p8 and dev version by putting MatSetOption >> >> Are you sure your entries are exactly 0.0? > > > Are you using ADD_VALUES? > > http://petsc.cs.iit.edu/petsc/petsc-dev/file/783e93230143/src/mat/impls/aij/seq/aij.c#l310 > > Matt > >> Matt >> >> 1) right after creation of the matrix: >> >> #ifdef petscDev >> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >> #else >> ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >> #endif >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >> >> 2) at the beginning of the FormJacobianLocal() routine: >> >> PetscFunctionBegin; >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >> >> 3) before calling MatAssemblyBegin() in FormJacobianLocal() routine: >> >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> >> None of those works. What is wrong here? >> >> Thanks, >> >> R >> >> >> On Jan 20, 2012, at 4:32 PM, Matthew Knepley wrote: >> >>> On Fri, Jan 20, 2012 at 6:28 PM, Xuefei (Rebecca) Yuan wrote: >>> Hello Matt, >>> >>> I have changed the code as >>> >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> >>> You have to set it before you start setting values, so we know to ignore them. >>> >>> Matt >>> >>> but still get the same result as before, the matrix still has 5776 nonzeros: >>> >>> % Size = 100 100 >>> 2 % Nonzeros = 5776 >>> 3 zzz = zeros(5776,3); >>> >>> Then I switch the order as >>> >>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> >>> nothing changed. >>> >>> The version is 3.1-p8. >>> >>> Thanks very much! >>> >>> Best regards, >>> >>> Rebecca >>> >>> >>> >>> >>> On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: >>> >>>> On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: >>>> Hello all, >>>> >>>> This is a test for np=1 case of the nonzero structure of the jacobian matrix. The jacobian matrix is created via >>>> >>>> ierr = DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); >>>> >>>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>>> >>>> After creation of the jacobian matrix, >>>> >>>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> >>>> PetscViewer viewer; >>>> char fileName[120]; >>>> sprintf(fileName, "jacobian_after_creation.m");CHKERRQ(ierr); >>>> >>>> FILE * fp; >>>> >>>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); >>>> ierr = PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); >>>> ierr = MatView (jacobian, viewer); CHKERRQ (ierr); >>>> ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); CHKERRQ(ierr); >>>> ierr = PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); >>>> ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); >>>> PetscViewerDestroy(&viewer); >>>> >>>> I took a look at the structure of the jacobian by storing it in the matlab format, the matrix has 5776 nonzeros entries, however, those values are all zeros at the moment as I have not insert or add any values into it yet, the structure shows: (the following figure shows a global replacement of 0.0 by 1.0 for those 5776 numbers) >>>> >>>> >>>> >>>> >>>> Inside the FormJacobianLocal() function, I have selected the index to pass to the nonzero values to jacobian, for example, >>>> >>>> ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, INSERT_VALUES);CHKERRQ(ierr); >>>> >>>> where >>>> >>>> col[0].i = column[4].i; >>>> col[1].i = column[5].i; >>>> col[2].i = column[6].i; >>>> col[3].i = column[9].i; >>>> col[4].i = column[10].i; >>>> col[5].i = column[12].i; >>>> >>>> >>>> col[0].j = column[4].j; >>>> col[1].j = column[5].j; >>>> col[2].j = column[6].j; >>>> col[3].j = column[9].j; >>>> col[4].j = column[10].j; >>>> col[5].j = column[12].j; >>>> >>>> col[0].c = column[4].c; >>>> col[1].c = column[5].c; >>>> col[2].c = column[6].c; >>>> col[3].c = column[9].c; >>>> col[4].c = column[10].c; >>>> col[5].c = column[12].c; >>>> >>>> v[0] = value[4]; >>>> v[1] = value[5]; >>>> v[2] = value[6]; >>>> v[3] = value[9]; >>>> v[4] = value[10]; >>>> v[5] = value[12]; >>>> >>>> and did not pass the zero entries into the jacobian matrix. However, after inserting or adding all values to the matrix, by the same routine above to take a look at the jacobian matrix in matlab format, the matrix still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other 4701 numbers are all zeros. The spy() gives >>>> >>>> >>>> >>>> >>>> for the true nonzero structures. >>>> >>>> But the ksp_view will give the nonzeros number as 5776, instead of 1075: >>>> >>>> linear system matrix = precond matrix: >>>> Matrix Object: Mat_0x84000000_1 1 MPI processes >>>> type: seqaij >>>> rows=100, cols=100 >>>> total: nonzeros=5776, allocated nonzeros=5776 >>>> >>>> It is a waste of memory to have all those values of zeros been stored in the jacobian. >>>> >>>> Is there anyway to get rid of those zero values in jacobian and has the only nonzero numbers stored in jacobian? In such a case, the ksp_view will tell that total: nonzeros=1075. >>>> >>>> MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); >>>> >>>> Matt >>>> >>>> Thanks very much! >>>> >>>> Have a nice weekend! >>>> >>>> Cheers, >>>> >>>> Rebecca >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Jan 21 12:27:30 2012 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 21 Jan 2012 12:27:30 -0600 Subject: [petsc-users] The matrix allocated by calling MatMPIAIJSetProallocation() was flushed out after calling DMMGSetSNESLocal() In-Reply-To: <00188A05-D954-4302-B32A-692CB276AD79@lbl.gov> References: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> <24EA91F5-6441-4582-9B8F-8F510FA7126F@lbl.gov> <00188A05-D954-4302-B32A-692CB276AD79@lbl.gov> Message-ID: On Sat, Jan 21, 2012 at 12:19 PM, Xuefei (Rebecca) Yuan wrote: > Hello, > > After I was not able to get rid of the zeros in the matrix by > > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >>> PETSC_TRUE);CHKERRQ(ierr); >>> >> > > I would like to create the matrix by the following routine: > > ierr = MatCreate(PETSC_COMM_WORLD, &jacobian);CHKERRQ(ierr); > ierr = MatSetType(jacobian, MATMPIAIJ);CHKERRQ(ierr); > ierr = MatSetSizes(jacobian, PETSC_DECIDE, PETSC_DECIDE, (PetscInt)( > info.mx*info.my*4), (PetscInt)(info.mx*info.my*4));CHKERRQ(ierr); > ierr = MatMPIAIJSetPreallocation(jacobian, 11, PETSC_NULL, 18, > PETSC_NULL);CHKERRQ(ierr); > > instead of using > > ierr = DMGetMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); > > to save the memory. However, I found that the routine DMGetMatrix() was > still called inside DMMGSetSNESLocal(), therefore, the matrix I created was > useless: > What exactly are you getting out of using the DM? If it does not express the pattern of your problem, why use it? Matt > ********************************************* > > Breakpoint 2, main (argc=3, argv=0x7fff5fbff770) at twcartffxmhd.c:255 > 255 ierr = MatCreate(PETSC_COMM_WORLD, &jacobian);CHKERRQ(ierr); > (gdb) n > 256 ierr = MatSetType(jacobian, MATMPIAIJ);CHKERRQ(ierr); > (gdb) > 257 ierr = MatSetSizes(jacobian, PETSC_DECIDE, PETSC_DECIDE, (PetscInt)( > info.mx*info.my*4), (PetscInt)(info.mx*info.my*4));CHKERRQ(ierr); > (gdb) > 258 ierr = MatMPIAIJSetPreallocation(jacobian, 11, PETSC_NULL, 18, > PETSC_NULL);CHKERRQ(ierr); > (gdb) > > Breakpoint 3, main (argc=3, argv=0x7fff5fbff770) at twcartffxmhd.c:260 > 260 ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal, > FormJacobianLocal,0,0);CHKERRQ(ierr); > (gdb) s > DMMGSetSNESLocal_Private (dmmg=0x102004480, function=0x100016ba9 > , jacobian=0x100050e83 , > ad_function=0, admf_function=0) at damgsnes.c:932 > 932 PetscInt i,nlevels = dmmg[0]->nlevels; > (gdb) n > 934 PetscErrorCode > (*computejacobian)(SNES,Vec,Mat*,Mat*,MatStructure*,void*) = 0; > (gdb) s > 937 PetscFunctionBegin; > (gdb) n > 938 if (jacobian) computejacobian = DMMGComputeJacobian; > (gdb) > 942 CHKMEMQ; > (gdb) > 943 ierr = PetscObjectGetCookie((PetscObject) > dmmg[0]->dm,&cookie);CHKERRQ(ierr); > (gdb) > 944 if (cookie == DM_COOKIE) { > (gdb) > 948 ierr = PetscOptionsHasName(PETSC_NULL, > "-dmmg_form_function_ghost", &flag);CHKERRQ(ierr); > (gdb) > 949 if (flag) { > (gdb) > 952 ierr = > DMMGSetSNES(dmmg,DMMGFormFunction,computejacobian);CHKERRQ(ierr); > (gdb) > Matrix Object: > type=seqaij, rows=100, cols=100 > total: nonzeros=5776, allocated nonzeros=5776 > using I-node routines: found 25 nodes, limit used is 5 > 954 for (i=0; i (gdb) b twcartffxmhd.c:280 > Breakpoint 4 at 0x100004032: file twcartffxmhd.c, line 282. > (gdb) c > Continuing. > Matrix Object: > type=mpiaij, rows=100, cols=100 > total: nonzeros=0, allocated nonzeros=2900 > using I-node (on process 0) routines: found 20 nodes, limit used is 5 > > ********************************************* > > The final matrix after FormJacobianlLocal() call still have 5776 nonzeros: > > % Size = 100 100 > 2 % Nonzeros = 5776 > 3 zzz = zeros(5776,3); > > Is there a way that I can pass the matrix created via line 255-258 to > FormJacobianLocal()? > > Thanks very much! > > Cheers, > > Rebecca > > > > > > > > On Jan 20, 2012, at 5:23 PM, Matthew Knepley wrote: > > On Fri, Jan 20, 2012 at 7:07 PM, Xuefei (Rebecca) Yuan wrote: > >> Here is the output: >> >> >> >> >> >> On Jan 20, 2012, at 5:01 PM, Matthew Knepley wrote: >> >> On Fri, Jan 20, 2012 at 6:55 PM, Xuefei (Rebecca) Yuan wrote: >> >>> Hello Matt, >>> >>> I tried several times for 3.1-p8 and dev version by putting MatSetOption >>> >> >> Are you sure your entries are exactly 0.0? >> >> > Are you using ADD_VALUES? > > > http://petsc.cs.iit.edu/petsc/petsc-dev/file/783e93230143/src/mat/impls/aij/seq/aij.c#l310 > > Matt > > >> Matt >> >> >>> 1) right after creation of the matrix: >>> >>> #ifdef petscDev >>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, >>> &jacobian);CHKERRQ(ierr); >>> #else >>> ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, >>> &jacobian);CHKERRQ(ierr); >>> #endif >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >>> PETSC_TRUE);CHKERRQ(ierr); >>> >>> 2) at the beginning of the FormJacobianLocal() routine: >>> >>> PetscFunctionBegin; >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >>> PETSC_TRUE);CHKERRQ(ierr); >>> >>> 3) before calling MatAssemblyBegin() in FormJacobianLocal() routine: >>> >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >>> PETSC_TRUE);CHKERRQ(ierr); >>> ierr = MatAssemblyBegin(jacobian, >>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, >>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> >>> None of those works. What is wrong here? >>> >>> Thanks, >>> >>> R >>> >>> >>> On Jan 20, 2012, at 4:32 PM, Matthew Knepley wrote: >>> >>> On Fri, Jan 20, 2012 at 6:28 PM, Xuefei (Rebecca) Yuan wrote: >>> >>>> Hello Matt, >>>> >>>> I have changed the code as >>>> >>>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >>>> PETSC_TRUE);CHKERRQ(ierr); >>>> ierr = MatAssemblyBegin(jacobian, >>>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(jacobian, >>>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> >>> >>> You have to set it before you start setting values, so we know to ignore >>> them. >>> >>> Matt >>> >>> >>>> but still get the same result as before, the matrix still has 5776 >>>> nonzeros: >>>> >>>> % Size = 100 100 >>>> 2 % Nonzeros = 5776 >>>> 3 zzz = zeros(5776,3); >>>> >>>> Then I switch the order as >>>> >>>> ierr = MatAssemblyBegin(jacobian, >>>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(jacobian, >>>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >>>> PETSC_TRUE);CHKERRQ(ierr); >>>> >>>> nothing changed. >>>> >>>> The version is 3.1-p8. >>>> >>>> Thanks very much! >>>> >>>> Best regards, >>>> >>>> Rebecca >>>> >>>> >>>> >>>> >>>> On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: >>>> >>>> On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: >>>> >>>>> Hello all, >>>>> >>>>> This is a test for np=1 case of the nonzero structure of the jacobian >>>>> matrix. The jacobian matrix is created via >>>>> >>>>> ierr = >>>>> DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, >>>>> parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, >>>>> 0, 0, &da);CHKERRQ(ierr); >>>>> >>>>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, >>>>> &jacobian);CHKERRQ(ierr); >>>>> >>>>> After creation of the jacobian matrix, >>>>> >>>>> ierr = MatAssemblyBegin(jacobian, >>>>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>>> ierr = MatAssemblyEnd(jacobian, >>>>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>>> >>>>> PetscViewer viewer; >>>>> char fileName[120]; >>>>> sprintf(fileName, >>>>> "jacobian_after_creation.m");CHKERRQ(ierr); >>>>> >>>>> FILE * fp; >>>>> >>>>> ierr = >>>>> PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); >>>>> ierr = >>>>> PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); >>>>> ierr = MatView (jacobian, viewer); CHKERRQ (ierr); >>>>> ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); >>>>> CHKERRQ(ierr); >>>>> ierr = >>>>> PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); >>>>> ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); >>>>> PetscViewerDestroy(&viewer); >>>>> >>>>> I took a look at the structure of the jacobian by storing it in the >>>>> matlab format, the matrix has 5776 nonzeros entries, however, those values >>>>> are all zeros at the moment as I have not insert or add any values into it >>>>> yet, the structure shows: (the following figure shows a global replacement >>>>> of 0.0 by 1.0 for those 5776 numbers) >>>>> >>>>> >>>>> >>>>> >>>>> Inside the FormJacobianLocal() function, I have selected the index to >>>>> pass to the nonzero values to jacobian, for example, >>>>> >>>>> ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, >>>>> INSERT_VALUES);CHKERRQ(ierr); >>>>> >>>>> where >>>>> >>>>> col[0].i = column[4].i; >>>>> col[1].i = column[5].i; >>>>> col[2].i = column[6].i; >>>>> col[3].i = column[9].i; >>>>> col[4].i = column[10].i; >>>>> col[5].i = column[12].i; >>>>> >>>>> >>>>> col[0].j = column[4].j; >>>>> col[1].j = column[5].j; >>>>> col[2].j = column[6].j; >>>>> col[3].j = column[9].j; >>>>> col[4].j = column[10].j; >>>>> col[5].j = column[12].j; >>>>> >>>>> col[0].c = column[4].c; >>>>> col[1].c = column[5].c; >>>>> col[2].c = column[6].c; >>>>> col[3].c = column[9].c; >>>>> col[4].c = column[10].c; >>>>> col[5].c = column[12].c; >>>>> >>>>> v[0] = value[4]; >>>>> v[1] = value[5]; >>>>> v[2] = value[6]; >>>>> v[3] = value[9]; >>>>> v[4] = value[10]; >>>>> v[5] = value[12]; >>>>> >>>>> and did not pass the zero entries into the jacobian matrix. However, >>>>> after inserting or adding all values to the matrix, by the same routine >>>>> above to take a look at the jacobian matrix in matlab format, the matrix >>>>> still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other >>>>> 4701 numbers are all zeros. The spy() gives >>>>> >>>>> >>>>> >>>>> >>>>> for the true nonzero structures. >>>>> >>>>> But the ksp_view will give the nonzeros number as 5776, instead of >>>>> 1075: >>>>> >>>>> linear system matrix = precond matrix: >>>>> Matrix Object: Mat_0x84000000_1 1 MPI processes >>>>> type: seqaij >>>>> rows=100, cols=100 >>>>> total: nonzeros=5776, allocated nonzeros=5776 >>>>> >>>>> It is a waste of memory to have all those values of zeros been stored >>>>> in the jacobian. >>>>> >>>>> Is there anyway to get rid of those zero values in jacobian and has >>>>> the only nonzero numbers stored in jacobian? In such a case, the ksp_view >>>>> will tell that total: nonzeros=1075. >>>>> >>>> >>>> MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); >>>> >>>> Matt >>>> >>>> >>>>> Thanks very much! >>>>> >>>>> Have a nice weekend! >>>>> >>>>> Cheers, >>>>> >>>>> Rebecca >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xyuan at lbl.gov Sat Jan 21 12:41:48 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Sat, 21 Jan 2012 10:41:48 -0800 Subject: [petsc-users] The matrix allocated by calling MatMPIAIJSetProallocation() was flushed out after calling DMMGSetSNESLocal() In-Reply-To: References: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> <24EA91F5-6441-4582-9B8F-8F510FA7126F@lbl.gov> <00188A05-D954-4302-B32A-692CB276AD79@lbl.gov> Message-ID: <62AAE9E0-A13F-47D0-88CC-D3280B2915BF@lbl.gov> On Jan 21, 2012, at 10:27 AM, Matthew Knepley wrote: > On Sat, Jan 21, 2012 at 12:19 PM, Xuefei (Rebecca) Yuan wrote: > Hello, > > After I was not able to get rid of the zeros in the matrix by > >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); > > > I would like to create the matrix by the following routine: > > > ierr = MatCreate(PETSC_COMM_WORLD, &jacobian);CHKERRQ(ierr); > ierr = MatSetType(jacobian, MATMPIAIJ);CHKERRQ(ierr); > ierr = MatSetSizes(jacobian, PETSC_DECIDE, PETSC_DECIDE, (PetscInt)(info.mx*info.my*4), (PetscInt)(info.mx*info.my*4));CHKERRQ(ierr); > ierr = MatMPIAIJSetPreallocation(jacobian, 11, PETSC_NULL, 18, PETSC_NULL);CHKERRQ(ierr); > > instead of using > > ierr = DMGetMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); > > to save the memory. However, I found that the routine DMGetMatrix() was still called inside DMMGSetSNESLocal(), therefore, the matrix I created was useless: > > What exactly are you getting out of using the DM? If it does not express the pattern of your problem, why use it? I do not want to use DMGetMatrix, so I use the MatCreate() to get my own matrix. But inside the call DMMGSetSNESLocal(), this DMGetMatrix was called and the matrix is set to 5776 nonzeros... > > Matt > > ********************************************* > > Breakpoint 2, main (argc=3, argv=0x7fff5fbff770) at twcartffxmhd.c:255 > 255 ierr = MatCreate(PETSC_COMM_WORLD, &jacobian);CHKERRQ(ierr); > (gdb) n > 256 ierr = MatSetType(jacobian, MATMPIAIJ);CHKERRQ(ierr); > (gdb) > 257 ierr = MatSetSizes(jacobian, PETSC_DECIDE, PETSC_DECIDE, (PetscInt)(info.mx*info.my*4), (PetscInt)(info.mx*info.my*4));CHKERRQ(ierr); > (gdb) > 258 ierr = MatMPIAIJSetPreallocation(jacobian, 11, PETSC_NULL, 18, PETSC_NULL);CHKERRQ(ierr); > (gdb) > > Breakpoint 3, main (argc=3, argv=0x7fff5fbff770) at twcartffxmhd.c:260 > 260 ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal, FormJacobianLocal,0,0);CHKERRQ(ierr); > (gdb) s > DMMGSetSNESLocal_Private (dmmg=0x102004480, function=0x100016ba9 , jacobian=0x100050e83 , ad_function=0, admf_function=0) at damgsnes.c:932 > 932 PetscInt i,nlevels = dmmg[0]->nlevels; > (gdb) n > 934 PetscErrorCode (*computejacobian)(SNES,Vec,Mat*,Mat*,MatStructure*,void*) = 0; > (gdb) s > 937 PetscFunctionBegin; > (gdb) n > 938 if (jacobian) computejacobian = DMMGComputeJacobian; > (gdb) > 942 CHKMEMQ; > (gdb) > 943 ierr = PetscObjectGetCookie((PetscObject) dmmg[0]->dm,&cookie);CHKERRQ(ierr); > (gdb) > 944 if (cookie == DM_COOKIE) { > (gdb) > 948 ierr = PetscOptionsHasName(PETSC_NULL, "-dmmg_form_function_ghost", &flag);CHKERRQ(ierr); > (gdb) > 949 if (flag) { > (gdb) > 952 ierr = DMMGSetSNES(dmmg,DMMGFormFunction,computejacobian);CHKERRQ(ierr); > (gdb) > Matrix Object: > type=seqaij, rows=100, cols=100 > total: nonzeros=5776, allocated nonzeros=5776 > using I-node routines: found 25 nodes, limit used is 5 > 954 for (i=0; i (gdb) b twcartffxmhd.c:280 > Breakpoint 4 at 0x100004032: file twcartffxmhd.c, line 282. > (gdb) c > Continuing. > Matrix Object: > type=mpiaij, rows=100, cols=100 > total: nonzeros=0, allocated nonzeros=2900 > using I-node (on process 0) routines: found 20 nodes, limit used is 5 > > ********************************************* > > The final matrix after FormJacobianlLocal() call still have 5776 nonzeros: > > % Size = 100 100 > 2 % Nonzeros = 5776 > 3 zzz = zeros(5776,3); > > Is there a way that I can pass the matrix created via line 255-258 to FormJacobianLocal()? > > Thanks very much! > > Cheers, > > Rebecca > > > > > > > > On Jan 20, 2012, at 5:23 PM, Matthew Knepley wrote: > >> On Fri, Jan 20, 2012 at 7:07 PM, Xuefei (Rebecca) Yuan wrote: >> Here is the output: >> >> >> >> >> >> On Jan 20, 2012, at 5:01 PM, Matthew Knepley wrote: >> >>> On Fri, Jan 20, 2012 at 6:55 PM, Xuefei (Rebecca) Yuan wrote: >>> Hello Matt, >>> >>> I tried several times for 3.1-p8 and dev version by putting MatSetOption >>> >>> Are you sure your entries are exactly 0.0? >> >> >> Are you using ADD_VALUES? >> >> http://petsc.cs.iit.edu/petsc/petsc-dev/file/783e93230143/src/mat/impls/aij/seq/aij.c#l310 >> >> Matt >> >>> Matt >>> >>> 1) right after creation of the matrix: >>> >>> #ifdef petscDev >>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>> #else >>> ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>> #endif >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> >>> 2) at the beginning of the FormJacobianLocal() routine: >>> >>> PetscFunctionBegin; >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> >>> 3) before calling MatAssemblyBegin() in FormJacobianLocal() routine: >>> >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> >>> None of those works. What is wrong here? >>> >>> Thanks, >>> >>> R >>> >>> >>> On Jan 20, 2012, at 4:32 PM, Matthew Knepley wrote: >>> >>>> On Fri, Jan 20, 2012 at 6:28 PM, Xuefei (Rebecca) Yuan wrote: >>>> Hello Matt, >>>> >>>> I have changed the code as >>>> >>>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> >>>> You have to set it before you start setting values, so we know to ignore them. >>>> >>>> Matt >>>> >>>> but still get the same result as before, the matrix still has 5776 nonzeros: >>>> >>>> % Size = 100 100 >>>> 2 % Nonzeros = 5776 >>>> 3 zzz = zeros(5776,3); >>>> >>>> Then I switch the order as >>>> >>>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>>> >>>> nothing changed. >>>> >>>> The version is 3.1-p8. >>>> >>>> Thanks very much! >>>> >>>> Best regards, >>>> >>>> Rebecca >>>> >>>> >>>> >>>> >>>> On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: >>>> >>>>> On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: >>>>> Hello all, >>>>> >>>>> This is a test for np=1 case of the nonzero structure of the jacobian matrix. The jacobian matrix is created via >>>>> >>>>> ierr = DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); >>>>> >>>>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>>>> >>>>> After creation of the jacobian matrix, >>>>> >>>>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>>> >>>>> PetscViewer viewer; >>>>> char fileName[120]; >>>>> sprintf(fileName, "jacobian_after_creation.m");CHKERRQ(ierr); >>>>> >>>>> FILE * fp; >>>>> >>>>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); >>>>> ierr = PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); >>>>> ierr = MatView (jacobian, viewer); CHKERRQ (ierr); >>>>> ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); CHKERRQ(ierr); >>>>> ierr = PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); >>>>> ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); >>>>> PetscViewerDestroy(&viewer); >>>>> >>>>> I took a look at the structure of the jacobian by storing it in the matlab format, the matrix has 5776 nonzeros entries, however, those values are all zeros at the moment as I have not insert or add any values into it yet, the structure shows: (the following figure shows a global replacement of 0.0 by 1.0 for those 5776 numbers) >>>>> >>>>> >>>>> >>>>> >>>>> Inside the FormJacobianLocal() function, I have selected the index to pass to the nonzero values to jacobian, for example, >>>>> >>>>> ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, INSERT_VALUES);CHKERRQ(ierr); >>>>> >>>>> where >>>>> >>>>> col[0].i = column[4].i; >>>>> col[1].i = column[5].i; >>>>> col[2].i = column[6].i; >>>>> col[3].i = column[9].i; >>>>> col[4].i = column[10].i; >>>>> col[5].i = column[12].i; >>>>> >>>>> >>>>> col[0].j = column[4].j; >>>>> col[1].j = column[5].j; >>>>> col[2].j = column[6].j; >>>>> col[3].j = column[9].j; >>>>> col[4].j = column[10].j; >>>>> col[5].j = column[12].j; >>>>> >>>>> col[0].c = column[4].c; >>>>> col[1].c = column[5].c; >>>>> col[2].c = column[6].c; >>>>> col[3].c = column[9].c; >>>>> col[4].c = column[10].c; >>>>> col[5].c = column[12].c; >>>>> >>>>> v[0] = value[4]; >>>>> v[1] = value[5]; >>>>> v[2] = value[6]; >>>>> v[3] = value[9]; >>>>> v[4] = value[10]; >>>>> v[5] = value[12]; >>>>> >>>>> and did not pass the zero entries into the jacobian matrix. However, after inserting or adding all values to the matrix, by the same routine above to take a look at the jacobian matrix in matlab format, the matrix still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other 4701 numbers are all zeros. The spy() gives >>>>> >>>>> >>>>> >>>>> >>>>> for the true nonzero structures. >>>>> >>>>> But the ksp_view will give the nonzeros number as 5776, instead of 1075: >>>>> >>>>> linear system matrix = precond matrix: >>>>> Matrix Object: Mat_0x84000000_1 1 MPI processes >>>>> type: seqaij >>>>> rows=100, cols=100 >>>>> total: nonzeros=5776, allocated nonzeros=5776 >>>>> >>>>> It is a waste of memory to have all those values of zeros been stored in the jacobian. >>>>> >>>>> Is there anyway to get rid of those zero values in jacobian and has the only nonzero numbers stored in jacobian? In such a case, the ksp_view will tell that total: nonzeros=1075. >>>>> >>>>> MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); >>>>> >>>>> Matt >>>>> >>>>> Thanks very much! >>>>> >>>>> Have a nice weekend! >>>>> >>>>> Cheers, >>>>> >>>>> Rebecca >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Jan 22 11:50:45 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 22 Jan 2012 11:50:45 -0600 Subject: [petsc-users] SNESVI convergence spped In-Reply-To: <26304804-B4D1-4357-AA9B-E3FFCCB7CA31@tigers.lsu.edu> References: <2B2259CF-97A3-47AE-BA86-29A97BE2B514@tigers.lsu.edu> <9018222C-E09A-4253-AB3E-745181517208@mcs.anl.gov> <26304804-B4D1-4357-AA9B-E3FFCCB7CA31@tigers.lsu.edu> Message-ID: Ata, Sorry for the delay in responding. It looks like you are using petsc-3.2, is that correct? You should switch to petsc-dev and rerun. The vi solvers are actively being developed and have some improvements since the release. Please send the same convergence information (just the text file of output) when running with -ksp_monitor_true_residual -snes_vi_monitor (not the binary files). Barry On Jan 18, 2012, at 9:40 AM, Ataollah Mesgarnejad wrote: > Dear all, > > Just realized that my email didn't go through because of my attachments, so here it is: > > Sorry if it took a bit long to do the runs, I wasn't feeling well yesterday. > > I attached the output I get from a small problem (90elements, 621 DOFs ) with different SNESVI types (exodusII and command line outputs). As you can see rsaug exits with an error but ss and rs run (and their results are similar). However, after V goes to zero at a cross section line searches for both of them (rs,ss) fail?! Also as you can see KSP converges for every step. > > These are the tolerances I pass to SNES: > user->KSP_default_rtol = 1e-12; > user->KSP_default_atol = 1e-12; > user->KSP_default_dtol = 1e3; > user->KSP_default_maxit = 50000; > user->psi_default_frtol = 1e-8; // snes_frtol > user->psi_default_fatol = 1e-8; //snes_fatol > user->psi_maxit = 500; //snes_maxit > user->psi_max_funcs = 1000; //snes_max_func_its > > Ps: files are here: http://cl.ly/0Z001Z3y1k0Q0g0s2F2R > > thanks, > Ata > > > On Jan 17, 2012, at 8:16 AM, Barry Smith wrote: > >> >> Blaise, >> >> Let's not solve the problem until we know what the problem is. -snes_vi_monitor first then think about the cure >> >> Barry >> >> On Jan 16, 2012, at 8:49 PM, Blaise Bourdin wrote: >> >>> Hi, >>> >>> Ata and I are working together on this. The problem he describes is 1/2 of the iteration of our variational fracture code. >>> In our application, E is position dependant, and typically becomes very large along very thin bands with width of the order of epsilon in the domain. Essentially, we expect that V will remain exactly equal to 1 almost everywhere, and will transition to 0 on these bands. Of course, we are interested in the limit as epsilon goes to 0. >>> >>> If the problem indeed is that it takes many steps to add the degrees of freedom. Is there any way to initialize manually the list of active constraints? To give you an idea, here is a link to a picture of the type of solution we expect. blue=1 >>> https://www.math.lsu.edu/~bourdin/377451-0000.png >>> >>> Blaise >>> >>> >>> >>>> It seems to me that the problem is that ultimately ALL of the degrees of freedom are in the active set, >>>> but they get added to it a few at a time -- and there may even be some "chatter" there -- necessitating many SNESVI steps. >>>> Could it be that the regularization makes things worse? When \epsilon \ll 1, the unconstrained solution is highly oscillatory, possibly further exacerbating the problem. It's possible that it would be better if V just diverged uniformly. Then nearly all of the degrees of freedom would bump up against the upper obstacle all at once. >>>> >>>> Dmitry. >>>> >>>> On Mon, Jan 16, 2012 at 8:05 PM, Barry Smith wrote: >>>> >>>> What do you get with -snes_vi_monitor it could be it is taking a while to get the right active set. >>>> >>>> Barry >>>> >>>> On Jan 16, 2012, at 6:20 PM, Ataollah Mesgarnejad wrote: >>>> >>>>> Dear all, >>>>> >>>>> I'm trying to use SNESVI to solve a quadratic problem with box constraints. My problem in FE context reads: >>>>> >>>>> (\int_{Omega} E phi_i phi_j + \alpha \epsilon dphi_i dphi_j dx) V_i - (\int_{Omega} \alpha \frac{phi_j}{\epsilon} dx) = 0 , 0<= V <= 1 >>>>> >>>>> or: >>>>> >>>>> [A]{V}-{b}={0} >>>>> >>>>> here phi is the basis function, E and \alpha are positive constants, and \epsilon is a positive regularization parameter in order of mesh resolution. In this problem we expect V =1 a.e. and go to zero very fast at some places. >>>>> I'm running this on a rather small problem (<500000 DOFS) on small number of processors (<72). I expected SNESVI to converge in couple of iterations (<10) since my A matrix doesn't change, however I'm experiencing a slow convergence (~50-70 iterations). I checked KSP solver for SNES and it converges with a few iterations. >>>>> >>>>> I would appreciate any suggestions or observations to increase the convergence speed? >>>>> >>>>> Best, >>>>> Ata >>>> >>>> >>> >>> -- >>> Department of Mathematics and Center for Computation & Technology >>> Louisiana State University, Baton Rouge, LA 70803, USA >>> Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin >>> >>> >>> >>> >>> >>> >>> >> > From amesga1 at tigers.lsu.edu Sun Jan 22 12:46:56 2012 From: amesga1 at tigers.lsu.edu (Ataollah Mesgarnejad) Date: Sun, 22 Jan 2012 12:46:56 -0600 Subject: [petsc-users] SNESVI convergence spped In-Reply-To: References: <2B2259CF-97A3-47AE-BA86-29A97BE2B514@tigers.lsu.edu> <9018222C-E09A-4253-AB3E-745181517208@mcs.anl.gov> <26304804-B4D1-4357-AA9B-E3FFCCB7CA31@tigers.lsu.edu> Message-ID: <5D9E45BA-784A-455A-BE9C-0BEC6C2FA90A@tigers.lsu.edu> Thanks Barry, Yes I'm using 3.2. I will change to dev and get back to you. Ata On Jan 22, 2012, at 11:50 AM, Barry Smith wrote: > > > Ata, > > Sorry for the delay in responding. > > It looks like you are using petsc-3.2, is that correct? You should switch to petsc-dev and rerun. The vi solvers are actively being developed and have some improvements since the release. Please send the same convergence information (just the text file of output) when running with -ksp_monitor_true_residual -snes_vi_monitor (not the binary files). > > > Barry > > On Jan 18, 2012, at 9:40 AM, Ataollah Mesgarnejad wrote: > >> Dear all, >> >> Just realized that my email didn't go through because of my attachments, so here it is: >> >> Sorry if it took a bit long to do the runs, I wasn't feeling well yesterday. >> >> I attached the output I get from a small problem (90elements, 621 DOFs ) with different SNESVI types (exodusII and command line outputs). As you can see rsaug exits with an error but ss and rs run (and their results are similar). However, after V goes to zero at a cross section line searches for both of them (rs,ss) fail?! Also as you can see KSP converges for every step. >> >> These are the tolerances I pass to SNES: >> user->KSP_default_rtol = 1e-12; >> user->KSP_default_atol = 1e-12; >> user->KSP_default_dtol = 1e3; >> user->KSP_default_maxit = 50000; >> user->psi_default_frtol = 1e-8; // snes_frtol >> user->psi_default_fatol = 1e-8; //snes_fatol >> user->psi_maxit = 500; //snes_maxit >> user->psi_max_funcs = 1000; //snes_max_func_its >> >> Ps: files are here: http://cl.ly/0Z001Z3y1k0Q0g0s2F2R >> >> thanks, >> Ata >> >> >> On Jan 17, 2012, at 8:16 AM, Barry Smith wrote: >> >>> >>> Blaise, >>> >>> Let's not solve the problem until we know what the problem is. -snes_vi_monitor first then think about the cure >>> >>> Barry >>> >>> On Jan 16, 2012, at 8:49 PM, Blaise Bourdin wrote: >>> >>>> Hi, >>>> >>>> Ata and I are working together on this. The problem he describes is 1/2 of the iteration of our variational fracture code. >>>> In our application, E is position dependant, and typically becomes very large along very thin bands with width of the order of epsilon in the domain. Essentially, we expect that V will remain exactly equal to 1 almost everywhere, and will transition to 0 on these bands. Of course, we are interested in the limit as epsilon goes to 0. >>>> >>>> If the problem indeed is that it takes many steps to add the degrees of freedom. Is there any way to initialize manually the list of active constraints? To give you an idea, here is a link to a picture of the type of solution we expect. blue=1 >>>> https://www.math.lsu.edu/~bourdin/377451-0000.png >>>> >>>> Blaise >>>> >>>> >>>> >>>>> It seems to me that the problem is that ultimately ALL of the degrees of freedom are in the active set, >>>>> but they get added to it a few at a time -- and there may even be some "chatter" there -- necessitating many SNESVI steps. >>>>> Could it be that the regularization makes things worse? When \epsilon \ll 1, the unconstrained solution is highly oscillatory, possibly further exacerbating the problem. It's possible that it would be better if V just diverged uniformly. Then nearly all of the degrees of freedom would bump up against the upper obstacle all at once. >>>>> >>>>> Dmitry. >>>>> >>>>> On Mon, Jan 16, 2012 at 8:05 PM, Barry Smith wrote: >>>>> >>>>> What do you get with -snes_vi_monitor it could be it is taking a while to get the right active set. >>>>> >>>>> Barry >>>>> >>>>> On Jan 16, 2012, at 6:20 PM, Ataollah Mesgarnejad wrote: >>>>> >>>>>> Dear all, >>>>>> >>>>>> I'm trying to use SNESVI to solve a quadratic problem with box constraints. My problem in FE context reads: >>>>>> >>>>>> (\int_{Omega} E phi_i phi_j + \alpha \epsilon dphi_i dphi_j dx) V_i - (\int_{Omega} \alpha \frac{phi_j}{\epsilon} dx) = 0 , 0<= V <= 1 >>>>>> >>>>>> or: >>>>>> >>>>>> [A]{V}-{b}={0} >>>>>> >>>>>> here phi is the basis function, E and \alpha are positive constants, and \epsilon is a positive regularization parameter in order of mesh resolution. In this problem we expect V =1 a.e. and go to zero very fast at some places. >>>>>> I'm running this on a rather small problem (<500000 DOFS) on small number of processors (<72). I expected SNESVI to converge in couple of iterations (<10) since my A matrix doesn't change, however I'm experiencing a slow convergence (~50-70 iterations). I checked KSP solver for SNES and it converges with a few iterations. >>>>>> >>>>>> I would appreciate any suggestions or observations to increase the convergence speed? >>>>>> >>>>>> Best, >>>>>> Ata >>>>> >>>>> >>>> >>>> -- >>>> Department of Mathematics and Center for Computation & Technology >>>> Louisiana State University, Baton Rouge, LA 70803, USA >>>> Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >> > From dominik at itis.ethz.ch Mon Jan 23 05:53:10 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Mon, 23 Jan 2012 12:53:10 +0100 Subject: [petsc-users] AVX in Petsc? Message-ID: Does Petsc in any way leverages AVX? Would I need to explicitly enable it somehow or will be taken over by the system and the compiler? Compiling with GNU on SMP linux and Cray XT6. http://en.wikipedia.org/wiki/Advanced_Vector_Extensions Dominik From jedbrown at mcs.anl.gov Mon Jan 23 06:34:04 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 23 Jan 2012 06:34:04 -0600 Subject: [petsc-users] AVX in Petsc? In-Reply-To: References: Message-ID: On Mon, Jan 23, 2012 at 05:53, Dominik Szczerba wrote: > Does Petsc in any way leverages AVX? > Would I need to explicitly enable it somehow or will be taken over by > the system and the compiler? > Compiling with GNU on SMP linux and Cray XT6. > AVX should be used by vendor BLAS if you have a recent implementation (but Intel might still check for "GenuineIntel" and switch to the slow version, so perhaps try ACML instead of MKL). With any given compiler, just grep the disassembly to see if AVX instructions are used. It's possible, but I doubt it. It generally won't make a performance difference for sparse kernels or for BLAS level 1 because these operations are memory bound. If you use sparse direct solvers for large problems, most of the work will end up being in dense factorization, so you would benefit. If you use matrix-free methods for nontrivial physics, you can benefit from using AVX instructions, but of course you are responsible for doing that, usually by writing in assembly or using intrinsics (perhaps with the C++ overloaded wrappers), because compilers are quite bad at automatic vectorization. -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Mon Jan 23 13:24:53 2012 From: recrusader at gmail.com (recrusader) Date: Mon, 23 Jan 2012 13:24:53 -0600 Subject: [petsc-users] CUDA with complex number Message-ID: Dear PETSc Developers, I am compiling PETSc-dev using GPU in complex number mode. However, when I configure PETSc-dev, I get the following errors " ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- Cannot use cuda with complex numbers it is not coded for this capability ******************************************************************************* " I have checked CUSP. The latest version can support complex number. What should I do for CUDA? Thank you very much. Best, Yujie From bsmith at mcs.anl.gov Mon Jan 23 15:23:19 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 23 Jan 2012 15:23:19 -0600 Subject: [petsc-users] CUDA with complex number In-Reply-To: References: Message-ID: <5595A735-5BA6-4F96-B665-B19F4773ACBF@mcs.anl.gov> This is an installation issue, please send all installation issues to petsc-maint at mcs.anl.gov with the entire configure.log You can try editing config/PETSc/packages/cuda.py and removing the two lines if self.scalartypes.scalartype == 'complex': raise RuntimeError('Must use real numbers with CUDA') As it says we have never tested for complex so I do not know how far it is from working. Barry On Jan 23, 2012, at 1:24 PM, recrusader wrote: > Dear PETSc Developers, > > I am compiling PETSc-dev using GPU in complex number mode. > However, when I configure PETSc-dev, I get the following errors > " > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > for details): > ------------------------------------------------------------------------------- > Cannot use cuda with complex numbers it is not coded for this capability > ******************************************************************************* > " > I have checked CUSP. The latest version can support complex number. > What should I do for CUDA? > Thank you very much. > > Best, > Yujie From paulm at txcorp.com Mon Jan 23 15:28:44 2012 From: paulm at txcorp.com (Paul Mullowney) Date: Mon, 23 Jan 2012 14:28:44 -0700 Subject: [petsc-users] CUDA with complex number In-Reply-To: <5595A735-5BA6-4F96-B665-B19F4773ACBF@mcs.anl.gov> References: <5595A735-5BA6-4F96-B665-B19F4773ACBF@mcs.anl.gov> Message-ID: <4F1DD10C.4000702@txcorp.com> I would claim that Petsc does not support GPU capabilities for complex numbers right now. The CUSP library is templated over the scalar type (i.e. float or double), however I don't think it supports complex numbers. Although, I could be wrong on this. It could be somewhat straightforward to move from CUSP to CUSPARSE (yes, these are different Nvidia libraries). I believe CUSPARSE supports SpMV for complex types. It also supports triangular solve for complex types. -Paul > This is an installation issue, please send all installation issues to petsc-maint at mcs.anl.gov with the entire configure.log > > > > > You can try editing config/PETSc/packages/cuda.py and removing the two lines > > if self.scalartypes.scalartype == 'complex': > raise RuntimeError('Must use real numbers with CUDA') > > As it says we have never tested for complex so I do not know how far it is from working. > > Barry > > On Jan 23, 2012, at 1:24 PM, recrusader wrote: > >> Dear PETSc Developers, >> >> I am compiling PETSc-dev using GPU in complex number mode. >> However, when I configure PETSc-dev, I get the following errors >> " >> ******************************************************************************* >> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log >> for details): >> ------------------------------------------------------------------------------- >> Cannot use cuda with complex numbers it is not coded for this capability >> ******************************************************************************* >> " >> I have checked CUSP. The latest version can support complex number. >> What should I do for CUDA? >> Thank you very much. >> >> Best, >> Yujie From recrusader at gmail.com Mon Jan 23 15:36:58 2012 From: recrusader at gmail.com (recrusader) Date: Mon, 23 Jan 2012 15:36:58 -0600 Subject: [petsc-users] CUDA with complex number In-Reply-To: <4F1DD10C.4000702@txcorp.com> References: <5595A735-5BA6-4F96-B665-B19F4773ACBF@mcs.anl.gov> <4F1DD10C.4000702@txcorp.com> Message-ID: Dear Paul, Please find the changelog in the following link for CUSP 0.2.0 http://code.google.com/p/cusp-library/source/browse/CHANGELOG They has added cusp:complex class to support complex number-based operation. does it not work for PETSc? Thanks a lot, Yujie On Mon, Jan 23, 2012 at 3:28 PM, Paul Mullowney wrote: > I would claim that Petsc does not support GPU capabilities for complex > numbers right now. > > The CUSP library is templated over the scalar type (i.e. float or double), > however I don't think it supports complex numbers. Although, I could be > wrong on this. > > It could be somewhat straightforward to move from CUSP to CUSPARSE (yes, > these are different Nvidia libraries). I believe CUSPARSE supports SpMV for > complex types. It also supports triangular solve for complex types. > > -Paul > > > This is an installation issue, please send all installation issues to >> petsc-maint at mcs.anl.gov with the entire configure.log >> >> >> >> >> You can try editing config/PETSc/packages/cuda.py and removing the two >> lines >> >> if self.scalartypes.scalartype == 'complex': >> raise RuntimeError('Must use real numbers with CUDA') >> >> As it says we have never tested for complex so I do not know how far >> it is from working. >> >> Barry >> >> On Jan 23, 2012, at 1:24 PM, recrusader wrote: >> >> Dear PETSc Developers, >>> >>> I am compiling PETSc-dev using GPU in complex number mode. >>> However, when I configure PETSc-dev, I get the following errors >>> " >>> **************************************************************** >>> ******************* >>> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log >>> for details): >>> ------------------------------**------------------------------** >>> ------------------- >>> Cannot use cuda with complex numbers it is not coded for this capability >>> **************************************************************** >>> ******************* >>> " >>> I have checked CUSP. The latest version can support complex number. >>> What should I do for CUDA? >>> Thank you very much. >>> >>> Best, >>> Yujie >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danesh.daroui at ltu.se Mon Jan 23 15:40:55 2012 From: danesh.daroui at ltu.se (=?utf-8?B?RGFuZXNoIERhcm91aQ==?=) Date: Mon, 23 Jan 2012 22:40:55 +0100 Subject: [petsc-users] =?utf-8?q?CUDA_with_complex_number?= Message-ID: <201201232141.q0NLf5Cb007247@mxi.ltu.se> For sparse and even dense solvers for GPU you can use CULA a product from CULA tools. They even support double complex data type. But do not forget if you wish to get better performance over CPU with complex numbers you should use fermi based cards i.e. Tesla m2050-m2090. ----- Reply message ----- From: "Paul Mullowney" To: "PETSc users list" Subject: [petsc-users] CUDA with complex number Date: Mon, Jan 23, 2012 22:28 I would claim that Petsc does not support GPU capabilities for complex numbers right now. The CUSP library is templated over the scalar type (i.e. float or double), however I don't think it supports complex numbers. Although, I could be wrong on this. It could be somewhat straightforward to move from CUSP to CUSPARSE (yes, these are different Nvidia libraries). I believe CUSPARSE supports SpMV for complex types. It also supports triangular solve for complex types. -Paul > This is an installation issue, please send all installation issues to petsc-maint at mcs.anl.gov with the entire configure.log > > > > > You can try editing config/PETSc/packages/cuda.py and removing the two lines > > if self.scalartypes.scalartype == 'complex': > raise RuntimeError('Must use real numbers with CUDA') > > As it says we have never tested for complex so I do not know how far it is from working. > > Barry > > On Jan 23, 2012, at 1:24 PM, recrusader wrote: > >> Dear PETSc Developers, >> >> I am compiling PETSc-dev using GPU in complex number mode. >> However, when I configure PETSc-dev, I get the following errors >> " >> ******************************************************************************* >> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log >> for details): >> ------------------------------------------------------------------------------- >> Cannot use cuda with complex numbers it is not coded for this capability >> ******************************************************************************* >> " >> I have checked CUSP. The latest version can support complex number. >> What should I do for CUDA? >> Thank you very much. >> >> Best, >> Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Jan 23 15:46:50 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 23 Jan 2012 15:46:50 -0600 Subject: [petsc-users] CUDA with complex number In-Reply-To: References: <5595A735-5BA6-4F96-B665-B19F4773ACBF@mcs.anl.gov> <4F1DD10C.4000702@txcorp.com> Message-ID: <4DD2945F-616A-4B1A-8E6A-743323B2195E@mcs.anl.gov> Yujie, As I said we have never tried it. You are welcome to try it following the directions I gave. You must realize that we do not have an army of people to support GPUs so you are largely on your on in terms of getting things done and need to use your initiative in trying things yourself and working through the problems that may arise. Barry On Jan 23, 2012, at 3:36 PM, recrusader wrote: > Dear Paul, > > Please find the changelog in the following link for CUSP 0.2.0 > http://code.google.com/p/cusp-library/source/browse/CHANGELOG > > They has added cusp:complex class to support complex number-based operation. > > does it not work for PETSc? > > Thanks a lot, > Yujie > > > On Mon, Jan 23, 2012 at 3:28 PM, Paul Mullowney wrote: > I would claim that Petsc does not support GPU capabilities for complex numbers right now. > > The CUSP library is templated over the scalar type (i.e. float or double), however I don't think it supports complex numbers. Although, I could be wrong on this. > > It could be somewhat straightforward to move from CUSP to CUSPARSE (yes, these are different Nvidia libraries). I believe CUSPARSE supports SpMV for complex types. It also supports triangular solve for complex types. > > -Paul > > > This is an installation issue, please send all installation issues to petsc-maint at mcs.anl.gov with the entire configure.log > > > > > You can try editing config/PETSc/packages/cuda.py and removing the two lines > > if self.scalartypes.scalartype == 'complex': > raise RuntimeError('Must use real numbers with CUDA') > > As it says we have never tested for complex so I do not know how far it is from working. > > Barry > > On Jan 23, 2012, at 1:24 PM, recrusader wrote: > > Dear PETSc Developers, > > I am compiling PETSc-dev using GPU in complex number mode. > However, when I configure PETSc-dev, I get the following errors > " > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > for details): > ------------------------------------------------------------------------------- > Cannot use cuda with complex numbers it is not coded for this capability > ******************************************************************************* > " > I have checked CUSP. The latest version can support complex number. > What should I do for CUDA? > Thank you very much. > > Best, > Yujie > > From rlmackie862 at gmail.com Mon Jan 23 15:49:36 2012 From: rlmackie862 at gmail.com (Randall Mackie) Date: Mon, 23 Jan 2012 13:49:36 -0800 Subject: [petsc-users] CUDA with complex number In-Reply-To: References: <5595A735-5BA6-4F96-B665-B19F4773ACBF@mcs.anl.gov> <4F1DD10C.4000702@txcorp.com> Message-ID: <26010469-5ABF-460C-A179-BB73258C6CD2@gmail.com> I looked into this last year and this was the response from the PETSc group and I don't know if this has changed: =============================>> Looks like it is not so trivial as I had made it out to be. Perhaps if you ask on petsc-dev at mcs.anl.gov there may be people who want this also and are willing to share the work? Barry On May 23, 2011, at 6:06 PM, Victor Minden wrote: > Barry, > > Currently there should be two things going on--one is that some of the > CUBLAS routines used for basic vector operations (which we've been moving > away from) have 4 different forms depending on single/double and > real/complex--currently those that we still use are hard-coded as real and > #ifdeffed to be single/double, and I haven't looked in depth but I think > those are the errors that Randy sent. > > Those should be pretty quick to change, but they won't make complex work, > because once you get past veccusp.cu there are a bunch more errors in the > cusp matrix routines which work with complex if you use the relatively new > (Oct? Dec?) cusp complex type that they've added to the cusp library. So, > we need to make the GPU routines work with the standard complex type or we > need to convert all the GPU code to use the cusp type. I don't know how > much work this is. > > As it stands, I'm just starting my internship this week so I probably won't > be able to look at this in-depth until the weekend, just FYI. > > Cheers, > > Victor > --- <<=============================================== On Jan 23, 2012, at 1:36 PM, recrusader wrote: > Dear Paul, > > Please find the changelog in the following link for CUSP 0.2.0 > http://code.google.com/p/cusp-library/source/browse/CHANGELOG > > They has added cusp:complex class to support complex number-based operation. > > does it not work for PETSc? > > Thanks a lot, > Yujie > > > On Mon, Jan 23, 2012 at 3:28 PM, Paul Mullowney wrote: > I would claim that Petsc does not support GPU capabilities for complex numbers right now. > > The CUSP library is templated over the scalar type (i.e. float or double), however I don't think it supports complex numbers. Although, I could be wrong on this. > > It could be somewhat straightforward to move from CUSP to CUSPARSE (yes, these are different Nvidia libraries). I believe CUSPARSE supports SpMV for complex types. It also supports triangular solve for complex types. > > -Paul > > > This is an installation issue, please send all installation issues to petsc-maint at mcs.anl.gov with the entire configure.log > > > > > You can try editing config/PETSc/packages/cuda.py and removing the two lines > > if self.scalartypes.scalartype == 'complex': > raise RuntimeError('Must use real numbers with CUDA') > > As it says we have never tested for complex so I do not know how far it is from working. > > Barry > > On Jan 23, 2012, at 1:24 PM, recrusader wrote: > > Dear PETSc Developers, > > I am compiling PETSc-dev using GPU in complex number mode. > However, when I configure PETSc-dev, I get the following errors > " > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > for details): > ------------------------------------------------------------------------------- > Cannot use cuda with complex numbers it is not coded for this capability > ******************************************************************************* > " > I have checked CUSP. The latest version can support complex number. > What should I do for CUDA? > Thank you very much. > > Best, > Yujie > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Jan 23 15:52:12 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 23 Jan 2012 15:52:12 -0600 Subject: [petsc-users] CUDA with complex number In-Reply-To: <201201232141.q0NLf5Cb007247@mxi.ltu.se> References: <201201232141.q0NLf5Cb007247@mxi.ltu.se> Message-ID: On Mon, Jan 23, 2012 at 15:40, Danesh Daroui wrote: > For sparse and even dense solvers for GPU you can use CULA a product from > CULA tools. This commercial product smells like a scam. Their benchmark/comparisons are less than worthless and they don't even compare to the free/open source libraries. -------------- next part -------------- An HTML attachment was scrubbed... URL: From xyuan at lbl.gov Mon Jan 23 15:52:32 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Mon, 23 Jan 2012 13:52:32 -0800 Subject: [petsc-users] How to simplify the nonzero structure of the jacobian matrix. In-Reply-To: References: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> <24EA91F5-6441-4582-9B8F-8F510FA7126F@lbl.gov> Message-ID: Hello all, There is one example that has used the ierr = MatSetOption(*B,MAT_IGNORE_ZERO_ENTRIES,PETSC_TRUE);CHKERRQ(ierr); in /src/mat/examples/tutorials/ex12.c, and this option works for this case because the matrix B is created via ierr = MatCreateSeqAIJ(PETSC_COMM_SELF,n+1,n+1,0,cnt,B);CHKERRQ(ierr); and the values are inserted by ierr = MatSetValues(*B,1,&i,aij->i[i+1] - aij->i[i],aij->j + aij->i[i],aij->a + aij->i[i],INSERT_VALUES);CHKERRQ(ierr); However, the option MAT_IGNORE_ZERO_ENTRIES does not work for the matrix if it is generated via ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); no matter if MatSetValues() or MatSetValuesStencil() is used. In the other word, if the matrix is generated based on the stencil information, the zero entries cannot be ignored even if the option is called before inserting/adding values. Is there anyway that can get rid of those zeros in the matrix generated based on the stencil information? Thanks very much! Best regards, Rebecca On Jan 20, 2012, at 5:23 PM, Matthew Knepley wrote: > On Fri, Jan 20, 2012 at 7:07 PM, Xuefei (Rebecca) Yuan wrote: > Here is the output: > > > > > > On Jan 20, 2012, at 5:01 PM, Matthew Knepley wrote: > >> On Fri, Jan 20, 2012 at 6:55 PM, Xuefei (Rebecca) Yuan wrote: >> Hello Matt, >> >> I tried several times for 3.1-p8 and dev version by putting MatSetOption >> >> Are you sure your entries are exactly 0.0? > > > Are you using ADD_VALUES? > > http://petsc.cs.iit.edu/petsc/petsc-dev/file/783e93230143/src/mat/impls/aij/seq/aij.c#l310 > > Matt > >> Matt >> >> 1) right after creation of the matrix: >> >> #ifdef petscDev >> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >> #else >> ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >> #endif >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >> >> 2) at the beginning of the FormJacobianLocal() routine: >> >> PetscFunctionBegin; >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >> >> 3) before calling MatAssemblyBegin() in FormJacobianLocal() routine: >> >> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> >> None of those works. What is wrong here? >> >> Thanks, >> >> R >> >> >> On Jan 20, 2012, at 4:32 PM, Matthew Knepley wrote: >> >>> On Fri, Jan 20, 2012 at 6:28 PM, Xuefei (Rebecca) Yuan wrote: >>> Hello Matt, >>> >>> I have changed the code as >>> >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> >>> You have to set it before you start setting values, so we know to ignore them. >>> >>> Matt >>> >>> but still get the same result as before, the matrix still has 5776 nonzeros: >>> >>> % Size = 100 100 >>> 2 % Nonzeros = 5776 >>> 3 zzz = zeros(5776,3); >>> >>> Then I switch the order as >>> >>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> >>> nothing changed. >>> >>> The version is 3.1-p8. >>> >>> Thanks very much! >>> >>> Best regards, >>> >>> Rebecca >>> >>> >>> >>> >>> On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: >>> >>>> On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: >>>> Hello all, >>>> >>>> This is a test for np=1 case of the nonzero structure of the jacobian matrix. The jacobian matrix is created via >>>> >>>> ierr = DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); >>>> >>>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>>> >>>> After creation of the jacobian matrix, >>>> >>>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> >>>> PetscViewer viewer; >>>> char fileName[120]; >>>> sprintf(fileName, "jacobian_after_creation.m");CHKERRQ(ierr); >>>> >>>> FILE * fp; >>>> >>>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); >>>> ierr = PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); >>>> ierr = MatView (jacobian, viewer); CHKERRQ (ierr); >>>> ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); CHKERRQ(ierr); >>>> ierr = PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); >>>> ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); >>>> PetscViewerDestroy(&viewer); >>>> >>>> I took a look at the structure of the jacobian by storing it in the matlab format, the matrix has 5776 nonzeros entries, however, those values are all zeros at the moment as I have not insert or add any values into it yet, the structure shows: (the following figure shows a global replacement of 0.0 by 1.0 for those 5776 numbers) >>>> >>>> >>>> >>>> >>>> Inside the FormJacobianLocal() function, I have selected the index to pass to the nonzero values to jacobian, for example, >>>> >>>> ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, INSERT_VALUES);CHKERRQ(ierr); >>>> >>>> where >>>> >>>> col[0].i = column[4].i; >>>> col[1].i = column[5].i; >>>> col[2].i = column[6].i; >>>> col[3].i = column[9].i; >>>> col[4].i = column[10].i; >>>> col[5].i = column[12].i; >>>> >>>> >>>> col[0].j = column[4].j; >>>> col[1].j = column[5].j; >>>> col[2].j = column[6].j; >>>> col[3].j = column[9].j; >>>> col[4].j = column[10].j; >>>> col[5].j = column[12].j; >>>> >>>> col[0].c = column[4].c; >>>> col[1].c = column[5].c; >>>> col[2].c = column[6].c; >>>> col[3].c = column[9].c; >>>> col[4].c = column[10].c; >>>> col[5].c = column[12].c; >>>> >>>> v[0] = value[4]; >>>> v[1] = value[5]; >>>> v[2] = value[6]; >>>> v[3] = value[9]; >>>> v[4] = value[10]; >>>> v[5] = value[12]; >>>> >>>> and did not pass the zero entries into the jacobian matrix. However, after inserting or adding all values to the matrix, by the same routine above to take a look at the jacobian matrix in matlab format, the matrix still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other 4701 numbers are all zeros. The spy() gives >>>> >>>> >>>> >>>> >>>> for the true nonzero structures. >>>> >>>> But the ksp_view will give the nonzeros number as 5776, instead of 1075: >>>> >>>> linear system matrix = precond matrix: >>>> Matrix Object: Mat_0x84000000_1 1 MPI processes >>>> type: seqaij >>>> rows=100, cols=100 >>>> total: nonzeros=5776, allocated nonzeros=5776 >>>> >>>> It is a waste of memory to have all those values of zeros been stored in the jacobian. >>>> >>>> Is there anyway to get rid of those zero values in jacobian and has the only nonzero numbers stored in jacobian? In such a case, the ksp_view will tell that total: nonzeros=1075. >>>> >>>> MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); >>>> >>>> Matt >>>> >>>> Thanks very much! >>>> >>>> Have a nice weekend! >>>> >>>> Cheers, >>>> >>>> Rebecca >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From paulm at txcorp.com Mon Jan 23 15:52:36 2012 From: paulm at txcorp.com (Paul Mullowney) Date: Mon, 23 Jan 2012 14:52:36 -0700 Subject: [petsc-users] CUDA with complex number In-Reply-To: References: <5595A735-5BA6-4F96-B665-B19F4773ACBF@mcs.anl.gov> <4F1DD10C.4000702@txcorp.com> Message-ID: <4F1DD6A4.6080003@txcorp.com> Interesting. I didn't know this as I've never tried to do complex arithmetic with GPUs and PETSc. Maybe it is just a configuration issue. It may work provided Thrust has complex number support also (which it probably does). However, if you are using ILU (and at some pt, ICC) preconditioners, some additional code is needed to do MatSolve with complex numbers. This wouldn't be too difficult to implement though. -Paul > Dear Paul, > > Please find the changelog in the following link for CUSP 0.2.0 > http://code.google.com/p/cusp-library/source/browse/CHANGELOG > > They has added cusp:complex class to support complex number-based > operation. > > does it not work for PETSc? > > Thanks a lot, > Yujie > > > On Mon, Jan 23, 2012 at 3:28 PM, Paul Mullowney > wrote: > > I would claim that Petsc does not support GPU capabilities for > complex numbers right now. > > The CUSP library is templated over the scalar type (i.e. float or > double), however I don't think it supports complex numbers. > Although, I could be wrong on this. > > It could be somewhat straightforward to move from CUSP to CUSPARSE > (yes, these are different Nvidia libraries). I believe CUSPARSE > supports SpMV for complex types. It also supports triangular solve > for complex types. > > -Paul > > > This is an installation issue, please send all > installation issues to petsc-maint at mcs.anl.gov > with the entire configure.log > > > > > You can try editing config/PETSc/packages/cuda.py and > removing the two lines > > if self.scalartypes.scalartype == 'complex': > raise RuntimeError('Must use real numbers with CUDA') > > As it says we have never tested for complex so I do not > know how far it is from working. > > Barry > > On Jan 23, 2012, at 1:24 PM, recrusader wrote: > > Dear PETSc Developers, > > I am compiling PETSc-dev using GPU in complex number mode. > However, when I configure PETSc-dev, I get the following > errors > " > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see > configure.log > for details): > ------------------------------------------------------------------------------- > Cannot use cuda with complex numbers it is not coded for > this capability > ******************************************************************************* > " > I have checked CUSP. The latest version can support > complex number. > What should I do for CUDA? > Thank you very much. > > Best, > Yujie > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Jan 23 15:55:38 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 23 Jan 2012 15:55:38 -0600 Subject: [petsc-users] How to simplify the nonzero structure of the jacobian matrix. In-Reply-To: References: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> <24EA91F5-6441-4582-9B8F-8F510FA7126F@lbl.gov> Message-ID: -dm_preallocate_only http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/DM/DMSetMatrixPreallocateOnly.html On Mon, Jan 23, 2012 at 15:52, Xuefei (Rebecca) Yuan wrote: > Hello all, > > There is one example that has used the > > ierr = MatSetOption(*B,MAT_IGNORE_ZERO_ENTRIES,PETSC_TRUE);CHKERRQ(ierr); > > in /src/mat/examples/tutorials/ex12.c, > > and this option works for this case because the matrix B is created via > > ierr = MatCreateSeqAIJ(PETSC_COMM_SELF,n+1,n+1,0,cnt,B);CHKERRQ(ierr); > > and the values are inserted by > > ierr = MatSetValues(*B,1,&i,aij->i[i+1] - aij->i[i],aij->j + > aij->i[i],aij->a + aij->i[i],INSERT_VALUES);CHKERRQ(ierr); > > > However, the option MAT_IGNORE_ZERO_ENTRIES does not work for the matrix > if it is generated via > > ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); > > no matter if MatSetValues() or MatSetValuesStencil() is used. > > In the other word, if the matrix is generated based on the stencil > information, the zero entries cannot be ignored even if the option is > called before inserting/adding values. > > Is there anyway that can get rid of those zeros in the matrix generated > based on the stencil information? > > > Thanks very much! > > Best regards, > > Rebecca > > > > > > On Jan 20, 2012, at 5:23 PM, Matthew Knepley wrote: > > On Fri, Jan 20, 2012 at 7:07 PM, Xuefei (Rebecca) Yuan wrote: > >> Here is the output: >> >> >> >> >> >> On Jan 20, 2012, at 5:01 PM, Matthew Knepley wrote: >> >> On Fri, Jan 20, 2012 at 6:55 PM, Xuefei (Rebecca) Yuan wrote: >> >>> Hello Matt, >>> >>> I tried several times for 3.1-p8 and dev version by putting MatSetOption >>> >> >> Are you sure your entries are exactly 0.0? >> >> > Are you using ADD_VALUES? > > > http://petsc.cs.iit.edu/petsc/petsc-dev/file/783e93230143/src/mat/impls/aij/seq/aij.c#l310 > > Matt > > >> Matt >> >> >>> 1) right after creation of the matrix: >>> >>> #ifdef petscDev >>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, >>> &jacobian);CHKERRQ(ierr); >>> #else >>> ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, >>> &jacobian);CHKERRQ(ierr); >>> #endif >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >>> PETSC_TRUE);CHKERRQ(ierr); >>> >>> 2) at the beginning of the FormJacobianLocal() routine: >>> >>> PetscFunctionBegin; >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >>> PETSC_TRUE);CHKERRQ(ierr); >>> >>> 3) before calling MatAssemblyBegin() in FormJacobianLocal() routine: >>> >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >>> PETSC_TRUE);CHKERRQ(ierr); >>> ierr = MatAssemblyBegin(jacobian, >>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, >>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> >>> None of those works. What is wrong here? >>> >>> Thanks, >>> >>> R >>> >>> >>> On Jan 20, 2012, at 4:32 PM, Matthew Knepley wrote: >>> >>> On Fri, Jan 20, 2012 at 6:28 PM, Xuefei (Rebecca) Yuan wrote: >>> >>>> Hello Matt, >>>> >>>> I have changed the code as >>>> >>>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >>>> PETSC_TRUE);CHKERRQ(ierr); >>>> ierr = MatAssemblyBegin(jacobian, >>>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(jacobian, >>>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> >>> >>> You have to set it before you start setting values, so we know to ignore >>> them. >>> >>> Matt >>> >>> >>>> but still get the same result as before, the matrix still has 5776 >>>> nonzeros: >>>> >>>> % Size = 100 100 >>>> 2 % Nonzeros = 5776 >>>> 3 zzz = zeros(5776,3); >>>> >>>> Then I switch the order as >>>> >>>> ierr = MatAssemblyBegin(jacobian, >>>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(jacobian, >>>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, >>>> PETSC_TRUE);CHKERRQ(ierr); >>>> >>>> nothing changed. >>>> >>>> The version is 3.1-p8. >>>> >>>> Thanks very much! >>>> >>>> Best regards, >>>> >>>> Rebecca >>>> >>>> >>>> >>>> >>>> On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: >>>> >>>> On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: >>>> >>>>> Hello all, >>>>> >>>>> This is a test for np=1 case of the nonzero structure of the jacobian >>>>> matrix. The jacobian matrix is created via >>>>> >>>>> ierr = >>>>> DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, >>>>> parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, >>>>> 0, 0, &da);CHKERRQ(ierr); >>>>> >>>>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, >>>>> &jacobian);CHKERRQ(ierr); >>>>> >>>>> After creation of the jacobian matrix, >>>>> >>>>> ierr = MatAssemblyBegin(jacobian, >>>>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>>> ierr = MatAssemblyEnd(jacobian, >>>>> MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>>> >>>>> PetscViewer viewer; >>>>> char fileName[120]; >>>>> sprintf(fileName, >>>>> "jacobian_after_creation.m");CHKERRQ(ierr); >>>>> >>>>> FILE * fp; >>>>> >>>>> ierr = >>>>> PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); >>>>> ierr = >>>>> PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); >>>>> ierr = MatView (jacobian, viewer); CHKERRQ (ierr); >>>>> ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); >>>>> CHKERRQ(ierr); >>>>> ierr = >>>>> PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); >>>>> ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); >>>>> PetscViewerDestroy(&viewer); >>>>> >>>>> I took a look at the structure of the jacobian by storing it in the >>>>> matlab format, the matrix has 5776 nonzeros entries, however, those values >>>>> are all zeros at the moment as I have not insert or add any values into it >>>>> yet, the structure shows: (the following figure shows a global replacement >>>>> of 0.0 by 1.0 for those 5776 numbers) >>>>> >>>>> >>>>> >>>>> >>>>> Inside the FormJacobianLocal() function, I have selected the index to >>>>> pass to the nonzero values to jacobian, for example, >>>>> >>>>> ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, >>>>> INSERT_VALUES);CHKERRQ(ierr); >>>>> >>>>> where >>>>> >>>>> col[0].i = column[4].i; >>>>> col[1].i = column[5].i; >>>>> col[2].i = column[6].i; >>>>> col[3].i = column[9].i; >>>>> col[4].i = column[10].i; >>>>> col[5].i = column[12].i; >>>>> >>>>> >>>>> col[0].j = column[4].j; >>>>> col[1].j = column[5].j; >>>>> col[2].j = column[6].j; >>>>> col[3].j = column[9].j; >>>>> col[4].j = column[10].j; >>>>> col[5].j = column[12].j; >>>>> >>>>> col[0].c = column[4].c; >>>>> col[1].c = column[5].c; >>>>> col[2].c = column[6].c; >>>>> col[3].c = column[9].c; >>>>> col[4].c = column[10].c; >>>>> col[5].c = column[12].c; >>>>> >>>>> v[0] = value[4]; >>>>> v[1] = value[5]; >>>>> v[2] = value[6]; >>>>> v[3] = value[9]; >>>>> v[4] = value[10]; >>>>> v[5] = value[12]; >>>>> >>>>> and did not pass the zero entries into the jacobian matrix. However, >>>>> after inserting or adding all values to the matrix, by the same routine >>>>> above to take a look at the jacobian matrix in matlab format, the matrix >>>>> still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other >>>>> 4701 numbers are all zeros. The spy() gives >>>>> >>>>> >>>>> >>>>> >>>>> for the true nonzero structures. >>>>> >>>>> But the ksp_view will give the nonzeros number as 5776, instead of >>>>> 1075: >>>>> >>>>> linear system matrix = precond matrix: >>>>> Matrix Object: Mat_0x84000000_1 1 MPI processes >>>>> type: seqaij >>>>> rows=100, cols=100 >>>>> total: nonzeros=5776, allocated nonzeros=5776 >>>>> >>>>> It is a waste of memory to have all those values of zeros been stored >>>>> in the jacobian. >>>>> >>>>> Is there anyway to get rid of those zero values in jacobian and has >>>>> the only nonzero numbers stored in jacobian? In such a case, the ksp_view >>>>> will tell that total: nonzeros=1075. >>>>> >>>> >>>> MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); >>>> >>>> Matt >>>> >>>> >>>>> Thanks very much! >>>>> >>>>> Have a nice weekend! >>>>> >>>>> Cheers, >>>>> >>>>> Rebecca >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Mon Jan 23 16:17:38 2012 From: recrusader at gmail.com (recrusader) Date: Mon, 23 Jan 2012 16:17:38 -0600 Subject: [petsc-users] CUDA with complex number In-Reply-To: <4DD2945F-616A-4B1A-8E6A-743323B2195E@mcs.anl.gov> References: <5595A735-5BA6-4F96-B665-B19F4773ACBF@mcs.anl.gov> <4F1DD10C.4000702@txcorp.com> <4DD2945F-616A-4B1A-8E6A-743323B2195E@mcs.anl.gov> Message-ID: Dear Barry, For complex number-based implementation, will the functions in Vec, Mat, KSP, and PC having been realized for real number work at least at the running level? Do I need to do more coding for them? To my knowledge, Victor Minden finished most of the coding work for GPU implementation in PETSc with your and Matt's help. However, it seems there is not people to further realize and optimize other GPU functions in PETSc. Do you think GPU-based computation is not important or has low efficiency compared to current CPU-based implementation in PETSc? Thank you very much, Yujie On Mon, Jan 23, 2012 at 3:46 PM, Barry Smith wrote: > > Yujie, > > As I said we have never tried it. You are welcome to try it following > the directions I gave. You must realize that we do not have an army of > people to support GPUs so you are largely on your on in terms of getting > things done and need to use your initiative in trying things yourself and > working through the problems that may arise. > > Barry > > On Jan 23, 2012, at 3:36 PM, recrusader wrote: > > > Dear Paul, > > > > Please find the changelog in the following link for CUSP 0.2.0 > > http://code.google.com/p/cusp-library/source/browse/CHANGELOG > > > > They has added cusp:complex class to support complex number-based > operation. > > > > does it not work for PETSc? > > > > Thanks a lot, > > Yujie > > > > > > On Mon, Jan 23, 2012 at 3:28 PM, Paul Mullowney > wrote: > > I would claim that Petsc does not support GPU capabilities for complex > numbers right now. > > > > The CUSP library is templated over the scalar type (i.e. float or > double), however I don't think it supports complex numbers. Although, I > could be wrong on this. > > > > It could be somewhat straightforward to move from CUSP to CUSPARSE (yes, > these are different Nvidia libraries). I believe CUSPARSE supports SpMV for > complex types. It also supports triangular solve for complex types. > > > > -Paul > > > > > > This is an installation issue, please send all installation issues > to petsc-maint at mcs.anl.gov with the entire configure.log > > > > > > > > > > You can try editing config/PETSc/packages/cuda.py and removing the > two lines > > > > if self.scalartypes.scalartype == 'complex': > > raise RuntimeError('Must use real numbers with CUDA') > > > > As it says we have never tested for complex so I do not know how far > it is from working. > > > > Barry > > > > On Jan 23, 2012, at 1:24 PM, recrusader wrote: > > > > Dear PETSc Developers, > > > > I am compiling PETSc-dev using GPU in complex number mode. > > However, when I configure PETSc-dev, I get the following errors > > " > > > ******************************************************************************* > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > > for details): > > > ------------------------------------------------------------------------------- > > Cannot use cuda with complex numbers it is not coded for this capability > > > ******************************************************************************* > > " > > I have checked CUSP. The latest version can support complex number. > > What should I do for CUDA? > > Thank you very much. > > > > Best, > > Yujie > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danesh.daroui at ltu.se Mon Jan 23 16:18:26 2012 From: danesh.daroui at ltu.se (=?utf-8?B?RGFuZXNoIERhcm91aQ==?=) Date: Mon, 23 Jan 2012 23:18:26 +0100 Subject: [petsc-users] =?utf-8?q?CUDA_with_complex_number?= Message-ID: <201201232218.q0NMIbu4008515@mxi.ltu.se> Well I don't have any share at that company! :-) That would be true but that is the only lapack and sparse lib fir gpu that I knew. ----- Reply message ----- From: "Jed Brown" To: "PETSc users list" Subject: [petsc-users] CUDA with complex number Date: Mon, Jan 23, 2012 22:52 On Mon, Jan 23, 2012 at 15:40, Danesh Daroui wrote: For sparse and even dense solvers for GPU you can use CULA a product from CULA tools. This commercial product smells like a scam. Their benchmark/comparisons are less than worthless and they don't even compare to the free/open source libraries. -------------- next part -------------- An HTML attachment was scrubbed... URL: From xyuan at lbl.gov Mon Jan 23 16:45:34 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Mon, 23 Jan 2012 14:45:34 -0800 Subject: [petsc-users] How to simplify the nonzero structure of the jacobian matrix. In-Reply-To: References: <9366BD13-47A6-47DC-AA0B-1E25A9388B4F@lbl.gov> <24EA91F5-6441-4582-9B8F-8F510FA7126F@lbl.gov> Message-ID: <0FEDC3EB-319D-4CB2-876C-5CB2CC4E027A@lbl.gov> Hi, Jed, Thanks very much! Problem solved. Cheers, Rebecca On Jan 23, 2012, at 1:55 PM, Jed Brown wrote: > -dm_preallocate_only > > http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/DM/DMSetMatrixPreallocateOnly.html > > On Mon, Jan 23, 2012 at 15:52, Xuefei (Rebecca) Yuan wrote: > Hello all, > > There is one example that has used the > > ierr = MatSetOption(*B,MAT_IGNORE_ZERO_ENTRIES,PETSC_TRUE);CHKERRQ(ierr); > > in /src/mat/examples/tutorials/ex12.c, > > and this option works for this case because the matrix B is created via > > ierr = MatCreateSeqAIJ(PETSC_COMM_SELF,n+1,n+1,0,cnt,B);CHKERRQ(ierr); > > and the values are inserted by > > ierr = MatSetValues(*B,1,&i,aij->i[i+1] - aij->i[i],aij->j + aij->i[i],aij->a + aij->i[i],INSERT_VALUES);CHKERRQ(ierr); > > > However, the option MAT_IGNORE_ZERO_ENTRIES does not work for the matrix if it is generated via > > ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); > > no matter if MatSetValues() or MatSetValuesStencil() is used. > > In the other word, if the matrix is generated based on the stencil information, the zero entries cannot be ignored even if the option is called before inserting/adding values. > > Is there anyway that can get rid of those zeros in the matrix generated based on the stencil information? > > > Thanks very much! > > Best regards, > > Rebecca > > > > > > On Jan 20, 2012, at 5:23 PM, Matthew Knepley wrote: > >> On Fri, Jan 20, 2012 at 7:07 PM, Xuefei (Rebecca) Yuan wrote: >> Here is the output: >> >> >> >> >> >> On Jan 20, 2012, at 5:01 PM, Matthew Knepley wrote: >> >>> On Fri, Jan 20, 2012 at 6:55 PM, Xuefei (Rebecca) Yuan wrote: >>> Hello Matt, >>> >>> I tried several times for 3.1-p8 and dev version by putting MatSetOption >>> >>> Are you sure your entries are exactly 0.0? >> >> >> Are you using ADD_VALUES? >> >> http://petsc.cs.iit.edu/petsc/petsc-dev/file/783e93230143/src/mat/impls/aij/seq/aij.c#l310 >> >> Matt >> >>> Matt >>> >>> 1) right after creation of the matrix: >>> >>> #ifdef petscDev >>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>> #else >>> ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>> #endif >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> >>> 2) at the beginning of the FormJacobianLocal() routine: >>> >>> PetscFunctionBegin; >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> >>> 3) before calling MatAssemblyBegin() in FormJacobianLocal() routine: >>> >>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>> >>> None of those works. What is wrong here? >>> >>> Thanks, >>> >>> R >>> >>> >>> On Jan 20, 2012, at 4:32 PM, Matthew Knepley wrote: >>> >>>> On Fri, Jan 20, 2012 at 6:28 PM, Xuefei (Rebecca) Yuan wrote: >>>> Hello Matt, >>>> >>>> I have changed the code as >>>> >>>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> >>>> You have to set it before you start setting values, so we know to ignore them. >>>> >>>> Matt >>>> >>>> but still get the same result as before, the matrix still has 5776 nonzeros: >>>> >>>> % Size = 100 100 >>>> 2 % Nonzeros = 5776 >>>> 3 zzz = zeros(5776,3); >>>> >>>> Then I switch the order as >>>> >>>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); >>>> >>>> nothing changed. >>>> >>>> The version is 3.1-p8. >>>> >>>> Thanks very much! >>>> >>>> Best regards, >>>> >>>> Rebecca >>>> >>>> >>>> >>>> >>>> On Jan 20, 2012, at 4:09 PM, Matthew Knepley wrote: >>>> >>>>> On Fri, Jan 20, 2012 at 6:02 PM, Xuefei (Rebecca) Yuan wrote: >>>>> Hello all, >>>>> >>>>> This is a test for np=1 case of the nonzero structure of the jacobian matrix. The jacobian matrix is created via >>>>> >>>>> ierr = DMDACreate2d(comm,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_BOX, parameters.mxfield, parameters.myfield, PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); >>>>> >>>>> ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); >>>>> >>>>> After creation of the jacobian matrix, >>>>> >>>>> ierr = MatAssemblyBegin(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>>> ierr = MatAssemblyEnd(jacobian, MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>>> >>>>> PetscViewer viewer; >>>>> char fileName[120]; >>>>> sprintf(fileName, "jacobian_after_creation.m");CHKERRQ(ierr); >>>>> >>>>> FILE * fp; >>>>> >>>>> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,fileName,&viewer);CHKERRQ(ierr); >>>>> ierr = PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);CHKERRQ(ierr); >>>>> ierr = MatView (jacobian, viewer); CHKERRQ (ierr); >>>>> ierr = PetscFOpen(PETSC_COMM_WORLD,fileName,"a",&fp); CHKERRQ(ierr); >>>>> ierr = PetscViewerASCIIPrintf(viewer,"spy((spconvert(zzz)));\n");CHKERRQ(ierr); >>>>> ierr = PetscFClose(PETSC_COMM_WORLD,fp);CHKERRQ(ierr); >>>>> PetscViewerDestroy(&viewer); >>>>> >>>>> I took a look at the structure of the jacobian by storing it in the matlab format, the matrix has 5776 nonzeros entries, however, those values are all zeros at the moment as I have not insert or add any values into it yet, the structure shows: (the following figure shows a global replacement of 0.0 by 1.0 for those 5776 numbers) >>>>> >>>>> >>>>> >>>>> >>>>> Inside the FormJacobianLocal() function, I have selected the index to pass to the nonzero values to jacobian, for example, >>>>> >>>>> ierr = MatSetValuesStencil(jacobian, 1, &row, 6, col, v, INSERT_VALUES);CHKERRQ(ierr); >>>>> >>>>> where >>>>> >>>>> col[0].i = column[4].i; >>>>> col[1].i = column[5].i; >>>>> col[2].i = column[6].i; >>>>> col[3].i = column[9].i; >>>>> col[4].i = column[10].i; >>>>> col[5].i = column[12].i; >>>>> >>>>> >>>>> col[0].j = column[4].j; >>>>> col[1].j = column[5].j; >>>>> col[2].j = column[6].j; >>>>> col[3].j = column[9].j; >>>>> col[4].j = column[10].j; >>>>> col[5].j = column[12].j; >>>>> >>>>> col[0].c = column[4].c; >>>>> col[1].c = column[5].c; >>>>> col[2].c = column[6].c; >>>>> col[3].c = column[9].c; >>>>> col[4].c = column[10].c; >>>>> col[5].c = column[12].c; >>>>> >>>>> v[0] = value[4]; >>>>> v[1] = value[5]; >>>>> v[2] = value[6]; >>>>> v[3] = value[9]; >>>>> v[4] = value[10]; >>>>> v[5] = value[12]; >>>>> >>>>> and did not pass the zero entries into the jacobian matrix. However, after inserting or adding all values to the matrix, by the same routine above to take a look at the jacobian matrix in matlab format, the matrix still has 5776 nonzeros, in which 1075 numbers are nonzeros, and the other 4701 numbers are all zeros. The spy() gives >>>>> >>>>> >>>>> >>>>> >>>>> for the true nonzero structures. >>>>> >>>>> But the ksp_view will give the nonzeros number as 5776, instead of 1075: >>>>> >>>>> linear system matrix = precond matrix: >>>>> Matrix Object: Mat_0x84000000_1 1 MPI processes >>>>> type: seqaij >>>>> rows=100, cols=100 >>>>> total: nonzeros=5776, allocated nonzeros=5776 >>>>> >>>>> It is a waste of memory to have all those values of zeros been stored in the jacobian. >>>>> >>>>> Is there anyway to get rid of those zero values in jacobian and has the only nonzero numbers stored in jacobian? In such a case, the ksp_view will tell that total: nonzeros=1075. >>>>> >>>>> MatSetOption(MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE); >>>>> >>>>> Matt >>>>> >>>>> Thanks very much! >>>>> >>>>> Have a nice weekend! >>>>> >>>>> Cheers, >>>>> >>>>> Rebecca >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paulm at txcorp.com Mon Jan 23 17:09:37 2012 From: paulm at txcorp.com (Paul Mullowney) Date: Mon, 23 Jan 2012 16:09:37 -0700 Subject: [petsc-users] CUDA with complex number In-Reply-To: References: <5595A735-5BA6-4F96-B665-B19F4773ACBF@mcs.anl.gov> <4F1DD10C.4000702@txcorp.com> <4DD2945F-616A-4B1A-8E6A-743323B2195E@mcs.anl.gov> Message-ID: <4F1DE8B1.70000@txcorp.com> Yujie, I'm developing GPU code for PETSc that will, at some point, support complex numbers. However, I cannot give you a time frame as to when that will be complete as I have several other tasks that are higher priority right now. -Paul > Dear Barry, > > For complex number-based implementation, will the functions in Vec, > Mat, KSP, and PC having been realized for real number work at least at > the running level? > Do I need to do more coding for them? > > To my knowledge, Victor Minden finished most of the coding work for > GPU implementation in PETSc with your and Matt's help. > However, it seems there is not people to further realize and optimize > other GPU functions in PETSc. > Do you think GPU-based computation is not important or has low > efficiency compared to current CPU-based implementation in PETSc? > > > Thank you very much, > Yujie > > > On Mon, Jan 23, 2012 at 3:46 PM, Barry Smith > wrote: > > > Yujie, > > As I said we have never tried it. You are welcome to try it > following the directions I gave. You must realize that we do not > have an army of people to support GPUs so you are largely on your > on in terms of getting things done and need to use your initiative > in trying things yourself and working through the problems that > may arise. > > Barry > > On Jan 23, 2012, at 3:36 PM, recrusader wrote: > > > Dear Paul, > > > > Please find the changelog in the following link for CUSP 0.2.0 > > http://code.google.com/p/cusp-library/source/browse/CHANGELOG > > > > They has added cusp:complex class to support complex > number-based operation. > > > > does it not work for PETSc? > > > > Thanks a lot, > > Yujie > > > > > > On Mon, Jan 23, 2012 at 3:28 PM, Paul Mullowney > > wrote: > > I would claim that Petsc does not support GPU capabilities for > complex numbers right now. > > > > The CUSP library is templated over the scalar type (i.e. float > or double), however I don't think it supports complex numbers. > Although, I could be wrong on this. > > > > It could be somewhat straightforward to move from CUSP to > CUSPARSE (yes, these are different Nvidia libraries). I believe > CUSPARSE supports SpMV for complex types. It also supports > triangular solve for complex types. > > > > -Paul > > > > > > This is an installation issue, please send all installation > issues to petsc-maint at mcs.anl.gov > with the entire configure.log > > > > > > > > > > You can try editing config/PETSc/packages/cuda.py and > removing the two lines > > > > if self.scalartypes.scalartype == 'complex': > > raise RuntimeError('Must use real numbers with CUDA') > > > > As it says we have never tested for complex so I do not know > how far it is from working. > > > > Barry > > > > On Jan 23, 2012, at 1:24 PM, recrusader wrote: > > > > Dear PETSc Developers, > > > > I am compiling PETSc-dev using GPU in complex number mode. > > However, when I configure PETSc-dev, I get the following errors > > " > > > ******************************************************************************* > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > > for details): > > > ------------------------------------------------------------------------------- > > Cannot use cuda with complex numbers it is not coded for this > capability > > > ******************************************************************************* > > " > > I have checked CUSP. The latest version can support complex number. > > What should I do for CUDA? > > Thank you very much. > > > > Best, > > Yujie > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Mon Jan 23 19:54:54 2012 From: recrusader at gmail.com (recrusader) Date: Mon, 23 Jan 2012 19:54:54 -0600 Subject: [petsc-users] CUDA with complex number In-Reply-To: <4F1DE8B1.70000@txcorp.com> References: <5595A735-5BA6-4F96-B665-B19F4773ACBF@mcs.anl.gov> <4F1DD10C.4000702@txcorp.com> <4DD2945F-616A-4B1A-8E6A-743323B2195E@mcs.anl.gov> <4F1DE8B1.70000@txcorp.com> Message-ID: Dear Paul, Thank you very much for your consideration. Actually, if the codes have been finished for complex number-based computation in PETSc, I can do some testings. It will be ok for me. If lots of codes need to be written regarding what I said, I am not sure whether I have enough time. After all, I am not familiar with PETSc and CUDA. Best, Yujie On Mon, Jan 23, 2012 at 5:09 PM, Paul Mullowney wrote: > ** > Yujie, > > I'm developing GPU code for PETSc that will, at some point, support > complex numbers. However, I cannot give you a time frame as to when that > will be complete as I have several other tasks that are higher priority > right now. > > -Paul > > Dear Barry, > > For complex number-based implementation, will the functions in Vec, Mat, > KSP, and PC having been realized for real number work at least at the > running level? > Do I need to do more coding for them? > > To my knowledge, Victor Minden finished most of the coding work for GPU > implementation in PETSc with your and Matt's help. > However, it seems there is not people to further realize and optimize > other GPU functions in PETSc. > Do you think GPU-based computation is not important or has low efficiency > compared to current CPU-based implementation in PETSc? > > > Thank you very much, > Yujie > > > On Mon, Jan 23, 2012 at 3:46 PM, Barry Smith wrote: > >> >> Yujie, >> >> As I said we have never tried it. You are welcome to try it following >> the directions I gave. You must realize that we do not have an army of >> people to support GPUs so you are largely on your on in terms of getting >> things done and need to use your initiative in trying things yourself and >> working through the problems that may arise. >> >> Barry >> >> On Jan 23, 2012, at 3:36 PM, recrusader wrote: >> >> > Dear Paul, >> > >> > Please find the changelog in the following link for CUSP 0.2.0 >> > http://code.google.com/p/cusp-library/source/browse/CHANGELOG >> > >> > They has added cusp:complex class to support complex number-based >> operation. >> > >> > does it not work for PETSc? >> > >> > Thanks a lot, >> > Yujie >> > >> > >> > On Mon, Jan 23, 2012 at 3:28 PM, Paul Mullowney >> wrote: >> > I would claim that Petsc does not support GPU capabilities for complex >> numbers right now. >> > >> > The CUSP library is templated over the scalar type (i.e. float or >> double), however I don't think it supports complex numbers. Although, I >> could be wrong on this. >> > >> > It could be somewhat straightforward to move from CUSP to CUSPARSE >> (yes, these are different Nvidia libraries). I believe CUSPARSE supports >> SpMV for complex types. It also supports triangular solve for complex types. >> > >> > -Paul >> > >> > >> > This is an installation issue, please send all installation issues >> to petsc-maint at mcs.anl.gov with the entire configure.log >> > >> > >> > >> > >> > You can try editing config/PETSc/packages/cuda.py and removing the >> two lines >> > >> > if self.scalartypes.scalartype == 'complex': >> > raise RuntimeError('Must use real numbers with CUDA') >> > >> > As it says we have never tested for complex so I do not know how far >> it is from working. >> > >> > Barry >> > >> > On Jan 23, 2012, at 1:24 PM, recrusader wrote: >> > >> > Dear PETSc Developers, >> > >> > I am compiling PETSc-dev using GPU in complex number mode. >> > However, when I configure PETSc-dev, I get the following errors >> > " >> > >> ******************************************************************************* >> > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log >> > for details): >> > >> ------------------------------------------------------------------------------- >> > Cannot use cuda with complex numbers it is not coded for this capability >> > >> ******************************************************************************* >> > " >> > I have checked CUSP. The latest version can support complex number. >> > What should I do for CUDA? >> > Thank you very much. >> > >> > Best, >> > Yujie >> > >> > >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Jan 23 21:27:29 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 23 Jan 2012 21:27:29 -0600 Subject: [petsc-users] CUDA with complex number In-Reply-To: References: <5595A735-5BA6-4F96-B665-B19F4773ACBF@mcs.anl.gov> <4F1DD10C.4000702@txcorp.com> <4DD2945F-616A-4B1A-8E6A-743323B2195E@mcs.anl.gov> Message-ID: <53235B98-1DD0-4465-BC3E-5D7994B8E7E3@mcs.anl.gov> On Jan 23, 2012, at 4:17 PM, recrusader wrote: > Dear Barry, > > For complex number-based implementation, will the functions in Vec, Mat, KSP, and PC having been realized for real number work at least at the running level? > Do I need to do more coding for them? I don't think it is a matter of much coding. It would be a matter of fixing little uses of templates to get it all working with complex. No way to know what it involves until you try. > > To my knowledge, Victor Minden finished most of the coding work for GPU implementation in PETSc with your and Matt's help. > However, it seems there is not people to further realize and optimize other GPU functions in PETSc. > Do you think GPU-based computation is not important or has low efficiency compared to current CPU-based implementation in PETSc? There is only a certain amount a finite number of people can do. Our priorities are more adding more functionality to PETSc, better solvers etc. Honestly it would take the entire PETSc team working full time on GPUs to make it top notch, but would that be the best use of our time? Basically Paul is the only one developing the GPU code at the moment. Barry > > > Thank you very much, > Yujie > > > On Mon, Jan 23, 2012 at 3:46 PM, Barry Smith wrote: > > Yujie, > > As I said we have never tried it. You are welcome to try it following the directions I gave. You must realize that we do not have an army of people to support GPUs so you are largely on your on in terms of getting things done and need to use your initiative in trying things yourself and working through the problems that may arise. > > Barry > > On Jan 23, 2012, at 3:36 PM, recrusader wrote: > > > Dear Paul, > > > > Please find the changelog in the following link for CUSP 0.2.0 > > http://code.google.com/p/cusp-library/source/browse/CHANGELOG > > > > They has added cusp:complex class to support complex number-based operation. > > > > does it not work for PETSc? > > > > Thanks a lot, > > Yujie > > > > > > On Mon, Jan 23, 2012 at 3:28 PM, Paul Mullowney wrote: > > I would claim that Petsc does not support GPU capabilities for complex numbers right now. > > > > The CUSP library is templated over the scalar type (i.e. float or double), however I don't think it supports complex numbers. Although, I could be wrong on this. > > > > It could be somewhat straightforward to move from CUSP to CUSPARSE (yes, these are different Nvidia libraries). I believe CUSPARSE supports SpMV for complex types. It also supports triangular solve for complex types. > > > > -Paul > > > > > > This is an installation issue, please send all installation issues to petsc-maint at mcs.anl.gov with the entire configure.log > > > > > > > > > > You can try editing config/PETSc/packages/cuda.py and removing the two lines > > > > if self.scalartypes.scalartype == 'complex': > > raise RuntimeError('Must use real numbers with CUDA') > > > > As it says we have never tested for complex so I do not know how far it is from working. > > > > Barry > > > > On Jan 23, 2012, at 1:24 PM, recrusader wrote: > > > > Dear PETSc Developers, > > > > I am compiling PETSc-dev using GPU in complex number mode. > > However, when I configure PETSc-dev, I get the following errors > > " > > ******************************************************************************* > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > > for details): > > ------------------------------------------------------------------------------- > > Cannot use cuda with complex numbers it is not coded for this capability > > ******************************************************************************* > > " > > I have checked CUSP. The latest version can support complex number. > > What should I do for CUDA? > > Thank you very much. > > > > Best, > > Yujie > > > > > > From dominik at itis.ethz.ch Tue Jan 24 06:27:17 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Tue, 24 Jan 2012 13:27:17 +0100 Subject: [petsc-users] ***ASSERTION failed on line 65 of file fm.c: ComputeCut(graph, where) == graph->mincut Message-ID: I am getting the above error unexpectedly when setting/assembling my MPIAIJ matrices. The input unstrtuctured mesh was partitioned with parmetis and no errors were reported. The error only happens for some number of requested partitions. The code is clean according to valgrind. What else can I do about it? Thanks for any suggestions. Dominik From jedbrown at mcs.anl.gov Tue Jan 24 06:37:55 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 24 Jan 2012 06:37:55 -0600 Subject: [petsc-users] ***ASSERTION failed on line 65 of file fm.c: ComputeCut(graph, where) == graph->mincut In-Reply-To: References: Message-ID: The error is inside of METIS and this is just a bare assert, so there isn't information regarding whether they think this is the result of invalid input or a METIS bug. Get the full stack trace so we can determine where this is happening, METIS is a partitioner, so it's not normally called as part of assembling a matrix. On Tue, Jan 24, 2012 at 06:27, Dominik Szczerba wrote: > I am getting the above error unexpectedly when setting/assembling my > MPIAIJ matrices. The input unstrtuctured mesh was partitioned with > parmetis and no errors were reported. The error only happens for some > number of requested partitions. > > The code is clean according to valgrind. What else can I do about it? > Thanks for any suggestions. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Tue Jan 24 08:37:40 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Tue, 24 Jan 2012 15:37:40 +0100 Subject: [petsc-users] ***ASSERTION failed on line 65 of file fm.c: ComputeCut(graph, where) == graph->mincut In-Reply-To: References: Message-ID: On Tue, Jan 24, 2012 at 1:37 PM, Jed Brown wrote: > The error is inside of METIS and this is just a bare assert, so there isn't > information regarding whether they think this is the result of invalid input > or a METIS bug. Get the full stack trace so we can determine where this is > happening, METIS is a partitioner, so it's not normally called as part of > assembling a matrix. This will be difficult because it happens only with 64 processes and I am on a cray where I can not run gdb. I will try to run the same on my quadcore, but am not sure what I get. Is it possible to replace calls to parmetis with chaco or party for "internal" usage (and NOT explicit partitioning that I do myself)? Is it supported/good idea? Many thanks Dominik From jedbrown at mcs.anl.gov Tue Jan 24 09:03:43 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 24 Jan 2012 09:03:43 -0600 Subject: [petsc-users] ***ASSERTION failed on line 65 of file fm.c: ComputeCut(graph, where) == graph->mincut In-Reply-To: References: Message-ID: At least send the PETSc stack trace. PETSc always uses MatPartitioning, so you can use any partitioner. On Jan 24, 2012 8:51 AM, "Dominik Szczerba" wrote: > On Tue, Jan 24, 2012 at 1:37 PM, Jed Brown wrote: > > The error is inside of METIS and this is just a bare assert, so there > isn't > > information regarding whether they think this is the result of invalid > input > > or a METIS bug. Get the full stack trace so we can determine where this > is > > happening, METIS is a partitioner, so it's not normally called as part of > > assembling a matrix. > > This will be difficult because it happens only with 64 processes and I > am on a cray where I can not run gdb. > I will try to run the same on my quadcore, but am not sure what I get. > > Is it possible to replace calls to parmetis with chaco or party for > "internal" usage (and NOT explicit partitioning that I do myself)? Is > it supported/good idea? > > Many thanks > Dominik > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Wed Jan 25 06:52:06 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 25 Jan 2012 13:52:06 +0100 Subject: [petsc-users] ***ASSERTION failed on line 65 of file fm.c: ComputeCut(graph, where) == graph->mincut In-Reply-To: References: Message-ID: > The error is inside of METIS and this is just a bare assert, so there isn't > information regarding whether they think this is the result of invalid input > or a METIS bug. Get the full stack trace so we can determine where this is > happening, METIS is a partitioner, so it's not normally called as part > of assembling a matrix. I have built both petsc and my app in debug mode, unfortunately, I do not get any trace, only the below message, followed by assertions from parmetis. Is there any other way to increase verbosity / find out where it is triggered? Dominik _pmiu_daemon(SIGCHLD): [NID 00098] [c1-0c2s1n2] [Wed Jan 25 13:35:59 2012] PE RANK 36 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00062] [c0-0c1s0n0] [Wed Jan 25 13:35:58 2012] PE RANK 20 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00130] [c1-0c1s1n0] [Wed Jan 25 13:35:58 2012] PE RANK 48 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00099] [c1-0c2s1n3] [Wed Jan 25 13:35:59 2012] PE RANK 37 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00055] [c0-0c1s4n1] [Wed Jan 25 13:35:58 2012] PE RANK 15 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00026] [c0-0c0s2n2] [Wed Jan 25 13:35:58 2012] PE RANK 4 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00058] [c0-0c1s2n0] [Wed Jan 25 13:35:58 2012] PE RANK 16 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00073] [c0-0c2s4n1] [Wed Jan 25 13:35:58 2012] PE RANK 27 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00159] [c1-0c1s0n3] [Wed Jan 25 13:35:58 2012] PE RANK 55 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00066] [c0-0c2s1n0] [Wed Jan 25 13:35:59 2012] PE RANK 24 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00156] [c1-0c1s1n2] [Wed Jan 25 13:35:58 2012] PE RANK 52 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00029] [c0-0c0s1n3] [Wed Jan 25 13:35:58 2012] PE RANK 7 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00040] [c0-0c1s4n2] [Wed Jan 25 13:35:58 2012] PE RANK 12 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00105] [c1-0c2s4n3] [Wed Jan 25 13:35:58 2012] PE RANK 39 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00129] [c1-0c1s0n1] [Wed Jan 25 13:35:58 2012] PE RANK 47 exit signal Aborted [NID 00055] 2012-01-25 13:35:58 Apid 478353: initiated application termination _pmiu_daemon(SIGCHLD): [NID 00067] [c0-0c2s1n1] [Wed Jan 25 13:35:58 2012] PE RANK 25 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00041] [c0-0c1s4n3] [Wed Jan 25 13:35:58 2012] PE RANK 13 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00157] [c1-0c1s1n3] [Wed Jan 25 13:35:58 2012] PE RANK 53 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00127] [c1-0c2s0n1] [Wed Jan 25 13:35:58 2012] PE RANK 45 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00093] [c0-0c2s1n3] [Wed Jan 25 13:35:58 2012] PE RANK 31 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00065] [c0-0c2s0n1] [Wed Jan 25 13:35:58 2012] PE RANK 23 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00063] [c0-0c1s0n1] [Wed Jan 25 13:35:58 2012] PE RANK 21 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00033] [c0-0c1s0n3] [Wed Jan 25 13:35:58 2012] PE RANK 9 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00032] [c0-0c1s0n2] [Wed Jan 25 13:35:58 2012] PE RANK 8 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00161] [c1-0c0s0n3] [Wed Jan 25 13:35:58 2012] PE RANK 57 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00086] [c0-0c2s4n2] [Wed Jan 25 13:35:58 2012] PE RANK 28 exit signal Aborted _pmiu_daemon(SIGCHLD): [NID 00160] [c1-0c0s0n2] [Wed Jan 25 13:35:58 2012] PE RANK 56 exit signal Aborted Command exited with non-zero status 137 From knepley at gmail.com Wed Jan 25 07:55:16 2012 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 25 Jan 2012 07:55:16 -0600 Subject: [petsc-users] ***ASSERTION failed on line 65 of file fm.c: ComputeCut(graph, where) == graph->mincut In-Reply-To: References: Message-ID: On Wed, Jan 25, 2012 at 6:52 AM, Dominik Szczerba wrote: > > The error is inside of METIS and this is just a bare assert, so there > isn't > > information regarding whether they think this is the result of invalid > input > > or a METIS bug. Get the full stack trace so we can determine where this > is > > happening, METIS is a partitioner, so it's not normally called as part > > of assembling a matrix. > > I have built both petsc and my app in debug mode, unfortunately, I do > not get any trace, only the below message, followed by assertions from > parmetis. Is there any other way to increase verbosity / find out > where it is triggered? > asserts are a terrible debugging tool. You need to either use a debugger, or output the matrix in a form that the ParMetis people can use and debug with. Matt > Dominik > > > _pmiu_daemon(SIGCHLD): [NID 00098] [c1-0c2s1n2] [Wed Jan 25 13:35:59 > 2012] PE RANK 36 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00062] [c0-0c1s0n0] [Wed Jan 25 13:35:58 > 2012] PE RANK 20 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00130] [c1-0c1s1n0] [Wed Jan 25 13:35:58 > 2012] PE RANK 48 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00099] [c1-0c2s1n3] [Wed Jan 25 13:35:59 > 2012] PE RANK 37 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00055] [c0-0c1s4n1] [Wed Jan 25 13:35:58 > 2012] PE RANK 15 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00026] [c0-0c0s2n2] [Wed Jan 25 13:35:58 > 2012] PE RANK 4 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00058] [c0-0c1s2n0] [Wed Jan 25 13:35:58 > 2012] PE RANK 16 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00073] [c0-0c2s4n1] [Wed Jan 25 13:35:58 > 2012] PE RANK 27 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00159] [c1-0c1s0n3] [Wed Jan 25 13:35:58 > 2012] PE RANK 55 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00066] [c0-0c2s1n0] [Wed Jan 25 13:35:59 > 2012] PE RANK 24 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00156] [c1-0c1s1n2] [Wed Jan 25 13:35:58 > 2012] PE RANK 52 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00029] [c0-0c0s1n3] [Wed Jan 25 13:35:58 > 2012] PE RANK 7 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00040] [c0-0c1s4n2] [Wed Jan 25 13:35:58 > 2012] PE RANK 12 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00105] [c1-0c2s4n3] [Wed Jan 25 13:35:58 > 2012] PE RANK 39 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00129] [c1-0c1s0n1] [Wed Jan 25 13:35:58 > 2012] PE RANK 47 exit signal Aborted > [NID 00055] 2012-01-25 13:35:58 Apid 478353: initiated application > termination > _pmiu_daemon(SIGCHLD): [NID 00067] [c0-0c2s1n1] [Wed Jan 25 13:35:58 > 2012] PE RANK 25 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00041] [c0-0c1s4n3] [Wed Jan 25 13:35:58 > 2012] PE RANK 13 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00157] [c1-0c1s1n3] [Wed Jan 25 13:35:58 > 2012] PE RANK 53 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00127] [c1-0c2s0n1] [Wed Jan 25 13:35:58 > 2012] PE RANK 45 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00093] [c0-0c2s1n3] [Wed Jan 25 13:35:58 > 2012] PE RANK 31 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00065] [c0-0c2s0n1] [Wed Jan 25 13:35:58 > 2012] PE RANK 23 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00063] [c0-0c1s0n1] [Wed Jan 25 13:35:58 > 2012] PE RANK 21 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00033] [c0-0c1s0n3] [Wed Jan 25 13:35:58 > 2012] PE RANK 9 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00032] [c0-0c1s0n2] [Wed Jan 25 13:35:58 > 2012] PE RANK 8 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00161] [c1-0c0s0n3] [Wed Jan 25 13:35:58 > 2012] PE RANK 57 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00086] [c0-0c2s4n2] [Wed Jan 25 13:35:58 > 2012] PE RANK 28 exit signal Aborted > _pmiu_daemon(SIGCHLD): [NID 00160] [c1-0c0s0n2] [Wed Jan 25 13:35:58 > 2012] PE RANK 56 exit signal Aborted > Command exited with non-zero status 137 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Wed Jan 25 08:36:08 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 25 Jan 2012 15:36:08 +0100 Subject: [petsc-users] ***ASSERTION failed on line 65 of file fm.c: ComputeCut(graph, where) == graph->mincut In-Reply-To: References: Message-ID: > asserts are a terrible debugging tool. You need to either use a debugger, or > output > the matrix in a form that the ParMetis people can use and debug with. After a lot of fun running the program on a quadcore with 64 processes and as many gdb windows, typing 'c' into all of them without closing them accidentally, then finding the ones that have exitted, I found the below pasted trace. Does it help to locate the problem? Many thanks Dominik #0 0x00007fd4232433a5 in __GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007fd423246b0b in __GI_abort () at abort.c:92 #2 0x000000000109baff in __FM_2WayEdgeRefine (ctrl=0x7fff8f7d7b30, graph=0x341c7d0, tpwgts=0x7fff8f7d7a70, npasses=4) at fm.c:65 #3 0x000000000109d483 in __GrowBisection (ctrl=0x7fff8f7d7b30, graph=0x341c7d0, tpwgts=0x7fff8f7d7a70, ubfactor=1) at initpart.c:188 #4 0x000000000109ccd4 in __Init2WayPartition (ctrl=0x7fff8f7d7b30, graph=0x341c7d0, tpwgts=0x7fff8f7d7a70, ubfactor=1) at initpart.c:36 #5 0x0000000001084dc2 in __MlevelEdgeBisection (ctrl=0x7fff8f7d7b30, graph=0x7fff8f7d7c20, tpwgts=0x7fff8f7d7a70, ubfactor=1) at pmetis.c:173 #6 0x0000000001084a30 in __MlevelRecursiveBisection (ctrl=0x7fff8f7d7b30, graph=0x7fff8f7d7c20, nparts=2, part=0xe0c4bc8, tpwgts=0x73871e0, ubfactor=1, fpart=0) at pmetis.c:120 #7 0x000000000108488f in METIS_WPartGraphRecursive (nvtxs=0x341b030, xadj=0x7f60ef0, adjncy=0x7f61124, vwgt=0x7f60f80, adjwgt=0x7f621c4, wgtflag=0x7fff8f7d7dd4, numflag=0x7fff8f7d7dd8, nparts=0x7fff8f7d7d7c, tpwgts=0x7fff8f7d8110, options=0x7fff8f7d7d90, edgecut=0x7fff8f7d7ddc, part=0xe0c4bc8) at pmetis.c:85 #8 0x000000000105a520 in __MlevelKWayPartitioning (ctrl=0x7fff8f7d7e60, graph=0x7fff8f7d7f50, nparts=2, part=0x7c7b9e0, tpwgts=0x7fff8f7d8110, ubfactor=1) at kmetis.c:110 #9 0x000000000105fcb3 in METIS_WPartGraphKway2 (nvtxs=0x33f6174, xadj=0x8bf81b0, adjncy=0x7c41850, vwgt=0x8bf1b40, adjwgt=0x7b3b450, wgtflag=0x7fff8f7d81c8, numflag=0x7fff8f7d81c4, nparts=0x7fff8f7d81c0, tpwgts=0x7fff8f7d8110, options=0x7fff8f7d80e0, edgecut=0x7fff8f7d81cc, part=0x7c7b9e0) at parmetis.c:79 #10 0x0000000001031d20 in Mc_InitPartition_RB__ (ctrl=0x7fff8f7d8860, graph=0x3a5a060, wspace=0x7fff8f8109f0) at initpart.c:95 #11 0x0000000001031348 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, graph=0x3a5a060, wspace=0x7fff8f8109f0) at kmetis.c:219 #12 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, graph=0x3a00cc0, wspace=0x7fff8f8109f0) at kmetis.c:238 #13 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, graph=0xe145c20, wspace=0x7fff8f8109f0) at kmetis.c:238 #14 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, graph=0x3abdb60, wspace=0x7fff8f8109f0) at kmetis.c:238 #15 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, graph=0x3a016e0, wspace=0x7fff8f8109f0) at kmetis.c:238 #16 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, graph=0x7377a70, wspace=0x7fff8f8109f0) at kmetis.c:238 #17 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, graph=0x736d510, wspace=0x7fff8f8109f0) at kmetis.c:238 #18 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, graph=0x738b670, wspace=0x7fff8f8109f0) at kmetis.c:238 #19 0x0000000001030d5f in ParMETIS_V3_PartKway (vtxdist=0x9f15090, xadj=0x7cf4e90, adjncy=0x7a57b60, vwgt=0x0, adjwgt=0xb26ebc0, wgtflag=0x7fff8f810c24, numflag=0x7fff8f810c28, ncon=0x7fff8f810c2c, nparts=0x7fff8f810c30, tpwgts=0x91b5c70, ubvec=0x91b4d50, options=0x7fff8f810bb0, edgecut=0x91b54f0, part=0x3349f20, comm=0x91b5504) at kmetis.c:146 #20 0x0000000000a9e6c5 in MatPartitioningApply_Parmetis (part=0x91b34d0, partitioning=0x7fff8f811008) at /home/dsz/pack/petsc-3.2-p5/src/mat/partition/impls/pmetis/pmetis.c:96 #21 0x0000000000695ecd in MatPartitioningApply (matp=0x91b34d0, partitioning=0x7fff8f811008) at /home/dsz/pack/petsc-3.2-p5/src/mat/partition/partition.c:226 #22 0x00000000004d31d6 in FluidSolver::CreateSolverContexts (this=0x30eb400) at /home/dsz/src/framework/sandbox/dsz/solvers/solve/FluidSolver.cxx:3104 #23 0x00000000004c697f in FluidSolver::Solve (this=0x30eb400) at /home/dsz/src/framework/sandbox/dsz/solvers/solve/FluidSolver.cxx:1925 #24 0x00000000005177f9 in main (argc=3, argv=0x7fff8f812c78) at /home/dsz/src/framework/sandbox/dsz/solvers/solve/FluidSolverMain.cxx:319 From knepley at gmail.com Wed Jan 25 08:53:30 2012 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 25 Jan 2012 08:53:30 -0600 Subject: [petsc-users] ***ASSERTION failed on line 65 of file fm.c: ComputeCut(graph, where) == graph->mincut In-Reply-To: References: Message-ID: On Wed, Jan 25, 2012 at 8:36 AM, Dominik Szczerba wrote: > > asserts are a terrible debugging tool. You need to either use a > debugger, or > > output > > the matrix in a form that the ParMetis people can use and debug with. > > After a lot of fun running the program on a quadcore with 64 processes > and as many gdb windows, typing 'c' into all of them without closing > them accidentally, then finding the ones that have exitted, I found > the below pasted trace. Does it help to locate the problem? > That should definitely be sent to the ParMetis team. Matt > Many thanks > Dominik > > > #0 0x00007fd4232433a5 in __GI_raise (sig=6) > at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 > #1 0x00007fd423246b0b in __GI_abort () at abort.c:92 > #2 0x000000000109baff in __FM_2WayEdgeRefine (ctrl=0x7fff8f7d7b30, > graph=0x341c7d0, tpwgts=0x7fff8f7d7a70, npasses=4) at fm.c:65 > #3 0x000000000109d483 in __GrowBisection (ctrl=0x7fff8f7d7b30, > graph=0x341c7d0, tpwgts=0x7fff8f7d7a70, ubfactor=1) at initpart.c:188 > #4 0x000000000109ccd4 in __Init2WayPartition (ctrl=0x7fff8f7d7b30, > graph=0x341c7d0, tpwgts=0x7fff8f7d7a70, ubfactor=1) at initpart.c:36 > #5 0x0000000001084dc2 in __MlevelEdgeBisection (ctrl=0x7fff8f7d7b30, > graph=0x7fff8f7d7c20, tpwgts=0x7fff8f7d7a70, ubfactor=1) at pmetis.c:173 > #6 0x0000000001084a30 in __MlevelRecursiveBisection (ctrl=0x7fff8f7d7b30, > graph=0x7fff8f7d7c20, nparts=2, part=0xe0c4bc8, tpwgts=0x73871e0, > ubfactor=1, fpart=0) at pmetis.c:120 > #7 0x000000000108488f in METIS_WPartGraphRecursive (nvtxs=0x341b030, > xadj=0x7f60ef0, adjncy=0x7f61124, vwgt=0x7f60f80, adjwgt=0x7f621c4, > wgtflag=0x7fff8f7d7dd4, numflag=0x7fff8f7d7dd8, nparts=0x7fff8f7d7d7c, > tpwgts=0x7fff8f7d8110, options=0x7fff8f7d7d90, edgecut=0x7fff8f7d7ddc, > part=0xe0c4bc8) at pmetis.c:85 > #8 0x000000000105a520 in __MlevelKWayPartitioning (ctrl=0x7fff8f7d7e60, > graph=0x7fff8f7d7f50, nparts=2, part=0x7c7b9e0, tpwgts=0x7fff8f7d8110, > ubfactor=1) at kmetis.c:110 > #9 0x000000000105fcb3 in METIS_WPartGraphKway2 (nvtxs=0x33f6174, > xadj=0x8bf81b0, adjncy=0x7c41850, vwgt=0x8bf1b40, adjwgt=0x7b3b450, > wgtflag=0x7fff8f7d81c8, numflag=0x7fff8f7d81c4, nparts=0x7fff8f7d81c0, > tpwgts=0x7fff8f7d8110, options=0x7fff8f7d80e0, edgecut=0x7fff8f7d81cc, > part=0x7c7b9e0) at parmetis.c:79 > #10 0x0000000001031d20 in Mc_InitPartition_RB__ (ctrl=0x7fff8f7d8860, > graph=0x3a5a060, wspace=0x7fff8f8109f0) at initpart.c:95 > #11 0x0000000001031348 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, > graph=0x3a5a060, wspace=0x7fff8f8109f0) at kmetis.c:219 > #12 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, > graph=0x3a00cc0, wspace=0x7fff8f8109f0) at kmetis.c:238 > #13 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, > graph=0xe145c20, wspace=0x7fff8f8109f0) at kmetis.c:238 > #14 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, > graph=0x3abdb60, wspace=0x7fff8f8109f0) at kmetis.c:238 > #15 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, > graph=0x3a016e0, wspace=0x7fff8f8109f0) at kmetis.c:238 > #16 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, > graph=0x7377a70, wspace=0x7fff8f8109f0) at kmetis.c:238 > #17 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, > graph=0x736d510, wspace=0x7fff8f8109f0) at kmetis.c:238 > #18 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860, > graph=0x738b670, wspace=0x7fff8f8109f0) at kmetis.c:238 > #19 0x0000000001030d5f in ParMETIS_V3_PartKway (vtxdist=0x9f15090, > xadj=0x7cf4e90, adjncy=0x7a57b60, vwgt=0x0, adjwgt=0xb26ebc0, > wgtflag=0x7fff8f810c24, numflag=0x7fff8f810c28, ncon=0x7fff8f810c2c, > nparts=0x7fff8f810c30, tpwgts=0x91b5c70, ubvec=0x91b4d50, > options=0x7fff8f810bb0, edgecut=0x91b54f0, part=0x3349f20, > comm=0x91b5504) > at kmetis.c:146 > #20 0x0000000000a9e6c5 in MatPartitioningApply_Parmetis (part=0x91b34d0, > partitioning=0x7fff8f811008) > at > /home/dsz/pack/petsc-3.2-p5/src/mat/partition/impls/pmetis/pmetis.c:96 > #21 0x0000000000695ecd in MatPartitioningApply (matp=0x91b34d0, > partitioning=0x7fff8f811008) > at /home/dsz/pack/petsc-3.2-p5/src/mat/partition/partition.c:226 > #22 0x00000000004d31d6 in FluidSolver::CreateSolverContexts > (this=0x30eb400) > at > /home/dsz/src/framework/sandbox/dsz/solvers/solve/FluidSolver.cxx:3104 > #23 0x00000000004c697f in FluidSolver::Solve (this=0x30eb400) > at > /home/dsz/src/framework/sandbox/dsz/solvers/solve/FluidSolver.cxx:1925 > #24 0x00000000005177f9 in main (argc=3, argv=0x7fff8f812c78) > at > /home/dsz/src/framework/sandbox/dsz/solvers/solve/FluidSolverMain.cxx:319 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From amesga1 at tigers.lsu.edu Wed Jan 25 08:55:39 2012 From: amesga1 at tigers.lsu.edu (Ataollah Mesgarnejad) Date: Wed, 25 Jan 2012 08:55:39 -0600 Subject: [petsc-users] SNESVI convergence spped Message-ID: Barry, Here is my output from my program with petsc-dev (txt files attached). Even with petsc-dev (revision 21929), SNES (I tried VISS and VIRS) takes a long time to converge (>50 iterations) from a small problem (99 HEX20 elements, 681 DOFs for SNES). The exodusII output files are uploaded in the link; I attached pictures of last time step for Psi field which we solve with SNESVI and u field which the displacement in the x-direction and is my loading. As it was the case with petsc-3.2 line search diverges as soon as we hit the lower bound in a neighborhood of nodes. I ran with KSPPREONLY and PCLU to exclude possibility of LS failure because of the KSP tol. Best, Ata link to exodusII files: http://cl.ly/0g431o443v240x1h2h1B -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: output-viss.txt URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: axial-tension-virs-u.png Type: image/png Size: 10081 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: axial-tension-virs-psi.png Type: image/png Size: 10230 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: axial-tension-viss-psi.png Type: image/png Size: 10230 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: axial-tension-viss-u.png Type: image/png Size: 10081 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: output-virs.txt URL: From dominik at itis.ethz.ch Wed Jan 25 12:10:51 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 25 Jan 2012 19:10:51 +0100 Subject: [petsc-users] question about partitioning usage Message-ID: I recently realized that, independently of my explicitly partitioning the input mesh, Petsc also employs partitioning somewhere internally. Is my understanding correct then: * parmetis is the default even when other partitioners were configured * for MPIAIJ matrices the employed partitioner must be parallel, e.g., parmetis or ptscotch and not sequential, e.g. chaco or party. Thanks for any clarifications. PS. I can not configure petsc with --download-ptscotch on any of my systems, will send configure.log's soon. Dominik From knepley at gmail.com Wed Jan 25 12:19:06 2012 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 25 Jan 2012 12:19:06 -0600 Subject: [petsc-users] question about partitioning usage In-Reply-To: References: Message-ID: On Wed, Jan 25, 2012 at 12:10 PM, Dominik Szczerba wrote: > I recently realized that, independently of my explicitly partitioning > the input mesh, Petsc also employs partitioning somewhere internally. > We don't unless you tell us to. > Is my understanding correct then: > > * parmetis is the default even when other partitioners were configured > Yes. > * for MPIAIJ matrices the employed partitioner must be parallel, e.g., > parmetis or ptscotch and not sequential, e.g. chaco or party. > Yes. Matt > Thanks for any clarifications. > > PS. I can not configure petsc with --download-ptscotch on any of my > systems, will send configure.log's soon. > > Dominik -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Wed Jan 25 12:29:29 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 25 Jan 2012 19:29:29 +0100 Subject: [petsc-users] question about partitioning usage In-Reply-To: References: Message-ID: >> I recently realized that, independently of my explicitly partitioning >> the input mesh, Petsc also employs partitioning somewhere internally. > > > We don't unless you tell us to. Can you please expand? I am partitioning my input unstructured mesh to distribute dofs throughout the processes, and then I am setting up my MPI matricess "as usual", not intending any calls to parmetis or other partitioning functions... So where can I tell or not tell Petsc to use partitioning here or not? Many thanks Dominik From knepley at gmail.com Wed Jan 25 12:37:48 2012 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 25 Jan 2012 12:37:48 -0600 Subject: [petsc-users] question about partitioning usage In-Reply-To: References: Message-ID: On Wed, Jan 25, 2012 at 12:29 PM, Dominik Szczerba wrote: > >> I recently realized that, independently of my explicitly partitioning > >> the input mesh, Petsc also employs partitioning somewhere internally. > > > > > > We don't unless you tell us to. > > Can you please expand? I am partitioning my input unstructured mesh to > distribute dofs throughout the processes, and then I am setting up my > MPI matricess "as usual", not intending any calls to parmetis or other > partitioning functions... So where can I tell or not tell Petsc to use > partitioning here or not? > Unless you create a MatPartitioning object, we do not call ParMetis. Matt > Many thanks > Dominik > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Wed Jan 25 12:41:24 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 25 Jan 2012 19:41:24 +0100 Subject: [petsc-users] ***ASSERTION failed on line 65 of file fm.c: ComputeCut(graph, where) == graph->mincut In-Reply-To: References: Message-ID: >> After a lot of fun running the program on a quadcore with 64 processes >> and as many gdb windows, typing 'c' into all of them without closing >> them accidentally, then finding the ones that have exitted, I found >> the below pasted trace. Does it help to locate the problem? > > > That should definitely be sent to the ParMetis team. Just reported in Parmetis bug tracker. Dominik From jedbrown at mcs.anl.gov Wed Jan 25 13:08:51 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 25 Jan 2012 13:08:51 -0600 Subject: [petsc-users] question about partitioning usage In-Reply-To: References: Message-ID: PCASM with multiple subdomains per process. PCGAMG coarse levels. On Jan 25, 2012 12:37 PM, "Matthew Knepley" wrote: > On Wed, Jan 25, 2012 at 12:29 PM, Dominik Szczerba wrote: > >> >> I recently realized that, independently of my explicitly partitioning >> >> the input mesh, Petsc also employs partitioning somewhere internally. >> > >> > >> > We don't unless you tell us to. >> >> Can you please expand? I am partitioning my input unstructured mesh to >> distribute dofs throughout the processes, and then I am setting up my >> MPI matricess "as usual", not intending any calls to parmetis or other >> partitioning functions... So where can I tell or not tell Petsc to use >> partitioning here or not? >> > > Unless you create a MatPartitioning object, we do not call ParMetis. > > Matt > > >> Many thanks >> Dominik >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Jan 25 13:15:47 2012 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 25 Jan 2012 13:15:47 -0600 Subject: [petsc-users] question about partitioning usage In-Reply-To: References: Message-ID: On Wed, Jan 25, 2012 at 1:08 PM, Jed Brown wrote: > PCASM with multiple subdomains per process. > > PCGAMG coarse levels. > GAMG makes sense, but ASM has to be activated with an option. I don't think it makes sense to do this by default. Matt > On Jan 25, 2012 12:37 PM, "Matthew Knepley" wrote: > >> On Wed, Jan 25, 2012 at 12:29 PM, Dominik Szczerba wrote: >> >>> >> I recently realized that, independently of my explicitly partitioning >>> >> the input mesh, Petsc also employs partitioning somewhere internally. >>> > >>> > >>> > We don't unless you tell us to. >>> >>> Can you please expand? I am partitioning my input unstructured mesh to >>> distribute dofs throughout the processes, and then I am setting up my >>> MPI matricess "as usual", not intending any calls to parmetis or other >>> partitioning functions... So where can I tell or not tell Petsc to use >>> partitioning here or not? >>> >> >> Unless you create a MatPartitioning object, we do not call ParMetis. >> >> Matt >> >> >>> Many thanks >>> Dominik >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Wed Jan 25 13:20:09 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 25 Jan 2012 20:20:09 +0100 Subject: [petsc-users] question about partitioning usage In-Reply-To: References: Message-ID: > Unless you create a MatPartitioning object, we do not call ParMetis. I create it to partition my input unstructured mesh, then I destroy it. It is so separated from the rest of the code, that I could even delegate it to a separate program. But then, as shown in the other thread, I somehow get an assertion fail from parmetis much later in the code, when creating solver contexts. Why is this happening if the MatPartitioning object was destroyed? Dominik > > ? ?Matt > >> >> Many thanks >> Dominik > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener From jedbrown at mcs.anl.gov Wed Jan 25 14:41:47 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 25 Jan 2012 14:41:47 -0600 Subject: [petsc-users] question about partitioning usage In-Reply-To: References: Message-ID: On Wed, Jan 25, 2012 at 13:20, Dominik Szczerba wrote: > I create it to partition my input unstructured mesh, then I destroy > it. It is so separated from the rest of the code, that I could even > delegate it to a separate program. > But then, as shown in the other thread, I somehow get an assertion > fail from parmetis much later in the code, when creating solver > contexts. Why is this happening if the MatPartitioning object was > destroyed? > Dominik, you've been at this game for a while now, so you should be familiar with your tools. In any software project, when you encounter an error, you should ask the question "how did I get here?" With languages that have a call stack, the first step to answering that question is to look at the stack trace. There is a well-known tool for producing stack traces and related tasks, it's called a debugger. Now PETSc noticed early on that teaching every user to use a debugger (and dealing with the issues on clusters) is too much work, consequently, PETSc manages its own call stack so that it can produce stack traces automatically. Third-party packages usually don't do this, so they usually just return error codes. Asserts are useful as a developer because it's very little typing and the debugger catches them, but they are horrible in production because the program just exits without telling you why there was an error. If the programmer was nice, they would have placed a comment at that line of code explaining what might cause the assertion to fail, but you still have to dig up the source code and go to that line, and many programmers aren't that careful. Of course failing asserts don't usually forcefully exit, they call abort() which raises SIGABRT and, as it turns out, you can catch SIGABRT (though you cannot block it). However, SIGABRT is the same signal that is usually used to instruct a program to dump core, so in order to not clutter the ability to get core, PETSc does not catch SIGABRT. (At least I think this is the rationale, perhaps it should be reconsidered.) Consequently, PETSc doesn't automatically give stack traces when third-party libraries call abort(). Note that the only thing worse than calling assert()/abort() from a library is to call exit() in an error condition, and unfortunately, this is more common than you might thing. Again, PETSc could register a callback with atexit(), but this would interfere with users' ability to exit intentionally (before calling PetscFinalize()) without seeing a PETSc error message. In any case, the current decision was not to use atexit() either. What this all adds up to for you is that you should use a debugger to get a stack trace if you want to find out how you reached a failed assert() in some third-party package. You know that PETSc doesn't require that package, so it's not using it by default for anything. Presumably, you also know that "somehow get an assertion fail from parmetis much later in the code, when creating solver contexts" is pretty vague. How did you determine that it was when creating solver contexts? Even if that claim is correct, are we expected to guess what kind of solvers you have and how they are configured, such that a partitioner might be called? You have to help us help you. So don't be afraid to use the debugger. Read enough of the documentation that you can fluently work with conditional breakpoints and with watchpoints (both values and locations). Other times, use valgrind with --db-attach=yes to get the debugger to the correct place. I can understand that it looks overwhelming if you are starting out, but if you have made it through your first year of serious development and aren't familiar with these tools yet, you have already lost more time than it takes to learn the tools. When developing new code, have a hard rule that there always be a run-time parameter to change the problem size, and always make the smallest problem size something that runs in less than 10 seconds without optimization. I have occasionally ended up in circumstances where these rules were not followed and I have regretted it every time. Similarly, always develop code in a friendly development environment. To me, that means that debuggers and valgrind must work, disk latency must be low enough that I don't notice it in Emacs, and the source code has to be indexed so that I can move around quickly. It also means that the build system has to be fast, waiting more than a few seconds for compilation when you make a simple change to a C file is unacceptable. If you follow these guidelines, I think you will end up answering your own questions in less time than it takes to ask them, and when find that you need to ask, you will have plenty of relevant information that hopefully we won't need several rounds of email ping-pong to get oriented. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Wed Jan 25 14:55:40 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 25 Jan 2012 21:55:40 +0100 Subject: [petsc-users] question about partitioning usage In-Reply-To: References: Message-ID: Jed, you are generally right, although I do not capture the context of your remarks in this very case. Maybe the context you are missing is that on my own linux box things run smoothly - I start getting problems on platforms like Windows or Cray, where I can not run a debugger (easily). Regarding parmetis assert fail, I have run the failing code in the debugger and submitted the stack trace, as reported previously. The question here was where is Petsc using parmetis internally elsewhere than on my explicit call, and why still after having destroyed the MatPartitioning object. I can of course find it all out myself in a few days by studying the code or using a debugger - I thought about such internal issue I'd just ask. Thanks for your understanding and patience, Dominik On Wed, Jan 25, 2012 at 9:41 PM, Jed Brown wrote: > On Wed, Jan 25, 2012 at 13:20, Dominik Szczerba > wrote: >> >> I create it to partition my input unstructured mesh, then I destroy >> it. It is so separated from the rest of the code, that I could even >> delegate it to a separate program. >> But then, as shown in the other thread, I somehow get an assertion >> fail from parmetis much later in the code, when creating solver >> contexts. Why is this happening if the MatPartitioning object was >> destroyed? > > > Dominik, you've been at this game for a while now, so you should be familiar > with your tools. In any software project, when you encounter an error, you > should ask the question "how did I get here?" With languages that have a > call stack, the first step to answering that question is to look at the > stack trace. There is a well-known tool for producing stack traces and > related tasks, it's called a debugger. > > Now PETSc noticed early on that teaching every user to use a debugger (and > dealing with the issues on clusters) is too much work, consequently, PETSc > manages its own call stack so that it can produce stack traces > automatically. Third-party packages usually don't do this, so they usually > just return error codes. Asserts are useful as a developer because it's very > little typing and the debugger catches them, but they are horrible in > production because the program just exits without telling you why there was > an error. If the programmer was nice, they would have placed a comment at > that line of code explaining what might cause the assertion to fail, but you > still have to dig up the source code and go to that line, and many > programmers aren't that careful. > > Of course failing asserts don't usually forcefully exit, they call abort() > which raises SIGABRT and, as it turns out, you can catch SIGABRT (though you > cannot block it). However, SIGABRT is the same signal that is usually used > to instruct a program to dump core, so in order to not clutter the ability > to get core, PETSc does not catch SIGABRT. (At least I think this is the > rationale, perhaps it should be reconsidered.) Consequently, PETSc doesn't > automatically give stack traces when third-party libraries call abort(). > > Note that the only thing worse than calling assert()/abort() from a library > is to call exit() in an error condition, and unfortunately, this is more > common than you might thing. Again, PETSc could register a callback with > atexit(), but this would interfere with users' ability to exit intentionally > (before calling PetscFinalize()) without seeing a PETSc error message. In > any case, the current decision was not to use atexit() either. > > What this all adds up to for you is that you should use a debugger to get a > stack trace if you want to find out how you reached a failed assert() in > some third-party package. You know that PETSc doesn't require that package, > so it's not using it by default for anything. Presumably, you also know that > "somehow get an assertion?fail from parmetis much later in the code, when > creating solver?contexts" is pretty vague. How did you determine that it was > when creating solver contexts? Even if that claim is correct, are we > expected to guess what kind of solvers you have and how they are configured, > such that a partitioner might be called? You have to help us help you. > > So don't be afraid to use the debugger. Read enough of the documentation > that you can fluently work with conditional breakpoints and with watchpoints > (both values and locations). Other times, use valgrind with --db-attach=yes > to get the debugger to the correct place. I can understand that it looks > overwhelming if you are starting out, but if you have made it through your > first year of serious development and aren't familiar with these tools yet, > you have already lost more time than it takes to learn the tools. > > When developing new code, have a hard rule that there always be a run-time > parameter to change the problem size, and always make the smallest problem > size something that runs in less than 10 seconds without optimization. I > have occasionally ended up in circumstances where these rules were not > followed and I have regretted it every time. Similarly, always develop code > in a friendly development environment. To me, that means that debuggers and > valgrind must work, disk latency must be low enough that I don't notice it > in Emacs, and the source code has to be indexed so that I can move around > quickly. It also means that the build system has to be fast, waiting more > than a few seconds for compilation when you make a simple change to a C file > is unacceptable. > > If you follow these guidelines, I think you will end up answering your own > questions in less time than it takes to ask them, and when find that you > need to ask, you will have plenty of relevant information that hopefully we > won't need several rounds of email ping-pong to get oriented. From jedbrown at mcs.anl.gov Wed Jan 25 15:07:01 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 25 Jan 2012 15:07:01 -0600 Subject: [petsc-users] question about partitioning usage In-Reply-To: References: Message-ID: On Wed, Jan 25, 2012 at 14:55, Dominik Szczerba wrote: > Jed, you are generally right, although I do not capture the context of > your remarks in this very case. > Maybe the context you are missing is that on my own linux box things > run smoothly - I start getting problems on platforms like Windows or > Cray, where I can not run a debugger (easily). > So ParMETIS crashes on the hostile platform where it is slow to debug, fine. But your code is calling it in cases where it doesn't crash, so why not run the same code in your friendly environment and set a breakpoint in MatPartitioningApply() (or even the ParMETIS routine where the assert is failing) so that you can find out where else it is being called from. This should take about 10 seconds. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Wed Jan 25 15:28:29 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 25 Jan 2012 22:28:29 +0100 Subject: [petsc-users] question about partitioning usage In-Reply-To: References: Message-ID: > So ParMETIS crashes on the hostile platform where it is slow to debug, fine. > > But your code is calling it in cases where it doesn't crash, so why not run > the same code in your friendly environment and set a breakpoint in > MatPartitioningApply() (or even the ParMETIS routine where the assert is > failing) so that you can find out where else it is being called from. This > should take about 10 seconds. It will take several minutes till xterm gdb windows will pop up, till I will manage to type "c+ENTER" into 64 windows on my 1024x768 quadcore, and then till I find the right window where the executions stopped - but yes, it is definitely doable, and I already did that, as posted separately. The point I was trying to clarify here was if one at all expects calls to parmetis (or equivalent) other than MatPartitioning after it was destroyed (e.g. for some hidden internal matrix partitioning later). I just wanted to hear a hard yes or no. Thanks, Dominik From knepley at gmail.com Wed Jan 25 15:41:00 2012 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 25 Jan 2012 15:41:00 -0600 Subject: [petsc-users] question about partitioning usage In-Reply-To: References: Message-ID: On Wed, Jan 25, 2012 at 3:28 PM, Dominik Szczerba wrote: > > So ParMETIS crashes on the hostile platform where it is slow to debug, > fine. > > > > But your code is calling it in cases where it doesn't crash, so why not > run > > the same code in your friendly environment and set a breakpoint in > > MatPartitioningApply() (or even the ParMETIS routine where the assert is > > failing) so that you can find out where else it is being called from. > This > > should take about 10 seconds. > > It will take several minutes till xterm gdb windows will pop up, till > I will manage to type "c+ENTER" into 64 windows on my 1024x768 > quadcore, and then till I find the right window where the executions > stopped - but yes, it is definitely doable, and I already did that, as > posted separately. The point I was trying to clarify here was if one > at all expects calls to parmetis (or equivalent) other than > MatPartitioning after it was destroyed (e.g. for some hidden internal > matrix partitioning later). I just wanted to hear a hard yes or no. > No, only MatPartitioning. Also, why not use --debugger_nodes 0? Matt > Thanks, > Dominik > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Jan 25 15:59:35 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 25 Jan 2012 15:59:35 -0600 Subject: [petsc-users] question about partitioning usage In-Reply-To: References: Message-ID: On Wed, Jan 25, 2012 at 15:28, Dominik Szczerba wrote: > It will take several minutes till xterm gdb windows will pop up, till > I will manage to type "c+ENTER" into 64 windows on my 1024x768 > quadcore, and then till I find the right window where the executions > stopped - but yes, it is definitely doable, and I already did that, as > posted separately. > Why do you have to run on 64 processes to find out if ParMETIS is called? Are you worried that there is code in PETSc that says if (comm_size >= 64) do_crazy_things(); So run on a small number of processes (like 2 or 4) to see who calls ParMETIS. You can batch up setting of breakpoints mpiexec -n 4 xterm -e gdb -ex 'b file.c:42' -ex r --args ./app -options -for_petsc These will run until they hit the breakpoint, no need to press "c". If a partitioner is called to partition a global problem, then all ranks must call it. In the case of ParMETIS errors, the output you showed told you the rank. Suppose you want to look at ranks 13 and 59 of a 64-process job. run="./app -options -for_petsc" dbg="xterm -e gdb -ex 'b file.c:42' -ex r --args $run" mpiexec -n 13 $run : -n 1 $dbg : -n 45 $run : -n 1 $dbg : -n 4 $run -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Wed Jan 25 16:08:01 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 25 Jan 2012 23:08:01 +0100 Subject: [petsc-users] question about partitioning usage In-Reply-To: References: Message-ID: > No, only MatPartitioning. Also, why not use --debugger_nodes 0? Thank you, this is very useful. Dominik From dominik at itis.ethz.ch Wed Jan 25 16:11:59 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 25 Jan 2012 23:11:59 +0100 Subject: [petsc-users] question about partitioning usage In-Reply-To: References: Message-ID: > You can batch up setting of breakpoints > > mpiexec -n 4 xterm -e gdb -ex 'b file.c:42' -ex r --args ./app -options > -for_petsc > > run="./app -options -for_petsc" > dbg="xterm -e gdb -ex 'b file.c:42' -ex r --args $run" > mpiexec -n 13 $run : -n 1 $dbg : -n 45 $run : -n 1 $dbg : -n 4 $run These look like very useful debugging hints to try out in the nearest future. Thanks a lot. Dominik From dominik at itis.ethz.ch Thu Jan 26 12:24:09 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Thu, 26 Jan 2012 19:24:09 +0100 Subject: [petsc-users] MatConvert behavior 3.1 vs 3.2 Message-ID: I am having problems with the following piece of code in petsc 3.2 but not 3.1 (with syntax changes where appropriate): Mat XBI; Mat adj = 0; MatPartitioning part = 0; IS is = 0; // Now declare XBI as MPIAIJ and fill it. MatConvert(XBI, MATMPIADJ, MAT_INITIAL_MATRIX, &adj); MatPartitioningCreate(PETSC_COMM_WORLD, &part); MatPartitioningSetAdjacency(part, adj); MatPartitioningSetNParts(part, npMax); MatPartitioningSetFromOptions(part); MatPartitioningApply(part, &is); It crashes in parmetis at the last line above and I have already reported that in their bug system. I want to eliminate the changes between Petsc 3.1/3.2 API as a possible cause - would there be any changes possibly affecting behavior this code? Thanks -- Dominik From jedbrown at mcs.anl.gov Thu Jan 26 12:52:25 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 26 Jan 2012 12:52:25 -0600 Subject: [petsc-users] MatConvert behavior 3.1 vs 3.2 In-Reply-To: References: Message-ID: On Thu, Jan 26, 2012 at 12:24, Dominik Szczerba wrote: > I am having problems with the following piece of code in petsc 3.2 but > not 3.1 (with syntax changes where appropriate): > > Mat XBI; > Mat adj = 0; > MatPartitioning part = 0; > IS is = 0; > > // Now declare XBI as MPIAIJ and fill it. > > MatConvert(XBI, MATMPIADJ, MAT_INITIAL_MATRIX, &adj); > MatPartitioningCreate(PETSC_COMM_WORLD, &part); > MatPartitioningSetAdjacency(part, adj); > MatPartitioningSetNParts(part, npMax); > MatPartitioningSetFromOptions(part); > MatPartitioningApply(part, &is); > > It crashes in parmetis at the last line above and I have already > reported that in their bug system. > I want to eliminate the changes between Petsc 3.1/3.2 API as a > possible cause - would there be any changes possibly affecting > behavior this code? > PETSc-3.1 with --download-parmetis uses ParMetis-3.1.1 with some custom patches. PETSc-3.2 uses ParMetis-3.2 with other custom patches. PETSc-dev uses ParMetis-4.0.2 with other custom patches. I suggest that you try building your code with petsc-dev to see if ParMetis-4.0.2 fixed the bug you are hitting. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Jan 26 21:07:49 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 26 Jan 2012 21:07:49 -0600 Subject: [petsc-users] SNESVI convergence spped In-Reply-To: References: Message-ID: <7D18AECC-905D-4298-85E6-A10DDD85547A@mcs.anl.gov> Don't even touch VISS, focus on RS. What solver did you use in the run you sent the output for? * If SS please rerun (with -snes_monitor) with RS and send the output again. * If RS then it is very puzzling that you get that slow convergence since the number of active constraints does not change so it should converge just like plain old Newton. Barry On Jan 25, 2012, at 8:55 AM, Ataollah Mesgarnejad wrote: > Barry, > > Here is my output from my program with petsc-dev (txt files attached). Even with petsc-dev (revision 21929), SNES (I tried VISS and VIRS) takes a long time to converge (>50 iterations) from a small problem (99 HEX20 elements, 681 DOFs for SNES). The exodusII output files are uploaded in the link; I attached pictures of last time step for Psi field which we solve with SNESVI and u field which the displacement in the x-direction and is my loading. As it was the case with petsc-3.2 line search diverges as soon as we hit the lower bound in a neighborhood of nodes. I ran with KSPPREONLY and PCLU to exclude possibility of LS failure because of the KSP tol. > > Best, > Ata > > link to exodusII files: http://cl.ly/0g431o443v240x1h2h1B > > From amesga1 at tigers.lsu.edu Thu Jan 26 21:13:02 2012 From: amesga1 at tigers.lsu.edu (Ataollah Mesgarnejad) Date: Thu, 26 Jan 2012 21:13:02 -0600 Subject: [petsc-users] SNESVI convergence spped In-Reply-To: <7D18AECC-905D-4298-85E6-A10DDD85547A@mcs.anl.gov> References: <7D18AECC-905D-4298-85E6-A10DDD85547A@mcs.anl.gov> Message-ID: Barry, There are two sets of output for both virs and viss. Slow convergence aside I can't see why line search fails after we hit lower bound at some nodes? I tried quadratic and cubic line search both and got the same result. Best, Ata On Jan 26, 2012, at 9:07 PM, Barry Smith wrote: > > Don't even touch VISS, focus on RS. What solver did you use in the run you sent the output for? > > * If SS please rerun (with -snes_monitor) with RS and send the output again. > > * If RS then it is very puzzling that you get that slow convergence since the number of active constraints does not change so it should converge just like plain old Newton. > > Barry > > > > On Jan 25, 2012, at 8:55 AM, Ataollah Mesgarnejad wrote: > >> Barry, >> >> Here is my output from my program with petsc-dev (txt files attached). Even with petsc-dev (revision 21929), SNES (I tried VISS and VIRS) takes a long time to converge (>50 iterations) from a small problem (99 HEX20 elements, 681 DOFs for SNES). The exodusII output files are uploaded in the link; I attached pictures of last time step for Psi field which we solve with SNESVI and u field which the displacement in the x-direction and is my loading. As it was the case with petsc-3.2 line search diverges as soon as we hit the lower bound in a neighborhood of nodes. I ran with KSPPREONLY and PCLU to exclude possibility of LS failure because of the KSP tol. >> >> Best, >> Ata >> >> link to exodusII files: http://cl.ly/0g431o443v240x1h2h1B >> >> > From xyuan at lbl.gov Thu Jan 26 23:13:57 2012 From: xyuan at lbl.gov (Xuefei (Rebecca) Yuan) Date: Thu, 26 Jan 2012 21:13:57 -0800 Subject: [petsc-users] [petsc-dev] does petsc filter out zeros in MatSetValues? In-Reply-To: References: <6DE791D6-182D-404D-B5EC-D9C9B94F50C4@columbia.edu> Message-ID: Here is another error message if running on local mac: *************petsc-Dev = yes***************** ********************************************* ******* start solving for time = 0.10000 at time step = 1****** ******* start solving for time = 0.10000 at time step = 1****** 0 SNES Function norm 2.452320964164e-02 0 SNES Function norm 2.452320964164e-02 Matrix Object: 1 MPI processes type: seqaij rows=16384, cols=16384 total: nonzeros=831552, allocated nonzeros=1577536 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 4096 nodes, limit used is 5 Matrix Object: 1 MPI processes type: seqaij rows=16384, cols=16384 total: nonzeros=831552, allocated nonzeros=1577536 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 4096 nodes, limit used is 5 Runtime parameters: Objective type: Unknown! Coarsening type: Unknown! Initial partitioning type: Unknown! Refinement type: Unknown! Number of balancing constraints: 1 Number of refinement iterations: 1606408608 Random number seed: 1606408644 Number of separators: 48992256 Compress graph prior to ordering: Yes Detect & order connected components separately: Yes Prunning factor for high degree vertices: 0.100000 Allowed maximum load imbalance: 1.001 Input Error: Incorrect objective type. nbrpool statistics nbrpoolsize: 0 nbrpoolcpos: 0 nbrpoolreallocs: 0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] MatLUFactorNumeric_SuperLU_DIST line 284 /Users/xyuan/Software_macbook/petsc-dev/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [0]PETSC ERROR: [0] MatLUFactorNumeric line 2871 /Users/xyuan/Software_macbook/petsc-dev/src/mat/interface/matrix.c [0]PETSC ERROR: [0] PCSetUp_LU line 108 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: [0] PCSetUp line 810 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/pc/interface/precon.c [0]PETSC ERROR: [0] KSPSetUp line 184 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: [0] KSPSolve line 334 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: [0] SNES_KSPSolve line 3874 /Users/xyuan/Software_macbook/petsc-dev/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESSolve_LS line 593 /Users/xyuan/Software_macbook/petsc-dev/src/snes/impls/ls/ls.c [0]PETSC ERROR: [0] SNESSolve line 3061 /Users/xyuan/Software_macbook/petsc-dev/src/snes/interface/snes.c [0]PETSC ERROR: [0] DMMGSolveSNES line 538 /Users/xyuan/Software_macbook/petsc-dev/src/snes/utils/damgsnes.c [0]PETSC ERROR: [0] DMMGSolve line 303 /Users/xyuan/Software_macbook/petsc-dev/src/snes/utils/damg.c [0]PETSC ERROR: [0] Solve line 374 twcartffxmhd.c [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Development HG revision: 905af3a7d7cdee7d0b744502bace1d74dc34b204 HG Date: Sun Jan 22 16:10:04 2012 -0700 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./twcartffxmhd.exe on a arch-osx- named DOE6897708.local by xyuan Thu Jan 26 21:09:47 2012 [0]PETSC ERROR: Libraries linked from /Users/xyuan/Software_macbook/petsc-dev/arch-osx-10.6-c-pkgs-opt-debug/lib [0]PETSC ERROR: Configure run at Mon Jan 23 10:21:17 2012 [0]PETSC ERROR: Configure options --with-cc="gcc -m64" --with-fc="gfortran -m64" --with-cxx=g++ --with-debugging=1 -download-f-blas-lapack=1 --download-mpich=1 --download-plapack=1 --download-parmetis=1 --download-metis=1 --download-triangle=1 --download-spooles=1 --download-superlu=1 --download-superlu_dist=/Users/xyuan/Software_macbook/superlu_dist_3.0.tar.gz --download-blacs=1 --download-scalapack=1 --download-mumps=1 --download-hdf5=1 --download-sundials=1 --download-prometheus=1 --download-umfpack=1 --download-chaco=1 --download-spai=1 --download-ptscotch=1 --download-pastix=1 --download-prometheus=1 --download-cmake=1 PETSC_ARCH=arch-osx-10.6-c-pkgs-opt-debug [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 Runtime parameters: Objective type: Unknown! Coarsening type: Unknown! Initial partitioning type: Unknown! Refinement type: Unknown! Number of balancing constraints: 1 Number of refinement iterations: 1606408608 Random number seed: 1606408644 Number of separators: 48992256 Compress graph prior to ordering: Yes Detect & order connected components separately: Yes Prunning factor for high degree vertices: 0.100000 Allowed maximum load imbalance: 1.001 Input Error: Incorrect objective type. nbrpool statistics nbrpoolsize: 0 nbrpoolcpos: 0 nbrpoolreallocs: 0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] MatLUFactorNumeric_SuperLU_DIST line 284 /Users/xyuan/Software_macbook/petsc-dev/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [0]PETSC ERROR: [0] MatLUFactorNumeric line 2871 /Users/xyuan/Software_macbook/petsc-dev/src/mat/interface/matrix.c [0]PETSC ERROR: [0] PCSetUp_LU line 108 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: [0] PCSetUp line 810 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/pc/interface/precon.c [0]PETSC ERROR: [0] KSPSetUp line 184 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: [0] KSPSolve line 334 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: [0] SNES_KSPSolve line 3874 /Users/xyuan/Software_macbook/petsc-dev/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESSolve_LS line 593 /Users/xyuan/Software_macbook/petsc-dev/src/snes/impls/ls/ls.c [0]PETSC ERROR: [0] SNESSolve line 3061 /Users/xyuan/Software_macbook/petsc-dev/src/snes/interface/snes.c [0]PETSC ERROR: [0] DMMGSolveSNES line 538 /Users/xyuan/Software_macbook/petsc-dev/src/snes/utils/damgsnes.c [0]PETSC ERROR: [0] DMMGSolve line 303 /Users/xyuan/Software_macbook/petsc-dev/src/snes/utils/damg.c [0]PETSC ERROR: [0] Solve line 374 twcartffxmhd.c [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Development HG revision: 905af3a7d7cdee7d0b744502bace1d74dc34b204 HG Date: Sun Jan 22 16:10:04 2012 -0700 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./twcartffxmhd.exe on a arch-osx- named DOE6897708.local by xyuan Thu Jan 26 21:09:47 2012 [0]PETSC ERROR: Libraries linked from /Users/xyuan/Software_macbook/petsc-dev/arch-osx-10.6-c-pkgs-opt-debug/lib [0]PETSC ERROR: Configure run at Mon Jan 23 10:21:17 2012 [0]PETSC ERROR: Configure options --with-cc="gcc -m64" --with-fc="gfortran -m64" --with-cxx=g++ --with-debugging=1 -download-f-blas-lapack=1 --download-mpich=1 --download-plapack=1 --download-parmetis=1 --download-metis=1 --download-triangle=1 --download-spooles=1 --download-superlu=1 --download-superlu_dist=/Users/xyuan/Software_macbook/superlu_dist_3.0.tar.gz --download-blacs=1 --download-scalapack=1 --download-mumps=1 --download-hdf5=1 --download-sundials=1 --download-prometheus=1 --download-umfpack=1 --download-chaco=1 --download-spai=1 --download-ptscotch=1 --download-pastix=1 --download-prometheus=1 --download-cmake=1 PETSC_ARCH=arch-osx-10.6-c-pkgs-opt-debug [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 R On Jan 26, 2012, at 9:05 PM, Xuefei (Rebecca) Yuan wrote: > Hello Mark, > > Actually I have tried those options for a sequential run where I need to use superlu to get the condition number of some matrix. > > In the dev version, > > ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr); > > And in the options file, > > -dm_preallocate_only > > is added. > > This is totally fine when np=1, however, when I use multiple processors, there are some memory corruption happened. > > For example, the number of true nonzeros for 65536 size matrix is 1,470,802. the output (&) is for np=1 with the following PETSc related options: > > -dm_preallocate_only > -snes_ksp_ew true > -snes_monitor > -snes_max_it 1 > -ksp_view > -mat_view_info > -ksp_type preonly > -pc_type lu > -pc_factor_mat_solver_package superlu > -mat_superlu_conditionnumber > -mat_superlu_printstat > > > However, when np=2, the number of nonzeros changes to 3,366,976 with the following PETSc related options. (*) is the output file. > > -dm_preallocate_only > -snes_ksp_ew true > -snes_monitor > -ksp_view > -mat_view_info > -ksp_type preonly > -pc_type lu > -pc_factor_mat_solver_package superlu_dist > > ----------------------------- > (&) > > *************petsc-Dev = yes***************** > ********************************************* > ******* start solving for time = 1.00000 at time step = 1****** > 0 SNES Function norm 1.242539468950e-02 > Matrix Object: 1 MPI processes > type: seqaij > rows=65536, cols=65536 > total: nonzeros=1470802, allocated nonzeros=2334720 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Recip. condition number = 4.345658e-07 > MatLUFactorNumeric_SuperLU(): > Factor time = 42.45 > Factor flops = 7.374620e+10 Mflops = 1737.25 > Solve time = 0.00 > Number of memory expansions: 3 > No of nonzeros in factor L = 32491856 > No of nonzeros in factor U = 39390974 > No of nonzeros in L+U = 71817294 > L\U MB 741.397 total MB needed 756.339 > Matrix Object: 1 MPI processes > type: seqaij > rows=65536, cols=65536 > package used to perform factorization: superlu > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > SuperLU run parameters: > Equil: NO > ColPerm: 3 > IterRefine: 0 > SymmetricMode: NO > DiagPivotThresh: 1 > PivotGrowth: NO > ConditionNumber: YES > RowPerm: 0 > ReplaceTinyPivot: NO > PrintStat: YES > lwork: 0 > MatSolve__SuperLU(): > Factor time = 42.45 > Factor flops = 7.374620e+10 Mflops = 1737.25 > Solve time = 0.59 > Solve flops = 1.436365e+08 Mflops = 243.45 > Number of memory expansions: 3 > 1 SNES Function norm 2.645145585949e-04 > > > ----------------------------------------------- > (*) > *************petsc-Dev = yes***************** > ********************************************* > ******* start solving for time = 1.00000 at time step = 1****** > 0 SNES Function norm 1.242539468950e-02 > Matrix Object: 2 MPI processes > type: mpiaij > rows=65536, cols=65536 > total: nonzeros=3366976, allocated nonzeros=6431296 > total number of mallocs used during MatSetValues calls =0 > Matrix Object: 2 MPI processes > type: mpiaij > rows=65536, cols=65536 > total: nonzeros=3366976, allocated nonzeros=3366976 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 8192 nodes, limit used is 5 > Input Error: Incorrect objective type. > Input Error: Incorrect objective type. > At column 0, pivotL() encounters zero diagonal at line 708 in file symbfact.c > At column 0, pivotL() encounters zero diagonal at line 708 in file symbfact.c > > Moreover, When I use valgrind with --leak-check=yes --track-origins=yes, there are 441 errors from 219 contexts in PetscInitialize() before calling SNESSolve(). Is this normal for dev? > > Thanks very much! > > Best regards, > > Rebecca > > > > > > > On Jan 26, 2012, at 5:06 PM, Jed Brown wrote: > >> On Thu, Jan 26, 2012 at 19:00, Mark F. Adams wrote: >> I'm guessing that PETSc recently changed and now filters out 0.0 in MatSetValues ... is this true? >> >> Did the option MAT_IGNORE_ZERO_ENTRIES get set somehow? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 26 23:20:11 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 26 Jan 2012 23:20:11 -0600 Subject: [petsc-users] [petsc-dev] does petsc filter out zeros in MatSetValues? In-Reply-To: References: <6DE791D6-182D-404D-B5EC-D9C9B94F50C4@columbia.edu> Message-ID: On Thu, Jan 26, 2012 at 11:13 PM, Xuefei (Rebecca) Yuan wrote: > Here is another error message if running on local mac: > > *************petsc-Dev = yes***************** > ********************************************* > ******* start solving for time = 0.10000 at time step = 1****** > ******* start solving for time = 0.10000 at time step = 1****** > 0 SNES Function norm 2.452320964164e-02 > 0 SNES Function norm 2.452320964164e-02 > Matrix Object: 1 MPI processes > type: seqaij > rows=16384, cols=16384 > total: nonzeros=831552, allocated nonzeros=1577536 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 4096 nodes, limit used is 5 > Matrix Object: 1 MPI processes > type: seqaij > rows=16384, cols=16384 > total: nonzeros=831552, allocated nonzeros=1577536 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 4096 nodes, limit used is 5 > Runtime parameters: > Objective type: Unknown! > Coarsening type: Unknown! > Initial partitioning type: Unknown! > Refinement type: Unknown! > Number of balancing constraints: 1 > Number of refinement iterations: 1606408608 > Random number seed: 1606408644 > Number of separators: 48992256 > Compress graph prior to ordering: Yes > Detect & order connected components separately: Yes > Prunning factor for high degree vertices: 0.100000 > Allowed maximum load imbalance: 1.001 > > Input Error: Incorrect objective type. > nbrpool statistics > nbrpoolsize: 0 nbrpoolcpos: 0 > nbrpoolreallocs: 0 > Can you run that through valgrind? It looks like it might be prior memory corruption. Matt > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSCERROR: or try > http://valgrind.org on GNU/linux and Apple Mac OS X to find memory > corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatLUFactorNumeric_SuperLU_DIST line 284 > /Users/xyuan/Software_macbook/petsc-dev/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [0]PETSC ERROR: [0] MatLUFactorNumeric line 2871 > /Users/xyuan/Software_macbook/petsc-dev/src/mat/interface/matrix.c > [0]PETSC ERROR: [0] PCSetUp_LU line 108 > /Users/xyuan/Software_macbook/petsc-dev/src/ksp/pc/impls/factor/lu/lu.c > [0]PETSC ERROR: [0] PCSetUp line 810 > /Users/xyuan/Software_macbook/petsc-dev/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: [0] KSPSetUp line 184 > /Users/xyuan/Software_macbook/petsc-dev/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: [0] KSPSolve line 334 > /Users/xyuan/Software_macbook/petsc-dev/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: [0] SNES_KSPSolve line 3874 > /Users/xyuan/Software_macbook/petsc-dev/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSolve_LS line 593 > /Users/xyuan/Software_macbook/petsc-dev/src/snes/impls/ls/ls.c > [0]PETSC ERROR: [0] SNESSolve line 3061 > /Users/xyuan/Software_macbook/petsc-dev/src/snes/interface/snes.c > [0]PETSC ERROR: [0] DMMGSolveSNES line 538 > /Users/xyuan/Software_macbook/petsc-dev/src/snes/utils/damgsnes.c > [0]PETSC ERROR: [0] DMMGSolve line 303 > /Users/xyuan/Software_macbook/petsc-dev/src/snes/utils/damg.c > [0]PETSC ERROR: [0] Solve line 374 twcartffxmhd.c > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Development HG revision: > 905af3a7d7cdee7d0b744502bace1d74dc34b204 HG Date: Sun Jan 22 16:10:04 2012 > -0700 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./twcartffxmhd.exe on a arch-osx- named DOE6897708.local > by xyuan Thu Jan 26 21:09:47 2012 > [0]PETSC ERROR: Libraries linked from > /Users/xyuan/Software_macbook/petsc-dev/arch-osx-10.6-c-pkgs-opt-debug/lib > [0]PETSC ERROR: Configure run at Mon Jan 23 10:21:17 2012 > [0]PETSC ERROR: Configure options --with-cc="gcc -m64" --with-fc="gfortran > -m64" --with-cxx=g++ --with-debugging=1 -download-f-blas-lapack=1 > --download-mpich=1 --download-plapack=1 --download-parmetis=1 > --download-metis=1 --download-triangle=1 --download-spooles=1 > --download-superlu=1 > --download-superlu_dist=/Users/xyuan/Software_macbook/superlu_dist_3.0.tar.gz > --download-blacs=1 --download-scalapack=1 --download-mumps=1 > --download-hdf5=1 --download-sundials=1 --download-prometheus=1 > --download-umfpack=1 --download-chaco=1 --download-spai=1 > --download-ptscotch=1 --download-pastix=1 --download-prometheus=1 > --download-cmake=1 PETSC_ARCH=arch-osx-10.6-c-pkgs-opt-debug > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > Runtime parameters: > Objective type: Unknown! > Coarsening type: Unknown! > Initial partitioning type: Unknown! > Refinement type: Unknown! > Number of balancing constraints: 1 > Number of refinement iterations: 1606408608 > Random number seed: 1606408644 > Number of separators: 48992256 > Compress graph prior to ordering: Yes > Detect & order connected components separately: Yes > Prunning factor for high degree vertices: 0.100000 > Allowed maximum load imbalance: 1.001 > > Input Error: Incorrect objective type. > nbrpool statistics > nbrpoolsize: 0 nbrpoolcpos: 0 > nbrpoolreallocs: 0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSCERROR: or try > http://valgrind.org on GNU/linux and Apple Mac OS X to find memory > corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatLUFactorNumeric_SuperLU_DIST line 284 > /Users/xyuan/Software_macbook/petsc-dev/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [0]PETSC ERROR: [0] MatLUFactorNumeric line 2871 > /Users/xyuan/Software_macbook/petsc-dev/src/mat/interface/matrix.c > [0]PETSC ERROR: [0] PCSetUp_LU line 108 > /Users/xyuan/Software_macbook/petsc-dev/src/ksp/pc/impls/factor/lu/lu.c > [0]PETSC ERROR: [0] PCSetUp line 810 > /Users/xyuan/Software_macbook/petsc-dev/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: [0] KSPSetUp line 184 > /Users/xyuan/Software_macbook/petsc-dev/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: [0] KSPSolve line 334 > /Users/xyuan/Software_macbook/petsc-dev/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: [0] SNES_KSPSolve line 3874 > /Users/xyuan/Software_macbook/petsc-dev/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSolve_LS line 593 > /Users/xyuan/Software_macbook/petsc-dev/src/snes/impls/ls/ls.c > [0]PETSC ERROR: [0] SNESSolve line 3061 > /Users/xyuan/Software_macbook/petsc-dev/src/snes/interface/snes.c > [0]PETSC ERROR: [0] DMMGSolveSNES line 538 > /Users/xyuan/Software_macbook/petsc-dev/src/snes/utils/damgsnes.c > [0]PETSC ERROR: [0] DMMGSolve line 303 > /Users/xyuan/Software_macbook/petsc-dev/src/snes/utils/damg.c > [0]PETSC ERROR: [0] Solve line 374 twcartffxmhd.c > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Development HG revision: > 905af3a7d7cdee7d0b744502bace1d74dc34b204 HG Date: Sun Jan 22 16:10:04 2012 > -0700 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./twcartffxmhd.exe on a arch-osx- named DOE6897708.local > by xyuan Thu Jan 26 21:09:47 2012 > [0]PETSC ERROR: Libraries linked from > /Users/xyuan/Software_macbook/petsc-dev/arch-osx-10.6-c-pkgs-opt-debug/lib > [0]PETSC ERROR: Configure run at Mon Jan 23 10:21:17 2012 > [0]PETSC ERROR: Configure options --with-cc="gcc -m64" --with-fc="gfortran > -m64" --with-cxx=g++ --with-debugging=1 -download-f-blas-lapack=1 > --download-mpich=1 --download-plapack=1 --download-parmetis=1 > --download-metis=1 --download-triangle=1 --download-spooles=1 > --download-superlu=1 > --download-superlu_dist=/Users/xyuan/Software_macbook/superlu_dist_3.0.tar.gz > --download-blacs=1 --download-scalapack=1 --download-mumps=1 > --download-hdf5=1 --download-sundials=1 --download-prometheus=1 > --download-umfpack=1 --download-chaco=1 --download-spai=1 > --download-ptscotch=1 --download-pastix=1 --download-prometheus=1 > --download-cmake=1 PETSC_ARCH=arch-osx-10.6-c-pkgs-opt-debug > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > R > > > > > > On Jan 26, 2012, at 9:05 PM, Xuefei (Rebecca) Yuan wrote: > > Hello Mark, > > Actually I have tried those options for a sequential run where I need to > use superlu to get the condition number of some matrix. > > In the dev version, > > ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); > ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, > PETSC_TRUE);CHKERRQ(ierr); > > And in the options file, > > -dm_preallocate_only > > is added. > > This is totally fine when np=1, however, when I use multiple processors, > there are some memory corruption happened. > > For example, the number of true nonzeros for 65536 size matrix is > 1,470,802. the output (&) is for np=1 with the following PETSc related > options: > > -dm_preallocate_only > -snes_ksp_ew true > -snes_monitor > -snes_max_it 1 > -ksp_view > -mat_view_info > -ksp_type preonly > -pc_type lu > -pc_factor_mat_solver_package superlu > -mat_superlu_conditionnumber > -mat_superlu_printstat > > > However, when np=2, the number of nonzeros changes to 3,366,976 with the > following PETSc related options. (*) is the output file. > > -dm_preallocate_only > -snes_ksp_ew true > -snes_monitor > -ksp_view > -mat_view_info > -ksp_type preonly > -pc_type lu > -pc_factor_mat_solver_package superlu_dist > > ----------------------------- > (&) > > *************petsc-Dev = yes***************** > ********************************************* > ******* start solving for time = 1.00000 at time step = 1****** > 0 SNES Function norm 1.242539468950e-02 > Matrix Object: 1 MPI processes > type: seqaij > rows=65536, cols=65536 > total: nonzeros=1470802, allocated nonzeros=2334720 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Recip. condition number = 4.345658e-07 > MatLUFactorNumeric_SuperLU(): > Factor time = 42.45 > Factor flops = 7.374620e+10 Mflops = 1737.25 > Solve time = 0.00 > Number of memory expansions: 3 > No of nonzeros in factor L = 32491856 > No of nonzeros in factor U = 39390974 > No of nonzeros in L+U = 71817294 > L\U MB 741.397 total MB needed 756.339 > Matrix Object: 1 MPI processes > type: seqaij > rows=65536, cols=65536 > package used to perform factorization: superlu > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > SuperLU run parameters: > Equil: NO > ColPerm: 3 > IterRefine: 0 > SymmetricMode: NO > DiagPivotThresh: 1 > PivotGrowth: NO > ConditionNumber: YES > RowPerm: 0 > ReplaceTinyPivot: NO > PrintStat: YES > lwork: 0 > MatSolve__SuperLU(): > Factor time = 42.45 > Factor flops = 7.374620e+10 Mflops = 1737.25 > Solve time = 0.59 > Solve flops = 1.436365e+08 Mflops = 243.45 > Number of memory expansions: 3 > 1 SNES Function norm 2.645145585949e-04 > > > ----------------------------------------------- > (*) > *************petsc-Dev = yes***************** > ********************************************* > ******* start solving for time = 1.00000 at time step = 1****** > 0 SNES Function norm 1.242539468950e-02 > Matrix Object: 2 MPI processes > type: mpiaij > rows=65536, cols=65536 > total: nonzeros=3366976, allocated nonzeros=6431296 > total number of mallocs used during MatSetValues calls =0 > Matrix Object: 2 MPI processes > type: mpiaij > rows=65536, cols=65536 > total: nonzeros=3366976, allocated nonzeros=3366976 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 8192 nodes, limit used > is 5 > Input Error: Incorrect objective type. > Input Error: Incorrect objective type. > At column 0, pivotL() encounters zero diagonal at line 708 in file > symbfact.c > At column 0, pivotL() encounters zero diagonal at line 708 in file > symbfact.c > > Moreover, When I use valgrind with --leak-check=yes --track-origins=yes, > there are 441 errors from 219 contexts in PetscInitialize() before calling > SNESSolve(). Is this normal for dev? > > Thanks very much! > > Best regards, > > Rebecca > > > > > > > On Jan 26, 2012, at 5:06 PM, Jed Brown wrote: > > On Thu, Jan 26, 2012 at 19:00, Mark F. Adams wrote: > >> I'm guessing that PETSc recently changed and now filters out 0.0 in >> MatSetValues ... is this true? >> > > Did the option MAT_IGNORE_ZERO_ENTRIES get set somehow? > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Fri Jan 27 02:43:44 2012 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Fri, 27 Jan 2012 08:43:44 +0000 Subject: [petsc-users] access to matnest block (0,1) ? Message-ID: >> >> What about a less general (but important) case: saddle point >> problems arising from incompressible Stokes, Oseen and >> Navier-Stokes eqs. with Schur type preconditioning. In 2D with N >> cells and co-located variables arranged >> as (u1,...,uN,v1,...,vN,p1,...pN) the matrix would have the form >> [Q G, D 0] with Q a 2N-by-2N matrix, G a 2N-by-N matrix and D a >> N-by-2N matrix. Since the variables are co-located, they share >> the same partitioning but could have different stencils. How to use >> the "split local space", DMComposite and MATNEST in this case? >> > >If you order this way, then you don't need DMComposite or MatNest (although >you can still make a MatNest that operates in this ordering, we just don't >have a way to make it automatically). > So I made 4 matrices corresponding to the four blocks above and assembled them in a nested matrix. Then I tried to solve it using Schur preconditioning (see below). Apparently the matrix is still treated as a single block. I must be misunderstanding the concept... $ mpiexec -n 1 ./matnest-try -ksp_view -pc_type fieldsplit -pc_fieldsplit_type schur Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="a00_", type=mpiaij, rows=24, cols=24 (0,1) : prefix="a01_", type=mpiaij, rows=24, cols=12 (1,0) : prefix="a10_", type=mpiaij, rows=12, cols=24 (1,1) : prefix="a11_", type=mpiaij, rows=12, cols=12 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Petsc has generated inconsistent data! [0]PETSC ERROR: Unhandled case, must have at least two fields! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 5, Sat Oct 29 13:45:54 CDT 2011 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./matnest-try on a linux_64b named lin0133 by cklaij Fri Jan 27 09:25:01 2012 [0]PETSC ERROR: Libraries linked from /opt/refresco/64bit_intelv11.1_openmpi/petsc-3.2-p5/lib [0]PETSC ERROR: Configure run at Thu Jan 26 13:44:12 2012 [0]PETSC ERROR: Configure options --prefix=/opt/refresco/64bit_intelv11.1_openmpi/petsc-3.2-p5 --with-mpi-dir=/opt/refresco/64bit_intelv11.1_openmpi/openmpi-1.4.4 --with-x=1 --with-mpe=0 --with-debugging=1 --with-clanguage=c++ --with-hypre-include=/opt/refresco/64bit_intelv11.1_openmpi/hypre-2.7.0b/include --with-hypre-lib=/opt/refresco/64bit_intelv11.1_openmpi/hypre-2.7.0b/lib/libHYPRE.a --with-ml-include=/opt/refresco/64bit_intelv11.1_openmpi/ml-6.2/include --with-ml-lib=/opt/refresco/64bit_intelv11.1_openmpi/ml-6.2/lib/libml.a --with-blas-lapack-dir=/opt/intel/mkl [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: PCFieldSplitSetDefaults() line 319 in /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: PCSetUp_FieldSplit() line 335 in /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: PCSetUp() line 819 in /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 260 in /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 379 in /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/ksp/ksp/interface/itfunc.c dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From jedbrown at mcs.anl.gov Fri Jan 27 06:14:51 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 27 Jan 2012 06:14:51 -0600 Subject: [petsc-users] access to matnest block (0,1) ? In-Reply-To: References: Message-ID: On Fri, Jan 27, 2012 at 02:43, Klaij, Christiaan wrote: > [0]PETSC ERROR: Unhandled case, must have at least two fields! You can use PCFieldSplitSetIS(). The implementation could check whether a MatNest is being used and automatically set the splits if it is, but you would have to call PCFieldSplitSetIS() later when you wanted to assemble into an AIJ format, so I'm hesitant to pick it up automatically. Just call the function for now. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Fri Jan 27 08:16:07 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 27 Jan 2012 15:16:07 +0100 Subject: [petsc-users] MatConvert behavior 3.1 vs 3.2 In-Reply-To: References: Message-ID: > PETSc-dev uses ParMetis-4.0.2 with other custom patches. > > > I suggest that you try building your code with petsc-dev to see if > ParMetis-4.0.2 fixed the bug you are hitting. I am getting this error. Looks simple but it comes from BLAS routines, so I did not attempt to resolve it: /home/dsz/pack/petsc-dev/gnu-debug/lib/libmetis.a(gklib.c.o): In function `libmetis__inorm2': /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/gklib.c:18: undefined reference to `sqrt' /home/dsz/pack/petsc-dev/gnu-debug/lib/libmetis.a(gklib.c.o): In function `libmetis__rnorm2': /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/gklib.c:19: undefined reference to `sqrt' /home/dsz/pack/petsc-dev/gnu-debug/lib/libmetis.a(kmetis.c.o): In function `libmetis__InitKWayPartitioning': /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/kmetis.c:187: undefined reference to `log' /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/kmetis.c:187: undefined reference to `pow' /home/dsz/pack/petsc-dev/gnu-debug/lib/libmetis.a(util.c.o): In function `gk_flog2': /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/GKlib/util.c:106: undefined reference to `log' /home/dsz/pack/petsc-dev/gnu-debug/lib/libmetis.a(kwayfm.c.o): In function `libmetis__Greedy_KWayCutOptimize': /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/kwayfm.c:181: undefined reference to `sqrt' /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/kwayfm.c:321: undefined reference to `sqrt' /home/dsz/pack/petsc-dev/gnu-debug/lib/libmetis.a(kwayfm.c.o): In function `libmetis__Greedy_McKWayCutOptimize': /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/kwayfm.c:823: undefined reference to `sqrt' /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/kwayfm.c:977: undefined reference to `sqrt' collect2: ld returned 1 exit status make[3]: *** [programs/mtest] Error 1 make[2]: *** [programs/CMakeFiles/mtest.dir/all] Error 2 make[1]: *** [all] Error 2 make: *** [all] Error 2 -- Dominik From jedbrown at mcs.anl.gov Fri Jan 27 08:18:40 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 27 Jan 2012 08:18:40 -0600 Subject: [petsc-users] MatConvert behavior 3.1 vs 3.2 In-Reply-To: References: Message-ID: I have no idea how you are missing linking libm, but that's the problem. On Jan 27, 2012 8:16 AM, "Dominik Szczerba" wrote: > > PETSc-dev uses ParMetis-4.0.2 with other custom patches. > > > > > > I suggest that you try building your code with petsc-dev to see if > > ParMetis-4.0.2 fixed the bug you are hitting. > > I am getting this error. Looks simple but it comes from BLAS routines, > so I did not attempt to resolve it: > > /home/dsz/pack/petsc-dev/gnu-debug/lib/libmetis.a(gklib.c.o): In > function `libmetis__inorm2': > /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/gklib.c:18: > undefined reference to `sqrt' > /home/dsz/pack/petsc-dev/gnu-debug/lib/libmetis.a(gklib.c.o): In > function `libmetis__rnorm2': > /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/gklib.c:19: > undefined reference to `sqrt' > /home/dsz/pack/petsc-dev/gnu-debug/lib/libmetis.a(kmetis.c.o): In > function `libmetis__InitKWayPartitioning': > > /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/kmetis.c:187: > undefined reference to `log' > > /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/kmetis.c:187: > undefined reference to `pow' > /home/dsz/pack/petsc-dev/gnu-debug/lib/libmetis.a(util.c.o): In > function `gk_flog2': > /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/GKlib/util.c:106: > undefined reference to `log' > /home/dsz/pack/petsc-dev/gnu-debug/lib/libmetis.a(kwayfm.c.o): In > function `libmetis__Greedy_KWayCutOptimize': > > /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/kwayfm.c:181: > undefined reference to `sqrt' > > /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/kwayfm.c:321: > undefined reference to `sqrt' > /home/dsz/pack/petsc-dev/gnu-debug/lib/libmetis.a(kwayfm.c.o): In > function `libmetis__Greedy_McKWayCutOptimize': > > /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/kwayfm.c:823: > undefined reference to `sqrt' > > /home/dsz/pack/petsc-dev/externalpackages/metis-5.0.2/libmetis/kwayfm.c:977: > undefined reference to `sqrt' > collect2: ld returned 1 exit status > make[3]: *** [programs/mtest] Error 1 > make[2]: *** [programs/CMakeFiles/mtest.dir/all] Error 2 > make[1]: *** [all] Error 2 > make: *** [all] Error 2 > > > > -- > Dominik > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Fri Jan 27 08:26:46 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 27 Jan 2012 15:26:46 +0100 Subject: [petsc-users] MatConvert behavior 3.1 vs 3.2 In-Reply-To: References: Message-ID: On Fri, Jan 27, 2012 at 3:18 PM, Jed Brown wrote: > I have no idea how you are missing linking libm, but that's the problem. Because for some reason on the linking line there is -lm -lmetis while it should be -lmetis -lm Dominik From dominik at itis.ethz.ch Fri Jan 27 08:34:57 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 27 Jan 2012 15:34:57 +0100 Subject: [petsc-users] MatConvert behavior 3.1 vs 3.2 In-Reply-To: References: Message-ID: > On Fri, Jan 27, 2012 at 3:18 PM, Jed Brown wrote: >> I have no idea how you are missing linking libm, but that's the problem. > > Because for some reason on the linking line there is > > -lm -lmetis > > while it should be > > -lmetis -lm I was able to compile it by hand myself, and then replacing --download-xxx with --with-xxx pointing to external packages. Clearly, parmetis makefile, as pulled by --download-parmetis, has an issue. Immediately one suggestion: it would be nice to pull and build the external packages at option separately from the rest of petsc. -- Dominik From balay at mcs.anl.gov Fri Jan 27 08:39:31 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 27 Jan 2012 08:39:31 -0600 (CST) Subject: [petsc-users] MatConvert behavior 3.1 vs 3.2 In-Reply-To: References: Message-ID: please send the corresponding configure.log to petsc-maint satish On Fri, 27 Jan 2012, Dominik Szczerba wrote: > On Fri, Jan 27, 2012 at 3:18 PM, Jed Brown wrote: > > I have no idea how you are missing linking libm, but that's the problem. > > Because for some reason on the linking line there is > > -lm -lmetis > > while it should be > > -lmetis -lm > > Dominik > From balay at mcs.anl.gov Fri Jan 27 08:42:00 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 27 Jan 2012 08:42:00 -0600 (CST) Subject: [petsc-users] MatConvert behavior 3.1 vs 3.2 In-Reply-To: References: Message-ID: On Fri, 27 Jan 2012, Dominik Szczerba wrote: > > On Fri, Jan 27, 2012 at 3:18 PM, Jed Brown wrote: > >> I have no idea how you are missing linking libm, but that's the problem. > > > > Because for some reason on the linking line there is > > > > -lm -lmetis > > > > while it should be > > > > -lmetis -lm > > I was able to compile it by hand myself, and then replacing > --download-xxx with --with-xxx pointing to external packages. Clearly, > parmetis makefile, as pulled by --download-parmetis, has an issue. > > Immediately one suggestion: it would be nice to pull and build the > external packages at option separately from the rest of petsc. ??? petsc configure already does this for you.. [you don't have to build PETSc - after externalpacakges are built...] satish From dominik at itis.ethz.ch Fri Jan 27 08:51:36 2012 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 27 Jan 2012 15:51:36 +0100 Subject: [petsc-users] MatConvert behavior 3.1 vs 3.2 In-Reply-To: References: Message-ID: >> Immediately one suggestion: it would be nice to pull and build the >> external packages at option separately from the rest of petsc. > > ??? petsc configure already does this for you.. > > [you don't have to build PETSc - after externalpacakges are built...] But it breaks... As I wrote above, I can fix it, but then re-running configure rebuilds my built version. Then I tried --with instead of --download, that resulted in a correct build, unfortunately, seems like parmetis is somehow not registered with petsc: [0]PETSC ERROR: Unknown partitioning type parmetis! Would you please advise how to go on? Thanks Dominik From balay at mcs.anl.gov Fri Jan 27 08:54:59 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 27 Jan 2012 08:54:59 -0600 (CST) Subject: [petsc-users] MatConvert behavior 3.1 vs 3.2 In-Reply-To: References: Message-ID: On Fri, 27 Jan 2012, Dominik Szczerba wrote: > >> Immediately one suggestion: it would be nice to pull and build the > >> external packages at option separately from the rest of petsc. > > > > ??? petsc configure already does this for you.. > > > > [you don't have to build PETSc - after externalpacakges are built...] > > But it breaks... what breaks? Send relavent logs to petsc-maint. > As I wrote above, I can fix it, don't understand what the problme is - so don't understand what your fix is. > but then re-running configure rebuilds my built version. thats fine. > Then I tried --with instead of --download, that resulted in a correct > build, again - don't know what the problem is - so don't understand this workarround. > unfortunately, seems like parmetis is somehow not registered > with petsc: > > [0]PETSC ERROR: Unknown partitioning type parmetis! > > Would you please advise how to go on? start with the original problem - and send logs [not logs with workarrounds] satish > > Thanks > Dominik > From stali at geology.wisc.edu Fri Jan 27 10:32:20 2012 From: stali at geology.wisc.edu (Tabrez Ali) Date: Fri, 27 Jan 2012 10:32:20 -0600 Subject: [petsc-users] Preallocation woes again Message-ID: <4F22D194.3040103@geology.wisc.edu> PETSc Gurus First I want to thank you for patiently answering questions that I have asked in past regarding preallocation. Unfortunately I am still having problems. I have a small unstructured FE (elasticity) code that uses PETSc. Unfortunately I am not yet able to find an _efficient_ way of calculating the non-zero structure so I simply overestimate the o_nz and d_nz values (= dimension x max_number_of nodes _a_node_contributes_to) which means that my stiffness matrix consumes at least 2X more memory. For example, for a perfectly structured mesh of linear hexes this number is 3*27=81=o_nz=d_nz. In complicated 3D linear tet/hex meshes that I have generated I rarely need to set a value greater than 150. In general I have found that as long as I keep DOF/core between 100-200K (assuming 1GB/per core) there is enough local RAM left even when memory for stiffness matrix is overestimated by 2X-6X. In any case now I do want to preallocate exactly for better memory performance and the ambiguity involved in choosing a reasonable o_nz/d_nz. The way I am trying to do it involves loops like ... do i=1, num_local_elements do j=1, num_total_nodes ... end do end do or using lists that involve searches. This unfortunately takes much time (much more than assembly and solve) due to the second loop. I am aware of many posts/slides by PETSc authors that mention that the non-zero structure can be found simply by looping once through elements but what to do next is not quite clear. I am also aware of this post by Barry [ http://lists.mcs.anl.gov/pipermail/petsc-users/2008-May/003020.html ] but I cannot get it to work for a simple 4 element problem. Can some one please expand on the strategy being suggested by Barry specially where the loops are discussed (I understand the arrays and the vecscatter part). I am also not sure how does using a t of 0.5 and 1 prevents double counting. Shouldn't it be 0 and 1? I am also aware of some capabilities in DMMESH but right now I dont understand it well enough to utilize it. Alternatively does PETSc support MATMPIAIJ of size=integer(1)/logical instead of real(8) which I can use to put zeros/ones (when I loop over elements the first time) to get the non-zero structure? Thanks in advance. Tabrez From knepley at gmail.com Fri Jan 27 10:26:58 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 27 Jan 2012 10:26:58 -0600 Subject: [petsc-users] Preallocation woes again In-Reply-To: <4F22D194.3040103@geology.wisc.edu> References: <4F22D194.3040103@geology.wisc.edu> Message-ID: On Fri, Jan 27, 2012 at 10:32 AM, Tabrez Ali wrote: > PETSc Gurus > > First I want to thank you for patiently answering questions that I have > asked in past regarding preallocation. Unfortunately I am still having > problems. > > I have a small unstructured FE (elasticity) code that uses PETSc. > > Unfortunately I am not yet able to find an _efficient_ way of calculating > the non-zero structure so I simply overestimate the o_nz and d_nz values (= > dimension x max_number_of nodes _a_node_contributes_to) which means that my > stiffness matrix consumes at least 2X more memory. For example, for a > perfectly structured mesh of linear hexes this number is 3*27=81=o_nz=d_nz. > In complicated 3D linear tet/hex meshes that I have generated I rarely need > to set a value greater than 150. In general I have found that as long as I > keep DOF/core between 100-200K (assuming 1GB/per core) there is enough > local RAM left even when memory for stiffness matrix is overestimated by > 2X-6X. > > In any case now I do want to preallocate exactly for better memory > performance and the ambiguity involved in choosing a reasonable o_nz/d_nz. > The way I am trying to do it involves loops like ... > > do i=1, num_local_elements > do j=1, num_total_nodes > ... > end do > end do > > or using lists that involve searches. > > This unfortunately takes much time (much more than assembly and solve) due > to the second loop. I am aware of many posts/slides by PETSc authors that > mention that the non-zero structure can be found simply by looping once > through elements but what to do next is not quite clear. > > I am also aware of this post by Barry [ http://lists.mcs.anl.gov/** > pipermail/petsc-users/2008-**May/003020.html] but I cannot get it to work for a simple 4 element problem. Can some one > please expand on the strategy being suggested by Barry specially where the > loops are discussed (I understand the arrays and the vecscatter part). I am > also not sure how does using a t of 0.5 and 1 prevents double counting. > Shouldn't it be 0 and 1? > > I am also aware of some capabilities in DMMESH but right now I dont > understand it well enough to utilize it. > I agree with you here. Its not easy enough to use. However, I have rewritten the basics of it completely in C, which will all have easy Fortran bindings. You can specify your mesh with adjacency lists using Fortran arrays. The last part I am working on is matrix preallocation. I would not normally ask you to wait, but I think I am pretty close. Mail me directly if you want more info. I think it should only take me a few weeks to finish. Thanks, Matt Alternatively does PETSc support MATMPIAIJ of size=integer(1)/logical > instead of real(8) which I can use to put zeros/ones (when I loop over > elements the first time) to get the non-zero structure? > > Thanks in advance. > > Tabrez > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From popov at uni-mainz.de Fri Jan 27 11:16:41 2012 From: popov at uni-mainz.de (Anton Popov) Date: Fri, 27 Jan 2012 18:16:41 +0100 Subject: [petsc-users] multigrid_repartitioning Message-ID: <4F22DBF9.7090607@uni-mainz.de> Dear PETSc developers/users, Does PETSc support repartitioning of coarse grid operators onto a smaller subset of processors during coarsening? Or the coarsest grid must be represented on the same amount of processors as the finest? It becomes a problem when a lot of processors are employed (say 4096 or more). If it is supported, how can I use it? Thank you, Anton Popov From jedbrown at mcs.anl.gov Fri Jan 27 11:18:14 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 27 Jan 2012 11:18:14 -0600 Subject: [petsc-users] multigrid_repartitioning In-Reply-To: <4F22DBF9.7090607@uni-mainz.de> References: <4F22DBF9.7090607@uni-mainz.de> Message-ID: On Fri, Jan 27, 2012 at 11:16, Anton Popov wrote: > Does PETSc support repartitioning of coarse grid operators onto a smaller > subset of processors during coarsening? Or the coarsest grid must be > represented on the same amount of processors as the finest? It becomes a > problem when a lot of processors are employed (say 4096 or more). If it is > supported, how can I use it? PCGAMG does this automatically. There are some technical challenges to doing this with geometric multigrid, but we are thinking about how to do it best. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.adams at columbia.edu Fri Jan 27 11:29:23 2012 From: mark.adams at columbia.edu (Mark F. Adams) Date: Fri, 27 Jan 2012 12:29:23 -0500 Subject: [petsc-users] multigrid_repartitioning In-Reply-To: References: <4F22DBF9.7090607@uni-mainz.de> Message-ID: <3A48520A-E9F8-40C4-A8F0-8E081827D9EA@columbia.edu> On Jan 27, 2012, at 12:18 PM, Jed Brown wrote: > On Fri, Jan 27, 2012 at 11:16, Anton Popov wrote: > Does PETSc support repartitioning of coarse grid operators onto a smaller subset of processors during coarsening? Or the coarsest grid must be represented on the same amount of processors as the finest? It becomes a problem when a lot of processors are employed (say 4096 or more). If it is supported, how can I use it? > > PCGAMG does this automatically. There are some technical challenges to doing this with geometric multigrid, but we are thinking about how to do it best. GAMG does not repartition by default anymore -- it is very expensive. GAMG does now do simple process aggregation on coarser grids if repartitioning is not specified. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stali at geology.wisc.edu Fri Jan 27 12:04:44 2012 From: stali at geology.wisc.edu (Tabrez Ali) Date: Fri, 27 Jan 2012 12:04:44 -0600 Subject: [petsc-users] Preallocation woes again In-Reply-To: References: <4F22D194.3040103@geology.wisc.edu> Message-ID: Matt Thanks. Yes I can wait for a few weeks/months (its only a local memory footprint issue and in no way hampers my ability to solve problems). Btw just for curiosity are you using the same technique discussed by Barry? Maybe I am overlooking something trivial in his post. Tabrez On Jan 27, 2012, at 10:26 AM, Matthew Knepley wrote: > On Fri, Jan 27, 2012 at 10:32 AM, Tabrez Ali > wrote: > PETSc Gurus > > First I want to thank you for patiently answering questions that I > have asked in past regarding preallocation. Unfortunately I am still > having problems. > > I have a small unstructured FE (elasticity) code that uses PETSc. > > Unfortunately I am not yet able to find an _efficient_ way of > calculating the non-zero structure so I simply overestimate the o_nz > and d_nz values (= dimension x max_number_of nodes > _a_node_contributes_to) which means that my stiffness matrix > consumes at least 2X more memory. For example, for a perfectly > structured mesh of linear hexes this number is 3*27=81=o_nz=d_nz. In > complicated 3D linear tet/hex meshes that I have generated I rarely > need to set a value greater than 150. In general I have found that > as long as I keep DOF/core between 100-200K (assuming 1GB/per core) > there is enough local RAM left even when memory for stiffness matrix > is overestimated by 2X-6X. > > In any case now I do want to preallocate exactly for better memory > performance and the ambiguity involved in choosing a reasonable o_nz/ > d_nz. The way I am trying to do it involves loops like ... > > do i=1, num_local_elements > do j=1, num_total_nodes > ... > end do > end do > > or using lists that involve searches. > > This unfortunately takes much time (much more than assembly and > solve) due to the second loop. I am aware of many posts/slides by > PETSc authors that mention that the non-zero structure can be found > simply by looping once through elements but what to do next is not > quite clear. > > I am also aware of this post by Barry [ http://lists.mcs.anl.gov/pipermail/petsc-users/2008-May/003020.html > ] but I cannot get it to work for a simple 4 element problem. Can > some one please expand on the strategy being suggested by Barry > specially where the loops are discussed (I understand the arrays and > the vecscatter part). I am also not sure how does using a t of 0.5 > and 1 prevents double counting. Shouldn't it be 0 and 1? > > I am also aware of some capabilities in DMMESH but right now I dont > understand it well enough to utilize it. > > I agree with you here. Its not easy enough to use. However, I have > rewritten the basics of it completely in C, which > will all have easy Fortran bindings. You can specify your mesh with > adjacency lists using Fortran arrays. The last part > I am working on is matrix preallocation. I would not normally ask > you to wait, but I think I am pretty close. Mail me > directly if you want more info. I think it should only take me a few > weeks to finish. > > Thanks, > > Matt > > Alternatively does PETSc support MATMPIAIJ of size=integer(1)/ > logical instead of real(8) which I can use to put zeros/ones (when I > loop over elements the first time) to get the non-zero structure? > > Thanks in advance. > > Tabrez > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 27 12:14:14 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 27 Jan 2012 12:14:14 -0600 Subject: [petsc-users] Preallocation woes again In-Reply-To: References: <4F22D194.3040103@geology.wisc.edu> Message-ID: On Fri, Jan 27, 2012 at 12:04 PM, Tabrez Ali wrote: > Matt > > Thanks. Yes I can wait for a few weeks/months (its only a local memory > footprint issue and in no way hampers my ability to solve problems). > > Btw just for curiosity are you using the same technique discussed by Barry? > Basically. You have to store indices while counting if you don't want to overcount. This is the part he glosses over. Matt > Maybe I am overlooking something trivial in his post. > > Tabrez > > > On Jan 27, 2012, at 10:26 AM, Matthew Knepley wrote: > > On Fri, Jan 27, 2012 at 10:32 AM, Tabrez Ali wrote: > >> PETSc Gurus >> >> First I want to thank you for patiently answering questions that I have >> asked in past regarding preallocation. Unfortunately I am still having >> problems. >> >> I have a small unstructured FE (elasticity) code that uses PETSc. >> >> Unfortunately I am not yet able to find an _efficient_ way of calculating >> the non-zero structure so I simply overestimate the o_nz and d_nz values (= >> dimension x max_number_of nodes _a_node_contributes_to) which means that my >> stiffness matrix consumes at least 2X more memory. For example, for a >> perfectly structured mesh of linear hexes this number is 3*27=81=o_nz=d_nz. >> In complicated 3D linear tet/hex meshes that I have generated I rarely need >> to set a value greater than 150. In general I have found that as long as I >> keep DOF/core between 100-200K (assuming 1GB/per core) there is enough >> local RAM left even when memory for stiffness matrix is overestimated by >> 2X-6X. >> >> In any case now I do want to preallocate exactly for better memory >> performance and the ambiguity involved in choosing a reasonable o_nz/d_nz. >> The way I am trying to do it involves loops like ... >> >> do i=1, num_local_elements >> do j=1, num_total_nodes >> ... >> end do >> end do >> >> or using lists that involve searches. >> >> This unfortunately takes much time (much more than assembly and solve) >> due to the second loop. I am aware of many posts/slides by PETSc authors >> that mention that the non-zero structure can be found simply by looping >> once through elements but what to do next is not quite clear. >> >> I am also aware of this post by Barry [ http://lists.mcs.anl.gov/** >> pipermail/petsc-users/2008-**May/003020.html] but I cannot get it to work for a simple 4 element problem. Can some one >> please expand on the strategy being suggested by Barry specially where the >> loops are discussed (I understand the arrays and the vecscatter part). I am >> also not sure how does using a t of 0.5 and 1 prevents double counting. >> Shouldn't it be 0 and 1? >> >> I am also aware of some capabilities in DMMESH but right now I dont >> understand it well enough to utilize it. >> > > I agree with you here. Its not easy enough to use. However, I have > rewritten the basics of it completely in C, which > will all have easy Fortran bindings. You can specify your mesh with > adjacency lists using Fortran arrays. The last part > I am working on is matrix preallocation. I would not normally ask you to > wait, but I think I am pretty close. Mail me > directly if you want more info. I think it should only take me a few weeks > to finish. > > Thanks, > > Matt > > Alternatively does PETSc support MATMPIAIJ of size=integer(1)/logical >> instead of real(8) which I can use to put zeros/ones (when I loop over >> elements the first time) to get the non-zero structure? >> >> Thanks in advance. >> >> Tabrez >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Jan 27 13:12:47 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 27 Jan 2012 13:12:47 -0600 Subject: [petsc-users] Preallocation woes again In-Reply-To: References: <4F22D194.3040103@geology.wisc.edu> Message-ID: On Fri, Jan 27, 2012 at 12:14, Matthew Knepley wrote: > Basically. You have to store indices while counting if you don't want to > overcount. This is the part he glosses over. No, he has structure implicitly through the mesh. The mesh overcounts some points, but if you can easily determine how much it overcounts, then you have an efficient way to compute a non-redundant count. It's actually straightforward for low-order simplices, but not for more general bases. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 27 13:15:12 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 27 Jan 2012 13:15:12 -0600 Subject: [petsc-users] Preallocation woes again In-Reply-To: References: <4F22D194.3040103@geology.wisc.edu> Message-ID: On Fri, Jan 27, 2012 at 1:12 PM, Jed Brown wrote: > On Fri, Jan 27, 2012 at 12:14, Matthew Knepley wrote: > >> Basically. You have to store indices while counting if you don't want to >> overcount. This is the part he glosses over. > > > No, he has structure implicitly through the mesh. The mesh overcounts some > points, but if you can easily determine how much it overcounts, then you > have an efficient way to compute a non-redundant count. It's actually > straightforward for low-order simplices, but not for more general bases. > I am doing completely general, in parallel. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Jan 28 15:25:19 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 28 Jan 2012 15:25:19 -0600 Subject: [petsc-users] multigrid_repartitioning In-Reply-To: <3A48520A-E9F8-40C4-A8F0-8E081827D9EA@columbia.edu> References: <4F22DBF9.7090607@uni-mainz.de> <3A48520A-E9F8-40C4-A8F0-8E081827D9EA@columbia.edu> Message-ID: On Fri, Jan 27, 2012 at 11:29, Mark F. Adams wrote: > GAMG does not repartition by default anymore -- it is very expensive. > GAMG does now do simple process aggregation on coarser grids if > repartitioning is not specified. Couldn't we at least do some cheap (e.g. greedy) repartitioning? Or at least squish out empty ranks so that the coarser levels tend to be nearby on the network? Would it make more sense to do this or to work on a "real" partitioner. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.adams at columbia.edu Sun Jan 29 08:08:33 2012 From: mark.adams at columbia.edu (Mark F. Adams) Date: Sun, 29 Jan 2012 09:08:33 -0500 Subject: [petsc-users] multigrid_repartitioning In-Reply-To: References: <4F22DBF9.7090607@uni-mainz.de> <3A48520A-E9F8-40C4-A8F0-8E081827D9EA@columbia.edu> Message-ID: On Jan 28, 2012, at 4:25 PM, Jed Brown wrote: > On Fri, Jan 27, 2012 at 11:29, Mark F. Adams wrote: > GAMG does not repartition by default anymore -- it is very expensive. GAMG does now do simple process aggregation on coarser grids if repartitioning is not specified. > > Couldn't we at least do some cheap (e.g. greedy) repartitioning? Yes > Or at least squish out empty ranks so that the coarser levels tend to be nearby on the network? We do that now. > Would it make more sense to do this or to work on a "real" partitioner. I prefer to fold this into the new aggregation stuff. -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Sun Jan 29 12:53:23 2012 From: recrusader at gmail.com (recrusader) Date: Sun, 29 Jan 2012 12:53:23 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number Message-ID: Dear PETSc developers, With your help, I can successfully PETSc-deve with enabling GPU and complex number. However, when I compiled the codes, I met some errors. I also tried to use simple codes to realize the same function. However, the errors disappear. One example is as follows: for the function "VecScale_SeqCUSP" "#undef __FUNCT__ #define __FUNCT__ "VecScale_SeqCUSP" PetscErrorCode VecScale_SeqCUSP(Vec xin, PetscScalar alpha) { CUSPARRAY *xarray; PetscErrorCode ierr; PetscFunctionBegin; if (alpha == 0.0) { ierr = VecSet_SeqCUSP(xin,alpha);CHKERRQ(ierr); } else if (alpha != 1.0) { ierr = VecCUSPGetArrayReadWrite(xin,&xarray);CHKERRQ(ierr); try { cusp::blas::scal(*xarray,alpha); } catch(char* ex) { SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); } ierr = VecCUSPRestoreArrayReadWrite(xin,&xarray);CHKERRQ(ierr); } ierr = WaitForGPU();CHKERRCUSP(ierr); ierr = PetscLogFlops(xin->map->n);CHKERRQ(ierr); PetscFunctionReturn(0); } " When I compiled PETSc-dev, I met the following errors: " /opt/apps/cuda/4.0/cuda/include/cusp/detail/blas.inl(134): warning: calling a __host__ function from a __host__ __device__ function is not allowed detected during: instantiation of "void cusp::blas::detail::SCAL::operator()(T2 &) [with T=std::complex, T2=PetscScalar]" /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/for_each.inl(72): here instantiation of "void thrust::detail::device::cuda::for_each_n_closure::operator()() [with RandomAccessIterator=thrust::detail::normal_iterator>, Size=long, UnaryFunction=cusp::blas::detail::SCAL>]" /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): here instantiation of "void thrust::detail::device::cuda::detail::launch_closure_by_value(NullaryFunction) [with NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, long, cusp::blas::detail::SCAL>>]" /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(71): here instantiation of "size_t thrust::detail::device::cuda::detail::closure_launcher_base::block_size_with_maximal_occupancy(size_t) [with NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, long, cusp::blas::detail::SCAL>>, launch_by_value=true]" /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(136): here instantiation of "thrust::pair thrust::detail::device::cuda::detail::closure_launcher::configuration_with_maximal_occupancy(Size) [with NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, long, cusp::blas::detail::SCAL>>, Size=long]" /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(145): here [ 6 instantiation contexts not shown ] instantiation of "InputIterator thrust::detail::dispatch::for_each(InputIterator, InputIterator, UnaryFunction, thrust::device_space_tag) [with InputIterator=thrust::detail::normal_iterator>, UnaryFunction=cusp::blas::detail::SCAL>]" /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(51): here instantiation of "InputIterator thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) [with InputIterator=thrust::detail::normal_iterator>, UnaryFunction=cusp::blas::detail::SCAL>]" /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(67): here instantiation of "void thrust::for_each(InputIterator, InputIterator, UnaryFunction) [with InputIterator=thrust::detail::normal_iterator>, UnaryFunction=cusp::blas::detail::SCAL>]" (367): here instantiation of "void cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) [with ForwardIterator=thrust::detail::normal_iterator>, ScalarType=std::complex]" (748): here instantiation of "void cusp::blas::scal(Array &, ScalarType) [with Array=cusp::array1d, ScalarType=std::complex]" veccusp.cu(1185): here /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): error: a value of type "int" cannot be assigned to an entity of type "_ZNSt7complexIdE9_ComplexTE" " However, I further realize simiar codes as " #include #include #include #include #include #include int main(void) { cusp::array1d, cusp::host_memory> *x; x=new cusp::array1d, cusp::host_memory>(2,0.0); std::complex alpha(1,2.0); cusp::blas::scal(*x,alpha); return 0; } " When I complied it using "nvcc gputest.cu -o gputest", I only meet warning information as follows: " /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): warning: calling a __host__ function from a __host__ __device__ function is not allowed detected during: instantiation of "void cusp::blas::detail::SCAL::operator()(T2 &) [with T=std::complex, T2=std::complex]" /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): here instantiation of "InputIterator thrust::detail::host::for_each(InputIterator, InputIterator, UnaryFunction) [with InputIterator=thrust::detail::normal_iterator *>, UnaryFunction=cusp::blas::detail::SCAL>]" /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): here instantiation of "InputIterator thrust::detail::dispatch::for_each(InputIterator, InputIterator, UnaryFunction, thrust::host_space_tag) [with InputIterator=thrust::detail::normal_iterator *>, UnaryFunction=cusp::blas::detail::SCAL>]" /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): here instantiation of "InputIterator thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) [with InputIterator=thrust::detail::normal_iterator *>, UnaryFunction=cusp::blas::detail::SCAL>]" /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): here instantiation of "void thrust::for_each(InputIterator, InputIterator, UnaryFunction) [with InputIterator=thrust::detail::normal_iterator *>, UnaryFunction=cusp::blas::detail::SCAL>]" (367): here instantiation of "void cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) [with ForwardIterator=thrust::detail::normal_iterator *>, ScalarType=std::complex]" (748): here instantiation of "void cusp::blas::scal(Array &, ScalarType) [with Array=cusp::array1d, cusp::host_memory>, ScalarType=std::complex]" gputest.cu(25): here " There are not errors like "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): error: a value of type "int" cannot be assigned to an entity of type "_ZNSt7complexIdE9_ComplexTE" " Furthermore, the warning information is also different between PETSc-dev and simple codes. Could you give me some suggestion for this errors? Thank you very much. Best, Yujie From knepley at gmail.com Sun Jan 29 13:00:43 2012 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 29 Jan 2012 13:00:43 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: On Sun, Jan 29, 2012 at 12:53 PM, recrusader wrote: > Dear PETSc developers, > > With your help, I can successfully PETSc-deve with enabling GPU and > complex number. > However, when I compiled the codes, I met some errors. I also tried to > use simple codes to realize the same function. However, the errors > disappear. One example is as follows: > > for the function "VecScale_SeqCUSP" > "#undef __FUNCT__ > #define __FUNCT__ "VecScale_SeqCUSP" > PetscErrorCode VecScale_SeqCUSP(Vec xin, PetscScalar alpha) > { > CUSPARRAY *xarray; > PetscErrorCode ierr; > > PetscFunctionBegin; > if (alpha == 0.0) { > ierr = VecSet_SeqCUSP(xin,alpha);CHKERRQ(ierr); > } else if (alpha != 1.0) { > ierr = VecCUSPGetArrayReadWrite(xin,&xarray);CHKERRQ(ierr); > try { > cusp::blas::scal(*xarray,alpha); > } catch(char* ex) { > SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); > } > ierr = VecCUSPRestoreArrayReadWrite(xin,&xarray);CHKERRQ(ierr); > } > ierr = WaitForGPU();CHKERRCUSP(ierr); > ierr = PetscLogFlops(xin->map->n);CHKERRQ(ierr); > PetscFunctionReturn(0); > } " > > When I compiled PETSc-dev, I met the following errors: > " /opt/apps/cuda/4.0/cuda/include/cusp/detail/blas.inl(134): warning: > calling a __host__ function from a __host__ __device__ function is not > allowed > detected during: > instantiation of "void > cusp::blas::detail::SCAL::operator()(T2 &) [with > T=std::complex, T2=PetscScalar]" > /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/for_each.inl(72): > here > instantiation of "void > thrust::detail::device::cuda::for_each_n_closure Size, UnaryFunction>::operator()() [with > > RandomAccessIterator=thrust::detail::normal_iterator>, > Size=long, UnaryFunction=cusp::blas::detail::SCAL>]" > > /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): > here > instantiation of "void > > thrust::detail::device::cuda::detail::launch_closure_by_value(NullaryFunction) > [with > NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, > long, cusp::blas::detail::SCAL>>]" > > /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(71): > here > instantiation of "size_t > > thrust::detail::device::cuda::detail::closure_launcher_base launch_by_value>::block_size_with_maximal_occupancy(size_t) [with > > NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, > long, cusp::blas::detail::SCAL>>, > launch_by_value=true]" > > /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(136): > here > instantiation of "thrust::pair > > thrust::detail::device::cuda::detail::closure_launcher::configuration_with_maximal_occupancy(Size) > [with > NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, > long, cusp::blas::detail::SCAL>>, Size=long]" > > /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(145): > here > [ 6 instantiation contexts not shown ] > instantiation of "InputIterator > thrust::detail::dispatch::for_each(InputIterator, InputIterator, > UnaryFunction, thrust::device_space_tag) [with > > InputIterator=thrust::detail::normal_iterator>, > UnaryFunction=cusp::blas::detail::SCAL>]" > /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(51): here > instantiation of "InputIterator > thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) > [with > InputIterator=thrust::detail::normal_iterator>, > UnaryFunction=cusp::blas::detail::SCAL>]" > /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(67): here > instantiation of "void thrust::for_each(InputIterator, > InputIterator, UnaryFunction) [with > > InputIterator=thrust::detail::normal_iterator>, > UnaryFunction=cusp::blas::detail::SCAL>]" > (367): here > instantiation of "void > cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) > [with > ForwardIterator=thrust::detail::normal_iterator>, > ScalarType=std::complex]" > (748): here > instantiation of "void cusp::blas::scal(Array &, > ScalarType) [with Array=cusp::array1d cusp::device_memory>, ScalarType=std::complex]" > veccusp.cu(1185): here > > > /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): > error: a value of type "int" cannot be assigned to an entity of type > "_ZNSt7complexIdE9_ComplexTE" > > " > However, I further realize simiar codes as > " > #include > #include > #include > #include > #include > #include > > int main(void) > { > cusp::array1d, cusp::host_memory> *x; > > x=new cusp::array1d, cusp::host_memory>(2,0.0); > > std::complex alpha(1,2.0); > cusp::blas::scal(*x,alpha); > > return 0; > } > " > > When I complied it using "nvcc gputest.cu -o gputest", I only meet > warning information as follows: > " > /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): > warning: calling a __host__ function from a __host__ __device__ > function is not allowed > detected during: > instantiation of "void > cusp::blas::detail::SCAL::operator()(T2 &) [with > T=std::complex, T2=std::complex]" > /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): > here > instantiation of "InputIterator > thrust::detail::host::for_each(InputIterator, InputIterator, > UnaryFunction) [with > InputIterator=thrust::detail::normal_iterator *>, > UnaryFunction=cusp::blas::detail::SCAL>]" > > /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): > here > instantiation of "InputIterator > thrust::detail::dispatch::for_each(InputIterator, InputIterator, > UnaryFunction, thrust::host_space_tag) [with > InputIterator=thrust::detail::normal_iterator *>, > UnaryFunction=cusp::blas::detail::SCAL>]" > /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): here > instantiation of "InputIterator > thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) > [with InputIterator=thrust::detail::normal_iterator > *>, UnaryFunction=cusp::blas::detail::SCAL>]" > /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): here > instantiation of "void thrust::for_each(InputIterator, > InputIterator, UnaryFunction) [with > InputIterator=thrust::detail::normal_iterator *>, > UnaryFunction=cusp::blas::detail::SCAL>]" > (367): here > instantiation of "void > cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) > [with ForwardIterator=thrust::detail::normal_iterator > *>, ScalarType=std::complex]" > (748): here > instantiation of "void cusp::blas::scal(Array &, > ScalarType) [with Array=cusp::array1d, > cusp::host_memory>, ScalarType=std::complex]" > gputest.cu(25): here > > " > There are not errors like > > "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): > error: a value of type "int" cannot be assigned to an entity of type > "_ZNSt7complexIdE9_ComplexTE" " > > Furthermore, the warning information is also different between > PETSc-dev and simple codes. > > Could you give me some suggestion for this errors? Thank you very much. > The headers are complicated to get right. The whole point of what we did is to give a way to use GPU simply through the existing PETSc linear algebra interface. Matt > Best, > Yujie > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Sun Jan 29 13:05:22 2012 From: recrusader at gmail.com (recrusader) Date: Sun, 29 Jan 2012 13:05:22 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: Thank you very much, Matt, You mean the headers of the simple codes, I further simply the codes as " #include #include int main(void) { cusp::array1d, cusp::host_memory> *x; x=new cusp::array1d, cusp::host_memory>(2,0.0); std::complex alpha(1,2.0); cusp::blas::scal(*x,alpha); return 0; }" I got the same compilation results " login1$ nvcc gputest.cu -o gputest /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): warning: calling a __host__ function from a __host__ __device__ function is not allowed detected during: instantiation of "void cusp::blas::detail::SCAL::operator()(T2 &) [with T=std::complex, T2=std::complex]" /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): here instantiation of "InputIterator thrust::detail::host::for_each(InputIterator, InputIterator, UnaryFunction) [with InputIterator=thrust::detail::normal_iterator *>, UnaryFunction=cusp::blas::detail::SCAL>]" /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): here instantiation of "InputIterator thrust::detail::dispatch::for_each(InputIterator, InputIterator, UnaryFunction, thrust::host_space_tag) [with InputIterator=thrust::detail::normal_iterator *>, UnaryFunction=cusp::blas::detail::SCAL>]" /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): here instantiation of "InputIterator thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) [with InputIterator=thrust::detail::normal_iterator *>, UnaryFunction=cusp::blas::detail::SCAL>]" /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): here instantiation of "void thrust::for_each(InputIterator, InputIterator, UnaryFunction) [with InputIterator=thrust::detail::normal_iterator *>, UnaryFunction=cusp::blas::detail::SCAL>]" (367): here instantiation of "void cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) [with ForwardIterator=thrust::detail::normal_iterator *>, ScalarType=std::complex]" (748): here instantiation of "void cusp::blas::scal(Array &, ScalarType) [with Array=cusp::array1d, cusp::host_memory>, ScalarType=std::complex]" gputest.cu(25): here " Thanks a lot. Best, Yujie On 1/29/12, Matthew Knepley wrote: > On Sun, Jan 29, 2012 at 12:53 PM, recrusader wrote: > >> Dear PETSc developers, >> >> With your help, I can successfully PETSc-deve with enabling GPU and >> complex number. >> However, when I compiled the codes, I met some errors. I also tried to >> use simple codes to realize the same function. However, the errors >> disappear. One example is as follows: >> >> for the function "VecScale_SeqCUSP" >> "#undef __FUNCT__ >> #define __FUNCT__ "VecScale_SeqCUSP" >> PetscErrorCode VecScale_SeqCUSP(Vec xin, PetscScalar alpha) >> { >> CUSPARRAY *xarray; >> PetscErrorCode ierr; >> >> PetscFunctionBegin; >> if (alpha == 0.0) { >> ierr = VecSet_SeqCUSP(xin,alpha);CHKERRQ(ierr); >> } else if (alpha != 1.0) { >> ierr = VecCUSPGetArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >> try { >> cusp::blas::scal(*xarray,alpha); >> } catch(char* ex) { >> SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); >> } >> ierr = VecCUSPRestoreArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >> } >> ierr = WaitForGPU();CHKERRCUSP(ierr); >> ierr = PetscLogFlops(xin->map->n);CHKERRQ(ierr); >> PetscFunctionReturn(0); >> } " >> >> When I compiled PETSc-dev, I met the following errors: >> " /opt/apps/cuda/4.0/cuda/include/cusp/detail/blas.inl(134): warning: >> calling a __host__ function from a __host__ __device__ function is not >> allowed >> detected during: >> instantiation of "void >> cusp::blas::detail::SCAL::operator()(T2 &) [with >> T=std::complex, T2=PetscScalar]" >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/for_each.inl(72): >> here >> instantiation of "void >> thrust::detail::device::cuda::for_each_n_closure> Size, UnaryFunction>::operator()() [with >> >> RandomAccessIterator=thrust::detail::normal_iterator>, >> Size=long, UnaryFunction=cusp::blas::detail::SCAL>]" >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >> here >> instantiation of "void >> >> thrust::detail::device::cuda::detail::launch_closure_by_value(NullaryFunction) >> [with >> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >> long, cusp::blas::detail::SCAL>>]" >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(71): >> here >> instantiation of "size_t >> >> thrust::detail::device::cuda::detail::closure_launcher_base> launch_by_value>::block_size_with_maximal_occupancy(size_t) [with >> >> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >> long, cusp::blas::detail::SCAL>>, >> launch_by_value=true]" >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(136): >> here >> instantiation of "thrust::pair >> >> thrust::detail::device::cuda::detail::closure_launcher::configuration_with_maximal_occupancy(Size) >> [with >> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >> long, cusp::blas::detail::SCAL>>, Size=long]" >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(145): >> here >> [ 6 instantiation contexts not shown ] >> instantiation of "InputIterator >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >> UnaryFunction, thrust::device_space_tag) [with >> >> InputIterator=thrust::detail::normal_iterator>, >> UnaryFunction=cusp::blas::detail::SCAL>]" >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(51): here >> instantiation of "InputIterator >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >> [with >> InputIterator=thrust::detail::normal_iterator>, >> UnaryFunction=cusp::blas::detail::SCAL>]" >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(67): here >> instantiation of "void thrust::for_each(InputIterator, >> InputIterator, UnaryFunction) [with >> >> InputIterator=thrust::detail::normal_iterator>, >> UnaryFunction=cusp::blas::detail::SCAL>]" >> (367): here >> instantiation of "void >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >> [with >> ForwardIterator=thrust::detail::normal_iterator>, >> ScalarType=std::complex]" >> (748): here >> instantiation of "void cusp::blas::scal(Array &, >> ScalarType) [with Array=cusp::array1d> cusp::device_memory>, ScalarType=std::complex]" >> veccusp.cu(1185): here >> >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >> error: a value of type "int" cannot be assigned to an entity of type >> "_ZNSt7complexIdE9_ComplexTE" >> >> " >> However, I further realize simiar codes as >> " >> #include >> #include >> #include >> #include >> #include >> #include >> >> int main(void) >> { >> cusp::array1d, cusp::host_memory> *x; >> >> x=new cusp::array1d, cusp::host_memory>(2,0.0); >> >> std::complex alpha(1,2.0); >> cusp::blas::scal(*x,alpha); >> >> return 0; >> } >> " >> >> When I complied it using "nvcc gputest.cu -o gputest", I only meet >> warning information as follows: >> " >> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >> warning: calling a __host__ function from a __host__ __device__ >> function is not allowed >> detected during: >> instantiation of "void >> cusp::blas::detail::SCAL::operator()(T2 &) [with >> T=std::complex, T2=std::complex]" >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >> here >> instantiation of "InputIterator >> thrust::detail::host::for_each(InputIterator, InputIterator, >> UnaryFunction) [with >> InputIterator=thrust::detail::normal_iterator *>, >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >> here >> instantiation of "InputIterator >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >> UnaryFunction, thrust::host_space_tag) [with >> InputIterator=thrust::detail::normal_iterator *>, >> UnaryFunction=cusp::blas::detail::SCAL>]" >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >> here >> instantiation of "InputIterator >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >> [with InputIterator=thrust::detail::normal_iterator >> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >> here >> instantiation of "void thrust::for_each(InputIterator, >> InputIterator, UnaryFunction) [with >> InputIterator=thrust::detail::normal_iterator *>, >> UnaryFunction=cusp::blas::detail::SCAL>]" >> (367): here >> instantiation of "void >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >> [with ForwardIterator=thrust::detail::normal_iterator >> *>, ScalarType=std::complex]" >> (748): here >> instantiation of "void cusp::blas::scal(Array &, >> ScalarType) [with Array=cusp::array1d, >> cusp::host_memory>, ScalarType=std::complex]" >> gputest.cu(25): here >> >> " >> There are not errors like >> >> "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >> error: a value of type "int" cannot be assigned to an entity of type >> "_ZNSt7complexIdE9_ComplexTE" " >> >> Furthermore, the warning information is also different between >> PETSc-dev and simple codes. >> >> Could you give me some suggestion for this errors? Thank you very much. >> > > The headers are complicated to get right. The whole point of what we did is > to give a way to use GPU > simply through the existing PETSc linear algebra interface. > > Matt > > >> Best, >> Yujie >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > From knepley at gmail.com Sun Jan 29 13:20:56 2012 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 29 Jan 2012 13:20:56 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: On Sun, Jan 29, 2012 at 1:05 PM, recrusader wrote: > Thank you very much, Matt, > > You mean the headers of the simple codes, I further simply the codes as This is a question for the CUSP mailing list. Thanks, Matt > " > #include > #include > > int main(void) > { > cusp::array1d, cusp::host_memory> *x; > > x=new cusp::array1d, cusp::host_memory>(2,0.0); > > std::complex alpha(1,2.0); > cusp::blas::scal(*x,alpha); > > return 0; > }" > > I got the same compilation results " > login1$ nvcc gputest.cu -o gputest > /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): > warning: calling a __host__ function from a __host__ __device__ > function is not allowed > detected during: > instantiation of "void > cusp::blas::detail::SCAL::operator()(T2 &) [with > T=std::complex, T2=std::complex]" > /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): > here > instantiation of "InputIterator > thrust::detail::host::for_each(InputIterator, InputIterator, > UnaryFunction) [with > InputIterator=thrust::detail::normal_iterator *>, > UnaryFunction=cusp::blas::detail::SCAL>]" > > /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): > here > instantiation of "InputIterator > thrust::detail::dispatch::for_each(InputIterator, InputIterator, > UnaryFunction, thrust::host_space_tag) [with > InputIterator=thrust::detail::normal_iterator *>, > UnaryFunction=cusp::blas::detail::SCAL>]" > /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): here > instantiation of "InputIterator > thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) > [with InputIterator=thrust::detail::normal_iterator > *>, UnaryFunction=cusp::blas::detail::SCAL>]" > /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): here > instantiation of "void thrust::for_each(InputIterator, > InputIterator, UnaryFunction) [with > InputIterator=thrust::detail::normal_iterator *>, > UnaryFunction=cusp::blas::detail::SCAL>]" > (367): here > instantiation of "void > cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) > [with ForwardIterator=thrust::detail::normal_iterator > *>, ScalarType=std::complex]" > (748): here > instantiation of "void cusp::blas::scal(Array &, > ScalarType) [with Array=cusp::array1d, > cusp::host_memory>, ScalarType=std::complex]" > gputest.cu(25): here > > " > > Thanks a lot. > > Best, > Yujie > > On 1/29/12, Matthew Knepley wrote: > > On Sun, Jan 29, 2012 at 12:53 PM, recrusader > wrote: > > > >> Dear PETSc developers, > >> > >> With your help, I can successfully PETSc-deve with enabling GPU and > >> complex number. > >> However, when I compiled the codes, I met some errors. I also tried to > >> use simple codes to realize the same function. However, the errors > >> disappear. One example is as follows: > >> > >> for the function "VecScale_SeqCUSP" > >> "#undef __FUNCT__ > >> #define __FUNCT__ "VecScale_SeqCUSP" > >> PetscErrorCode VecScale_SeqCUSP(Vec xin, PetscScalar alpha) > >> { > >> CUSPARRAY *xarray; > >> PetscErrorCode ierr; > >> > >> PetscFunctionBegin; > >> if (alpha == 0.0) { > >> ierr = VecSet_SeqCUSP(xin,alpha);CHKERRQ(ierr); > >> } else if (alpha != 1.0) { > >> ierr = VecCUSPGetArrayReadWrite(xin,&xarray);CHKERRQ(ierr); > >> try { > >> cusp::blas::scal(*xarray,alpha); > >> } catch(char* ex) { > >> SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); > >> } > >> ierr = VecCUSPRestoreArrayReadWrite(xin,&xarray);CHKERRQ(ierr); > >> } > >> ierr = WaitForGPU();CHKERRCUSP(ierr); > >> ierr = PetscLogFlops(xin->map->n);CHKERRQ(ierr); > >> PetscFunctionReturn(0); > >> } " > >> > >> When I compiled PETSc-dev, I met the following errors: > >> " /opt/apps/cuda/4.0/cuda/include/cusp/detail/blas.inl(134): warning: > >> calling a __host__ function from a __host__ __device__ function is not > >> allowed > >> detected during: > >> instantiation of "void > >> cusp::blas::detail::SCAL::operator()(T2 &) [with > >> T=std::complex, T2=PetscScalar]" > >> > /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/for_each.inl(72): > >> here > >> instantiation of "void > >> thrust::detail::device::cuda::for_each_n_closure >> Size, UnaryFunction>::operator()() [with > >> > >> > RandomAccessIterator=thrust::detail::normal_iterator>, > >> Size=long, > UnaryFunction=cusp::blas::detail::SCAL>]" > >> > >> > /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): > >> here > >> instantiation of "void > >> > >> > thrust::detail::device::cuda::detail::launch_closure_by_value(NullaryFunction) > >> [with > >> > NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, > >> long, cusp::blas::detail::SCAL>>]" > >> > >> > /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(71): > >> here > >> instantiation of "size_t > >> > >> > thrust::detail::device::cuda::detail::closure_launcher_base >> launch_by_value>::block_size_with_maximal_occupancy(size_t) [with > >> > >> > NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, > >> long, cusp::blas::detail::SCAL>>, > >> launch_by_value=true]" > >> > >> > /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(136): > >> here > >> instantiation of "thrust::pair > >> > >> > thrust::detail::device::cuda::detail::closure_launcher::configuration_with_maximal_occupancy(Size) > >> [with > >> > NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, > >> long, cusp::blas::detail::SCAL>>, Size=long]" > >> > >> > /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(145): > >> here > >> [ 6 instantiation contexts not shown ] > >> instantiation of "InputIterator > >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, > >> UnaryFunction, thrust::device_space_tag) [with > >> > >> > InputIterator=thrust::detail::normal_iterator>, > >> UnaryFunction=cusp::blas::detail::SCAL>]" > >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(51): here > >> instantiation of "InputIterator > >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) > >> [with > >> > InputIterator=thrust::detail::normal_iterator>, > >> UnaryFunction=cusp::blas::detail::SCAL>]" > >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(67): here > >> instantiation of "void thrust::for_each(InputIterator, > >> InputIterator, UnaryFunction) [with > >> > >> > InputIterator=thrust::detail::normal_iterator>, > >> UnaryFunction=cusp::blas::detail::SCAL>]" > >> (367): here > >> instantiation of "void > >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) > >> [with > >> > ForwardIterator=thrust::detail::normal_iterator>, > >> ScalarType=std::complex]" > >> (748): here > >> instantiation of "void cusp::blas::scal(Array &, > >> ScalarType) [with Array=cusp::array1d >> cusp::device_memory>, ScalarType=std::complex]" > >> veccusp.cu(1185): here > >> > >> > >> > /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): > >> error: a value of type "int" cannot be assigned to an entity of type > >> "_ZNSt7complexIdE9_ComplexTE" > >> > >> " > >> However, I further realize simiar codes as > >> " > >> #include > >> #include > >> #include > >> #include > >> #include > >> #include > >> > >> int main(void) > >> { > >> cusp::array1d, cusp::host_memory> *x; > >> > >> x=new cusp::array1d, cusp::host_memory>(2,0.0); > >> > >> std::complex alpha(1,2.0); > >> cusp::blas::scal(*x,alpha); > >> > >> return 0; > >> } > >> " > >> > >> When I complied it using "nvcc gputest.cu -o gputest", I only meet > >> warning information as follows: > >> " > >> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): > >> warning: calling a __host__ function from a __host__ __device__ > >> function is not allowed > >> detected during: > >> instantiation of "void > >> cusp::blas::detail::SCAL::operator()(T2 &) [with > >> T=std::complex, T2=std::complex]" > >> > /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): > >> here > >> instantiation of "InputIterator > >> thrust::detail::host::for_each(InputIterator, InputIterator, > >> UnaryFunction) [with > >> InputIterator=thrust::detail::normal_iterator *>, > >> UnaryFunction=cusp::blas::detail::SCAL>]" > >> > >> > /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): > >> here > >> instantiation of "InputIterator > >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, > >> UnaryFunction, thrust::host_space_tag) [with > >> InputIterator=thrust::detail::normal_iterator *>, > >> UnaryFunction=cusp::blas::detail::SCAL>]" > >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): > >> here > >> instantiation of "InputIterator > >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) > >> [with InputIterator=thrust::detail::normal_iterator > >> *>, UnaryFunction=cusp::blas::detail::SCAL>]" > >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): > >> here > >> instantiation of "void thrust::for_each(InputIterator, > >> InputIterator, UnaryFunction) [with > >> InputIterator=thrust::detail::normal_iterator *>, > >> UnaryFunction=cusp::blas::detail::SCAL>]" > >> (367): here > >> instantiation of "void > >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) > >> [with > ForwardIterator=thrust::detail::normal_iterator > >> *>, ScalarType=std::complex]" > >> (748): here > >> instantiation of "void cusp::blas::scal(Array &, > >> ScalarType) [with Array=cusp::array1d, > >> cusp::host_memory>, ScalarType=std::complex]" > >> gputest.cu(25): here > >> > >> " > >> There are not errors like > >> > >> > "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): > >> error: a value of type "int" cannot be assigned to an entity of type > >> "_ZNSt7complexIdE9_ComplexTE" " > >> > >> Furthermore, the warning information is also different between > >> PETSc-dev and simple codes. > >> > >> Could you give me some suggestion for this errors? Thank you very much. > >> > > > > The headers are complicated to get right. The whole point of what we did > is > > to give a way to use GPU > > simply through the existing PETSc linear algebra interface. > > > > Matt > > > > > >> Best, > >> Yujie > >> > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > their > > experiments lead. > > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Sun Jan 29 13:24:30 2012 From: recrusader at gmail.com (recrusader) Date: Sun, 29 Jan 2012 13:24:30 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: I have thought I send this question to CUSP mailing list. Therefore, I wish I could get the same errors using the simple codes. However, the errors disappear. Is it possible to provide simple codes with PETSc to show the errors? Thanks again. Best, Yujie On Sun, Jan 29, 2012 at 1:20 PM, Matthew Knepley wrote: > On Sun, Jan 29, 2012 at 1:05 PM, recrusader wrote: > >> Thank you very much, Matt, >> >> You mean the headers of the simple codes, I further simply the codes as > > > This is a question for the CUSP mailing list. > > Thanks, > > Matt > > >> " >> #include >> #include >> >> int main(void) >> { >> cusp::array1d, cusp::host_memory> *x; >> >> x=new cusp::array1d, cusp::host_memory>(2,0.0); >> >> std::complex alpha(1,2.0); >> cusp::blas::scal(*x,alpha); >> >> return 0; >> }" >> >> I got the same compilation results " >> login1$ nvcc gputest.cu -o gputest >> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >> warning: calling a __host__ function from a __host__ __device__ >> function is not allowed >> detected during: >> instantiation of "void >> cusp::blas::detail::SCAL::operator()(T2 &) [with >> T=std::complex, T2=std::complex]" >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >> here >> instantiation of "InputIterator >> thrust::detail::host::for_each(InputIterator, InputIterator, >> UnaryFunction) [with >> InputIterator=thrust::detail::normal_iterator *>, >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >> here >> instantiation of "InputIterator >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >> UnaryFunction, thrust::host_space_tag) [with >> InputIterator=thrust::detail::normal_iterator *>, >> UnaryFunction=cusp::blas::detail::SCAL>]" >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >> here >> instantiation of "InputIterator >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >> [with InputIterator=thrust::detail::normal_iterator >> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >> here >> instantiation of "void thrust::for_each(InputIterator, >> InputIterator, UnaryFunction) [with >> InputIterator=thrust::detail::normal_iterator *>, >> UnaryFunction=cusp::blas::detail::SCAL>]" >> (367): here >> instantiation of "void >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >> [with ForwardIterator=thrust::detail::normal_iterator >> *>, ScalarType=std::complex]" >> (748): here >> instantiation of "void cusp::blas::scal(Array &, >> ScalarType) [with Array=cusp::array1d, >> cusp::host_memory>, ScalarType=std::complex]" >> gputest.cu(25): here >> >> " >> >> Thanks a lot. >> >> Best, >> Yujie >> >> On 1/29/12, Matthew Knepley wrote: >> > On Sun, Jan 29, 2012 at 12:53 PM, recrusader >> wrote: >> > >> >> Dear PETSc developers, >> >> >> >> With your help, I can successfully PETSc-deve with enabling GPU and >> >> complex number. >> >> However, when I compiled the codes, I met some errors. I also tried to >> >> use simple codes to realize the same function. However, the errors >> >> disappear. One example is as follows: >> >> >> >> for the function "VecScale_SeqCUSP" >> >> "#undef __FUNCT__ >> >> #define __FUNCT__ "VecScale_SeqCUSP" >> >> PetscErrorCode VecScale_SeqCUSP(Vec xin, PetscScalar alpha) >> >> { >> >> CUSPARRAY *xarray; >> >> PetscErrorCode ierr; >> >> >> >> PetscFunctionBegin; >> >> if (alpha == 0.0) { >> >> ierr = VecSet_SeqCUSP(xin,alpha);CHKERRQ(ierr); >> >> } else if (alpha != 1.0) { >> >> ierr = VecCUSPGetArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >> >> try { >> >> cusp::blas::scal(*xarray,alpha); >> >> } catch(char* ex) { >> >> SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); >> >> } >> >> ierr = VecCUSPRestoreArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >> >> } >> >> ierr = WaitForGPU();CHKERRCUSP(ierr); >> >> ierr = PetscLogFlops(xin->map->n);CHKERRQ(ierr); >> >> PetscFunctionReturn(0); >> >> } " >> >> >> >> When I compiled PETSc-dev, I met the following errors: >> >> " /opt/apps/cuda/4.0/cuda/include/cusp/detail/blas.inl(134): warning: >> >> calling a __host__ function from a __host__ __device__ function is not >> >> allowed >> >> detected during: >> >> instantiation of "void >> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >> >> T=std::complex, T2=PetscScalar]" >> >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/for_each.inl(72): >> >> here >> >> instantiation of "void >> >> thrust::detail::device::cuda::for_each_n_closure> >> Size, UnaryFunction>::operator()() [with >> >> >> >> >> RandomAccessIterator=thrust::detail::normal_iterator>, >> >> Size=long, >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> >> >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >> >> here >> >> instantiation of "void >> >> >> >> >> thrust::detail::device::cuda::detail::launch_closure_by_value(NullaryFunction) >> >> [with >> >> >> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >> >> long, cusp::blas::detail::SCAL>>]" >> >> >> >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(71): >> >> here >> >> instantiation of "size_t >> >> >> >> >> thrust::detail::device::cuda::detail::closure_launcher_base> >> launch_by_value>::block_size_with_maximal_occupancy(size_t) [with >> >> >> >> >> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >> >> long, cusp::blas::detail::SCAL>>, >> >> launch_by_value=true]" >> >> >> >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(136): >> >> here >> >> instantiation of "thrust::pair >> >> >> >> >> thrust::detail::device::cuda::detail::closure_launcher::configuration_with_maximal_occupancy(Size) >> >> [with >> >> >> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >> >> long, cusp::blas::detail::SCAL>>, Size=long]" >> >> >> >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(145): >> >> here >> >> [ 6 instantiation contexts not shown ] >> >> instantiation of "InputIterator >> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >> >> UnaryFunction, thrust::device_space_tag) [with >> >> >> >> >> InputIterator=thrust::detail::normal_iterator>, >> >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(51): here >> >> instantiation of "InputIterator >> >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >> >> [with >> >> >> InputIterator=thrust::detail::normal_iterator>, >> >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(67): here >> >> instantiation of "void thrust::for_each(InputIterator, >> >> InputIterator, UnaryFunction) [with >> >> >> >> >> InputIterator=thrust::detail::normal_iterator>, >> >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> (367): here >> >> instantiation of "void >> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >> >> [with >> >> >> ForwardIterator=thrust::detail::normal_iterator>, >> >> ScalarType=std::complex]" >> >> (748): here >> >> instantiation of "void cusp::blas::scal(Array &, >> >> ScalarType) [with Array=cusp::array1d> >> cusp::device_memory>, ScalarType=std::complex]" >> >> veccusp.cu(1185): here >> >> >> >> >> >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >> >> error: a value of type "int" cannot be assigned to an entity of type >> >> "_ZNSt7complexIdE9_ComplexTE" >> >> >> >> " >> >> However, I further realize simiar codes as >> >> " >> >> #include >> >> #include >> >> #include >> >> #include >> >> #include >> >> #include >> >> >> >> int main(void) >> >> { >> >> cusp::array1d, cusp::host_memory> *x; >> >> >> >> x=new cusp::array1d, cusp::host_memory>(2,0.0); >> >> >> >> std::complex alpha(1,2.0); >> >> cusp::blas::scal(*x,alpha); >> >> >> >> return 0; >> >> } >> >> " >> >> >> >> When I complied it using "nvcc gputest.cu -o gputest", I only meet >> >> warning information as follows: >> >> " >> >> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >> >> warning: calling a __host__ function from a __host__ __device__ >> >> function is not allowed >> >> detected during: >> >> instantiation of "void >> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >> >> T=std::complex, T2=std::complex]" >> >> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >> >> here >> >> instantiation of "InputIterator >> >> thrust::detail::host::for_each(InputIterator, InputIterator, >> >> UnaryFunction) [with >> >> InputIterator=thrust::detail::normal_iterator *>, >> >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> >> >> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >> >> here >> >> instantiation of "InputIterator >> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >> >> UnaryFunction, thrust::host_space_tag) [with >> >> InputIterator=thrust::detail::normal_iterator *>, >> >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >> >> here >> >> instantiation of "InputIterator >> >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >> >> [with >> InputIterator=thrust::detail::normal_iterator >> >> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >> >> here >> >> instantiation of "void thrust::for_each(InputIterator, >> >> InputIterator, UnaryFunction) [with >> >> InputIterator=thrust::detail::normal_iterator *>, >> >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> (367): here >> >> instantiation of "void >> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >> >> [with >> ForwardIterator=thrust::detail::normal_iterator >> >> *>, ScalarType=std::complex]" >> >> (748): here >> >> instantiation of "void cusp::blas::scal(Array &, >> >> ScalarType) [with Array=cusp::array1d, >> >> cusp::host_memory>, ScalarType=std::complex]" >> >> gputest.cu(25): here >> >> >> >> " >> >> There are not errors like >> >> >> >> >> "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >> >> error: a value of type "int" cannot be assigned to an entity of type >> >> "_ZNSt7complexIdE9_ComplexTE" " >> >> >> >> Furthermore, the warning information is also different between >> >> PETSc-dev and simple codes. >> >> >> >> Could you give me some suggestion for this errors? Thank you very much. >> >> >> > >> > The headers are complicated to get right. The whole point of what we >> did is >> > to give a way to use GPU >> > simply through the existing PETSc linear algebra interface. >> > >> > Matt >> > >> > >> >> Best, >> >> Yujie >> >> >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> > experiments is infinitely more interesting than any results to which >> their >> > experiments lead. >> > -- Norbert Wiener >> > >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Jan 29 13:29:47 2012 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 29 Jan 2012 13:29:47 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: On Sun, Jan 29, 2012 at 1:24 PM, recrusader wrote: > I have thought I send this question to CUSP mailing list. Therefore, I > wish I could get the same errors using the simple codes. However, the > errors disappear. > The code below has no PETSc library calls. I really do not understand what you are asking. Can you run a simple PETSc example with the GPU? cd src/snes/examples/tutorials make ex5 ./ex5 -vec_type cusp -log_summary Matt > Is it possible to provide simple codes with PETSc to show the errors? > Thanks again. > > Best, > Yujie > > On Sun, Jan 29, 2012 at 1:20 PM, Matthew Knepley wrote: > >> On Sun, Jan 29, 2012 at 1:05 PM, recrusader wrote: >> >>> Thank you very much, Matt, >>> >>> You mean the headers of the simple codes, I further simply the codes as >> >> >> This is a question for the CUSP mailing list. >> >> Thanks, >> >> Matt >> >> >>> " >>> #include >>> #include >>> >>> int main(void) >>> { >>> cusp::array1d, cusp::host_memory> *x; >>> >>> x=new cusp::array1d, cusp::host_memory>(2,0.0); >>> >>> std::complex alpha(1,2.0); >>> cusp::blas::scal(*x,alpha); >>> >>> return 0; >>> }" >>> >>> I got the same compilation results " >>> login1$ nvcc gputest.cu -o gputest >>> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>> warning: calling a __host__ function from a __host__ __device__ >>> function is not allowed >>> detected during: >>> instantiation of "void >>> cusp::blas::detail::SCAL::operator()(T2 &) [with >>> T=std::complex, T2=std::complex]" >>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>> here >>> instantiation of "InputIterator >>> thrust::detail::host::for_each(InputIterator, InputIterator, >>> UnaryFunction) [with >>> InputIterator=thrust::detail::normal_iterator *>, >>> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>> here >>> instantiation of "InputIterator >>> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>> UnaryFunction, thrust::host_space_tag) [with >>> InputIterator=thrust::detail::normal_iterator *>, >>> UnaryFunction=cusp::blas::detail::SCAL>]" >>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>> here >>> instantiation of "InputIterator >>> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>> [with InputIterator=thrust::detail::normal_iterator >>> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>> here >>> instantiation of "void thrust::for_each(InputIterator, >>> InputIterator, UnaryFunction) [with >>> InputIterator=thrust::detail::normal_iterator *>, >>> UnaryFunction=cusp::blas::detail::SCAL>]" >>> (367): here >>> instantiation of "void >>> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >>> [with >>> ForwardIterator=thrust::detail::normal_iterator >>> *>, ScalarType=std::complex]" >>> (748): here >>> instantiation of "void cusp::blas::scal(Array &, >>> ScalarType) [with Array=cusp::array1d, >>> cusp::host_memory>, ScalarType=std::complex]" >>> gputest.cu(25): here >>> >>> " >>> >>> Thanks a lot. >>> >>> Best, >>> Yujie >>> >>> On 1/29/12, Matthew Knepley wrote: >>> > On Sun, Jan 29, 2012 at 12:53 PM, recrusader >>> wrote: >>> > >>> >> Dear PETSc developers, >>> >> >>> >> With your help, I can successfully PETSc-deve with enabling GPU and >>> >> complex number. >>> >> However, when I compiled the codes, I met some errors. I also tried to >>> >> use simple codes to realize the same function. However, the errors >>> >> disappear. One example is as follows: >>> >> >>> >> for the function "VecScale_SeqCUSP" >>> >> "#undef __FUNCT__ >>> >> #define __FUNCT__ "VecScale_SeqCUSP" >>> >> PetscErrorCode VecScale_SeqCUSP(Vec xin, PetscScalar alpha) >>> >> { >>> >> CUSPARRAY *xarray; >>> >> PetscErrorCode ierr; >>> >> >>> >> PetscFunctionBegin; >>> >> if (alpha == 0.0) { >>> >> ierr = VecSet_SeqCUSP(xin,alpha);CHKERRQ(ierr); >>> >> } else if (alpha != 1.0) { >>> >> ierr = VecCUSPGetArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>> >> try { >>> >> cusp::blas::scal(*xarray,alpha); >>> >> } catch(char* ex) { >>> >> SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); >>> >> } >>> >> ierr = VecCUSPRestoreArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>> >> } >>> >> ierr = WaitForGPU();CHKERRCUSP(ierr); >>> >> ierr = PetscLogFlops(xin->map->n);CHKERRQ(ierr); >>> >> PetscFunctionReturn(0); >>> >> } " >>> >> >>> >> When I compiled PETSc-dev, I met the following errors: >>> >> " /opt/apps/cuda/4.0/cuda/include/cusp/detail/blas.inl(134): warning: >>> >> calling a __host__ function from a __host__ __device__ function is not >>> >> allowed >>> >> detected during: >>> >> instantiation of "void >>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>> >> T=std::complex, T2=PetscScalar]" >>> >> >>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/for_each.inl(72): >>> >> here >>> >> instantiation of "void >>> >> thrust::detail::device::cuda::for_each_n_closure>> >> Size, UnaryFunction>::operator()() [with >>> >> >>> >> >>> RandomAccessIterator=thrust::detail::normal_iterator>, >>> >> Size=long, >>> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> >>> >> >>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>> >> here >>> >> instantiation of "void >>> >> >>> >> >>> thrust::detail::device::cuda::detail::launch_closure_by_value(NullaryFunction) >>> >> [with >>> >> >>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>> >> long, cusp::blas::detail::SCAL>>]" >>> >> >>> >> >>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(71): >>> >> here >>> >> instantiation of "size_t >>> >> >>> >> >>> thrust::detail::device::cuda::detail::closure_launcher_base>> >> launch_by_value>::block_size_with_maximal_occupancy(size_t) [with >>> >> >>> >> >>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>> >> long, cusp::blas::detail::SCAL>>, >>> >> launch_by_value=true]" >>> >> >>> >> >>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(136): >>> >> here >>> >> instantiation of "thrust::pair >>> >> >>> >> >>> thrust::detail::device::cuda::detail::closure_launcher::configuration_with_maximal_occupancy(Size) >>> >> [with >>> >> >>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>> >> long, cusp::blas::detail::SCAL>>, Size=long]" >>> >> >>> >> >>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(145): >>> >> here >>> >> [ 6 instantiation contexts not shown ] >>> >> instantiation of "InputIterator >>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>> >> UnaryFunction, thrust::device_space_tag) [with >>> >> >>> >> >>> InputIterator=thrust::detail::normal_iterator>, >>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(51): here >>> >> instantiation of "InputIterator >>> >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>> >> [with >>> >> >>> InputIterator=thrust::detail::normal_iterator>, >>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(67): here >>> >> instantiation of "void thrust::for_each(InputIterator, >>> >> InputIterator, UnaryFunction) [with >>> >> >>> >> >>> InputIterator=thrust::detail::normal_iterator>, >>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> (367): here >>> >> instantiation of "void >>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >>> >> [with >>> >> >>> ForwardIterator=thrust::detail::normal_iterator>, >>> >> ScalarType=std::complex]" >>> >> (748): here >>> >> instantiation of "void cusp::blas::scal(Array &, >>> >> ScalarType) [with Array=cusp::array1d>> >> cusp::device_memory>, ScalarType=std::complex]" >>> >> veccusp.cu(1185): here >>> >> >>> >> >>> >> >>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>> >> error: a value of type "int" cannot be assigned to an entity of type >>> >> "_ZNSt7complexIdE9_ComplexTE" >>> >> >>> >> " >>> >> However, I further realize simiar codes as >>> >> " >>> >> #include >>> >> #include >>> >> #include >>> >> #include >>> >> #include >>> >> #include >>> >> >>> >> int main(void) >>> >> { >>> >> cusp::array1d, cusp::host_memory> *x; >>> >> >>> >> x=new cusp::array1d, >>> cusp::host_memory>(2,0.0); >>> >> >>> >> std::complex alpha(1,2.0); >>> >> cusp::blas::scal(*x,alpha); >>> >> >>> >> return 0; >>> >> } >>> >> " >>> >> >>> >> When I complied it using "nvcc gputest.cu -o gputest", I only meet >>> >> warning information as follows: >>> >> " >>> >> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>> >> warning: calling a __host__ function from a __host__ __device__ >>> >> function is not allowed >>> >> detected during: >>> >> instantiation of "void >>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>> >> T=std::complex, T2=std::complex]" >>> >> >>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>> >> here >>> >> instantiation of "InputIterator >>> >> thrust::detail::host::for_each(InputIterator, InputIterator, >>> >> UnaryFunction) [with >>> >> InputIterator=thrust::detail::normal_iterator *>, >>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> >>> >> >>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>> >> here >>> >> instantiation of "InputIterator >>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>> >> UnaryFunction, thrust::host_space_tag) [with >>> >> InputIterator=thrust::detail::normal_iterator *>, >>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>> >> here >>> >> instantiation of "InputIterator >>> >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>> >> [with >>> InputIterator=thrust::detail::normal_iterator >>> >> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>> >> here >>> >> instantiation of "void thrust::for_each(InputIterator, >>> >> InputIterator, UnaryFunction) [with >>> >> InputIterator=thrust::detail::normal_iterator *>, >>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> (367): here >>> >> instantiation of "void >>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >>> >> [with >>> ForwardIterator=thrust::detail::normal_iterator >>> >> *>, ScalarType=std::complex]" >>> >> (748): here >>> >> instantiation of "void cusp::blas::scal(Array &, >>> >> ScalarType) [with Array=cusp::array1d, >>> >> cusp::host_memory>, ScalarType=std::complex]" >>> >> gputest.cu(25): here >>> >> >>> >> " >>> >> There are not errors like >>> >> >>> >> >>> "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>> >> error: a value of type "int" cannot be assigned to an entity of type >>> >> "_ZNSt7complexIdE9_ComplexTE" " >>> >> >>> >> Furthermore, the warning information is also different between >>> >> PETSc-dev and simple codes. >>> >> >>> >> Could you give me some suggestion for this errors? Thank you very >>> much. >>> >> >>> > >>> > The headers are complicated to get right. The whole point of what we >>> did is >>> > to give a way to use GPU >>> > simply through the existing PETSc linear algebra interface. >>> > >>> > Matt >>> > >>> > >>> >> Best, >>> >> Yujie >>> >> >>> > >>> > >>> > >>> > -- >>> > What most experimenters take for granted before they begin their >>> > experiments is infinitely more interesting than any results to which >>> their >>> > experiments lead. >>> > -- Norbert Wiener >>> > >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Sun Jan 29 13:35:58 2012 From: recrusader at gmail.com (recrusader) Date: Sun, 29 Jan 2012 13:35:58 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: Dear Matt, Without PETSc library calls, the errors, that is "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51):>> error: a value of type "int" cannot be assigned to an entity of type "_ZNSt7complexIdE9_ComplexTE", disappear. Since I meet the errors when compiling PETSc-dev, is it possible to run PETSc example? Thank you very much. Best, Yujie On Sun, Jan 29, 2012 at 1:29 PM, Matthew Knepley wrote: > On Sun, Jan 29, 2012 at 1:24 PM, recrusader wrote: > >> I have thought I send this question to CUSP mailing list. Therefore, I >> wish I could get the same errors using the simple codes. However, the >> errors disappear. >> > > The code below has no PETSc library calls. I really do not understand what > you are asking. > > Can you run a simple PETSc example with the GPU? > > cd src/snes/examples/tutorials > make ex5 > ./ex5 -vec_type cusp -log_summary > > Matt > > >> Is it possible to provide simple codes with PETSc to show the errors? >> Thanks again. >> >> Best, >> Yujie >> >> On Sun, Jan 29, 2012 at 1:20 PM, Matthew Knepley wrote: >> >>> On Sun, Jan 29, 2012 at 1:05 PM, recrusader wrote: >>> >>>> Thank you very much, Matt, >>>> >>>> You mean the headers of the simple codes, I further simply the codes as >>> >>> >>> This is a question for the CUSP mailing list. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> " >>>> #include >>>> #include >>>> >>>> int main(void) >>>> { >>>> cusp::array1d, cusp::host_memory> *x; >>>> >>>> x=new cusp::array1d, cusp::host_memory>(2,0.0); >>>> >>>> std::complex alpha(1,2.0); >>>> cusp::blas::scal(*x,alpha); >>>> >>>> return 0; >>>> }" >>>> >>>> I got the same compilation results " >>>> login1$ nvcc gputest.cu -o gputest >>>> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>>> warning: calling a __host__ function from a __host__ __device__ >>>> function is not allowed >>>> detected during: >>>> instantiation of "void >>>> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>> T=std::complex, T2=std::complex]" >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>>> here >>>> instantiation of "InputIterator >>>> thrust::detail::host::for_each(InputIterator, InputIterator, >>>> UnaryFunction) [with >>>> InputIterator=thrust::detail::normal_iterator *>, >>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>>> here >>>> instantiation of "InputIterator >>>> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>> UnaryFunction, thrust::host_space_tag) [with >>>> InputIterator=thrust::detail::normal_iterator *>, >>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>>> here >>>> instantiation of "InputIterator >>>> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>>> [with InputIterator=thrust::detail::normal_iterator >>>> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>>> here >>>> instantiation of "void thrust::for_each(InputIterator, >>>> InputIterator, UnaryFunction) [with >>>> InputIterator=thrust::detail::normal_iterator *>, >>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> (367): here >>>> instantiation of "void >>>> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >>>> [with >>>> ForwardIterator=thrust::detail::normal_iterator >>>> *>, ScalarType=std::complex]" >>>> (748): here >>>> instantiation of "void cusp::blas::scal(Array &, >>>> ScalarType) [with Array=cusp::array1d, >>>> cusp::host_memory>, ScalarType=std::complex]" >>>> gputest.cu(25): here >>>> >>>> " >>>> >>>> Thanks a lot. >>>> >>>> Best, >>>> Yujie >>>> >>>> On 1/29/12, Matthew Knepley wrote: >>>> > On Sun, Jan 29, 2012 at 12:53 PM, recrusader >>>> wrote: >>>> > >>>> >> Dear PETSc developers, >>>> >> >>>> >> With your help, I can successfully PETSc-deve with enabling GPU and >>>> >> complex number. >>>> >> However, when I compiled the codes, I met some errors. I also tried >>>> to >>>> >> use simple codes to realize the same function. However, the errors >>>> >> disappear. One example is as follows: >>>> >> >>>> >> for the function "VecScale_SeqCUSP" >>>> >> "#undef __FUNCT__ >>>> >> #define __FUNCT__ "VecScale_SeqCUSP" >>>> >> PetscErrorCode VecScale_SeqCUSP(Vec xin, PetscScalar alpha) >>>> >> { >>>> >> CUSPARRAY *xarray; >>>> >> PetscErrorCode ierr; >>>> >> >>>> >> PetscFunctionBegin; >>>> >> if (alpha == 0.0) { >>>> >> ierr = VecSet_SeqCUSP(xin,alpha);CHKERRQ(ierr); >>>> >> } else if (alpha != 1.0) { >>>> >> ierr = VecCUSPGetArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>>> >> try { >>>> >> cusp::blas::scal(*xarray,alpha); >>>> >> } catch(char* ex) { >>>> >> SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); >>>> >> } >>>> >> ierr = VecCUSPRestoreArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>>> >> } >>>> >> ierr = WaitForGPU();CHKERRCUSP(ierr); >>>> >> ierr = PetscLogFlops(xin->map->n);CHKERRQ(ierr); >>>> >> PetscFunctionReturn(0); >>>> >> } " >>>> >> >>>> >> When I compiled PETSc-dev, I met the following errors: >>>> >> " /opt/apps/cuda/4.0/cuda/include/cusp/detail/blas.inl(134): warning: >>>> >> calling a __host__ function from a __host__ __device__ function is >>>> not >>>> >> allowed >>>> >> detected during: >>>> >> instantiation of "void >>>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>> >> T=std::complex, T2=PetscScalar]" >>>> >> >>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/for_each.inl(72): >>>> >> here >>>> >> instantiation of "void >>>> >> >>>> thrust::detail::device::cuda::for_each_n_closure>>> >> Size, UnaryFunction>::operator()() [with >>>> >> >>>> >> >>>> RandomAccessIterator=thrust::detail::normal_iterator>, >>>> >> Size=long, >>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> >>>> >> >>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>> >> here >>>> >> instantiation of "void >>>> >> >>>> >> >>>> thrust::detail::device::cuda::detail::launch_closure_by_value(NullaryFunction) >>>> >> [with >>>> >> >>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>> >> long, cusp::blas::detail::SCAL>>]" >>>> >> >>>> >> >>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(71): >>>> >> here >>>> >> instantiation of "size_t >>>> >> >>>> >> >>>> thrust::detail::device::cuda::detail::closure_launcher_base>>> >> launch_by_value>::block_size_with_maximal_occupancy(size_t) [with >>>> >> >>>> >> >>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>> >> long, cusp::blas::detail::SCAL>>, >>>> >> launch_by_value=true]" >>>> >> >>>> >> >>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(136): >>>> >> here >>>> >> instantiation of "thrust::pair >>>> >> >>>> >> >>>> thrust::detail::device::cuda::detail::closure_launcher::configuration_with_maximal_occupancy(Size) >>>> >> [with >>>> >> >>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>> >> long, cusp::blas::detail::SCAL>>, Size=long]" >>>> >> >>>> >> >>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(145): >>>> >> here >>>> >> [ 6 instantiation contexts not shown ] >>>> >> instantiation of "InputIterator >>>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>> >> UnaryFunction, thrust::device_space_tag) [with >>>> >> >>>> >> >>>> InputIterator=thrust::detail::normal_iterator>, >>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(51): here >>>> >> instantiation of "InputIterator >>>> >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>>> >> [with >>>> >> >>>> InputIterator=thrust::detail::normal_iterator>, >>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(67): here >>>> >> instantiation of "void thrust::for_each(InputIterator, >>>> >> InputIterator, UnaryFunction) [with >>>> >> >>>> >> >>>> InputIterator=thrust::detail::normal_iterator>, >>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> (367): here >>>> >> instantiation of "void >>>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, >>>> ScalarType) >>>> >> [with >>>> >> >>>> ForwardIterator=thrust::detail::normal_iterator>, >>>> >> ScalarType=std::complex]" >>>> >> (748): here >>>> >> instantiation of "void cusp::blas::scal(Array &, >>>> >> ScalarType) [with Array=cusp::array1d>>> >> cusp::device_memory>, ScalarType=std::complex]" >>>> >> veccusp.cu(1185): here >>>> >> >>>> >> >>>> >> >>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>> >> error: a value of type "int" cannot be assigned to an entity of type >>>> >> "_ZNSt7complexIdE9_ComplexTE" >>>> >> >>>> >> " >>>> >> However, I further realize simiar codes as >>>> >> " >>>> >> #include >>>> >> #include >>>> >> #include >>>> >> #include >>>> >> #include >>>> >> #include >>>> >> >>>> >> int main(void) >>>> >> { >>>> >> cusp::array1d, cusp::host_memory> *x; >>>> >> >>>> >> x=new cusp::array1d, >>>> cusp::host_memory>(2,0.0); >>>> >> >>>> >> std::complex alpha(1,2.0); >>>> >> cusp::blas::scal(*x,alpha); >>>> >> >>>> >> return 0; >>>> >> } >>>> >> " >>>> >> >>>> >> When I complied it using "nvcc gputest.cu -o gputest", I only meet >>>> >> warning information as follows: >>>> >> " >>>> >> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>>> >> warning: calling a __host__ function from a __host__ __device__ >>>> >> function is not allowed >>>> >> detected during: >>>> >> instantiation of "void >>>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>> >> T=std::complex, T2=std::complex]" >>>> >> >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>>> >> here >>>> >> instantiation of "InputIterator >>>> >> thrust::detail::host::for_each(InputIterator, InputIterator, >>>> >> UnaryFunction) [with >>>> >> InputIterator=thrust::detail::normal_iterator >>>> *>, >>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> >>>> >> >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>>> >> here >>>> >> instantiation of "InputIterator >>>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>> >> UnaryFunction, thrust::host_space_tag) [with >>>> >> InputIterator=thrust::detail::normal_iterator >>>> *>, >>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>>> >> here >>>> >> instantiation of "InputIterator >>>> >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>>> >> [with >>>> InputIterator=thrust::detail::normal_iterator >>>> >> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>>> >> here >>>> >> instantiation of "void thrust::for_each(InputIterator, >>>> >> InputIterator, UnaryFunction) [with >>>> >> InputIterator=thrust::detail::normal_iterator >>>> *>, >>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> (367): here >>>> >> instantiation of "void >>>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, >>>> ScalarType) >>>> >> [with >>>> ForwardIterator=thrust::detail::normal_iterator >>>> >> *>, ScalarType=std::complex]" >>>> >> (748): here >>>> >> instantiation of "void cusp::blas::scal(Array &, >>>> >> ScalarType) [with Array=cusp::array1d, >>>> >> cusp::host_memory>, ScalarType=std::complex]" >>>> >> gputest.cu(25): here >>>> >> >>>> >> " >>>> >> There are not errors like >>>> >> >>>> >> >>>> "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>> >> error: a value of type "int" cannot be assigned to an entity of type >>>> >> "_ZNSt7complexIdE9_ComplexTE" " >>>> >> >>>> >> Furthermore, the warning information is also different between >>>> >> PETSc-dev and simple codes. >>>> >> >>>> >> Could you give me some suggestion for this errors? Thank you very >>>> much. >>>> >> >>>> > >>>> > The headers are complicated to get right. The whole point of what we >>>> did is >>>> > to give a way to use GPU >>>> > simply through the existing PETSc linear algebra interface. >>>> > >>>> > Matt >>>> > >>>> > >>>> >> Best, >>>> >> Yujie >>>> >> >>>> > >>>> > >>>> > >>>> > -- >>>> > What most experimenters take for granted before they begin their >>>> > experiments is infinitely more interesting than any results to which >>>> their >>>> > experiments lead. >>>> > -- Norbert Wiener >>>> > >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Jan 29 13:43:42 2012 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 29 Jan 2012 13:43:42 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: On Sun, Jan 29, 2012 at 1:35 PM, recrusader wrote: > Dear Matt, > > Without PETSc library calls, the errors, that is > "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51):>> > error: a value of type "int" cannot be assigned to an entity of type > "_ZNSt7complexIdE9_ComplexTE", > disappear. > > Since I meet the errors when compiling PETSc-dev, is it possible to run > PETSc example? > 1) If you have an error compiling petsc-dev ALWAYS send make.log to petsc-maint at mcs.anl.gov 2) petsc-dev only works with the latest cusp-dev from the repository Thanks Matt > Thank you very much. > > Best, > Yujie > > > > > On Sun, Jan 29, 2012 at 1:29 PM, Matthew Knepley wrote: > >> On Sun, Jan 29, 2012 at 1:24 PM, recrusader wrote: >> >>> I have thought I send this question to CUSP mailing list. Therefore, I >>> wish I could get the same errors using the simple codes. However, the >>> errors disappear. >>> >> >> The code below has no PETSc library calls. I really do not understand >> what you are asking. >> >> Can you run a simple PETSc example with the GPU? >> >> cd src/snes/examples/tutorials >> make ex5 >> ./ex5 -vec_type cusp -log_summary >> >> Matt >> >> >>> Is it possible to provide simple codes with PETSc to show the errors? >>> Thanks again. >>> >>> Best, >>> Yujie >>> >>> On Sun, Jan 29, 2012 at 1:20 PM, Matthew Knepley wrote: >>> >>>> On Sun, Jan 29, 2012 at 1:05 PM, recrusader wrote: >>>> >>>>> Thank you very much, Matt, >>>>> >>>>> You mean the headers of the simple codes, I further simply the codes as >>>> >>>> >>>> This is a question for the CUSP mailing list. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> " >>>>> #include >>>>> #include >>>>> >>>>> int main(void) >>>>> { >>>>> cusp::array1d, cusp::host_memory> *x; >>>>> >>>>> x=new cusp::array1d, cusp::host_memory>(2,0.0); >>>>> >>>>> std::complex alpha(1,2.0); >>>>> cusp::blas::scal(*x,alpha); >>>>> >>>>> return 0; >>>>> }" >>>>> >>>>> I got the same compilation results " >>>>> login1$ nvcc gputest.cu -o gputest >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>>>> warning: calling a __host__ function from a __host__ __device__ >>>>> function is not allowed >>>>> detected during: >>>>> instantiation of "void >>>>> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>>> T=std::complex, T2=std::complex]" >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>>>> here >>>>> instantiation of "InputIterator >>>>> thrust::detail::host::for_each(InputIterator, InputIterator, >>>>> UnaryFunction) [with >>>>> InputIterator=thrust::detail::normal_iterator *>, >>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>>>> here >>>>> instantiation of "InputIterator >>>>> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>>> UnaryFunction, thrust::host_space_tag) [with >>>>> InputIterator=thrust::detail::normal_iterator *>, >>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>>>> here >>>>> instantiation of "InputIterator >>>>> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>>>> [with >>>>> InputIterator=thrust::detail::normal_iterator >>>>> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>>>> here >>>>> instantiation of "void thrust::for_each(InputIterator, >>>>> InputIterator, UnaryFunction) [with >>>>> InputIterator=thrust::detail::normal_iterator *>, >>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> (367): here >>>>> instantiation of "void >>>>> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >>>>> [with >>>>> ForwardIterator=thrust::detail::normal_iterator >>>>> *>, ScalarType=std::complex]" >>>>> (748): here >>>>> instantiation of "void cusp::blas::scal(Array &, >>>>> ScalarType) [with Array=cusp::array1d, >>>>> cusp::host_memory>, ScalarType=std::complex]" >>>>> gputest.cu(25): here >>>>> >>>>> " >>>>> >>>>> Thanks a lot. >>>>> >>>>> Best, >>>>> Yujie >>>>> >>>>> On 1/29/12, Matthew Knepley wrote: >>>>> > On Sun, Jan 29, 2012 at 12:53 PM, recrusader >>>>> wrote: >>>>> > >>>>> >> Dear PETSc developers, >>>>> >> >>>>> >> With your help, I can successfully PETSc-deve with enabling GPU and >>>>> >> complex number. >>>>> >> However, when I compiled the codes, I met some errors. I also tried >>>>> to >>>>> >> use simple codes to realize the same function. However, the errors >>>>> >> disappear. One example is as follows: >>>>> >> >>>>> >> for the function "VecScale_SeqCUSP" >>>>> >> "#undef __FUNCT__ >>>>> >> #define __FUNCT__ "VecScale_SeqCUSP" >>>>> >> PetscErrorCode VecScale_SeqCUSP(Vec xin, PetscScalar alpha) >>>>> >> { >>>>> >> CUSPARRAY *xarray; >>>>> >> PetscErrorCode ierr; >>>>> >> >>>>> >> PetscFunctionBegin; >>>>> >> if (alpha == 0.0) { >>>>> >> ierr = VecSet_SeqCUSP(xin,alpha);CHKERRQ(ierr); >>>>> >> } else if (alpha != 1.0) { >>>>> >> ierr = VecCUSPGetArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>>>> >> try { >>>>> >> cusp::blas::scal(*xarray,alpha); >>>>> >> } catch(char* ex) { >>>>> >> SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); >>>>> >> } >>>>> >> ierr = VecCUSPRestoreArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>>>> >> } >>>>> >> ierr = WaitForGPU();CHKERRCUSP(ierr); >>>>> >> ierr = PetscLogFlops(xin->map->n);CHKERRQ(ierr); >>>>> >> PetscFunctionReturn(0); >>>>> >> } " >>>>> >> >>>>> >> When I compiled PETSc-dev, I met the following errors: >>>>> >> " /opt/apps/cuda/4.0/cuda/include/cusp/detail/blas.inl(134): >>>>> warning: >>>>> >> calling a __host__ function from a __host__ __device__ function is >>>>> not >>>>> >> allowed >>>>> >> detected during: >>>>> >> instantiation of "void >>>>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>>> >> T=std::complex, T2=PetscScalar]" >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/for_each.inl(72): >>>>> >> here >>>>> >> instantiation of "void >>>>> >> >>>>> thrust::detail::device::cuda::for_each_n_closure>>>> >> Size, UnaryFunction>::operator()() [with >>>>> >> >>>>> >> >>>>> RandomAccessIterator=thrust::detail::normal_iterator>, >>>>> >> Size=long, >>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>>> >> here >>>>> >> instantiation of "void >>>>> >> >>>>> >> >>>>> thrust::detail::device::cuda::detail::launch_closure_by_value(NullaryFunction) >>>>> >> [with >>>>> >> >>>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>>> >> long, cusp::blas::detail::SCAL>>]" >>>>> >> >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(71): >>>>> >> here >>>>> >> instantiation of "size_t >>>>> >> >>>>> >> >>>>> thrust::detail::device::cuda::detail::closure_launcher_base>>>> >> launch_by_value>::block_size_with_maximal_occupancy(size_t) [with >>>>> >> >>>>> >> >>>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>>> >> long, cusp::blas::detail::SCAL>>, >>>>> >> launch_by_value=true]" >>>>> >> >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(136): >>>>> >> here >>>>> >> instantiation of "thrust::pair >>>>> >> >>>>> >> >>>>> thrust::detail::device::cuda::detail::closure_launcher::configuration_with_maximal_occupancy(Size) >>>>> >> [with >>>>> >> >>>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>>> >> long, cusp::blas::detail::SCAL>>, Size=long]" >>>>> >> >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(145): >>>>> >> here >>>>> >> [ 6 instantiation contexts not shown ] >>>>> >> instantiation of "InputIterator >>>>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>>> >> UnaryFunction, thrust::device_space_tag) [with >>>>> >> >>>>> >> >>>>> InputIterator=thrust::detail::normal_iterator>, >>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(51): here >>>>> >> instantiation of "InputIterator >>>>> >> thrust::detail::for_each(InputIterator, InputIterator, >>>>> UnaryFunction) >>>>> >> [with >>>>> >> >>>>> InputIterator=thrust::detail::normal_iterator>, >>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(67): here >>>>> >> instantiation of "void thrust::for_each(InputIterator, >>>>> >> InputIterator, UnaryFunction) [with >>>>> >> >>>>> >> >>>>> InputIterator=thrust::detail::normal_iterator>, >>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> (367): here >>>>> >> instantiation of "void >>>>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, >>>>> ScalarType) >>>>> >> [with >>>>> >> >>>>> ForwardIterator=thrust::detail::normal_iterator>, >>>>> >> ScalarType=std::complex]" >>>>> >> (748): here >>>>> >> instantiation of "void cusp::blas::scal(Array &, >>>>> >> ScalarType) [with Array=cusp::array1d>>>> >> cusp::device_memory>, ScalarType=std::complex]" >>>>> >> veccusp.cu(1185): here >>>>> >> >>>>> >> >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>>> >> error: a value of type "int" cannot be assigned to an entity of type >>>>> >> "_ZNSt7complexIdE9_ComplexTE" >>>>> >> >>>>> >> " >>>>> >> However, I further realize simiar codes as >>>>> >> " >>>>> >> #include >>>>> >> #include >>>>> >> #include >>>>> >> #include >>>>> >> #include >>>>> >> #include >>>>> >> >>>>> >> int main(void) >>>>> >> { >>>>> >> cusp::array1d, cusp::host_memory> *x; >>>>> >> >>>>> >> x=new cusp::array1d, >>>>> cusp::host_memory>(2,0.0); >>>>> >> >>>>> >> std::complex alpha(1,2.0); >>>>> >> cusp::blas::scal(*x,alpha); >>>>> >> >>>>> >> return 0; >>>>> >> } >>>>> >> " >>>>> >> >>>>> >> When I complied it using "nvcc gputest.cu -o gputest", I only meet >>>>> >> warning information as follows: >>>>> >> " >>>>> >> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>>>> >> warning: calling a __host__ function from a __host__ __device__ >>>>> >> function is not allowed >>>>> >> detected during: >>>>> >> instantiation of "void >>>>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>>> >> T=std::complex, T2=std::complex]" >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>>>> >> here >>>>> >> instantiation of "InputIterator >>>>> >> thrust::detail::host::for_each(InputIterator, InputIterator, >>>>> >> UnaryFunction) [with >>>>> >> InputIterator=thrust::detail::normal_iterator >>>>> *>, >>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>>>> >> here >>>>> >> instantiation of "InputIterator >>>>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>>> >> UnaryFunction, thrust::host_space_tag) [with >>>>> >> InputIterator=thrust::detail::normal_iterator >>>>> *>, >>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>>>> >> here >>>>> >> instantiation of "InputIterator >>>>> >> thrust::detail::for_each(InputIterator, InputIterator, >>>>> UnaryFunction) >>>>> >> [with >>>>> InputIterator=thrust::detail::normal_iterator >>>>> >> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>>>> >> here >>>>> >> instantiation of "void thrust::for_each(InputIterator, >>>>> >> InputIterator, UnaryFunction) [with >>>>> >> InputIterator=thrust::detail::normal_iterator >>>>> *>, >>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> (367): here >>>>> >> instantiation of "void >>>>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, >>>>> ScalarType) >>>>> >> [with >>>>> ForwardIterator=thrust::detail::normal_iterator >>>>> >> *>, ScalarType=std::complex]" >>>>> >> (748): here >>>>> >> instantiation of "void cusp::blas::scal(Array &, >>>>> >> ScalarType) [with Array=cusp::array1d, >>>>> >> cusp::host_memory>, ScalarType=std::complex]" >>>>> >> gputest.cu(25): here >>>>> >> >>>>> >> " >>>>> >> There are not errors like >>>>> >> >>>>> >> >>>>> "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>>> >> error: a value of type "int" cannot be assigned to an entity of type >>>>> >> "_ZNSt7complexIdE9_ComplexTE" " >>>>> >> >>>>> >> Furthermore, the warning information is also different between >>>>> >> PETSc-dev and simple codes. >>>>> >> >>>>> >> Could you give me some suggestion for this errors? Thank you very >>>>> much. >>>>> >> >>>>> > >>>>> > The headers are complicated to get right. The whole point of what we >>>>> did is >>>>> > to give a way to use GPU >>>>> > simply through the existing PETSc linear algebra interface. >>>>> > >>>>> > Matt >>>>> > >>>>> > >>>>> >> Best, >>>>> >> Yujie >>>>> >> >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > What most experimenters take for granted before they begin their >>>>> > experiments is infinitely more interesting than any results to which >>>>> their >>>>> > experiments lead. >>>>> > -- Norbert Wiener >>>>> > >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Mon Jan 30 06:47:27 2012 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Mon, 30 Jan 2012 12:47:27 +0000 Subject: [petsc-users] (no subject) Message-ID: dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From C.Klaij at marin.nl Mon Jan 30 06:51:19 2012 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Mon, 30 Jan 2012 12:51:19 +0000 Subject: [petsc-users] (no subject) Message-ID: >> [0]PETSC ERROR: Unhandled case, must have at least two fields! > > >You can use PCFieldSplitSetIS(). The implementation could check whether a >MatNest is being used and automatically set the splits if it is, but you >would have to call PCFieldSplitSetIS() later when you wanted to assemble >into an AIJ format, so I'm hesitant to pick it up automatically. Just call >the function for now. Then what would be the best way to create the IS in this case? Can it somehow be deduced from the separate blocks? dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From jedbrown at mcs.anl.gov Mon Jan 30 06:54:44 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 30 Jan 2012 06:54:44 -0600 Subject: [petsc-users] (no subject) In-Reply-To: References: Message-ID: On Mon, Jan 30, 2012 at 06:51, Klaij, Christiaan wrote: > Then what would be the best way to create the IS in this case? > Can it somehow be deduced from the separate blocks? > The ISs define the row space of the blocks inside the global matrix. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.hui.zhang at hotmail.com Tue Jan 31 04:11:34 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Tue, 31 Jan 2012 11:11:34 +0100 Subject: [petsc-users] Row Orderings of Mat and Vec Message-ID: Suppose using MatCreate() and MatSetSizes() we obtain a Mat A, and then using VecCreate() and VecSetSizes() with the same rows' arguments as A we obtain a Vec b. Let A_i (b_i) consists of the rows owned by processor i. Can I believe that under the ordering of petsc, the following equality holds, A= [A_0 A_1 ... A_N], b= [b_0 b_1 ... b_N], i.e. A_i, b_i with smaller i--the processor number goes first? My question comes from assembly of linear system. Suppose under my application ordering(AO) the system is Ax=b. I can get an AO from A using MatGetOwnershipRanges(A,&Istart,&Iend); AOCreateBasic( -, -,app_ind, Istart..Iend, &ao_1); and assemble A. Under the petsc ordering, A becomes P_1*A mathematically, with P_1 the permutation corresponding to ao_1. In a similar way, we have that b becomes P_2 b under the petsc ordering. I want to make sure P_1 and P_2 are the same so arises my question. Thanks a lot! Hui From jedbrown at mcs.anl.gov Tue Jan 31 06:23:49 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 31 Jan 2012 06:23:49 -0600 Subject: [petsc-users] Row Orderings of Mat and Vec In-Reply-To: References: Message-ID: On Tue, Jan 31, 2012 at 04:11, Hui Zhang wrote: > Suppose using MatCreate() and MatSetSizes() we obtain a Mat A, and then > using VecCreate() and VecSetSizes() with the same rows' arguments as A > we obtain a Vec b. Let A_i (b_i) consists of the rows owned by processor > i. > > Can I believe that under the ordering of petsc, the following equality > holds, > A= [A_0 > A_1 > ... > A_N], > b= [b_0 > b_1 > ... > b_N], > i.e. A_i, b_i with smaller i--the processor number goes first? > Matrices and vectors always have contiguous row partitions, yes. > > My question comes from assembly of linear system. Suppose under my > application > ordering(AO) the system is Ax=b. I can get an AO from A using > > MatGetOwnershipRanges(A,&Istart,&Iend); > AOCreateBasic( -, -,app_ind, Istart..Iend, &ao_1); > > and assemble A. Under the petsc ordering, A becomes P_1*A mathematically, I don't think I follow. This AO does not transform A in any way. You can use the AO to translate application indices to PETSc indices to give to MatSetValues(). In most cases, I would recommend redistributing the mesh according to the partition and then using local indices (e.g. MatSetValuesLocal()) during assembly. You might use AO for that setup step, but it usually doesn't make sense to use _during_ assembly. > with > P_1 the permutation corresponding to ao_1. In a similar way, we have that > b > becomes P_2 b under the petsc ordering. I want to make sure P_1 and P_2 > are > the same so arises my question. > > Thanks a lot! > Hui > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Tue Jan 31 16:34:56 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Tue, 31 Jan 2012 14:34:56 -0800 Subject: [petsc-users] VecView() comparison: BINARY vs HDF5 Message-ID: Hi all, I was using VecView() to write the data to file (a vector of total size 50*20*10 (3D DMDA)). I compared the times for two cases: PETSc's binary and also HDF5. I get an enormous difference between the times I get for these two cases (this test is done using only one processor) HDF5: 16.2 (sec) Binary: 0.33 (sec). I am using HDF5 VecView() as a magic black box writer to dump the field quantities. And I am not an expert on it but this order of magnitude seems a bit strange to me. Any inputs are appreciated! Best, Mohamad -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jan 31 16:39:41 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 31 Jan 2012 16:39:41 -0600 Subject: [petsc-users] VecView() comparison: BINARY vs HDF5 In-Reply-To: References: Message-ID: <7D0FB4FE-FB3D-4E49-B24A-2407EB85F7BC@mcs.anl.gov> On Jan 31, 2012, at 4:34 PM, Mohamad M. Nasr-Azadani wrote: > Hi all, > > I was using VecView() to write the data to file (a vector of total size 50*20*10 (3D DMDA)). > I compared the times for two cases: PETSc's binary and also HDF5. > I get an enormous difference between the times I get for these two cases (this test is done using only one processor) > > HDF5: 16.2 (sec) > Binary: 0.33 (sec). > > I am using HDF5 VecView() as a magic black box writer to dump the field quantities. And I am not an expert on it but this order of magnitude seems a bit strange to me. I am not surprised at all. Just because HDF5 is a "defacto standard" and "supposedly" a good thing to use doesn't mean it will be faster than something else. I would only use HDF5 when the resulting file needs to be HDF5 for some other software, like visualization. If you are just using the file with PETSc then use PETSc's binary. Barry > > Any inputs are appreciated! > Best, > Mohamad > > > From mmnasr at gmail.com Tue Jan 31 16:48:43 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Tue, 31 Jan 2012 14:48:43 -0800 Subject: [petsc-users] VecView() comparison: BINARY vs HDF5 In-Reply-To: <7D0FB4FE-FB3D-4E49-B24A-2407EB85F7BC@mcs.anl.gov> References: <7D0FB4FE-FB3D-4E49-B24A-2407EB85F7BC@mcs.anl.gov> Message-ID: Thanks Barry. Well, I guess it all comes from the fact that I want it all. I wanted to have everything done only once. Data that can be also read in a visualization package. Then I guess I have to stick with my old strategy of having everything stored in binary and using a postprocessor to make hdft or vtk format data files for visualization. Cheers, Mohamad On Tue, Jan 31, 2012 at 2:39 PM, Barry Smith wrote: > > On Jan 31, 2012, at 4:34 PM, Mohamad M. Nasr-Azadani wrote: > > > Hi all, > > > > I was using VecView() to write the data to file (a vector of total size > 50*20*10 (3D DMDA)). > > I compared the times for two cases: PETSc's binary and also HDF5. > > I get an enormous difference between the times I get for these two cases > (this test is done using only one processor) > > > > HDF5: 16.2 (sec) > > Binary: 0.33 (sec). > > > > I am using HDF5 VecView() as a magic black box writer to dump the field > quantities. And I am not an expert on it but this order of magnitude seems > a bit strange to me. > > I am not surprised at all. Just because HDF5 is a "defacto standard" > and "supposedly" a good thing to use doesn't mean it will be faster than > something else. > > I would only use HDF5 when the resulting file needs to be HDF5 for some > other software, like visualization. If you are just using the file with > PETSc then use PETSc's binary. > > Barry > > > > > > Any inputs are appreciated! > > Best, > > Mohamad > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: