From behzad.baghapour at gmail.com Sat Oct 1 00:39:59 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sat, 1 Oct 2011 09:09:59 +0330 Subject: [petsc-users] How to get Matrix from MatGetArray() Message-ID: Dear All, I just be able to get right answer from MatGetArray( A, &a ) only if I define matrix in Dense fromat. How could I get the right Answer from this function when I define SparseAIJ matrix? Thanks a lot, Behzad -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From abraham.zamudio at gmail.com Sat Oct 1 00:52:52 2011 From: abraham.zamudio at gmail.com (Abraham Zamudio) Date: Sat, 1 Oct 2011 00:52:52 -0500 Subject: [petsc-users] Problem with petsc-3.2-p3 and Open64 In-Reply-To: References: Message-ID: My configure : *./configure --with-blas-lapack-dir=/opt/acml-4-4-0-open64-64bit/open64_64 --with-cc=opencc --with-cxx=openCC -with-fc=openf95 --with-mpi=0* *The output of configure : =============================================================================== Configuring PETSc to compile on your system =============================================================================== =============================================================================== It appears you do not have valgrind installed on your system. We HIGHLY recommend you install it from www.valgrind.org Or install valgrind-devel or equivalent using your package manager. Then rerun ./configure =============================================================================== TESTING: alternateConfigureLibrary from PETSc.packages.petsc4py(config/PETSc/packages/petsc4py.py:65) Compilers: C Compiler: opencc -g Fortran Compiler: openf95 -g Linkers: Static linker: /usr/bin/ar cr X11: Includes: Library: -lX11 BLAS/LAPACK: -Wl,-rpath,/opt/acml-4-4-0-open64-64bit/open64_64/lib -L/opt/acml-4-4-0-open64-64bit/open64_64/lib -lacml pthread: Library: -lpthread PETSc: PETSC_ARCH: arch-linux2-c-debug PETSC_DIR: /usr/local/petsc-3.2-p3_opencc Clanguage: C Scalar type: real Precision: double shared libraries: disabled dynamic loading: disabled Memory alignment: 16 xxx=========================================================================xxx Configure stage complete. Now build PETSc libraries with (cmake build): make PETSC_DIR=/usr/local/petsc-3.2-p3_opencc PETSC_ARCH=arch-linux2-c-debug all or (experimental with python): PETSC_DIR=/usr/local/petsc-3.2-p3_opencc PETSC_ARCH=arch-linux2-c-debug ./config/builder.py xxx=========================================================================xxx * The next step : *make PETSC_DIR=/usr/local/petsc-3.2-p3_opencc PETSC_ARCH=arch-linux2-c-debug all* The problem : *========================================== Building PETSc using CMake with 5 build threads Using PETSC_DIR=/usr/local/petsc-3.2-p3_opencc and PETSC_ARCH=arch-linux2-c-debug ========================================== Scanning dependencies of target petsc [ 1%] Building Fortran object CMakeFiles/petsc.dir/src/sys/mpiuni/f90-mod/mpiunimod.F.o openf95 WARNING: unknown flag: -Jinclude Error copying Fortran module "include/mpi". Tried "include/MPI.mod" and "include/mpi.mod". make[5]: *** [CMakeFiles/petsc.dir/src/sys/mpiuni/f90-mod/mpiunimod.F.o.provides.build] Error 1 make[4]: *** [CMakeFiles/petsc.dir/src/sys/mpiuni/f90-mod/mpiunimod.F.o.provides] Error 2 make[3]: *** [CMakeFiles/petsc.dir/all] Error 2 make[2]: *** [all] Error 2 ******************************************************************** Error during compile, check arch-linux2-c-debug/conf/make.log Send it and arch-linux2-c-debug/conf/configure.log to petsc-maint at mcs.anl.gov ******************************************************************** make: [all] Error 1 (ignored)* -- Abraham Zamudio Ch. -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sat Oct 1 02:33:13 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sat, 1 Oct 2011 11:03:13 +0330 Subject: [petsc-users] How Efficiently copy matrices Message-ID: Dear all, I want to copy some elements of a calculated matrix (A) into another matrix (B) with DIFFERENT_NONZERO_PATTERN. How could I efficiently do this? Is this should be done after or before matrix assembly of (A)? Then, what is the sparse pattern of the matrix (A) after assembling and how is the correct access into its elements? Thanks, Behzad -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.hui.zhang at hotmail.com Sat Oct 1 03:04:04 2011 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Sat, 1 Oct 2011 10:04:04 +0200 Subject: [petsc-users] MatMult a MatNest and a Vec In-Reply-To: References: Message-ID: Thank you, Jed! I have more questions: if I do C=A*B with A(or B) of MatNest type and B(or A, resp.) of usual type, what type will C be of? And is there any function not supporting MatNest, for example, does LU decomposition supports MatNest? Does one sometimes need to convert MatNest to the usual type and how to do it? The MatNest is useful to me because my matrix is in the form [K1*K2 K3*K4]. I think to construct it as MatNest is easier than setting up Aij. Best wishes, Hui On Sep 30, 2011, at 7:08 PM, Jed Brown wrote: > On Fri, Sep 30, 2011 at 10:51, Hui Zhang wrote: > Do you know what type of the resulting vector y by MatMult of a MatNest A and a Vec x, i.e. y = Ax? > > MatMult does not change the types of vectors. This function works with any types. > > Note, however, that many Vec operations require all arguments to have the same type, so you can't freely mix native Vec with VecNest. I don't recommend using VecNest unless you have a very specific use case and can explain to me why you should use it. (I do recommend using MatNest, but generally with the MatGetLocalSubMatrix() so you don't depend on it---I always try to avoid depending on implementations.) > > Because after that I want to modify a SubVec of the result y. If it is a VectNest, I think > VecNestGetSubVec should be used but not VecGetSubVector. > > This is why VecNest implements VecGetSubVector (see VecGetSubVector_Nest). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Oct 1 07:51:40 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 1 Oct 2011 07:51:40 -0500 Subject: [petsc-users] How to get Matrix from MatGetArray() In-Reply-To: References: Message-ID: On Sat, Oct 1, 2011 at 00:39, behzad baghapour wrote: > I just be able to get right answer from MatGetArray( A, &a ) only if I > define matrix in Dense fromat. > > How could I get the right Answer from this function when I define SparseAIJ > matrix? > Depends what you mean by "right answer". This function accesses private storage, so it's different for each matrix format. You might want to look at MatView() or MatConvert(). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Oct 1 07:53:22 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 1 Oct 2011 07:53:22 -0500 Subject: [petsc-users] How Efficiently copy matrices In-Reply-To: References: Message-ID: On Sat, Oct 1, 2011 at 02:33, behzad baghapour wrote: > I want to copy some elements of a calculated matrix (A) into another matrix > (B) with DIFFERENT_NONZERO_PATTERN. > Which "some elements"? > How could I efficiently do this? > Is this should be done after or before matrix assembly of (A)? > You can only copy after assembling A. > Then, what is the sparse pattern of the matrix (A) after assembling and how > is the correct access into its elements? > You probably want MatGetSubMatrix(). -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Oct 1 07:59:25 2011 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 1 Oct 2011 07:59:25 -0500 Subject: [petsc-users] MatMult a MatNest and a Vec In-Reply-To: References: Message-ID: On Sat, Oct 1, 2011 at 3:04 AM, Hui Zhang wrote: > Thank you, Jed! I have more questions: > > if I do C=A*B with A(or B) of MatNest type and B(or A, resp.) of usual > type, > what type will C be of? > There is an understanding problem here. PETSc does not create C, you do. You determine its type. > And is there any function not supporting MatNest, for example, does LU > decomposition > supports MatNest? > No, you would have to convert it, or pull out the block that you want LU of. > Does one sometimes need to convert MatNest to the usual type and how to do > it? > MatConvert. Matt > The MatNest is useful to me because my matrix is in the form [K1*K2 > K3*K4]. > I think to construct it as MatNest is easier than setting up Aij. > > Best wishes, > Hui > > On Sep 30, 2011, at 7:08 PM, Jed Brown wrote: > > On Fri, Sep 30, 2011 at 10:51, Hui Zhang wrote: > >> Do you know what type of the resulting vector y by MatMult of a MatNest A >> and a Vec x, i.e. y = Ax? >> > > MatMult does not change the types of vectors. This function works with any > types. > > Note, however, that many Vec operations require all arguments to have the > same type, so you can't freely mix native Vec with VecNest. I don't > recommend using VecNest unless you have a very specific use case and can > explain to me why you should use it. (I do recommend using MatNest, but > generally with the MatGetLocalSubMatrix() so you don't depend on it---I > always try to avoid depending on implementations.) > > >> Because after that I want to modify a SubVec of the result y. If it is a >> VectNest, I think >> VecNestGetSubVec should be used but not VecGetSubVector. >> > > This is why VecNest implements VecGetSubVector (see VecGetSubVector_Nest). > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Oct 1 08:06:26 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 1 Oct 2011 08:06:26 -0500 Subject: [petsc-users] MatMult a MatNest and a Vec In-Reply-To: References: Message-ID: On Sat, Oct 1, 2011 at 03:04, Hui Zhang wrote: > Thank you, Jed! I have more questions: > > if I do C=A*B with A(or B) of MatNest type and B(or A, resp.) of usual > type, > what type will C be of? > MatMatMult() is not currently supported for any type combination involving MatNest. > And is there any function not supporting MatNest, for example, does LU > decomposition > supports MatNest? > No, and the only practical way to do so would be to convert it to an AIJ format. > Does one sometimes need to convert MatNest to the usual type and how to do > it? > MatConvert() except that case is not written so it won't work yet. I will write it, it's not difficult, but it hasn't been needed yet. > The MatNest is useful to me because my matrix is in the form [K1*K2 > K3*K4]. > I think to construct it as MatNest is easier than setting up Aij. > The difficulty to construct them should be about the same. In particular, you should be able to use identical code to assemble into either format by building [K1, K3] and [K2; K4]. You can create these matrices (set dimensions and types, but no values), call MatGetLocalSubMatrix() to address the "blocks", and call the assembly routines for each block. Then do the MatMatMult. The assembly part will work for any format, the MatMatMult() will currently require AIJ (not Nest), but will work as soon as MatMatMult_Nest() is implemented. -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sat Oct 1 10:23:50 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sat, 1 Oct 2011 18:53:50 +0330 Subject: [petsc-users] How to get Matrix from MatGetArray() In-Reply-To: References: Message-ID: I meant by MatAssembly, the global indices of matrix elements (related to a dense format) which is changed into sparse one. So what would be right pattern to read the content of the Matrix from MatGetArray if I know that the original matrix is created by AIJ? On Sat, Oct 1, 2011 at 4:21 PM, Jed Brown wrote: > On Sat, Oct 1, 2011 at 00:39, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> I just be able to get right answer from MatGetArray( A, &a ) only if I >> define matrix in Dense fromat. >> >> How could I get the right Answer from this function when I define >> SparseAIJ matrix? >> > > Depends what you mean by "right answer". This function accesses private > storage, so it's different for each matrix format. You might want to look at > MatView() or MatConvert(). > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sat Oct 1 10:24:56 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sat, 1 Oct 2011 18:54:56 +0330 Subject: [petsc-users] How Efficiently copy matrices In-Reply-To: References: Message-ID: Thanks, I will try MatGetSubMatrix for this matter... On Sat, Oct 1, 2011 at 4:23 PM, Jed Brown wrote: > On Sat, Oct 1, 2011 at 02:33, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> I want to copy some elements of a calculated matrix (A) into another >> matrix (B) with DIFFERENT_NONZERO_PATTERN. >> > > Which "some elements"? > > >> How could I efficiently do this? >> Is this should be done after or before matrix assembly of (A)? >> > > You can only copy after assembling A. > > >> Then, what is the sparse pattern of the matrix (A) after assembling and >> how is the correct access into its elements? >> > > You probably want MatGetSubMatrix(). > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Oct 1 10:26:30 2011 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 1 Oct 2011 10:26:30 -0500 Subject: [petsc-users] How to get Matrix from MatGetArray() In-Reply-To: References: Message-ID: On Sat, Oct 1, 2011 at 10:23 AM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > I meant by MatAssembly, the global indices of matrix elements (related to a > dense format) which is changed into sparse one. So what would be right > pattern to read the content of the Matrix from MatGetArray if I know that > the original matrix is created by AIJ? > MatGetArray() on an AIJ matrix will give you 'A'. What do you need this for? There are usually much better ways to do things, and keep your code data-structure neutral. Matt > > On Sat, Oct 1, 2011 at 4:21 PM, Jed Brown wrote: > >> On Sat, Oct 1, 2011 at 00:39, behzad baghapour < >> behzad.baghapour at gmail.com> wrote: >> >>> I just be able to get right answer from MatGetArray( A, &a ) only if I >>> define matrix in Dense fromat. >>> >>> How could I get the right Answer from this function when I define >>> SparseAIJ matrix? >>> >> >> Depends what you mean by "right answer". This function accesses private >> storage, so it's different for each matrix format. You might want to look at >> MatView() or MatConvert(). >> > > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Oct 1 10:26:45 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 1 Oct 2011 10:26:45 -0500 Subject: [petsc-users] How to get Matrix from MatGetArray() In-Reply-To: References: Message-ID: On Sat, Oct 1, 2011 at 10:23, behzad baghapour wrote: > I meant by MatAssembly, the global indices of matrix elements (related to a > dense format) which is changed into sparse one. I don't understand what you mean. It doesn't make sense to convert "dense" indices into sparse ones. > So what would be right pattern to read the content of the Matrix from > MatGetArray if I know that the original matrix is created by AIJ? Use MatGetRow() if you need access to this. (The internal data structures are different in parallel and serial, so accessing them directly is bad.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sat Oct 1 10:44:35 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sat, 1 Oct 2011 19:14:35 +0330 Subject: [petsc-users] How to get Matrix from MatGetArray() In-Reply-To: References: Message-ID: OK. All I am concering is: 1- define matrix A in sparse format AIJ. 2- Get the content of A. 3- change the content of A for some elements 4- Restore the changed matrix back into origin. I test MatGetArray( Mat A, PetscScalar* a ) but I couldnt find the correct content of matrix by using a[i*n+j]. I need to know how to access the elements of an assembled sparse matrix. Am I clear now? Thanks Behzad On Sat, Oct 1, 2011 at 6:56 PM, Jed Brown wrote: > On Sat, Oct 1, 2011 at 10:23, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> I meant by MatAssembly, the global indices of matrix elements (related to >> a dense format) which is changed into sparse one. > > > I don't understand what you mean. It doesn't make sense to convert "dense" > indices into sparse ones. > > >> So what would be right pattern to read the content of the Matrix from >> MatGetArray if I know that the original matrix is created by AIJ? > > > Use MatGetRow() if you need access to this. (The internal data structures > are different in parallel and serial, so accessing them directly is bad.) > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Oct 1 10:56:06 2011 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 1 Oct 2011 10:56:06 -0500 Subject: [petsc-users] How to get Matrix from MatGetArray() In-Reply-To: References: Message-ID: On Sat, Oct 1, 2011 at 10:44 AM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > OK. All I am concering is: > > 1- define matrix A in sparse format AIJ. > 2- Get the content of A. > 3- change the content of A for some elements > 4- Restore the changed matrix back into origin. > > I test MatGetArray( Mat A, PetscScalar* a ) but I couldnt find the correct > content of matrix by using a[i*n+j]. I need to know how to access the > elements of an assembled sparse matrix. > > Am I clear now? > No, definitely do not do that. The whole point of PETSc matrices is that you do not do this. This is what is wrong with 90% of numerical software. You want *interfaces*, not data structures. That allows you to seamlessly - work in parallel - work on GPUs - optimize for different architectures What you want to do is use MatSetValues() (or MatZeroRows, etc.) to change your entries. Matt > > Thanks > Behzad > > On Sat, Oct 1, 2011 at 6:56 PM, Jed Brown wrote: > >> On Sat, Oct 1, 2011 at 10:23, behzad baghapour < >> behzad.baghapour at gmail.com> wrote: >> >>> I meant by MatAssembly, the global indices of matrix elements (related to >>> a dense format) which is changed into sparse one. >> >> >> I don't understand what you mean. It doesn't make sense to convert "dense" >> indices into sparse ones. >> >> >>> So what would be right pattern to read the content of the Matrix from >>> MatGetArray if I know that the original matrix is created by AIJ? >> >> >> Use MatGetRow() if you need access to this. (The internal data structures >> are different in parallel and serial, so accessing them directly is bad.) >> > > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Oct 1 11:06:53 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 1 Oct 2011 11:06:53 -0500 Subject: [petsc-users] How to get Matrix from MatGetArray() In-Reply-To: References: Message-ID: On Sat, Oct 1, 2011 at 10:44, behzad baghapour wrote: > I test MatGetArray( Mat A, PetscScalar* a ) but I couldnt find the correct > content of matrix by using a[i*n+j]. The moment you ask for such a grotesque thing, you have thrown in the towel and given up any chance of a moderately scalable method (in memory or time). -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sat Oct 1 11:07:44 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sat, 1 Oct 2011 19:37:44 +0330 Subject: [petsc-users] How to get Matrix from MatGetArray() In-Reply-To: References: Message-ID: Thanks, then i should change my idea abut matrix evaluation in Petsc. I will try to keep matrix in the same structure during successive iterations. On Sat, Oct 1, 2011 at 7:26 PM, Matthew Knepley wrote: > On Sat, Oct 1, 2011 at 10:44 AM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> OK. All I am concering is: >> >> 1- define matrix A in sparse format AIJ. >> 2- Get the content of A. >> 3- change the content of A for some elements >> 4- Restore the changed matrix back into origin. >> >> I test MatGetArray( Mat A, PetscScalar* a ) but I couldnt find the correct >> content of matrix by using a[i*n+j]. I need to know how to access the >> elements of an assembled sparse matrix. >> >> Am I clear now? >> > > No, definitely do not do that. The whole point of PETSc matrices is that > you do not do this. This is what > is wrong with 90% of numerical software. You want *interfaces*, not data > structures. That allows you > to seamlessly > > - work in parallel > - work on GPUs > - optimize for different architectures > > What you want to do is use MatSetValues() (or MatZeroRows, etc.) to change > your entries. > > Matt > > >> >> Thanks >> Behzad >> >> On Sat, Oct 1, 2011 at 6:56 PM, Jed Brown wrote: >> >>> On Sat, Oct 1, 2011 at 10:23, behzad baghapour < >>> behzad.baghapour at gmail.com> wrote: >>> >>>> I meant by MatAssembly, the global indices of matrix elements (related >>>> to a dense format) which is changed into sparse one. >>> >>> >>> I don't understand what you mean. It doesn't make sense to convert >>> "dense" indices into sparse ones. >>> >>> >>>> So what would be right pattern to read the content of the Matrix from >>>> MatGetArray if I know that the original matrix is created by AIJ? >>> >>> >>> Use MatGetRow() if you need access to this. (The internal data structures >>> are different in parallel and serial, so accessing them directly is bad.) >>> >> >> >> >> -- >> ================================== >> Behzad Baghapour >> Ph.D. Candidate, Mechecanical Engineering >> University of Tehran, Tehran, Iran >> https://sites.google.com/site/behzadbaghapour >> Fax: 0098-21-88020741 >> ================================== >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Oct 1 11:12:02 2011 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 1 Oct 2011 11:12:02 -0500 Subject: [petsc-users] How to get Matrix from MatGetArray() In-Reply-To: References: Message-ID: On Sat, Oct 1, 2011 at 11:07 AM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > Thanks, then i should change my idea abut matrix evaluation in Petsc. I > will try to keep matrix in the same structure during successive iterations. > No, that is not the points at all. You should use the interface to alter the matrix as you want. More directly, Give up on a[i[row]+offset[col]] = value; and use instead MatSetValues(A, 1, &row, 1, &col, &value, INSERT_VALUES); I strongly suggest you read http://dl.acm.org/citation.cfm?id=245883 Thanks, Matt > On Sat, Oct 1, 2011 at 7:26 PM, Matthew Knepley wrote: > >> On Sat, Oct 1, 2011 at 10:44 AM, behzad baghapour < >> behzad.baghapour at gmail.com> wrote: >> >>> OK. All I am concering is: >>> >>> 1- define matrix A in sparse format AIJ. >>> 2- Get the content of A. >>> 3- change the content of A for some elements >>> 4- Restore the changed matrix back into origin. >>> >>> I test MatGetArray( Mat A, PetscScalar* a ) but I couldnt find the >>> correct content of matrix by using a[i*n+j]. I need to know how to access >>> the elements of an assembled sparse matrix. >>> >>> Am I clear now? >>> >> >> No, definitely do not do that. The whole point of PETSc matrices is that >> you do not do this. This is what >> is wrong with 90% of numerical software. You want *interfaces*, not data >> structures. That allows you >> to seamlessly >> >> - work in parallel >> - work on GPUs >> - optimize for different architectures >> >> What you want to do is use MatSetValues() (or MatZeroRows, etc.) to change >> your entries. >> >> Matt >> >> >>> >>> Thanks >>> Behzad >>> >>> On Sat, Oct 1, 2011 at 6:56 PM, Jed Brown wrote: >>> >>>> On Sat, Oct 1, 2011 at 10:23, behzad baghapour < >>>> behzad.baghapour at gmail.com> wrote: >>>> >>>>> I meant by MatAssembly, the global indices of matrix elements (related >>>>> to a dense format) which is changed into sparse one. >>>> >>>> >>>> I don't understand what you mean. It doesn't make sense to convert >>>> "dense" indices into sparse ones. >>>> >>>> >>>>> So what would be right pattern to read the content of the Matrix from >>>>> MatGetArray if I know that the original matrix is created by AIJ? >>>> >>>> >>>> Use MatGetRow() if you need access to this. (The internal data >>>> structures are different in parallel and serial, so accessing them directly >>>> is bad.) >>>> >>> >>> >>> >>> -- >>> ================================== >>> Behzad Baghapour >>> Ph.D. Candidate, Mechecanical Engineering >>> University of Tehran, Tehran, Iran >>> https://sites.google.com/site/behzadbaghapour >>> Fax: 0098-21-88020741 >>> ================================== >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Sat Oct 1 12:28:34 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 1 Oct 2011 19:28:34 +0200 Subject: [petsc-users] question about contexts In-Reply-To: References: Message-ID: Jed and Vijay, Yes, thanks, this works. I somehow did not realize I can pass 'this' as the context. However, it has a serious drawback, that all the members the out-of-class function needs to access must be public. I guess I will have to live with this for now, but if you have some ideas to somehow make the function a "friend" of my class, I am all ears... Many thanks, Dominik On Fri, Sep 30, 2011 at 7:33 PM, Vijay S. Mahadevan wrote: > You should be able to do this: > > #undef __FUNCT__ > #define __FUNCT__ "StiffnessMatrixMultiplication" > PetscErrorCode StiffnessMatrixMultiplication(Mat A, Vec x, Vec y) > { > ? ? ? MyClass* ctx ; > ? ? ? // ... > ? ? ? ierr = MatShellGetContext(A, (void**) &ctx);CHKERRQ(ierr); > ? ? ? Mat foo; > ? ? ? foo = ctx->foo; // non-static public member of MyClass that is > of type Mat > ? ? ? // ... > ? ? ? Mat bar; > ? ? ? bar = MyClass::bar; // static member of MyClass that is of type Mat > ? ? ? // ... > ? ? ? Mat foobar; > ? ? ? foobar = MyClass::foobar(); // static member function of > MyClass that returns a Mat > ? ? ? // ... > } > > And similarly for the PCShellApply, you can get the context MyClass > from the PC instance. I don't see what problems you have with this > use-case. > > Look at SampleShellPCApply in > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/src/ksp/ksp/examples/tutorials/ex15.c.html > for a similar usage. > > On Fri, Sep 30, 2011 at 12:28 PM, Dominik Szczerba wrote: >>> But the method is like a "friend" in the sense that it can access members through an explicit pointer. >> >> Can you please expand a bit on this one? What is an explicit pointer? >> >> Dominik >> > > From jedbrown at mcs.anl.gov Sat Oct 1 12:30:18 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 1 Oct 2011 12:30:18 -0500 Subject: [petsc-users] question about contexts In-Reply-To: References: Message-ID: On Sat, Oct 1, 2011 at 12:28, Dominik Szczerba wrote: > Yes, thanks, this works. I somehow did not realize I can pass 'this' > as the context. > However, it has a serious drawback, that all the members the > out-of-class function needs to access must be public. I guess I will > have to live with this for now, but if you have some ideas to somehow > make the function a "friend" of my class, I am all ears... > As mentioned above, make the callback function a static member of the class. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Sat Oct 1 12:36:23 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 1 Oct 2011 19:36:23 +0200 Subject: [petsc-users] question about contexts In-Reply-To: References: Message-ID: Yes, I was trying to, but always get the previously mentioned error when using PCShellSetApply error: argument of type ?PetscErrorCode (FluidSolver::)(_p_PC*, _p_Vec*, _p_Vec*)? does not match ?PetscErrorCode (*)(_p_PC*, _p_Vec*, _p_Vec*)? Is it still possible somehow? Would be great! Dominik On Sat, Oct 1, 2011 at 7:30 PM, Jed Brown wrote: > On Sat, Oct 1, 2011 at 12:28, Dominik Szczerba wrote: >> >> Yes, thanks, this works. I somehow did not realize I can pass 'this' >> as the context. >> However, it has a serious drawback, that all the members the >> out-of-class function needs to access must be public. I guess I will >> have to live with this for now, but if you have some ideas to somehow >> make the function a "friend" of my class, I am all ears... > > As mentioned above, make the callback function a static member of the class. From jedbrown at mcs.anl.gov Sat Oct 1 12:53:07 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 1 Oct 2011 12:53:07 -0500 Subject: [petsc-users] question about contexts In-Reply-To: References: Message-ID: On Sat, Oct 1, 2011 at 12:36, Dominik Szczerba wrote: > Yes, I was trying to, but always get the previously mentioned error > when using PCShellSetApply > > error: argument of type ?PetscErrorCode > (FluidSolver::)(_p_PC*, _p_Vec*, _p_Vec*)? does not match > ?PetscErrorCode (*)(_p_PC*, _p_Vec*, _p_Vec*)? > > Is it still possible somehow? Would be great! > Are you sure the function you are passing in is static? Show us the code snippets. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Oct 1 13:03:20 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 1 Oct 2011 13:03:20 -0500 Subject: [petsc-users] question about contexts In-Reply-To: References: Message-ID: On Sat, Oct 1, 2011 at 12:53, Jed Brown wrote: > Are you sure the function you are passing in is static? Show us the code > snippets. Attached is a simple example of setting callbacks into class methods using standard C function pointers. You can use this approach for any callbacks in PETSc. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: callback.cc Type: text/x-c++src Size: 1021 bytes Desc: not available URL: From dominik at itis.ethz.ch Sat Oct 1 14:22:48 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 1 Oct 2011 21:22:48 +0200 Subject: [petsc-users] question about contexts In-Reply-To: References: Message-ID: You are right it compiles now, I must have screwed something up before, not enough sleep and coffee... Thanks, it finally works the way I want. Regards, Dominik On Sat, Oct 1, 2011 at 7:53 PM, Jed Brown wrote: > On Sat, Oct 1, 2011 at 12:36, Dominik Szczerba wrote: >> >> Yes, I was trying to, but always get the previously mentioned error >> when using PCShellSetApply >> >> error: argument of type ?PetscErrorCode >> (FluidSolver::)(_p_PC*, _p_Vec*, _p_Vec*)? does not match >> ?PetscErrorCode (*)(_p_PC*, _p_Vec*, _p_Vec*)? >> >> Is it still possible somehow? Would be great! > > Are you sure the function you are passing in is static? Show us the code > snippets. From behzad.baghapour at gmail.com Sun Oct 2 01:29:11 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sun, 2 Oct 2011 09:59:11 +0330 Subject: [petsc-users] How to get Matrix from MatGetArray() In-Reply-To: References: Message-ID: I know about the efficiency and scalability of modern linear methods and I don't really want to pass memory in a inefficient manner. But here I need to test my designed matrix and know what PETSc brings out in a dense format to check my procedure. Thanks, Behzad On Sat, Oct 1, 2011 at 7:36 PM, Jed Brown wrote: > On Sat, Oct 1, 2011 at 10:44, behzad baghapour > wrote: > >> I test MatGetArray( Mat A, PetscScalar* a ) but I couldnt find the correct >> content of matrix by using a[i*n+j]. > > > The moment you ask for such a grotesque thing, you have thrown in the towel > and given up any chance of a moderately scalable method (in memory or time). > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From aron.ahmadia at kaust.edu.sa Sun Oct 2 01:33:34 2011 From: aron.ahmadia at kaust.edu.sa (Aron Ahmadia) Date: Sun, 2 Oct 2011 09:33:34 +0300 Subject: [petsc-users] How to get Matrix from MatGetArray() In-Reply-To: References: Message-ID: | But here I need to test my designed matrix and know what PETSc brings out in a dense format to check my procedure. This sentence doesn't make any sense. The matrix is the same whether it is stored in a dense or sparse format, you can query individual values of the matrix and find out whether they are nonzero very easily. The only thing that changes from a user perspective is the efficiency of the code. If you insist on interactively playing with a dense version of your matrix, I might suggest that you look at storing the Matrix using the MATLAB viewer or connecting to a MATLAB session. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Sun Oct 2 04:46:09 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sun, 2 Oct 2011 11:46:09 +0200 Subject: [petsc-users] check if object was created Message-ID: Is it OK to do: KSP/Mat/Vec/etc object; if(object) KSP/Mat/Vec/etcDestroy(object) It other words, are default constructors initializing the object to NULL, or is there an other dedicated way to check if an object was actually allocated? I could not easily deduce it from the code... Many thanks, Dominik From jedbrown at mcs.anl.gov Sun Oct 2 07:03:21 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sun, 2 Oct 2011 07:03:21 -0500 Subject: [petsc-users] How to get Matrix from MatGetArray() In-Reply-To: References: Message-ID: On Sun, Oct 2, 2011 at 01:29, behzad baghapour wrote: > I know about the efficiency and scalability of modern linear methods and I > don't really want to pass memory in a inefficient manner. > But here I need to test my designed matrix and know what PETSc brings out > in a dense format to check my procedure. > You can use MatConvert() to create a dense matrix with the same format, but I would usually just call MatView() with a very small problem size. -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Sun Oct 2 08:30:38 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Sun, 2 Oct 2011 08:30:38 -0500 (CDT) Subject: [petsc-users] check if object was created In-Reply-To: References: Message-ID: On Sun, 2 Oct 2011, Dominik Szczerba wrote: > Is it OK to do: > > KSP/Mat/Vec/etc object; try KSP/Mat/Vec/etc object=0; Satish > > if(object) > KSP/Mat/Vec/etcDestroy(object) > > It other words, are default constructors initializing the object to > NULL, or is there an other dedicated way to check if an object was > actually allocated? I could not easily deduce it from the code... > > Many thanks, > Dominik > From jedbrown at mcs.anl.gov Sun Oct 2 10:35:41 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sun, 2 Oct 2011 10:35:41 -0500 Subject: [petsc-users] check if object was created In-Reply-To: References: Message-ID: On Sun, Oct 2, 2011 at 04:46, Dominik Szczerba wrote: > if(object) > KSP/Mat/Vec/etcDestroy(object) > Just call KSPDestroy(&ksp); It doesn't matter if KSP is NULL or a valid object because this function checks. But note that if you just write KSP ksp; or allocate similarly in part of malloc() --- with no memzero() or similar --- then the value of ksp is undefined. -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sun Oct 2 10:39:57 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sun, 2 Oct 2011 19:09:57 +0330 Subject: [petsc-users] How to use pARMS in Petsc Message-ID: I am searching a multi-level ILU preconditoiner. I saw that Petsc has interface command to pARMS. How I can use these commands with Petsc? Should I download pARMS separately and link it with Petsc? Then how I can do this? Thanks a lot, Behzad -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Oct 2 10:43:08 2011 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 2 Oct 2011 10:43:08 -0500 Subject: [petsc-users] How to use pARMS in Petsc In-Reply-To: References: Message-ID: On Sun, Oct 2, 2011 at 10:39 AM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > I am searching a multi-level ILU preconditoiner. I saw that Petsc has > interface command to pARMS. How I can use these commands with Petsc? Should > I download pARMS separately and link it with Petsc? Then how I can do this? > --download-parms Matt > Thanks a lot, > Behzad > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Oct 2 11:07:26 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 2 Oct 2011 11:07:26 -0500 Subject: [petsc-users] check if object was created In-Reply-To: References: Message-ID: On Oct 2, 2011, at 10:35 AM, Jed Brown wrote: > On Sun, Oct 2, 2011 at 04:46, Dominik Szczerba wrote: > if(object) > KSP/Mat/Vec/etcDestroy(object) > > Just call KSPDestroy(&ksp); > > It doesn't matter if KSP is NULL or a valid object because this function checks. But note that if you just write > > KSP ksp; > > or allocate similarly in part of malloc() --- with no memzero() or similar --- then the value of ksp is undefined. In other words, if you are sometimes not sure if you will ever create a certain object make sure you initialize it to 0 with Vec vec = 0; ....... VecDestroy(&vec); ..... Note that we pass &vec as the argument and all the destroys set the value to 0 so it cannot be accidently used again. Barry From behzad.baghapour at gmail.com Sun Oct 2 11:12:28 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sun, 2 Oct 2011 19:42:28 +0330 Subject: [petsc-users] How to use pARMS in Petsc In-Reply-To: References: Message-ID: Thanks...it works. Also, please let me know about "valgrind" which is prompt when configuring the Petsc package? On Sun, Oct 2, 2011 at 7:13 PM, Matthew Knepley wrote: > On Sun, Oct 2, 2011 at 10:39 AM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> I am searching a multi-level ILU preconditoiner. I saw that Petsc has >> interface command to pARMS. How I can use these commands with Petsc? Should >> I download pARMS separately and link it with Petsc? Then how I can do this? >> > > --download-parms > > Matt > > >> Thanks a lot, >> Behzad >> >> -- >> ================================== >> Behzad Baghapour >> Ph.D. Candidate, Mechecanical Engineering >> University of Tehran, Tehran, Iran >> https://sites.google.com/site/behzadbaghapour >> Fax: 0098-21-88020741 >> ================================== >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Oct 2 11:14:53 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 2 Oct 2011 11:14:53 -0500 Subject: [petsc-users] How to use pARMS in Petsc In-Reply-To: References: Message-ID: http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind On Oct 2, 2011, at 11:12 AM, behzad baghapour wrote: > Thanks...it works. > Also, please let me know about "valgrind" which is prompt when configuring the Petsc package? > > > On Sun, Oct 2, 2011 at 7:13 PM, Matthew Knepley wrote: > On Sun, Oct 2, 2011 at 10:39 AM, behzad baghapour wrote: > I am searching a multi-level ILU preconditoiner. I saw that Petsc has interface command to pARMS. How I can use these commands with Petsc? Should I download pARMS separately and link it with Petsc? Then how I can do this? > > --download-parms > > Matt > > Thanks a lot, > Behzad > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > From zhenglun.wei at gmail.com Sun Oct 2 12:04:33 2011 From: zhenglun.wei at gmail.com (Alan Wei) Date: Sun, 2 Oct 2011 12:04:33 -0500 Subject: [petsc-users] Using /src/ksp/ksp/example/tutorial/ex29.c to solve Poisson Equation Twice Message-ID: Dear all, I hope you're having a nice weekend. I'm still using this program to solve Poisson Equations. After several discussions before, I know how to use it in general. However, I met a problem when I want to use it twice to solve a same Poisson Equations with two different RHS.(same boundary conditions) 1) after I've done my first computation of a Poisson equation, I use DMMGSetKSP() again setting up another (*rhs) . However, I found that this DMMGSetKSP does not call the function that makes the new (*rhs), i.e. ComputeNewRHS(). 2) In the first tempt to solve the Poisson equation with old RHS, DMMGSetKSP called the function making the old (*rhs). If I comment out the first DMMGSetKSP(), which creates the old (*rhs), then this DMMGSetKSP, which create the new (*rhs), can call the function making the new (*rhs) 2) I read the manual of "DMMGSetKSP" saying *For linear problems may be called more than once, reevaluates the matrices if it is called more than once. Call DMMGSolve() directly several times to solve with the same matrix but different right hand sides.* Does this mean that I do not need to call DMMGSetKSP if I need to solve the same Poisson equation with another RHS? If so, how can I pass my new RHS to dmmg? thanks in advance, Alan -------------- next part -------------- An HTML attachment was scrubbed... URL: From domel07 at gmail.com Sun Oct 2 12:04:09 2011 From: domel07 at gmail.com (Dominik Szczerba) Date: Sun, 2 Oct 2011 19:04:09 +0200 Subject: [petsc-users] check if object was created In-Reply-To: References: Message-ID: Ok these answer my question. Many thanks! On Oct 2, 2011 6:07 PM, "Barry Smith" wrote: > > On Oct 2, 2011, at 10:35 AM, Jed Brown wrote: > >> On Sun, Oct 2, 2011 at 04:46, Dominik Szczerba wrote: >> if(object) >> KSP/Mat/Vec/etcDestroy(object) >> >> Just call KSPDestroy(&ksp); >> >> It doesn't matter if KSP is NULL or a valid object because this function checks. But note that if you just write >> >> KSP ksp; >> >> or allocate similarly in part of malloc() --- with no memzero() or similar --- then the value of ksp is undefined. > > > In other words, if you are sometimes not sure if you will ever create a certain object make sure you initialize it to 0 with > > Vec vec = 0; > > ....... > > > VecDestroy(&vec); > > ..... > > > Note that we pass &vec as the argument and all the destroys set the value to 0 so it cannot be accidently used again. > > Barry > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Sun Oct 2 14:01:15 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sun, 2 Oct 2011 21:01:15 +0200 Subject: [petsc-users] check if object was created In-Reply-To: References: Message-ID: One little clarification: > Vec vec = 0; > VecCreate(..., &vec); > VecDestroy(&vec); This causes a compilation error for (only) VecDestroy, the only way to avoid is to say: VecDestroy(vec); This confuses me, be cause the docu says like you that a pointer is expected. I see that e.g. Mat expands to _p_Mat* but that is not the same... ??? Dominik From jedbrown at mcs.anl.gov Sun Oct 2 14:03:28 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sun, 2 Oct 2011 14:03:28 -0500 Subject: [petsc-users] check if object was created In-Reply-To: References: Message-ID: On Sun, Oct 2, 2011 at 14:01, Dominik Szczerba wrote: > One little clarification: > > > Vec vec = 0; > > VecCreate(..., &vec); > > VecDestroy(&vec); > > This causes a compilation error for (only) VecDestroy, the only way to > avoid is to say: > > VecDestroy(vec); > Please upgrade to petsc-3.2. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Oct 2 15:56:16 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 2 Oct 2011 15:56:16 -0500 Subject: [petsc-users] Using /src/ksp/ksp/example/tutorial/ex29.c to solve Poisson Equation Twice In-Reply-To: References: Message-ID: On Oct 2, 2011, at 12:04 PM, Alan Wei wrote: > Dear all, > I hope you're having a nice weekend. I'm still using this program to solve Poisson Equations. After several discussions before, I know how to use it in general. However, I met a problem when I want to use it twice to solve a same Poisson Equations with two different RHS.(same boundary conditions) > 1) after I've done my first computation of a Poisson equation, I use DMMGSetKSP() again setting up another (*rhs) . > However, I found that this DMMGSetKSP does not call the function that makes the new (*rhs), i.e. ComputeNewRHS(). > 2) In the first tempt to solve the Poisson equation with old RHS, DMMGSetKSP called the function making the old (*rhs). If I comment out the first DMMGSetKSP(), which creates the old (*rhs), then this DMMGSetKSP, which create the new (*rhs), can call the function making the new (*rhs) > 2) I read the manual of "DMMGSetKSP" saying For linear problems may be called more than once, reevaluates the matrices if it is called more than once. Call DMMGSolve() directly several times to solve with the same matrix but different right hand sides. Does this mean that I do not need to call DMMGSetKSP if I need to solve the same Poisson equation with another RHS? If so, how can I pass my new RHS to dmmg? With the DMMG interface you provide a __function__ to compute the right hand side, if you want to solve with several right hand sides you just have the rhs function compute a different right hand side each time it is called corresponding to the right hand sides you want to compute By the way the DMMG is being deprecated and will not be in the next PETSc release so we recommend moving away from it and just using the KSP linear solver interface (see the manual page for KSPSetDM() and several examples that us KSPSetDM() in the src/ksp/ksp/examples/tutorials directory. Barry > > thanks in advance, > Alan > From zhenglun.wei at gmail.com Sun Oct 2 16:31:41 2011 From: zhenglun.wei at gmail.com (Alan Wei) Date: Sun, 2 Oct 2011 16:31:41 -0500 Subject: [petsc-users] Using /src/ksp/ksp/example/tutorial/ex29.c to solve Poisson Equation Twice In-Reply-To: References: Message-ID: Thanks Dr. Smith, I will work on DM later after I know how to do this. BTW, what is the DMMG interface you are talking about? is it DMMGSetKSP()? if so, that's is my problem. At the 2nd time of using DMMGSetKSP() in the main function, it does not call the __function__ to compute the right hand side. >_< any suggestions on that? best, Alan On Sun, Oct 2, 2011 at 3:56 PM, Barry Smith wrote: > > On Oct 2, 2011, at 12:04 PM, Alan Wei wrote: > > > Dear all, > > I hope you're having a nice weekend. I'm still using this program to > solve Poisson Equations. After several discussions before, I know how to use > it in general. However, I met a problem when I want to use it twice to solve > a same Poisson Equations with two different RHS.(same boundary conditions) > > 1) after I've done my first computation of a Poisson equation, I use > DMMGSetKSP() again setting up another (*rhs) . > > However, I found that this DMMGSetKSP does not call the function that > makes the new (*rhs), i.e. ComputeNewRHS(). > > 2) In the first tempt to solve the Poisson equation with old RHS, > DMMGSetKSP called the function making the old (*rhs). If I comment out the > first DMMGSetKSP(), which creates the old (*rhs), then this DMMGSetKSP, > which create the new (*rhs), can call the function making the new (*rhs) > > 2) I read the manual of "DMMGSetKSP" saying For linear problems may be > called more than once, reevaluates the matrices if it is called more than > once. Call DMMGSolve() directly several times to solve with the same matrix > but different right hand sides. Does this mean that I do not need to call > DMMGSetKSP if I need to solve the same Poisson equation with another RHS? If > so, how can I pass my new RHS to dmmg? > > With the DMMG interface you provide a __function__ to compute the right > hand side, if you want to solve with several right hand sides you just have > the rhs function compute a different right hand side each time it is called > corresponding to the right hand sides you want to compute > > By the way the DMMG is being deprecated and will not be in the next > PETSc release so we recommend moving away from it and just using the KSP > linear solver interface (see the manual page for KSPSetDM() and several > examples that us KSPSetDM() in the src/ksp/ksp/examples/tutorials directory. > > Barry > > > > > thanks in advance, > > Alan > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sun Oct 2 17:22:53 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sun, 2 Oct 2011 17:22:53 -0500 Subject: [petsc-users] Using /src/ksp/ksp/example/tutorial/ex29.c to solve Poisson Equation Twice In-Reply-To: References: Message-ID: On Sun, Oct 2, 2011 at 16:31, Alan Wei wrote: > BTW, what is the DMMG interface you are talking about? is it DMMGSetKSP()? Every object and function containing "DMMG" will be removed. (I don't think we've decided on a precise timeline, but don't write new code that uses DMMG.) > if so, that's is my problem. At the 2nd time of using DMMGSetKSP() in the > main function, it does not call the __function__ to compute the right hand > side. >_< any suggestions on that? What function is being called? -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhenglun.wei at gmail.com Sun Oct 2 17:52:11 2011 From: zhenglun.wei at gmail.com (Alan Wei) Date: Sun, 2 Oct 2011 17:52:11 -0500 Subject: [petsc-users] Using /src/ksp/ksp/example/tutorial/ex29.c to solve Poisson Equation Twice In-Reply-To: References: Message-ID: In this example, i recalled the ComputeRHS. Alan On Sun, Oct 2, 2011 at 5:22 PM, Jed Brown wrote: > On Sun, Oct 2, 2011 at 16:31, Alan Wei wrote: > >> BTW, what is the DMMG interface you are talking about? is it DMMGSetKSP()? > > > Every object and function containing "DMMG" will be removed. (I don't think > we've decided on a precise timeline, but don't write new code that uses > DMMG.) > > >> if so, that's is my problem. At the 2nd time of using DMMGSetKSP() in the >> main function, it does not call the __function__ to compute the right hand >> side. >_< any suggestions on that? > > > What function is being called? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gshy2014 at gmail.com Sun Oct 2 23:13:23 2011 From: gshy2014 at gmail.com (Shiyuan) Date: Sun, 2 Oct 2011 23:13:23 -0500 Subject: [petsc-users] direct access of seqcusp Vec In-Reply-To: References: Message-ID: On Thu, Sep 29, 2011 at 3:22 PM, Barry Smith wrote: > > On Sep 29, 2011, at 2:56 PM, Shiyuan wrote: > > > What about a direct access to seqaijcusp Mat? Is there a function call > which return the pointed to the cusp::csr_matrix ? > > No. But look at the source code to MatMult_SeqAIJCusp and you will see > how it may be accessed. > > In MatMult_SeqAIJCUSP, if the mat is CSR, the result first is stored in tmpvec, and then is permuted before stored to the final destination yy. What's the reason for doing that? Does petsc also support other format than csr? The part of the code is attached: if (usecprow){ /* use compressed row format */ try { cusp::multiply(*cuspstruct->mat,*xarray,*cuspstruct->tempvec); ierr = VecSet_SeqCUSP(yy,0.0);CHKERRQ(ierr); thrust::copy(cuspstruct->tempvec->begin(),cuspstruct->tempvec->end(),thrust::make_permutation_iterator(yarray->begin(),cuspstruct->indices->begin())); } catch (char* ex) { SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); } } else { /* do not use compressed row format */ try { cusp::multiply(*cuspstruct->mat,*xarray,*yarray); } catch(char* ex) { SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); } } -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Mon Oct 3 03:55:59 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Mon, 3 Oct 2011 12:25:59 +0330 Subject: [petsc-users] about ILUT preconditioner Message-ID: Is it necessary to download and link SPARSEKIT with Petsc to use ILUT (with threshold) as a preconditioner? If so, is it correct to configure Petsc with --download-sparsekit in this matter? -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From gdiso at ustc.edu Mon Oct 3 04:47:09 2011 From: gdiso at ustc.edu (Gong Ding) Date: Mon, 3 Oct 2011 17:47:09 +0800 Subject: [petsc-users] DGMRES Right preconditioner Message-ID: <079FB6F512FC4D1FAA325A66B156ED0D@cogendaeda> Hi, I had installed petsc-3.2, which has a new deflated GMRES solver. It seems the deflation procedure provides addtional precondition on top of GMRES. However, it seems the deflation was implemented as first right preconditioner: if (ksp->pc_side == PC_LEFT) { /* Apply the first preconditioner */ ierr = KSP_PCApplyBAorAB (ksp,VEC_VV (it), VEC_TEMP,VEC_TEMP_MATOP); CHKERRQ (ierr); /* Then apply Deflation as a preconditioner */ ierr=KSPDGMRESApplyDeflation (ksp, VEC_TEMP, VEC_VV (1+it)); CHKERRQ (ierr); } else if (ksp->pc_side == PC_RIGHT) { ierr=KSPDGMRESApplyDeflation (ksp, VEC_VV (it), VEC_TEMP); CHKERRQ (ierr); ierr=KSP_PCApplyBAorAB (ksp, VEC_TEMP, VEC_VV (1+it), VEC_TEMP_MATOP); CHKERRQ (ierr); } Since a "fixed" external preconditoner M such as ILU or my own PC will be used, I want the deflation apply on AM^-1. As a result, the update of deflation preconditioner will not affect the fixed preconditioner M. How should I modify the source code? Gong Ding From bsmith at mcs.anl.gov Mon Oct 3 07:54:09 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 3 Oct 2011 07:54:09 -0500 Subject: [petsc-users] about ILUT preconditioner In-Reply-To: References: Message-ID: <9B4C556E-F870-460F-A38E-6BA82CA57081@mcs.anl.gov> On Oct 3, 2011, at 3:55 AM, behzad baghapour wrote: > Is it necessary to download and link SPARSEKIT with Petsc to use ILUT (with threshold) as a preconditioner? If so, is it correct to configure Petsc with > --download-sparsekit in this matter? > We removed the interface to sparsekit a long time ago, because it wasn't very good. If you want to use ILUT you can use --download-hypre then run with -pc_type boomeramg -pc_hypre_type pilut See http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/PC/PCHYPRE.html for more details and run the PETSc program with the above options and -help to see the various pilut options. Barry > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > From jedbrown at mcs.anl.gov Mon Oct 3 08:02:05 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 3 Oct 2011 08:02:05 -0500 Subject: [petsc-users] about ILUT preconditioner In-Reply-To: <9B4C556E-F870-460F-A38E-6BA82CA57081@mcs.anl.gov> References: <9B4C556E-F870-460F-A38E-6BA82CA57081@mcs.anl.gov> Message-ID: On Mon, Oct 3, 2011 at 07:54, Barry Smith wrote: > -pc_type boomeramg -pc_hypre_type pilut Barry meant to type "-pc_type hypre -pc_hypre_type pilut". The Hypre developers recommend using Euclid for parallel ILU instead: "-pc_type hypre -pc_hypre_type euclid". -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Oct 3 08:03:45 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 3 Oct 2011 08:03:45 -0500 Subject: [petsc-users] DGMRES Right preconditioner In-Reply-To: <079FB6F512FC4D1FAA325A66B156ED0D@cogendaeda> References: <079FB6F512FC4D1FAA325A66B156ED0D@cogendaeda> Message-ID: <45548A70-8582-4169-92A1-E05EB44CCDBE@mcs.anl.gov> Desire, Any suggestions? Does this make sense? Thanks Barry On Oct 3, 2011, at 4:47 AM, Gong Ding wrote: > Hi, > I had installed petsc-3.2, which has a new deflated GMRES solver. > It seems the deflation procedure provides addtional precondition on top of GMRES. > > However, it seems the deflation was implemented as first right preconditioner: > > if (ksp->pc_side == PC_LEFT) { > /* Apply the first preconditioner */ > ierr = KSP_PCApplyBAorAB (ksp,VEC_VV (it), VEC_TEMP,VEC_TEMP_MATOP); > CHKERRQ (ierr); > /* Then apply Deflation as a preconditioner */ > ierr=KSPDGMRESApplyDeflation (ksp, VEC_TEMP, VEC_VV (1+it)); > CHKERRQ (ierr); > } else if (ksp->pc_side == PC_RIGHT) { > ierr=KSPDGMRESApplyDeflation (ksp, VEC_VV (it), VEC_TEMP); > CHKERRQ (ierr); > ierr=KSP_PCApplyBAorAB (ksp, VEC_TEMP, VEC_VV (1+it), VEC_TEMP_MATOP); > CHKERRQ (ierr); > } > > Since a "fixed" external preconditoner M such as ILU or my own PC will be used, > I want the deflation apply on AM^-1. As a result, the update of deflation preconditioner will not > affect the fixed preconditioner M. How should I modify the source code? > > > Gong Ding > > > > > > > From bsmith at mcs.anl.gov Mon Oct 3 08:10:22 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 3 Oct 2011 08:10:22 -0500 Subject: [petsc-users] direct access of seqcusp Vec In-Reply-To: References: Message-ID: <88EFCC8F-DE09-40E8-A854-69C323C7241F@mcs.anl.gov> On Oct 2, 2011, at 11:13 PM, Shiyuan wrote: > > > On Thu, Sep 29, 2011 at 3:22 PM, Barry Smith wrote: > > On Sep 29, 2011, at 2:56 PM, Shiyuan wrote: > > > What about a direct access to seqaijcusp Mat? Is there a function call which return the pointed to the cusp::csr_matrix ? > > No. But look at the source code to MatMult_SeqAIJCusp and you will see how it may be accessed. > > > In MatMult_SeqAIJCUSP, if the mat is CSR, the result first is stored in tmpvec, and then is permuted before stored to the final destination yy. What's the reason for doing that? You ignored the if (usecprow) flag. This flag is set when many many of the matrix rows are identically zero, in that case tempvec computes the resulting vector entries for the nonzero rows and the the "permute" sticks the result back into the larger vector. Normally just the second case of the if statement is used and it is just a cusp multiply. > Does petsc also support other format than csr? If it did it would be in the code somewhere, we don't have secret code you cannot see. Some user tried another format and got good performance but hasn't given us the code for that. Barry > The part of the code is attached: > > > if (usecprow){ /* use compressed row format */ > try { > cusp::multiply(*cuspstruct->mat,*xarray,*cuspstruct->tempvec); > ierr = VecSet_SeqCUSP(yy,0.0);CHKERRQ(ierr); > thrust::copy(cuspstruct->tempvec->begin(),cuspstruct->tempvec->end(),thrust::make_permutation_iterator(yarray->begin(),cuspstruct->indices->begin())); > } catch (char* ex) { > SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); > } > } else { /* do not use compressed row format */ > try { > cusp::multiply(*cuspstruct->mat,*xarray,*yarray); > } catch(char* ex) { > SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); > } > } > From behzad.baghapour at gmail.com Mon Oct 3 08:25:04 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Mon, 3 Oct 2011 16:55:04 +0330 Subject: [petsc-users] about ILUT preconditioner In-Reply-To: <9B4C556E-F870-460F-A38E-6BA82CA57081@mcs.anl.gov> References: <9B4C556E-F870-460F-A38E-6BA82CA57081@mcs.anl.gov> Message-ID: Many thanks. I have a basic (dummy) question here...Does downloading external packages like blas,lapack, hypre with Petsc command make them usable for other codes as linked libraries or is just for use of Petsc? On Mon, Oct 3, 2011 at 4:24 PM, Barry Smith wrote: > > On Oct 3, 2011, at 3:55 AM, behzad baghapour wrote: > > > Is it necessary to download and link SPARSEKIT with Petsc to use ILUT > (with threshold) as a preconditioner? If so, is it correct to configure > Petsc with > > --download-sparsekit in this matter? > > > > We removed the interface to sparsekit a long time ago, because it > wasn't very good. > > If you want to use ILUT you can use --download-hypre then run with > -pc_type boomeramg -pc_hypre_type pilut > > See > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/PC/PCHYPRE.htmlfor more details and run the PETSc program with the above options and -help > to see the various pilut options. > > Barry > > > > -- > > ================================== > > Behzad Baghapour > > Ph.D. Candidate, Mechecanical Engineering > > University of Tehran, Tehran, Iran > > https://sites.google.com/site/behzadbaghapour > > Fax: 0098-21-88020741 > > ================================== > > > > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Oct 3 08:26:31 2011 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 3 Oct 2011 08:26:31 -0500 Subject: [petsc-users] about ILUT preconditioner In-Reply-To: References: <9B4C556E-F870-460F-A38E-6BA82CA57081@mcs.anl.gov> Message-ID: On Mon, Oct 3, 2011 at 8:25 AM, behzad baghapour wrote: > Many thanks. > > I have a basic (dummy) question here...Does downloading external packages > like blas,lapack, hypre with Petsc command make them usable for other codes > as linked libraries or is just for use of Petsc? > They are installed into $PETSC_DIR/$PETSC_ARCH so you could use them from there. Matt > On Mon, Oct 3, 2011 at 4:24 PM, Barry Smith wrote: > >> >> On Oct 3, 2011, at 3:55 AM, behzad baghapour wrote: >> >> > Is it necessary to download and link SPARSEKIT with Petsc to use ILUT >> (with threshold) as a preconditioner? If so, is it correct to configure >> Petsc with >> > --download-sparsekit in this matter? >> > >> >> We removed the interface to sparsekit a long time ago, because it >> wasn't very good. >> >> If you want to use ILUT you can use --download-hypre then run with >> -pc_type boomeramg -pc_hypre_type pilut >> >> See >> http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/PC/PCHYPRE.htmlfor more details and run the PETSc program with the above options and -help >> to see the various pilut options. >> >> Barry >> >> >> > -- >> > ================================== >> > Behzad Baghapour >> > Ph.D. Candidate, Mechecanical Engineering >> > University of Tehran, Tehran, Iran >> > https://sites.google.com/site/behzadbaghapour >> > Fax: 0098-21-88020741 >> > ================================== >> > >> >> > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Oct 3 08:26:57 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 3 Oct 2011 08:26:57 -0500 Subject: [petsc-users] about ILUT preconditioner In-Reply-To: References: <9B4C556E-F870-460F-A38E-6BA82CA57081@mcs.anl.gov> Message-ID: On Mon, Oct 3, 2011 at 08:25, behzad baghapour wrote: > I have a basic (dummy) question here...Does downloading external packages > like blas,lapack, hypre with Petsc command make them usable for other codes > as linked libraries or is just for use of Petsc? Yes, they are installed to $PETSC_DIR/$PETSC_ARCH/{include,lib}. -------------- next part -------------- An HTML attachment was scrubbed... URL: From desire.nuentsa_wakam at inria.fr Mon Oct 3 08:33:32 2011 From: desire.nuentsa_wakam at inria.fr (Desire NUENTSA WAKAM) Date: Mon, 03 Oct 2011 15:33:32 +0200 Subject: [petsc-users] DGMRES Right preconditioner In-Reply-To: <45548A70-8582-4169-92A1-E05EB44CCDBE@mcs.anl.gov> References: <079FB6F512FC4D1FAA325A66B156ED0D@cogendaeda> <45548A70-8582-4169-92A1-E05EB44CCDBE@mcs.anl.gov> Message-ID: <4E89B9AC.4090309@inria.fr> Hi Gong, You do not need to modify the source code for the right preconditioning. The additional preconditioner (say M_D^{-1}) is built from the approximate eigenvalues of M^{-1}A or AM^{-1}. Hence you are solving either M_D^{-1}M_{-1}Ax = M_D^{-1}M_{-1}b or AM^{-1}M^{-1}_Dy = b for x = M^{-1}M^{-1}_Dy Thanks Desire On 10/03/2011 03:03 PM, Barry Smith wrote: > Desire, > > Any suggestions? Does this make sense? > > Thanks > > Barry > > On Oct 3, 2011, at 4:47 AM, Gong Ding wrote: > > >> Hi, >> I had installed petsc-3.2, which has a new deflated GMRES solver. >> It seems the deflation procedure provides addtional precondition on top of GMRES. >> >> However, it seems the deflation was implemented as first right preconditioner: >> >> if (ksp->pc_side == PC_LEFT) { >> /* Apply the first preconditioner */ >> ierr = KSP_PCApplyBAorAB (ksp,VEC_VV (it), VEC_TEMP,VEC_TEMP_MATOP); >> CHKERRQ (ierr); >> /* Then apply Deflation as a preconditioner */ >> ierr=KSPDGMRESApplyDeflation (ksp, VEC_TEMP, VEC_VV (1+it)); >> CHKERRQ (ierr); >> } else if (ksp->pc_side == PC_RIGHT) { >> ierr=KSPDGMRESApplyDeflation (ksp, VEC_VV (it), VEC_TEMP); >> CHKERRQ (ierr); >> ierr=KSP_PCApplyBAorAB (ksp, VEC_TEMP, VEC_VV (1+it), VEC_TEMP_MATOP); >> CHKERRQ (ierr); >> } >> >> Since a "fixed" external preconditoner M such as ILU or my own PC will be used, >> I want the deflation apply on AM^-1. As a result, the update of deflation preconditioner will not >> affect the fixed preconditioner M. How should I modify the source code? >> >> >> Gong Ding >> >> >> >> >> >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sistek at math.cas.cz Mon Oct 3 10:13:20 2011 From: sistek at math.cas.cz (Jakub Sistek) Date: Mon, 03 Oct 2011 17:13:20 +0200 Subject: [petsc-users] [SPAM] user experience with PCNN Message-ID: <4E89D110.6050609@math.cas.cz> Dear PETSc developers and users, we are using PETSc to solve systems arising from mixed-hybrid FEM applied to Darcy flow. First, the saddle point system is reduced to the Schur complement problem for Lagrange multipliers on element interfaces, which is then symmetric positive definite. Currently, we are using the PCASM preconditioner to solve it. We have very positive experience (also from other projects) with PCASM, but we have observed some worsening of convergence and scalability with going to larger number of processors (up to 64) here. As far as we understand, the increasing number of iterations may be caused by the lack of coarse correction in the implementation of the preconditioner. On the other hand, PCNN should contain such a coarse solve. I have modified our FEM code to support MATIS matrices besides MPIAIJ, but so far have a mixed experience with PCNN. It seems to work on 2 CPUs, but complains about singular local problems (solved by MUMPS) on more. After some time spent by debugging ( though there are probably still many bugs left in my code ) and unsuccessful playing with some of the related options ( -pc_is_damp_fixed, -pc_is_set_damping_factor_floating, etc.) I have decided to ask couple of questions before I will continue in further investigation why PCNN does not work for me for general case: 1) Am I right that PCNN is the only domain decomposition method exploiting coarse correction readily available in PETSc? 2) As PCNN seems much less documented (I have found no example or so) than other preconditioners, I would simply like to know if someone else uses it and have positive experience with this implementation? 3) What may be proper options for stabilizing solutions of the local problems? 4) Are there limitations to the method with respect to nullspace type of subdomain problems, i.e. equation? 5) Do these answers depend on version of PETSc? (I have played with 3.0, would things be different with 3.2 ?) In the long run, I would like to connect the FEM code to an own solver based on BDDC domain decomposition, for which the MATIS matrix seems as a natural format and connection should be straightforward. However, I would like to make it work with PCNN as well. Thank you very much for your help and suggestions. Best regards, Jakub -- Jakub Sistek, Ph.D. postdoctoral researcher Institute of Mathematics of the AS CR http://www.math.cas.cz/~sistek/ tel: (+420) 222 090 710 From jedbrown at mcs.anl.gov Mon Oct 3 10:44:12 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 3 Oct 2011 10:44:12 -0500 Subject: [petsc-users] [SPAM] user experience with PCNN In-Reply-To: <4E89D110.6050609@math.cas.cz> References: <4E89D110.6050609@math.cas.cz> Message-ID: On Mon, Oct 3, 2011 at 10:13, Jakub Sistek wrote: > Dear PETSc developers and users, > > we are using PETSc to solve systems arising from mixed-hybrid FEM applied > to Darcy flow. First, the saddle point system is reduced to the Schur > complement problem for Lagrange multipliers on element interfaces, which is > then symmetric positive definite. Currently, we are using the PCASM > preconditioner to solve it. We have very positive experience (also from > other projects) with PCASM, but we have observed some worsening of > convergence and scalability with going to larger number of processors (up to > 64) here. As far as we understand, the increasing number of iterations may > be caused by the lack of coarse correction in the implementation of the > preconditioner. On the other hand, PCNN should contain such a coarse solve. > I have modified our FEM code to support MATIS matrices besides MPIAIJ, but > so far have a mixed experience with PCNN. It seems to work on 2 CPUs, but > complains about singular local problems (solved by MUMPS) on more. After > some time spent by debugging ( though there are probably still many bugs > left in my code ) and unsuccessful playing with some of the related options > ( -pc_is_damp_fixed, -pc_is_set_damping_factor_**floating, etc.) I have > decided to ask couple of questions before I will continue in further > investigation why PCNN does not work for me for general case: > > 1) Am I right that PCNN is the only domain decomposition method exploiting > coarse correction readily available in PETSc? > Multilevel Schwarz methods can be used through PCMG. A two-level example, based on the so-called wirebasket coarse spaces, is PCEXOTIC. This code was basically an experiment that only works for scalar problems on structured grids (using DMDA). It could certainly be generalized. The new PCGAMG is not a DD method per-se, but it is developing a lot of hooks with many ideas that overlap with modern DD methods. > 2) As PCNN seems much less documented (I have found no example or so) than > other preconditioners, I would simply like to know if someone else uses it > and have positive experience with this implementation? > Unfortunately, PCNN was more of a research project than robust or widely used code. The coarse space used by the current code is only suitable for scalar problems. The method uses coarse spaces that are more awkward to generalize than the more recently developed balancing and dual methods like BDDC/FETI-DP. > 3) What may be proper options for stabilizing solutions of the local > problems? > 4) Are there limitations to the method with respect to nullspace type of > subdomain problems, i.e. equation? > 5) Do these answers depend on version of PETSc? (I have played with 3.0, > would things be different with 3.2 ?) > PCNN specifically is no different in 3.2, but there are many other improvements, so I recommend upgrading. One notable improvement is that now you can use MatGetLocalSubMatrix() as part of assembly, such that exactly the same code can assemble into MATIS, MATAIJ, MATNEST, etc. > > In the long run, I would like to connect the FEM code to an own solver > based on BDDC domain decomposition, for which the MATIS matrix seems as a > natural format and connection should be straightforward. However, I would > like to make it work with PCNN as well. > I have been meaning to add BDDC to PETSc for a long time, but it hasn't worked its way up to top priority yet. I have read some of your papers on the subject and would be happy to work with you on a native PETSc implementation. (My interest is largely on use with mixed discretizations for incompressible problems, so it's somewhat more delicate than non-mixed elasticity.) I would not worry about PCNN. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhenglun.wei at gmail.com Mon Oct 3 12:00:56 2011 From: zhenglun.wei at gmail.com (Alan Wei) Date: Mon, 3 Oct 2011 12:00:56 -0500 Subject: [petsc-users] Using /src/ksp/ksp/example/tutorial/ex29.c to solve Poisson Equation Twice In-Reply-To: References: Message-ID: Dear all, I still have the problem. >_< I can not find the DMMG interface provided to change the RHS, mentioned by Dr. Smith. The only thing I have is DMMGSetKSP(). It is DMMGSetKSP(dmmg,ComputeRHS,ComputeMatrix).When I use it first time in this example, it calls both ComputeRHS and ComputeMatrix (I assert some print-out probes inside these two functions). However, If I use it again, it just does not call the new one; rather, it calls the old one. My code is attached. Could you please help me to fix this problem ^_^. best, Alan On Sun, Oct 2, 2011 at 5:52 PM, Alan Wei wrote: > In this example, i recalled the ComputeRHS. > > Alan > > > On Sun, Oct 2, 2011 at 5:22 PM, Jed Brown wrote: > >> On Sun, Oct 2, 2011 at 16:31, Alan Wei wrote: >> >>> BTW, what is the DMMG interface you are talking about? is it >>> DMMGSetKSP()? >> >> >> Every object and function containing "DMMG" will be removed. (I don't >> think we've decided on a precise timeline, but don't write new code that >> uses DMMG.) >> >> >>> if so, that's is my problem. At the 2nd time of using DMMGSetKSP() in the >>> main function, it does not call the __function__ to compute the right hand >>> side. >_< any suggestions on that? >> >> >> What function is being called? >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Ex29.zip Type: application/zip Size: 12206 bytes Desc: not available URL: From knepley at gmail.com Mon Oct 3 12:06:20 2011 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 3 Oct 2011 12:06:20 -0500 Subject: [petsc-users] Using /src/ksp/ksp/example/tutorial/ex29.c to solve Poisson Equation Twice In-Reply-To: References: Message-ID: On Mon, Oct 3, 2011 at 12:00 PM, Alan Wei wrote: > Dear all, > I still have the problem. >_< > I can not find the DMMG interface provided to change the RHS, mentioned > by Dr. Smith. The only thing I have is DMMGSetKSP(). It is > DMMGSetKSP(dmmg,ComputeRHS,ComputeMatrix).When I use it first time in this > example, it calls both ComputeRHS and ComputeMatrix (I assert some print-out > probes inside these two functions). However, If I use it again, it just does > not call the new one; rather, it calls the old one. > My code is attached. Could you please help me to fix this problem ^_^. > Yes, this is the way DMMG works. It can only be setup once. This is a severe limitation, which is why we have discontinued its use. If you look at ex50.c, we show how to transform ex19 into a code using SetDM instead of DMMG. You can then reset the DMDA rhs function at each iteration. Thanks, Matt > best, > Alan > > On Sun, Oct 2, 2011 at 5:52 PM, Alan Wei wrote: > >> In this example, i recalled the ComputeRHS. >> >> Alan >> >> >> On Sun, Oct 2, 2011 at 5:22 PM, Jed Brown wrote: >> >>> On Sun, Oct 2, 2011 at 16:31, Alan Wei wrote: >>> >>>> BTW, what is the DMMG interface you are talking about? is it >>>> DMMGSetKSP()? >>> >>> >>> Every object and function containing "DMMG" will be removed. (I don't >>> think we've decided on a precise timeline, but don't write new code that >>> uses DMMG.) >>> >>> >>>> if so, that's is my problem. At the 2nd time of using DMMGSetKSP() in >>>> the main function, it does not call the __function__ to compute the right >>>> hand side. >_< any suggestions on that? >>> >>> >>> What function is being called? >>> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhenglun.wei at gmail.com Mon Oct 3 12:12:57 2011 From: zhenglun.wei at gmail.com (Alan Wei) Date: Mon, 3 Oct 2011 12:12:57 -0500 Subject: [petsc-users] Using /src/ksp/ksp/example/tutorial/ex29.c to solve Poisson Equation Twice In-Reply-To: References: Message-ID: That's great, Matt. Thanks to point out the example for me. best, Alan On Mon, Oct 3, 2011 at 12:06 PM, Matthew Knepley wrote: > On Mon, Oct 3, 2011 at 12:00 PM, Alan Wei wrote: > >> Dear all, >> I still have the problem. >_< >> I can not find the DMMG interface provided to change the RHS, >> mentioned by Dr. Smith. The only thing I have is DMMGSetKSP(). It is >> DMMGSetKSP(dmmg,ComputeRHS,ComputeMatrix).When I use it first time in >> this example, it calls both ComputeRHS and ComputeMatrix (I assert some >> print-out probes inside these two functions). However, If I use it again, it >> just does not call the new one; rather, it calls the old one. >> My code is attached. Could you please help me to fix this problem >> ^_^. >> > > Yes, this is the way DMMG works. It can only be setup once. This is a > severe limitation, which is why we > have discontinued its use. If you look at ex50.c, we show how to transform > ex19 into a code using SetDM > instead of DMMG. You can then reset the DMDA rhs function at each > iteration. > > Thanks, > > Matt > > >> best, >> Alan >> >> On Sun, Oct 2, 2011 at 5:52 PM, Alan Wei wrote: >> >>> In this example, i recalled the ComputeRHS. >>> >>> Alan >>> >>> >>> On Sun, Oct 2, 2011 at 5:22 PM, Jed Brown wrote: >>> >>>> On Sun, Oct 2, 2011 at 16:31, Alan Wei wrote: >>>> >>>>> BTW, what is the DMMG interface you are talking about? is it >>>>> DMMGSetKSP()? >>>> >>>> >>>> Every object and function containing "DMMG" will be removed. (I don't >>>> think we've decided on a precise timeline, but don't write new code that >>>> uses DMMG.) >>>> >>>> >>>>> if so, that's is my problem. At the 2nd time of using DMMGSetKSP() in >>>>> the main function, it does not call the __function__ to compute the right >>>>> hand side. >_< any suggestions on that? >>>> >>>> >>>> What function is being called? >>>> >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From milan.v.mitrovic at gmail.com Mon Oct 3 12:38:56 2011 From: milan.v.mitrovic at gmail.com (Milan Mitrovic) Date: Mon, 3 Oct 2011 19:38:56 +0200 Subject: [petsc-users] SNES question Message-ID: Hello everyone, I have a problem trying to use snes. I'm using it in a simulation to compute an implicit time stepping scheme. I'm trying to validate my implementation by testing everything on the heat equation. When I use ksp with the analytic jacobian I get great results, but when I try to switch back to snes I notice that when I try to increase the resolution there is a lower limit on dt below which snes won't converge... It works fine with ksp, but snes fails with LS_FAILURE. I also wrote an fd approximation and tested the jacobian and the two are equal up to machine precision. And the fact that it works with ksp tells me that it is correct... What could be the problem with snes so that it fails to solve a linear equation with a very small dt? Thanks in advance, Milan From bsmith at mcs.anl.gov Mon Oct 3 12:54:24 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 3 Oct 2011 12:54:24 -0500 Subject: [petsc-users] SNES question In-Reply-To: References: Message-ID: <379C5682-D361-4366-8D60-389A252BA5B7@mcs.anl.gov> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#newton On Oct 3, 2011, at 12:38 PM, Milan Mitrovic wrote: > Hello everyone, > > I have a problem trying to use snes. I'm using it in a simulation to > compute an implicit time stepping scheme. > > I'm trying to validate my implementation by testing everything on the > heat equation. When I use ksp with the analytic jacobian I get great > results, but when I try to switch back to snes I notice that when I > try to increase the resolution there is a lower limit on dt below > which snes won't converge... It works fine with ksp, but snes fails > with LS_FAILURE. > > I also wrote an fd approximation and tested the jacobian and the two > are equal up to machine precision. And the fact that it works with ksp > tells me that it is correct... > > What could be the problem with snes so that it fails to solve a linear > equation with a very small dt? > > Thanks in advance, > Milan From sistek at math.cas.cz Mon Oct 3 14:40:41 2011 From: sistek at math.cas.cz (Jakub Sistek) Date: Mon, 03 Oct 2011 21:40:41 +0200 Subject: [petsc-users] [SPAM] Re: [SPAM] user experience with PCNN In-Reply-To: References: <4E89D110.6050609@math.cas.cz> Message-ID: <4E8A0FB9.6080602@math.cas.cz> Dear Jed, thank you for your quick response and answers. And I am very pleased by your interest in BDDC :-) I would be also very happy if we could make a native implementation of BDDC into PETSc together and have been thinking of something like that several times. However, I can see some issues with this, which might be caused only by my limited knowledge of PETSc. I would love to know your comments to them. One thing I particularly enjoy on PETSc is the quick interchangeability of preconditioners and Krylov methods within the KSP object. But I can see this possible through strictly algebraic nature of the approach, where only matrix object is passed. On the other hand, all of the FETI-DP and BDDC implementations I have heard of are related to FEM computations and make the mesh somewhat accessible to the solver. Although I do not like this, also my third generation of implementation of the BDDC method still needs some limited information on geometry. Not really for construction of the coarse basis functions (this is algebraic in BDDC), but rather indirectly for the selection of coarse degrees of freedom. I am not aware of any existing approach to selection of coarse DOFs at the moment, that would not require some information on geometry for robust selection on unstructured 3D meshes. I could imagine that the required information could be limited to positions of unknowns and some information of the problem which is solved (the nullspace size), the topology of the mesh is not really necessary. For this difficulty, I do not see it simple to write something like PCBDDC preconditioner that would simply interchange with PCASM and others. The situation would be simpler for BDDC if the preconditioner could use also some kind of mesh description. The other issue I can see to be a bit conflicting with the KSP approach of PETSc might be the fact, that BDDC implementations introduce some coupling between preconditioner and Krylov method, which is in fact run only for the Schur complement problem at the interface among subdomains. Multiplication by the system matrix in Krylov method is performed by Dirichlet solves on each subdomain, which corresponds to passing a special matrix-vector multiplying routine to the Krylov method - at least, this is the approach I follow in my last implementation of BDDC, in the BDDCML code, where essentially the preconditioner provides the A*x function to the Krylov method. I have seen this circumvented in PCNN by resolving the vectors to the original size after each application of the preconditioner, but in my opinion, this approach then loses some of the efficiency of running Krylov method on the Schur complement problem instead of the original problem, which usually has a great effect on convergence by itself. Regarding problem types, I have little experience with using BDDC beyond Poisson problems and elasticity. Recently, I have done some tests with Stokes problems and incompressible Navier-Stokes problems, using "brute force" rather than any delicacy you may have in mind. The initial experience with Stokes problem using Taylor-Hood elements is quite good, things get worse for Navier-Stokes where the convergence, with the current simple coarse problem, deteriorates quickly with increasing Reynolds number. However, all these things should be better tested and, as you probably know, are rather recent topic of research and no clear conclusions have been really achieved. I am looking forward to your comments. It would certainly be great if we could make BDDC work within PETSc. Let us see if we can overcome these issues... Jakub On 10/03/2011 05:44 PM, Jed Brown wrote: > On Mon, Oct 3, 2011 at 10:13, Jakub Sistek > wrote: > > Dear PETSc developers and users, > > we are using PETSc to solve systems arising from mixed-hybrid FEM > applied to Darcy flow. First, the saddle point system is reduced > to the Schur complement problem for Lagrange multipliers on > element interfaces, which is then symmetric positive definite. > Currently, we are using the PCASM preconditioner to solve it. We > have very positive experience (also from other projects) with > PCASM, but we have observed some worsening of convergence and > scalability with going to larger number of processors (up to 64) > here. As far as we understand, the increasing number of iterations > may be caused by the lack of coarse correction in the > implementation of the preconditioner. On the other hand, PCNN > should contain such a coarse solve. I have modified our FEM code > to support MATIS matrices besides MPIAIJ, but so far have a mixed > experience with PCNN. It seems to work on 2 CPUs, but complains > about singular local problems (solved by MUMPS) on more. After > some time spent by debugging ( though there are probably still > many bugs left in my code ) and unsuccessful playing with some of > the related options ( -pc_is_damp_fixed, > -pc_is_set_damping_factor_floating, etc.) I have decided to ask > couple of questions before I will continue in further > investigation why PCNN does not work for me for general case: > > 1) Am I right that PCNN is the only domain decomposition method > exploiting coarse correction readily available in PETSc? > > > Multilevel Schwarz methods can be used through PCMG. A two-level > example, based on the so-called wirebasket coarse spaces, is PCEXOTIC. > This code was basically an experiment that only works for scalar > problems on structured grids (using DMDA). It could certainly be > generalized. > > The new PCGAMG is not a DD method per-se, but it is developing a lot > of hooks with many ideas that overlap with modern DD methods. > > 2) As PCNN seems much less documented (I have found no example or > so) than other preconditioners, I would simply like to know if > someone else uses it and have positive experience with this > implementation? > > > Unfortunately, PCNN was more of a research project than robust or > widely used code. The coarse space used by the current code is only > suitable for scalar problems. The method uses coarse spaces that are > more awkward to generalize than the more recently developed balancing > and dual methods like BDDC/FETI-DP. > > 3) What may be proper options for stabilizing solutions of the > local problems? > 4) Are there limitations to the method with respect to nullspace > type of subdomain problems, i.e. equation? > 5) Do these answers depend on version of PETSc? (I have played > with 3.0, would things be different with 3.2 ?) > > > PCNN specifically is no different in 3.2, but there are many other > improvements, so I recommend upgrading. One notable improvement is > that now you can use MatGetLocalSubMatrix() as part of assembly, such > that exactly the same code can assemble into MATIS, MATAIJ, MATNEST, etc. > > > In the long run, I would like to connect the FEM code to an own > solver based on BDDC domain decomposition, for which the MATIS > matrix seems as a natural format and connection should be > straightforward. However, I would like to make it work with PCNN > as well. > > > I have been meaning to add BDDC to PETSc for a long time, but it > hasn't worked its way up to top priority yet. I have read some of your > papers on the subject and would be happy to work with you on a native > PETSc implementation. (My interest is largely on use with mixed > discretizations for incompressible problems, so it's somewhat more > delicate than non-mixed elasticity.) > > I would not worry about PCNN. -- Jakub Sistek, Ph.D. postdoctoral researcher Institute of Mathematics of the AS CR http://www.math.cas.cz/~sistek/ tel: (+420) 222 090 710 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Oct 4 10:22:47 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 4 Oct 2011 10:22:47 -0500 Subject: [petsc-users] user experience with PCNN In-Reply-To: <4E8A0FB9.6080602@math.cas.cz> References: <4E89D110.6050609@math.cas.cz> <4E8A0FB9.6080602@math.cas.cz> Message-ID: On Mon, Oct 3, 2011 at 14:40, Jakub Sistek wrote: > ** > > One thing I particularly enjoy on PETSc is the quick interchangeability of > preconditioners and Krylov methods within the KSP object. But I can see this > possible through strictly algebraic nature of the approach, where only > matrix object is passed. > The KSP and PC objects have two "slots", the Krylov operator A and the "preconditioning matrix" B. I take a very liberal view of what B is. I consider it to be a container into which any problem/state-dependent information needed by the preconditioner should be placed. Topological and geometric information needed by the preconditioner does not change between nonlinear iterations/time steps/etc, so it can be given to the PC directly (e.g. PCBDDCSetCoarseSpaceCandidates() or something like that, though this could also be attached to the Mat). On the other hand, all of the FETI-DP and BDDC implementations I have heard > of are related to FEM computations and make the mesh somewhat accessible to > the solver. Although I do not like this, also my third generation of > implementation of the BDDC method still needs some limited information on > geometry. Not really for construction of the coarse basis functions (this is > algebraic in BDDC), but rather indirectly for the selection of coarse > degrees of freedom. I am not aware of any existing approach to selection of > coarse DOFs at the moment, that would not require some information on > geometry for robust selection on unstructured 3D meshes. I could imagine > that the required information could be limited to positions of unknowns and > some information of the problem which is solved (the nullspace size), the > topology of the mesh is not really necessary. > We recently introduced MatSetNearNullSpace() which is also needed by smoothed aggregation algebraic multigrid. (We decided that this belonged on the Mat because there are problems for which the near null space could change depending on the nonlinear regime, thus needing updating within a nonlinear iteration. For multiphysics problems, it is fragile to depend on access to the PC used for a particular "block" (if it exists), so I prefer to put information that may eventually need to be composed with or interact with other "blocks" into the Mat.) > For this difficulty, I do not see it simple to write something like PCBDDC > preconditioner that would simply interchange with PCASM and others. The > situation would be simpler for BDDC if the preconditioner could use also > some kind of mesh description. > I agree that it may always be necessary to provide extra information in order to use PCBDDC. The goal would not be to have a solver that only needs a (partially) assembled sparse matrix, but rather to have a purely algebraic interface by which that information can be provided. Another way for the PC to access grid information is through PCSetDM(). From the perspective of the solver, the DM is just an interface for providing grid- and discretization-dependent algebraic ingredients to the solver. This enables users of DM to have preconditioners automatically set up. > > The other issue I can see to be a bit conflicting with the KSP approach of > PETSc might be the fact, that BDDC implementations introduce some coupling > between preconditioner and Krylov method, which is in fact run only for the > Schur complement problem at the interface among subdomains. Multiplication > by the system matrix in Krylov method is performed by Dirichlet solves on > each subdomain, which corresponds to passing a special matrix-vector > multiplying routine to the Krylov method - at least, this is the approach I > follow in my last implementation of BDDC, in the BDDCML code, where > essentially the preconditioner provides the A*x function to the Krylov > method. > I have seen this circumvented in PCNN by resolving the vectors to the > original size after each application of the preconditioner, but in my > opinion, this approach then loses some of the efficiency of running Krylov > method on the Schur complement problem instead of the original problem, > which usually has a great effect on convergence by itself. > There are tradeoffs both ways because iterating in the full space can accommodate inexact subdomain solves. There are a bunch of algorithms that use the same ingredients and are essentially equivalent when direct solvers are used, but different when inexact solvers are used: BDDC: iterate in interface space, needs exact subdomain and coarse solves BDDC/primal: iterate in interface space plus coarse primal dofs, tolerant of inexact coarse level solve BDDC/full: iterate in full space, tolerant of inexact subdomain and coarse solves FETI-DP: iterate in space of Lagrange multipliers, much like BDDC above iFETI-DP: iterate in space of subdomains with duplicate interface dofs, coarse primal dofs, and Lagrange multipliers. tolerant of inexact subdomain and coarse level solves irFETI-DP: iterate in space of Lagrange multipliers and coarse dofs, tolerant of inexact coarse solves One advantage of iterating in the full space is that the method can naturally be used to precondition a somewhat different matrix (e.g. a higher order discretization of the same physics on the same mesh) which can be applied matrix-free. Any method that iterates in a reduced space simply contains another KSP for that purpose. > > Regarding problem types, I have little experience with using BDDC beyond > Poisson problems and elasticity. Recently, I have done some tests with > Stokes problems and incompressible Navier-Stokes problems, using "brute > force" rather than any delicacy you may have in mind. The initial experience > with Stokes problem using Taylor-Hood elements is quite good, things get > worse for Navier-Stokes where the convergence, with the current simple > coarse problem, deteriorates quickly with increasing Reynolds number. > However, all these things should be better tested and, as you probably know, > are rather recent topic of research and no clear conclusions have been > really achieved. > I'm curious about what sort of problems you tested with Stokes. In particular, I'm interested in problems containing thin structures with large jumps in coefficients (e.g. 10^8). (In this application, I'm only interested in the Stokes problem, Re=1e-20 for these problems.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From gshy2014 at gmail.com Tue Oct 4 14:07:40 2011 From: gshy2014 at gmail.com (Shiyuan) Date: Tue, 4 Oct 2011 14:07:40 -0500 Subject: [petsc-users] compiling petsc.h and cusp's headers Message-ID: Hi, I cannot compiling a source file with petsc.h and cusp headers like the following: #include"petsc.h" #include #include "cusp/csr_matrix.h" #include int main(void){ std::cout<<"Hello World!"< From bsmith at mcs.anl.gov Tue Oct 4 15:15:09 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 4 Oct 2011 15:15:09 -0500 Subject: [petsc-users] compiling petsc.h and cusp's headers In-Reply-To: References: Message-ID: Are you using the PETSc makefiles to compile it? Or did you make up that compile line yourself? Why not just use the PETSc makefile to compile it? Presumably cusp/thrust has some pattern for what include files need to be included in what order. If you do not know the pattern then why not start by copying what PETSc does, since that compiles? For example cuspvecimpl.h has #include #include #include #include #include #include while cuspmatimpl.h has #include <../src/vec/vec/impls/seq/seqcusp/cuspvecimpl.h> #include #include /*for MatCreateSeqAIJCUSPFromTriple*/ #include so start with at least those. Generally just picking a couple of random include files from some C++ package (like cusp/thrust) and including them won't work. Barry On Oct 4, 2011, at 2:07 PM, Shiyuan wrote: > Hi, > I cannot compiling a source file with petsc.h and cusp headers like the following: > > #include"petsc.h" > #include > #include "cusp/csr_matrix.h" > #include > int main(void){ > std::cout<<"Hello World!"< return 0; > } > > I use the following to compile it: > > nvcc -m64 -O -arch=sm_13 -c --compiler-options="-Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O -I/home/guest/sgu1/softwares/petsc-dev/include -I/home/guest/sgu1/softwares/petsc-dev/helena-cxx-nompi-os64-release/include -I/usr/local/cuda/include -I/home/guest/sgu1/softwares/cusp-library/ -I/home/guest/sgu1/softwares/thrust/ -I/home/guest/sgu1/softwares/petsc-dev/include/mpiuni -DCUDA=1 -I/home/guest/sgu1/softwares/slepc-dev -I/home/guest/sgu1/softwares/slepc-dev/helena-cxx-nompi-os64-release/include -I/home/guest/sgu1/softwares/slepc-dev/include -I/home/guest/sgu1/softwares/ImageMagick-6.7.2/include/ImageMagick -D__INSDIR__= -I/home/guest/sgu1/softwares/slepc-dev -I/home/guest/sgu1/softwares/slepc-dev/helena-cxx-nompi-os64-release/include -I/home/guest/sgu1/softwares/slepc-dev/include" MyKspTmp4.cu -o MyKspTmp4.o > > and it gives me error: > /usr/local/cuda/include/thrust/detail/device/cuda/detail/b40c/vector_types.h(37): error: expected an identifier > > I think I did not do it in a correct way. I took a a look at some of petsc's .cu files, they usually include a long list of headers. Can petsc.h directly included in a source code with cusp headers? What's the correct way to do it? Does the makefile system in petsc provide the command or flags to do that? How can I obtain those flags? > > And after I compile it and obtain .o files, can I use g++ to link the .o files with petcs libs and other .o files as usual, i.e., > g++ -o $(BIN_DIR)/$@ $(CPPFLAGS) $@.s $(objects) $(PETSC_LIB) $(OTHERS) > or I need to do some extra steps and some specific linking flags to do that? > Thanks. > > Shiyuan > > > From knepley at gmail.com Tue Oct 4 15:36:22 2011 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 4 Oct 2011 15:36:22 -0500 Subject: [petsc-users] compiling petsc.h and cusp's headers In-Reply-To: References: Message-ID: On Tue, Oct 4, 2011 at 3:15 PM, Barry Smith wrote: > > Are you using the PETSc makefiles to compile it? Or did you make up that > compile line yourself? Why not just use the PETSc makefile to compile it? > > Presumably cusp/thrust has some pattern for what include files need to be > included in what order. If you do not know the pattern then why not start by > copying what PETSc does, since that compiles? For example > cuspvecimpl.h has > > #include > #include > #include > #include > #include > #include > > while cuspmatimpl.h has > > #include <../src/vec/vec/impls/seq/seqcusp/cuspvecimpl.h> > #include > #include > /*for MatCreateSeqAIJCUSPFromTriple*/ > #include > > so start with at least those. Generally just picking a couple of random > include files from some C++ package (like cusp/thrust) and including them > won't work. I believe that your problem is the clash of VecType between CUSP and PETSc. In aijcusp.cu, we have #undef VecType #include "../src/mat/impls/aij/seq/seqcusp/cuspmatimpl.h" Matt > > Barry > > > On Oct 4, 2011, at 2:07 PM, Shiyuan wrote: > > > Hi, > > I cannot compiling a source file with petsc.h and cusp headers like > the following: > > > > #include"petsc.h" > > #include > > #include "cusp/csr_matrix.h" > > #include > > int main(void){ > > std::cout<<"Hello World!"< > return 0; > > } > > > > I use the following to compile it: > > > > nvcc -m64 -O -arch=sm_13 -c --compiler-options="-Wall -Wwrite-strings > -Wno-strict-aliasing -Wno-unknown-pragmas -O > -I/home/guest/sgu1/softwares/petsc-dev/include > -I/home/guest/sgu1/softwares/petsc-dev/helena-cxx-nompi-os64-release/include > -I/usr/local/cuda/include -I/home/guest/sgu1/softwares/cusp-library/ > -I/home/guest/sgu1/softwares/thrust/ > -I/home/guest/sgu1/softwares/petsc-dev/include/mpiuni -DCUDA=1 > -I/home/guest/sgu1/softwares/slepc-dev > -I/home/guest/sgu1/softwares/slepc-dev/helena-cxx-nompi-os64-release/include > -I/home/guest/sgu1/softwares/slepc-dev/include > -I/home/guest/sgu1/softwares/ImageMagick-6.7.2/include/ImageMagick > -D__INSDIR__= -I/home/guest/sgu1/softwares/slepc-dev > -I/home/guest/sgu1/softwares/slepc-dev/helena-cxx-nompi-os64-release/include > -I/home/guest/sgu1/softwares/slepc-dev/include" MyKspTmp4.cu -o MyKspTmp4.o > > > > and it gives me error: > > > /usr/local/cuda/include/thrust/detail/device/cuda/detail/b40c/vector_types.h(37): > error: expected an identifier > > > > I think I did not do it in a correct way. I took a a look at some of > petsc's .cu files, they usually include a long list of headers. Can petsc.h > directly included in a source code with cusp headers? What's the correct way > to do it? Does the makefile system in petsc provide the command or flags to > do that? How can I obtain those flags? > > > > And after I compile it and obtain .o files, can I use g++ to link the .o > files with petcs libs and other .o files as usual, i.e., > > g++ -o $(BIN_DIR)/$@ $(CPPFLAGS) $@.s $(objects) $(PETSC_LIB) > $(OTHERS) > > or I need to do some extra steps and some specific linking flags to do > that? > > Thanks. > > > > Shiyuan > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From gshy2014 at gmail.com Tue Oct 4 18:59:03 2011 From: gshy2014 at gmail.com (Shiyuan) Date: Tue, 4 Oct 2011 18:59:03 -0500 Subject: [petsc-users] compiling petsc.h and cusp's headers Message-ID: * *>* Are you using the PETSc makefiles to compile it? Or did you make up that *>* compile line yourself? Why not just use the PETSc makefile to compile it? *Actually, the compiling command I usedd is generated by PETSs's makefiles.* *But I am looking for something like "make getincludedirs; make getpetscflags; make getlinklibs" which can give me the compiling flags Petsc uses to feed nvcc. Are there anything like that? * *>* *>* Presumably cusp/thrust has some pattern for what include files need to be *>* included in what order. If you do not know the pattern then why not start by *>* copying what PETSc does, since that compiles? For example *>* cuspvecimpl.h has *>* *>* #include *>* #include *>* #include *>* #include *>* #include *>* #include *>* *>* while cuspmatimpl.h has *>* *>* #include <../src/vec/vec/impls/seq/seqcusp/cuspvecimpl.h> *>* #include *>* #include *>* /*for MatCreateSeqAIJCUSPFromTriple*/ *>* #include *>* *>* so start with at least those. Generally just picking a couple of random *>* include files from some C++ package (like cusp/thrust) and including them *>* won't work. I did check the documentation of cusp, and I don't see a requirement about **the order for including headers. It doesn't sound reasonable to require programmers to enforce the order when the programmers uses only functions from one library. Does it?( I think there is an issue I don't see, please excuse me for my ignorance). And I did try a few experiment where I picked some cusp examples and changed the orders of including headers, and they do work. *>I believe that your problem is the clash of VecType between CUSP and PETSc. >In aijcusp.cu, we have >#undef VecType >#include "../src/mat/impls/aij/seq/seqcusp/cuspmatimpl.h" > Matt That explains it. But I still don't get the solution. If I want to define a function where I call some functions provided by Petsc and some functions provided by cusp and thrust, In what order should I include the headers? For example, if I want Mat, Vec and cusp::krylov::cg. If I look at the aijcusp.cu where MatMult_SeqAIJCusp is defined, it has a long list of headers, #include "petscconf.h" PETSC_CUDA_EXTERN_C_BEGIN #include "../src/mat/impls/aij/seq/aij.h" /*I "petscmat.h" I*/ #include "petscbt.h" #include "../src/vec/vec/impls/dvecimpl.h" #include "private/vecimpl.h" PETSC_CUDA_EXTERN_C_END #undef VecType #include "../src/mat/impls/aij/seq/seqcusp/cuspmatimpl.h" #ifdef PETSC_HAVE_TXPETSCGPU #include "csr_matrix_data.h" #include "csr_matrix_data_gpu.h" #include "csr_tri_solve_gpu.h" #include "csr_tri_solve_gpu_level_scheduler.h" #include "csr_spmv_inode.h" #include #include #include #include #include But looking at this list of headers, I cannot see what files do I really need and what else is missing if I only want Mat, Vec and cusp::krylov::cg. Is there a less painful way? Thanks. Shiyuan **** --------------------------------------------------------------------------------------------------------------------------------------------------------------------- **>* On Oct 4, 2011, at 2:07 PM, Shiyuan wrote: *>* *>* > Hi, *>* > I cannot compiling a source file with petsc.h and cusp headers like *>* the following: *>* > *>* > #include"petsc.h" *>* > #include *>* > #include "cusp/csr_matrix.h" *>* > #include *>* > int main(void){ *>* > std::cout<<"Hello World!"<* > return 0; *>* > } *>* > *>* > I use the following to compile it: *>* > *>* > nvcc -m64 -O -arch=sm_13 -c --compiler-options="-Wall -Wwrite-strings *>* -Wno-strict-aliasing -Wno-unknown-pragmas -O *>* -I/home/guest/sgu1/softwares/petsc-dev/include *>* -I/home/guest/sgu1/softwares/petsc-dev/helena-cxx-nompi-os64-release/include *>* -I/usr/local/cuda/include -I/home/guest/sgu1/softwares/cusp-library/ *>* -I/home/guest/sgu1/softwares/thrust/ *>* -I/home/guest/sgu1/softwares/petsc-dev/include/mpiuni -DCUDA=1 *>* -I/home/guest/sgu1/softwares/slepc-dev *>* -I/home/guest/sgu1/softwares/slepc-dev/helena-cxx-nompi-os64-release/include *>* -I/home/guest/sgu1/softwares/slepc-dev/include *>* -I/home/guest/sgu1/softwares/ImageMagick-6.7.2/include/ImageMagick *>* -D__INSDIR__= -I/home/guest/sgu1/softwares/slepc-dev *>* -I/home/guest/sgu1/softwares/slepc-dev/helena-cxx-nompi-os64-release/include *>* -I/home/guest/sgu1/softwares/slepc-dev/include" MyKspTmp4.cu -o MyKspTmp4.o *>* > *>* > and it gives me error: *>* > *>* /usr/local/cuda/include/thrust/detail/device/cuda/detail/b40c/vector_types.h(37): *>* error: expected an identifier *>* > *>* > I think I did not do it in a correct way. I took a a look at some of *>* petsc's .cu files, they usually include a long list of headers. Can petsc.h *>* directly included in a source code with cusp headers? What's the correct way *>* to do it? Does the makefile system in petsc provide the command or flags to *>* do that? How can I obtain those flags? *>* > *>* > And after I compile it and obtain .o files, can I use g++ to link the .o *>* files with petcs libs and other .o files as usual, i.e., *>* > g++ -o $(BIN_DIR)/$@ $(CPPFLAGS) $@.s $(objects) $(PETSC_LIB) *>* $(OTHERS) *>* > or I need to do some extra steps and some specific linking flags to do *>* that? *>* > Thanks. *>* > *>* > Shiyuan *>* > *>* > *>* > *>* *>* * -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Oct 4 21:30:55 2011 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 4 Oct 2011 21:30:55 -0500 Subject: [petsc-users] compiling petsc.h and cusp's headers In-Reply-To: References: Message-ID: On Tue, Oct 4, 2011 at 6:59 PM, Shiyuan wrote: > * > *>* Are you using the PETSc makefiles to compile it? Or did you make up that > *>* compile line yourself? Why not just use the PETSc makefile to compile it? > > > > * > Actually, the compiling command I usedd is generated by > > PETSs's makefiles.* *But I am looking for something like "make getincludedirs; make getpetscflags; > make getlinklibs" which can give me the compiling flags Petsc uses to feed nvcc. Are there anything > > like that? > > * > > *>* > *>* Presumably cusp/thrust has some pattern for what include files need to be > *>* included in what order. If you do not know the pattern then why not start by > *>* copying what PETSc does, since that compiles? For example > *>* cuspvecimpl.h has > *>* > *>* #include > *>* #include > *>* #include > *>* #include > *>* #include > *>* #include > *>* > *>* while cuspmatimpl.h has > *>* > *>* #include <../src/vec/vec/impls/seq/seqcusp/cuspvecimpl.h> > *>* #include > *>* #include > *>* /*for MatCreateSeqAIJCUSPFromTriple*/ > *>* #include > *>* > *>* so start with at least those. Generally just picking a couple of random > *>* include files from some C++ package (like cusp/thrust) and including them > *> > * won't work. > > > I did check the documentation of cusp, and I don't see a requirement about > **the order for including headers. It doesn't sound reasonable to require programmers to enforce the order when the programmers > > uses only functions from one library. Does it?( I think there is an issue I don't see, please excuse me for my ignorance). And I did > try a few experiment where I picked some cusp examples and changed the orders of including headers, and they do work. > > > > > * > >I believe that your problem is the clash of VecType between CUSP and PETSc. > >In aijcusp.cu, we have > > >#undef VecType > >#include "../src/mat/impls/aij/seq/seqcusp/cuspmatimpl.h" > > > Matt > > That explains it. But I still don't get the solution. If I want to define a function where I call some functions provided by Petsc and some functions provided by cusp and thrust, > In what order should I include the headers? For example, if I want Mat, Vec and cusp::krylov::cg. > > If I look at the aijcusp.cu where MatMult_SeqAIJCusp is defined, it has a long list of headers, > > > #include "petscconf.h" > PETSC_CUDA_EXTERN_C_BEGIN > #include "../src/mat/impls/aij/seq/aij.h" /*I "petscmat.h" I*/ > #include "petscbt.h" > #include "../src/vec/vec/impls/dvecimpl.h" > #include "private/vecimpl.h" > PETSC_CUDA_EXTERN_C_END > #undef VecType > #include "../src/mat/impls/aij/seq/seqcusp/cuspmatimpl.h" > > > #ifdef PETSC_HAVE_TXPETSCGPU > > #include "csr_matrix_data.h" > #include "csr_matrix_data_gpu.h" > #include "csr_tri_solve_gpu.h" > #include "csr_tri_solve_gpu_level_scheduler.h" > #include "csr_spmv_inode.h" > #include > #include > #include > #include > #include > > > But looking at this list of headers, I cannot see what files do I really need and what else is missing if I only want Mat, Vec and cusp::krylov::cg. > > Is there a less painful way? Thanks. > > #undef VecType Matt > > Shiyuan > **** > > --------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > **>* On Oct 4, 2011, at 2:07 PM, Shiyuan wrote: > *>* > *>* > Hi, > *>* > I cannot compiling a source file with petsc.h and cusp headers like > *>* the following: > *>* > > *>* > #include"petsc.h" > *>* > #include > *>* > #include "cusp/csr_matrix.h" > *>* > #include > *>* > int main(void){ > *>* > std::cout<<"Hello World!"< *>* > return 0; > *>* > } > *>* > > *>* > I use the following to compile it: > *>* > > *>* > nvcc -m64 -O -arch=sm_13 -c --compiler-options="-Wall -Wwrite-strings > *>* -Wno-strict-aliasing -Wno-unknown-pragmas -O > *>* -I/home/guest/sgu1/softwares/petsc-dev/include > *>* -I/home/guest/sgu1/softwares/petsc-dev/helena-cxx-nompi-os64-release/include > *>* -I/usr/local/cuda/include -I/home/guest/sgu1/softwares/cusp-library/ > *>* -I/home/guest/sgu1/softwares/thrust/ > *>* -I/home/guest/sgu1/softwares/petsc-dev/include/mpiuni -DCUDA=1 > *>* -I/home/guest/sgu1/softwares/slepc-dev > *>* -I/home/guest/sgu1/softwares/slepc-dev/helena-cxx-nompi-os64-release/include > *>* -I/home/guest/sgu1/softwares/slepc-dev/include > *>* -I/home/guest/sgu1/softwares/ImageMagick-6.7.2/include/ImageMagick > *>* -D__INSDIR__= -I/home/guest/sgu1/softwares/slepc-dev > *>* -I/home/guest/sgu1/softwares/slepc-dev/helena-cxx-nompi-os64-release/include > *>* -I/home/guest/sgu1/softwares/slepc-dev/include" MyKspTmp4.cu -o MyKspTmp4.o > *>* > > *>* > and it gives me error: > *>* > > *>* /usr/local/cuda/include/thrust/detail/device/cuda/detail/b40c/vector_types.h(37): > *>* error: expected an identifier > *>* > > *>* > I think I did not do it in a correct way. I took a a look at some of > *>* petsc's .cu files, they usually include a long list of headers. Can petsc.h > *>* directly included in a source code with cusp headers? What's the correct way > *>* to do it? Does the makefile system in petsc provide the command or flags to > *>* do that? How can I obtain those flags? > *>* > > *>* > And after I compile it and obtain .o files, can I use g++ to link the .o > *>* files with petcs libs and other .o files as usual, i.e., > *>* > g++ -o $(BIN_DIR)/$@ $(CPPFLAGS) $@.s $(objects) $(PETSC_LIB) > *>* $(OTHERS) > *>* > or I need to do some extra steps and some specific linking flags to do > *>* that? > *>* > Thanks. > *>* > > *>* > Shiyuan > *>* > > *>* > > *>* > > *>* > *>* > * > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Oct 4 21:32:47 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 4 Oct 2011 21:32:47 -0500 (CDT) Subject: [petsc-users] compiling petsc.h and cusp's headers In-Reply-To: References: Message-ID: How about the following? Satish ------------ balay at bb30:~/ex>cat makefile CFLAGS = FFLAGS = CPPFLAGS = FPPFLAGS = CLEANFILES = include ${PETSC_DIR}/conf/variables include ${PETSC_DIR}/conf/rules ex1: ex1.o chkopts -${CLINKER} -o ex1 ex1.o ${PETSC_KSP_LIB} ${RM} ex1.o balay at bb30:~/ex>cat ex1.cu #include "petsc.h" #undef VecType #include #include int main(int argc,char **argv) { PetscErrorCode ierr; ierr = PetscInitialize(&argc,&argv,(char *)0,(char*)0);CHKERRQ(ierr); std::cout<<"Hello World!"<make ex1 nvcc -g -arch=sm_13 -c --compiler-options="-Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -I/home/balay/petsc-dev/include -I/home/balay/petsc-dev/arch-cuda-double/include -I/usr/local/cuda/include -D__INSDIR__=" ex1.cu /home/balay/petsc-dev/arch-cuda-double/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -o ex1 ex1.o -L/home/balay/petsc-dev/arch-cuda-double/lib -lpetsc -lX11 -lpthread -Wl,-rpath,/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -lcufft -lcublas -lcudart -llapack -lblas -lm -lmpichcxx -lstdc++ -ldl /bin/rm -f ex1.o balay at bb30:~/ex>./ex1 Hello World! balay at bb30:~/ex> On Tue, 4 Oct 2011, Shiyuan wrote: > * > *>* Are you using the PETSc makefiles to compile it? Or did you make up that > *>* compile line yourself? Why not just use the PETSc makefile to compile it? > > > > *Actually, the compiling command I usedd is generated by > PETSs's makefiles.* *But I am looking for something like "make > getincludedirs; make getpetscflags; > make getlinklibs" which can give me the compiling flags Petsc uses to > feed nvcc. Are there anything > like that? > * > > *>* > *>* Presumably cusp/thrust has some pattern for what include files need to be > *>* included in what order. If you do not know the pattern then why not start by > *>* copying what PETSc does, since that compiles? For example > *>* cuspvecimpl.h has > *>* > *>* #include > *>* #include > *>* #include > *>* #include > *>* #include > *>* #include > *>* > *>* while cuspmatimpl.h has > *>* > *>* #include <../src/vec/vec/impls/seq/seqcusp/cuspvecimpl.h> > *>* #include > *>* #include > *>* /*for MatCreateSeqAIJCUSPFromTriple*/ > *>* #include > *>* > *>* so start with at least those. Generally just picking a couple of random > *>* include files from some C++ package (like cusp/thrust) and including them > *>* won't work. > > > I did check the documentation of cusp, and I don't see a requirement about > **the order for including headers. It doesn't sound reasonable to > require programmers to enforce the order when the programmers > uses only functions from one library. Does it?( I think there is an > issue I don't see, please excuse me for my ignorance). And I did > try a few experiment where I picked some cusp examples and changed the > orders of including headers, and they do work. > > > > > *>I believe that your problem is the clash of VecType between CUSP and PETSc. > >In aijcusp.cu, we have > > >#undef VecType > >#include "../src/mat/impls/aij/seq/seqcusp/cuspmatimpl.h" > > > Matt > > That explains it. But I still don't get the solution. If I want to > define a function where I call some functions provided by Petsc and > some functions provided by cusp and thrust, > In what order should I include the headers? For example, if I want > Mat, Vec and cusp::krylov::cg. > If I look at the aijcusp.cu where MatMult_SeqAIJCusp is defined, it > has a long list of headers, > > > #include "petscconf.h" > PETSC_CUDA_EXTERN_C_BEGIN > #include "../src/mat/impls/aij/seq/aij.h" /*I "petscmat.h" I*/ > #include "petscbt.h" > #include "../src/vec/vec/impls/dvecimpl.h" > #include "private/vecimpl.h" > PETSC_CUDA_EXTERN_C_END > #undef VecType > #include "../src/mat/impls/aij/seq/seqcusp/cuspmatimpl.h" > > > #ifdef PETSC_HAVE_TXPETSCGPU > > #include "csr_matrix_data.h" > #include "csr_matrix_data_gpu.h" > #include "csr_tri_solve_gpu.h" > #include "csr_tri_solve_gpu_level_scheduler.h" > #include "csr_spmv_inode.h" > #include > #include > #include > #include > #include > > > But looking at this list of headers, I cannot see what files do I > really need and what else is missing if I only want Mat, Vec and > cusp::krylov::cg. > Is there a less painful way? Thanks. > > > Shiyuan > **** > > --------------------------------------------------------------------------------------------------------------------------------------------------------------------- > **>* On Oct 4, 2011, at 2:07 PM, Shiyuan wrote: > *>* > *>* > Hi, > *>* > I cannot compiling a source file with petsc.h and cusp headers like > *>* the following: > *>* > > *>* > #include"petsc.h" > *>* > #include > *>* > #include "cusp/csr_matrix.h" > *>* > #include > *>* > int main(void){ > *>* > std::cout<<"Hello World!"< *>* > return 0; > *>* > } > *>* > > *>* > I use the following to compile it: > *>* > > *>* > nvcc -m64 -O -arch=sm_13 -c --compiler-options="-Wall -Wwrite-strings > *>* -Wno-strict-aliasing -Wno-unknown-pragmas -O > *>* -I/home/guest/sgu1/softwares/petsc-dev/include > *>* -I/home/guest/sgu1/softwares/petsc-dev/helena-cxx-nompi-os64-release/include > *>* -I/usr/local/cuda/include -I/home/guest/sgu1/softwares/cusp-library/ > *>* -I/home/guest/sgu1/softwares/thrust/ > *>* -I/home/guest/sgu1/softwares/petsc-dev/include/mpiuni -DCUDA=1 > *>* -I/home/guest/sgu1/softwares/slepc-dev > *>* -I/home/guest/sgu1/softwares/slepc-dev/helena-cxx-nompi-os64-release/include > *>* -I/home/guest/sgu1/softwares/slepc-dev/include > *>* -I/home/guest/sgu1/softwares/ImageMagick-6.7.2/include/ImageMagick > *>* -D__INSDIR__= -I/home/guest/sgu1/softwares/slepc-dev > *>* -I/home/guest/sgu1/softwares/slepc-dev/helena-cxx-nompi-os64-release/include > *>* -I/home/guest/sgu1/softwares/slepc-dev/include" MyKspTmp4.cu -o MyKspTmp4.o > *>* > > *>* > and it gives me error: > *>* > > *>* /usr/local/cuda/include/thrust/detail/device/cuda/detail/b40c/vector_types.h(37): > *>* error: expected an identifier > *>* > > *>* > I think I did not do it in a correct way. I took a a look at some of > *>* petsc's .cu files, they usually include a long list of headers. Can petsc.h > *>* directly included in a source code with cusp headers? What's the correct way > *>* to do it? Does the makefile system in petsc provide the command or flags to > *>* do that? How can I obtain those flags? > *>* > > *>* > And after I compile it and obtain .o files, can I use g++ to link the .o > *>* files with petcs libs and other .o files as usual, i.e., > *>* > g++ -o $(BIN_DIR)/$@ $(CPPFLAGS) $@.s $(objects) $(PETSC_LIB) > *>* $(OTHERS) > *>* > or I need to do some extra steps and some specific linking flags to do > *>* that? > *>* > Thanks. > *>* > > *>* > Shiyuan > *>* > > *>* > > *>* > > *>* > *>* > * > From gdiso at ustc.edu Tue Oct 4 21:47:30 2011 From: gdiso at ustc.edu (Gong Ding) Date: Wed, 5 Oct 2011 10:47:30 +0800 Subject: [petsc-users] DGMRES Right preconditioner References: <079FB6F512FC4D1FAA325A66B156ED0D@cogendaeda><45548A70-8582-4169-92A1-E05EB44CCDBE@mcs.anl.gov> <4E89B9AC.4090309@inria.fr> Message-ID: The order of the preconditioner is the problem. I would like to use my own preconditioner i.e. M_D first, then the deflation preconditioner M. That is solving (A*M_D^-1*M^-1) (M M_D) x = b M_D is a static preconditioner build from A (i.e ILU or some preconditioner based on domain decomposition). And M is the deflation preconditioner build from AM_D^-1. Anyway, since M is a dynamic preconditioner built from GMRES Arnoldi process, it is easy to be constructed from AM_D^-1. However build preconditioner such as ILU from AM^-1 is more difficult. I think it is reasonable to change the order of preconditioner in DGMRES code. Or at least gives an option to do this. > Hi Gong, > You do not need to modify the source code for the right preconditioning. > The additional preconditioner (say M_D^{-1}) is built from the > approximate eigenvalues of M^{-1}A or AM^{-1}. > Hence you are solving either M_D^{-1}M_{-1}Ax = M_D^{-1}M_{-1}b or > AM^{-1}M^{-1}_Dy = b for x = M^{-1}M^{-1}_Dy > > Thanks > Desire > From bsmith at mcs.anl.gov Tue Oct 4 22:09:14 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 4 Oct 2011 22:09:14 -0500 Subject: [petsc-users] DGMRES Right preconditioner In-Reply-To: References: <079FB6F512FC4D1FAA325A66B156ED0D@cogendaeda><45548A70-8582-4169-92A1-E05EB44CCDBE@mcs.anl.gov> <4E89B9AC.4090309@inria.fr> Message-ID: <778E956F-2DF6-4C5A-917D-5F94411D71CE@mcs.anl.gov> 1) It seems this is what the code is already doing? if (ksp->pc_side == PC_LEFT) { /* Apply the first preconditioner */ ierr = KSP_PCApplyBAorAB (ksp,VEC_VV (it), VEC_TEMP,VEC_TEMP_MATOP); CHKERRQ (ierr); /* Then apply Deflation as a preconditioner */ ierr=KSPDGMRESApplyDeflation (ksp, VEC_TEMP, VEC_VV (1+it)); CHKERRQ (ierr); } else if (ksp->pc_side == PC_RIGHT) { ierr=KSPDGMRESApplyDeflation (ksp, VEC_VV (it), VEC_TEMP); CHKERRQ (ierr); ierr=KSP_PCApplyBAorAB (ksp, VEC_TEMP, VEC_VV (1+it), VEC_TEMP_MATOP); CHKERRQ (ierr); For example, with right preconditioning it is applying M^-1(the deflation thingy) to VEC_VV(it) putting the result in VEC_TEMP then it is applying the static preconditioner (for example ILU) M_D^-1 then applying the operator A. 2) you are certainly free to edit dgmres.c and do a different algorithm and see how it goes; if you show that it works well or better than the implemented code then people are likely to accept your suggestion. But you need to take the initiative with your ideas to try them out and demonstrate they are good ideas. Barry On Oct 4, 2011, at 9:47 PM, Gong Ding wrote: > The order of the preconditioner is the problem. > I would like to use my own preconditioner i.e. M_D first, then the deflation preconditioner M. > That is solving > (A*M_D^-1*M^-1) (M M_D) x = b > > M_D is a static preconditioner build from A (i.e ILU or some preconditioner based on domain decomposition). And M is the deflation preconditioner build from AM_D^-1. > Anyway, since M is a dynamic preconditioner built from GMRES Arnoldi process, it is easy to be constructed from AM_D^-1. > However build preconditioner such as ILU from AM^-1 is more difficult. > > I think it is reasonable to change the order of preconditioner in DGMRES code. Or at least gives an option to do this. > > >> Hi Gong, >> You do not need to modify the source code for the right preconditioning. >> The additional preconditioner (say M_D^{-1}) is built from the >> approximate eigenvalues of M^{-1}A or AM^{-1}. >> Hence you are solving either M_D^{-1}M_{-1}Ax = M_D^{-1}M_{-1}b or >> AM^{-1}M^{-1}_Dy = b for x = M^{-1}M^{-1}_Dy >> >> Thanks >> Desire >> From gdiso at ustc.edu Wed Oct 5 00:22:24 2011 From: gdiso at ustc.edu (Gong Ding) Date: Wed, 5 Oct 2011 13:22:24 +0800 (CST) Subject: [petsc-users] DGMRES Right preconditioner In-Reply-To: <778E956F-2DF6-4C5A-917D-5F94411D71CE@mcs.anl.gov> References: <778E956F-2DF6-4C5A-917D-5F94411D71CE@mcs.anl.gov> <079FB6F512FC4D1FAA325A66B156ED0D@cogendaeda><45548A70-8582-4169-92A1-E05EB44CCDBE@mcs.anl.gov> <4E89B9AC.4090309@inria.fr> Message-ID: <9262131.256761317792144775.JavaMail.coremail@mail.ustc.edu> Yes, the code has already do as I want. Sorry, I misunderstanding the code in the PCApply sequence. Thanks, Barry. > 1) It seems this is what the code is already doing? > > > > if (ksp->pc_side == PC_LEFT) { > > /* Apply the first preconditioner */ > > ierr = KSP_PCApplyBAorAB (ksp,VEC_VV (it), VEC_TEMP,VEC_TEMP_MATOP); > > CHKERRQ (ierr); > > /* Then apply Deflation as a preconditioner */ > > ierr=KSPDGMRESApplyDeflation (ksp, VEC_TEMP, VEC_VV (1+it)); > > CHKERRQ (ierr); > > } else if (ksp->pc_side == PC_RIGHT) { > > ierr=KSPDGMRESApplyDeflation (ksp, VEC_VV (it), VEC_TEMP); > > CHKERRQ (ierr); > > ierr=KSP_PCApplyBAorAB (ksp, VEC_TEMP, VEC_VV (1+it), VEC_TEMP_MATOP); > > CHKERRQ (ierr); > > > > For example, with right preconditioning it is applying M^-1(the deflation thingy) to VEC_VV(it) putting the result in VEC_TEMP then it is applying the static preconditioner (for example ILU) M_D^-1 then applying the operator A. > > > > 2) you are certainly free to edit dgmres.c and do a different algorithm and see how it goes; if you show that it works well or better than the implemented code then people are likely to accept your suggestion. But you need to take the initiative with your ideas to try them out and demonstrate they are good ideas. > > > > > > Barry > > > > On Oct 4, 2011, at 9:47 PM, Gong Ding wrote: > > > > > The order of the preconditioner is the problem. > > > I would like to use my own preconditioner i.e. M_D first, then the deflation preconditioner M. > > > That is solving > > > (A*M_D^-1*M^-1) (M M_D) x = b > > > > > > M_D is a static preconditioner build from A (i.e ILU or some preconditioner based on domain decomposition). And M is the deflation preconditioner build from AM_D^-1. > > > Anyway, since M is a dynamic preconditioner built from GMRES Arnoldi process, it is easy to be constructed from AM_D^-1. > > > However build preconditioner such as ILU from AM^-1 is more difficult. > > > > > > I think it is reasonable to change the order of preconditioner in DGMRES code. Or at least gives an option to do this. > > > > > > > > >> Hi Gong, > > >> You do not need to modify the source code for the right preconditioning. > > >> The additional preconditioner (say M_D^{-1}) is built from the > > >> approximate eigenvalues of M^{-1}A or AM^{-1}. > > >> Hence you are solving either M_D^{-1}M_{-1}Ax = M_D^{-1}M_{-1}b or > > >> AM^{-1}M^{-1}_Dy = b for x = M^{-1}M^{-1}_Dy > > >> > > >> Thanks > > >> Desire > > >> > > > > From desire.nuentsa_wakam at inria.fr Wed Oct 5 03:18:41 2011 From: desire.nuentsa_wakam at inria.fr (Desire NUENTSA WAKAM) Date: Wed, 05 Oct 2011 10:18:41 +0200 Subject: [petsc-users] DGMRES Right preconditioner In-Reply-To: <778E956F-2DF6-4C5A-917D-5F94411D71CE@mcs.anl.gov> References: <079FB6F512FC4D1FAA325A66B156ED0D@cogendaeda><45548A70-8582-4169-92A1-E05EB44CCDBE@mcs.anl.gov> <4E89B9AC.4090309@inria.fr> <778E956F-2DF6-4C5A-917D-5F94411D71CE@mcs.anl.gov> Message-ID: <4E8C12E1.9090809@inria.fr> Yes Barry, that is what the code is doing. Thanks Desire On 10/05/2011 05:09 AM, Barry Smith wrote: > 1) It seems this is what the code is already doing? > > if (ksp->pc_side == PC_LEFT) { > /* Apply the first preconditioner */ > ierr = KSP_PCApplyBAorAB (ksp,VEC_VV (it), VEC_TEMP,VEC_TEMP_MATOP); > CHKERRQ (ierr); > /* Then apply Deflation as a preconditioner */ > ierr=KSPDGMRESApplyDeflation (ksp, VEC_TEMP, VEC_VV (1+it)); > CHKERRQ (ierr); > } else if (ksp->pc_side == PC_RIGHT) { > ierr=KSPDGMRESApplyDeflation (ksp, VEC_VV (it), VEC_TEMP); > CHKERRQ (ierr); > ierr=KSP_PCApplyBAorAB (ksp, VEC_TEMP, VEC_VV (1+it), VEC_TEMP_MATOP); > CHKERRQ (ierr); > > For example, with right preconditioning it is applying M^-1(the deflation thingy) to VEC_VV(it) putting the result in VEC_TEMP then it is applying the static preconditioner (for example ILU) M_D^-1 then applying the operator A. > > 2) you are certainly free to edit dgmres.c and do a different algorithm and see how it goes; if you show that it works well or better than the implemented code then people are likely to accept your suggestion. But you need to take the initiative with your ideas to try them out and demonstrate they are good ideas. > > > Barry > > On Oct 4, 2011, at 9:47 PM, Gong Ding wrote: > > >> The order of the preconditioner is the problem. >> I would like to use my own preconditioner i.e. M_D first, then the deflation preconditioner M. >> That is solving >> (A*M_D^-1*M^-1) (M M_D) x = b >> >> M_D is a static preconditioner build from A (i.e ILU or some preconditioner based on domain decomposition). And M is the deflation preconditioner build from AM_D^-1. >> Anyway, since M is a dynamic preconditioner built from GMRES Arnoldi process, it is easy to be constructed from AM_D^-1. >> However build preconditioner such as ILU from AM^-1 is more difficult. >> >> I think it is reasonable to change the order of preconditioner in DGMRES code. Or at least gives an option to do this. >> >> >> >>> Hi Gong, >>> You do not need to modify the source code for the right preconditioning. >>> The additional preconditioner (say M_D^{-1}) is built from the >>> approximate eigenvalues of M^{-1}A or AM^{-1}. >>> Hence you are solving either M_D^{-1}M_{-1}Ax = M_D^{-1}M_{-1}b or >>> AM^{-1}M^{-1}_Dy = b for x = M^{-1}M^{-1}_Dy >>> >>> Thanks >>> Desire >>> >>> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.richard.green at gmail.com Wed Oct 5 12:46:22 2011 From: kevin.richard.green at gmail.com (Kevin Green) Date: Wed, 5 Oct 2011 13:46:22 -0400 Subject: [petsc-users] Appending to vector / numerical continuation / slepc Message-ID: Greetings, I was just wondering what the simplest way to create a new N+k dim where the first N come from a DA. It seems to me that I would have to go the round about way of getting the array, then writing that to the first N components of the new vector... I think there would be a bit of a pain for the parallel case when doing this though, like in managing the change in the local sizes when going from N to N+k... perhaps it's not that tricky. Also, with DAs I don't have to worry about orderings, correct? Essentially I want to get pseudo-arclength continuation working using the SNES solver. Another option I'm thinking is that rather than using an extended vector, I could use a MatShell where the added components are located within its context, and updated upon matmult...since k is typically small, this seems reasonable. Do you know of any code/projects that make use of the SNES module for continuation? Any thoughts on what would be the better or simpler way of doing this? I'm using petsc-3.1 right now, as I also need slepc...which hasn't been updated to work with 3.2 yet, as far as I know. I'm fairly new to petsc/slepc... so I have to ask, what is the timescale like between the release of a new petsc, and update of slepc? Or is there a way to get slepc working with the new release? Cheers, Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Wed Oct 5 13:30:13 2011 From: jroman at dsic.upv.es (Jose E. Roman) Date: Wed, 5 Oct 2011 20:30:13 +0200 Subject: [petsc-users] Appending to vector / numerical continuation / slepc In-Reply-To: References: Message-ID: El 05/10/2011, a las 19:46, Kevin Green escribi?: > Greetings, > > I was just wondering what the simplest way to create a new N+k dim where the first N come from a DA. It seems to me that I would have to go the round about way of getting the array, then writing that to the first N components of the new vector... I think there would be a bit of a pain for the parallel case when doing this though, like in managing the change in the local sizes when going from N to N+k... perhaps it's not that tricky. Also, with DAs I don't have to worry about orderings, correct? > > Essentially I want to get pseudo-arclength continuation working using the SNES solver. Another option I'm thinking is that rather than using an extended vector, I could use a MatShell where the added components are located within its context, and updated upon matmult...since k is typically small, this seems reasonable. Do you know of any code/projects that make use of the SNES module for continuation? Any thoughts on what would be the better or simpler way of doing this? > > I'm using petsc-3.1 right now, as I also need slepc...which hasn't been updated to work with 3.2 yet, as far as I know. I'm fairly new to petsc/slepc... so I have to ask, what is the timescale like between the release of a new petsc, and update of slepc? Or is there a way to get slepc working with the new release? slepc-3.2 will be out in a few weeks. Meanwhile, you can use slepc-dev (see instructions at http://www.grycap.upv.es/slepc/download/ ). Jose > > Cheers, > Kevin From gshy2014 at gmail.com Wed Oct 5 16:15:20 2011 From: gshy2014 at gmail.com (Shiyuan) Date: Wed, 5 Oct 2011 16:15:20 -0500 Subject: [petsc-users] compiling petsc.h and cusp's headers In-Reply-To: References: Message-ID: balay at bb30:~/ex>cat ex1.cu > #include "petsc.h" > #undef VecType > > #include > #include > > int main(int argc,char **argv) > { > PetscErrorCode ierr; > ierr = PetscInitialize(&argc,&argv,(char *)0,(char*)0);CHKERRQ(ierr); > std::cout<<"Hello World!"< ierr = PetscFinalize(); > return 0; > } > > That almost works except that petsc.h doesn't include all definitions/declarations. I have a function which cannot be compiled by just adding petsc.h with errors like: error: identifier "CUSPARRAY" is undefined error: identifier "Mat_SeqAIJ" is undefined error: identifier "Mat_SeqAIJCUSP" is undefined ect. But it can be compiled if I add the following in addition to petsc.h PETSC_CUDA_EXTERN_C_BEGIN #include "../src/mat/impls/aij/seq/aij.h" /*I "petscmat.h" I*/ #include "petscbt.h" #include "../src/vec/vec/impls/dvecimpl.h" #include "private/vecimpl.h" PETSC_CUDA_EXTERN_C_END #undef VecType #include "../src/mat/impls/aij/seq/seqcusp/cuspmatimpl.h" What's the minimal set of headers I can add to petsc.h such that I can call alll petsc functions and use all petsc type? Thanks. Shiyuan -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Oct 5 16:23:12 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 5 Oct 2011 16:23:12 -0500 Subject: [petsc-users] compiling petsc.h and cusp's headers In-Reply-To: References: Message-ID: The PETSc headers are divided into two parts, the public headers with all definitions etc needed by PETSc users (this you just need petsc.h) and private headers that define all the specific data structures for various parts of PETSc. There really is no "minimal set of headers" that allows access to all private parts, the minimal set would simply be all the headers everywhere. But this is not how it is done in PETSc, each piece of code includes the headers it needs for its computations. Since you wish to access the private parts of the CUSP Vec and AIJ Mat classes you need to include what you have below. As soon as you start using the private data structures of PETSc you move into a world that is going to require more initiative on your part in understand the pieces of code you are changing and how to work with them. Barry On Oct 5, 2011, at 4:15 PM, Shiyuan wrote: > > > > balay at bb30:~/ex>cat ex1.cu > #include "petsc.h" > #undef VecType > > #include > #include > > int main(int argc,char **argv) > { > PetscErrorCode ierr; > ierr = PetscInitialize(&argc,&argv,(char *)0,(char*)0);CHKERRQ(ierr); > std::cout<<"Hello World!"< ierr = PetscFinalize(); > return 0; > } > > > That almost works except that petsc.h doesn't include all definitions/declarations. I have a function which cannot be compiled by just adding petsc.h with errors like: > error: identifier "CUSPARRAY" is undefined > error: identifier "Mat_SeqAIJ" is undefined > error: identifier "Mat_SeqAIJCUSP" is undefined > ect. > But it can be compiled if I add the following in addition to petsc.h > > PETSC_CUDA_EXTERN_C_BEGIN > #include "../src/mat/impls/aij/seq/aij.h" /*I "petscmat.h" I*/ > #include "petscbt.h" > #include "../src/vec/vec/impls/dvecimpl.h" > #include "private/vecimpl.h" > PETSC_CUDA_EXTERN_C_END > #undef VecType > #include "../src/mat/impls/aij/seq/seqcusp/cuspmatimpl.h" > > What's the minimal set of headers I can add to petsc.h such that I can call alll petsc functions and use all petsc type? > Thanks. > > > Shiyuan > > > > > > > > > > From balay at mcs.anl.gov Wed Oct 5 16:23:55 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 5 Oct 2011 16:23:55 -0500 (CDT) Subject: [petsc-users] compiling petsc.h and cusp's headers In-Reply-To: References: Message-ID: On Wed, 5 Oct 2011, Shiyuan wrote: > balay at bb30:~/ex>cat ex1.cu > > #include "petsc.h" > > #undef VecType > > > > #include > > #include > > > > int main(int argc,char **argv) > > { > > PetscErrorCode ierr; > > ierr = PetscInitialize(&argc,&argv,(char *)0,(char*)0);CHKERRQ(ierr); > > std::cout<<"Hello World!"< > ierr = PetscFinalize(); > > return 0; > > } > > > > > That almost works except that petsc.h doesn't include all > definitions/declarations. I have a function which cannot be compiled by > just adding petsc.h with errors like: > error: identifier "CUSPARRAY" is undefined > error: identifier "Mat_SeqAIJ" is undefined > error: identifier "Mat_SeqAIJCUSP" is undefined > ect. > But it can be compiled if I add the following in addition to petsc.h > > PETSC_CUDA_EXTERN_C_BEGIN > #include "../src/mat/impls/aij/seq/aij.h" /*I "petscmat.h" I*/ > #include "petscbt.h" > #include "../src/vec/vec/impls/dvecimpl.h" > #include "private/vecimpl.h" > PETSC_CUDA_EXTERN_C_END > #undef VecType > #include "../src/mat/impls/aij/seq/seqcusp/cuspmatimpl.h" > > What's the minimal set of headers I can add to petsc.h such that I can call > alll petsc functions and use all petsc type? > Thanks. petsc.h will give you all the public interface functions. [and datatypes]. If you need to access the private datatypes and functions - then yes - you have to include the needed private include files. Since the private includes are not really meant for regular users - there is no single include file that covers them all. You keep listing all the ones you need. Satish From bsmith at mcs.anl.gov Wed Oct 5 16:29:14 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 5 Oct 2011 16:29:14 -0500 Subject: [petsc-users] Appending to vector / numerical continuation / slepc In-Reply-To: References: Message-ID: <2303DE53-E457-4168-9C56-B7B590676927@mcs.anl.gov> Kevin, The DMCOMPOSITE is designed exactly for this purpose. See the manual page http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-3.0.0/docs/manualpages/DA/DMCompositeCreate.html#DMCompositeCreate and examples it links to. Essentially you create a DMCOMPOSITE and then use DMCompositeAddDM() to put in the DM which will be parallel and DMCompositeAddArray() for the "k" extra things (like continuation parameters). After you read the manual pages and look at the examples and start your code using the DMCOMPOSITE, feel free to ask specific questions about its usage. You definitely should switch to PETSc 3.2 before working because the DM has been markedly improved in places for this type of thing, Barry On Oct 5, 2011, at 12:46 PM, Kevin Green wrote: > Greetings, > > I was just wondering what the simplest way to create a new N+k dim where the first N come from a DA. It seems to me that I would have to go the round about way of getting the array, then writing that to the first N components of the new vector... I think there would be a bit of a pain for the parallel case when doing this though, like in managing the change in the local sizes when going from N to N+k... perhaps it's not that tricky. Also, with DAs I don't have to worry about orderings, correct? > > Essentially I want to get pseudo-arclength continuation working using the SNES solver. Another option I'm thinking is that rather than using an extended vector, I could use a MatShell where the added components are located within its context, and updated upon matmult...since k is typically small, this seems reasonable. Do you know of any code/projects that make use of the SNES module for continuation? Any thoughts on what would be the better or simpler way of doing this? > > I'm using petsc-3.1 right now, as I also need slepc...which hasn't been updated to work with 3.2 yet, as far as I know. I'm fairly new to petsc/slepc... so I have to ask, what is the timescale like between the release of a new petsc, and update of slepc? Or is there a way to get slepc working with the new release? > > Cheers, > Kevin From mccomic at mcs.anl.gov Wed Oct 5 16:35:10 2011 From: mccomic at mcs.anl.gov (Mike McCourt) Date: Wed, 5 Oct 2011 16:35:10 -0500 (CDT) Subject: [petsc-users] Appending to vector / numerical continuation / slepc In-Reply-To: <2303DE53-E457-4168-9C56-B7B590676927@mcs.anl.gov> Message-ID: <1805540796.70710.1317850510287.JavaMail.root@zimbra.anl.gov> If you're gonna use PETSc 3.2, make sure to check out the updated documentation: http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/DM/DMCompositeCreate.html It has a more accurate list of examples. -Mike ----- Original Message ----- From: "Barry Smith" To: "PETSc users list" Sent: Wednesday, October 5, 2011 4:29:14 PM Subject: Re: [petsc-users] Appending to vector / numerical continuation / slepc Kevin, The DMCOMPOSITE is designed exactly for this purpose. See the manual page http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-3.0.0/docs/manualpages/DA/DMCompositeCreate.html#DMCompositeCreate and examples it links to. Essentially you create a DMCOMPOSITE and then use DMCompositeAddDM() to put in the DM which will be parallel and DMCompositeAddArray() for the "k" extra things (like continuation parameters). After you read the manual pages and look at the examples and start your code using the DMCOMPOSITE, feel free to ask specific questions about its usage. You definitely should switch to PETSc 3.2 before working because the DM has been markedly improved in places for this type of thing, Barry On Oct 5, 2011, at 12:46 PM, Kevin Green wrote: > Greetings, > > I was just wondering what the simplest way to create a new N+k dim where the first N come from a DA. It seems to me that I would have to go the round about way of getting the array, then writing that to the first N components of the new vector... I think there would be a bit of a pain for the parallel case when doing this though, like in managing the change in the local sizes when going from N to N+k... perhaps it's not that tricky. Also, with DAs I don't have to worry about orderings, correct? > > Essentially I want to get pseudo-arclength continuation working using the SNES solver. Another option I'm thinking is that rather than using an extended vector, I could use a MatShell where the added components are located within its context, and updated upon matmult...since k is typically small, this seems reasonable. Do you know of any code/projects that make use of the SNES module for continuation? Any thoughts on what would be the better or simpler way of doing this? > > I'm using petsc-3.1 right now, as I also need slepc...which hasn't been updated to work with 3.2 yet, as far as I know. I'm fairly new to petsc/slepc... so I have to ask, what is the timescale like between the release of a new petsc, and update of slepc? Or is there a way to get slepc working with the new release? > > Cheers, > Kevin From rongliang.chan at gmail.com Wed Oct 5 17:56:53 2011 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Wed, 5 Oct 2011 16:56:53 -0600 Subject: [petsc-users] Memory problem Message-ID: Hello Everyone, I am testing a non-linear problem using the snessolve(). The degree of freedoms of my test case is about 1 Million, which means the Jacobian matrix in the snessolve() is an 1 million by 1 million matrix and it should be a sparse matrix. And my question is that in the "-log_summary" output file I find a strange massage: "Matrix 39 39 18446744074642894848 0". Does this message mean that the matrix's memory usage is 1.8x10^20? I have no idea why an one million by one million matrix use so much memory. Is this possible? The output of the "-log_summary" followed. Thanks. Best, Rongliang ------------------------------------------------------------------------------------------------------------------------- Starting to load grid... Nodes on moving boundary: coarse 199, fine 799, Gridratio 0.250000. Setupping Interpolation matrix...... Interpolation matrix done......Time spent: 0.405431 finished. Grid has 32000 elements, 1096658 degrees of freedom. Coarse grid has 2000 elements, 70170 degrees of freedom. [0] has 35380 degrees of freedom (matrix), 35380 degrees of freedom (including shared points). [0] coarse grid has 2194 degrees of freedom (matrix), 2194 degrees of freedom (including shared points). [31] has 32466 degrees of freedom (matrix), 34428 degrees of freedom (including shared points). [31] coarse grid has 2250 degrees of freedom (matrix), 2826 degrees of freedom (including shared points). Time spend on the load grid and create matrix etc.: 3.577862. Solving fixed mesh (steady-state problem) Solving coarse problem...... 0 SNES norm 3.1224989992e+01, 0 KSP its last norm 0.0000000000e+00. 1 SNES norm 1.3987219837e+00, 25 KSP its last norm 2.4915963656e-01. 2 SNES norm 5.1898321541e-01, 59 KSP its last norm 1.3451744761e-02. 3 SNES norm 4.0024228221e-02, 56 KSP its last norm 4.9036146089e-03. 4 SNES norm 6.7641787439e-04, 59 KSP its last norm 3.6925683196e-04. Coarse solver done...... Initial value of object function (Energy dissipation) (Coarse): 38.9341108701 0 SNES norm 7.4575110699e+00, 0 KSP its last norm 0.0000000000e+00. 1 SNES norm 6.4497565921e-02, 51 KSP its last norm 7.4277453141e-03. 2 SNES norm 9.2093642958e-04, 90 KSP its last norm 5.4331380112e-05. 3 SNES norm 8.1283574549e-07, 103 KSP its last norm 7.5974191049e-07. Initial value of object function (Energy dissipation) (Fine): 42.5134271399 Solution time of 17.180358 sec. Fixed mesh (Steady-state) solver done. Total number of nonlinear iterations = 3 Total number of linear iterations = 244 Average number of linear iterations = 81.333336 Time computing: 17.180358 sec, Time outputting: 0.000000 sec. Time spent in coarse nonlinear solve: 0.793436 sec, 0.046183 fraction of total compute time. Solving Shape Optimization problem (steady-state problem) Solving coarse problem...... 0 SNES norm 4.1963166116e+01, 0 KSP its last norm 0.0000000000e+00. 1 SNES norm 3.2749386875e+01, 132 KSP its last norm 4.0966334477e-01. 2 SNES norm 2.2874504408e+01, 130 KSP its last norm 3.2526355310e-01. 3 SNES norm 1.4327187891e+01, 132 KSP its last norm 2.1213029400e-01. 4 SNES norm 1.7283643754e+00, 81 KSP its last norm 1.4233338128e-01. 5 SNES norm 3.6703566918e-01, 133 KSP its last norm 1.6069896349e-02. 6 SNES norm 3.6554528686e-03, 77 KSP its last norm 3.5379167356e-03. Coarse solver done...... Optimized value of object function (Energy dissipation) (Coarse): 29.9743062939 The reduction of the energy dissipation (Coarse): 23.012737% The optimized curve (Coarse): a = (4.500000, -0.042893, -0.002030, 0.043721, -0.018798, 0.001824) Solving moving mesh equation...... KSP norm 2.3040219081e-07, KSP its. 741. Time spent 8.481956 Moving mesh solver done. 0 SNES norm 4.7843968670e+02, 0 KSP its last norm 0.0000000000e+00. 1 SNES norm 1.0148854085e+02, 49 KSP its last norm 4.7373180511e-01. 2 SNES norm 1.8312214030e+00, 46 KSP its last norm 1.0133332840e-01. 3 SNES norm 3.3101970861e-03, 212 KSP its last norm 1.7753271069e-03. 4 SNES norm 4.9552614008e-06, 249 KSP its last norm 3.2293284103e-06. Optimized value of object function (Energy dissipation) (Fine): 33.2754372645 Solution time of 4053.227456 sec. Number of unknowns = 1096658 Parameters: kinematic viscosity = 0.01 inlet velocity: u = 5, v = 0 Total number of nonlinear iterations = 4 Total number of linear iterations = 556 Average number of linear iterations = 139.000000 Time computing: 4053.227456 sec, Time outputting: 0.000001 sec. Time spent in coarse nonlinear solve: 24.239526 sec, 0.005980 fraction of total compute time. The optimized curve (fine): a = (4.500000, -0.046468, -0.001963, 0.045736, -0.019141, 0.001789) The reduction of the energy dissipation (Fine): 21.729582% Time spend on fixed mesh solving: 17.296872 Time spend on shape opt. solving: 4053.250126 Latex command line: np Newton GMRES Time(Total) Time(Coarse) Ratio 32 & 4 & 139.00 & 4053.23 & 24.24 & 0.6\% Running finished on: Wed Oct 5 11:32:04 2011 Total running time: 4070.644329 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./joab on a Janus-nod named node1751 with 32 processors, by ronglian Wed Oct 5 11:32:04 2011 Using Petsc Release Version 3.2.0, Patch 1, Mon Sep 12 16:01:51 CDT 2011 Max Max/Min Avg Total Time (sec): 4.074e+03 1.00000 4.074e+03 Objects: 1.011e+03 1.00000 1.011e+03 Flops: 2.255e+11 2.27275 1.471e+11 4.706e+12 Flops/sec: 5.535e+07 2.27275 3.609e+07 1.155e+09 MPI Messages: 1.103e+05 5.41392 3.665e+04 1.173e+06 MPI Message Lengths: 1.326e+09 2.60531 2.416e+04 2.833e+10 MPI Reductions: 5.969e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 4.0743e+03 100.0% 4.7058e+12 100.0% 1.173e+06 100.0% 2.416e+04 100.0% 5.968e+03 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 2493 1.0 1.2225e+0218.4 4.37e+09 1.1 3.9e+05 2.2e+03 0.0e+00 2 3 33 3 0 2 3 33 3 0 1084 MatMultTranspose 6 1.0 3.3590e-02 2.2 7.38e+06 1.1 8.0e+02 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 6727 MatSolve 2467 1.0 1.1270e+02 1.7 5.95e+10 1.7 0.0e+00 0.0e+00 0.0e+00 2 33 0 0 0 2 33 0 0 0 13775 MatLUFactorSym 4 1.0 3.4774e+00 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 18 1.0 2.0832e+02 3.7 1.55e+11 3.2 0.0e+00 0.0e+00 0.0e+00 2 56 0 0 0 2 56 0 0 0 12746 MatILUFactorSym 1 1.0 8.3280e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 103 1.0 7.6879e+0215.4 0.00e+00 0.0 1.6e+04 6.2e+04 1.7e+02 7 0 1 4 3 7 0 1 4 3 0 MatAssemblyEnd 103 1.0 3.7818e+01 1.0 0.00e+00 0.0 3.0e+03 5.3e+02 1.6e+02 1 0 0 0 3 1 0 0 0 3 0 MatGetRowIJ 5 1.0 4.8716e-02 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 18 1.0 4.3095e+00 2.5 0.00e+00 0.0 1.6e+04 3.5e+05 7.4e+01 0 0 1 20 1 0 0 1 20 1 0 MatGetOrdering 5 1.0 1.4656e+00 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 0 0 0 0 0 0 0 MatPartitioning 1 1.0 1.4356e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 42 1.0 2.0939e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecDot 17 1.0 1.2719e-02 6.8 5.47e+05 1.1 0.0e+00 0.0e+00 1.7e+01 0 0 0 0 0 0 0 0 0 0 1317 VecMDot 2425 1.0 1.7196e+01 2.2 5.82e+09 1.1 0.0e+00 0.0e+00 2.4e+03 0 4 0 0 41 0 4 0 0 41 10353 VecNorm 2503 1.0 2.7923e+00 3.4 1.18e+08 1.1 0.0e+00 0.0e+00 2.5e+03 0 0 0 0 42 0 0 0 0 42 1293 VecScale 2467 1.0 7.3112e-02 1.7 5.84e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 24453 VecCopy 153 1.0 1.1636e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 5031 1.0 6.0423e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 137 1.0 1.1462e-02 1.5 6.33e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 16902 VecWAXPY 19 1.0 1.7784e-03 1.4 2.83e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4869 VecMAXPY 2467 1.0 8.5820e+00 1.3 5.93e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 21153 VecAssemblyBegin 69 1.0 1.0341e+0018.2 0.00e+00 0.0 4.9e+03 5.4e+02 2.1e+02 0 0 0 0 3 0 0 0 0 3 0 VecAssemblyEnd 69 1.0 2.4939e-04 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 7491 1.0 1.3734e+00 1.7 0.00e+00 0.0 1.1e+06 1.9e+04 0.0e+00 0 0 96 76 0 0 0 96 76 0 0 VecScatterEnd 7491 1.0 2.0055e+02 8.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 VecReduceArith 8 1.0 1.4977e-03 2.0 3.05e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6232 VecReduceComm 4 1.0 8.9908e-0412.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 2467 1.0 2.8067e+00 3.4 1.75e+08 1.1 0.0e+00 0.0e+00 2.4e+03 0 0 0 0 41 0 0 0 0 41 1905 SNESSolve 4 1.0 4.0619e+03 1.0 2.23e+11 2.3 9.4e+05 2.3e+04 4.1e+03100 98 80 77 68 100 98 80 77 68 1136 SNESLineSearch 17 1.0 1.1423e+01 1.0 5.23e+07 1.1 1.8e+04 1.7e+04 3.3e+02 0 0 2 1 6 0 0 2 1 6 140 SNESFunctionEval 23 1.0 2.9742e+01 1.0 2.60e+07 1.1 1.9e+04 1.9e+04 3.5e+02 1 0 2 1 6 1 0 2 1 6 27 SNESJacobianEval 17 1.0 3.6786e+03 1.0 0.00e+00 0.0 9.8e+03 6.4e+04 1.4e+02 90 0 1 2 2 90 0 1 2 2 0 KSPGMRESOrthog 2425 1.0 2.5150e+01 1.6 1.16e+10 1.1 0.0e+00 0.0e+00 2.4e+03 0 8 0 0 41 0 8 0 0 41 14157 KSPSetup 36 1.0 2.5388e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 18 1.0 3.6141e+02 1.0 2.25e+11 2.3 1.1e+06 2.4e+04 5.0e+03 9100 97 96 84 9100 97 96 84 13015 PCSetUp 36 1.0 2.1635e+02 3.6 1.55e+11 3.2 1.8e+04 3.2e+05 1.5e+02 3 56 2 20 3 3 56 2 20 3 12274 PCSetUpOnBlocks 18 1.0 2.1293e+02 3.7 1.55e+11 3.2 0.0e+00 0.0e+00 2.7e+01 2 56 0 0 0 2 56 0 0 0 12471 PCApply 2467 1.0 2.5616e+02 2.5 5.95e+10 1.7 7.3e+05 2.8e+04 0.0e+00 4 33 62 73 0 4 33 62 73 0 6060 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 39 39 18446744074642894848 0 Matrix Partitioning 1 1 640 0 Index Set 184 184 2589512 0 IS L to G Mapping 2 2 301720 0 Vector 729 729 133662888 0 Vector Scatter 29 29 30508 0 Application Order 2 2 9335968 0 SNES 4 4 5088 0 Krylov Solver 10 10 32264320 0 Preconditioner 10 10 9088 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 1.19209e-07 Average time for MPI_Barrier(): 1.20163e-05 Average time for zero size MPI_Send(): 2.49594e-06 ...................................... ----------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Oct 5 18:00:44 2011 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 5 Oct 2011 18:00:44 -0500 Subject: [petsc-users] Memory problem In-Reply-To: References: Message-ID: On Wed, Oct 5, 2011 at 5:56 PM, Rongliang Chen wrote: > Hello Everyone, > > I am testing a non-linear problem using the snessolve(). The degree of > freedoms of my test case is about 1 Million, which means the Jacobian matrix > in the snessolve() is an 1 million by 1 million matrix and it should be a > sparse matrix. > And my question is that in the "-log_summary" output file I find a strange > massage: "Matrix 39 39 18446744074642894848 0". Does > this message mean that the matrix's memory usage is 1.8x10^20? I have no > idea why an one million by one million matrix use so much memory. Is this > possible? The output of the "-log_summary" followed. Thanks. > That is an overflow somewhere. You can probably get the right answer by using -snes_view. I will try and track down this overflow. Matt > Best, > Rongliang > > > ------------------------------------------------------------------------------------------------------------------------- > Starting to load grid... > Nodes on moving boundary: coarse 199, fine 799, Gridratio 0.250000. > Setupping Interpolation matrix...... > Interpolation matrix done......Time spent: 0.405431 > finished. > Grid has 32000 elements, 1096658 degrees of freedom. > Coarse grid has 2000 elements, 70170 degrees of freedom. > [0] has 35380 degrees of freedom (matrix), 35380 degrees of freedom > (including shared points). > [0] coarse grid has 2194 degrees of freedom (matrix), 2194 degrees of > freedom (including shared points). > [31] has 32466 degrees of freedom (matrix), 34428 degrees of freedom > (including shared points). > [31] coarse grid has 2250 degrees of freedom (matrix), 2826 degrees of > freedom (including shared points). > Time spend on the load grid and create matrix etc.: 3.577862. > Solving fixed mesh (steady-state problem) > Solving coarse problem...... > 0 SNES norm 3.1224989992e+01, 0 KSP its last norm 0.0000000000e+00. > 1 SNES norm 1.3987219837e+00, 25 KSP its last norm 2.4915963656e-01. > 2 SNES norm 5.1898321541e-01, 59 KSP its last norm 1.3451744761e-02. > 3 SNES norm 4.0024228221e-02, 56 KSP its last norm 4.9036146089e-03. > 4 SNES norm 6.7641787439e-04, 59 KSP its last norm 3.6925683196e-04. > Coarse solver done...... > Initial value of object function (Energy dissipation) (Coarse): > 38.9341108701 > 0 SNES norm 7.4575110699e+00, 0 KSP its last norm 0.0000000000e+00. > 1 SNES norm 6.4497565921e-02, 51 KSP its last norm 7.4277453141e-03. > 2 SNES norm 9.2093642958e-04, 90 KSP its last norm 5.4331380112e-05. > 3 SNES norm 8.1283574549e-07, 103 KSP its last norm 7.5974191049e-07. > Initial value of object function (Energy dissipation) (Fine): 42.5134271399 > Solution time of 17.180358 sec. > Fixed mesh (Steady-state) solver done. > Total number of nonlinear iterations = 3 > Total number of linear iterations = 244 > Average number of linear iterations = 81.333336 > Time computing: 17.180358 sec, Time outputting: 0.000000 sec. > Time spent in coarse nonlinear solve: 0.793436 sec, 0.046183 fraction of > total compute time. > Solving Shape Optimization problem (steady-state problem) > Solving coarse problem...... > 0 SNES norm 4.1963166116e+01, 0 KSP its last norm 0.0000000000e+00. > 1 SNES norm 3.2749386875e+01, 132 KSP its last norm 4.0966334477e-01. > 2 SNES norm 2.2874504408e+01, 130 KSP its last norm 3.2526355310e-01. > 3 SNES norm 1.4327187891e+01, 132 KSP its last norm 2.1213029400e-01. > 4 SNES norm 1.7283643754e+00, 81 KSP its last norm 1.4233338128e-01. > 5 SNES norm 3.6703566918e-01, 133 KSP its last norm 1.6069896349e-02. > 6 SNES norm 3.6554528686e-03, 77 KSP its last norm 3.5379167356e-03. > Coarse solver done...... > Optimized value of object function (Energy dissipation) (Coarse): > 29.9743062939 > The reduction of the energy dissipation (Coarse): 23.012737% > The optimized curve (Coarse): > a = (4.500000, -0.042893, -0.002030, 0.043721, -0.018798, 0.001824) > Solving moving mesh equation...... > KSP norm 2.3040219081e-07, KSP its. 741. Time spent 8.481956 > Moving mesh solver done. > 0 SNES norm 4.7843968670e+02, 0 KSP its last norm 0.0000000000e+00. > 1 SNES norm 1.0148854085e+02, 49 KSP its last norm 4.7373180511e-01. > 2 SNES norm 1.8312214030e+00, 46 KSP its last norm 1.0133332840e-01. > 3 SNES norm 3.3101970861e-03, 212 KSP its last norm 1.7753271069e-03. > 4 SNES norm 4.9552614008e-06, 249 KSP its last norm 3.2293284103e-06. > Optimized value of object function (Energy dissipation) (Fine): > 33.2754372645 > Solution time of 4053.227456 sec. > Number of unknowns = 1096658 > Parameters: kinematic viscosity = 0.01 > inlet velocity: u = 5, v = 0 > Total number of nonlinear iterations = 4 > Total number of linear iterations = 556 > Average number of linear iterations = 139.000000 > Time computing: 4053.227456 sec, Time outputting: 0.000001 sec. > Time spent in coarse nonlinear solve: 24.239526 sec, 0.005980 fraction of > total compute time. > The optimized curve (fine): > a = (4.500000, -0.046468, -0.001963, 0.045736, -0.019141, 0.001789) > The reduction of the energy dissipation (Fine): 21.729582% > Time spend on fixed mesh solving: 17.296872 > Time spend on shape opt. solving: 4053.250126 > Latex command line: > np Newton GMRES Time(Total) Time(Coarse) Ratio > 32 & 4 & 139.00 & 4053.23 & 24.24 & 0.6\% > > Running finished on: Wed Oct 5 11:32:04 2011 > Total running time: 4070.644329 > > ************************************************************************************************************************ > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r > -fCourier9' to print this document *** > > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: > ---------------------------------------------- > > ./joab on a Janus-nod named node1751 with 32 processors, by ronglian Wed > Oct 5 11:32:04 2011 > Using Petsc Release Version 3.2.0, Patch 1, Mon Sep 12 16:01:51 CDT 2011 > > Max Max/Min Avg Total > Time (sec): 4.074e+03 1.00000 4.074e+03 > Objects: 1.011e+03 1.00000 1.011e+03 > Flops: 2.255e+11 2.27275 1.471e+11 4.706e+12 > Flops/sec: 5.535e+07 2.27275 3.609e+07 1.155e+09 > MPI Messages: 1.103e+05 5.41392 3.665e+04 1.173e+06 > MPI Message Lengths: 1.326e+09 2.60531 2.416e+04 2.833e+10 > MPI Reductions: 5.969e+03 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flops > and VecAXPY() for complex vectors of length N > --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > 0: Main Stage: 4.0743e+03 100.0% 4.7058e+12 100.0% 1.173e+06 > 100.0% 2.416e+04 100.0% 5.968e+03 100.0% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over > all processors) > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) > Flops --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > MatMult 2493 1.0 1.2225e+0218.4 4.37e+09 1.1 3.9e+05 2.2e+03 > 0.0e+00 2 3 33 3 0 2 3 33 3 0 1084 > MatMultTranspose 6 1.0 3.3590e-02 2.2 7.38e+06 1.1 8.0e+02 1.5e+03 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 6727 > MatSolve 2467 1.0 1.1270e+02 1.7 5.95e+10 1.7 0.0e+00 0.0e+00 > 0.0e+00 2 33 0 0 0 2 33 0 0 0 13775 > MatLUFactorSym 4 1.0 3.4774e+00 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 18 1.0 2.0832e+02 3.7 1.55e+11 3.2 0.0e+00 0.0e+00 > 0.0e+00 2 56 0 0 0 2 56 0 0 0 12746 > MatILUFactorSym 1 1.0 8.3280e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatAssemblyBegin 103 1.0 7.6879e+0215.4 0.00e+00 0.0 1.6e+04 6.2e+04 > 1.7e+02 7 0 1 4 3 7 0 1 4 3 0 > MatAssemblyEnd 103 1.0 3.7818e+01 1.0 0.00e+00 0.0 3.0e+03 5.3e+02 > 1.6e+02 1 0 0 0 3 1 0 0 0 3 0 > MatGetRowIJ 5 1.0 4.8716e-02 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetSubMatrice 18 1.0 4.3095e+00 2.5 0.00e+00 0.0 1.6e+04 3.5e+05 > 7.4e+01 0 0 1 20 1 0 0 1 20 1 0 > MatGetOrdering 5 1.0 1.4656e+00 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.4e+01 0 0 0 0 0 0 0 0 0 0 0 > MatPartitioning 1 1.0 1.4356e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatZeroEntries 42 1.0 2.0939e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecDot 17 1.0 1.2719e-02 6.8 5.47e+05 1.1 0.0e+00 0.0e+00 > 1.7e+01 0 0 0 0 0 0 0 0 0 0 1317 > VecMDot 2425 1.0 1.7196e+01 2.2 5.82e+09 1.1 0.0e+00 0.0e+00 > 2.4e+03 0 4 0 0 41 0 4 0 0 41 10353 > VecNorm 2503 1.0 2.7923e+00 3.4 1.18e+08 1.1 0.0e+00 0.0e+00 > 2.5e+03 0 0 0 0 42 0 0 0 0 42 1293 > VecScale 2467 1.0 7.3112e-02 1.7 5.84e+07 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 24453 > VecCopy 153 1.0 1.1636e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 5031 1.0 6.0423e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 137 1.0 1.1462e-02 1.5 6.33e+06 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 16902 > VecWAXPY 19 1.0 1.7784e-03 1.4 2.83e+05 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 4869 > VecMAXPY 2467 1.0 8.5820e+00 1.3 5.93e+09 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 4 0 0 0 0 4 0 0 0 21153 > VecAssemblyBegin 69 1.0 1.0341e+0018.2 0.00e+00 0.0 4.9e+03 5.4e+02 > 2.1e+02 0 0 0 0 3 0 0 0 0 3 0 > VecAssemblyEnd 69 1.0 2.4939e-04 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 7491 1.0 1.3734e+00 1.7 0.00e+00 0.0 1.1e+06 1.9e+04 > 0.0e+00 0 0 96 76 0 0 0 96 76 0 0 > VecScatterEnd 7491 1.0 2.0055e+02 8.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 > VecReduceArith 8 1.0 1.4977e-03 2.0 3.05e+05 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 6232 > VecReduceComm 4 1.0 8.9908e-0412.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecNormalize 2467 1.0 2.8067e+00 3.4 1.75e+08 1.1 0.0e+00 0.0e+00 > 2.4e+03 0 0 0 0 41 0 0 0 0 41 1905 > SNESSolve 4 1.0 4.0619e+03 1.0 2.23e+11 2.3 9.4e+05 2.3e+04 > 4.1e+03100 98 80 77 68 100 98 80 77 68 1136 > SNESLineSearch 17 1.0 1.1423e+01 1.0 5.23e+07 1.1 1.8e+04 1.7e+04 > 3.3e+02 0 0 2 1 6 0 0 2 1 6 140 > SNESFunctionEval 23 1.0 2.9742e+01 1.0 2.60e+07 1.1 1.9e+04 1.9e+04 > 3.5e+02 1 0 2 1 6 1 0 2 1 6 27 > SNESJacobianEval 17 1.0 3.6786e+03 1.0 0.00e+00 0.0 9.8e+03 6.4e+04 > 1.4e+02 90 0 1 2 2 90 0 1 2 2 0 > KSPGMRESOrthog 2425 1.0 2.5150e+01 1.6 1.16e+10 1.1 0.0e+00 0.0e+00 > 2.4e+03 0 8 0 0 41 0 8 0 0 41 14157 > KSPSetup 36 1.0 2.5388e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 18 1.0 3.6141e+02 1.0 2.25e+11 2.3 1.1e+06 2.4e+04 > 5.0e+03 9100 97 96 84 9100 97 96 84 13015 > PCSetUp 36 1.0 2.1635e+02 3.6 1.55e+11 3.2 1.8e+04 3.2e+05 > 1.5e+02 3 56 2 20 3 3 56 2 20 3 12274 > PCSetUpOnBlocks 18 1.0 2.1293e+02 3.7 1.55e+11 3.2 0.0e+00 0.0e+00 > 2.7e+01 2 56 0 0 0 2 56 0 0 0 12471 > PCApply 2467 1.0 2.5616e+02 2.5 5.95e+10 1.7 7.3e+05 2.8e+04 > 0.0e+00 4 33 62 73 0 4 33 62 73 0 6060 > > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Matrix 39 39 18446744074642894848 0 > Matrix Partitioning 1 1 640 0 > Index Set 184 184 2589512 0 > IS L to G Mapping 2 2 301720 0 > Vector 729 729 133662888 0 > Vector Scatter 29 29 30508 0 > Application Order 2 2 9335968 0 > SNES 4 4 5088 0 > Krylov Solver 10 10 32264320 0 > Preconditioner 10 10 9088 0 > Viewer 1 0 0 0 > > ======================================================================================================================== > Average time to get PetscTime(): 1.19209e-07 > Average time for MPI_Barrier(): 1.20163e-05 > Average time for zero size MPI_Send(): 2.49594e-06 > ...................................... > ----------------------------------------- > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.richard.green at gmail.com Wed Oct 5 21:59:37 2011 From: kevin.richard.green at gmail.com (Kevin Green) Date: Wed, 5 Oct 2011 22:59:37 -0400 Subject: [petsc-users] Appending to vector / numerical continuation / slepc In-Reply-To: <1805540796.70710.1317850510287.JavaMail.root@zimbra.anl.gov> References: <2303DE53-E457-4168-9C56-B7B590676927@mcs.anl.gov> <1805540796.70710.1317850510287.JavaMail.root@zimbra.anl.gov> Message-ID: Jose - Thank you, I'll see if I can get this working. Barry - This seems to be exactly what I'm looking for. Glancing at the documentation briefly, some questions do spring to mind, but I will not ask until I look at some of the examples! Mike - Thanks for the updated link, I didn't even notice that Barry's was for 3.0.0. In the meantime, I'll update to petsc 3.2, and slepc-dev, and get looking at these examples. This isn't at the immediate top of my todo list, but I expect I'll have some detailed questions on DMCOMPOSITE in a week or so. Kevin On Wed, Oct 5, 2011 at 5:35 PM, Mike McCourt wrote: > If you're gonna use PETSc 3.2, make sure to check out the updated > documentation: > > > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/DM/DMCompositeCreate.html > > It has a more accurate list of examples. > > -Mike > > ----- Original Message ----- > From: "Barry Smith" > To: "PETSc users list" > Sent: Wednesday, October 5, 2011 4:29:14 PM > Subject: Re: [petsc-users] Appending to vector / numerical continuation / > slepc > > > Kevin, > > The DMCOMPOSITE is designed exactly for this purpose. See the manual > page > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-3.0.0/docs/manualpages/DA/DMCompositeCreate.html#DMCompositeCreateand examples it links to. Essentially you create a DMCOMPOSITE and then use > DMCompositeAddDM() to put in the DM which will be parallel and > DMCompositeAddArray() for the "k" extra things (like continuation > parameters). After you read the manual pages and look at the examples and > start your code using the DMCOMPOSITE, feel free to ask specific questions > about its usage. > > You definitely should switch to PETSc 3.2 before working because the DM > has been markedly improved in places for this type of thing, > > Barry > > > > On Oct 5, 2011, at 12:46 PM, Kevin Green wrote: > > > Greetings, > > > > I was just wondering what the simplest way to create a new N+k dim where > the first N come from a DA. It seems to me that I would have to go the > round about way of getting the array, then writing that to the first N > components of the new vector... I think there would be a bit of a pain for > the parallel case when doing this though, like in managing the change in the > local sizes when going from N to N+k... perhaps it's not that tricky. Also, > with DAs I don't have to worry about orderings, correct? > > > > Essentially I want to get pseudo-arclength continuation working using the > SNES solver. Another option I'm thinking is that rather than using an > extended vector, I could use a MatShell where the added components are > located within its context, and updated upon matmult...since k is typically > small, this seems reasonable. Do you know of any code/projects that make > use of the SNES module for continuation? Any thoughts on what would be the > better or simpler way of doing this? > > > > I'm using petsc-3.1 right now, as I also need slepc...which hasn't been > updated to work with 3.2 yet, as far as I know. I'm fairly new to > petsc/slepc... so I have to ask, what is the timescale like between the > release of a new petsc, and update of slepc? Or is there a way to get slepc > working with the new release? > > > > Cheers, > > Kevin > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rongliang.chan at gmail.com Thu Oct 6 10:27:59 2011 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Thu, 6 Oct 2011 09:27:59 -0600 Subject: [petsc-users] Memory problem Message-ID: That is an overflow somewhere. You can probably get the right answer by > using -snes_view. I will try and track down this overflow. > > Matt > > > Hi Matt, Thank you for your reply. The -snes_view and -log_summary output is followed. But I did not find any unusual results in the -snes_view output. I have another question. what the number 23 mean in the following -log_summary output: " Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 23 23 18446744073922240512 0 " Does it mean that I created 23 matrices in my code? But I think I have not created so many matrices. Thanks. Best, Rongliang ------------------------------------------------------------------------------------------------------------ Starting to load grid... Nodes on moving boundary: coarse 199, fine 799, Gridratio 0.250000. Setupping Interpolation matrix...... Interpolation matrix done......Time spent: 0.207468 finished. Grid has 32000 elements, 1096658 degrees of freedom. Coarse grid has 2000 elements, 70170 degrees of freedom. [0] has 9234 degrees of freedom (matrix), 9234 degrees of freedom (including shared points). [0] coarse grid has 484 degrees of freedom (matrix), 484 degrees of freedom (including shared points). [127] has 7876 degrees of freedom (matrix), 9100 degrees of freedom (including shared points). [127] coarse grid has 588 degrees of freedom (matrix), 912 degrees of freedom (including shared points). Time spend on the load grid and create matrix etc.: 3.719866. Solving Shape Optimization problem (steady-state problem) Solving coarse problem...... 0 SNES norm 3.4998054301e+01, 0 KSP its last norm 0.0000000000e+00. 1 SNES norm 3.1501179205e+01, 34 KSP its last norm 3.3927102450e-01. 2 SNES norm 2.1246874435e+01, 57 KSP its last norm 3.1177722630e-01. 3 SNES norm 1.7390263296e+01, 141 KSP its last norm 1.9452289323e-01. 4 SNES norm 1.1644760718e+01, 160 KSP its last norm 1.6835316856e-01. 5 SNES norm 1.0601030093e+01, 181 KSP its last norm 1.1003156828e-01. 6 SNES norm 1.0145938759e+00, 126 KSP its last norm 1.0556059459e-01. 7 SNES norm 1.9267547420e-01, 203 KSP its last norm 9.9489004947e-03. 8 SNES norm 1.8901340973e-03, 195 KSP its last norm 1.8359299380e-03. Coarse solver done...... Optimized value of object function (Energy dissipation) (Coarse): 29.9743671231 The reduction of the energy dissipation (Coarse): -inf% The optimized curve (Coarse): a = (4.500000, -0.042961, -0.002068, 0.043750, -0.018783, 0.001816) Solving moving mesh equation...... KSP norm 2.2906632201e-07, KSP its. 741. Time spent 2.772948 Moving mesh solver done. 0 SNES norm 4.7914118974e+02, 0 KSP its last norm 0.0000000000e+00. 1 SNES norm 1.0150289152e+02, 63 KSP its last norm 4.6576374323e-01. 2 SNES norm 1.8326417396e+00, 90 KSP its last norm 9.9707541310e-02. 3 SNES norm 3.7711809663e-03, 348 KSP its last norm 1.8059473115e-03. 4 SNES norm 9.7342448527e-06, 484 KSP its last norm 3.6343704479e-06. SNES Object: 128 MPI processes type: ls line search variant: SNESLineSearchCubic alpha=1.000000000000e-04, maxstep=1.000000000000e+08, minlambda=1.000000000000e-12 maximum iterations=20, maximum function evaluations=10000 tolerances: relative=1e-06, absolute=1e-10, solution=1e-08 total number of linear solver iterations=985 total number of function evaluations=5 KSP Object: 128 MPI processes type: gmres GMRES: restart=600, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=3000, initial guess is zero tolerances: relative=0.001, absolute=1e-08, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 128 MPI processes type: asm Additive Schwarz: total subdomain blocks = 128, user-defined overlap Additive Schwarz: restriction/interpolation type - BASIC Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 1e-12 using diagonal shift to prevent zero pivot matrix ordering: qmd factor fill ratio given 5, needed 5.26731 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=25170, cols=25170 package used to perform factorization: petsc total: nonzeros=11090981, allocated nonzeros=11090981 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 12872 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=25170, cols=25170 total: nonzeros=2105626, allocated nonzeros=2105626 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 13453 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 128 MPI processes type: mpiaij rows=1096658, cols=1096658 total: nonzeros=94170314, allocated nonzeros=223806957 total number of mallocs used during MatSetValues calls =6185057 not using I-node (on process 0) routines Optimized value of object function (Energy dissipation) (Fine): 33.2754475059 Solution time of 395.289169 sec. Number of unknowns = 1096658 Parameters: kinematic viscosity = 0.01 inlet velocity: u = 5, v = 0 Total number of nonlinear iterations = 4 Total number of linear iterations = 985 Average number of linear iterations = 246.250000 Time computing: 395.289169 sec, Time outputting: 0.000000 sec. Time spent in coarse nonlinear solve: 13.134366 sec, 0.033227 fraction of total compute time. The optimized curve (fine): a = (4.500000, -0.046466, -0.001962, 0.045734, -0.019141, 0.001789) The reduction of the energy dissipation (Fine): -inf% Time spend on fixed mesh solving: 0.013564 Time spend on shape opt. solving: 395.324390 Latex command line: np Newton GMRES Time(Total) Time(Coarse) Ratio 128 & 4 & 246.25 & 395.29 & 13.13 & 3.3\% Running finished on: Wed Oct 5 19:02:01 2011 Total running time: 395.376442 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./joab on a Janus-nod named node0844 with 128 processors, by ronglian Wed Oct 5 19:02:01 2011 Using Petsc Release Version 3.2.0, Patch 1, Mon Sep 12 16:01:51 CDT 2011 Max Max/Min Avg Total Time (sec): 3.991e+02 1.00013 3.991e+02 Objects: 1.066e+03 1.00000 1.066e+03 Flops: 7.938e+10 2.52133 5.615e+10 7.187e+12 Flops/sec: 1.989e+08 2.52100 1.407e+08 1.801e+10 MPI Messages: 2.724e+05 8.91400 6.158e+04 7.883e+06 MPI Message Lengths: 8.340e+08 2.63753 1.025e+04 8.083e+10 MPI Reductions: 6.537e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 3.9910e+02 100.0% 7.1875e+12 100.0% 7.883e+06 100.0% 1.025e+04 100.0% 6.536e+03 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 2879 1.0 4.3698e+0116.7 1.86e+09 1.3 2.1e+06 1.4e+03 0.0e+00 4 3 27 4 0 4 3 27 4 0 4839 MatMultTranspose 3 1.0 3.0989e-0226.2 9.81e+05 1.2 2.0e+03 7.3e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 3646 MatSolve 2860 1.0 7.2956e+01 2.3 3.95e+10 2.5 0.0e+00 0.0e+00 0.0e+00 14 52 0 0 0 14 52 0 0 0 50895 MatLUFactorSym 2 1.0 1.3847e+00 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 13 1.0 4.9187e+01 5.8 3.56e+10 4.8 0.0e+00 0.0e+00 0.0e+00 5 33 0 0 0 5 33 0 0 0 48174 MatILUFactorSym 1 1.0 3.9380e-03 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 78 1.0 8.7529e+0129.0 0.00e+00 0.0 4.3e+04 5.2e+04 1.3e+02 11 0 1 3 2 11 0 1 3 2 0 MatAssemblyEnd 78 1.0 7.2215e+00 1.0 0.00e+00 0.0 8.5e+03 3.6e+02 1.1e+02 2 0 0 0 2 2 0 0 0 2 0 MatGetRowIJ 3 1.0 4.3902e-02 8.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 13 1.0 1.9239e+00 2.9 0.00e+00 0.0 7.9e+04 1.8e+05 5.1e+01 0 0 1 17 1 0 0 1 17 1 0 MatGetOrdering 3 1.0 4.1121e-01 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 MatPartitioning 1 1.0 2.2540e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 32 1.0 3.8607e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatView 3 3.0 1.6980e-0323.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecDot 12 1.0 6.8195e-0317.5 8.36e+04 1.2 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 1451 VecMDot 2823 1.0 4.1334e+01 7.2 3.82e+09 1.2 0.0e+00 0.0e+00 2.8e+03 4 6 0 0 43 4 6 0 0 43 10682 VecNorm 2888 1.0 2.5551e+00 3.1 3.47e+07 1.2 0.0e+00 0.0e+00 2.9e+03 0 0 0 0 44 0 0 0 0 44 1575 VecScale 2860 1.0 2.0850e-02 1.9 1.73e+07 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 96028 VecCopy 117 1.0 2.1448e-03 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 5795 1.0 2.2957e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 116 1.0 1.9181e-03 1.6 1.36e+06 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 82341 VecWAXPY 16 1.0 2.6107e-04 1.5 4.61e+04 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 21104 VecMAXPY 2860 1.0 5.2077e+00 1.5 3.85e+09 1.2 0.0e+00 0.0e+00 0.0e+00 1 6 0 0 0 1 6 0 0 0 85546 VecAssemblyBegin 60 1.0 3.4554e-0110.6 0.00e+00 0.0 1.8e+04 3.4e+02 1.8e+02 0 0 0 0 3 0 0 0 0 3 0 VecAssemblyEnd 60 1.0 1.9860e-04 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 8648 1.0 1.1008e+00 3.0 0.00e+00 0.0 7.7e+06 8.4e+03 0.0e+00 0 0 98 80 0 0 0 98 80 0 0 VecScatterEnd 8648 1.0 8.4154e+0135.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 10 0 0 0 0 0 VecReduceArith 4 1.0 2.6989e-04 2.3 4.00e+04 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 17292 VecReduceComm 2 1.0 2.7108e-04 6.3 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 2860 1.0 2.5307e+00 3.1 5.17e+07 1.2 0.0e+00 0.0e+00 2.8e+03 0 0 0 0 44 0 0 0 0 44 2370 SNESSolve 2 1.0 3.9251e+02 1.0 7.85e+10 2.6 6.7e+06 9.7e+03 4.7e+03 98 98 84 80 72 98 98 84 80 72 18034 SNESLineSearch 12 1.0 3.0610e+00 1.0 1.26e+07 1.2 6.1e+04 1.1e+04 2.9e+02 1 0 1 1 4 1 0 1 1 4 473 SNESFunctionEval 18 1.0 6.5305e+00 1.0 6.01e+06 1.2 6.2e+04 1.3e+04 2.9e+02 2 0 1 1 4 2 0 1 1 4 106 SNESJacobianEval 12 1.0 2.4492e+02 1.0 0.00e+00 0.0 2.5e+04 5.9e+04 9.0e+01 61 0 0 2 1 61 0 0 2 1 0 KSPGMRESOrthog 2823 1.0 4.6476e+01 4.3 7.64e+09 1.2 0.0e+00 0.0e+00 2.8e+03 5 12 0 0 43 5 12 0 0 43 19001 KSPSetup 26 1.0 1.3622e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 13 1.0 1.4371e+02 1.0 7.94e+10 2.5 7.8e+06 1.0e+04 5.8e+03 36100 98 97 88 36100 98 97 88 50001 PCSetUp 26 1.0 5.2544e+01 5.3 3.56e+10 4.8 8.5e+04 1.7e+05 9.8e+01 6 33 1 17 1 6 33 1 17 1 45097 PCSetUpOnBlocks 13 1.0 5.0695e+01 5.7 3.56e+10 4.8 0.0e+00 0.0e+00 1.7e+01 6 33 0 0 0 6 33 0 0 0 46742 PCApply 2860 1.0 1.1268e+02 1.8 3.95e+10 2.5 5.6e+06 1.1e+04 0.0e+00 20 52 71 76 0 20 52 71 76 0 32953 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 23 23 18446744073922240512 0 Matrix Partitioning 1 1 640 0 Index Set 168 168 922496 0 IS L to G Mapping 2 2 78872 0 Vector 828 828 44121632 0 Vector Scatter 23 23 24196 0 Application Order 2 2 9335968 0 SNES 3 2 2544 0 Krylov Solver 7 6 16141840 0 Preconditioner 7 6 5456 0 Viewer 2 1 712 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 2.36034e-05 Average time for zero size MPI_Send(): 2.78279e-06 #PETSc Option Table entries: -coarse_ksp_rtol 1.0e-1 -coarsegrid /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E2000_N8241_D70170.fsi -f /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E32000_N128961_D1096650.fsi -geometric_asm -geometric_asm_overlap 8 -inletu 5.0 -ksp_atol 1e-8 -ksp_gmres_restart 600 -ksp_max_it 3000 -ksp_pc_side right -ksp_rtol 1.e-3 -ksp_type gmres -log_summary -mat_partitioning_type parmetis -nest_geometric_asm_overlap 4 -nest_ksp_atol 1e-8 -nest_ksp_gmres_restart 800 -nest_ksp_max_it 1000 -nest_ksp_pc_side right -nest_ksp_rtol 1.e-2 -nest_ksp_type gmres -nest_pc_asm_type basic -nest_pc_type asm -nest_snes_atol 1.e-10 -nest_snes_max_it 20 -nest_snes_rtol 1.e-4 -nest_sub_pc_factor_mat_ordering_type qmd -nest_sub_pc_factor_shift_amount 1e-8 -nest_sub_pc_factor_shift_type nonzero -nest_sub_pc_type lu -nested -noboundaryreduce -pc_asm_type basic -pc_type asm -shapebeta 10.0 -snes_atol 1.e-10 -snes_max_it 20 -snes_rtol 1.e-6 -snes_view -sub_pc_factor_mat_ordering_type qmd -sub_pc_factor_shift_amount 1e-8 -sub_pc_factor_shift_type nonzero -sub_pc_type lu -viscosity 0.01 #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Tue Sep 13 13:28:48 2011 Configure options: --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --with-batch=1 --with-mpi-shared-libraries=1 --known-mpi-shared-libraries=0 --download-f-blas-lapack=1 --download-hypre=1 --download-superlu=1 --download-parmetis=1 --download-superlu_dist=1 --download-blacs=1 --download-scalapack=1 --download-mumps=1 --with-debugging=0 ----------------------------------------- Libraries compiled on Tue Sep 13 13:28:48 2011 on node1367 Machine characteristics: Linux-2.6.18-238.12.1.el5-x86_64-with-redhat-5.6-Tikanga Using PETSc directory: /home/ronglian/soft/petsc-3.2-p1 Using PETSc arch: Janus-nodebug ----------------------------------------- Using C compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -Wall -Wno-unused-variable -O ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/include -I/home/ronglian/soft/petsc-3.2-p1/include -I/home/ronglian/soft/petsc-3.2-p1/include -I/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/include -I/curc/tools/free/redhat_5_x86_64/openmpi-1.4.3_ib/include ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -L/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -lpetsc -lX11 -Wl,-rpath,/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -L/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -lsuperlu_dist_2.5 -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lHYPRE -lmpi_cxx -lstdc++ -lscalapack -lblacs -lsuperlu_4.2 -lflapack -lfblas -L/curc/tools/free/redhat_5_x86_64/openmpi-1.4.3_ib/lib -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortran -lm -lm -lm -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s -lpthread -ldl ----------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Oct 6 14:55:46 2011 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 6 Oct 2011 14:55:46 -0500 Subject: [petsc-users] Memory problem In-Reply-To: References: Message-ID: On Thu, Oct 6, 2011 at 10:27 AM, Rongliang Chen wrote: > That is an overflow somewhere. You can probably get the right answer by > >> using -snes_view. I will try and track down this overflow. >> >> Matt >> >> >> > Hi Matt, > > Thank you for your reply. > The -snes_view and -log_summary output is followed. But I did not find any > unusual results in the -snes_view output. > Yes, there was no overflow for individual matrices, so this output is correct. The -log_summary output is for all matrices, and that is the problem. > I have another question. what the number 23 mean in the following > -log_summary output: > > " Object Type Creations Destructions Memory Descendants' > Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Matrix 23 23 18446744073922240512 0 " > > Does it mean that I created 23 matrices in my code? But I think I have not > created so many > matrices. Thanks. > You might have created them, but we are counting all matrices, for instance those created for subdomain preconditioning. Matt > Best, > Rongliang > > > > ------------------------------------------------------------------------------------------------------------ > Starting to load grid... > Nodes on moving boundary: coarse 199, fine 799, Gridratio 0.250000. > Setupping Interpolation matrix...... > Interpolation matrix done......Time spent: 0.207468 > finished. > Grid has 32000 elements, 1096658 degrees of freedom. > Coarse grid has 2000 elements, 70170 degrees of freedom. > [0] has 9234 degrees of freedom (matrix), 9234 degrees of freedom > (including shared points). > [0] coarse grid has 484 degrees of freedom (matrix), 484 degrees of > freedom (including shared points). > [127] has 7876 degrees of freedom (matrix), 9100 degrees of freedom > (including shared points). > [127] coarse grid has 588 degrees of freedom (matrix), 912 degrees of > freedom (including shared points). > Time spend on the load grid and create matrix etc.: 3.719866. > > Solving Shape Optimization problem (steady-state problem) > Solving coarse problem...... > 0 SNES norm 3.4998054301e+01, 0 KSP its last norm 0.0000000000e+00. > 1 SNES norm 3.1501179205e+01, 34 KSP its last norm 3.3927102450e-01. > 2 SNES norm 2.1246874435e+01, 57 KSP its last norm 3.1177722630e-01. > 3 SNES norm 1.7390263296e+01, 141 KSP its last norm 1.9452289323e-01. > 4 SNES norm 1.1644760718e+01, 160 KSP its last norm 1.6835316856e-01. > 5 SNES norm 1.0601030093e+01, 181 KSP its last norm 1.1003156828e-01. > 6 SNES norm 1.0145938759e+00, 126 KSP its last norm 1.0556059459e-01. > 7 SNES norm 1.9267547420e-01, 203 KSP its last norm 9.9489004947e-03. > 8 SNES norm 1.8901340973e-03, 195 KSP its last norm 1.8359299380e-03. > Coarse solver done...... > Optimized value of object function (Energy dissipation) (Coarse): > 29.9743671231 > The reduction of the energy dissipation (Coarse): -inf% > The optimized curve (Coarse): > a = (4.500000, -0.042961, -0.002068, 0.043750, -0.018783, 0.001816) > Solving moving mesh equation...... > KSP norm 2.2906632201e-07, KSP its. 741. Time spent 2.772948 > Moving mesh solver done. > 0 SNES norm 4.7914118974e+02, 0 KSP its last norm 0.0000000000e+00. > 1 SNES norm 1.0150289152e+02, 63 KSP its last norm 4.6576374323e-01. > 2 SNES norm 1.8326417396e+00, 90 KSP its last norm 9.9707541310e-02. > 3 SNES norm 3.7711809663e-03, 348 KSP its last norm 1.8059473115e-03. > 4 SNES norm 9.7342448527e-06, 484 KSP its last norm 3.6343704479e-06. > SNES Object: 128 MPI processes > type: ls > line search variant: SNESLineSearchCubic > alpha=1.000000000000e-04, maxstep=1.000000000000e+08, > minlambda=1.000000000000e-12 > maximum iterations=20, maximum function evaluations=10000 > tolerances: relative=1e-06, absolute=1e-10, solution=1e-08 > total number of linear solver iterations=985 > total number of function evaluations=5 > KSP Object: 128 MPI processes > type: gmres > GMRES: restart=600, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=3000, initial guess is zero > tolerances: relative=0.001, absolute=1e-08, divergence=10000 > right preconditioning > using UNPRECONDITIONED norm type for convergence test > PC Object: 128 MPI processes > type: asm > Additive Schwarz: total subdomain blocks = 128, user-defined overlap > Additive Schwarz: restriction/interpolation type - BASIC > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object: (sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 1e-12 > using diagonal shift to prevent zero pivot > matrix ordering: qmd > factor fill ratio given 5, needed 5.26731 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=25170, cols=25170 > package used to perform factorization: petsc > total: nonzeros=11090981, allocated nonzeros=11090981 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 12872 nodes, limit used is 5 > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=25170, cols=25170 > total: nonzeros=2105626, allocated nonzeros=2105626 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 13453 nodes, limit used is 5 > linear system matrix = precond matrix: > Matrix Object: 128 MPI processes > type: mpiaij > rows=1096658, cols=1096658 > total: nonzeros=94170314, allocated nonzeros=223806957 > total number of mallocs used during MatSetValues calls =6185057 > not using I-node (on process 0) routines > Optimized value of object function (Energy dissipation) (Fine): > 33.2754475059 > Solution time of 395.289169 sec. > > Number of unknowns = 1096658 > Parameters: kinematic viscosity = 0.01 > inlet velocity: u = 5, v = 0 > Total number of nonlinear iterations = 4 > Total number of linear iterations = 985 > Average number of linear iterations = 246.250000 > Time computing: 395.289169 sec, Time outputting: 0.000000 sec. > Time spent in coarse nonlinear solve: 13.134366 sec, 0.033227 fraction of > total compute time. > The optimized curve (fine): > a = (4.500000, -0.046466, -0.001962, 0.045734, -0.019141, 0.001789) > The reduction of the energy dissipation (Fine): -inf% > Time spend on fixed mesh solving: 0.013564 > Time spend on shape opt. solving: 395.324390 > Latex command line: > np Newton GMRES Time(Total) Time(Coarse) Ratio > 128 & 4 & 246.25 & 395.29 & 13.13 & 3.3\% > > Running finished on: Wed Oct 5 19:02:01 2011 > Total running time: 395.376442 > > ************************************************************************************************************************ > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r > -fCourier9' to print this document *** > > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: > ---------------------------------------------- > > ./joab on a Janus-nod named node0844 with 128 processors, by ronglian Wed > Oct 5 19:02:01 2011 > > Using Petsc Release Version 3.2.0, Patch 1, Mon Sep 12 16:01:51 CDT 2011 > > Max Max/Min Avg Total > Time (sec): 3.991e+02 1.00013 3.991e+02 > Objects: 1.066e+03 1.00000 1.066e+03 > Flops: 7.938e+10 2.52133 5.615e+10 7.187e+12 > Flops/sec: 1.989e+08 2.52100 1.407e+08 1.801e+10 > MPI Messages: 2.724e+05 8.91400 6.158e+04 7.883e+06 > MPI Message Lengths: 8.340e+08 2.63753 1.025e+04 8.083e+10 > MPI Reductions: 6.537e+03 1.00000 > > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flops > and VecAXPY() for complex vectors of length N > --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > 0: Main Stage: 3.9910e+02 100.0% 7.1875e+12 100.0% 7.883e+06 > 100.0% 1.025e+04 100.0% 6.536e+03 100.0% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over > all processors) > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) > Flops --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > MatMult 2879 1.0 4.3698e+0116.7 1.86e+09 1.3 2.1e+06 1.4e+03 > 0.0e+00 4 3 27 4 0 4 3 27 4 0 4839 > MatMultTranspose 3 1.0 3.0989e-0226.2 9.81e+05 1.2 2.0e+03 7.3e+02 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 3646 > MatSolve 2860 1.0 7.2956e+01 2.3 3.95e+10 2.5 0.0e+00 0.0e+00 > 0.0e+00 14 52 0 0 0 14 52 0 0 0 50895 > MatLUFactorSym 2 1.0 1.3847e+00 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 6.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 13 1.0 4.9187e+01 5.8 3.56e+10 4.8 0.0e+00 0.0e+00 > 0.0e+00 5 33 0 0 0 5 33 0 0 0 48174 > MatILUFactorSym 1 1.0 3.9380e-03 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatAssemblyBegin 78 1.0 8.7529e+0129.0 0.00e+00 0.0 4.3e+04 5.2e+04 > 1.3e+02 11 0 1 3 2 11 0 1 3 2 0 > MatAssemblyEnd 78 1.0 7.2215e+00 1.0 0.00e+00 0.0 8.5e+03 3.6e+02 > 1.1e+02 2 0 0 0 2 2 0 0 0 2 0 > MatGetRowIJ 3 1.0 4.3902e-02 8.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetSubMatrice 13 1.0 1.9239e+00 2.9 0.00e+00 0.0 7.9e+04 1.8e+05 > 5.1e+01 0 0 1 17 1 0 0 1 17 1 0 > MatGetOrdering 3 1.0 4.1121e-01 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 > MatPartitioning 1 1.0 2.2540e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatZeroEntries 32 1.0 3.8607e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatView 3 3.0 1.6980e-0323.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecDot 12 1.0 6.8195e-0317.5 8.36e+04 1.2 0.0e+00 0.0e+00 > 1.2e+01 0 0 0 0 0 0 0 0 0 0 1451 > VecMDot 2823 1.0 4.1334e+01 7.2 3.82e+09 1.2 0.0e+00 0.0e+00 > 2.8e+03 4 6 0 0 43 4 6 0 0 43 10682 > VecNorm 2888 1.0 2.5551e+00 3.1 3.47e+07 1.2 0.0e+00 0.0e+00 > 2.9e+03 0 0 0 0 44 0 0 0 0 44 1575 > VecScale 2860 1.0 2.0850e-02 1.9 1.73e+07 1.2 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 96028 > VecCopy 117 1.0 2.1448e-03 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 5795 1.0 2.2957e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 116 1.0 1.9181e-03 1.6 1.36e+06 1.2 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 82341 > VecWAXPY 16 1.0 2.6107e-04 1.5 4.61e+04 1.2 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 21104 > VecMAXPY 2860 1.0 5.2077e+00 1.5 3.85e+09 1.2 0.0e+00 0.0e+00 > 0.0e+00 1 6 0 0 0 1 6 0 0 0 85546 > VecAssemblyBegin 60 1.0 3.4554e-0110.6 0.00e+00 0.0 1.8e+04 3.4e+02 > 1.8e+02 0 0 0 0 3 0 0 0 0 3 0 > VecAssemblyEnd 60 1.0 1.9860e-04 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 8648 1.0 1.1008e+00 3.0 0.00e+00 0.0 7.7e+06 8.4e+03 > 0.0e+00 0 0 98 80 0 0 0 98 80 0 0 > VecScatterEnd 8648 1.0 8.4154e+0135.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 10 0 0 0 0 10 0 0 0 0 0 > VecReduceArith 4 1.0 2.6989e-04 2.3 4.00e+04 1.2 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 17292 > VecReduceComm 2 1.0 2.7108e-04 6.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecNormalize 2860 1.0 2.5307e+00 3.1 5.17e+07 1.2 0.0e+00 0.0e+00 > 2.8e+03 0 0 0 0 44 0 0 0 0 44 2370 > SNESSolve 2 1.0 3.9251e+02 1.0 7.85e+10 2.6 6.7e+06 9.7e+03 > 4.7e+03 98 98 84 80 72 98 98 84 80 72 18034 > SNESLineSearch 12 1.0 3.0610e+00 1.0 1.26e+07 1.2 6.1e+04 1.1e+04 > 2.9e+02 1 0 1 1 4 1 0 1 1 4 473 > SNESFunctionEval 18 1.0 6.5305e+00 1.0 6.01e+06 1.2 6.2e+04 1.3e+04 > 2.9e+02 2 0 1 1 4 2 0 1 1 4 106 > SNESJacobianEval 12 1.0 2.4492e+02 1.0 0.00e+00 0.0 2.5e+04 5.9e+04 > 9.0e+01 61 0 0 2 1 61 0 0 2 1 0 > KSPGMRESOrthog 2823 1.0 4.6476e+01 4.3 7.64e+09 1.2 0.0e+00 0.0e+00 > 2.8e+03 5 12 0 0 43 5 12 0 0 43 19001 > KSPSetup 26 1.0 1.3622e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 13 1.0 1.4371e+02 1.0 7.94e+10 2.5 7.8e+06 1.0e+04 > 5.8e+03 36100 98 97 88 36100 98 97 88 50001 > PCSetUp 26 1.0 5.2544e+01 5.3 3.56e+10 4.8 8.5e+04 1.7e+05 > 9.8e+01 6 33 1 17 1 6 33 1 17 1 45097 > PCSetUpOnBlocks 13 1.0 5.0695e+01 5.7 3.56e+10 4.8 0.0e+00 0.0e+00 > 1.7e+01 6 33 0 0 0 6 33 0 0 0 46742 > PCApply 2860 1.0 1.1268e+02 1.8 3.95e+10 2.5 5.6e+06 1.1e+04 > 0.0e+00 20 52 71 76 0 20 52 71 76 0 32953 > > > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Matrix 23 23 18446744073922240512 0 > > Matrix Partitioning 1 1 640 0 > Index Set 168 168 922496 0 > IS L to G Mapping 2 2 78872 0 > Vector 828 828 44121632 0 > Vector Scatter 23 23 24196 0 > > Application Order 2 2 9335968 0 > SNES 3 2 2544 0 > Krylov Solver 7 6 16141840 0 > Preconditioner 7 6 5456 0 > Viewer 2 1 712 0 > > ======================================================================================================================== > Average time to get PetscTime(): 9.53674e-08 > Average time for MPI_Barrier(): 2.36034e-05 > Average time for zero size MPI_Send(): 2.78279e-06 > #PETSc Option Table entries: > -coarse_ksp_rtol 1.0e-1 > -coarsegrid > /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E2000_N8241_D70170.fsi > -f > /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E32000_N128961_D1096650.fsi > -geometric_asm > -geometric_asm_overlap 8 > -inletu 5.0 > -ksp_atol 1e-8 > -ksp_gmres_restart 600 > -ksp_max_it 3000 > -ksp_pc_side right > -ksp_rtol 1.e-3 > -ksp_type gmres > -log_summary > -mat_partitioning_type parmetis > -nest_geometric_asm_overlap 4 > -nest_ksp_atol 1e-8 > -nest_ksp_gmres_restart 800 > -nest_ksp_max_it 1000 > -nest_ksp_pc_side right > -nest_ksp_rtol 1.e-2 > -nest_ksp_type gmres > -nest_pc_asm_type basic > -nest_pc_type asm > -nest_snes_atol 1.e-10 > -nest_snes_max_it 20 > -nest_snes_rtol 1.e-4 > -nest_sub_pc_factor_mat_ordering_type qmd > -nest_sub_pc_factor_shift_amount 1e-8 > -nest_sub_pc_factor_shift_type nonzero > -nest_sub_pc_type lu > -nested > -noboundaryreduce > -pc_asm_type basic > -pc_type asm > -shapebeta 10.0 > -snes_atol 1.e-10 > -snes_max_it 20 > -snes_rtol 1.e-6 > -snes_view > -sub_pc_factor_mat_ordering_type qmd > -sub_pc_factor_shift_amount 1e-8 > -sub_pc_factor_shift_type nonzero > -sub_pc_type lu > -viscosity 0.01 > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 > Configure run at: Tue Sep 13 13:28:48 2011 > Configure options: --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 > --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --with-batch=1 > --with-mpi-shared-libraries=1 --known-mpi-shared-libraries=0 > --download-f-blas-lapack=1 --download-hypre=1 --download-superlu=1 > --download-parmetis=1 --download-superlu_dist=1 --download-blacs=1 > --download-scalapack=1 --download-mumps=1 --with-debugging=0 > ----------------------------------------- > Libraries compiled on Tue Sep 13 13:28:48 2011 on node1367 > Machine characteristics: > Linux-2.6.18-238.12.1.el5-x86_64-with-redhat-5.6-Tikanga > Using PETSc directory: /home/ronglian/soft/petsc-3.2-p1 > Using PETSc arch: Janus-nodebug > ----------------------------------------- > > Using C compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -O ${COPTFLAGS} ${CFLAGS} > Using Fortran compiler: mpif90 -Wall -Wno-unused-variable -O > ${FOPTFLAGS} ${FFLAGS} > ----------------------------------------- > > Using include paths: > -I/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/include > -I/home/ronglian/soft/petsc-3.2-p1/include > -I/home/ronglian/soft/petsc-3.2-p1/include > -I/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/include > -I/curc/tools/free/redhat_5_x86_64/openmpi-1.4.3_ib/include > ----------------------------------------- > > Using C linker: mpicc > Using Fortran linker: mpif90 > Using libraries: > -Wl,-rpath,/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib > -L/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -lpetsc -lX11 > -Wl,-rpath,/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib > -L/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -lsuperlu_dist_2.5 > -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis > -lHYPRE -lmpi_cxx -lstdc++ -lscalapack -lblacs -lsuperlu_4.2 -lflapack > -lfblas -L/curc/tools/free/redhat_5_x86_64/openmpi-1.4.3_ib/lib > -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -ldl -lmpi -lopen-rte -lopen-pal > -lnsl -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortran -lm -lm -lm > -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte -lopen-pal > -lnsl -lutil -lgcc_s -lpthread -ldl > ----------------------------------------- > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rongliang.chan at gmail.com Thu Oct 6 15:23:55 2011 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Thu, 6 Oct 2011 14:23:55 -0600 Subject: [petsc-users] Memory problem Message-ID: > > Yes, there was no overflow for individual matrices, so this output is > correct. The -log_summary output is for all > matrices, and that is the problem. > > > You might have created them, but we are counting all matrices, for instance > those created for > subdomain preconditioning. > > Matt > > > Hi Matt, Do you mean that this is just a bug in the -log_summary output and it will not influence the performance? But it does influence the performance in my test cases. Can you tell me how to debug this? Thank you. Best, Rongliang -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Oct 6 15:26:18 2011 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 6 Oct 2011 15:26:18 -0500 Subject: [petsc-users] Memory problem In-Reply-To: References: Message-ID: On Thu, Oct 6, 2011 at 3:23 PM, Rongliang Chen wrote: > > >> Yes, there was no overflow for individual matrices, so this output is >> correct. The -log_summary output is for all >> matrices, and that is the problem. >> >> >> You might have created them, but we are counting all matrices, for >> instance >> those created for >> subdomain preconditioning. >> >> Matt >> >> >> > Hi Matt, > > Do you mean that this is just a bug in the -log_summary output and it will > not influence the performance? But it does influence the performance in my > test cases. Can you tell me how to debug this? Thank you. > Yes, it has no influence on performance. If you think it does, send -log_summary output to petsc-maint at mcs.anl.gov Matt > Best, > Rongliang > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhaonanavril at gmail.com Thu Oct 6 15:36:54 2011 From: zhaonanavril at gmail.com (NAN ZHAO) Date: Thu, 6 Oct 2011 14:36:54 -0600 Subject: [petsc-users] get row or column vector from a matrix Message-ID: Dear all, I created a matrix (MATMPISBAIJ). I want to get the column or row vector after finishing assemble the matrix. Is there a way to do that in the exist PETSC routine? Thanks, Nan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Oct 6 15:41:33 2011 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 6 Oct 2011 15:41:33 -0500 Subject: [petsc-users] get row or column vector from a matrix In-Reply-To: References: Message-ID: On Thu, Oct 6, 2011 at 3:36 PM, NAN ZHAO wrote: > Dear all, > > I created a matrix (MATMPISBAIJ). I want to get the column or row vector > after finishing assemble the matrix. Is there a way to do that in the exist > PETSC routine? > > Thanks, > Nan > What column vector? Please use matrix notation. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Thu Oct 6 15:42:50 2011 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Thu, 6 Oct 2011 15:42:50 -0500 Subject: [petsc-users] get row or column vector from a matrix In-Reply-To: References: Message-ID: NAN : > Dear all, > I created a matrix ( > > MATMPISBAIJ). I want to get the column or row vector after finishing > assemble the matrix. Is there a way to do that in the exist PETSC routine? You can call MatGetRow(). See http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatGetRow.html Hong From lizs at mail.uc.edu Fri Oct 7 01:29:02 2011 From: lizs at mail.uc.edu (Li, Zhisong (lizs)) Date: Fri, 7 Oct 2011 06:29:02 +0000 Subject: [petsc-users] How to use ghost points to store updated data? Message-ID: Hi, Petsc Team, From my knowledge of Petsc, in parallel data structure DA, the ghost points are only used for providing info from the existing stored data. They are not used for storing updated data as this should be done by the neighboring local vectors. In my current case, the computation is performed solely in one local vector but the updated data points may be beyond the bound of this local vector (still within its ghost points' range). So how to store these data from ghost points to global vector? "DALocaltoGlobal" seems to ignore all values in ghost points for each local vector. Thank you very much. Regards, Zhisong Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Fri Oct 7 02:58:32 2011 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Fri, 7 Oct 2011 00:58:32 -0700 Subject: [petsc-users] Domain partitioning using DAs Message-ID: Hi guys, I have a simple question. Forgive me if it has been discussed before. How does DACreate3d() partition the domain among processors? Is it based on the minimizing the surface between the processors? Or it is just a simple x-y-z order? Thanks in advance, Best Mohamad -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Fri Oct 7 04:45:28 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 7 Oct 2011 11:45:28 +0200 Subject: [petsc-users] passing an object to function Message-ID: I need to do something with - say - a Vec in a function but obviously do not want to copy by value. Should the signature be foo(Vec) or foo(Vec&)? I see Vec is a pointer in itself, so Vec should suffice, but in many Petsc functions I see Vec& signatures, so I am a bit confused. Thanks for any clarifications. Dominik From jedbrown at mcs.anl.gov Fri Oct 7 07:17:50 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 7 Oct 2011 07:17:50 -0500 Subject: [petsc-users] passing an object to function In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 04:45, Dominik Szczerba wrote: > I need to do something with - say - a Vec in a function but obviously > do not want to copy by value. Should the signature be foo(Vec) or > foo(Vec&)? > PETSc does not use C++ internally. > I see Vec is a pointer in itself, so Vec should suffice, > but in many Petsc functions I see Vec& signatures, > No, you see some Vec* signatures, not Vec&. Vec* is used when a new Vec is created, if the function needs to modify the pointer, or if the argument is an array of Vecs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Oct 7 07:22:58 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 7 Oct 2011 07:22:58 -0500 Subject: [petsc-users] How to use ghost points to store updated data? In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 01:29, Li, Zhisong (lizs) wrote: > In my current case, the computation is performed solely in one local vector > but the updated data points may be beyond the bound of this local vector > (still within its ghost points' range). So how to store these data from > ghost points to global vector? "DALocaltoGlobal" seems to ignore all values > in ghost points for each local vector. http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/DM/DMLocalToGlobalBegin.html with ADD_VALUES -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Oct 7 07:34:39 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 7 Oct 2011 07:34:39 -0500 Subject: [petsc-users] Domain partitioning using DAs In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 02:58, Mohamad M. Nasr-Azadani wrote: > How does DACreate3d() partition the domain among processors? > Is it based on the minimizing the surface between the processors? Or it is > just a simple x-y-z order? > It tries to produce a "squarish" partition based on divisibility of the number of processes. It will only partition in one direction if you use a prime number of processes, for example. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Fri Oct 7 11:09:06 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 7 Oct 2011 18:09:06 +0200 Subject: [petsc-users] passing an object to function In-Reply-To: References: Message-ID: Sorry, I could not conclude: so what signature to pass an object is best/cleanest/most optimal for my C++ code? Or are they all fine, e.g. Vec would pass address by value, Vec& would pass address by reference, Vec* would pass the pointer, none of them copying the heavy data? On Fri, Oct 7, 2011 at 2:17 PM, Jed Brown wrote: > On Fri, Oct 7, 2011 at 04:45, Dominik Szczerba wrote: >> >> I need to do something with - say - a Vec in a function but obviously >> do not want to copy by value. Should the signature be foo(Vec) or >> foo(Vec&)? > > PETSc does not use C++ internally. > >> >> I see Vec is a pointer in itself, so Vec should suffice, >> but in many Petsc functions I see Vec& signatures, > > No, you see some Vec* signatures, not Vec&. Vec* is used when a new Vec is > created, if the function needs to modify the pointer, or if the argument is > an array of Vecs. From jedbrown at mcs.anl.gov Fri Oct 7 11:10:51 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 7 Oct 2011 11:10:51 -0500 Subject: [petsc-users] passing an object to function In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 11:09, Dominik Szczerba wrote: > Sorry, I could not conclude: so what signature to pass an object is > best/cleanest/most optimal for my C++ code? > Prefer Vec > Or are they all fine, e.g. > Vec would pass address by value, Vec& would pass address by reference, > Vec* would pass the pointer, none of them copying the heavy data? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rongliang.chan at gmail.com Fri Oct 7 11:58:08 2011 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Fri, 7 Oct 2011 10:58:08 -0600 Subject: [petsc-users] [petsc-maint #89695] Re: Memory problem In-Reply-To: References: Message-ID: Hi Barry, Thank you for your reply. I don't think this problem comes from the matrix assemble. Because the result I showed you in the last email is from a two-level Newton method which means I first solve a coarse problem and use the coarse solution as the fine level problem's initial guess. If I just use the one-level method, there is no such problem. The memory usage in the -log_summary output is correct and time spend on the SNESJacobianEval is also normal I think (see attached) for the one-level method. The strange memory usage just appear in the two-level method. The reason that I claim the two-level's computing time is not correct is that I solve the same problem with the same number of processors and the two-level's iteration number of SNES and GMRES is much smaller than the one-level method, but the compute time is opposite (the time spend on the coarse problem is just 25s). From the -log_summary outputs of the two methods I found that the matrix's memory usage is total different. So I think there must be some bugs in my two-level code. But I have no idea how to debug this problem. Best, Rongliang On Fri, Oct 7, 2011 at 10:24 AM, Barry Smith wrote: > > > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly > > > On Oct 7, 2011, at 11:22 AM, Rongliang Chen wrote: > > > ------------------------------------------------- > > Joab > > > > Shape Optimization solver > > by Rongliang Chen > > compiled on 15:54:32, Oct 3 2011 > > Running on: Wed Oct 5 10:24:10 2011 > > > > revision $Rev: 157 $ > > ------------------------------------------------- > > Command-line options: -coarse_ksp_rtol 1.0e-1 -coarsegrid > /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E2000_N8241_D70170.fsi > -computeinitialguess -f > /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E32000_N128961_D1096650.fsi > -geometric_asm -geometric_asm_overlap 8 -inletu 5.0 -ksp_atol 1e-8 > -ksp_gmres_restart 600 -ksp_max_it 3000 -ksp_pc_side right -ksp_rtol 1.e-3 > -ksp_type gmres -log_summary -mat_partitioning_type parmetis > -nest_geometric_asm_overlap 4 -nest_ksp_atol 1e-8 -nest_ksp_gmres_restart > 800 -nest_ksp_max_it 1000 -nest_ksp_pc_side right -nest_ksp_rtol 1.e-2 > -nest_ksp_type gmres -nest_pc_asm_type basic -nest_pc_type asm > -nest_snes_atol 1.e-10 -nest_snes_max_it 20 -nest_snes_rtol 1.e-4 > -nest_sub_pc_factor_mat_ordering_type qmd -nest_sub_pc_factor_shift_amount > 1e-8 -nest_sub_pc_factor_shift_type nonzero -nest_sub_pc_type lu -nested > -noboundaryreduce -pc_asm_type basic -pc_type asm -shapebeta 10.0 -snes_atol > 1.e-10 -snes_max_it 20 -snes_rtol 1.e-6 -sub_pc_f > > actor_mat_ordering_type qmd -sub_pc_factor_shift_amount 1e-8 > -sub_pc_factor_shift_type nonzero -sub_pc_type lu -viscosity 0.01 > > ------------------------------------------------- > > > > Starting to load grid... > > Nodes on moving boundary: coarse 199, fine 799, Gridratio 0.250000. > > Setupping Interpolation matrix...... > > Interpolation matrix done......Time spent: 0.405431 > > finished. > > Grid has 32000 elements, 1096658 degrees of freedom. > > Coarse grid has 2000 elements, 70170 degrees of freedom. > > [0] has 35380 degrees of freedom (matrix), 35380 degrees of freedom > (including shared points). > > [0] coarse grid has 2194 degrees of freedom (matrix), 2194 degrees of > freedom (including shared points). > > [31] has 32466 degrees of freedom (matrix), 34428 degrees of freedom > (including shared points). > > [31] coarse grid has 2250 degrees of freedom (matrix), 2826 degrees of > freedom (including shared points). > > Time spend on the load grid and create matrix etc.: 3.577862. > > Solving fixed mesh (steady-state problem) > > Solving coarse problem...... > > 0 SNES norm 3.1224989992e+01, 0 KSP its last norm 0.0000000000e+00. > > 1 SNES norm 1.3987219837e+00, 25 KSP its last norm 2.4915963656e-01. > > 2 SNES norm 5.1898321541e-01, 59 KSP its last norm 1.3451744761e-02. > > 3 SNES norm 4.0024228221e-02, 56 KSP its last norm 4.9036146089e-03. > > 4 SNES norm 6.7641787439e-04, 59 KSP its last norm 3.6925683196e-04. > > Coarse solver done...... > > Initial value of object function (Energy dissipation) (Coarse): > 38.9341108701 > > 0 SNES norm 7.4575110699e+00, 0 KSP its last norm 0.0000000000e+00. > > 1 SNES norm 6.4497565921e-02, 51 KSP its last norm 7.4277453141e-03. > > 2 SNES norm 9.2093642958e-04, 90 KSP its last norm 5.4331380112e-05. > > 3 SNES norm 8.1283574549e-07, 103 KSP its last norm 7.5974191049e-07. > > Initial value of object function (Energy dissipation) (Fine): > 42.5134271399 > > Solution time of 17.180358 sec. > > Fixed mesh (Steady-state) solver done. > > Total number of nonlinear iterations = 3 > > Total number of linear iterations = 244 > > Average number of linear iterations = 81.333336 > > Time computing: 17.180358 sec, Time outputting: 0.000000 sec. > > Time spent in coarse nonlinear solve: 0.793436 sec, 0.046183 fraction of > total compute time. > > Solving Shape Optimization problem (steady-state problem) > > Solving coarse problem...... > > 0 SNES norm 4.1963166116e+01, 0 KSP its last norm 0.0000000000e+00. > > 1 SNES norm 3.2749386875e+01, 132 KSP its last norm 4.0966334477e-01. > > 2 SNES norm 2.2874504408e+01, 130 KSP its last norm 3.2526355310e-01. > > 3 SNES norm 1.4327187891e+01, 132 KSP its last norm 2.1213029400e-01. > > 4 SNES norm 1.7283643754e+00, 81 KSP its last norm 1.4233338128e-01. > > 5 SNES norm 3.6703566918e-01, 133 KSP its last norm 1.6069896349e-02. > > 6 SNES norm 3.6554528686e-03, 77 KSP its last norm 3.5379167356e-03. > > Coarse solver done...... > > Optimized value of object function (Energy dissipation) (Coarse): > 29.9743062939 > > The reduction of the energy dissipation (Coarse): 23.012737% > > The optimized curve (Coarse): > > a = (4.500000, -0.042893, -0.002030, 0.043721, -0.018798, 0.001824) > > Solving moving mesh equation...... > > KSP norm 2.3040219081e-07, KSP its. 741. Time spent 8.481956 > > Moving mesh solver done. > > 0 SNES norm 4.7843968670e+02, 0 KSP its last norm 0.0000000000e+00. > > 1 SNES norm 1.0148854085e+02, 49 KSP its last norm 4.7373180511e-01. > > 2 SNES norm 1.8312214030e+00, 46 KSP its last norm 1.0133332840e-01. > > 3 SNES norm 3.3101970861e-03, 212 KSP its last norm 1.7753271069e-03. > > 4 SNES norm 4.9552614008e-06, 249 KSP its last norm 3.2293284103e-06. > > Optimized value of object function (Energy dissipation) (Fine): > 33.2754372645 > > Solution time of 4053.227456 sec. > > Number of unknowns = 1096658 > > Parameters: kinematic viscosity = 0.01 > > inlet velocity: u = 5, v = 0 > > Total number of nonlinear iterations = 4 > > Total number of linear iterations = 556 > > Average number of linear iterations = 139.000000 > > Time computing: 4053.227456 sec, Time outputting: 0.000001 sec. > > Time spent in coarse nonlinear solve: 24.239526 sec, 0.005980 fraction of > total compute time. > > The optimized curve (fine): > > a = (4.500000, -0.046468, -0.001963, 0.045736, -0.019141, 0.001789) > > The reduction of the energy dissipation (Fine): 21.729582% > > Time spend on fixed mesh solving: 17.296872 > > Time spend on shape opt. solving: 4053.250126 > > Latex command line: > > np Newton GMRES Time(Total) Time(Coarse) Ratio > > 32 & 4 & 139.00 & 4053.23 & 24.24 & 0.6\% > > > > Running finished on: Wed Oct 5 11:32:04 2011 > > Total running time: 4070.644329 > > > ************************************************************************************************************************ > > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r > -fCourier9' to print this document *** > > > ************************************************************************************************************************ > > > > ---------------------------------------------- PETSc Performance Summary: > ---------------------------------------------- > > > > ./joab on a Janus-nod named node1751 with 32 processors, by ronglian Wed > Oct 5 11:32:04 2011 > > Using Petsc Release Version 3.2.0, Patch 1, Mon Sep 12 16:01:51 CDT 2011 > > > > Max Max/Min Avg Total > > Time (sec): 4.074e+03 1.00000 4.074e+03 > > Objects: 1.011e+03 1.00000 1.011e+03 > > Flops: 2.255e+11 2.27275 1.471e+11 4.706e+12 > > Flops/sec: 5.535e+07 2.27275 3.609e+07 1.155e+09 > > MPI Messages: 1.103e+05 5.41392 3.665e+04 1.173e+06 > > MPI Message Lengths: 1.326e+09 2.60531 2.416e+04 2.833e+10 > > MPI Reductions: 5.969e+03 1.00000 > > > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > > e.g., VecAXPY() for real vectors of length N > --> 2N flops > > and VecAXPY() for complex vectors of length N > --> 8N flops > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages > --- -- Message Lengths -- -- Reductions -- > > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > > 0: Main Stage: 4.0743e+03 100.0% 4.7058e+12 100.0% 1.173e+06 > 100.0% 2.416e+04 100.0% 5.968e+03 100.0% > > > > > ------------------------------------------------------------------------------------------------------------------------ > > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > > Phase summary info: > > Count: number of times phase was executed > > Time and Flops: Max - maximum over all processors > > Ratio - ratio of maximum to minimum over all processors > > Mess: number of messages sent > > Avg. len: average message length > > Reduct: number of global reductions > > Global: entire computation > > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > > %T - percent time in this phase %F - percent flops in this > phase > > %M - percent messages in this phase %L - percent message lengths > in this phase > > %R - percent reductions in this phase > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all processors) > > > ------------------------------------------------------------------------------------------------------------------------ > > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > > ------------------------------------------------------------------------------------------------------------------------ > > > > --- Event Stage 0: Main Stage > > > > MatMult 2493 1.0 1.2225e+0218.4 4.37e+09 1.1 3.9e+05 2.2e+03 > 0.0e+00 2 3 33 3 0 2 3 33 3 0 1084 > > MatMultTranspose 6 1.0 3.3590e-02 2.2 7.38e+06 1.1 8.0e+02 1.5e+03 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 6727 > > MatSolve 2467 1.0 1.1270e+02 1.7 5.95e+10 1.7 0.0e+00 0.0e+00 > 0.0e+00 2 33 0 0 0 2 33 0 0 0 13775 > > MatLUFactorSym 4 1.0 3.4774e+00 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 > > MatLUFactorNum 18 1.0 2.0832e+02 3.7 1.55e+11 3.2 0.0e+00 0.0e+00 > 0.0e+00 2 56 0 0 0 2 56 0 0 0 12746 > > MatILUFactorSym 1 1.0 8.3280e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatAssemblyBegin 103 1.0 7.6879e+0215.4 0.00e+00 0.0 1.6e+04 6.2e+04 > 1.7e+02 7 0 1 4 3 7 0 1 4 3 0 > > MatAssemblyEnd 103 1.0 3.7818e+01 1.0 0.00e+00 0.0 3.0e+03 5.3e+02 > 1.6e+02 1 0 0 0 3 1 0 0 0 3 0 > > MatGetRowIJ 5 1.0 4.8716e-02 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetSubMatrice 18 1.0 4.3095e+00 2.5 0.00e+00 0.0 1.6e+04 3.5e+05 > 7.4e+01 0 0 1 20 1 0 0 1 20 1 0 > > MatGetOrdering 5 1.0 1.4656e+00 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.4e+01 0 0 0 0 0 0 0 0 0 0 0 > > MatPartitioning 1 1.0 1.4356e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatZeroEntries 42 1.0 2.0939e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecDot 17 1.0 1.2719e-02 6.8 5.47e+05 1.1 0.0e+00 0.0e+00 > 1.7e+01 0 0 0 0 0 0 0 0 0 0 1317 > > VecMDot 2425 1.0 1.7196e+01 2.2 5.82e+09 1.1 0.0e+00 0.0e+00 > 2.4e+03 0 4 0 0 41 0 4 0 0 41 10353 > > VecNorm 2503 1.0 2.7923e+00 3.4 1.18e+08 1.1 0.0e+00 0.0e+00 > 2.5e+03 0 0 0 0 42 0 0 0 0 42 1293 > > VecScale 2467 1.0 7.3112e-02 1.7 5.84e+07 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 24453 > > VecCopy 153 1.0 1.1636e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecSet 5031 1.0 6.0423e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecAXPY 137 1.0 1.1462e-02 1.5 6.33e+06 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 16902 > > VecWAXPY 19 1.0 1.7784e-03 1.4 2.83e+05 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 4869 > > VecMAXPY 2467 1.0 8.5820e+00 1.3 5.93e+09 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 4 0 0 0 0 4 0 0 0 21153 > > VecAssemblyBegin 69 1.0 1.0341e+0018.2 0.00e+00 0.0 4.9e+03 5.4e+02 > 2.1e+02 0 0 0 0 3 0 0 0 0 3 0 > > VecAssemblyEnd 69 1.0 2.4939e-04 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecScatterBegin 7491 1.0 1.3734e+00 1.7 0.00e+00 0.0 1.1e+06 1.9e+04 > 0.0e+00 0 0 96 76 0 0 0 96 76 0 0 > > VecScatterEnd 7491 1.0 2.0055e+02 8.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 > > VecReduceArith 8 1.0 1.4977e-03 2.0 3.05e+05 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 6232 > > VecReduceComm 4 1.0 8.9908e-0412.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecNormalize 2467 1.0 2.8067e+00 3.4 1.75e+08 1.1 0.0e+00 0.0e+00 > 2.4e+03 0 0 0 0 41 0 0 0 0 41 1905 > > SNESSolve 4 1.0 4.0619e+03 1.0 2.23e+11 2.3 9.4e+05 2.3e+04 > 4.1e+03100 98 80 77 68 100 98 80 77 68 1136 > > SNESLineSearch 17 1.0 1.1423e+01 1.0 5.23e+07 1.1 1.8e+04 1.7e+04 > 3.3e+02 0 0 2 1 6 0 0 2 1 6 140 > > SNESFunctionEval 23 1.0 2.9742e+01 1.0 2.60e+07 1.1 1.9e+04 1.9e+04 > 3.5e+02 1 0 2 1 6 1 0 2 1 6 27 > > SNESJacobianEval 17 1.0 3.6786e+03 1.0 0.00e+00 0.0 9.8e+03 6.4e+04 > 1.4e+02 90 0 1 2 2 90 0 1 2 2 0 > > KSPGMRESOrthog 2425 1.0 2.5150e+01 1.6 1.16e+10 1.1 0.0e+00 0.0e+00 > 2.4e+03 0 8 0 0 41 0 8 0 0 41 14157 > > KSPSetup 36 1.0 2.5388e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > KSPSolve 18 1.0 3.6141e+02 1.0 2.25e+11 2.3 1.1e+06 2.4e+04 > 5.0e+03 9100 97 96 84 9100 97 96 84 13015 > > PCSetUp 36 1.0 2.1635e+02 3.6 1.55e+11 3.2 1.8e+04 3.2e+05 > 1.5e+02 3 56 2 20 3 3 56 2 20 3 12274 > > PCSetUpOnBlocks 18 1.0 2.1293e+02 3.7 1.55e+11 3.2 0.0e+00 0.0e+00 > 2.7e+01 2 56 0 0 0 2 56 0 0 0 12471 > > PCApply 2467 1.0 2.5616e+02 2.5 5.95e+10 1.7 7.3e+05 2.8e+04 > 0.0e+00 4 33 62 73 0 4 33 62 73 0 6060 > > > ------------------------------------------------------------------------------------------------------------------------ > > > > Memory usage is given in bytes: > > > > Object Type Creations Destructions Memory Descendants' > Mem. > > Reports information only for process 0. > > > > --- Event Stage 0: Main Stage > > > > Matrix 39 39 18446744074642894848 0 > > Matrix Partitioning 1 1 640 0 > > Index Set 184 184 2589512 0 > > IS L to G Mapping 2 2 301720 0 > > Vector 729 729 133662888 0 > > Vector Scatter 29 29 30508 0 > > Application Order 2 2 9335968 0 > > SNES 4 4 5088 0 > > Krylov Solver 10 10 32264320 0 > > Preconditioner 10 10 9088 0 > > Viewer 1 0 0 0 > > > ======================================================================================================================== > > Average time to get PetscTime(): 1.19209e-07 > > Average time for MPI_Barrier(): 1.20163e-05 > > Average time for zero size MPI_Send(): 2.49594e-06 > > #PETSc Option Table entries: > > -coarse_ksp_rtol 1.0e-1 > > -coarsegrid > /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E2000_N8241_D70170.fsi > > -computeinitialguess > > -f > /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E32000_N128961_D1096650.fsi > > -geometric_asm > > -geometric_asm_overlap 8 > > -inletu 5.0 > > -ksp_atol 1e-8 > > -ksp_gmres_restart 600 > > -ksp_max_it 3000 > > -ksp_pc_side right > > -ksp_rtol 1.e-3 > > -ksp_type gmres > > -log_summary > > -mat_partitioning_type parmetis > > -nest_geometric_asm_overlap 4 > > -nest_ksp_atol 1e-8 > > -nest_ksp_gmres_restart 800 > > -nest_ksp_max_it 1000 > > -nest_ksp_pc_side right > > -nest_ksp_rtol 1.e-2 > > -nest_ksp_type gmres > > -nest_pc_asm_type basic > > -nest_pc_type asm > > -nest_snes_atol 1.e-10 > > -nest_snes_max_it 20 > > -nest_snes_rtol 1.e-4 > > -nest_sub_pc_factor_mat_ordering_type qmd > > -nest_sub_pc_factor_shift_amount 1e-8 > > -nest_sub_pc_factor_shift_type nonzero > > -nest_sub_pc_type lu > > -nested > > -noboundaryreduce > > -pc_asm_type basic > > -pc_type asm > > -shapebeta 10.0 > > -snes_atol 1.e-10 > > -snes_max_it 20 > > -snes_rtol 1.e-6 > > -sub_pc_factor_mat_ordering_type qmd > > -sub_pc_factor_shift_amount 1e-8 > > -sub_pc_factor_shift_type nonzero > > -sub_pc_type lu > > -viscosity 0.01 > > #End of PETSc Option Table entries > > Compiled without FORTRAN kernels > > Compiled with full precision matrices (default) > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 > > Configure run at: Tue Sep 13 13:28:48 2011 > > Configure options: --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 > --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --with-batch=1 > --with-mpi-shared-libraries=1 --known-mpi-shared-libraries=0 > --download-f-blas-lapack=1 --download-hypre=1 --download-superlu=1 > --download-parmetis=1 --download-superlu_dist=1 --download-blacs=1 > --download-scalapack=1 --download-mumps=1 --with-debugging=0 > > ----------------------------------------- > > Libraries compiled on Tue Sep 13 13:28:48 2011 on node1367 > > Machine characteristics: > Linux-2.6.18-238.12.1.el5-x86_64-with-redhat-5.6-Tikanga > > Using PETSc directory: /home/ronglian/soft/petsc-3.2-p1 > > Using PETSc arch: Janus-nodebug > > ----------------------------------------- > > > > Using C compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -O ${COPTFLAGS} ${CFLAGS} > > Using Fortran compiler: mpif90 -Wall -Wno-unused-variable -O > ${FOPTFLAGS} ${FFLAGS} > > ----------------------------------------- > > > > Using include paths: > -I/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/include > -I/home/ronglian/soft/petsc-3.2-p1/include > -I/home/ronglian/soft/petsc-3.2-p1/include > -I/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/include > -I/curc/tools/free/redhat_5_x86_64/openmpi-1.4.3_ib/include > > ----------------------------------------- > > > > Using C linker: mpicc > > Using Fortran linker: mpif90 > > Using libraries: > -Wl,-rpath,/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib > -L/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -lpetsc -lX11 > -Wl,-rpath,/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib > -L/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -lsuperlu_dist_2.5 > -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis > -lHYPRE -lmpi_cxx -lstdc++ -lscalapack -lblacs -lsuperlu_4.2 -lflapack > -lfblas -L/curc/tools/free/redhat_5_x86_64/openmpi-1.4.3_ib/lib > -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -ldl -lmpi -lopen-rte -lopen-pal > -lnsl -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortran -lm -lm -lm > -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte -lopen-pal > -lnsl -lutil -lgcc_s -lpthread -ldl > > ----------------------------------------- > > > >> > >> Yes, it has no influence on performance. If you think it does, send > >> -log_summary output to petsc-maint at mcs.anl.gov > >> > >> Matt > >> > >> > > Hi Matt, > > > > The -log_summary output is attached. I found that the SNESJacobianEval() > > takes 90% of the total time. I think this is abnormal because I use a > hand > > coded Jacobian matrix. The reason, I think, for the 90% of the total time > is > > that the matrix takes too much memory (over 1.8x10^19 bytes) which maybe > > have used the swap. But I do not know why 23 one million by one million > > matrices will use so much memory. Can you tell me how to debug this > problem? > > Thank you. > > > > Best, > > Rongliang > > > > > > Yes, it has no influence on performance. If you think it does, send > > -log_summary output to petsc-maint at mcs.anl.gov > > > > Matt > > > > > > Hi Matt, > > > > The -log_summary output is attached. I found that the SNESJacobianEval() > takes 90% of the total time. I think this is abnormal because I use a hand > coded Jacobian matrix. The reason, I think, for the 90% of the total time is > that the matrix takes too much memory (over 1.8x10^19 bytes) which maybe > have used the swap. But I do not know why 23 one million by one million > matrices will use so much memory. Can you tell me how to debug this problem? > Thank you. > > > > Best, > > Rongliang > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ------------------------------------------------- Joab Shape Optimization solver by Rongliang Chen compiled on 15:54:32, Oct 3 2011 Running on: Wed Oct 5 11:23:25 2011 revision $Rev: 157 $ ------------------------------------------------- Command-line options: -computeinitialguess -f /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E32000_N128961_D1096650.fsi -geometric_asm -geometric_asm_overlap 8 -inletu 5.0 -ksp_atol 1e-8 -ksp_gmres_restart 400 -ksp_max_it 3000 -ksp_pc_side right -ksp_rtol 1.e-2 -ksp_type gmres -log_summary -mat_partitioning_type parmetis -pc_asm_type basic -pc_type asm -shapebeta 10.0 -snes_atol 1.e-10 -snes_max_it 20 -snes_rtol 1.e-6 -sub_pc_factor_mat_ordering_type qmd -sub_pc_factor_shift_amount 1e-8 -sub_pc_factor_shift_type nonzero -sub_pc_type lu -viscosity 0.01 ------------------------------------------------- Starting to load grid... finished. Grid has 32000 elements, 1096658 degrees of freedom. [0] has 35380 degrees of freedom (matrix), 35380 degrees of freedom (including shared points). [31] has 32466 degrees of freedom (matrix), 34428 degrees of freedom (including shared points). Time spend on the load grid and create matrix etc.: 2.524287. Solving fixed mesh (steady-state problem) 0 SNES norm 6.3047601065e+01, 0 KSP its last norm 0.0000000000e+00. 1 SNES norm 6.8497829311e-01, 34 KSP its last norm 5.0668501287e-01. 2 SNES norm 1.8277997453e-01, 104 KSP its last norm 5.3649449690e-03. 3 SNES norm 1.2494936037e-02, 93 KSP its last norm 1.4890024153e-03. 4 SNES norm 2.4161921075e-04, 98 KSP its last norm 1.0705394154e-04. 5 SNES norm 2.3507660310e-06, 85 KSP its last norm 2.3033769563e-06. Initial value of object function (Energy dissipation) (Fine): 42.5134315176 Solution time of 22.377218 sec. Fixed mesh (Steady-state) solver done. Total number of nonlinear iterations = 5 Total number of linear iterations = 414 Average number of linear iterations = 82.800003 Time computing: 22.377218 sec, Time outputting: 0.000000 sec. Solving Shape Optimization problem (steady-state problem) 0 SNES norm 1.7510864453e+02, 0 KSP its last norm 0.0000000000e+00. 1 SNES norm 2.3546363669e+01, 188 KSP its last norm 1.6824374180e+00. 2 SNES norm 1.4710489481e+01, 227 KSP its last norm 2.1654000566e-01. 3 SNES norm 5.1619747492e+00, 216 KSP its last norm 1.4024550925e-01. 4 SNES norm 1.6085957859e+00, 228 KSP its last norm 5.0298508735e-02. 5 SNES norm 3.2115315727e-02, 200 KSP its last norm 1.5654715717e-02. 6 SNES norm 2.3318868200e-03, 224 KSP its last norm 3.0282269758e-04. 7 SNES norm 2.2884883607e-05, 139 KSP its last norm 2.2886088166e-05. Optimized value of object function (Energy dissipation) (Fine): 33.2754517498 Solution time of 1769.098900 sec. Number of unknowns = 1096658 Parameters: kinematic viscosity = 0.01 inlet velocity: u = 5, v = 0 Total number of nonlinear iterations = 7 Total number of linear iterations = 1422 Average number of linear iterations = 203.142853 Time computing: 1769.098900 sec, Time outputting: 0.000000 sec. The optimized curve (fine): a = (4.500000, -0.046461, -0.001963, 0.045736, -0.019145, 0.001790) The reduction of the energy dissipation (Fine): 21.729556% Time spend on fixed mesh solving: 22.471772 Time spend on shape opt. solving: 1769.109364 Latex command line: np Newton GMRES Time Reduction 32 & 7 & 203.14 & 1769.10 & 21.7\% Running finished on: Wed Oct 5 11:53:19 2011 Total running time: 1791.677926 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./joab on a Janus-nod named node1777 with 32 processors, by ronglian Wed Oct 5 11:53:19 2011 Using Petsc Release Version 3.2.0, Patch 1, Mon Sep 12 16:01:51 CDT 2011 Max Max/Min Avg Total Time (sec): 1.794e+03 1.00000 1.794e+03 Objects: 5.370e+02 1.00000 5.370e+02 Flops: 4.316e+11 2.18377 2.876e+11 9.203e+12 Flops/sec: 2.405e+08 2.18376 1.603e+08 5.129e+09 MPI Messages: 1.024e+05 6.85190 2.949e+04 9.436e+05 MPI Message Lengths: 1.867e+09 2.60703 4.210e+04 3.973e+10 MPI Reductions: 4.270e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.7943e+03 100.0% 9.2028e+12 100.0% 9.436e+05 100.0% 4.210e+04 100.0% 4.269e+03 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 1862 1.0 1.0334e+02 8.2 9.20e+09 1.1 3.2e+05 4.3e+03 0.0e+00 3 3 33 3 0 3 3 33 3 0 2693 MatSolve 1848 1.0 2.6237e+02 1.7 1.41e+11 1.7 0.0e+00 0.0e+00 0.0e+00 12 40 0 0 0 12 40 0 0 0 13897 MatLUFactorSym 2 1.0 1.9804e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 12 1.0 2.4884e+02 2.6 2.69e+11 3.2 0.0e+00 0.0e+00 0.0e+00 9 49 0 0 0 9 49 0 0 0 18224 MatAssemblyBegin 65 1.0 4.1394e+0248.0 0.00e+00 0.0 1.0e+04 1.2e+05 1.0e+02 7 0 1 3 2 7 0 1 3 2 0 MatAssemblyEnd 65 1.0 5.3107e+01 1.0 0.00e+00 0.0 1.2e+03 9.1e+02 8.4e+01 3 0 0 0 2 3 0 0 0 2 0 MatGetRowIJ 2 1.0 2.9041e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 12 1.0 3.6227e+00 1.4 0.00e+00 0.0 9.7e+03 8.2e+05 4.4e+01 0 0 1 20 1 0 0 1 20 1 0 MatGetOrdering 2 1.0 8.1808e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatPartitioning 1 1.0 8.6379e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 26 1.0 2.6638e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecDot 12 1.0 2.6891e-0220.3 8.60e+05 1.1 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 979 VecMDot 1836 1.0 3.4383e+01 2.3 1.20e+10 1.1 0.0e+00 0.0e+00 1.8e+03 1 4 0 0 43 1 4 0 0 43 10662 VecNorm 1872 1.0 5.5265e+00 8.7 1.34e+08 1.1 0.0e+00 0.0e+00 1.9e+03 0 0 0 0 44 0 0 0 0 44 743 VecScale 1848 1.0 6.1965e-02 1.5 6.62e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 32706 VecCopy 86 1.0 1.0849e-02 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 3734 1.0 6.4959e-01 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 54 1.0 5.5003e-03 1.6 3.87e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 21533 VecWAXPY 12 1.0 2.8160e-03 1.5 4.30e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4673 VecMAXPY 1848 1.0 1.8090e+01 1.4 1.21e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 20488 VecAssemblyBegin 38 1.0 1.4457e-01 3.2 0.00e+00 0.0 2.6e+03 8.4e+02 1.1e+02 0 0 0 0 3 0 0 0 0 3 0 VecAssemblyEnd 38 1.0 1.3518e-04 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 5592 1.0 1.5577e+00 1.8 0.00e+00 0.0 9.2e+05 3.3e+04 0.0e+00 0 0 97 77 0 0 0 97 77 0 0 VecScatterEnd 5592 1.0 2.6118e+0212.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 0 VecReduceArith 4 1.0 4.2009e-04 1.1 2.87e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 20884 VecReduceComm 2 1.0 3.0112e-04 9.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 1848 1.0 5.2645e+00 9.2 1.98e+08 1.1 0.0e+00 0.0e+00 1.8e+03 0 0 0 0 43 0 0 0 0 43 1150 SNESSolve 2 1.0 1.7915e+03 1.0 4.32e+11 2.2 9.4e+05 4.2e+04 4.1e+03100100100100 96 100100100100 96 5137 SNESLineSearch 12 1.0 1.1234e+01 1.0 8.30e+07 1.1 1.1e+04 3.2e+04 2.1e+02 1 0 1 1 5 1 0 1 1 5 224 SNESFunctionEval 14 1.0 2.7475e+01 1.0 3.73e+07 1.1 1.2e+04 3.6e+04 2.1e+02 2 0 1 1 5 2 0 1 1 5 41 SNESJacobianEval 12 1.0 1.1870e+03 1.0 0.00e+00 0.0 6.6e+03 1.2e+05 8.8e+01 66 0 1 2 2 66 0 1 2 2 0 KSPGMRESOrthog 1836 1.0 5.0183e+01 1.7 2.40e+10 1.1 0.0e+00 0.0e+00 1.8e+03 2 8 0 0 43 2 8 0 0 43 14611 KSPSetup 24 1.0 4.9849e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 12 1.0 5.7689e+02 1.0 4.31e+11 2.2 9.2e+05 4.2e+04 3.7e+03 32100 98 97 88 32100 98 97 88 15948 PCSetUp 24 1.0 2.5429e+02 2.5 2.69e+11 3.2 1.0e+04 7.7e+05 7.6e+01 9 49 1 20 2 9 49 1 20 2 17833 PCSetUpOnBlocks 12 1.0 2.5128e+02 2.6 2.69e+11 3.2 0.0e+00 0.0e+00 1.2e+01 9 49 0 0 0 9 49 0 0 0 18047 PCApply 1848 1.0 4.1893e+02 1.9 1.41e+11 1.7 6.0e+05 4.9e+04 0.0e+00 18 40 63 73 0 18 40 63 73 0 8703 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 17 17 857526496 0 Matrix Partitioning 1 1 640 0 Index Set 91 91 2079184 0 IS L to G Mapping 1 1 283604 0 Vector 402 402 111977984 0 Vector Scatter 13 13 13676 0 Application Order 1 1 8773904 0 SNES 2 2 2544 0 Krylov Solver 4 4 5192080 0 Preconditioner 4 4 3632 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 9.58443e-06 Average time for zero size MPI_Send(): 2.94298e-06 #PETSc Option Table entries: -computeinitialguess -f /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E32000_N128961_D1096650.fsi -geometric_asm -geometric_asm_overlap 8 -inletu 5.0 -ksp_atol 1e-8 -ksp_gmres_restart 400 -ksp_max_it 3000 -ksp_pc_side right -ksp_rtol 1.e-2 -ksp_type gmres -log_summary -mat_partitioning_type parmetis -pc_asm_type basic -pc_type asm -shapebeta 10.0 -snes_atol 1.e-10 -snes_max_it 20 -snes_rtol 1.e-6 -sub_pc_factor_mat_ordering_type qmd -sub_pc_factor_shift_amount 1e-8 -sub_pc_factor_shift_type nonzero -sub_pc_type lu -viscosity 0.01 #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Tue Sep 13 13:28:48 2011 Configure options: --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --with-batch=1 --with-mpi-shared-libraries=1 --known-mpi-shared-libraries=0 --download-f-blas-lapack=1 --download-hypre=1 --download-superlu=1 --download-parmetis=1 --download-superlu_dist=1 --download-blacs=1 --download-scalapack=1 --download-mumps=1 --with-debugging=0 ----------------------------------------- Libraries compiled on Tue Sep 13 13:28:48 2011 on node1367 Machine characteristics: Linux-2.6.18-238.12.1.el5-x86_64-with-redhat-5.6-Tikanga Using PETSc directory: /home/ronglian/soft/petsc-3.2-p1 Using PETSc arch: Janus-nodebug ----------------------------------------- Using C compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -Wall -Wno-unused-variable -O ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/include -I/home/ronglian/soft/petsc-3.2-p1/include -I/home/ronglian/soft/petsc-3.2-p1/include -I/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/include -I/curc/tools/free/redhat_5_x86_64/openmpi-1.4.3_ib/include ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -L/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -lpetsc -lX11 -Wl,-rpath,/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -L/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -lsuperlu_dist_2.5 -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lHYPRE -lmpi_cxx -lstdc++ -lscalapack -lblacs -lsuperlu_4.2 -lflapack -lfblas -L/curc/tools/free/redhat_5_x86_64/openmpi-1.4.3_ib/lib -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortran -lm -lm -lm -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s -lpthread -ldl ----------------------------------------- From bsmith at mcs.anl.gov Fri Oct 7 12:07:59 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 7 Oct 2011 12:07:59 -0500 Subject: [petsc-users] [petsc-maint #89695] Re: Memory problem In-Reply-To: References: Message-ID: <5F5DF18A-6971-4206-BD42-7A7AEB6C39E3@mcs.anl.gov> On Oct 7, 2011, at 11:58 AM, Rongliang Chen wrote: > Hi Barry, > > Thank you for your reply. > I don't think this problem comes from the matrix assemble. Because the result I showed you in the last email is from a two-level Newton method which means I first solve a coarse problem and use the coarse solution as the fine level problem's initial guess. If I just use the one-level method, there is no such problem. The memory usage in the -log_summary output is correct and time spend on the SNESJacobianEval is also normal I think (see attached) for the one-level method. The strange memory usage just appear in the two-level method. The reason that I claim the two-level's computing time is not correct is that I solve the same problem with the same number of processors and the two-level's iteration number of SNES and GMRES is much smaller than the one-level method, but the compute time is opposite (the time spend on the coarse problem is just 25s). From the -log_summary outputs of the two methods I found that the matrix's memory usage is total different. So I think there must be some bugs in my two-level code. But I have no idea how to debug this problem. It is not normal that 90% of the time is spent in computing the Jacobian. Usually (always) this is a sign that the memory is not suitably preallocated. Perhaps it is not suitable preallocated for the coarse problem (or the fine problem). barry > > Best, > Rongliang > > On Fri, Oct 7, 2011 at 10:24 AM, Barry Smith wrote: > > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly > > > On Oct 7, 2011, at 11:22 AM, Rongliang Chen wrote: > > > ------------------------------------------------- > > Joab > > > > Shape Optimization solver > > by Rongliang Chen > > compiled on 15:54:32, Oct 3 2011 > > Running on: Wed Oct 5 10:24:10 2011 > > > > revision $Rev: 157 $ > > ------------------------------------------------- > > Command-line options: -coarse_ksp_rtol 1.0e-1 -coarsegrid /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E2000_N8241_D70170.fsi -computeinitialguess -f /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E32000_N128961_D1096650.fsi -geometric_asm -geometric_asm_overlap 8 -inletu 5.0 -ksp_atol 1e-8 -ksp_gmres_restart 600 -ksp_max_it 3000 -ksp_pc_side right -ksp_rtol 1.e-3 -ksp_type gmres -log_summary -mat_partitioning_type parmetis -nest_geometric_asm_overlap 4 -nest_ksp_atol 1e-8 -nest_ksp_gmres_restart 800 -nest_ksp_max_it 1000 -nest_ksp_pc_side right -nest_ksp_rtol 1.e-2 -nest_ksp_type gmres -nest_pc_asm_type basic -nest_pc_type asm -nest_snes_atol 1.e-10 -nest_snes_max_it 20 -nest_snes_rtol 1.e-4 -nest_sub_pc_factor_mat_ordering_type qmd -nest_sub_pc_factor_shift_amount 1e-8 -nest_sub_pc_factor_shift_type nonzero -nest_sub_pc_type lu -nested -noboundaryreduce -pc_asm_type basic -pc_type asm -shapebeta 10.0 -snes_atol 1.e-10 -snes_max_it 20 -snes_rtol 1.e-6 -sub_pc_f > > actor_mat_ordering_type qmd -sub_pc_factor_shift_amount 1e-8 -sub_pc_factor_shift_type nonzero -sub_pc_type lu -viscosity 0.01 > > ------------------------------------------------- > > > > Starting to load grid... > > Nodes on moving boundary: coarse 199, fine 799, Gridratio 0.250000. > > Setupping Interpolation matrix...... > > Interpolation matrix done......Time spent: 0.405431 > > finished. > > Grid has 32000 elements, 1096658 degrees of freedom. > > Coarse grid has 2000 elements, 70170 degrees of freedom. > > [0] has 35380 degrees of freedom (matrix), 35380 degrees of freedom (including shared points). > > [0] coarse grid has 2194 degrees of freedom (matrix), 2194 degrees of freedom (including shared points). > > [31] has 32466 degrees of freedom (matrix), 34428 degrees of freedom (including shared points). > > [31] coarse grid has 2250 degrees of freedom (matrix), 2826 degrees of freedom (including shared points). > > Time spend on the load grid and create matrix etc.: 3.577862. > > Solving fixed mesh (steady-state problem) > > Solving coarse problem...... > > 0 SNES norm 3.1224989992e+01, 0 KSP its last norm 0.0000000000e+00. > > 1 SNES norm 1.3987219837e+00, 25 KSP its last norm 2.4915963656e-01. > > 2 SNES norm 5.1898321541e-01, 59 KSP its last norm 1.3451744761e-02. > > 3 SNES norm 4.0024228221e-02, 56 KSP its last norm 4.9036146089e-03. > > 4 SNES norm 6.7641787439e-04, 59 KSP its last norm 3.6925683196e-04. > > Coarse solver done...... > > Initial value of object function (Energy dissipation) (Coarse): 38.9341108701 > > 0 SNES norm 7.4575110699e+00, 0 KSP its last norm 0.0000000000e+00. > > 1 SNES norm 6.4497565921e-02, 51 KSP its last norm 7.4277453141e-03. > > 2 SNES norm 9.2093642958e-04, 90 KSP its last norm 5.4331380112e-05. > > 3 SNES norm 8.1283574549e-07, 103 KSP its last norm 7.5974191049e-07. > > Initial value of object function (Energy dissipation) (Fine): 42.5134271399 > > Solution time of 17.180358 sec. > > Fixed mesh (Steady-state) solver done. > > Total number of nonlinear iterations = 3 > > Total number of linear iterations = 244 > > Average number of linear iterations = 81.333336 > > Time computing: 17.180358 sec, Time outputting: 0.000000 sec. > > Time spent in coarse nonlinear solve: 0.793436 sec, 0.046183 fraction of total compute time. > > Solving Shape Optimization problem (steady-state problem) > > Solving coarse problem...... > > 0 SNES norm 4.1963166116e+01, 0 KSP its last norm 0.0000000000e+00. > > 1 SNES norm 3.2749386875e+01, 132 KSP its last norm 4.0966334477e-01. > > 2 SNES norm 2.2874504408e+01, 130 KSP its last norm 3.2526355310e-01. > > 3 SNES norm 1.4327187891e+01, 132 KSP its last norm 2.1213029400e-01. > > 4 SNES norm 1.7283643754e+00, 81 KSP its last norm 1.4233338128e-01. > > 5 SNES norm 3.6703566918e-01, 133 KSP its last norm 1.6069896349e-02. > > 6 SNES norm 3.6554528686e-03, 77 KSP its last norm 3.5379167356e-03. > > Coarse solver done...... > > Optimized value of object function (Energy dissipation) (Coarse): 29.9743062939 > > The reduction of the energy dissipation (Coarse): 23.012737% > > The optimized curve (Coarse): > > a = (4.500000, -0.042893, -0.002030, 0.043721, -0.018798, 0.001824) > > Solving moving mesh equation...... > > KSP norm 2.3040219081e-07, KSP its. 741. Time spent 8.481956 > > Moving mesh solver done. > > 0 SNES norm 4.7843968670e+02, 0 KSP its last norm 0.0000000000e+00. > > 1 SNES norm 1.0148854085e+02, 49 KSP its last norm 4.7373180511e-01. > > 2 SNES norm 1.8312214030e+00, 46 KSP its last norm 1.0133332840e-01. > > 3 SNES norm 3.3101970861e-03, 212 KSP its last norm 1.7753271069e-03. > > 4 SNES norm 4.9552614008e-06, 249 KSP its last norm 3.2293284103e-06. > > Optimized value of object function (Energy dissipation) (Fine): 33.2754372645 > > Solution time of 4053.227456 sec. > > Number of unknowns = 1096658 > > Parameters: kinematic viscosity = 0.01 > > inlet velocity: u = 5, v = 0 > > Total number of nonlinear iterations = 4 > > Total number of linear iterations = 556 > > Average number of linear iterations = 139.000000 > > Time computing: 4053.227456 sec, Time outputting: 0.000001 sec. > > Time spent in coarse nonlinear solve: 24.239526 sec, 0.005980 fraction of total compute time. > > The optimized curve (fine): > > a = (4.500000, -0.046468, -0.001963, 0.045736, -0.019141, 0.001789) > > The reduction of the energy dissipation (Fine): 21.729582% > > Time spend on fixed mesh solving: 17.296872 > > Time spend on shape opt. solving: 4053.250126 > > Latex command line: > > np Newton GMRES Time(Total) Time(Coarse) Ratio > > 32 & 4 & 139.00 & 4053.23 & 24.24 & 0.6\% > > > > Running finished on: Wed Oct 5 11:32:04 2011 > > Total running time: 4070.644329 > > ************************************************************************************************************************ > > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** > > ************************************************************************************************************************ > > > > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > > > ./joab on a Janus-nod named node1751 with 32 processors, by ronglian Wed Oct 5 11:32:04 2011 > > Using Petsc Release Version 3.2.0, Patch 1, Mon Sep 12 16:01:51 CDT 2011 > > > > Max Max/Min Avg Total > > Time (sec): 4.074e+03 1.00000 4.074e+03 > > Objects: 1.011e+03 1.00000 1.011e+03 > > Flops: 2.255e+11 2.27275 1.471e+11 4.706e+12 > > Flops/sec: 5.535e+07 2.27275 3.609e+07 1.155e+09 > > MPI Messages: 1.103e+05 5.41392 3.665e+04 1.173e+06 > > MPI Message Lengths: 1.326e+09 2.60531 2.416e+04 2.833e+10 > > MPI Reductions: 5.969e+03 1.00000 > > > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > > e.g., VecAXPY() for real vectors of length N --> 2N flops > > and VecAXPY() for complex vectors of length N --> 8N flops > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- > > Avg %Total Avg %Total counts %Total Avg %Total counts %Total > > 0: Main Stage: 4.0743e+03 100.0% 4.7058e+12 100.0% 1.173e+06 100.0% 2.416e+04 100.0% 5.968e+03 100.0% > > > > ------------------------------------------------------------------------------------------------------------------------ > > See the 'Profiling' chapter of the users' manual for details on interpreting output. > > Phase summary info: > > Count: number of times phase was executed > > Time and Flops: Max - maximum over all processors > > Ratio - ratio of maximum to minimum over all processors > > Mess: number of messages sent > > Avg. len: average message length > > Reduct: number of global reductions > > Global: entire computation > > Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > > %T - percent time in this phase %F - percent flops in this phase > > %M - percent messages in this phase %L - percent message lengths in this phase > > %R - percent reductions in this phase > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) > > ------------------------------------------------------------------------------------------------------------------------ > > Event Count Time (sec) Flops --- Global --- --- Stage --- Total > > Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > > > --- Event Stage 0: Main Stage > > > > MatMult 2493 1.0 1.2225e+0218.4 4.37e+09 1.1 3.9e+05 2.2e+03 0.0e+00 2 3 33 3 0 2 3 33 3 0 1084 > > MatMultTranspose 6 1.0 3.3590e-02 2.2 7.38e+06 1.1 8.0e+02 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 6727 > > MatSolve 2467 1.0 1.1270e+02 1.7 5.95e+10 1.7 0.0e+00 0.0e+00 0.0e+00 2 33 0 0 0 2 33 0 0 0 13775 > > MatLUFactorSym 4 1.0 3.4774e+00 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 > > MatLUFactorNum 18 1.0 2.0832e+02 3.7 1.55e+11 3.2 0.0e+00 0.0e+00 0.0e+00 2 56 0 0 0 2 56 0 0 0 12746 > > MatILUFactorSym 1 1.0 8.3280e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatAssemblyBegin 103 1.0 7.6879e+0215.4 0.00e+00 0.0 1.6e+04 6.2e+04 1.7e+02 7 0 1 4 3 7 0 1 4 3 0 > > MatAssemblyEnd 103 1.0 3.7818e+01 1.0 0.00e+00 0.0 3.0e+03 5.3e+02 1.6e+02 1 0 0 0 3 1 0 0 0 3 0 > > MatGetRowIJ 5 1.0 4.8716e-02 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetSubMatrice 18 1.0 4.3095e+00 2.5 0.00e+00 0.0 1.6e+04 3.5e+05 7.4e+01 0 0 1 20 1 0 0 1 20 1 0 > > MatGetOrdering 5 1.0 1.4656e+00 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 0 0 0 0 0 0 0 > > MatPartitioning 1 1.0 1.4356e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatZeroEntries 42 1.0 2.0939e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecDot 17 1.0 1.2719e-02 6.8 5.47e+05 1.1 0.0e+00 0.0e+00 1.7e+01 0 0 0 0 0 0 0 0 0 0 1317 > > VecMDot 2425 1.0 1.7196e+01 2.2 5.82e+09 1.1 0.0e+00 0.0e+00 2.4e+03 0 4 0 0 41 0 4 0 0 41 10353 > > VecNorm 2503 1.0 2.7923e+00 3.4 1.18e+08 1.1 0.0e+00 0.0e+00 2.5e+03 0 0 0 0 42 0 0 0 0 42 1293 > > VecScale 2467 1.0 7.3112e-02 1.7 5.84e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 24453 > > VecCopy 153 1.0 1.1636e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecSet 5031 1.0 6.0423e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecAXPY 137 1.0 1.1462e-02 1.5 6.33e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 16902 > > VecWAXPY 19 1.0 1.7784e-03 1.4 2.83e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4869 > > VecMAXPY 2467 1.0 8.5820e+00 1.3 5.93e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 21153 > > VecAssemblyBegin 69 1.0 1.0341e+0018.2 0.00e+00 0.0 4.9e+03 5.4e+02 2.1e+02 0 0 0 0 3 0 0 0 0 3 0 > > VecAssemblyEnd 69 1.0 2.4939e-04 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecScatterBegin 7491 1.0 1.3734e+00 1.7 0.00e+00 0.0 1.1e+06 1.9e+04 0.0e+00 0 0 96 76 0 0 0 96 76 0 0 > > VecScatterEnd 7491 1.0 2.0055e+02 8.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 > > VecReduceArith 8 1.0 1.4977e-03 2.0 3.05e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6232 > > VecReduceComm 4 1.0 8.9908e-0412.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecNormalize 2467 1.0 2.8067e+00 3.4 1.75e+08 1.1 0.0e+00 0.0e+00 2.4e+03 0 0 0 0 41 0 0 0 0 41 1905 > > SNESSolve 4 1.0 4.0619e+03 1.0 2.23e+11 2.3 9.4e+05 2.3e+04 4.1e+03100 98 80 77 68 100 98 80 77 68 1136 > > SNESLineSearch 17 1.0 1.1423e+01 1.0 5.23e+07 1.1 1.8e+04 1.7e+04 3.3e+02 0 0 2 1 6 0 0 2 1 6 140 > > SNESFunctionEval 23 1.0 2.9742e+01 1.0 2.60e+07 1.1 1.9e+04 1.9e+04 3.5e+02 1 0 2 1 6 1 0 2 1 6 27 > > SNESJacobianEval 17 1.0 3.6786e+03 1.0 0.00e+00 0.0 9.8e+03 6.4e+04 1.4e+02 90 0 1 2 2 90 0 1 2 2 0 > > KSPGMRESOrthog 2425 1.0 2.5150e+01 1.6 1.16e+10 1.1 0.0e+00 0.0e+00 2.4e+03 0 8 0 0 41 0 8 0 0 41 14157 > > KSPSetup 36 1.0 2.5388e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > KSPSolve 18 1.0 3.6141e+02 1.0 2.25e+11 2.3 1.1e+06 2.4e+04 5.0e+03 9100 97 96 84 9100 97 96 84 13015 > > PCSetUp 36 1.0 2.1635e+02 3.6 1.55e+11 3.2 1.8e+04 3.2e+05 1.5e+02 3 56 2 20 3 3 56 2 20 3 12274 > > PCSetUpOnBlocks 18 1.0 2.1293e+02 3.7 1.55e+11 3.2 0.0e+00 0.0e+00 2.7e+01 2 56 0 0 0 2 56 0 0 0 12471 > > PCApply 2467 1.0 2.5616e+02 2.5 5.95e+10 1.7 7.3e+05 2.8e+04 0.0e+00 4 33 62 73 0 4 33 62 73 0 6060 > > ------------------------------------------------------------------------------------------------------------------------ > > > > Memory usage is given in bytes: > > > > Object Type Creations Destructions Memory Descendants' Mem. > > Reports information only for process 0. > > > > --- Event Stage 0: Main Stage > > > > Matrix 39 39 18446744074642894848 0 > > Matrix Partitioning 1 1 640 0 > > Index Set 184 184 2589512 0 > > IS L to G Mapping 2 2 301720 0 > > Vector 729 729 133662888 0 > > Vector Scatter 29 29 30508 0 > > Application Order 2 2 9335968 0 > > SNES 4 4 5088 0 > > Krylov Solver 10 10 32264320 0 > > Preconditioner 10 10 9088 0 > > Viewer 1 0 0 0 > > ======================================================================================================================== > > Average time to get PetscTime(): 1.19209e-07 > > Average time for MPI_Barrier(): 1.20163e-05 > > Average time for zero size MPI_Send(): 2.49594e-06 > > #PETSc Option Table entries: > > -coarse_ksp_rtol 1.0e-1 > > -coarsegrid /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E2000_N8241_D70170.fsi > > -computeinitialguess > > -f /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E32000_N128961_D1096650.fsi > > -geometric_asm > > -geometric_asm_overlap 8 > > -inletu 5.0 > > -ksp_atol 1e-8 > > -ksp_gmres_restart 600 > > -ksp_max_it 3000 > > -ksp_pc_side right > > -ksp_rtol 1.e-3 > > -ksp_type gmres > > -log_summary > > -mat_partitioning_type parmetis > > -nest_geometric_asm_overlap 4 > > -nest_ksp_atol 1e-8 > > -nest_ksp_gmres_restart 800 > > -nest_ksp_max_it 1000 > > -nest_ksp_pc_side right > > -nest_ksp_rtol 1.e-2 > > -nest_ksp_type gmres > > -nest_pc_asm_type basic > > -nest_pc_type asm > > -nest_snes_atol 1.e-10 > > -nest_snes_max_it 20 > > -nest_snes_rtol 1.e-4 > > -nest_sub_pc_factor_mat_ordering_type qmd > > -nest_sub_pc_factor_shift_amount 1e-8 > > -nest_sub_pc_factor_shift_type nonzero > > -nest_sub_pc_type lu > > -nested > > -noboundaryreduce > > -pc_asm_type basic > > -pc_type asm > > -shapebeta 10.0 > > -snes_atol 1.e-10 > > -snes_max_it 20 > > -snes_rtol 1.e-6 > > -sub_pc_factor_mat_ordering_type qmd > > -sub_pc_factor_shift_amount 1e-8 > > -sub_pc_factor_shift_type nonzero > > -sub_pc_type lu > > -viscosity 0.01 > > #End of PETSc Option Table entries > > Compiled without FORTRAN kernels > > Compiled with full precision matrices (default) > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 > > Configure run at: Tue Sep 13 13:28:48 2011 > > Configure options: --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --with-batch=1 --with-mpi-shared-libraries=1 --known-mpi-shared-libraries=0 --download-f-blas-lapack=1 --download-hypre=1 --download-superlu=1 --download-parmetis=1 --download-superlu_dist=1 --download-blacs=1 --download-scalapack=1 --download-mumps=1 --with-debugging=0 > > ----------------------------------------- > > Libraries compiled on Tue Sep 13 13:28:48 2011 on node1367 > > Machine characteristics: Linux-2.6.18-238.12.1.el5-x86_64-with-redhat-5.6-Tikanga > > Using PETSc directory: /home/ronglian/soft/petsc-3.2-p1 > > Using PETSc arch: Janus-nodebug > > ----------------------------------------- > > > > Using C compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O ${COPTFLAGS} ${CFLAGS} > > Using Fortran compiler: mpif90 -Wall -Wno-unused-variable -O ${FOPTFLAGS} ${FFLAGS} > > ----------------------------------------- > > > > Using include paths: -I/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/include -I/home/ronglian/soft/petsc-3.2-p1/include -I/home/ronglian/soft/petsc-3.2-p1/include -I/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/include -I/curc/tools/free/redhat_5_x86_64/openmpi-1.4.3_ib/include > > ----------------------------------------- > > > > Using C linker: mpicc > > Using Fortran linker: mpif90 > > Using libraries: -Wl,-rpath,/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -L/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -lpetsc -lX11 -Wl,-rpath,/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -L/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -lsuperlu_dist_2.5 -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lHYPRE -lmpi_cxx -lstdc++ -lscalapack -lblacs -lsuperlu_4.2 -lflapack -lfblas -L/curc/tools/free/redhat_5_x86_64/openmpi-1.4.3_ib/lib -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortran -lm -lm -lm -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s -lpthread -ldl > > ----------------------------------------- > > > >> > >> Yes, it has no influence on performance. If you think it does, send > >> -log_summary output to petsc-maint at mcs.anl.gov > >> > >> Matt > >> > >> > > Hi Matt, > > > > The -log_summary output is attached. I found that the SNESJacobianEval() > > takes 90% of the total time. I think this is abnormal because I use a hand > > coded Jacobian matrix. The reason, I think, for the 90% of the total time is > > that the matrix takes too much memory (over 1.8x10^19 bytes) which maybe > > have used the swap. But I do not know why 23 one million by one million > > matrices will use so much memory. Can you tell me how to debug this problem? > > Thank you. > > > > Best, > > Rongliang > > > > > > Yes, it has no influence on performance. If you think it does, send > > -log_summary output to petsc-maint at mcs.anl.gov > > > > Matt > > > > > > Hi Matt, > > > > The -log_summary output is attached. I found that the SNESJacobianEval() takes 90% of the total time. I think this is abnormal because I use a hand coded Jacobian matrix. The reason, I think, for the 90% of the total time is that the matrix takes too much memory (over 1.8x10^19 bytes) which maybe have used the swap. But I do not know why 23 one million by one million matrices will use so much memory. Can you tell me how to debug this problem? Thank you. > > > > Best, > > Rongliang > > > From gshy2014 at gmail.com Fri Oct 7 19:48:11 2011 From: gshy2014 at gmail.com (Shiyuan) Date: Fri, 7 Oct 2011 19:48:11 -0500 Subject: [petsc-users] KSP and VecDestroy Message-ID: Hi, I want to track down an memory problem and have something I don't understand. In the following part, I create four vectors and destroy them. If I don't call KSPSolve, log tells me four Vec are created and four Vec are destroyed which I expect. But if I call KSPSolve, log tells me four Vec are created but only two are destroyed in that Stage? Does KSP refer the two vectors somewhere inside which make VecDestroy cannot destroy them? Does it create memory leak becase of that? Thanks. ierr=PetscLogStagePush(memoryWatch);CHKERRQ(ierr); MatGetVecs(*(pCSolverNeu->pK),&x0,&rhs); VecDuplicate(x0,&(workVec[0]));VecDuplicate(x0,&(workVec[1])); VecSetRandom(x0,randomctx); VecCopy(x0,workVec[0]); MatMult(*(pMGDataVec[0]->pA),x0,rhs); ierr=KSPSolve(ksp,rhs, workVec[0]);CHKERRQ(ierr); ierr=VecDestroy(&x0);CHKERRQ(ierr); ierr=VecDestroy(&rhs);CHKERRQ(ierr); ierr=VecDestroy(&workVec[0]);CHKERRQ(ierr); ierr=VecDestroy(&workVec[1]);CHKERRQ(ierr); ierr=PetscLogStagePop();CHKERRQ(ierr); -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Oct 7 19:49:35 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 7 Oct 2011 19:49:35 -0500 Subject: [petsc-users] KSP and VecDestroy In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 19:48, Shiyuan wrote: > Hi, > I want to track down an memory problem and have something I don't > understand. > In the following part, I create four vectors and destroy them. If I don't > call KSPSolve, log tells me four Vec are created and four Vec are destroyed > which I expect. But if I call KSPSolve, log tells me four Vec are created > but only two are destroyed in that Stage? Does KSP refer the two vectors > somewhere inside which make VecDestroy cannot destroy them? Yes > Does it create memory leak becase of that? No, they are destroyed when you call KSPDestroy(). > Thanks. > > ierr=PetscLogStagePush(memoryWatch);CHKERRQ(ierr); > MatGetVecs(*(pCSolverNeu->pK),&x0,&rhs); > VecDuplicate(x0,&(workVec[0]));VecDuplicate(x0,&(workVec[1])); > VecSetRandom(x0,randomctx); > VecCopy(x0,workVec[0]); > MatMult(*(pMGDataVec[0]->pA),x0,rhs); > > ierr=KSPSolve(ksp,rhs, workVec[0]);CHKERRQ(ierr); > > ierr=VecDestroy(&x0);CHKERRQ(ierr); > ierr=VecDestroy(&rhs);CHKERRQ(ierr); > ierr=VecDestroy(&workVec[0]);CHKERRQ(ierr); > ierr=VecDestroy(&workVec[1]);CHKERRQ(ierr); > ierr=PetscLogStagePop();CHKERRQ(ierr); > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gshy2014 at gmail.com Fri Oct 7 20:01:12 2011 From: gshy2014 at gmail.com (Shiyuan) Date: Fri, 7 Oct 2011 20:01:12 -0500 Subject: [petsc-users] KSP and VecDestroy In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 7:49 PM, Jed Brown wrote: > On Fri, Oct 7, 2011 at 19:48, Shiyuan wrote: > >> Hi, >> I want to track down an memory problem and have something I don't >> understand. >> In the following part, I create four vectors and destroy them. If I don't >> call KSPSolve, log tells me four Vec are created and four Vec are destroyed >> which I expect. But if I call KSPSolve, log tells me four Vec are created >> but only two are destroyed in that Stage? Does KSP refer the two vectors >> somewhere inside which make VecDestroy cannot destroy them? > > > Yes > > >> Does it create memory leak becase of that? > > > No, they are destroyed when you call KSPDestroy(). > > >> Thanks. >> >> ierr=PetscLogStagePush(memoryWatch);CHKERRQ(ierr); >> MatGetVecs(*(pCSolverNeu->pK),&x0,&rhs); >> VecDuplicate(x0,&(workVec[0]));VecDuplicate(x0,&(workVec[1])); >> VecSetRandom(x0,randomctx); >> VecCopy(x0,workVec[0]); >> MatMult(*(pMGDataVec[0]->pA),x0,rhs); >> >> ierr=KSPSolve(ksp,rhs, workVec[0]);CHKERRQ(ierr); >> >> ierr=VecDestroy(&x0);CHKERRQ(ierr); >> ierr=VecDestroy(&rhs);CHKERRQ(ierr); >> ierr=VecDestroy(&workVec[0]);CHKERRQ(ierr); >> ierr=VecDestroy(&workVec[1]);CHKERRQ(ierr); >> ierr=PetscLogStagePop();CHKERRQ(ierr); >> > > But in the -log_summary, these two Vecs seems not be counted as destroyed, which makes the number of created Vec is two more than the number of destroyed Vec. This makes me think that I forget to free memory. Is it supposed to work that way? Is the memory part in the log meant to be a way to detect the memory leak problem? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Oct 7 21:10:12 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 7 Oct 2011 21:10:12 -0500 Subject: [petsc-users] KSP and VecDestroy In-Reply-To: References: Message-ID: <7FE94F04-5667-438C-962A-1149AF450808@mcs.anl.gov> Did you call KSPDestroy()? References to two of the vectors are retained by the KSP object, only when the KSP object is destroyed are those references released and hence those two vectors destroyed. You can use -malloc -malloc_dump to see all the memory that was allocated by PETSc and not freed. On Oct 7, 2011, at 8:01 PM, Shiyuan wrote: > > > On Fri, Oct 7, 2011 at 7:49 PM, Jed Brown wrote: > On Fri, Oct 7, 2011 at 19:48, Shiyuan wrote: > Hi, > I want to track down an memory problem and have something I don't understand. > In the following part, I create four vectors and destroy them. If I don't call KSPSolve, log tells me four Vec are created and four Vec are destroyed which I expect. But if I call KSPSolve, log tells me four Vec are created but only two are destroyed in that Stage? Does KSP refer the two vectors somewhere inside which make VecDestroy cannot destroy them? > > Yes > > Does it create memory leak becase of that? > > No, they are destroyed when you call KSPDestroy(). > > Thanks. > > ierr=PetscLogStagePush(memoryWatch);CHKERRQ(ierr); > MatGetVecs(*(pCSolverNeu->pK),&x0,&rhs); VecDuplicate(x0,&(workVec[0]));VecDuplicate(x0,&(workVec[1])); > VecSetRandom(x0,randomctx); > VecCopy(x0,workVec[0]); > MatMult(*(pMGDataVec[0]->pA),x0,rhs); > > ierr=KSPSolve(ksp,rhs, workVec[0]);CHKERRQ(ierr); > > ierr=VecDestroy(&x0);CHKERRQ(ierr); > ierr=VecDestroy(&rhs);CHKERRQ(ierr); > ierr=VecDestroy(&workVec[0]);CHKERRQ(ierr); > ierr=VecDestroy(&workVec[1]);CHKERRQ(ierr); > ierr=PetscLogStagePop();CHKERRQ(ierr); > > But in the -log_summary, these two Vecs seems not be counted as destroyed, which makes the number of created Vec is two more than the number of destroyed Vec. This makes me think that I forget to free memory. Is it supposed to work that way? Is the memory part in the log meant to be a way to detect the memory leak problem? Thanks. From dominik at itis.ethz.ch Sat Oct 8 05:04:58 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 8 Oct 2011 12:04:58 +0200 Subject: [petsc-users] DIstribute a global vector Message-ID: I have my parallel layout and I can easily gather my MPI Vec's components into one sequential Vec on root process using VecScatterCreateToZero. Works perfectly. But now I want the opposite: "VecScatterCreateFromZero". There is no such function, but it well depicts my intention: I know my parallel layout, I have a sequential Vec on root process that I want to be partitioned and distributed to all processes. After studying all VecScatter* functions I remain unsure how to best accomplish it, e.g. VecScatterCreateToAll sounds promising but seems to scatter the whole vector while I need to scatter only relevant processor chunks. I think along these lines: // arrayGlobal is a sequential vector on root with known size and in application ordering. // arrayLocal is a MPI vector with known global and local sizes. IS ix; // Fill ix on each process with global ID's this process owns ierr = VecScatterCreate(arrayGlobal, ix, arrayLocal, PETSC_NULL, &scatter); Is this right or is there a better/more elegant way? Many thanks for any hints, Dominik From dominik at itis.ethz.ch Sat Oct 8 06:48:38 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 8 Oct 2011 13:48:38 +0200 Subject: [petsc-users] passing an object to function In-Reply-To: References: Message-ID: > On Fri, Oct 7, 2011 at 11:09, Dominik Szczerba wrote: >> >> Sorry, I could not conclude: so what signature to pass an object is >> best/cleanest/most optimal for my C++ code? > > Prefer Vec If I do so (instead of Vec& that I started with), I receive segfaults. An example function: foo(const Vec& v1, Vec& v2) { // allocate v2 and fill it it depending on v1) } Calling sequence; Vec v1; // create and fill v1; Vec v2; // creation expected in foo() foo(v1,v2); // Now do VecGetArray on v2 This will work as expected only if the signature for foo is Vec&. When replaced with Vec I get segfaults on VecGetArray trying to access v2 after calling foo, suggesting an attempt to access invalid memory location. Did I miss anything? Many thanks, Dominik From mmnasr at gmail.com Sat Oct 8 07:27:28 2011 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Sat, 8 Oct 2011 05:27:28 -0700 Subject: [petsc-users] Domain partitioning using DAs In-Reply-To: References: Message-ID: Thanks Jed. I guess the whole point I am missing is in that "squarish". I, for instance, use two processors for a domain of the global size: nx=50 ny=40 nz=4 To my surprise, the way it was decomposed was in y-direction, i.e. processor 0: 0-49 0-19 0-3 processor 1: 0-49 20-39 0-3 and not x-direction which minimizes the surface area between the two processors. Obviously "2" is a prime number. In that case, is PETSc biased to choose any of the directions? Thanks again Best, Mohamad On Fri, Oct 7, 2011 at 5:34 AM, Jed Brown wrote: > On Fri, Oct 7, 2011 at 02:58, Mohamad M. Nasr-Azadani wrote: > >> How does DACreate3d() partition the domain among processors? >> Is it based on the minimizing the surface between the processors? Or it is >> just a simple x-y-z order? >> > > It tries to produce a "squarish" partition based on divisibility of the > number of processes. It will only partition in one direction if you use a > prime number of processes, for example. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Oct 8 08:15:21 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 8 Oct 2011 08:15:21 -0500 Subject: [petsc-users] DIstribute a global vector In-Reply-To: References: Message-ID: On Sat, Oct 8, 2011 at 05:04, Dominik Szczerba wrote: > I have my parallel layout and I can easily gather my MPI Vec's > components into one sequential Vec on root process using > VecScatterCreateToZero. Works perfectly. > > But now I want the opposite: "VecScatterCreateFromZero". There is no > such function, but it well depicts my intention: I know my parallel > layout, I have a sequential Vec on root process that I want to be > partitioned and distributed to all processes. http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/Vec/VecScatterCreateToZero.html then http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/Vec/SCATTER_REVERSE.html > After studying all > VecScatter* functions I remain unsure how to best accomplish it, e.g. > VecScatterCreateToAll sounds promising but seems to scatter the whole > vector while I need to scatter only relevant processor chunks. I think > along these lines: > > // arrayGlobal is a sequential vector on root with known size and in > application ordering. > // arrayLocal is a MPI vector with known global and local sizes. > IS ix; > // Fill ix on each process with global ID's this process owns > ierr = VecScatterCreate(arrayGlobal, ix, arrayLocal, PETSC_NULL, &scatter); > > Is this right or is there a better/more elegant way? > > Many thanks for any hints, > Dominik > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Oct 8 08:24:55 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 8 Oct 2011 08:24:55 -0500 Subject: [petsc-users] passing an object to function In-Reply-To: References: Message-ID: On Sat, Oct 8, 2011 at 06:48, Dominik Szczerba wrote: > If I do so (instead of Vec& that I started with), I receive segfaults. > Well now you're doing allocation, so you have to modify the caller's Vec. This is just pass-by-value semantics. > An example function: > > foo(const Vec& v1, Vec& v2) > { > // allocate v2 and fill it it depending on v1) > It would be consistent with PETSc usage to write foo(Vec v1, Vec *v2) but in C++, you can also write foo(Vec v1, Vec &v2) There is no advantage to passing v1 as Vec& (it's slightly more indirection and totally unnecessary). > } > > Calling sequence; > > Vec v1; > // create and fill v1; > Vec v2; // creation expected in foo() > foo(v1,v2); > // Now do VecGetArray on v2 > > This will work as expected only if the signature for foo is Vec&. When > replaced with Vec I get segfaults on VecGetArray trying to access v2 > after calling foo, suggesting an attempt to access invalid memory > location. > That is because C++ has call-by-value semantics, so v2 in the scope above is still undefined when you pass it to VecGetArray. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Oct 8 08:32:10 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 8 Oct 2011 08:32:10 -0500 Subject: [petsc-users] Domain partitioning using DAs In-Reply-To: References: Message-ID: The "dividing up" is done in DMSetUp_DA_3D() in src/dm/impls/da/da3.c If you don't like the algorithm PETSc uses to "divide up" you can pass in the partitioning you want with the final arguments to DMDACreate3d() or DMDASetOwnershipRanges(). Barry On Oct 8, 2011, at 7:27 AM, Mohamad M. Nasr-Azadani wrote: > Thanks Jed. I guess the whole point I am missing is in that "squarish". > I, for instance, use two processors for a domain of the global size: > nx=50 > ny=40 > nz=4 > > To my surprise, the way it was decomposed was in y-direction, i.e. > processor 0: > 0-49 > 0-19 > 0-3 > > processor 1: > 0-49 > 20-39 > 0-3 > > and not x-direction which minimizes the surface area between the two processors. > Obviously "2" is a prime number. In that case, is PETSc biased to choose any of the directions? > > Thanks again > Best, > Mohamad > > > On Fri, Oct 7, 2011 at 5:34 AM, Jed Brown wrote: > On Fri, Oct 7, 2011 at 02:58, Mohamad M. Nasr-Azadani wrote: > How does DACreate3d() partition the domain among processors? > Is it based on the minimizing the surface between the processors? Or it is just a simple x-y-z order? > > It tries to produce a "squarish" partition based on divisibility of the number of processes. It will only partition in one direction if you use a prime number of processes, for example. > From jedbrown at mcs.anl.gov Sat Oct 8 08:32:32 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 8 Oct 2011 08:32:32 -0500 Subject: [petsc-users] Domain partitioning using DAs In-Reply-To: References: Message-ID: On Sat, Oct 8, 2011 at 07:27, Mohamad M. Nasr-Azadani wrote: > Thanks Jed. I guess the whole point I am missing is in that "squarish". > I, for instance, use two processors for a domain of the global size: > nx=50 > ny=40 > nz=4 > > To my surprise, the way it was decomposed was in y-direction, i.e. > processor 0: > 0-49 > 0-19 > 0-3 > > processor 1: > 0-49 > 20-39 > 0-3 > > and not x-direction which minimizes the surface area between the two > processors. > Obviously "2" is a prime number. In that case, is PETSc biased to choose > any of the directions? > There is an order to which directions get tried, but it will never pick a split that is "too bad". If you run on four processes, it will split in both the X and Y directions. The complete code is here. http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/src/dm/impls/da/da3.c.html#line231 I'm a little surprised it doesn't find the "best" split in this simple case. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gshy2014 at gmail.com Sat Oct 8 12:11:48 2011 From: gshy2014 at gmail.com (Shiyuan) Date: Sat, 8 Oct 2011 12:11:48 -0500 Subject: [petsc-users] KSP and VecDestroy In-Reply-To: <7FE94F04-5667-438C-962A-1149AF450808@mcs.anl.gov> References: <7FE94F04-5667-438C-962A-1149AF450808@mcs.anl.gov> Message-ID: On Fri, Oct 7, 2011 at 9:10 PM, Barry Smith wrote: > > Did you call KSPDestroy()? References to two of the vectors are retained > by the KSP object, only when the KSP object is destroyed are those > references released and hence those two vectors destroyed. > > I did. log_summary shows ksp is destroyed. But I called KSPDestroy() after VecDestroy(). That might cause the problem. Thanks. Shiyuan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Oct 8 12:13:40 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 8 Oct 2011 12:13:40 -0500 Subject: [petsc-users] KSP and VecDestroy In-Reply-To: References: <7FE94F04-5667-438C-962A-1149AF450808@mcs.anl.gov> Message-ID: On Sat, Oct 8, 2011 at 12:11, Shiyuan wrote: > I did. log_summary shows ksp is destroyed. But I called KSPDestroy() after > VecDestroy(). That might cause the problem. Calling KSPDestroy() after VecDestroy() is not a problem. Run with -malloc_dump to see if there is actually a memory leak. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Sun Oct 9 03:19:59 2011 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Sun, 9 Oct 2011 01:19:59 -0700 Subject: [petsc-users] Domain partitioning using DAs In-Reply-To: References: Message-ID: Thanks Barry and Jed. That definitely helps. Best, Mohamad On Sat, Oct 8, 2011 at 6:32 AM, Jed Brown wrote: > On Sat, Oct 8, 2011 at 07:27, Mohamad M. Nasr-Azadani wrote: > >> Thanks Jed. I guess the whole point I am missing is in that "squarish". >> I, for instance, use two processors for a domain of the global size: >> nx=50 >> ny=40 >> nz=4 >> >> To my surprise, the way it was decomposed was in y-direction, i.e. >> processor 0: >> 0-49 >> 0-19 >> 0-3 >> >> processor 1: >> 0-49 >> 20-39 >> 0-3 >> >> and not x-direction which minimizes the surface area between the two >> processors. >> Obviously "2" is a prime number. In that case, is PETSc biased to choose >> any of the directions? >> > > There is an order to which directions get tried, but it will never pick a > split that is "too bad". If you run on four processes, it will split in both > the X and Y directions. The complete code is here. > > > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/src/dm/impls/da/da3.c.html#line231 > > I'm a little surprised it doesn't find the "best" split in this simple > case. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Mon Oct 10 08:16:32 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Mon, 10 Oct 2011 15:16:32 +0200 Subject: [petsc-users] Do I need to VecRestoreArray after VecGetArray if nothing's modified? Message-ID: If only read from a pointer obtained via VecGetArray, but not modify any values, is VecRestoreArray still mandatory? Thanks, Dominik From jedbrown at mcs.anl.gov Mon Oct 10 08:18:36 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 10 Oct 2011 08:18:36 -0500 Subject: [petsc-users] Do I need to VecRestoreArray after VecGetArray if nothing's modified? In-Reply-To: References: Message-ID: On Mon, Oct 10, 2011 at 08:16, Dominik Szczerba wrote: > If only read from a pointer obtained via VecGetArray, but not modify > any values, is VecRestoreArray still mandatory? > Yes. Note that you can use VecGetArrayRead() if you only want const access. (You still restore, but cached norms remain valid.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Ellis at geosoft.com Mon Oct 10 15:33:12 2011 From: Robert.Ellis at geosoft.com (Robert Ellis) Date: Mon, 10 Oct 2011 20:33:12 +0000 Subject: [petsc-users] VecSetValues Message-ID: <18205E5ECD2A1A4584F2BFC0BCBDE95526E7F8C6@exchange.geosoft.com> Hello All, I sometimes use VecSetValues ... PetscErrorCode VecSetValues(Vec x,PetscInt ni,const PetscInt ix[],const PetscScalar y[],InsertMode iora) with PetscInt ix[] increasing uniformly, in steps of 1, from 0 to ni-1, which is also the full index range of Vec x. I do this on every rank, then use ADD_VALUES to generate the full Vec x. Is there a PetSc method to set values without specifying PetscInt ix[] explicitly in this simple case. I'm trying to save the memory overhead of specifying ix[]on every rank. Any guidance will be appreciated. Regards, Rob -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Oct 10 18:24:17 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 10 Oct 2011 18:24:17 -0500 Subject: [petsc-users] VecSetValues In-Reply-To: <18205E5ECD2A1A4584F2BFC0BCBDE95526E7F8C6@exchange.geosoft.com> References: <18205E5ECD2A1A4584F2BFC0BCBDE95526E7F8C6@exchange.geosoft.com> Message-ID: Rob, You can use VecGetArray() and then compute the values directly into the underlying array, this is the fastest way to change the values. If you use VecGetArray() you can only write values for THAT process into the vector and you do not need to call VecAssemblyBegin/End(). Barry On Oct 10, 2011, at 3:33 PM, Robert Ellis wrote: > Hello All, > > I sometimes use VecSetValues ... > > PetscErrorCode VecSetValues(Vec x,PetscInt ni,const PetscInt ix[],const PetscScalar y[],InsertMode iora) > > with PetscInt ix[] increasing uniformly, in steps of 1, from 0 to ni-1, which is also the full index range of Vec x. I do this on every rank, then use ADD_VALUES to generate the full Vec x. > > Is there a PetSc method to set values without specifying PetscInt ix[] explicitly in this simple case. I'm trying to save the memory overhead of specifying ix[]on every rank. > > Any guidance will be appreciated. > > Regards, > Rob From mmnasr at gmail.com Mon Oct 10 19:31:49 2011 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Mon, 10 Oct 2011 17:31:49 -0700 Subject: [petsc-users] Vec I/O using different parallel layout Message-ID: Hi, I was wondering if it would be possible if I write a global vector (associated with a certain 3D distributed array) to file via: ierr = PetscViewerBinaryOpen(PCW1, filename, FILE_MODE_WRITE, &writer); ierr = VecView(vec_data, writer); ierr = PetscViewerDestroy(writer); And then load the data into a global vector which is not created using the same parallel layout? A simple example for this case would be to write the runtime data (parallel vector) to file and then just load the saved vector to do some simple SERIAL post processing. Thanks in advance, Best Mohamad -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Oct 10 19:36:47 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 10 Oct 2011 19:36:47 -0500 Subject: [petsc-users] Vec I/O using different parallel layout In-Reply-To: References: Message-ID: <63924F0C-2EB9-444A-8DE1-74E1912701E2@mcs.anl.gov> On Oct 10, 2011, at 7:31 PM, Mohamad M. Nasr-Azadani wrote: > Hi, > > I was wondering if it would be possible if I write a global vector (associated with a certain 3D distributed array) to file > via: > > ierr = PetscViewerBinaryOpen(PCW1, filename, FILE_MODE_WRITE, &writer); > ierr = VecView(vec_data, writer); > ierr = PetscViewerDestroy(writer); > > And then load the data into a global vector which is not created using the same parallel layout? > A simple example for this case would be to write the runtime data (parallel vector) to file and then just load the saved vector to do some simple SERIAL post processing. YES. The vector is saved to the file in the "natural ordering" that is starting with the logically 0,0,0 coordinate then increasing through the x axis, then the y axis then the z axis. To load back in in parallel you need to pass to VecLoad() a vector obtained with the appropriate DMCreateGlobalVector(). Barry > > Thanks in advance, > Best > Mohamad > > > > From mmnasr at gmail.com Mon Oct 10 19:41:54 2011 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Mon, 10 Oct 2011 17:41:54 -0700 Subject: [petsc-users] Vec I/O using different parallel layout In-Reply-To: <63924F0C-2EB9-444A-8DE1-74E1912701E2@mcs.anl.gov> References: <63924F0C-2EB9-444A-8DE1-74E1912701E2@mcs.anl.gov> Message-ID: Thanks Barry, I am still not 100% sure if I can do this. Say I have save the global vector obtained by a DA (3D) that is shared amongst 16 processors. Can I load that data into a vector obtained from a DA (3D, same size obviously) that is shared on 1 processor? Thanks, Best, Mohamad On Mon, Oct 10, 2011 at 5:36 PM, Barry Smith wrote: > > On Oct 10, 2011, at 7:31 PM, Mohamad M. Nasr-Azadani wrote: > > > Hi > > I was wondering if it would be possible if I write a global vector > (associated with a certain 3D distributed array) to file > > via: > > > > ierr = PetscViewerBinaryOpen(PCW1, filename, FILE_MODE_WRITE, > &writer); > > ierr = VecView(vec_data, writer); > > ierr = PetscViewerDestroy(writer); > > > > And then load the data into a global vector which is not created using > the same parallel layout? > > A simple example for this case would be to write the runtime data > (parallel vector) to file and then just load the saved vector to do some > simple SERIAL post processing. > > YES. > > The vector is saved to the file in the "natural ordering" that is > starting with the logically 0,0,0 coordinate then increasing through the x > axis, then the y axis then the z axis. To load back in in parallel you need > to pass to VecLoad() a vector obtained with the appropriate > DMCreateGlobalVector(). > > > Barry > > > > > Thanks in advance, > > Best > > Mohamad > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Oct 10 19:45:06 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 10 Oct 2011 19:45:06 -0500 Subject: [petsc-users] Vec I/O using different parallel layout In-Reply-To: References: <63924F0C-2EB9-444A-8DE1-74E1912701E2@mcs.anl.gov> Message-ID: <1D221F65-FBC3-4709-B577-AB007A201C38@mcs.anl.gov> On Oct 10, 2011, at 7:41 PM, Mohamad M. Nasr-Azadani wrote: > Thanks Barry, > > I am still not 100% sure if I can do this. > Say I have save the global vector obtained by a DA (3D) that is shared amongst 16 processors. > Can I load that data into a vector obtained from a DA (3D, same size obviously) that is shared on 1 processor? ABSOLUTELY. Or a DA on 2 processes etc. Barry > > Thanks, > Best, > Mohamad > > > On Mon, Oct 10, 2011 at 5:36 PM, Barry Smith wrote: > > On Oct 10, 2011, at 7:31 PM, Mohamad M. Nasr-Azadani wrote: > > > Hi > > I was wondering if it would be possible if I write a global vector (associated with a certain 3D distributed array) to file > > via: > > > > ierr = PetscViewerBinaryOpen(PCW1, filename, FILE_MODE_WRITE, &writer); > > ierr = VecView(vec_data, writer); > > ierr = PetscViewerDestroy(writer); > > > > And then load the data into a global vector which is not created using the same parallel layout? > > A simple example for this case would be to write the runtime data (parallel vector) to file and then just load the saved vector to do some simple SERIAL post processing. > > YES. > > The vector is saved to the file in the "natural ordering" that is starting with the logically 0,0,0 coordinate then increasing through the x axis, then the y axis then the z axis. To load back in in parallel you need to pass to VecLoad() a vector obtained with the appropriate DMCreateGlobalVector(). > > > Barry > > > > > Thanks in advance, > > Best > > Mohamad > > > > > > > > > > From mmnasr at gmail.com Mon Oct 10 19:47:05 2011 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Mon, 10 Oct 2011 17:47:05 -0700 Subject: [petsc-users] Vec I/O using different parallel layout In-Reply-To: <1D221F65-FBC3-4709-B577-AB007A201C38@mcs.anl.gov> References: <63924F0C-2EB9-444A-8DE1-74E1912701E2@mcs.anl.gov> <1D221F65-FBC3-4709-B577-AB007A201C38@mcs.anl.gov> Message-ID: Awesome! That is fantastic. :-) Thanks Barry for your prompt response. Mohamad On Mon, Oct 10, 2011 at 5:45 PM, Barry Smith wrote: > > On Oct 10, 2011, at 7:41 PM, Mohamad M. Nasr-Azadani wrote: > > > Thanks Barry, > > > > I am still not 100% sure if I can do this. > > Say I have save the global vector obtained by a DA (3D) that is shared > amongst 16 processors. :. > > Can I load that data into a vector obtained from a DA (3D, same size > obviously) that is shared on 1 processor? > > ABSOLUTELY. Or a DA on 2 processes etc. > > Barry > > > > > Thanks, > > Best, > > Mohamad > > > > > > On Mon, Oct 10, 2011 at 5:36 PM, Barry Smith wrote: > > > > On Oct 10, 2011, at 7:31 PM, Mohamad M. Nasr-Azadani wrote: > > > > > Hi > > > I was wondering if it would be possible if I write a global vector > (associated with a certain 3D distributed array) to file > > > via: > > > > > > ierr = PetscViewerBinaryOpen(PCW1, filename, > FILE_MODE_WRITE, &writer); > > > ierr = VecView(vec_data, writer); > > > ierr = PetscViewerDestroy(writer); > > > > > > And then load the data into a global vector which is not created using > the same parallel layout? > > > A simple example for this case would be to write the runtime data > (parallel vector) to file and then just load the saved vector to do some > simple SERIAL post processing. > > > > YES. > > > > The vector is saved to the file in the "natural ordering" that is > starting with the logically 0,0,0 coordinate then increasing through the x > axis, then the y axis then the z axis. To load back in in parallel you need > to pass to VecLoad() a vector obtained with the appropriate > DMCreateGlobalVector(). > > > > > > Barry > > > > > > > > Thanks in advance, > > > Best > > > Mohamad > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vijay.m at gmail.com Tue Oct 11 00:16:15 2011 From: vijay.m at gmail.com (Vijay S. Mahadevan) Date: Tue, 11 Oct 2011 00:16:15 -0500 Subject: [petsc-users] Some questions regarding VecNest type. Message-ID: I understand that the Vec/Mat Nest types in the current release of petsc are designed to be similar in philosophy to Dave May's petsc-ext "block" vectors and matrices. I have some existing code based on petsc-ext and am trying to figure out the amount of work involved to make a transition from petsc-ext to pure petsc (>=3.2). Here are a list of my questions after looking at the VecNest type: 1) Does VecDuplicate create a shallow or deep copy ? i.e., are all leaf vectors/matrices recreated based on a recursive duplicate or does it only replicate the higher level structure ? 2) Do I need to restore the references after I get the vectors ? Say VecNestGetSubVecs and something along the lines of VecNestRestoreSubVecs (I dont see such a routine though) ? Or is this unnecessary ? 3) Is there a way to lazily set the reference to one of the vectors that is part of the nest structure ? Or does it always have to be specified during creation of VECNEST ? I see that this is not the case for matrices since there is a MatNestSetSubMats but I don't see something similar for Vec. Can this be handled via VecNestGetSubVecs ? In petsc-ext, there was a function that handled VecBlockSetValue and I guess I am looking for a suitable replacement, if that shows better light on the context. 4) Are there any examples that make use of the VecNest type ? Even a trivial one might help clarify some trouble I'm having in understanding calling sequence. Most of these questions might also be applicable for Mat/PC/.. but I haven't started looking at these objects in detail yet. If there is an overlap in the understanding, hopefully I'll have lot less questions the next time and they should be more specific. Thanks, Vijay From dominik at itis.ethz.ch Tue Oct 11 09:06:15 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Tue, 11 Oct 2011 16:06:15 +0200 Subject: [petsc-users] elimination of trivial entries in a matrix Message-ID: My matrix has many trivial entries (A_{ii}=1, A_{i!=j}=0, b_i = b0) that I want to eliminate to reduce the solving time, re-inserting them at the end into the solution vector. Of course I can do it all by creating a second matrix and doing related book-keeping on my own, but was wondering if there is some functionality in Petsc to facilitate it. If so, I would be grateful for just a list of relevant functions to look at. Many thanks Dominik From bsmith at mcs.anl.gov Tue Oct 11 09:31:38 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 11 Oct 2011 09:31:38 -0500 Subject: [petsc-users] elimination of trivial entries in a matrix In-Reply-To: References: Message-ID: Dominik, This is handled by PCREDISTRIBUTE Barry On Oct 11, 2011, at 9:06 AM, Dominik Szczerba wrote: > My matrix has many trivial entries (A_{ii}=1, A_{i!=j}=0, b_i = b0) > that I want to eliminate to reduce the solving time, re-inserting them > at the end into the solution vector. > Of course I can do it all by creating a second matrix and doing > related book-keeping on my own, but was wondering if there is some > functionality in Petsc to facilitate it. > If so, I would be grateful for just a list of relevant functions to look at. > > Many thanks > Dominik From dominik at itis.ethz.ch Tue Oct 11 09:43:19 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Tue, 11 Oct 2011 16:43:19 +0200 Subject: [petsc-users] elimination of trivial entries in a matrix In-Reply-To: References: Message-ID: That sounds really great, but I do not seem to find any examples how to use it. I would be grateful for a few more pointers. Regards, Dominik On Tue, Oct 11, 2011 at 4:31 PM, Barry Smith wrote: > > ?Dominik, > > ? ? ?This is handled by PCREDISTRIBUTE > > ? Barry > > On Oct 11, 2011, at 9:06 AM, Dominik Szczerba wrote: > >> My matrix has many trivial entries (A_{ii}=1, A_{i!=j}=0, b_i = b0) >> that I want to eliminate to reduce the solving time, re-inserting them >> at the end into the solution vector. >> Of course I can do it all by creating a second matrix and doing >> related book-keeping on my own, but was wondering if there is some >> functionality in Petsc to facilitate it. >> If so, I would be grateful for just a list of relevant functions to look at. >> >> Many thanks >> Dominik > > From behzad.baghapour at gmail.com Tue Oct 11 09:45:56 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Tue, 11 Oct 2011 18:15:56 +0330 Subject: [petsc-users] about linear solver convergence Message-ID: Dear all, I set my Jacobian matrix in petsc and test it to ensure for correctness for Jacobian and RHS function. ( also precondition = jacobian ) I have solved my problem before with some other solvers (gmers/ILU(2)) and the convergence has been achieved within 15 - 16 iterations. But in petsc the linear solver (ksp=GMRES) is not converged within this number ( or may be converged in more than 700 iterations!!! ). I know that I should have a mistake but I can't predict it? Please help me what parts of petsc solver I should check? Regards, Behzad -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Oct 11 09:47:12 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 11 Oct 2011 09:47:12 -0500 Subject: [petsc-users] elimination of trivial entries in a matrix In-Reply-To: References: Message-ID: <3315835D-68E8-4ECF-B081-E388992553D2@mcs.anl.gov> -pc_type redistribute -ksp_type preonly -redistribute_ksp_type gmres -redistribute_ksp_monitor -redistribute_pc_type mycoolpreconditioner Barry On Oct 11, 2011, at 9:43 AM, Dominik Szczerba wrote: > That sounds really great, but I do not seem to find any examples how > to use it. I would be grateful for a few more pointers. > > Regards, > Dominik > > On Tue, Oct 11, 2011 at 4:31 PM, Barry Smith wrote: >> >> Dominik, >> >> This is handled by PCREDISTRIBUTE >> >> Barry >> >> On Oct 11, 2011, at 9:06 AM, Dominik Szczerba wrote: >> >>> My matrix has many trivial entries (A_{ii}=1, A_{i!=j}=0, b_i = b0) >>> that I want to eliminate to reduce the solving time, re-inserting them >>> at the end into the solution vector. >>> Of course I can do it all by creating a second matrix and doing >>> related book-keeping on my own, but was wondering if there is some >>> functionality in Petsc to facilitate it. >>> If so, I would be grateful for just a list of relevant functions to look at. >>> >>> Many thanks >>> Dominik >> >> From bsmith at mcs.anl.gov Tue Oct 11 09:48:22 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 11 Oct 2011 09:48:22 -0500 Subject: [petsc-users] about linear solver convergence In-Reply-To: References: Message-ID: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#kspdiverged On Oct 11, 2011, at 9:45 AM, behzad baghapour wrote: > Dear all, > > I set my Jacobian matrix in petsc and test it to ensure for correctness for Jacobian and RHS function. ( also precondition = jacobian ) > I have solved my problem before with some other solvers (gmers/ILU(2)) and the convergence has been achieved within 15 - 16 iterations. > > But in petsc the linear solver (ksp=GMRES) is not converged within this number ( or may be converged in more than 700 iterations!!! ). I know that I should have a mistake but I can't predict it? > > Please help me what parts of petsc solver I should check? > > Regards, > Behzad > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > From behzad.baghapour at gmail.com Tue Oct 11 10:20:01 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Tue, 11 Oct 2011 18:50:01 +0330 Subject: [petsc-users] about linear solver convergence In-Reply-To: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> References: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> Message-ID: Thanks, I monitored the ksp but I found that " unpreconditioned resid norm" and "true resid norm" are exactly the same. I can't understand while I set Jacobian = Precondition matrix ?!!! In addition, I am solving true Newton iterations and I saw that first iteration is good (as I got from previous linear solvers) and the norm of the matrix is what I expected BUT for the next nonlinear iteration, the linear iterations can not converge to desired tolerance!!! On Tue, Oct 11, 2011 at 6:18 PM, Barry Smith wrote: > > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#kspdiverged > > > On Oct 11, 2011, at 9:45 AM, behzad baghapour wrote: > > > Dear all, > > > > I set my Jacobian matrix in petsc and test it to ensure for correctness > for Jacobian and RHS function. ( also precondition = jacobian ) > > I have solved my problem before with some other solvers (gmers/ILU(2)) > and the convergence has been achieved within 15 - 16 iterations. > > > > But in petsc the linear solver (ksp=GMRES) is not converged within this > number ( or may be converged in more than 700 iterations!!! ). I know that I > should have a mistake but I can't predict it? > > > > Please help me what parts of petsc solver I should check? > > > > Regards, > > Behzad > > > > -- > > ================================== > > Behzad Baghapour > > Ph.D. Candidate, Mechecanical Engineering > > University of Tehran, Tehran, Iran > > https://sites.google.com/site/behzadbaghapour > > Fax: 0098-21-88020741 > > ================================== > > > > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Tue Oct 11 10:33:42 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Tue, 11 Oct 2011 19:03:42 +0330 Subject: [petsc-users] about linear solver convergence In-Reply-To: References: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> Message-ID: What is the difference between "unpreconditioned resid norm" and "true resid norm" in -ksp_monitor? On Tue, Oct 11, 2011 at 6:50 PM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > Thanks, I monitored the ksp but I found that " unpreconditioned resid norm" > and "true resid norm" are exactly the same. I can't understand while I set > Jacobian = Precondition matrix ?!!! > > In addition, I am solving true Newton iterations and I saw that first > iteration is good (as I got from previous linear solvers) and the norm of > the matrix is what I expected BUT for the next nonlinear iteration, the > linear iterations can not converge to desired tolerance!!! > > > > On Tue, Oct 11, 2011 at 6:18 PM, Barry Smith wrote: > >> >> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#kspdiverged >> >> >> On Oct 11, 2011, at 9:45 AM, behzad baghapour wrote: >> >> > Dear all, >> > >> > I set my Jacobian matrix in petsc and test it to ensure for correctness >> for Jacobian and RHS function. ( also precondition = jacobian ) >> > I have solved my problem before with some other solvers (gmers/ILU(2)) >> and the convergence has been achieved within 15 - 16 iterations. >> > >> > But in petsc the linear solver (ksp=GMRES) is not converged within this >> number ( or may be converged in more than 700 iterations!!! ). I know that I >> should have a mistake but I can't predict it? >> > >> > Please help me what parts of petsc solver I should check? >> > >> > Regards, >> > Behzad >> > >> > -- >> > ================================== >> > Behzad Baghapour >> > Ph.D. Candidate, Mechecanical Engineering >> > University of Tehran, Tehran, Iran >> > https://sites.google.com/site/behzadbaghapour >> > Fax: 0098-21-88020741 >> > ================================== >> > >> >> > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Oct 11 10:38:53 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 11 Oct 2011 10:38:53 -0500 Subject: [petsc-users] about linear solver convergence In-Reply-To: References: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> Message-ID: On Tue, Oct 11, 2011 at 10:33, behzad baghapour wrote: > What is the difference between "unpreconditioned resid norm" This is estimated implicitly using the Krylov iteration. It can become inaccurate if the Krylov basis loses orthogonality (an issue of numerical stability). > and "true resid norm" in -ksp_monitor? This builds the residual explicitly. It is expensive with some methods, e.g. GMRES. -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Tue Oct 11 10:43:52 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Tue, 11 Oct 2011 19:13:52 +0330 Subject: [petsc-users] about linear solver convergence In-Reply-To: References: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> Message-ID: Thanks and what means if they are the same in ksp iterations? On Tue, Oct 11, 2011 at 7:08 PM, Jed Brown wrote: > On Tue, Oct 11, 2011 at 10:33, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> What is the difference between "unpreconditioned resid norm" > > > This is estimated implicitly using the Krylov iteration. It can become > inaccurate if the Krylov basis loses orthogonality (an issue of numerical > stability). > > >> and "true resid norm" in -ksp_monitor? > > > This builds the residual explicitly. It is expensive with some methods, > e.g. GMRES. > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Oct 11 10:52:38 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 11 Oct 2011 10:52:38 -0500 Subject: [petsc-users] about linear solver convergence In-Reply-To: References: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> Message-ID: On Tue, Oct 11, 2011 at 10:43, behzad baghapour wrote: > Thanks and what means if they are the same in ksp iterations? That the orthogonalization process is stable. That's good. -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Tue Oct 11 10:55:16 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Tue, 11 Oct 2011 19:25:16 +0330 Subject: [petsc-users] about linear solver convergence In-Reply-To: References: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> Message-ID: OK. Then how I can calculate and monitor the "condition number" in ksp? Thanks again. On Tue, Oct 11, 2011 at 7:22 PM, Jed Brown wrote: > On Tue, Oct 11, 2011 at 10:43, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> Thanks and what means if they are the same in ksp iterations? > > > That the orthogonalization process is stable. That's good. > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From ecoon at lanl.gov Tue Oct 11 10:56:55 2011 From: ecoon at lanl.gov (Ethan Coon) Date: Tue, 11 Oct 2011 09:56:55 -0600 Subject: [petsc-users] Vec I/O using different parallel layout In-Reply-To: References: Message-ID: <1318348615.17763.1.camel@echo.lanl.gov> On Mon, 2011-10-10 at 17:31 -0700, Mohamad M. Nasr-Azadani wrote: > > A simple example for this case would be to write the runtime data > (parallel vector) to file and then just load the saved vector to do > some simple SERIAL post processing. > If this is your actual goal and the post-processing is not too expensive, you may find it easier to use the matlab or numpy/python scripts in $PETSC_DIR/bin/matlab and $PETSC_DIR/bin/pythonscripts respectively. Ethan > > Thanks in advance, > Best > Mohamad > > > > > > > > -- ------------------------------------ Ethan Coon Post-Doctoral Researcher Applied Mathematics - T-5 Los Alamos National Laboratory 505-665-8289 http://www.ldeo.columbia.edu/~ecoon/ ------------------------------------ From jedbrown at mcs.anl.gov Tue Oct 11 10:58:25 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 11 Oct 2011 10:58:25 -0500 Subject: [petsc-users] about linear solver convergence In-Reply-To: References: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> Message-ID: On Tue, Oct 11, 2011 at 10:55, behzad baghapour wrote: > OK. Then how I can calculate and monitor the "condition number" in ksp? -ksp_monitor_singular_value -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Tue Oct 11 11:13:09 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Tue, 11 Oct 2011 19:43:09 +0330 Subject: [petsc-users] about linear solver convergence In-Reply-To: References: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> Message-ID: Okey. I checked the condition number "max/min". It increases significantly in each ksp iterations!!! Is it correct when the Jacobian matrix is fixed for each ksp iterations or it is due to my fault? On Tue, Oct 11, 2011 at 7:28 PM, Jed Brown wrote: > On Tue, Oct 11, 2011 at 10:55, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> OK. Then how I can calculate and monitor the "condition number" in ksp? > > > -ksp_monitor_singular_value > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Oct 11 11:15:27 2011 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 11 Oct 2011 11:15:27 -0500 Subject: [petsc-users] about linear solver convergence In-Reply-To: References: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> Message-ID: On Tue, Oct 11, 2011 at 11:13 AM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > Okey. I checked the condition number "max/min". > It increases significantly in each ksp iterations!!! > The approximation is becoming better. The condition number is harder to actually compute than the solution. Bottom line: You have a hard problem. Iterative methods do not work on hard problems without specialized preconditioners. I recommend a literature search on solvers for your equations. Matt > Is it correct when the Jacobian matrix is fixed for each ksp iterations or > it is due to my fault? > > On Tue, Oct 11, 2011 at 7:28 PM, Jed Brown wrote: > >> On Tue, Oct 11, 2011 at 10:55, behzad baghapour < >> behzad.baghapour at gmail.com> wrote: >> >>> OK. Then how I can calculate and monitor the "condition number" in ksp? >> >> >> -ksp_monitor_singular_value >> > > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From ecoon at lanl.gov Tue Oct 11 11:19:45 2011 From: ecoon at lanl.gov (Ethan Coon) Date: Tue, 11 Oct 2011 10:19:45 -0600 Subject: [petsc-users] Domain partitioning using DAs In-Reply-To: References: Message-ID: <1318349985.17763.14.camel@echo.lanl.gov> On Sat, 2011-10-08 at 08:32 -0500, Jed Brown wrote: > > I'm a little surprised it doesn't find the "best" split in this simple > case. I swore at this for a while as well back when I implemented the "degenerate" 3D case of P=1. It actually does quite poorly for a few common sizes (8, for example) when there is one "small" dimension that is not the Y-direction. (If the Y-direction is the small one, it does fine as n is calculated first, see line 273 in da3.c.) For instance, if M=N=256, P=1, and size <= 16, it ends up with (1,size,1) for the decomposition. I think it's worth reorganizing that calculation to pick p first, since it would do so much better for when m=n=p=PETSC_DECIDE and P << (M,N), which seems to be a much more common case. It wouldn't help in Mohamad's case, but if he chose to use, say, 4 processors, he'd find that his Y-direction would get split in 4 instead of splitting both X and Y in 2. Ethan -- ------------------------------------ Ethan Coon Post-Doctoral Researcher Applied Mathematics - T-5 Los Alamos National Laboratory 505-665-8289 http://www.ldeo.columbia.edu/~ecoon/ ------------------------------------ From bsmith at mcs.anl.gov Tue Oct 11 11:22:51 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 11 Oct 2011 11:22:51 -0500 Subject: [petsc-users] Domain partitioning using DAs In-Reply-To: <1318349985.17763.14.camel@echo.lanl.gov> References: <1318349985.17763.14.camel@echo.lanl.gov> Message-ID: <6C39C534-8424-4D0E-ADCD-6C7042A54508@mcs.anl.gov> On Oct 11, 2011, at 11:19 AM, Ethan Coon wrote: > > > On Sat, 2011-10-08 at 08:32 -0500, Jed Brown wrote: > >> >> I'm a little surprised it doesn't find the "best" split in this simple >> case. > > > I swore at this for a while as well back when I implemented the > "degenerate" 3D case of P=1. It actually does quite poorly for a few > common sizes (8, for example) when there is one "small" dimension that > is not the Y-direction. (If the Y-direction is the small one, it does > fine as n is calculated first, see line 273 in da3.c.) For instance, if > M=N=256, P=1, and size <= 16, it ends up with (1,size,1) for the > decomposition. > > I think it's worth reorganizing that calculation to pick p first, since > it would do so much better for when m=n=p=PETSC_DECIDE and P << (M,N), > which seems to be a much more common case. It wouldn't help in > Mohamad's case, but if he chose to use, say, 4 processors, he'd find > that his Y-direction would get split in 4 instead of splitting both X > and Y in 2. > Should the logic be changed to first select the direction to be decided based relative sizes and then decide the second direction instead of having the order of directions figured out hardwired? What would that logic look like? Barry > Ethan > > > > > -- > ------------------------------------ > Ethan Coon > Post-Doctoral Researcher > Applied Mathematics - T-5 > Los Alamos National Laboratory > 505-665-8289 > > http://www.ldeo.columbia.edu/~ecoon/ > ------------------------------------ > From ecoon at lanl.gov Tue Oct 11 11:30:25 2011 From: ecoon at lanl.gov (Ethan Coon) Date: Tue, 11 Oct 2011 10:30:25 -0600 Subject: [petsc-users] Domain partitioning using DAs In-Reply-To: <6C39C534-8424-4D0E-ADCD-6C7042A54508@mcs.anl.gov> References: <1318349985.17763.14.camel@echo.lanl.gov> <6C39C534-8424-4D0E-ADCD-6C7042A54508@mcs.anl.gov> Message-ID: <1318350625.17763.20.camel@echo.lanl.gov> On Tue, 2011-10-11 at 11:22 -0500, Barry Smith wrote: > On Oct 11, 2011, at 11:19 AM, Ethan Coon wrote: > > > > > > > On Sat, 2011-10-08 at 08:32 -0500, Jed Brown wrote: > > > >> > >> I'm a little surprised it doesn't find the "best" split in this simple > >> case. > > > > > > I swore at this for a while as well back when I implemented the > > "degenerate" 3D case of P=1. It actually does quite poorly for a few > > common sizes (8, for example) when there is one "small" dimension that > > is not the Y-direction. (If the Y-direction is the small one, it does > > fine as n is calculated first, see line 273 in da3.c.) For instance, if > > M=N=256, P=1, and size <= 16, it ends up with (1,size,1) for the > > decomposition. > > > > I think it's worth reorganizing that calculation to pick p first, since > > it would do so much better for when m=n=p=PETSC_DECIDE and P << (M,N), > > which seems to be a much more common case. It wouldn't help in > > Mohamad's case, but if he chose to use, say, 4 processors, he'd find > > that his Y-direction would get split in 4 instead of splitting both X > > and Y in 2. > > > > Should the logic be changed to first select the direction to be decided based relative sizes and then decide the second direction instead of having the order of directions figured out hardwired? > That seems reasonable. In the case where M ~ N ~ P, order doesn't matter. In the case where M << N ~ P, m should be done first (and permutations), which would suggest just doing them from smallest to largest. I haven't considered the case where M ~ N << P. Will do so and submit a patch for this (after I fix the communication bug...) Ethan > What would that logic look like? > > Barry > > > Ethan > > > > > > > > > > -- > > ------------------------------------ > > Ethan Coon > > Post-Doctoral Researcher > > Applied Mathematics - T-5 > > Los Alamos National Laboratory > > 505-665-8289 > > > > http://www.ldeo.columbia.edu/~ecoon/ > > ------------------------------------ > > > -- ------------------------------------ Ethan Coon Post-Doctoral Researcher Applied Mathematics - T-5 Los Alamos National Laboratory 505-665-8289 http://www.ldeo.columbia.edu/~ecoon/ ------------------------------------ From jack.poulson at gmail.com Tue Oct 11 11:32:36 2011 From: jack.poulson at gmail.com (Jack Poulson) Date: Tue, 11 Oct 2011 11:32:36 -0500 Subject: [petsc-users] about linear solver convergence In-Reply-To: References: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> Message-ID: On Tue, Oct 11, 2011 at 11:15 AM, Matthew Knepley wrote: > On Tue, Oct 11, 2011 at 11:13 AM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> Okey. I checked the condition number "max/min". >> It increases significantly in each ksp iterations!!! >> > > The approximation is becoming better. The condition number is harder to > actually compute than the solution. > > Bottom line: You have a hard problem. Iterative methods do not work on hard > problems without specialized > preconditioners. I recommend a literature search on solvers for your > equations. > > >From what he said earlier, > I have solved my problem before with some other solvers (gmers/ILU(2)) and the > convergence has been achieved within 15 - 16 iterations. it sounds like he should try using one of Hypre's ILUs. http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/PC/PCHYPRE.html Bezhad, why start with Jacobi if you already knew that GMRES+ILU(2) works? Jack -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Oct 11 11:33:03 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 11 Oct 2011 11:33:03 -0500 Subject: [petsc-users] Domain partitioning using DAs In-Reply-To: <1318350625.17763.20.camel@echo.lanl.gov> References: <1318349985.17763.14.camel@echo.lanl.gov> <6C39C534-8424-4D0E-ADCD-6C7042A54508@mcs.anl.gov> <1318350625.17763.20.camel@echo.lanl.gov> Message-ID: <29E3A958-4F7B-4941-94A8-D85E0F65E30C@mcs.anl.gov> On Oct 11, 2011, at 11:30 AM, Ethan Coon wrote: > On Tue, 2011-10-11 at 11:22 -0500, Barry Smith wrote: >> On Oct 11, 2011, at 11:19 AM, Ethan Coon wrote: >> >>> >>> >>> On Sat, 2011-10-08 at 08:32 -0500, Jed Brown wrote: >>> >>>> >>>> I'm a little surprised it doesn't find the "best" split in this simple >>>> case. >>> >>> >>> I swore at this for a while as well back when I implemented the >>> "degenerate" 3D case of P=1. It actually does quite poorly for a few >>> common sizes (8, for example) when there is one "small" dimension that >>> is not the Y-direction. (If the Y-direction is the small one, it does >>> fine as n is calculated first, see line 273 in da3.c.) For instance, if >>> M=N=256, P=1, and size <= 16, it ends up with (1,size,1) for the >>> decomposition. >>> >>> I think it's worth reorganizing that calculation to pick p first, since >>> it would do so much better for when m=n=p=PETSC_DECIDE and P << (M,N), >>> which seems to be a much more common case. It wouldn't help in >>> Mohamad's case, but if he chose to use, say, 4 processors, he'd find >>> that his Y-direction would get split in 4 instead of splitting both X >>> and Y in 2. >>> >> >> Should the logic be changed to first select the direction to be decided based relative sizes and then decide the second direction instead of having the order of directions figured out hardwired? >> > > That seems reasonable. In the case where M ~ N ~ P, order doesn't > matter. > > In the case where M << N ~ P, m should be done first (and permutations), > which would suggest just doing them from smallest to largest. > > I haven't considered the case where M ~ N << P. Will do so and submit a > patch for this (after I fix the communication bug...) Thanks, remember patches should be done via 3.2 repository, not petsc-dev ssh://petsc at petsc.cs.iit.edu//hg/petsc/releases/petsc-3.2 Barrt > > Ethan > >> What would that logic look like? >> >> Barry >> >>> Ethan >>> >>> >>> >>> >>> -- >>> ------------------------------------ >>> Ethan Coon >>> Post-Doctoral Researcher >>> Applied Mathematics - T-5 >>> Los Alamos National Laboratory >>> 505-665-8289 >>> >>> http://www.ldeo.columbia.edu/~ecoon/ >>> ------------------------------------ >>> >> > > -- > ------------------------------------ > Ethan Coon > Post-Doctoral Researcher > Applied Mathematics - T-5 > Los Alamos National Laboratory > 505-665-8289 > > http://www.ldeo.columbia.edu/~ecoon/ > ------------------------------------ > From behzad.baghapour at gmail.com Tue Oct 11 11:37:36 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Tue, 11 Oct 2011 20:07:36 +0330 Subject: [petsc-users] about linear solver convergence In-Reply-To: References: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> Message-ID: Thanks. I should study more. On Tue, Oct 11, 2011 at 7:45 PM, Matthew Knepley wrote: > On Tue, Oct 11, 2011 at 11:13 AM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> Okey. I checked the condition number "max/min". >> It increases significantly in each ksp iterations!!! >> > > The approximation is becoming better. The condition number is harder to > actually compute than the solution. > > Bottom line: You have a hard problem. Iterative methods do not work on hard > problems without specialized > preconditioners. I recommend a literature search on solvers for your > equations. > > Matt > > >> Is it correct when the Jacobian matrix is fixed for each ksp iterations or >> it is due to my fault? >> >> On Tue, Oct 11, 2011 at 7:28 PM, Jed Brown wrote: >> >>> On Tue, Oct 11, 2011 at 10:55, behzad baghapour < >>> behzad.baghapour at gmail.com> wrote: >>> >>>> OK. Then how I can calculate and monitor the "condition number" in ksp? >>> >>> >>> -ksp_monitor_singular_value >>> >> >> >> >> -- >> ================================== >> Behzad Baghapour >> Ph.D. Candidate, Mechecanical Engineering >> University of Tehran, Tehran, Iran >> https://sites.google.com/site/behzadbaghapour >> Fax: 0098-21-88020741 >> ================================== >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Oct 11 13:19:00 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 11 Oct 2011 13:19:00 -0500 Subject: [petsc-users] Some questions regarding VecNest type. In-Reply-To: References: Message-ID: On Tue, Oct 11, 2011 at 00:16, Vijay S. Mahadevan wrote: > I understand that the Vec/Mat Nest types in the current release of > petsc are designed to be similar in philosophy to Dave May's petsc-ext > "block" vectors and matrices. They have a similar data structure, but have an interface that interoperates better, so they should replace petsc-ext. > I have some existing code based on > petsc-ext and am trying to figure out the amount of work involved to > make a transition from petsc-ext to pure petsc (>=3.2). > > Here are a list of my questions after looking at the VecNest type: > > 1) Does VecDuplicate create a shallow or deep copy ? i.e., are all > leaf vectors/matrices recreated based on a recursive duplicate or does > it only replicate the higher level structure ? > Deep copy. A shallow copy would not be semantically correct. > 2) Do I need to restore the references after I get the vectors ? Say > VecNestGetSubVecs and something along the lines of > VecNestRestoreSubVecs (I dont see such a routine though) ? Or is this > unnecessary ? > VecNestGetSubVecs() gives you direct access to the data structure. It does not increment a reference and does not need to be restored. You should prefer to use VecGetSubVector() which works for any Vec type and still gives you a cheap (no-copy) reference for VecNest. > 3) Is there a way to lazily set the reference to one of the vectors > that is part of the nest structure ? Or does it always have to be > specified during creation of VECNEST ? I see that this is not the case > for matrices since there is a MatNestSetSubMats but I don't see > something similar for Vec. Can this be handled via VecNestGetSubVecs ? > In petsc-ext, there was a function that handled VecBlockSetValue and I > guess I am looking for a suitable replacement, if that shows better > light on the context. > Even with Mat, you are only supposed to call this once (at setup time). If it doesn't check for this, it should. It could be modified to be safe to call more than once, but is not currently. The single-Mat function MatNestSetSubMat() does exist and is safe to call multiple times. It would make sense to have VecNestSetSubVec(), but it is not implemented yet. 4) Are there any examples that make use of the VecNest type ? Even a > trivial one might help clarify some trouble I'm having in > understanding calling sequence. > In general, I don't recommend using VecNest unless it's very, very important to you that fields with some interlaced ordering actually be stored separately (or you have very specific requirements about in-place ghost updates for individual fields). You get a lot more consistent support if you use a standard contiguous Vec type and call VecGetSubVector() or a VecScatter when you need to get part of it. If you write code this way, you'll be able to use -dm_vec_type nest sometime and see if it helps performance. If after the above, you are still dedicated to needless pain, check the tests for usage of the internal interfaces. $ grep -l VecNest src/**/examples/**/*.c src/ksp/ksp/examples/tests/ex22.c src/snes/examples/tests/ex17.c src/vec/vec/examples/tests/ex37.c Note: MatNest is far more useful because the data structure tradeoffs are more significant when using PCFieldSplit. Again, you should try to avoid ever using the MatNest() interface directly. Instead, you can use MatGetLocalSubMatrix(). See src/snes/examples/tutorials/ex28.c, make runex28_3 for usage of MatNest without ever referencing it in the code. Note that there is still a preallocation interface challenge for off-diagonal blocks when using MatNest in a multiphysics application. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vijay.m at gmail.com Tue Oct 11 15:53:29 2011 From: vijay.m at gmail.com (Vijay S. Mahadevan) Date: Tue, 11 Oct 2011 15:53:29 -0500 Subject: [petsc-users] Some questions regarding VecNest type. In-Reply-To: References: Message-ID: > It would make sense to have VecNestSetSubVec(), but it is not implemented > yet. In src/vec/vec/examples/tests/ex37.c, I see a call to this function but it is part of dead code. I don't see a specialization VecGetSubVector/VecRestoreSubVector in VecNest and so my related question is if I call VecGetSubVector with multiple indices, do I expect a nested vector back ? And in the case when the index set has only one entry, perhaps the returned vector is in the parallel layout of the corresponding block entry. Or does the returned Vector have a different parallel layout in these cases ? > You get a lot more consistent support if you > use a standard contiguous Vec type and call VecGetSubVector() or a > VecScatter when you need to get part of it. I used to take this route until I found the block extensions. I feel it is elegant to handle different physics blocks using the block or Nest interface specifically since dofs for each physics can be distributed independently based on its own mesh and other requirements. IMO, the blocking also makes the solve/preconditioning for multi-physics problems natural (yes the sparsity pattern for off-diagonal blocks are tricky). I will also investigate the Nest independent interface to generalize this access. I will look at the tests and look at the implementation again to understand the details. The VecNestSetSubVec() would be nice to have though. I will see if I can do this today and send a patch if it succeeds. Anyway, thanks for the pointers ! Oh and an unrelated question: I couldn't find what replaced the Vec/Mat-Valid routines that existed in 3.0. Currently is there a routine to check if a Vec/Mat has called its setup yet ? Vijay On Tue, Oct 11, 2011 at 1:19 PM, Jed Brown wrote: > On Tue, Oct 11, 2011 at 00:16, Vijay S. Mahadevan wrote: >> >> I understand that the Vec/Mat Nest types in the current release of >> petsc are designed to be similar in philosophy to Dave May's petsc-ext >> "block" vectors and matrices. > > They have a similar data structure, but have an interface that interoperates > better, so they should replace petsc-ext. > >> >> I have some existing code based on >> petsc-ext and am trying to figure out the amount of work involved to >> make a transition from petsc-ext to pure petsc (>=3.2). >> >> Here are a list of my questions after looking at the VecNest type: >> >> 1) Does VecDuplicate create a shallow or deep copy ? i.e., are all >> leaf vectors/matrices recreated based on a recursive duplicate or does >> it only replicate the higher level structure ? > > Deep copy. A shallow copy would not be semantically correct. > >> >> 2) Do I need to restore the references after I get the vectors ? Say >> VecNestGetSubVecs and something along the lines of >> VecNestRestoreSubVecs (I dont see such a routine though) ? Or is this >> unnecessary ? > > VecNestGetSubVecs() gives you direct access to the data structure. It does > not increment a reference and does not need to be restored. You should > prefer to use VecGetSubVector() which works for any Vec type and still gives > you a cheap (no-copy) reference for VecNest. > >> >> 3) Is there a way to lazily set the reference to one of the vectors >> that is part of the nest structure ? Or does it always have to be >> specified during creation of VECNEST ? I see that this is not the case >> for matrices since there is a MatNestSetSubMats but I don't see >> something similar for Vec. Can this be handled via VecNestGetSubVecs ? >> In petsc-ext, there was a function that handled VecBlockSetValue and I >> guess I am looking for a suitable replacement, if that shows better >> light on the context. > > Even with Mat, you are only supposed to call this once (at setup time). If > it doesn't check for this, it should. It could be modified to be safe to > call more than once, but is not currently. The single-Mat function > MatNestSetSubMat() does exist and is safe to call multiple times. > It would make sense to have VecNestSetSubVec(), but it is not implemented > yet. >> >> 4) Are there any examples that make use of the VecNest type ? Even a >> trivial one might help clarify some trouble I'm having in >> understanding calling sequence. > > In general, I don't recommend using VecNest unless it's very, very important > to you that fields with some interlaced ordering actually be stored > separately (or you have very specific requirements about in-place ghost > updates for individual fields). You get a lot more consistent support if you > use a standard contiguous Vec type and call VecGetSubVector() or a > VecScatter when you need to get part of it. If you write code this way, > you'll be able to use -dm_vec_type nest sometime and see if it helps > performance. > If after the above, you are still dedicated to needless pain, check the > tests for usage of the internal interfaces. > $ grep -l VecNest src/**/examples/**/*.c > src/ksp/ksp/examples/tests/ex22.c > src/snes/examples/tests/ex17.c > src/vec/vec/examples/tests/ex37.c > > Note: MatNest is far more useful because the data structure tradeoffs are > more significant when using PCFieldSplit. Again, you should try to avoid > ever using the MatNest() interface directly. Instead, you can use > MatGetLocalSubMatrix(). See src/snes/examples/tutorials/ex28.c, make > runex28_3 for usage of MatNest without ever referencing it in the code. > Note that there is still a preallocation interface challenge for > off-diagonal blocks when using MatNest in a multiphysics application. From kevin.richard.green at gmail.com Tue Oct 11 16:18:57 2011 From: kevin.richard.green at gmail.com (Kevin Green) Date: Tue, 11 Oct 2011 17:18:57 -0400 Subject: [petsc-users] Appending to vector / numerical continuation / slepc In-Reply-To: References: <2303DE53-E457-4168-9C56-B7B590676927@mcs.anl.gov> <1805540796.70710.1317850510287.JavaMail.root@zimbra.anl.gov> Message-ID: Successfully updated to petsc-3.2 and slepc-dev. A note about slepc-dev: configure failed, with the error Traceback (most recent call last): File "./configure", line 10, in execfile(os.path.join(os.path.dirname(__file__), 'config', 'configure.py')) File "./config/configure.py", line 344, in generatefortranstubs.main(petscconf.BFORT) AttributeError: 'module' object has no attribute 'BFORT' when I commented out the offending if statement in configure.py, the build worked, but the tests for fortran examples failed... Not really an issue for me, but thought I should point this out. DMcomposite examples: src/snes/examples/tutorials/ex21.c (the example that seems most useful for my purpose) fails to run, as the jacobian isn't set within the code... I thought maybe running it with the -snes_fd option would work, but that results in a null object somewhere within snes.c. My guess is that an mffd needs to be in place for this. Or am I missing something? Up until this point, I have been setting the jacobian on a 2D (periodic in x and y) DM analytically. To preallocate the nonzero structure of the jacobian, I use DMDASetBlockFills(...). Now that I wish to use a DMComposite with the original DM and an array, I presume I'll have to use DMCompositeSetCoupling(...) to preallocate the jacobian structure, but the only example quoted in that documentation points this out, but doesn't actually use it. So... 1. Can I still manage to use DMDASetBlockFills(...) for the DM part of the composite? 2. How would the array coupling then get "wiggled in" to the structure? 3. It seems like this process could get ugly if the above is not possible... so if I move away from setting the Jacobian explicitly, do things become simpler? That's all for now, it seems fairly intuitive on how to make use of the composite once it gets set up. Cheers, Kevin On Wed, Oct 5, 2011 at 10:59 PM, Kevin Green wrote: > Jose - Thank you, I'll see if I can get this working. > > Barry - This seems to be exactly what I'm looking for. Glancing at the > documentation briefly, some questions do spring to mind, but I will not ask > until I look at some of the examples! > > Mike - Thanks for the updated link, I didn't even notice that Barry's was > for 3.0.0. > > In the meantime, I'll update to petsc 3.2, and slepc-dev, and get looking > at these examples. This isn't at the immediate top of my todo list, but I > expect I'll have some detailed questions on DMCOMPOSITE in a week or so. > > Kevin > > > > On Wed, Oct 5, 2011 at 5:35 PM, Mike McCourt wrote: > >> If you're gonna use PETSc 3.2, make sure to check out the updated >> documentation: >> >> >> http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/DM/DMCompositeCreate.html >> >> It has a more accurate list of examples. >> >> -Mike >> >> ----- Original Message ----- >> From: "Barry Smith" >> To: "PETSc users list" >> Sent: Wednesday, October 5, 2011 4:29:14 PM >> Subject: Re: [petsc-users] Appending to vector / numerical continuation / >> slepc >> >> >> Kevin, >> >> The DMCOMPOSITE is designed exactly for this purpose. See the manual >> page >> http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-3.0.0/docs/manualpages/DA/DMCompositeCreate.html#DMCompositeCreateand examples it links to. Essentially you create a DMCOMPOSITE and then use >> DMCompositeAddDM() to put in the DM which will be parallel and >> DMCompositeAddArray() for the "k" extra things (like continuation >> parameters). After you read the manual pages and look at the examples and >> start your code using the DMCOMPOSITE, feel free to ask specific questions >> about its usage. >> >> You definitely should switch to PETSc 3.2 before working because the DM >> has been markedly improved in places for this type of thing, >> >> Barry >> >> >> >> On Oct 5, 2011, at 12:46 PM, Kevin Green wrote: >> >> > Greetings, >> > >> > I was just wondering what the simplest way to create a new N+k dim where >> the first N come from a DA. It seems to me that I would have to go the >> round about way of getting the array, then writing that to the first N >> components of the new vector... I think there would be a bit of a pain for >> the parallel case when doing this though, like in managing the change in the >> local sizes when going from N to N+k... perhaps it's not that tricky. Also, >> with DAs I don't have to worry about orderings, correct? >> > >> > Essentially I want to get pseudo-arclength continuation working using >> the SNES solver. Another option I'm thinking is that rather than using an >> extended vector, I could use a MatShell where the added components are >> located within its context, and updated upon matmult...since k is typically >> small, this seems reasonable. Do you know of any code/projects that make >> use of the SNES module for continuation? Any thoughts on what would be the >> better or simpler way of doing this? >> > >> > I'm using petsc-3.1 right now, as I also need slepc...which hasn't been >> updated to work with 3.2 yet, as far as I know. I'm fairly new to >> petsc/slepc... so I have to ask, what is the timescale like between the >> release of a new petsc, and update of slepc? Or is there a way to get slepc >> working with the new release? >> > >> > Cheers, >> > Kevin >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rxk at cfdrc.com Tue Oct 11 16:24:56 2011 From: rxk at cfdrc.com (Ravi Kannan) Date: Tue, 11 Oct 2011 16:24:56 -0500 Subject: [petsc-users] question on MPI usage for PETSC + petsc for multicores Message-ID: <008c01cc885c$3912e490$ab38adb0$@com> Dear All, This is Ravi Kannan from CFD Research Corporation. We have been using PETSc as the main driver for our computational suites for the last decade. Recently there has been a surge in the multicore type architectures for scientific computing. I have a few questions in this regard: 1. Does PETSc's communication use the MPI which is installed on the host machine? In other words, do the transfers performed by PETSC use exactly the MPI installed on the host machine? 2. How does PETSc handle the data transfer for inside the processor (between cores) for multicore architectures? Thanks, Ravi. _________________________________________ Dr. Ravi Kannan Associate Editor of Scientific Journals International Editorial Board, Journal of Aerospace Engineering and Technology Who's Who in Thermal Fluids ( https://www.thermalfluidscentral.org/who/browse-entry.php?e=9560) Research Engineer, CFD Research Corporation 256.726.4851 rxk at cfdrc.com http://ravikannan.jimdo.com/ , https://www.msu.edu/~kannanra/homepage.html _________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Oct 11 16:35:47 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 11 Oct 2011 16:35:47 -0500 Subject: [petsc-users] question on MPI usage for PETSC + petsc for multicores In-Reply-To: <008c01cc885c$3912e490$ab38adb0$@com> References: <008c01cc885c$3912e490$ab38adb0$@com> Message-ID: On Oct 11, 2011, at 4:24 PM, Ravi Kannan wrote: > Dear All, > > This is Ravi Kannan from CFD Research Corporation. We have been using PETSc as the main driver for our computational suites for the last decade. Recently there has been a surge in the multicore type architectures for scientific computing. I have a few questions in this regard: > > 1. Does PETSc?s communication use the MPI which is installed on the host machine? In other words, do the transfers performed by PETSC use exactly the MPI installed on the host machine? That depends on the MPI you indicated when ./configure is run for PETSc. If you use --with-mpi-dir=/directoryofyourmachinesmpi then it will use that MPI > > 2. How does PETSc handle the data transfer for inside the processor (between cores) for multicore architectures? By default it uses all MPI. We have started to add support for using pthreads/shared memory for communication within the node. We are currently working with early users on this feature, it is not really ready for prime time yet. Likely the next release of PETSc will have a strong support for this. Barry > > Thanks, > Ravi. > > _________________________________________ > > Dr. Ravi Kannan > Associate Editor of Scientific Journals International > Editorial Board, Journal of Aerospace Engineering and Technology > Who?s Who in Thermal Fluids(https://www.thermalfluidscentral.org/who/browse-entry.php?e=9560) > Research Engineer, CFD Research Corporation > 256.726.4851 > rxk at cfdrc.com > http://ravikannan.jimdo.com/ , > https://www.msu.edu/~kannanra/homepage.html > _________________________________________ > From dominik at itis.ethz.ch Wed Oct 12 06:38:04 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 12 Oct 2011 13:38:04 +0200 Subject: [petsc-users] Using ghosted vector as 'x' in KSPSolve Message-ID: Is it legal to call KSPSolve with the solution vector being a ghost-aware vector created e.g. with VecCreateGhost? Underlying motivation: when setting up my element stiffness matrix I have a term depending on previous value of x, say in a transient scheme. But it has to be a ghosted vector, because local size of x is smaller than the amount of all nodes referenced by the local cells. For the moment I maintain a ghost-aware copy of 'x' that I update manually but am thinking if there are more efficient solutions possible. Regards, Dominik From jedbrown at mcs.anl.gov Wed Oct 12 06:43:23 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 12 Oct 2011 06:43:23 -0500 Subject: [petsc-users] Using ghosted vector as 'x' in KSPSolve In-Reply-To: References: Message-ID: On Wed, Oct 12, 2011 at 06:38, Dominik Szczerba wrote: > Is it legal to call KSPSolve with the solution vector being a > ghost-aware vector created e.g. with VecCreateGhost? > Yes > > Underlying motivation: when setting up my element stiffness matrix I > have a term depending on previous value of x, say in a transient > scheme. But it has to be a ghosted vector, because local size of x is > smaller than the amount of all nodes referenced by the local cells. > For the moment I maintain a ghost-aware copy of 'x' that I update > manually but am thinking if there are more efficient solutions > possible. > It's unlikely that this will make a performance difference in your application. (I can design a real problem where it's measurable, but it usually isn't.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Oct 12 07:58:52 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 12 Oct 2011 07:58:52 -0500 Subject: [petsc-users] Some questions regarding VecNest type. In-Reply-To: References: Message-ID: On Tue, Oct 11, 2011 at 15:53, Vijay S. Mahadevan wrote: > > It would make sense to have VecNestSetSubVec(), but it is not implemented > > yet. > > In src/vec/vec/examples/tests/ex37.c, I see a call to this function > but it is part of dead code. > > I don't see a specialization VecGetSubVector/VecRestoreSubVector in > VecNest vecnest.c:639 ? > and so my related question is if I call VecGetSubVector with > multiple indices, What do you mean by multiple indices? The union of multiple constituent index sets? You could make a hierarchical VecNest if you really want that layout, otherwise we need to find an inexpensive way to handle exotic index sets without losing structure. > do I expect a nested vector back ? And in the case > when the index set has only one entry, perhaps the returned vector is > in the parallel layout of the corresponding block entry. Or does the > returned Vector have a different parallel layout in these cases ? > Same parallel layout, the returned vector has the distribution specified by the IS. The main difference between using VecNest and standard Vec is that the former can have a non-contiguous index set, but store it contiguously in memory, and provide no-copy access to that piece. With a normal Vec, you only get no-copy access if the index space is contiguous. (But usually you choose the ordering of the index space so that related things are nearby, so it should be a good ordering.) > > > You get a lot more consistent support if you > > use a standard contiguous Vec type and call VecGetSubVector() or a > > VecScatter when you need to get part of it. > > I used to take this route until I found the block extensions. I feel > it is elegant to handle different physics blocks using the block or > Nest interface specifically since dofs for each physics can be > distributed independently based on its own mesh and other > requirements. IMO, the blocking also makes the solve/preconditioning > for multi-physics problems natural (yes the sparsity pattern for > off-diagonal blocks are tricky). I will also investigate the Nest > independent interface to generalize this access. > With the Nest-independent interface, you access pieces using an IS instead of an integer. I find it quite valuable when comparing algorithms to switch between monolithic formats with Schwarz/coupled multigrid/direct solvers and the Nest format with FieldSplit preconditioners. Since you can write your software so that this choice is a run-time option, I don't see a good rationale for choosing an interface that only does one thing. -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Wed Oct 12 09:56:48 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Wed, 12 Oct 2011 18:26:48 +0330 Subject: [petsc-users] about linear solver convergence In-Reply-To: References: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> Message-ID: I found that I should set up KSP in each Newton iteration ( if I don't want to use SNES I think!). Then KSP linear solver converged and my problem is solved. By the way, Is it correct to re-setup KSP in each Newton iteration or I can use a better way? On Tue, Oct 11, 2011 at 7:45 PM, Matthew Knepley wrote: > On Tue, Oct 11, 2011 at 11:13 AM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> Okey. I checked the condition number "max/min". >> It increases significantly in each ksp iterations!!! >> > > The approximation is becoming better. The condition number is harder to > actually compute than the solution. > > Bottom line: You have a hard problem. Iterative methods do not work on hard > problems without specialized > preconditioners. I recommend a literature search on solvers for your > equations. > > Matt > > >> Is it correct when the Jacobian matrix is fixed for each ksp iterations or >> it is due to my fault? >> >> On Tue, Oct 11, 2011 at 7:28 PM, Jed Brown wrote: >> >>> On Tue, Oct 11, 2011 at 10:55, behzad baghapour < >>> behzad.baghapour at gmail.com> wrote: >>> >>>> OK. Then how I can calculate and monitor the "condition number" in ksp? >>> >>> >>> -ksp_monitor_singular_value >>> >> >> >> >> -- >> ================================== >> Behzad Baghapour >> Ph.D. Candidate, Mechecanical Engineering >> University of Tehran, Tehran, Iran >> https://sites.google.com/site/behzadbaghapour >> Fax: 0098-21-88020741 >> ================================== >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Oct 12 09:58:35 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 12 Oct 2011 09:58:35 -0500 Subject: [petsc-users] about linear solver convergence In-Reply-To: References: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> Message-ID: On Wed, Oct 12, 2011 at 09:56, behzad baghapour wrote: > I found that I should set up KSP in each Newton iteration ( if I don't want > to use SNES I think!). > You probably should use SNES. :-) > Then KSP linear solver converged and my problem is solved. > By the way, Is it correct to re-setup KSP in each Newton iteration or I can > use a better way? > You can just call KSPSetOperators() before KSPSolve() in each Newton iteration. -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Wed Oct 12 10:04:00 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Wed, 12 Oct 2011 18:34:00 +0330 Subject: [petsc-users] about linear solver convergence In-Reply-To: References: <9940E0F1-91A0-4F94-A61C-C212388657D1@mcs.anl.gov> Message-ID: Thanks. On Wed, Oct 12, 2011 at 6:28 PM, Jed Brown wrote: > On Wed, Oct 12, 2011 at 09:56, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> I found that I should set up KSP in each Newton iteration ( if I don't >> want to use SNES I think!). >> > > You probably should use SNES. :-) > > >> Then KSP linear solver converged and my problem is solved. >> By the way, Is it correct to re-setup KSP in each Newton iteration or I >> can use a better way? >> > > You can just call KSPSetOperators() before KSPSolve() in each Newton > iteration. > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From rxk at cfdrc.com Wed Oct 12 11:17:24 2011 From: rxk at cfdrc.com (Ravi Kannan) Date: Wed, 12 Oct 2011 11:17:24 -0500 Subject: [petsc-users] petsc argument to get statistics Message-ID: <00c901cc88fa$6d8de8c0$48a9ba40$@com> Dear all, Is there a PETSc argument to get the statistics like percentage of time spent for different PETSC calls, MPI-SENDS, communication times etc. Thanks Ravi. _________________________________________ Dr. Ravi Kannan Associate Editor of Scientific Journals International Editorial Board, Journal of Aerospace Engineering and Technology Research Engineer, CFD Research Corporation 256.726.4851 rxk at cfdrc.com http://ravikannan.jimdo.com/ , https://www.msu.edu/~kannanra/homepage.html _________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Oct 12 11:16:40 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 12 Oct 2011 11:16:40 -0500 Subject: [petsc-users] petsc argument to get statistics In-Reply-To: <00c901cc88fa$6d8de8c0$48a9ba40$@com> References: <00c901cc88fa$6d8de8c0$48a9ba40$@com> Message-ID: <40A911EA-7957-48E0-8FE5-20B34E4693C1@mcs.anl.gov> -log_summary On Oct 12, 2011, at 11:17 AM, Ravi Kannan wrote: > Dear all, > > Is there a PETSc argument to get the statistics like percentage of time spent for different PETSC calls, MPI-SENDS, communication times etc. > > Thanks > Ravi. > _________________________________________ > > Dr. Ravi Kannan > Associate Editor of Scientific Journals International > Editorial Board, Journal of Aerospace Engineering and Technology > Research Engineer, CFD Research Corporation > 256.726.4851 > rxk at cfdrc.com > http://ravikannan.jimdo.com/ , > https://www.msu.edu/~kannanra/homepage.html > _________________________________________ > From vijay.m at gmail.com Wed Oct 12 14:08:47 2011 From: vijay.m at gmail.com (Vijay S. Mahadevan) Date: Wed, 12 Oct 2011 14:08:47 -0500 Subject: [petsc-users] Some questions regarding VecNest type. In-Reply-To: References: Message-ID: > vecnest.c:639 ? Oops. Must have had a typo in my search. > What do you mean by multiple indices? The union of multiple constituent > index sets? You could make a hierarchical VecNest if you really want that > layout, otherwise we need to find an inexpensive way to handle exotic index > sets without losing structure. Yes. I meant the union of indexes. i.e, if parent_vec = {child1, child2, .. childn}, then VecGetSubVector([1,2]) would then return subparent_vec={child1, child2} ? Not sure if I would call that exotic but I feel this is natural since I think of the vector as truly nested. If you are expecting the IS to contain [1..len(child1), len(child1)+1..len(child1)+len(child2)], then the parent_vec isn't truly nested but rather a regular parallel vector. Aren't these the distinctions between -vec_type nest and otherwise ? This is also what I gather from your explanation below. Please correct me if I am wrong. > Since you can write your > software so that this choice is a run-time option, I don't see a good > rationale for choosing an interface that only does one thing. Agreed. As I mentioned earlier, I originally did have both the options but only compile time since I was using petsc-ext. Perhaps this is the most optimal way forward. I have just started playing around with the examples and hopefully the tests will clarify some of my issues. Thanks. On Wed, Oct 12, 2011 at 7:58 AM, Jed Brown wrote: > On Tue, Oct 11, 2011 at 15:53, Vijay S. Mahadevan wrote: >> >> > It would make sense to have VecNestSetSubVec(), but it is not >> > implemented >> > yet. >> >> In src/vec/vec/examples/tests/ex37.c, I see a call to this function >> but it is part of dead code. >> >> I don't see a specialization VecGetSubVector/VecRestoreSubVector in >> VecNest > > vecnest.c:639 ? > >> >> and so my related question is if I call VecGetSubVector with >> multiple indices, > > What do you mean by multiple indices? The union of multiple constituent > index sets? You could make a hierarchical VecNest if you really want that > layout, otherwise we need to find an inexpensive way to handle exotic index > sets without losing structure. > >> >> do I expect a nested vector back ? And in the case >> when the index set has only one entry, perhaps the returned vector is >> in the parallel layout of the corresponding block entry. Or does the >> returned Vector have a different parallel layout in these cases ? > > Same parallel layout, the returned vector has the distribution specified by > the IS. The main difference between using VecNest and standard Vec is that > the former can have a non-contiguous index set, but store it contiguously in > memory, and provide no-copy access to that piece. With a normal Vec, you > only get no-copy access if the index space is contiguous. (But usually you > choose the ordering of the index space so that related things are nearby, so > it should be a good ordering.) > >> >> > You get a lot more consistent support if you >> > use a standard contiguous Vec type and call VecGetSubVector() or a >> > VecScatter when you need to get part of it. >> >> I used to take this route until I found the block extensions. I feel >> it is elegant to handle different physics blocks using the block or >> Nest interface specifically since dofs for each physics can be >> distributed independently based on its own mesh and other >> requirements. IMO, the blocking also makes the solve/preconditioning >> for multi-physics problems natural (yes the sparsity pattern for >> off-diagonal blocks are tricky). I will also investigate the Nest >> independent interface to generalize this access. > > With the Nest-independent interface, you access pieces using an IS instead > of an integer. I find it quite valuable when comparing algorithms to switch > between monolithic formats with Schwarz/coupled multigrid/direct solvers and > the Nest format with FieldSplit preconditioners. Since you can write your > software so that this choice is a run-time option, I don't see a good > rationale for choosing an interface that only does one thing. From jedbrown at mcs.anl.gov Wed Oct 12 14:34:40 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 12 Oct 2011 14:34:40 -0500 Subject: [petsc-users] Some questions regarding VecNest type. In-Reply-To: References: Message-ID: On Wed, Oct 12, 2011 at 14:08, Vijay S. Mahadevan wrote: > > vecnest.c:639 ? > > Oops. Must have had a typo in my search. > > > What do you mean by multiple indices? The union of multiple constituent > > index sets? You could make a hierarchical VecNest if you really want that > > layout, otherwise we need to find an inexpensive way to handle exotic > index > > sets without losing structure. > > Yes. I meant the union of indexes. i.e, if parent_vec = {child1, > child2, .. childn}, then VecGetSubVector([1,2]) would then return > subparent_vec={child1, child2} ? Not sure if I would call that exotic > but I feel this is natural since I think of the vector as truly > nested. Deciding when to have a deep hierarchy and when to flatten or reorder for locality is something that would be hard for the library to do automatically. You can always put a VecNest inside if you want to walk the tree that way. Solving the subset matching problem efficiently is tricky, but if someone wants to write support for that, we can do what you want. A different approach is to make an ISNest which would make the matching problem easier. > If you are expecting the IS to contain [1..len(child1), > len(child1)+1..len(child1)+len(child2)], then the parent_vec isn't > truly nested but rather a regular parallel vector. Aren't these the > distinctions between -vec_type nest and otherwise ? You can do VecNest that way, in which case you can do somewhat more flexible ghost updates and such. But VecNest will also let you have the two "blocks" be interlaced in the global ordering, but actually stored contiguously. I think this is rarely a good way to implement things, but you can do it. > This is also what > I gather from your explanation below. Please correct me if I am wrong. > > > Since you can write your > > software so that this choice is a run-time option, I don't see a good > > rationale for choosing an interface that only does one thing. > > Agreed. As I mentioned earlier, I originally did have both the options > but only compile time since I was using petsc-ext. How did you get a monolithic matrix that you could hit with a direct solver when you were using petsc-ext? -------------- next part -------------- An HTML attachment was scrubbed... URL: From vijay.m at gmail.com Wed Oct 12 14:53:35 2011 From: vijay.m at gmail.com (Vijay S. Mahadevan) Date: Wed, 12 Oct 2011 14:53:35 -0500 Subject: [petsc-users] Some questions regarding VecNest type. In-Reply-To: References: Message-ID: > Deciding when to have a deep hierarchy and when to flatten or reorder for > locality is something that would be hard for the library to do > automatically. You can always put a VecNest inside if you want to walk the > tree that way. Yes but petsc-ext did handle the "blocks" hierarchically only, if I remember right. And parallel vectors in petsc were always "flat". Are you suggesting that there is a way to interchangeably use these in the current Nest implementation ? > different approach is to make an ISNest which would make the matching > problem easier. I like this. And the interface is also consistent in the sense that a Nest type propagates the entire hierarchy of IS/Vec/Mat/(KSP?). > You can do VecNest that way, in which case you can do somewhat more flexible > ghost updates and such. But VecNest will also let you have the two "blocks" > be interlaced in the global ordering, but actually stored contiguously. I > think this is rarely a good way to implement things, but you can do it. I can think of cases where that might be a good thing to have. Of course these are problem/discretization dependent and switching the ordering/nesting at runtime will be a huge advantage to optimize efficiency. > How did you get a monolithic matrix that you could hit with a direct solver > when you were using petsc-ext? Ah. I didn't. Once I was sure enough that I was converging to the right solution, I switched to petsc-ext and used only Krylov solves with block diagonal lu/ilu preconditioners. And those worked very nicely for my problems. On Wed, Oct 12, 2011 at 2:34 PM, Jed Brown wrote: > On Wed, Oct 12, 2011 at 14:08, Vijay S. Mahadevan wrote: >> >> > vecnest.c:639 ? >> >> Oops. Must have had a typo in my search. >> >> > What do you mean by multiple indices? The union of multiple constituent >> > index sets? You could make a hierarchical VecNest if you really want >> > that >> > layout, otherwise we need to find an inexpensive way to handle exotic >> > index >> > sets without losing structure. >> >> Yes. I meant the union of indexes. i.e, if parent_vec = {child1, >> child2, .. childn}, then VecGetSubVector([1,2]) would then return >> subparent_vec={child1, child2} ? Not sure if I would call that exotic >> but I feel this is natural since I think of the vector as truly >> nested. > > Deciding when to have a deep hierarchy and when to flatten or reorder for > locality is something that would be hard for the library to do > automatically. You can always put a VecNest inside if you want to walk the > tree that way. Solving the subset matching problem efficiently is tricky, > but if someone wants to write support for that, we can do what you want. A > different approach is to make an ISNest which would make the matching > problem easier. > >> >> If you are expecting the IS to contain [1..len(child1), >> len(child1)+1..len(child1)+len(child2)], then the parent_vec isn't >> truly nested but rather a regular parallel vector. Aren't these the >> distinctions between -vec_type nest and otherwise ? > > You can do VecNest that way, in which case you can do somewhat more flexible > ghost updates and such. But VecNest will also let you have the two "blocks" > be interlaced in the global ordering, but actually stored contiguously. I > think this is rarely a good way to implement things, but you can do it. > >> >> This is also what >> I gather from your explanation below. Please correct me if I am wrong. >> >> > Since you can write your >> > software so that this choice is a run-time option, I don't see a good >> > rationale for choosing an interface that only does one thing. >> >> Agreed. As I mentioned earlier, I originally did have both the options >> but only compile time since I was using petsc-ext. > > How did you get a monolithic matrix that you could hit with a direct solver > when you were using petsc-ext? > From jedbrown at mcs.anl.gov Wed Oct 12 15:03:23 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 12 Oct 2011 15:03:23 -0500 Subject: [petsc-users] Some questions regarding VecNest type. In-Reply-To: References: Message-ID: On Wed, Oct 12, 2011 at 14:53, Vijay S. Mahadevan wrote: > Yes but petsc-ext did handle the "blocks" hierarchically only, if I > remember right. And parallel vectors in petsc were always "flat". Are > you suggesting that there is a way to interchangeably use these in the > current Nest implementation ? > If I remember correctly, you could only get access to single blocks, not to arbitrary subsets of the blocks. > > > different approach is to make an ISNest which would make the matching > > problem easier. > > I like this. And the interface is also consistent in the sense that a > Nest type propagates the entire hierarchy of IS/Vec/Mat/(KSP?). > You are welcome to write ISNest. I might get to it eventually, but it's not a high priority for me right now. > > > You can do VecNest that way, in which case you can do somewhat more > flexible > > ghost updates and such. But VecNest will also let you have the two > "blocks" > > be interlaced in the global ordering, but actually stored contiguously. I > > think this is rarely a good way to implement things, but you can do it. > > I can think of cases where that might be a good thing to have. Of > course these are problem/discretization dependent and switching the > ordering/nesting at runtime will be a huge advantage to optimize > efficiency. > I'm just saying that I think there are rather few problems where deep nesting for the Vec is good for performance. Also, if you commit to VecNest, there are lots of "cool" features in PETSc that either don't work or are much less efficient (anything that calls VecGetArray(), for a start). > > > How did you get a monolithic matrix that you could hit with a direct > solver > > when you were using petsc-ext? > > Ah. I didn't. Once I was sure enough that I was converging to the > right solution, I switched to petsc-ext and used only Krylov solves > with block diagonal lu/ilu preconditioners. And those worked very > nicely for my problems. > Sounds like a lot of work and what happens when you change the physics such that it's also time to change the preconditioner? -------------- next part -------------- An HTML attachment was scrubbed... URL: From vijay.m at gmail.com Wed Oct 12 15:25:19 2011 From: vijay.m at gmail.com (Vijay S. Mahadevan) Date: Wed, 12 Oct 2011 15:25:19 -0500 Subject: [petsc-users] Some questions regarding VecNest type. In-Reply-To: References: Message-ID: > You are welcome to write ISNest. I might get to it eventually, but it's not > a high priority for me right now. This definitely seems like a nice option to have and will start writing this soon. > I'm just saying that I think there are rather few problems where deep > nesting for the Vec is good for performance. Also, if you commit to VecNest, > there are lots of "cool" features in PETSc that either don't work or are > much less efficient (anything that calls VecGetArray(), for a start). I will keep that in mind. I dont want to lose the flexibility or give up efficiency. Perhaps there might be a better design in my end that can take advantage of both worlds. > Sounds like a lot of work and what happens when you change the physics such > that it's also time to change the preconditioner? What do you mean ? The blocks are formed in memory and handed over to the algebraic preconditioner (controlled at runtime). Why would the user have to change the preconditioner manually ? This made use of the block PC object in petsc-ext which I assume is replaced by FieldSplit ? I eventually do have to handle this elegantly and will need some input when I get there. If you are talking about the compile time switch, then yes, that was a little painful. But hopefully pure petsc will give me some peace ! On Wed, Oct 12, 2011 at 3:03 PM, Jed Brown wrote: > On Wed, Oct 12, 2011 at 14:53, Vijay S. Mahadevan wrote: >> >> Yes but petsc-ext did handle the "blocks" hierarchically only, if I >> remember right. And parallel vectors in petsc were always "flat". Are >> you suggesting that there is a way to interchangeably use these in the >> current Nest implementation ? > > If I remember correctly, you could only get access to single blocks, not to > arbitrary subsets of the blocks. > >> >> > different approach is to make an ISNest which would make the matching >> > problem easier. >> >> I like this. And the interface is also consistent in the sense that a >> Nest type propagates the entire hierarchy of IS/Vec/Mat/(KSP?). > > You are welcome to write ISNest. I might get to it eventually, but it's not > a high priority for me right now. > >> >> > You can do VecNest that way, in which case you can do somewhat more >> > flexible >> > ghost updates and such. But VecNest will also let you have the two >> > "blocks" >> > be interlaced in the global ordering, but actually stored contiguously. >> > I >> > think this is rarely a good way to implement things, but you can do it. >> >> I can think of cases where that might be a good thing to have. Of >> course these are problem/discretization dependent and switching the >> ordering/nesting at runtime will be a huge advantage to optimize >> efficiency. > > I'm just saying that I think there are rather few problems where deep > nesting for the Vec is good for performance. Also, if you commit to VecNest, > there are lots of "cool" features in PETSc that either don't work or are > much less efficient (anything that calls VecGetArray(), for a start). > >> >> > How did you get a monolithic matrix that you could hit with a direct >> > solver >> > when you were using petsc-ext? >> >> Ah. I didn't. Once I was sure enough that I was converging to the >> right solution, I switched to petsc-ext and used only Krylov solves >> with block diagonal lu/ilu preconditioners. And those worked very >> nicely for my problems. > > Sounds like a lot of work and what happens when you change the physics such > that it's also time to change the preconditioner? From jedbrown at mcs.anl.gov Wed Oct 12 15:28:34 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 12 Oct 2011 15:28:34 -0500 Subject: [petsc-users] Some questions regarding VecNest type. In-Reply-To: References: Message-ID: On Wed, Oct 12, 2011 at 15:25, Vijay S. Mahadevan wrote: > What do you mean ? The blocks are formed in memory and handed over to > the algebraic preconditioner (controlled at runtime). Why would the > user have to change the preconditioner manually ? This made use of the > block PC object in petsc-ext which I assume is replaced by FieldSplit > ? I eventually do have to handle this elegantly and will need some > input when I get there. > > If you are talking about the compile time switch, then yes, that was a > little painful. But hopefully pure petsc will give me some peace ! > I meant the two pieces of code. It's more to maintain and test. Needing to choose at compile time is lame, but even if you could do it at run-time, it would not be fun to maintain. In my opinion, we want one code where we can do -mat_type X -pc_type fieldsplit and -mat_type Y -pc_type lu and have efficient data structures used in both cases. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vijay.m at gmail.com Wed Oct 12 16:26:27 2011 From: vijay.m at gmail.com (Vijay S. Mahadevan) Date: Wed, 12 Oct 2011 16:26:27 -0500 Subject: [petsc-users] Some questions regarding VecNest type. In-Reply-To: References: Message-ID: > I meant the two pieces of code. It's more to maintain and test. Needing to > choose at compile time is lame, but even if you could do it at run-time, it > would not be fun to maintain. Not fun, yes, but two different code paths are inevitable. You can choose to do block solves or monolithic inversions but both do need a different data structure representation. In your example mat X vs Y. Anyway, I was trying to create a simple test case and was stopped immediately in my progress. I found the hard way that VecCreateNest is the only way to create a Nest vector ? The usual VecCreate+SetFromOptions doesn't seem to do the trick. Or am I missing some call here ? (Code attached) Petsc doesn't seem to like VecSetType(x, VECNEST) either. I get an error "Unknown vector type: nest!". On Wed, Oct 12, 2011 at 3:28 PM, Jed Brown wrote: > On Wed, Oct 12, 2011 at 15:25, Vijay S. Mahadevan wrote: >> >> What do you mean ? The blocks are formed in memory and handed over to >> the algebraic preconditioner (controlled at runtime). Why would the >> user have to change the preconditioner manually ? This made use of the >> block PC object in petsc-ext which I assume is replaced by FieldSplit >> ? I eventually do have to handle this elegantly and will need some >> input when I get there. >> >> If you are talking about the compile time switch, then yes, that was a >> little painful. But hopefully pure petsc will give me some peace ! > > I meant the two pieces of code. It's more to maintain and test. Needing to > choose at compile time is lame, but even if you could do it at run-time, it > would not be fun to maintain. In my opinion, we want one code where we can > do > -mat_type X -pc_type fieldsplit > and > -mat_type Y -pc_type lu > and have efficient data structures used in both cases. -------------- next part -------------- A non-text attachment was scrubbed... Name: testnest2.c Type: text/x-csrc Size: 567 bytes Desc: not available URL: From jedbrown at mcs.anl.gov Wed Oct 12 16:41:42 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 12 Oct 2011 16:41:42 -0500 Subject: [petsc-users] Some questions regarding VecNest type. In-Reply-To: References: Message-ID: On Wed, Oct 12, 2011 at 16:26, Vijay S. Mahadevan wrote: > Not fun, yes, but two different code paths are inevitable. > Nonsense! That's the point of the MatGetLocalSubMatrix() assembly interface. > You can > choose to do block solves or monolithic inversions but both do need a > different data structure representation. In your example mat X vs Y. > > Anyway, I was trying to create a simple test case and was stopped > immediately in my progress. I found the hard way that VecCreateNest is > the only way to create a Nest vector ? The usual > VecCreate+SetFromOptions doesn't seem to do the trick. > So we can (and should) change this to be like MatNest where you can MatCreate(), MatSetType(), MatNestSetSubMats(). But even with the update, you will still have to call VecNestSetSubVecs(), so creation is still different. The point of the rest of the interface (e.g. VecGetSubVector() instead of VecNestGetSubVecs()) is that *the rest* of your code never depends on the implementation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vijay.m at gmail.com Wed Oct 12 16:53:19 2011 From: vijay.m at gmail.com (Vijay S. Mahadevan) Date: Wed, 12 Oct 2011 16:53:19 -0500 Subject: [petsc-users] Some questions regarding VecNest type. In-Reply-To: References: Message-ID: > Nonsense! That's the point of the MatGetLocalSubMatrix() assembly interface. Didn't realize that VecGetSubVector/MatGetLocalSubMatrix are the pervasive implementation agnostic interfaces. Good to know ! > So we can (and should) change this to be like MatNest where you can > MatCreate(), MatSetType(), MatNestSetSubMats(). But even with the update, > you will still have to call VecNestSetSubVecs(), so creation is still > different. I can live with creation alone being different if the user wants it so. As long as the access via the above interface is still transparent, and it provides the ability to update/change the block references after creation, I can't ask for more. On Wed, Oct 12, 2011 at 4:41 PM, Jed Brown wrote: > On Wed, Oct 12, 2011 at 16:26, Vijay S. Mahadevan wrote: >> >> Not fun, yes, but two different code paths are inevitable. > > Nonsense! That's the point of the MatGetLocalSubMatrix() assembly interface. > >> >> You can >> choose to do block solves or monolithic inversions but both do need a >> different data structure representation. In your example mat X vs Y. >> >> Anyway, I was trying to create a simple test case and was stopped >> immediately in my progress. I found the hard way that VecCreateNest is >> the only way to create a Nest vector ? The usual >> VecCreate+SetFromOptions doesn't seem to do the trick. > > So we can (and should) change this to be like MatNest where you can > MatCreate(), MatSetType(), MatNestSetSubMats(). But even with the update, > you will still have to call VecNestSetSubVecs(), so creation is still > different. The point of the rest of the interface (e.g. VecGetSubVector() > instead of VecNestGetSubVecs()) is that *the rest* of your code never > depends on the implementation. > From mmnasr at gmail.com Wed Oct 12 17:42:54 2011 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Wed, 12 Oct 2011 15:42:54 -0700 Subject: [petsc-users] Binary I/O Message-ID: Hi everyone, I think I know the answer to my question, but I was double checking. When using PetscViewerBinaryOpen(); It is mentioned that "For writing files it only opens the file on processor 0 in the communicator." Does that mean when writing a parallel vector to file using VecView(), all the data from other processors is first sent to processor zero and then dumped into the file? If so, that would be a very slow processor for big datasets and large number of processor? Any suggestions to speed that process up? Best, Mohamad -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Oct 12 17:50:08 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 12 Oct 2011 17:50:08 -0500 Subject: [petsc-users] Binary I/O In-Reply-To: References: Message-ID: <3D695C1B-7D06-4CE7-A775-0A7FF52F2C4B@mcs.anl.gov> On Oct 12, 2011, at 5:42 PM, Mohamad M. Nasr-Azadani wrote: > Hi everyone, > > I think I know the answer to my question, but I was double checking. > When using > PetscViewerBinaryOpen(); > > It is mentioned that > "For writing files it only opens the file on processor 0 in the communicator." > > Does that mean when writing a parallel vector to file using VecView(), all the data from other processors is first sent to processor zero and then dumped into the file? No all the data is not sent to process zero before writing. That is process 0 does not need enough memory to store all the data before writing. Instead the processes take turns sending data to process 0 who immediately writes it out out to disk. > If so, that would be a very slow processor for big datasets and large number of processor? For less than a few thousand processes this is completely fine and nothing else would be much faster > Any suggestions to speed that process up? We have the various MPI IO options that uses MPI IO to have several processes writing to disks at the same time that is useful for very large numbers of processes. Barry > > Best, > Mohamad > From mmnasr at gmail.com Wed Oct 12 18:17:11 2011 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Wed, 12 Oct 2011 16:17:11 -0700 Subject: [petsc-users] Binary I/O In-Reply-To: <3D695C1B-7D06-4CE7-A775-0A7FF52F2C4B@mcs.anl.gov> References: <3D695C1B-7D06-4CE7-A775-0A7FF52F2C4B@mcs.anl.gov> Message-ID: Thanks Barry. That makes perfect sense. Best, Mohamad On Wed, Oct 12, 2011 at 3:50 PM, Barry Smith wrote: > > On Oct 12, 2011, at 5:42 PM, Mohamad M. Nasr-Azadani wrote: > > > Hi everyone, > > > > I think I know the answer to my question, but I was double checking. > > When using > > PetscViewerBinaryOpen(); > > > > It is mentioned that > > "For writing files it only opens the file on processor 0 in the > communicator." > > > > Does that mean when writing a parallel vector to file using VecView(), > all the data from other processors is first sent to processor zero and then > dumped into the file? > > No all the data is not sent to process zero before writing. That is > process 0 does not need enough memory to store all the data before writing. > > Instead the processes take turns sending data to process 0 who > immediately writes it out out to disk. > > > If so, that would be a very slow processor for big datasets and large > number of processor? > > For less than a few thousand processes this is completely fine and > nothing else would be much faster > > > Any suggestions to speed that process up? > > We have the various MPI IO options that uses MPI IO to have several > processes writing to disks at the same time that is useful for very large > numbers of processes. > > Barry > > > > > Best, > > Mohamad > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Wed Oct 12 19:40:02 2011 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Wed, 12 Oct 2011 17:40:02 -0700 Subject: [petsc-users] Binary I/O In-Reply-To: References: <3D695C1B-7D06-4CE7-A775-0A7FF52F2C4B@mcs.anl.gov> Message-ID: Hi again, On the similar topic, how hard would it be to write a function similar to the PETSc's VecView() associated with Binary writer to do exactly the same thing, i.e. write a parallel vector into one single file but when writing all the processors are performing the task simultaneously? Best, Mohamad On Wed, Oct 12, 2011 at 4:17 PM, Mohamad M. Nasr-Azadani wrote: > Thanks Barry. That makes perfect sense. > > Best, > Mohamad > > > On Wed, Oct 12, 2011 at 3:50 PM, Barry Smith wrote: > >> >> On Oct 12, 2011, at 5:42 PM, Mohamad M. Nasr-Azadani wrote: >> >> > Hi everyone, >> > >> > I think I know the answer to my question, but I was double checking. >> > When using >> > PetscViewerBinaryOpen(); >> > >> > It is mentioned that >> > "For writing files it only opens the file on processor 0 in the >> communicator." >> > >> > Does that mean when writing a parallel vector to file using VecView(), >> all the data from other processors is first sent to processor zero and then >> dumped into the file? >> >> No all the data is not sent to process zero before writing. That is >> process 0 does not need enough memory to store all the data before writing. >> >> Instead the processes take turns sending data to process 0 who >> immediately writes it out out to disk. >> >> > If so, that would be a very slow processor for big datasets and large >> number of processor? >> >> For less than a few thousand processes this is completely fine and >> nothing else would be much faster >> >> > Any suggestions to speed that process up? >> >> We have the various MPI IO options that uses MPI IO to have several >> processes writing to disks at the same time that is useful for very large >> numbers of processes. >> >> Barry >> >> > >> > Best, >> > Mohamad >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Oct 12 21:05:15 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 12 Oct 2011 21:05:15 -0500 Subject: [petsc-users] Binary I/O In-Reply-To: References: <3D695C1B-7D06-4CE7-A775-0A7FF52F2C4B@mcs.anl.gov> Message-ID: <56B1417B-B315-45D5-A9CD-69483EB84C25@mcs.anl.gov> On Oct 12, 2011, at 7:40 PM, Mohamad M. Nasr-Azadani wrote: > Hi again, > > On the similar topic, how hard would it be to write a function similar to the PETSc's VecView() associated with Binary writer to do exactly the same thing, i.e. write a parallel vector into one single file but when writing all the processors are performing the task simultaneously? Impossible. In the end in a standard filesystem the physical hard disk is connected to a single CPU and everything that gets written to that hard disk has to go through that one CPU; there is no way physically for a bunch of CPUs to write together onto a single physical disk. Now in high end parallel file systems each file may be spread over several hard disks (this is sometimes called stripping) (say 8). In that case there is some parallelism in writing the data since eight different parts of the vector can be sent through 8 different CPUs to 8 disks. But note that in general the number of these disks that file is spread over is small, like 8, it is never 10,000. When you have a parallel file system and use the option -viewer_binary_mpiio then the PETSc VecView() uses MPI IO to do the writing and you do get this level of parallelism so you may get slightly better performance than not using MPI IO. If you are seeing long wait times in VecView() with the binary file it is likely it is because the file server is connected via some pretty slow network to the actual compute nodes on parallel machine and nothing to do with the details of VecView(). You need to make sure you are writing directly to a disk that is on a compute node of the parallel machine, not over some network using NFS (Network File System), this can make a huge difference in time. Barry > > Best, > Mohamad > > > On Wed, Oct 12, 2011 at 4:17 PM, Mohamad M. Nasr-Azadani wrote: > Thanks Barry. That makes perfect sense. > > Best, > Mohamad > > > On Wed, Oct 12, 2011 at 3:50 PM, Barry Smith wrote: > > On Oct 12, 2011, at 5:42 PM, Mohamad M. Nasr-Azadani wrote: > > > Hi everyone, > > > > I think I know the answer to my question, but I was double checking. > > When using > > PetscViewerBinaryOpen(); > > > > It is mentioned that > > "For writing files it only opens the file on processor 0 in the communicator." > > > > Does that mean when writing a parallel vector to file using VecView(), all the data from other processors is first sent to processor zero and then dumped into the file? > > No all the data is not sent to process zero before writing. That is process 0 does not need enough memory to store all the data before writing. > > Instead the processes take turns sending data to process 0 who immediately writes it out out to disk. > > > If so, that would be a very slow processor for big datasets and large number of processor? > > For less than a few thousand processes this is completely fine and nothing else would be much faster > > > Any suggestions to speed that process up? > > We have the various MPI IO options that uses MPI IO to have several processes writing to disks at the same time that is useful for very large numbers of processes. > > Barry > > > > > Best, > > Mohamad > > > > > From marco.cisternino at polito.it Thu Oct 13 03:04:03 2011 From: marco.cisternino at polito.it (Marco Cisternino) Date: Thu, 13 Oct 2011 10:04:03 +0200 Subject: [petsc-users] KSP_DIVERGED_NAN Message-ID: <4E969B73.2090909@polito.it> Hi to everybody, I wrote an interface elliptic problem solver in 2D and 3D. The linear system I assemble depends on the geometrical characteristics of a level set function. I tested them with several and more and more complex interfaces, but now I would try it with a real 3D interface. But I probably have some problem with the computation of a good level set function, maybe the cause is the low resolution of the original interface. The evidence of my problem is in the subject. No matter what iterative petsc solver I try to use the run stops at the first iteration with a Nan residual. I don't want to ask you to understand the problem, but to give me some information about how to debug I situation like this. The interface is too complex to be deeply checked, then I hope that looking how the ksp yields a nan residual could help me. Thank you so much. Best regards, Marco -- Marco Cisternino PhD Student Politecnico di Torino Mobile:+393281189696 Email:marco.cisternino at polito.it From bsmith at mcs.anl.gov Thu Oct 13 07:52:14 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 13 Oct 2011 07:52:14 -0500 Subject: [petsc-users] KSP_DIVERGED_NAN In-Reply-To: <4E969B73.2090909@polito.it> References: <4E969B73.2090909@polito.it> Message-ID: First run with -pc_type lu what happens? Then run small problems with -mat_view and see if the matrix entries look reasonable, also check that the right hand side vector entries look reasonable. Barry On Oct 13, 2011, at 3:04 AM, Marco Cisternino wrote: > Hi to everybody, > I wrote an interface elliptic problem solver in 2D and 3D. The linear system I assemble depends on the geometrical characteristics of a level set function. I tested them with several and more and more complex interfaces, but now I would try it with a real 3D interface. But I probably have some problem with the computation of a good level set function, maybe the cause is the low resolution of the original interface. The evidence of my problem is in the subject. No matter what iterative petsc solver I try to use the run stops at the first iteration with a Nan residual. I don't want to ask you to understand the problem, but to give me some information about how to debug I situation like this. The interface is too complex to be deeply checked, then I hope that looking how the ksp yields a nan residual could help me. > Thank you so much. > > Best regards, > > Marco > > -- > Marco Cisternino > PhD Student > Politecnico di Torino > Mobile:+393281189696 > Email:marco.cisternino at polito.it > From behzad.baghapour at gmail.com Thu Oct 13 08:21:29 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Thu, 13 Oct 2011 16:51:29 +0330 Subject: [petsc-users] about make code with Petsc Message-ID: Dear all, I want to create code objects in different folder like "object" but I received "no rule to make target". The makefile is in the source files folder. Please tell me a hint for it. Here is my makefile: SHELL=/bin/bash OBJDIR = object/ MYOBJ = # my object files # include ${PETSC_DIR}/conf/variables include ${PETSC_DIR}/conf/rules OBJLIST = $(addprefix $(OBJDIR), $(MYOBJ)) code: ${OBJLIST} chkopts -${CLINKER} -o code ${OBJLIST} ${INCDIR} ${PETSC_LIB} Thanks, -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Oct 13 08:35:30 2011 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 13 Oct 2011 08:35:30 -0500 Subject: [petsc-users] about make code with Petsc In-Reply-To: References: Message-ID: On Thu, Oct 13, 2011 at 8:21 AM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > Dear all, > > I want to create code objects in different folder like "object" but I > received "no rule to make target". The makefile is in the source files > folder. > Please tell me a hint for it. > The PETSc make rules always put the *.o files in the same directory as the source. If you want then in a different place, you can put in an explicit 'mv' in your rule for the executable. There is also an experimental Python system, but I would not use it unless you have a lot of experience. Thanks, Matt > Here is my makefile: > > SHELL=/bin/bash > > OBJDIR = object/ > MYOBJ = # my object files # > > include ${PETSC_DIR}/conf/variables > include ${PETSC_DIR}/conf/rules > > OBJLIST = $(addprefix $(OBJDIR), $(MYOBJ)) > > code: ${OBJLIST} chkopts > -${CLINKER} -o code ${OBJLIST} ${INCDIR} ${PETSC_LIB} > > Thanks, > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Thu Oct 13 09:00:07 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Thu, 13 Oct 2011 17:30:07 +0330 Subject: [petsc-users] about make code with Petsc In-Reply-To: References: Message-ID: Thanks. On Thu, Oct 13, 2011 at 5:05 PM, Matthew Knepley wrote: > On Thu, Oct 13, 2011 at 8:21 AM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> Dear all, >> >> I want to create code objects in different folder like "object" but I >> received "no rule to make target". The makefile is in the source files >> folder. >> Please tell me a hint for it. >> > > The PETSc make rules always put the *.o files in the same directory as the > source. If you want then in a different place, > you can put in an explicit 'mv' in your rule for the executable. There is > also an experimental Python system, but I would > not use it unless you have a lot of experience. > > Thanks, > > Matt > > >> Here is my makefile: >> >> SHELL=/bin/bash >> >> OBJDIR = object/ >> MYOBJ = # my object files # >> >> include ${PETSC_DIR}/conf/variables >> include ${PETSC_DIR}/conf/rules >> >> OBJLIST = $(addprefix $(OBJDIR), $(MYOBJ)) >> >> code: ${OBJLIST} chkopts >> -${CLINKER} -o code ${OBJLIST} ${INCDIR} ${PETSC_LIB} >> >> Thanks, >> >> -- >> ================================== >> Behzad Baghapour >> Ph.D. Candidate, Mechecanical Engineering >> University of Tehran, Tehran, Iran >> https://sites.google.com/site/behzadbaghapour >> Fax: 0098-21-88020741 >> ================================== >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Thu Oct 13 09:04:43 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Thu, 13 Oct 2011 17:34:43 +0330 Subject: [petsc-users] about make code with Petsc In-Reply-To: References: Message-ID: By the way. I can move the objects BUT when I remake the code all objects are rebuilt again!!. Why when there is no need when objects are already exist? On Thu, Oct 13, 2011 at 5:30 PM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > Thanks. > > > On Thu, Oct 13, 2011 at 5:05 PM, Matthew Knepley wrote: > >> On Thu, Oct 13, 2011 at 8:21 AM, behzad baghapour < >> behzad.baghapour at gmail.com> wrote: >> >>> Dear all, >>> >>> I want to create code objects in different folder like "object" but I >>> received "no rule to make target". The makefile is in the source files >>> folder. >>> Please tell me a hint for it. >>> >> >> The PETSc make rules always put the *.o files in the same directory as the >> source. If you want then in a different place, >> you can put in an explicit 'mv' in your rule for the executable. There is >> also an experimental Python system, but I would >> not use it unless you have a lot of experience. >> >> Thanks, >> >> Matt >> >> >>> Here is my makefile: >>> >>> SHELL=/bin/bash >>> >>> OBJDIR = object/ >>> MYOBJ = # my object files # >>> >>> include ${PETSC_DIR}/conf/variables >>> include ${PETSC_DIR}/conf/rules >>> >>> OBJLIST = $(addprefix $(OBJDIR), $(MYOBJ)) >>> >>> code: ${OBJLIST} chkopts >>> -${CLINKER} -o code ${OBJLIST} ${INCDIR} ${PETSC_LIB} >>> >>> Thanks, >>> >>> -- >>> ================================== >>> Behzad Baghapour >>> Ph.D. Candidate, Mechecanical Engineering >>> University of Tehran, Tehran, Iran >>> https://sites.google.com/site/behzadbaghapour >>> Fax: 0098-21-88020741 >>> ================================== >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Thu Oct 13 09:18:11 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Thu, 13 Oct 2011 17:48:11 +0330 Subject: [petsc-users] about make code with Petsc In-Reply-To: References: Message-ID: Oh. I should chang it a bit. Thanks any way. On Thu, Oct 13, 2011 at 5:34 PM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > By the way. I can move the objects BUT when I remake the code all objects > are rebuilt again!!. Why when there is no need when objects are already > exist? > > > On Thu, Oct 13, 2011 at 5:30 PM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> Thanks. >> >> >> On Thu, Oct 13, 2011 at 5:05 PM, Matthew Knepley wrote: >> >>> On Thu, Oct 13, 2011 at 8:21 AM, behzad baghapour < >>> behzad.baghapour at gmail.com> wrote: >>> >>>> Dear all, >>>> >>>> I want to create code objects in different folder like "object" but I >>>> received "no rule to make target". The makefile is in the source files >>>> folder. >>>> Please tell me a hint for it. >>>> >>> >>> The PETSc make rules always put the *.o files in the same directory as >>> the source. If you want then in a different place, >>> you can put in an explicit 'mv' in your rule for the executable. There is >>> also an experimental Python system, but I would >>> not use it unless you have a lot of experience. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Here is my makefile: >>>> >>>> SHELL=/bin/bash >>>> >>>> OBJDIR = object/ >>>> MYOBJ = # my object files # >>>> >>>> include ${PETSC_DIR}/conf/variables >>>> include ${PETSC_DIR}/conf/rules >>>> >>>> OBJLIST = $(addprefix $(OBJDIR), $(MYOBJ)) >>>> >>>> code: ${OBJLIST} chkopts >>>> -${CLINKER} -o code ${OBJLIST} ${INCDIR} ${PETSC_LIB} >>>> >>>> Thanks, >>>> >>>> -- >>>> ================================== >>>> Behzad Baghapour >>>> Ph.D. Candidate, Mechecanical Engineering >>>> University of Tehran, Tehran, Iran >>>> https://sites.google.com/site/behzadbaghapour >>>> Fax: 0098-21-88020741 >>>> ================================== >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> >> >> -- >> ================================== >> Behzad Baghapour >> Ph.D. Candidate, Mechecanical Engineering >> University of Tehran, Tehran, Iran >> https://sites.google.com/site/behzadbaghapour >> Fax: 0098-21-88020741 >> ================================== >> >> > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Oct 13 09:45:45 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 13 Oct 2011 09:45:45 -0500 (CDT) Subject: [petsc-users] about make code with Petsc In-Reply-To: References: Message-ID: You can commit to gnumake [or equivalent] - and then use VPATH feature to do this. [however - in this case - the makefile would be in the location of the objfiles - and then the source file location is listed in VPATH variable - in the makefile] Satish On Thu, 13 Oct 2011, behzad baghapour wrote: > Dear all, > > I want to create code objects in different folder like "object" but I > received "no rule to make target". The makefile is in the source files > folder. > Please tell me a hint for it. > > Here is my makefile: > > SHELL=/bin/bash > > OBJDIR = object/ > MYOBJ = # my object files # > > include ${PETSC_DIR}/conf/variables > include ${PETSC_DIR}/conf/rules > > OBJLIST = $(addprefix $(OBJDIR), $(MYOBJ)) > > code: ${OBJLIST} chkopts > -${CLINKER} -o code ${OBJLIST} ${INCDIR} ${PETSC_LIB} > > Thanks, > > From hafedh.ben-hadj-ali at total.com Fri Oct 14 01:57:05 2011 From: hafedh.ben-hadj-ali at total.com (Hafedh BEN-HADJ-ALI) Date: Fri, 14 Oct 2011 08:57:05 +0200 Subject: [petsc-users] Distributed sparse matrix in in COO format Message-ID: Hi, I'm a new user of PETSC and I would like to test different linear solvers on some normal equations derived from a least squares problem. I have already a routine that build a sparse distributed (parallel ) matrix in coordinate COO format and would like to not change this part of matrix formation since it is a little bit complicated. What is the simplest way to link with PETSC routines ? Is it possible to use the distributed matrix blocks in the COO format ? Is there any conversion routine to put the COO format in CSR format ? Is it possible to use "MatCreateMPIAIJWithArrays" in that case or is there any more appropriate format ? Regards, HB -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.leissing at cstb.fr Fri Oct 14 02:42:58 2011 From: thomas.leissing at cstb.fr (Thomas Leissing) Date: Fri, 14 Oct 2011 09:42:58 +0200 Subject: [petsc-users] KSPGMRES preconditioned with a MATSHELL Message-ID: <1318578178.5336.46.camel@gre-019882> Dear all, I need to solve a system of equation Ax = b in which A is a MatShell object for which I defined a matrix-vector multiplication routine with MatShellSetOperation. Let's call this routine MyMatMult. The MyMatMult routine gives me an approximate matrix vector product, and I'm able to tune the parameters of MyMatMult so that I can choose a trade-off between calculation time and accuracy of the product. I successfully solved this problem with a KSPGMRES solver. So far so good... Now I'd like to precondition the system to accelerate the solving stage. To do this I'd like to use a lower-order (less accurate but faster) solution of Ax=b. I tried to do this with a PCKSP type preconditioner, but it doesn't seem to accept MatShell objects as preconditioning matrix. I also tried to use a PCSHELL preconditioner for which the PCApply routine consists in solving the lower order Ax=b system. I didn't manage to get this working properly: the outer solver doesn't converge to the expected rate. Indeed if I use for the inner loop the same accuracy than for the outer loop, the outer loop should converge in one iteration, which is not the case... Is there another way of doing this ? Any hint ? Thanks for your help, Thomas Leissing -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Fri Oct 14 03:05:48 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Fri, 14 Oct 2011 11:35:48 +0330 Subject: [petsc-users] Block-ILU in Petsc Message-ID: Dear all, How should I efficiently use block-ILU preconditioner in Petsc? I saw in man-page of PCILU: "For BAIJ matrices this implements a point block ILU". What exactly it means? Regards, Behzad -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Oct 14 05:20:04 2011 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 14 Oct 2011 05:20:04 -0500 Subject: [petsc-users] Distributed sparse matrix in in COO format In-Reply-To: References: Message-ID: On Fri, Oct 14, 2011 at 1:57 AM, Hafedh BEN-HADJ-ALI < hafedh.ben-hadj-ali at total.com> wrote: > Hi,**** > > ** ** > > I?m a new user of PETSC and I would like to test different linear solvers > on some normal equations derived from a least squares problem.**** > > ** ** > > I have already a routine that build a sparse distributed (parallel ) matrix > in coordinate COO format and would like to not change this part of matrix > formation since it is a little bit complicated. **** > > What is the simplest way to link with PETSC routines ? Is it possible to > use the distributed matrix blocks in the COO format ? Is there any > conversion routine to put the COO format in CSR format ? Is it possible to > use ?MatCreateMPIAIJWithArrays? in that case or is there any more > appropriate format ? > The easiest, although not most efficient way to do this is to call MatSetValues() for each (i,j,v) in the COO structure. Try this first and time it. Thanks, Matt > > > Regards,**** > > HB**** > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Oct 14 05:20:39 2011 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 14 Oct 2011 05:20:39 -0500 Subject: [petsc-users] Block-ILU in Petsc In-Reply-To: References: Message-ID: On Fri, Oct 14, 2011 at 3:05 AM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > Dear all, > > How should I efficiently use block-ILU preconditioner in Petsc? > > I saw in man-page of PCILU: > "For BAIJ matrices this implements a point block ILU". What exactly it > means? > That the block size is the same the matrix block size. Matt > Regards, > Behzad > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Oct 14 05:26:44 2011 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 14 Oct 2011 05:26:44 -0500 Subject: [petsc-users] KSPGMRES preconditioned with a MATSHELL In-Reply-To: <1318578178.5336.46.camel@gre-019882> References: <1318578178.5336.46.camel@gre-019882> Message-ID: On Fri, Oct 14, 2011 at 2:42 AM, Thomas Leissing wrote: > ** > Dear all, > > I need to solve a system of equation Ax = b in which A is a MatShell object > for which I defined a matrix-vector multiplication routine with > MatShellSetOperation. Let's call this routine MyMatMult. The MyMatMult > routine gives me an approximate matrix vector product, and I'm able to tune > the parameters of MyMatMult so that I can choose a trade-off between > calculation time and accuracy of the product. I successfully solved this > problem with a KSPGMRES solver. > So far so good... > > Now I'd like to precondition the system to accelerate the solving stage. To > do this I'd like to use a lower-order (less accurate but faster) solution of > Ax=b. > > I tried to do this with a PCKSP type preconditioner, but it doesn't seem to > accept MatShell objects as preconditioning matrix. > Could you send the error message? > I also tried to use a PCSHELL preconditioner for which the PCApply routine > consists in solving the lower order Ax=b system. > > I didn't manage to get this working properly: the outer solver doesn't > converge to the expected rate. Indeed if I use for the inner loop the same > accuracy than for the outer loop, the outer loop should converge in one > iteration, which is not the case... > > Is there another way of doing this ? > What we normally do is explicitly construct the low-order matrix. Then you can just pass it as the second Mat argument to KSPSetOperators(). The PC is built using that Mat, instead of the system Mat which is a MATSHELL. Thanks, Matt > Any hint ? > > Thanks for your help, > Thomas Leissing > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Fri Oct 14 05:59:13 2011 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 14 Oct 2011 12:59:13 +0200 Subject: [petsc-users] Appending to vector / numerical continuation / slepc In-Reply-To: References: <2303DE53-E457-4168-9C56-B7B590676927@mcs.anl.gov> <1805540796.70710.1317850510287.JavaMail.root@zimbra.anl.gov> Message-ID: On 11/10/2011, Kevin Green wrote: > Successfully updated to petsc-3.2 and slepc-dev. > > > A note about slepc-dev: > configure failed, with the error > > Traceback (most recent call last): > File "./configure", line 10, in > execfile(os.path.join(os.path.dirname(__file__), 'config', 'configure.py')) > File "./config/configure.py", line 344, in > generatefortranstubs.main(petscconf.BFORT) > AttributeError: 'module' object has no attribute 'BFORT' > > when I commented out the offending if statement in configure.py, the build worked, but the tests for fortran examples failed... Not really an issue for me, but thought I should point this out. Yes, this is because sowing is not available (can be fixed with --download-sowing in PETSc or using a mercurial version of PETSc). Now I have added a test in slepc-dev to avoid this error. Jose > > > DMcomposite examples: > src/snes/examples/tutorials/ex21.c (the example that seems most useful for my purpose) fails to run, as the jacobian isn't set within the code... I thought maybe running it with the -snes_fd option would work, but that results in a null object somewhere within snes.c. My guess is that an mffd needs to be in place for this. Or am I missing something? > > > Up until this point, I have been setting the jacobian on a 2D (periodic in x and y) DM analytically. To preallocate the nonzero structure of the jacobian, I use DMDASetBlockFills(...). Now that I wish to use a DMComposite with the original DM and an array, I presume I'll have to use DMCompositeSetCoupling(...) to preallocate the jacobian structure, but the only example quoted in that documentation points this out, but doesn't actually use it. So... > 1. Can I still manage to use DMDASetBlockFills(...) for the DM part of the composite? > 2. How would the array coupling then get "wiggled in" to the structure? > 3. It seems like this process could get ugly if the above is not possible... so if I move away from setting the Jacobian explicitly, do things become simpler? > > > That's all for now, it seems fairly intuitive on how to make use of the composite once it gets set up. > > Cheers, > Kevin > > > > > On Wed, Oct 5, 2011 at 10:59 PM, Kevin Green wrote: > Jose - Thank you, I'll see if I can get this working. > > Barry - This seems to be exactly what I'm looking for. Glancing at the documentation briefly, some questions do spring to mind, but I will not ask until I look at some of the examples! > > Mike - Thanks for the updated link, I didn't even notice that Barry's was for 3.0.0. > > In the meantime, I'll update to petsc 3.2, and slepc-dev, and get looking at these examples. This isn't at the immediate top of my todo list, but I expect I'll have some detailed questions on DMCOMPOSITE in a week or so. > > Kevin > > > > On Wed, Oct 5, 2011 at 5:35 PM, Mike McCourt wrote: > If you're gonna use PETSc 3.2, make sure to check out the updated documentation: > > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/DM/DMCompositeCreate.html > > It has a more accurate list of examples. > > -Mike > > ----- Original Message ----- > From: "Barry Smith" > To: "PETSc users list" > Sent: Wednesday, October 5, 2011 4:29:14 PM > Subject: Re: [petsc-users] Appending to vector / numerical continuation / slepc > > > Kevin, > > The DMCOMPOSITE is designed exactly for this purpose. See the manual page http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-3.0.0/docs/manualpages/DA/DMCompositeCreate.html#DMCompositeCreate and examples it links to. Essentially you create a DMCOMPOSITE and then use DMCompositeAddDM() to put in the DM which will be parallel and DMCompositeAddArray() for the "k" extra things (like continuation parameters). After you read the manual pages and look at the examples and start your code using the DMCOMPOSITE, feel free to ask specific questions about its usage. > > You definitely should switch to PETSc 3.2 before working because the DM has been markedly improved in places for this type of thing, > > Barry > > > > On Oct 5, 2011, at 12:46 PM, Kevin Green wrote: > > > Greetings, > > > > I was just wondering what the simplest way to create a new N+k dim where the first N come from a DA. It seems to me that I would have to go the round about way of getting the array, then writing that to the first N components of the new vector... I think there would be a bit of a pain for the parallel case when doing this though, like in managing the change in the local sizes when going from N to N+k... perhaps it's not that tricky. Also, with DAs I don't have to worry about orderings, correct? > > > > Essentially I want to get pseudo-arclength continuation working using the SNES solver. Another option I'm thinking is that rather than using an extended vector, I could use a MatShell where the added components are located within its context, and updated upon matmult...since k is typically small, this seems reasonable. Do you know of any code/projects that make use of the SNES module for continuation? Any thoughts on what would be the better or simpler way of doing this? > > > > I'm using petsc-3.1 right now, as I also need slepc...which hasn't been updated to work with 3.2 yet, as far as I know. I'm fairly new to petsc/slepc... so I have to ask, what is the timescale like between the release of a new petsc, and update of slepc? Or is there a way to get slepc working with the new release? > > > > Cheers, > > Kevin > > > From thomas.leissing at cstb.fr Fri Oct 14 06:23:07 2011 From: thomas.leissing at cstb.fr (Thomas Leissing) Date: Fri, 14 Oct 2011 13:23:07 +0200 Subject: [petsc-users] KSPGMRES preconditioned with a MATSHELL In-Reply-To: References: <1318578178.5336.46.camel@gre-019882> Message-ID: <1318591387.5336.68.camel@gre-019882> Matt, Could you send the error message? Here is the relevant part of the code: Variables: solver is my outer solver (KSPFGMRES) A is the matrix associated with the outer solver (MATSHELL) b is my RHS vector x is my solution vector precond is the outer solver preconditioner (PCKSP) pcSolver is the inner solver (KSPGMRES) pA is the matrix associated with the inner solver (preconditioner, MATSHELL too) [...] call KSPCreate(PETSC_COMM_WORLD, solver, pCode) call KSPGetPC(solver, precond, pCode) call PCSetType(precond, PCKSP, pCode) call PCKSPGetKSP(precond, pcSolver, pCode) call KSPCreate(PETSC_COMM_WORLD, pcSolver, pCode) call KSPSetType(pcSolver, KSPGMRES, pCode) call KSPSetOperators(pcSolver, pA, pA, SAME_NONZERO_PATTERN, pcode) call KSPSetOperators(solver, A, pA, SAME_NONZERO_PATTERN, pcode) call KSPSetType(solver, KSPFGMRES, pCode) call KSPSolve(solver, b, x, pCode) [...] and the error message is: [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: No support for this operation for this object type! [0]PETSC ERROR: Matrix format shell does not have a built-in PETSc ILU! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 3, Fri Sep 30 10:28:33 CDT 2011 What we normally do is explicitly construct the low-order matrix. Then you can just pass it as the second Mat argument to KSPSetOperators(). The PC is built using that Mat, instead of the system Mat which is a MATSHELL. Yes, but in my case, and if I understood what you meant, I cannot do that since my PC Mat is necessarily a MATSHELL... Thanks, Thomas -------- Message initial -------- De: Matthew Knepley Reply-to: PETSc users list ?: PETSc users list Sujet: Re: [petsc-users] KSPGMRES preconditioned with a MATSHELL Date: Fri, 14 Oct 2011 05:26:44 -0500 On Fri, Oct 14, 2011 at 2:42 AM, Thomas Leissing wrote: Dear all, I need to solve a system of equation Ax = b in which A is a MatShell object for which I defined a matrix-vector multiplication routine with MatShellSetOperation. Let's call this routine MyMatMult. The MyMatMult routine gives me an approximate matrix vector product, and I'm able to tune the parameters of MyMatMult so that I can choose a trade-off between calculation time and accuracy of the product. I successfully solved this problem with a KSPGMRES solver. So far so good... Now I'd like to precondition the system to accelerate the solving stage. To do this I'd like to use a lower-order (less accurate but faster) solution of Ax=b. I tried to do this with a PCKSP type preconditioner, but it doesn't seem to accept MatShell objects as preconditioning matrix. Could you send the error message? I also tried to use a PCSHELL preconditioner for which the PCApply routine consists in solving the lower order Ax=b system. I didn't manage to get this working properly: the outer solver doesn't converge to the expected rate. Indeed if I use for the inner loop the same accuracy than for the outer loop, the outer loop should converge in one iteration, which is not the case... Is there another way of doing this ? What we normally do is explicitly construct the low-order matrix. Then you can just pass it as the second Mat argument to KSPSetOperators(). The PC is built using that Mat, instead of the system Mat which is a MATSHELL. Thanks, Matt Any hint ? Thanks for your help, Thomas Leissing -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From behzad.baghapour at gmail.com Fri Oct 14 09:09:45 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Fri, 14 Oct 2011 17:39:45 +0330 Subject: [petsc-users] Block-ILU in Petsc In-Reply-To: References: Message-ID: So, Am I right with 2 things below? 1- If I set BAIJ for Mat, then the ILU preconditoner automatically recognize to use Block-version of ILU preconditioners? (like VBILUM developed by Saad and coworkers) or I should do more for Block preconditioning? 2- "the block size is the same the matrix block size" means that there is no way to combine the matrix elements to build blocks of possible maximum size like Hash method or something like this? Thanks a lot. On Fri, Oct 14, 2011 at 1:50 PM, Matthew Knepley wrote: > On Fri, Oct 14, 2011 at 3:05 AM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> Dear all, >> >> How should I efficiently use block-ILU preconditioner in Petsc? >> >> I saw in man-page of PCILU: >> "For BAIJ matrices this implements a point block ILU". What exactly it >> means? >> > > That the block size is the same the matrix block size. > > Matt > > >> Regards, >> Behzad >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Oct 14 09:19:37 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 14 Oct 2011 09:19:37 -0500 Subject: [petsc-users] Block-ILU in Petsc In-Reply-To: References: Message-ID: On Fri, Oct 14, 2011 at 09:09, behzad baghapour wrote: > 1- If I set BAIJ for Mat, then the ILU preconditoner automatically > recognize to use Block-version of ILU preconditioners? (like VBILUM > developed by Saad and coworkers) or I should do more for Block > preconditioning? > Yes, block ILU, block SOR, etc. > > 2- "the block size is the same the matrix block size" means that there is > no way to combine the matrix elements to build blocks of possible maximum > size like Hash method or something like this? > For AIJ, there are "Inodes" which automatically detect identical rows where blocking can be used. This helps more or less, depending on the hardware. There is no such thing for BAIJ, but it wouldn't offer much improvement in memory performance anyway. BAIJ performance approaches dense mat-vec as long as the ordering is such that the vector can be reused. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Oct 14 09:21:26 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 14 Oct 2011 09:21:26 -0500 Subject: [petsc-users] Block-ILU in Petsc In-Reply-To: References: Message-ID: <16698A46-DB80-4479-BCE9-2AA92C5CA56C@mcs.anl.gov> On Oct 14, 2011, at 9:09 AM, behzad baghapour wrote: > So, Am I right with 2 things below? > > 1- If I set BAIJ for Mat, then the ILU preconditoner automatically recognize to use Block-version of ILU preconditioners? (like VBILUM developed by Saad and coworkers) or I should do more for Block preconditioning? > > 2- "the block size is the same the matrix block size" means that there is no way to combine the matrix elements to build blocks of possible maximum size like Hash method or something like this? No. Though we would love to accept a contribution of a robust dynamic variable block size ILU from users. Barry > > Thanks a lot. > > > > On Fri, Oct 14, 2011 at 1:50 PM, Matthew Knepley wrote: > On Fri, Oct 14, 2011 at 3:05 AM, behzad baghapour wrote: > Dear all, > > How should I efficiently use block-ILU preconditioner in Petsc? > > I saw in man-page of PCILU: > "For BAIJ matrices this implements a point block ILU". What exactly it means? > > That the block size is the same the matrix block size. > > Matt > > Regards, > Behzad > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > From behzad.baghapour at gmail.com Fri Oct 14 09:41:01 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Fri, 14 Oct 2011 18:11:01 +0330 Subject: [petsc-users] Block-ILU in Petsc In-Reply-To: <16698A46-DB80-4479-BCE9-2AA92C5CA56C@mcs.anl.gov> References: <16698A46-DB80-4479-BCE9-2AA92C5CA56C@mcs.anl.gov> Message-ID: Thanks. Having a robust VBILU is a rather hard work while the new regenerated local blocks may or may not have proper spectral properties and may cause instabilities in iteration process but it would be a problem-dependent issue I think... On Fri, Oct 14, 2011 at 5:51 PM, Barry Smith wrote: > > On Oct 14, 2011, at 9:09 AM, behzad baghapour wrote: > > > So, Am I right with 2 things below? > > > > 1- If I set BAIJ for Mat, then the ILU preconditoner automatically > recognize to use Block-version of ILU preconditioners? (like VBILUM > developed by Saad and coworkers) or I should do more for Block > preconditioning? > > > > 2- "the block size is the same the matrix block size" means that there is > no way to combine the matrix elements to build blocks of possible maximum > size like Hash method or something like this? > > No. Though we would love to accept a contribution of a robust dynamic > variable block size ILU from users. > > Barry > > > > > Thanks a lot. > > > > > > > > On Fri, Oct 14, 2011 at 1:50 PM, Matthew Knepley > wrote: > > On Fri, Oct 14, 2011 at 3:05 AM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > > Dear all, > > > > How should I efficiently use block-ILU preconditioner in Petsc? > > > > I saw in man-page of PCILU: > > "For BAIJ matrices this implements a point block ILU". What exactly it > means? > > > > That the block size is the same the matrix block size. > > > > Matt > > > > Regards, > > Behzad > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > > > -- > > ================================== > > Behzad Baghapour > > Ph.D. Candidate, Mechecanical Engineering > > University of Tehran, Tehran, Iran > > https://sites.google.com/site/behzadbaghapour > > Fax: 0098-21-88020741 > > ================================== > > > > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Oct 14 09:47:22 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 14 Oct 2011 09:47:22 -0500 (CDT) Subject: [petsc-users] Block-ILU in Petsc In-Reply-To: References: Message-ID: On Fri, 14 Oct 2011, Jed Brown wrote: > On Fri, Oct 14, 2011 at 09:09, behzad baghapour > wrote: > > 2- "the block size is the same the matrix block size" means that there is > > no way to combine the matrix elements to build blocks of possible maximum > > size like Hash method or something like this? > > > > For AIJ, there are "Inodes" which automatically detect identical rows where > blocking can be used. This helps more or less, depending on the hardware. > There is no such thing for BAIJ, but it wouldn't offer much improvement in > memory performance anyway. BAIJ performance approaches dense mat-vec as long > as the ordering is such that the vector can be reused. I view AIJ+inode as variable-blocking in 1 dimenstion. Doing variable blocking in 2 dimensions is perhaps very difficult - and not worth the exta cost of keeping track of these blocks during each arithmetic operation [and parallel partitioning of such variable blocks has its own additional issues] Satish From behzad.baghapour at gmail.com Fri Oct 14 09:52:55 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Fri, 14 Oct 2011 18:22:55 +0330 Subject: [petsc-users] Block-ILU in Petsc In-Reply-To: References: Message-ID: I think so. Keeping the size of blocks as what is in the original matrix would be a safe choice. Here maximum gain is belonged to naturally block-wize matrices. On Fri, Oct 14, 2011 at 6:17 PM, Satish Balay wrote: > On Fri, 14 Oct 2011, Jed Brown wrote: > > > On Fri, Oct 14, 2011 at 09:09, behzad baghapour > > wrote: > > > > 2- "the block size is the same the matrix block size" means that there > is > > > no way to combine the matrix elements to build blocks of possible > maximum > > > size like Hash method or something like this? > > > > > > > For AIJ, there are "Inodes" which automatically detect identical rows > where > > blocking can be used. This helps more or less, depending on the hardware. > > There is no such thing for BAIJ, but it wouldn't offer much improvement > in > > memory performance anyway. BAIJ performance approaches dense mat-vec as > long > > as the ordering is such that the vector can be reused. > > I view AIJ+inode as variable-blocking in 1 dimenstion. Doing variable > blocking in 2 dimensions is perhaps very difficult - and not worth the > exta cost of keeping track of these blocks during each arithmetic > operation [and parallel partitioning of such variable blocks has its > own additional issues] > > Satish > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Oct 14 10:16:22 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 14 Oct 2011 10:16:22 -0500 Subject: [petsc-users] KSPGMRES preconditioned with a MATSHELL In-Reply-To: <1318591387.5336.68.camel@gre-019882> References: <1318578178.5336.46.camel@gre-019882> <1318591387.5336.68.camel@gre-019882> Message-ID: <698C572F-04C4-4FD2-9BB2-7387DA9A4B82@mcs.anl.gov> You never set the PC or PCType for your inner KSP hence it is trying to use the default of ILU. If pA is a MATSHELL then it cannot use ILU. Do you plan to use a preconditioner in the inner solver or PCNONE? Barry On Oct 14, 2011, at 6:23 AM, Thomas Leissing wrote: > Matt, > > Could you send the error message? > > Here is the relevant part of the code: > > Variables: > solver is my outer solver (KSPFGMRES) > A is the matrix associated with the outer solver (MATSHELL) > b is my RHS vector > x is my solution vector > precond is the outer solver preconditioner (PCKSP) > pcSolver is the inner solver (KSPGMRES) > pA is the matrix associated with the inner solver (preconditioner, > MATSHELL too) > > > [...] > call KSPCreate(PETSC_COMM_WORLD, solver, pCode) > call KSPGetPC(solver, precond, pCode) > call PCSetType(precond, PCKSP, pCode) > call PCKSPGetKSP(precond, pcSolver, pCode) > call KSPCreate(PETSC_COMM_WORLD, pcSolver, pCode) > call KSPSetType(pcSolver, KSPGMRES, pCode) > call KSPSetOperators(pcSolver, pA, pA, SAME_NONZERO_PATTERN, pcode) > call KSPSetOperators(solver, A, pA, SAME_NONZERO_PATTERN, pcode) > call KSPSetType(solver, KSPFGMRES, pCode) > call KSPSolve(solver, b, x, pCode) > [...] > > and the error message is: > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: Matrix format shell does not have a built-in PETSc ILU! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 3, Fri Sep 30 > 10:28:33 CDT 2011 > > > What we normally do is explicitly construct the low-order matrix. Then > you can just pass it as the second Mat argument to > KSPSetOperators(). The PC is built using that Mat, instead of the system > Mat which is a MATSHELL. > > Yes, but in my case, and if I understood what you meant, I cannot do > that since my PC Mat is necessarily a MATSHELL... > > > Thanks, > Thomas > > > > -------- Message initial -------- > De: Matthew Knepley > Reply-to: PETSc users list > ?: PETSc users list > Sujet: Re: [petsc-users] KSPGMRES preconditioned with a MATSHELL > Date: Fri, 14 Oct 2011 05:26:44 -0500 > > On Fri, Oct 14, 2011 at 2:42 AM, Thomas Leissing > wrote: > Dear all, > > I need to solve a system of equation Ax = b in which A is a > MatShell object for which I defined a matrix-vector > multiplication routine with MatShellSetOperation. Let's call > this routine MyMatMult. The MyMatMult routine gives me an > approximate matrix vector product, and I'm able to tune the > parameters of MyMatMult so that I can choose a trade-off between > calculation time and accuracy of the product. I successfully > solved this problem with a KSPGMRES solver. > So far so good... > > Now I'd like to precondition the system to accelerate the > solving stage. To do this I'd like to use a lower-order (less > accurate but faster) solution of Ax=b. > > I tried to do this with a PCKSP type preconditioner, but it > doesn't seem to accept MatShell objects as preconditioning > matrix. > > > > Could you send the error message? > > I also tried to use a PCSHELL preconditioner for which the > PCApply routine consists in solving the lower order Ax=b system. > > I didn't manage to get this working properly: the outer solver > doesn't converge to the expected rate. Indeed if I use for the > inner loop the same accuracy than for the outer loop, the outer > loop should converge in one iteration, which is not the case... > > Is there another way of doing this ? > > > > What we normally do is explicitly construct the low-order matrix. Then > you can just pass it as the second Mat argument to > KSPSetOperators(). The PC is built using that Mat, instead of the system > Mat which is a MATSHELL. > > > Thanks, > > > Matt > > Any hint ? > > Thanks for your help, > Thomas Leissing > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > From w.drenth at gmail.com Mon Oct 17 15:17:21 2011 From: w.drenth at gmail.com (Wienand Drenth) Date: Mon, 17 Oct 2011 22:17:21 +0200 Subject: [petsc-users] scattering strategies Message-ID: Dear all, I have a question regarding the utilization of the VecScatter routines to collect and distribute data from all processors to one, and vice versa. This in relation to file I/O. The setting is roughly as follows: At a certain stage in my computation, I have computed in parallel some results. Let these results be in an array X (X is a native C or Fortran array, not a Petsc vector. X might be multidimensional as well). The Xs of all processors together constitute my global result, and I would like to write it to disk. However, X itself is of course only part of the total. So I need to grab from all processors the pieces of X into one single structure. Furthermore, the X's are in a Petsc ordering (1 ... n for processor 1, n+1 .... n2 for processor 2, etc) which does not reflect the ordering defined by the user. So before writing I need to permute the values of X accordingly. My first thought is to use the VecScatter routines: define a parallel Petsc vector XVec, and see that the values of X are transferred to XVec (with VecGetArray and VecRestoreArray for example). I define a sequential vector XSeq as well. With VecScatterCreateToZero I define a scatter context, and I am able to get the distributed data into my vector XSeq. The data of XSeq is then written to disk by the zeroth processor. (Again using, for example, VecGetArray and VecRestoreArray to access the data.) Though this is working, my second thought is if this is not too much overkill. And if this collecting and distributing can be done smarter or more elegantly. With two auxiliary vectors it requires quite some code to get some distributed data to disk. Any thoughts and suggestions are much appreciated. kind regards, Wienand Drenth -- Wienand Drenth PhD Eindhoven, the Netherlands -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Oct 17 15:22:06 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 17 Oct 2011 15:22:06 -0500 Subject: [petsc-users] scattering strategies In-Reply-To: References: Message-ID: On Mon, Oct 17, 2011 at 15:17, Wienand Drenth wrote: > I have a question regarding the utilization of the VecScatter routines to > collect and distribute data from all processors to one, and vice versa. This > in relation to file I/O. The setting is roughly as follows: > > At a certain stage in my computation, I have computed in parallel some > results. Let these results be in an array X (X is a native C or Fortran > array, not a Petsc vector. X might be multidimensional as well). The Xs of > all processors together constitute my global result, and I would like to > write it to disk. However, X itself is of course only part of the total. So > I need to grab from all processors the pieces of X into one single > structure. > Furthermore, the X's are in a Petsc ordering (1 ... n for processor 1, n+1 > .... n2 for processor 2, etc) which does not reflect the ordering defined by > the user. So before writing I need to permute the values of X accordingly. > Simple solution: Make a DMDA to represent your multi-dimensional layout, put your array values into the Vec you get from the DM, and call VecView(). It will do a parallel write and your vector will end up in the natural ordering. You can VecLoad() it later or read it with other software. -------------- next part -------------- An HTML attachment was scrubbed... URL: From w.drenth at gmail.com Mon Oct 17 15:44:33 2011 From: w.drenth at gmail.com (Wienand Drenth) Date: Mon, 17 Oct 2011 22:44:33 +0200 Subject: [petsc-users] scattering strategies In-Reply-To: References: Message-ID: On Mon, Oct 17, 2011 at 10:22 PM, Jed Brown wrote: > On Mon, Oct 17, 2011 at 15:17, Wienand Drenth wrote: > >> >> > Simple solution: > > Make a DMDA to represent your multi-dimensional layout, put your array > values into the Vec you get from the DM, and call VecView(). It will do a > parallel write and your vector will end up in the natural ordering. You can > VecLoad() it later or read it with other software. > Hello Jed, Thanks for the quick response. I haven't thought about this DMDA approach and it sounds really nice. With natural ordering, I assume you mean what I call Petsc ordering (in any case different from the ordering imposed by the user)? regards, Wienand -- Wienand Drenth PhD Eindhoven, the Netherlands -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Oct 17 15:46:48 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 17 Oct 2011 15:46:48 -0500 Subject: [petsc-users] scattering strategies In-Reply-To: References: Message-ID: On Mon, Oct 17, 2011 at 15:44, Wienand Drenth wrote: > Thanks for the quick response. I haven't thought about this DMDA approach > and it sounds really nice. With natural ordering, I assume you mean what I > call Petsc ordering (in any case different from the ordering imposed by the > user)? See Figure 10 in the User's Manual (page 52). The natural ordering is the logical ordering for a structured grid if it was run in serial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Oct 17 15:46:56 2011 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 17 Oct 2011 15:46:56 -0500 Subject: [petsc-users] scattering strategies In-Reply-To: References: Message-ID: On Mon, Oct 17, 2011 at 3:44 PM, Wienand Drenth wrote: > > > On Mon, Oct 17, 2011 at 10:22 PM, Jed Brown wrote: > >> On Mon, Oct 17, 2011 at 15:17, Wienand Drenth wrote: >> >>> >>> >> Simple solution: >> >> Make a DMDA to represent your multi-dimensional layout, put your array >> values into the Vec you get from the DM, and call VecView(). It will do a >> parallel write and your vector will end up in the natural ordering. You can >> VecLoad() it later or read it with other software. >> > > Hello Jed, > Thanks for the quick response. I haven't thought about this DMDA approach > and it sounds really nice. With natural ordering, I assume you mean what I > call Petsc ordering (in any case different from the ordering imposed by the > user)? > A DMDA is a Cartesian grid. Natural ordering: Ordering by dimension Petsc Ordering: Ordering contiguously on each process, which is a contiguous rectangular piece of the grid Matt > regards, > Wienand > > > > -- > Wienand Drenth PhD > Eindhoven, the Netherlands > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Mon Oct 17 16:03:26 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Mon, 17 Oct 2011 23:03:26 +0200 Subject: [petsc-users] parallel vector of integers Message-ID: Vec seems to be meant for only real numbers. I need to have a parallel vector containing integers. Most of them are indices, but some of them would be arbitrary. Questions: 1) Is 'IS' supposed to be a parallel/distributed vector of indices? Syntax to use it is very different from that of Vec. E.g. I see functions to query global and local size, but can not find functions to specify them. How to scatter/gather such objects? Scatter seems only for Vec's. 2) How about arbitrary parallel vectors of integers? Many thanks for any pointers, Dominik -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Oct 17 16:05:22 2011 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 17 Oct 2011 16:05:22 -0500 Subject: [petsc-users] parallel vector of integers In-Reply-To: References: Message-ID: On Mon, Oct 17, 2011 at 4:03 PM, Dominik Szczerba wrote: > Vec seems to be meant for only real numbers. I need to have a parallel > vector containing integers. Most of them are indices, but some of them would > be arbitrary. Questions: > > 1) Is 'IS' supposed to be a parallel/distributed vector of indices? Syntax > to use it is very different from that of Vec. E.g. I see functions to query > global and local size, but can not find functions to specify them. How to > scatter/gather such objects? Scatter seems only for Vec's. > > 2) How about arbitrary parallel vectors of integers? > Right now, PETSc just does not have this. There is an ongoing discussion about implementation. For now, putting integers in a Vec is the best option. Matt > Many thanks for any pointers, > Dominik > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Oct 17 17:29:00 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 17 Oct 2011 17:29:00 -0500 Subject: [petsc-users] parallel vector of integers In-Reply-To: References: Message-ID: <07CAECF6-7D89-4FDC-A560-312C43469904@mcs.anl.gov> On Oct 17, 2011, at 4:03 PM, Dominik Szczerba wrote: > Vec seems to be meant for only real numbers. I need to have a parallel vector containing integers. Most of them are indices, but some of them would be arbitrary. Questions: > > 1) Is 'IS' supposed to be a parallel/distributed vector of indices? Syntax to use it is very different from that of Vec. E.g. I see functions to query global and local size, but can not find functions to specify them. How to scatter/gather such objects? Scatter seems only for Vec's. An IS is NOT a Vec for integers, it is a very different best. > > 2) How about arbitrary parallel vectors of integers? You can put the integers in a Vec. Unless your code is all integers (which is unlikely because why are you using PETSc for a code that just uses integers) the overhead of shipping around a few integers stored as doubles is not going to kill the overall performance of the code. In fact, I will faint away if you can even measure the difference. This is likely a case of premature over optimization. Barry > > Many thanks for any pointers, > Dominik From jedbrown at mcs.anl.gov Mon Oct 17 17:46:31 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 17 Oct 2011 17:46:31 -0500 Subject: [petsc-users] parallel vector of integers In-Reply-To: <07CAECF6-7D89-4FDC-A560-312C43469904@mcs.anl.gov> References: <07CAECF6-7D89-4FDC-A560-312C43469904@mcs.anl.gov> Message-ID: On Mon, Oct 17, 2011 at 17:29, Barry Smith wrote: > An IS is NOT a Vec for integers, it is a very different best. > Besides immutability, an IS is contravariant. Although ISGeneral is implemented with a similar data structure, it isn't meant to be used as "a Vec for integers". > > > > > 2) How about arbitrary parallel vectors of integers? > > You can put the integers in a Vec. Unless your code is all integers > (which is unlikely because why are you using PETSc for a code that just uses > integers) the overhead of shipping around a few integers stored as doubles > is not going to kill the overall performance of the code. In fact, I will > faint away if you can even measure the difference. This is likely a case of > premature over optimization. > The downside of this is that single precision is useless because the mantissa isn't big enough to hold useful integer sizes. If you always have at least double precision, then you can still solve big problems this way (2^53 is a big number), but I still find it aesthetically displeasing. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Oct 17 20:49:55 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 17 Oct 2011 20:49:55 -0500 Subject: [petsc-users] parallel vector of integers In-Reply-To: References: <07CAECF6-7D89-4FDC-A560-312C43469904@mcs.anl.gov> Message-ID: On Oct 17, 2011, at 5:46 PM, Jed Brown wrote: > On Mon, Oct 17, 2011 at 17:29, Barry Smith wrote: > An IS is NOT a Vec for integers, it is a very different best. > > Besides immutability, an IS is contravariant. Although ISGeneral is implemented with a similar data structure, it isn't meant to be used as "a Vec for integers". > > > > > > 2) How about arbitrary parallel vectors of integers? > > You can put the integers in a Vec. Unless your code is all integers (which is unlikely because why are you using PETSc for a code that just uses integers) the overhead of shipping around a few integers stored as doubles is not going to kill the overall performance of the code. In fact, I will faint away if you can even measure the difference. This is likely a case of premature over optimization. > > The downside of this is that single precision is useless because the mantissa isn't big enough to hold useful integer sizes. If you always have at least double precision, then you can still solve big problems this way (2^53 is a big number), but I still find it aesthetically displeasing. So let's increase the complexity of PETSc exponentially JUST so one little thing won't be "aesthetically displeasing"? From jedbrown at mcs.anl.gov Mon Oct 17 20:52:33 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 17 Oct 2011 20:52:33 -0500 Subject: [petsc-users] parallel vector of integers In-Reply-To: References: <07CAECF6-7D89-4FDC-A560-312C43469904@mcs.anl.gov> Message-ID: On Mon, Oct 17, 2011 at 20:49, Barry Smith wrote: > So let's increase the complexity of PETSc exponentially JUST so one little > thing won't be "aesthetically displeasing"? No, but when we have a VecScatter-like abstraction at the quasi-MPI level, then we can talk about moving integers around without stuffing them in Vecs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ecoon at lanl.gov Tue Oct 18 11:04:39 2011 From: ecoon at lanl.gov (Ethan Coon) Date: Tue, 18 Oct 2011 10:04:39 -0600 Subject: [petsc-users] parallel vector of integers In-Reply-To: References: <07CAECF6-7D89-4FDC-A560-312C43469904@mcs.anl.gov> Message-ID: <1318953879.8510.2.camel@echo.lanl.gov> Seems to me that the better argument for this would be that arbitrary precision scatters (done right) would be an important step on the path toward single-precision preconditioning. Surely this would make a measurable difference... Ethan On Mon, 2011-10-17 at 20:49 -0500, Barry Smith wrote: > On Oct 17, 2011, at 5:46 PM, Jed Brown wrote: > > > On Mon, Oct 17, 2011 at 17:29, Barry Smith wrote: > > An IS is NOT a Vec for integers, it is a very different best. > > > > Besides immutability, an IS is contravariant. Although ISGeneral is implemented with a similar data structure, it isn't meant to be used as "a Vec for integers". > > > > > > > > > > 2) How about arbitrary parallel vectors of integers? > > > > You can put the integers in a Vec. Unless your code is all integers (which is unlikely because why are you using PETSc for a code that just uses integers) the overhead of shipping around a few integers stored as doubles is not going to kill the overall performance of the code. In fact, I will faint away if you can even measure the difference. This is likely a case of premature over optimization. > > > > The downside of this is that single precision is useless because the mantissa isn't big enough to hold useful integer sizes. If you always have at least double precision, then you can still solve big problems this way (2^53 is a big number), but I still find it aesthetically displeasing. > > So let's increase the complexity of PETSc exponentially JUST so one little thing won't be "aesthetically displeasing"? > -- ------------------------------------ Ethan Coon Post-Doctoral Researcher Applied Mathematics - T-5 Los Alamos National Laboratory 505-665-8289 http://www.ldeo.columbia.edu/~ecoon/ ------------------------------------ From bsmith at mcs.anl.gov Tue Oct 18 11:11:43 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 18 Oct 2011 11:11:43 -0500 Subject: [petsc-users] parallel vector of integers In-Reply-To: <1318953879.8510.2.camel@echo.lanl.gov> References: <07CAECF6-7D89-4FDC-A560-312C43469904@mcs.anl.gov> <1318953879.8510.2.camel@echo.lanl.gov> Message-ID: On Oct 18, 2011, at 11:04 AM, Ethan Coon wrote: > Seems to me that the better argument for this would be that arbitrary > precision scatters (done right) would be an important step on the path > toward single-precision preconditioning. Surely this would make a > measurable difference... The issue is handling objects that have different internal precision representations in C. Do we even try it? Or do we do it in C++ via templates yuck or some other way? So far, despite a few aborted attempts, we've punted on do this at all and the objects can only have a single internal precision representation determined at compile time. Barry > > Ethan > > On Mon, 2011-10-17 at 20:49 -0500, Barry Smith wrote: >> On Oct 17, 2011, at 5:46 PM, Jed Brown wrote: >> >>> On Mon, Oct 17, 2011 at 17:29, Barry Smith wrote: >>> An IS is NOT a Vec for integers, it is a very different best. >>> >>> Besides immutability, an IS is contravariant. Although ISGeneral is implemented with a similar data structure, it isn't meant to be used as "a Vec for integers". >>> >>> >>>> >>>> 2) How about arbitrary parallel vectors of integers? >>> >>> You can put the integers in a Vec. Unless your code is all integers (which is unlikely because why are you using PETSc for a code that just uses integers) the overhead of shipping around a few integers stored as doubles is not going to kill the overall performance of the code. In fact, I will faint away if you can even measure the difference. This is likely a case of premature over optimization. >>> >>> The downside of this is that single precision is useless because the mantissa isn't big enough to hold useful integer sizes. If you always have at least double precision, then you can still solve big problems this way (2^53 is a big number), but I still find it aesthetically displeasing. >> >> So let's increase the complexity of PETSc exponentially JUST so one little thing won't be "aesthetically displeasing"? >> > > -- > ------------------------------------ > Ethan Coon > Post-Doctoral Researcher > Applied Mathematics - T-5 > Los Alamos National Laboratory > 505-665-8289 > > http://www.ldeo.columbia.edu/~ecoon/ > ------------------------------------ > From dominik at itis.ethz.ch Tue Oct 18 13:09:27 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Tue, 18 Oct 2011 20:09:27 +0200 Subject: [petsc-users] parallel vector of integers In-Reply-To: References: <07CAECF6-7D89-4FDC-A560-312C43469904@mcs.anl.gov> <1318953879.8510.2.camel@echo.lanl.gov> Message-ID: May I draw your attention how Kitware did it in VTK - avoiding templates, but using C++: http://www.vtk.org/doc/release/5.8/html/a00466.html Regards, Dominik On Tue, Oct 18, 2011 at 6:11 PM, Barry Smith wrote: > > On Oct 18, 2011, at 11:04 AM, Ethan Coon wrote: > > > Seems to me that the better argument for this would be that arbitrary > > precision scatters (done right) would be an important step on the path > > toward single-precision preconditioning. Surely this would make a > > measurable difference... > > The issue is handling objects that have different internal precision > representations in C. Do we even try it? Or do we do it in C++ via templates > yuck or some other way? > > So far, despite a few aborted attempts, we've punted on do this at all > and the objects can only have a single internal precision representation > determined at compile time. > > Barry > > > > > Ethan > > > > On Mon, 2011-10-17 at 20:49 -0500, Barry Smith wrote: > >> On Oct 17, 2011, at 5:46 PM, Jed Brown wrote: > >> > >>> On Mon, Oct 17, 2011 at 17:29, Barry Smith wrote: > >>> An IS is NOT a Vec for integers, it is a very different best. > >>> > >>> Besides immutability, an IS is contravariant. Although ISGeneral is > implemented with a similar data structure, it isn't meant to be used as "a > Vec for integers". > >>> > >>> > >>>> > >>>> 2) How about arbitrary parallel vectors of integers? > >>> > >>> You can put the integers in a Vec. Unless your code is all integers > (which is unlikely because why are you using PETSc for a code that just uses > integers) the overhead of shipping around a few integers stored as doubles > is not going to kill the overall performance of the code. In fact, I will > faint away if you can even measure the difference. This is likely a case of > premature over optimization. > >>> > >>> The downside of this is that single precision is useless because the > mantissa isn't big enough to hold useful integer sizes. If you always have > at least double precision, then you can still solve big problems this way > (2^53 is a big number), but I still find it aesthetically displeasing. > >> > >> So let's increase the complexity of PETSc exponentially JUST so one > little thing won't be "aesthetically displeasing"? > >> > > > > -- > > ------------------------------------ > > Ethan Coon > > Post-Doctoral Researcher > > Applied Mathematics - T-5 > > Los Alamos National Laboratory > > 505-665-8289 > > > > http://www.ldeo.columbia.edu/~ecoon/ > > ------------------------------------ > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Oct 18 13:39:36 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 18 Oct 2011 13:39:36 -0500 Subject: [petsc-users] parallel vector of integers In-Reply-To: References: <07CAECF6-7D89-4FDC-A560-312C43469904@mcs.anl.gov> <1318953879.8510.2.camel@echo.lanl.gov> Message-ID: On Oct 18, 2011, at 1:09 PM, Dominik Szczerba wrote: > May I draw your attention how Kitware did it in VTK - avoiding templates, but using C++: > > http://www.vtk.org/doc/release/5.8/html/a00466.html Thanks, we'll take a look at this. Barry > > Regards, Dominik > > On Tue, Oct 18, 2011 at 6:11 PM, Barry Smith wrote: > > On Oct 18, 2011, at 11:04 AM, Ethan Coon wrote: > > > Seems to me that the better argument for this would be that arbitrary > > precision scatters (done right) would be an important step on the path > > toward single-precision preconditioning. Surely this would make a > > measurable difference... > > The issue is handling objects that have different internal precision representations in C. Do we even try it? Or do we do it in C++ via templates yuck or some other way? > > So far, despite a few aborted attempts, we've punted on do this at all and the objects can only have a single internal precision representation determined at compile time. > > Barry > > > > > Ethan > > > > On Mon, 2011-10-17 at 20:49 -0500, Barry Smith wrote: > >> On Oct 17, 2011, at 5:46 PM, Jed Brown wrote: > >> > >>> On Mon, Oct 17, 2011 at 17:29, Barry Smith wrote: > >>> An IS is NOT a Vec for integers, it is a very different best. > >>> > >>> Besides immutability, an IS is contravariant. Although ISGeneral is implemented with a similar data structure, it isn't meant to be used as "a Vec for integers". > >>> > >>> > >>>> > >>>> 2) How about arbitrary parallel vectors of integers? > >>> > >>> You can put the integers in a Vec. Unless your code is all integers (which is unlikely because why are you using PETSc for a code that just uses integers) the overhead of shipping around a few integers stored as doubles is not going to kill the overall performance of the code. In fact, I will faint away if you can even measure the difference. This is likely a case of premature over optimization. > >>> > >>> The downside of this is that single precision is useless because the mantissa isn't big enough to hold useful integer sizes. If you always have at least double precision, then you can still solve big problems this way (2^53 is a big number), but I still find it aesthetically displeasing. > >> > >> So let's increase the complexity of PETSc exponentially JUST so one little thing won't be "aesthetically displeasing"? > >> > > > > -- > > ------------------------------------ > > Ethan Coon > > Post-Doctoral Researcher > > Applied Mathematics - T-5 > > Los Alamos National Laboratory > > 505-665-8289 > > > > http://www.ldeo.columbia.edu/~ecoon/ > > ------------------------------------ > > > > From chetan.jhurani at gmail.com Tue Oct 18 14:17:04 2011 From: chetan.jhurani at gmail.com (Chetan Jhurani) Date: Tue, 18 Oct 2011 15:17:04 -0400 Subject: [petsc-users] parallel vector of integers In-Reply-To: References: <07CAECF6-7D89-4FDC-A560-312C43469904@mcs.anl.gov> <1318953879.8510.2.camel@echo.lanl.gov> Message-ID: <4e9dd0be.e26a340a.2858.6726@mx.google.com> > From: petsc-users-bounces at mcs.anl.gov On Behalf Of Dominik Szczerba > Sent: Tuesday, October 18, 2011 2:09 PM > To: PETSc users list > Subject: Re: [petsc-users] parallel vector of integers > > May I draw your attention how Kitware did it in VTK - avoiding templates, but using C++: > > http://www.vtk.org/doc/release/5.8/html/a00466.html With all due respect to VTK folks, if you diff these files (vtkDoubleArray) http://www.vtk.org/doc/release/5.8/html/a04652.html (vtkFloatArray) http://www.vtk.org/doc/release/5.8/html/a04661.html you'll see that they used 'sed' and not C++. ;) Chetan > Regards, Dominik > On Tue, Oct 18, 2011 at 6:11 PM, Barry Smith wrote: > > On Oct 18, 2011, at 11:04 AM, Ethan Coon wrote: > > > Seems to me that the better argument for this would be that arbitrary > > precision scatters (done right) would be an important step on the path > > toward single-precision preconditioning. Surely this would make a > > measurable difference... > The issue is handling objects that have different internal precision representations in C. Do we even try it? Or do we do it in C++ via templates yuck or some other way? > > So far, despite a few aborted attempts, we've punted on do this at all and the objects can only have a single internal precision representation determined at compile time. > > Barry > > > > > Ethan > > > > On Mon, 2011-10-17 at 20:49 -0500, Barry Smith wrote: > >> On Oct 17, 2011, at 5:46 PM, Jed Brown wrote: > >> > >>> On Mon, Oct 17, 2011 at 17:29, Barry Smith wrote: > >>> An IS is NOT a Vec for integers, it is a very different best. > >>> > >>> Besides immutability, an IS is contravariant. Although ISGeneral is implemented with a similar data structure, it isn't meant to be used as "a Vec for integers". > >>> > >>> > >>>> > >>>> 2) How about arbitrary parallel vectors of integers? > >>> > >>> You can put the integers in a Vec. Unless your code is all integers (which is unlikely because why are you using PETSc for a code that just uses integers) the overhead of shipping around a few integers stored as doubles is not going to kill the overall performance of the code. In fact, I will faint away if you can even measure the difference. This is likely a case of premature over optimization. > >>> > >>> The downside of this is that single precision is useless because the mantissa isn't big enough to hold useful integer sizes. If you always have at least double precision, then you can still solve big problems this way (2^53 is a big number), but I still find it aesthetically displeasing. > >> > >> So let's increase the complexity of PETSc exponentially JUST so one little thing won't be "aesthetically displeasing"? > >> > > > > -- > > ------------------------------------ > > Ethan Coon > > Post-Doctoral Researcher > > Applied Mathematics - T-5 > > Los Alamos National Laboratory > > 505-665-8289 > > > > http://www.ldeo.columbia.edu/~ecoon/ > > ------------------------------------ > > > > From dominik at itis.ethz.ch Tue Oct 18 15:18:08 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Tue, 18 Oct 2011 22:18:08 +0200 Subject: [petsc-users] parallel vector of integers In-Reply-To: <4e9dd0be.e26a340a.2858.6726@mx.google.com> References: <07CAECF6-7D89-4FDC-A560-312C43469904@mcs.anl.gov> <1318953879.8510.2.camel@echo.lanl.gov> <4e9dd0be.e26a340a.2858.6726@mx.google.com> Message-ID: > > > May I draw your attention how Kitware did it in VTK - avoiding templates, > but using C++: > > > > http://www.vtk.org/doc/release/5.8/html/a00466.html > > With all due respect to VTK folks, if you diff these files > > (vtkDoubleArray) http://www.vtk.org/doc/release/5.8/html/a04652.html > (vtkFloatArray) http://www.vtk.org/doc/release/5.8/html/a04661.html > > you'll see that they used 'sed' and not C++. ;) > > Yes, these are custom parsing templates to avoid native C++ templates, yet they work all right. Dominik > Chetan > > > Regards, Dominik > > On Tue, Oct 18, 2011 at 6:11 PM, Barry Smith wrote: > > > > On Oct 18, 2011, at 11:04 AM, Ethan Coon wrote: > > > > > Seems to me that the better argument for this would be that arbitrary > > > precision scatters (done right) would be an important step on the path > > > toward single-precision preconditioning. Surely this would make a > > > measurable difference... > > The issue is handling objects that have different internal precision > representations in C. Do we even try it? Or do we do it in > C++ via templates yuck or some other way? > > > > So far, despite a few aborted attempts, we've punted on do this at all > and the objects can only have a single internal > precision representation determined at compile time. > > > > Barry > > > > > > > > Ethan > > > > > > On Mon, 2011-10-17 at 20:49 -0500, Barry Smith wrote: > > >> On Oct 17, 2011, at 5:46 PM, Jed Brown wrote: > > >> > > >>> On Mon, Oct 17, 2011 at 17:29, Barry Smith > wrote: > > >>> An IS is NOT a Vec for integers, it is a very different best. > > >>> > > >>> Besides immutability, an IS is contravariant. Although ISGeneral is > implemented with a similar data structure, it isn't meant > to be used as "a Vec for integers". > > >>> > > >>> > > >>>> > > >>>> 2) How about arbitrary parallel vectors of integers? > > >>> > > >>> You can put the integers in a Vec. Unless your code is all integers > (which is unlikely because why are you using PETSc for a > code that just uses integers) the overhead of shipping around a few > integers stored as doubles is not going to kill the overall > performance of the code. In fact, I will faint away if you can even measure > the difference. This is likely a case of premature over > optimization. > > >>> > > >>> The downside of this is that single precision is useless because the > mantissa isn't big enough to hold useful integer sizes. > If you always have at least double precision, then you can still solve big > problems this way (2^53 is a big number), but I still > find it aesthetically displeasing. > > >> > > >> So let's increase the complexity of PETSc exponentially JUST so one > little thing won't be "aesthetically displeasing"? > > >> > > > > > > -- > > > ------------------------------------ > > > Ethan Coon > > > Post-Doctoral Researcher > > > Applied Mathematics - T-5 > > > Los Alamos National Laboratory > > > 505-665-8289 > > > > > > http://www.ldeo.columbia.edu/~ecoon/ > > > ------------------------------------ > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From w.drenth at gmail.com Wed Oct 19 05:29:19 2011 From: w.drenth at gmail.com (Wienand Drenth) Date: Wed, 19 Oct 2011 12:29:19 +0200 Subject: [petsc-users] Question on DMDA in example ex11f90 Message-ID: Hello, Following suggestions on a previous query on handling distributed arrays, I set forth to use the DMDA path. I installed the latest petsc version (3.2 patch 3), and went looking into example 11f90 (in src/dm/examples/tutorials) for inspiration. Compilation of the example went ok, but when I run the example I get an error: 12:25] examples/tutorials 81 > mpiexec -n 1 ex11f90 Vector Object:Vec_0x80b3bb8_0 1 MPI processes type: mpi Process [0] *** glibc detected *** ex11f90: free(): invalid pointer: 0x080b6630 *** ======= Backtrace: ========= /lib/libc.so.6[0xb6f438c4] /lib/libc.so.6(cfree+0x90)[0xb6f47370] /home/wdrenth/software/petsc-3.2-p3/linux-gnu/lib/libpetsc.so(PetscFreeAlign+0x1b)[0xb777f5b1] /home/wdrenth/software/petsc-3.2-p3/linux-gnu/lib/libpetsc.so(VecDestroy_Seq+0x6a)[0xb7616e60] 1 2 3 4 0 /home/wdrenth/software/petsc-3.2-p3/linux-gnu/lib/libpetsc.so(VecDestroy+0x51)[0xb7976a11] /home/wdrenth/software/petsc-3.2-p3/linux-gnu/lib/libpetsc.so(DMClearGlobalVectors+0x6c)[0xb7666064] /home/wdrenth/software/petsc-3.2-p3/linux-gnu/lib/libpetsc.so(DMDestroy+0x233)[0xb76917d8] /home/wdrenth/software/petsc-3.2-p3/linux-gnu/lib/libpetsc.so(dmdestroy_+0x1b)[0xb7692b5a] ex11f90(MAIN__+0x172)[0x8049086] ex11f90(main+0x27)[0x8049967] /lib/libc.so.6(__libc_start_main+0xe0)[0xb6eee390] ex11f90[0x8048eb1] Petsc was installed with the following config options: --PETSC_ARCH=${PETSC_ARCH} --with-mpi-dir=${MPI_DIR} --download-f-blas-lapack=1 --with-shared-libraries=1 --with-debugging=no --download-mumps=1 --with-mumps=1 --with-fortran-interfaces --download-scalapack=1 --with-scalapack=1 --download-blacs=1 --with-blacs=1 --download-parmetis=1 --download-hypre=1 PETSC_ARCH is linux-gnu, and I have a locally installed MPI version, with MPI_DIR being the location. I use the openmpi 1.4.3 version of MPI. Thank for any support and help, kind regards, Wienand -- Wienand Drenth PhD Eindhoven, the Netherlands -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Oct 19 07:00:55 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 19 Oct 2011 07:00:55 -0500 (CDT) Subject: [petsc-users] Question on DMDA in example ex11f90 In-Reply-To: References: Message-ID: Hm - perhaps this is an issue with gfortran 4.2 [we've seen some f90 issues with it]. My run with gfortran-4.6 [on linux does not give any errors. Will see if I can find gfortran 4.2 to reproduce.. Satish On Wed, 19 Oct 2011, Wienand Drenth wrote: > Hello, > > Following suggestions on a previous query on handling distributed arrays, I > set forth to use the DMDA path. > > I installed the latest petsc version (3.2 patch 3), and went looking into > example 11f90 (in src/dm/examples/tutorials) for inspiration. > > Compilation of the example went ok, but when I run the example I get an > error: > > 12:25] examples/tutorials 81 > mpiexec -n 1 ex11f90 > Vector Object:Vec_0x80b3bb8_0 1 MPI processes > type: mpi > Process [0] > *** glibc detected *** ex11f90: free(): invalid pointer: 0x080b6630 *** > ======= Backtrace: ========= > /lib/libc.so.6[0xb6f438c4] > /lib/libc.so.6(cfree+0x90)[0xb6f47370] > /home/wdrenth/software/petsc-3.2-p3/linux-gnu/lib/libpetsc.so(PetscFreeAlign+0x1b)[0xb777f5b1] > /home/wdrenth/software/petsc-3.2-p3/linux-gnu/lib/libpetsc.so(VecDestroy_Seq+0x6a)[0xb7616e60] > 1 > 2 > 3 > 4 > 0 > /home/wdrenth/software/petsc-3.2-p3/linux-gnu/lib/libpetsc.so(VecDestroy+0x51)[0xb7976a11] > /home/wdrenth/software/petsc-3.2-p3/linux-gnu/lib/libpetsc.so(DMClearGlobalVectors+0x6c)[0xb7666064] > /home/wdrenth/software/petsc-3.2-p3/linux-gnu/lib/libpetsc.so(DMDestroy+0x233)[0xb76917d8] > /home/wdrenth/software/petsc-3.2-p3/linux-gnu/lib/libpetsc.so(dmdestroy_+0x1b)[0xb7692b5a] > ex11f90(MAIN__+0x172)[0x8049086] > ex11f90(main+0x27)[0x8049967] > /lib/libc.so.6(__libc_start_main+0xe0)[0xb6eee390] > ex11f90[0x8048eb1] > > > Petsc was installed with the following config options: > > --PETSC_ARCH=${PETSC_ARCH} --with-mpi-dir=${MPI_DIR} > --download-f-blas-lapack=1 --with-shared-libraries=1 --with-debugging=no > --download-mumps=1 --with-mumps=1 --with-fortran-interfaces > --download-scalapack=1 --with-scalapack=1 --download-blacs=1 --with-blacs=1 > --download-parmetis=1 --download-hypre=1 > > PETSC_ARCH is linux-gnu, and I have a locally installed MPI version, with > MPI_DIR being the location. I use the openmpi 1.4.3 version of MPI. > > Thank for any support and help, > > kind regards, > Wienand > > From bourdin at math.lsu.edu Wed Oct 19 10:54:13 2011 From: bourdin at math.lsu.edu (Blaise Bourdin) Date: Wed, 19 Oct 2011 10:54:13 -0500 Subject: [petsc-users] Questions about TS Message-ID: Hi, I am trying to use TS to solve a simple transient problem in an unstructured finite element f90 code. 1. Section 6.1.1 of the manual refers to a TSSetMatrices function that can be used to set the RHS and LHS matrices, but I can;t find it. Is this section outdated? 2. Since we are using unstructured finite elements, the LHS matrix is not the identity. As far as I understand, we have two possible choices: - Use a mass lumping approximation of the variational identity matrix (mass matrix), M, and use M^{-1}K for the RHS matrix instead of K. - Use an IMEX method where the implicit matrix is the variational identity M. Is this right? What is the recommended way? 3. Is src/ts/examples/tests/ex1f.F supposed to work? Is there a fortran or fortran90 example I could start from? I get the following: galerkin:tests bourdin$ ./ex1f -linear_constant_matrix 1 Procs Avg. error 2 norm NaN NaN euler galerkin:tests bourdin$ ./ex1f 1 Procs Avg. error 2 norm NaN NaN euler Thanks, Blaise -- Department of Mathematics and Center for Computation & Technology Louisiana State University, Baton Rouge, LA 70803, USA Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin From knepley at gmail.com Wed Oct 19 11:05:46 2011 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 19 Oct 2011 16:05:46 +0000 Subject: [petsc-users] Questions about TS In-Reply-To: References: Message-ID: On Wed, Oct 19, 2011 at 3:54 PM, Blaise Bourdin wrote: > Hi, > > I am trying to use TS to solve a simple transient problem in an > unstructured finite element f90 code. > > 1. Section 6.1.1 of the manual refers to a TSSetMatrices function that can > be used to set the RHS and LHS matrices, but I can;t find it. Is this > section outdated? > Yes. What you want is to use TSSetIFunction() and TSSetIJacobian(). Jed and Sean have added some nice examples under TS of this, and in the latest release there is a manual section. Let us you know if these are unclear. Thanks, Matt 2. Since we are using unstructured finite elements, the LHS matrix is not > the identity. As far as I understand, we have two possible choices: > - Use a mass lumping approximation of the variational identity matrix > (mass matrix), M, and use M^{-1}K for the RHS matrix instead of K. > - Use an IMEX method where the implicit matrix is the variational > identity M. > Is this right? What is the recommended way? > 3. Is src/ts/examples/tests/ex1f.F supposed to work? Is there a fortran or > fortran90 example I could start from? I get the following: > galerkin:tests bourdin$ ./ex1f -linear_constant_matrix > 1 Procs Avg. error 2 norm NaN > NaN euler > galerkin:tests bourdin$ ./ex1f > 1 Procs Avg. error 2 norm NaN > NaN euler > > Thanks, > > Blaise > > -- > Department of Mathematics and Center for Computation & Technology > Louisiana State University, Baton Rouge, LA 70803, USA > Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 > http://www.math.lsu.edu/~bourdin > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Oct 19 11:14:10 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 19 Oct 2011 11:14:10 -0500 Subject: [petsc-users] Questions about TS In-Reply-To: References: Message-ID: On Wed, Oct 19, 2011 at 10:54, Blaise Bourdin wrote: > Hi, > > I am trying to use TS to solve a simple transient problem in an > unstructured finite element f90 code. > > 1. Section 6.1.1 of the manual refers to a TSSetMatrices function that can > be used to set the RHS and LHS matrices, but I can;t find it. Is this > section outdated? > Yes, I must have missed this section when updating the documentation. > 2. Since we are using unstructured finite elements, the LHS matrix is not > the identity. As far as I understand, we have two possible choices: > - Use a mass lumping approximation of the variational identity matrix > (mass matrix), M, and use M^{-1}K for the RHS matrix instead of K. > You can also write this as a special case of the choice below where you use -ksp_type preonly -pc_type jacobi. > - Use an IMEX method where the implicit matrix is the variational > identity M. > Is this right? What is the recommended way? > I would just do this because it's the most flexible. See the user's manual section on IMEX methods. If you are interested in adaptive error control, then you should also check out -ts_type rosw in petsc-dev. In any case, you can write your mass matrix as well as any stiff terms that you want to treat implicitly into TSSetIFunction(), provide an (approximate) Jacobian with TSSetIJacobian(), and put the rest in TSSetRHSFunction(). You can be even more sloppy about it with TSROSW. Look at src/ts/examples/tutorials/ex22.c (has a Fortran twin, ex22f.F) or ex25.c in petsc-dev. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourdin at lsu.edu Wed Oct 19 12:32:39 2011 From: bourdin at lsu.edu (Blaise Bourdin) Date: Wed, 19 Oct 2011 12:32:39 -0500 Subject: [petsc-users] Questions about TS In-Reply-To: References: Message-ID: Hi, > On Wed, Oct 19, 2011 at 10:54, Blaise Bourdin wrote: > Hi, > > I am trying to use TS to solve a simple transient problem in an unstructured finite element f90 code. > > 1. Section 6.1.1 of the manual refers to a TSSetMatrices function that can be used to set the RHS and LHS matrices, but I can;t find it. Is this section outdated? > > Yes, I must have missed this section when updating the documentation. OK, that makes more sense now. > > 2. Since we are using unstructured finite elements, the LHS matrix is not the identity. As far as I understand, we have two possible choices: > - Use a mass lumping approximation of the variational identity matrix (mass matrix), M, and use M^{-1}K for the RHS matrix instead of K. > > You can also write this as a special case of the choice below where you use -ksp_type preonly -pc_type jacobi. I am not sure I am following you. Can you elaborate? > - Use an IMEX method where the implicit matrix is the variational identity M. > Is this right? What is the recommended way? > > I would just do this because it's the most flexible. See the user's manual section on IMEX methods. If you are interested in adaptive error control, then you should also check out -ts_type rosw in petsc-dev. In any case, you can write your mass matrix as well as any stiff terms that you want to treat implicitly into TSSetIFunction(), provide an (approximate) Jacobian with TSSetIJacobian(), and put the rest in TSSetRHSFunction(). You can be even more sloppy about it with TSROSW. > > Look at src/ts/examples/tutorials/ex22.c (has a Fortran twin, ex22f.F) or ex25.c in petsc-dev. > How about recasting the problem as a DAE? The documentation seems to imply that this is feasible. "For ODE with nontrivial mass matrices such as arise in FEM, the implicit/DAE interface significantly reduces overhead to prepare the system for algebraic solvers (SNES/KSP) by having the user assemble the correctly shifted matrix." Following ex15, solving u_t = \Delta u can be recast as solving F(t,u,\dot u) = 0 with F(t,u,v) = v-\Delta u. In this case, the IJacobian would be K+aM, where K is the stiffness matrix (K_{i,j} = \int_\Omega \nabla \phi_i \cdot \nabla \phi_j\,dx) and M the mass matrix (M_{i,j} = \int_\Omega \phi_i \phi_j\,dx) . At the continuous level, the IFunction would be v-\Delta u, which cannot be evaluated directly in the finite element basis by either solving M F = M\dot u + Ku or using mass lumping. Am I expected to do this in my IFunction or is there a way to pass the mass matrix to the TS? As far as I understand, using IMPEX will lead to the same issue regardless of the way I write my problem, i.e. wether I write G(t,u,v) = v-\Delta u and F(t,u) = 0, or G(t,u,v) = v and F(t,u) = \Delta u. As far as I can see, all examples and tests use structured meshes where the mass matrix is the identity matrix. Thanks for the clarification. Blaise -- Department of Mathematics and Center for Computation & Technology Louisiana State University, Baton Rouge, LA 70803, USA Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin From dominik at itis.ethz.ch Wed Oct 19 12:42:55 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 19 Oct 2011 19:42:55 +0200 Subject: [petsc-users] Using ghosted vector as 'x' in KSPSolve In-Reply-To: References: Message-ID: On Wed, Oct 12, 2011 at 1:43 PM, Jed Brown wrote: > On Wed, Oct 12, 2011 at 06:38, Dominik Szczerba wrote: > >> Is it legal to call KSPSolve with the solution vector being a >> ghost-aware vector created e.g. with VecCreateGhost? >> > > Yes > > Would that also mean, that 'b' vector can also be ghosted? That would be really cool... Many thanks, Dominik -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Oct 19 12:45:43 2011 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 19 Oct 2011 17:45:43 +0000 Subject: [petsc-users] Using ghosted vector as 'x' in KSPSolve In-Reply-To: References: Message-ID: On Wed, Oct 19, 2011 at 5:42 PM, Dominik Szczerba wrote: > > > On Wed, Oct 12, 2011 at 1:43 PM, Jed Brown wrote: > >> On Wed, Oct 12, 2011 at 06:38, Dominik Szczerba wrote: >> >>> Is it legal to call KSPSolve with the solution vector being a >>> ghost-aware vector created e.g. with VecCreateGhost? >>> >> >> Yes >> >> Would that also mean, that 'b' vector can also be ghosted? That would be > really cool... > In PETSc, ghosted vectors just have extra information, so they can always be used as normal vectors. Matt > Many thanks, > Dominik > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Oct 19 12:55:50 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 19 Oct 2011 12:55:50 -0500 Subject: [petsc-users] Questions about TS In-Reply-To: References: Message-ID: On Wed, Oct 19, 2011 at 12:32, Blaise Bourdin wrote: > Hi, > > > > On Wed, Oct 19, 2011 at 10:54, Blaise Bourdin > wrote: > > Hi, > > > > I am trying to use TS to solve a simple transient problem in an > unstructured finite element f90 code. > > > > 1. Section 6.1.1 of the manual refers to a TSSetMatrices function that > can be used to set the RHS and LHS matrices, but I can;t find it. Is this > section outdated? > > > > Yes, I must have missed this section when updating the documentation. > > OK, that makes more sense now. > > > > 2. Since we are using unstructured finite elements, the LHS matrix is not > the identity. As far as I understand, we have two possible choices: > > - Use a mass lumping approximation of the variational identity matrix > (mass matrix), M, and use M^{-1}K for the RHS matrix instead of K. > > > > You can also write this as a special case of the choice below where you > use -ksp_type preonly -pc_type jacobi. > I am not sure I am following you. Can you elaborate? > To solve M*Xdot = F(X) IFunction(X,Xdot) = M*Xdot IJacobian(X,Xdot) = M RHSFunction(X) = F(X) Now if you lump M, then you can solve it exactly with one iteration of Jacobi. My options were just turn off any extra norms that would normally be computed by an implicit solve. If M is the consistent mass matrix, then you will need some iterations. > > > - Use an IMEX method where the implicit matrix is the variational > identity M. > > Is this right? What is the recommended way? > > > > I would just do this because it's the most flexible. See the user's > manual section on IMEX methods. If you are interested in adaptive error > control, then you should also check out -ts_type rosw in petsc-dev. In any > case, you can write your mass matrix as well as any stiff terms that you > want to treat implicitly into TSSetIFunction(), provide an (approximate) > Jacobian with TSSetIJacobian(), and put the rest in TSSetRHSFunction(). You > can be even more sloppy about it with TSROSW. > > > > Look at src/ts/examples/tutorials/ex22.c (has a Fortran twin, ex22f.F) or > ex25.c in petsc-dev. > > > > How about recasting the problem as a DAE? The documentation seems to imply > that this is feasible. "For ODE with nontrivial mass matrices such as arise > in FEM, the implicit/DAE interface significantly reduces overhead to prepare > the system for algebraic solvers (SNES/KSP) by having the user assemble the > correctly shifted matrix." > > Following ex15, solving > u_t = \Delta u > can be recast as solving > F(t,u,\dot u) = 0 > with > F(t,u,v) = v-\Delta u. > > In this case, the IJacobian would be > K+aM, > where K is the stiffness matrix (K_{i,j} = \int_\Omega \nabla \phi_i \cdot > \nabla \phi_j\,dx) and M the mass matrix (M_{i,j} = \int_\Omega \phi_i > \phi_j\,dx) . > > At the continuous level, the IFunction would be v-\Delta u, which cannot be > evaluated directly in the finite element basis by either solving > M F = M\dot u + Ku > or using mass lumping. > > Am I expected to do this in my IFunction or is there a way to pass the mass > matrix to the TS? > You should have some sense for what K is and whether it makes sense to integrate your system implicitly or not. If you know that you want to use fully implicit methods, then just dump everything into the IFunction. If part of your system is non-stiff (and especially if it has less than C^1 regularity, or you need special properties from a certain explicit method), then you can use the IMEX form G(X,Xdot) = F(X) with G(X,Xdot) := M*Xdot You can pass TSComputeIFunctionLinear and/or TSComputeIJacobianConstant to TSSetIFunction() and TSSetIJacobian respectively if you have a linear constant mass matrix. > > As far as I understand, using IMPEX will lead to the same issue regardless > of the way I write my problem, i.e. wether I write G(t,u,v) = v-\Delta u > and F(t,u) = 0, or G(t,u,v) = v and F(t,u) = \Delta u. > > As far as I can see, all examples and tests use structured meshes where the > mass matrix is the identity matrix. > Identity as a mass matrix has nothing to do with structured vs. unstructured, it's a matter of continuous finite elements versus finite difference/finite volume/certain Petrov-Galerkin methods. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Wed Oct 19 13:26:14 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Wed, 19 Oct 2011 20:26:14 +0200 Subject: [petsc-users] VecSetType and ghosted vectors Message-ID: Is it possible to VecCreate() first and set its type to ghosted later, just as it is possible with VECMPI? I do not seem to find VECGHOST or so in petscvec.h... http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Vec/VecSetType.html#VecSetType Thanks for any hints. Dominik -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Oct 19 13:27:50 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 19 Oct 2011 13:27:50 -0500 Subject: [petsc-users] VecSetType and ghosted vectors In-Reply-To: References: Message-ID: On Wed, Oct 19, 2011 at 13:26, Dominik Szczerba wrote: > Is it possible to VecCreate() first and set its type to ghosted later, just > as it is possible with VECMPI? > Not currently, you have to use VecCreateGhost*(). -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourdin at lsu.edu Wed Oct 19 16:08:39 2011 From: bourdin at lsu.edu (Blaise Bourdin) Date: Wed, 19 Oct 2011 16:08:39 -0500 Subject: [petsc-users] Questions about TS In-Reply-To: References: Message-ID: Dear Jed, I get it now... My confusion was due to being used to derived all scheme on the pde, then discretizing, whereas the documentation assumes that the equation is already discretized. I should have figured it out. Is it right to think of the division between ODE, DAE and IMEX in the documentation as Fully Explicit vs. Fully implicit vs. Semi-implicit? Blaise On Oct 19, 2011, at 12:55 PM, Jed Brown wrote: > On Wed, Oct 19, 2011 at 12:32, Blaise Bourdin wrote: > Hi, > > > > On Wed, Oct 19, 2011 at 10:54, Blaise Bourdin wrote: > > Hi, > > > > I am trying to use TS to solve a simple transient problem in an unstructured finite element f90 code. > > > > 1. Section 6.1.1 of the manual refers to a TSSetMatrices function that can be used to set the RHS and LHS matrices, but I can;t find it. Is this section outdated? > > > > Yes, I must have missed this section when updating the documentation. > > OK, that makes more sense now. > > > > 2. Since we are using unstructured finite elements, the LHS matrix is not the identity. As far as I understand, we have two possible choices: > > - Use a mass lumping approximation of the variational identity matrix (mass matrix), M, and use M^{-1}K for the RHS matrix instead of K. > > > > You can also write this as a special case of the choice below where you use -ksp_type preonly -pc_type jacobi. > I am not sure I am following you. Can you elaborate? > > To solve > > M*Xdot = F(X) > > IFunction(X,Xdot) = M*Xdot > IJacobian(X,Xdot) = M > RHSFunction(X) = F(X) > > Now if you lump M, then you can solve it exactly with one iteration of Jacobi. My options were just turn off any extra norms that would normally be computed by an implicit solve. If M is the consistent mass matrix, then you will need some iterations. > > > > - Use an IMEX method where the implicit matrix is the variational identity M. > > Is this right? What is the recommended way? > > > > I would just do this because it's the most flexible. See the user's manual section on IMEX methods. If you are interested in adaptive error control, then you should also check out -ts_type rosw in petsc-dev. In any case, you can write your mass matrix as well as any stiff terms that you want to treat implicitly into TSSetIFunction(), provide an (approximate) Jacobian with TSSetIJacobian(), and put the rest in TSSetRHSFunction(). You can be even more sloppy about it with TSROSW. > > > > Look at src/ts/examples/tutorials/ex22.c (has a Fortran twin, ex22f.F) or ex25.c in petsc-dev. > > > > How about recasting the problem as a DAE? The documentation seems to imply that this is feasible. "For ODE with nontrivial mass matrices such as arise in FEM, the implicit/DAE interface significantly reduces overhead to prepare the system for algebraic solvers (SNES/KSP) by having the user assemble the correctly shifted matrix." > > Following ex15, solving > u_t = \Delta u > can be recast as solving > F(t,u,\dot u) = 0 > with > F(t,u,v) = v-\Delta u. > > In this case, the IJacobian would be > K+aM, > where K is the stiffness matrix (K_{i,j} = \int_\Omega \nabla \phi_i \cdot \nabla \phi_j\,dx) and M the mass matrix (M_{i,j} = \int_\Omega \phi_i \phi_j\,dx) . > > At the continuous level, the IFunction would be v-\Delta u, which cannot be evaluated directly in the finite element basis by either solving > M F = M\dot u + Ku > or using mass lumping. > > Am I expected to do this in my IFunction or is there a way to pass the mass matrix to the TS? > > You should have some sense for what K is and whether it makes sense to integrate your system implicitly or not. If you know that you want to use fully implicit methods, then just dump everything into the IFunction. If part of your system is non-stiff (and especially if it has less than C^1 regularity, or you need special properties from a certain explicit method), then you can use the IMEX form G(X,Xdot) = F(X) with > > G(X,Xdot) := M*Xdot > > You can pass TSComputeIFunctionLinear and/or TSComputeIJacobianConstant to TSSetIFunction() and TSSetIJacobian respectively if you have a linear constant mass matrix. Got it. > As far as I understand, using IMPEX will lead to the same issue regardless of the way I write my problem, i.e. wether I write G(t,u,v) = v-\Delta u and F(t,u) = 0, or G(t,u,v) = v and F(t,u) = \Delta u. > > As far as I can see, all examples and tests use structured meshes where the mass matrix is the identity matrix. > > Identity as a mass matrix has nothing to do with structured vs. unstructured, it's a matter of continuous finite elements versus finite difference/finite volume/certain Petrov-Galerkin methods. yep. Sorry, I meant finite differences, not structured mesh. > > -- Department of Mathematics and Center for Computation & Technology Louisiana State University, Baton Rouge, LA 70803, USA Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean at mcs.anl.gov Wed Oct 19 16:13:01 2011 From: sean at mcs.anl.gov (Sean Farley) Date: Wed, 19 Oct 2011 16:13:01 -0500 Subject: [petsc-users] Questions about TS In-Reply-To: References: Message-ID: > > Is it right to think of the division between ODE, DAE and IMEX in the > documentation as Fully Explicit vs. Fully implicit vs. Semi-implicit? > Mostly, yes. Indeed, we made 'shortcuts' such as "if the user provides no IFunction, then treat it as udot = F(u,t)" and "if the user provides no RHS, then treat it as G(udot, u, t) = 0." -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Oct 19 16:16:00 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 19 Oct 2011 16:16:00 -0500 Subject: [petsc-users] Questions about TS In-Reply-To: References: Message-ID: On Wed, Oct 19, 2011 at 16:08, Blaise Bourdin wrote: > I get it now... My confusion was due to being used to derived all scheme on > the pde, then discretizing, whereas the documentation assumes that the > equation is already discretized. I should have figured it out. > For software purposes and sometimes also for analysis, the "method of lines" approach is often useful. That's how the TS interfaces are set up. > > Is it right to think of the division between ODE, DAE and IMEX in the > documentation as Fully Explicit vs. Fully implicit vs. Semi-implicit? > Sure, but it's a matter of the interface more than the method. You can write Xdot = F(X) and use -ts_type beuler to solve it fully implicitly. I would consider G(X,Xdot) = F(X) to be the most general interface. When an IMEX method is used, this has the clear semantics that G is implicit and F is explicit. Explicit methods usually assume G(X,Xdot) = Xdot which is the default if you never call TSSetIFunction. I think we will eventually have support for using standard explicit methods where you just put the mass matrix into G, but that isn't done yet. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ram at ibrae.ac.ru Wed Oct 19 16:49:07 2011 From: ram at ibrae.ac.ru (=?UTF-8?B?0JDQu9C10LrRgdC10Lkg0KDRj9C30LDQvdC+0LI=?=) Date: Thu, 20 Oct 2011 01:49:07 +0400 Subject: [petsc-users] Using ghosted vector as 'x' in KSPSolve In-Reply-To: References: Message-ID: COOL! ) 2011/10/19 Matthew Knepley > On Wed, Oct 19, 2011 at 5:42 PM, Dominik Szczerba wrote: > >> >> >> On Wed, Oct 12, 2011 at 1:43 PM, Jed Brown wrote: >> >>> On Wed, Oct 12, 2011 at 06:38, Dominik Szczerba wrote: >>> >>>> Is it legal to call KSPSolve with the solution vector being a >>>> ghost-aware vector created e.g. with VecCreateGhost? >>>> >>> >>> Yes >>> >>> Would that also mean, that 'b' vector can also be ghosted? That would be >> really cool... >> > > In PETSc, ghosted vectors just have extra information, so they can always > be used as > normal vectors. > > Matt > > >> Many thanks, >> Dominik >> > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- Best regards, Alexey Ryazanov ______________________________________ Nuclear Safety Institute of Russian Academy of Sciences -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Oct 19 16:54:53 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 19 Oct 2011 16:54:53 -0500 Subject: [petsc-users] VecSetType and ghosted vectors In-Reply-To: References: Message-ID: <0189A696-1142-4439-AFC2-CF09EF61C2C8@mcs.anl.gov> On Oct 19, 2011, at 1:27 PM, Jed Brown wrote: > On Wed, Oct 19, 2011 at 13:26, Dominik Szczerba wrote: > Is it possible to VecCreate() first and set its type to ghosted later, just as it is possible with VECMPI? > > Not currently, you have to use VecCreateGhost*(). Ghosting is basically a property of VECMPI not a different vector class. I have added a VecMPISetGhost() to petsc-dev that you can call after VecCreate(), VecSetType() and VecSetSizes(). It is not clear to me that this is a particularly useful function because you generally decide before you write the code if you are going to use ghosted vectors or not; that is, it is unlikely it is a runtime decision so you might as well just use VecCreateGhost() instead of the sequence VecCreate(), VecSetType(), VecSetSizes(), VecMPISetGhost(). But for uniformity of PETSc style it is good to have it so here it is. Barry From dominik at itis.ethz.ch Wed Oct 19 17:09:09 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Thu, 20 Oct 2011 00:09:09 +0200 Subject: [petsc-users] VecSetType and ghosted vectors In-Reply-To: <0189A696-1142-4439-AFC2-CF09EF61C2C8@mcs.anl.gov> References: <0189A696-1142-4439-AFC2-CF09EF61C2C8@mcs.anl.gov> Message-ID: Many thanks! My rationale: I have abstracted a C++ class called PetscLinearSolver that is supposed to wrap all the complex ksp & co functionality in an easy to use C++ class. There, I do not know apriori if a class user will want to have x and b ghosted or not - that depends on the problem type. So even it it is known in runtime, it is not known in coding or setup time. Therefore I would benefit from a function to declare a vector as ghosted. Dominik On Wed, Oct 19, 2011 at 11:54 PM, Barry Smith wrote: > > On Oct 19, 2011, at 1:27 PM, Jed Brown wrote: > > > On Wed, Oct 19, 2011 at 13:26, Dominik Szczerba > wrote: > > Is it possible to VecCreate() first and set its type to ghosted later, > just as it is possible with VECMPI? > > > > Not currently, you have to use VecCreateGhost*(). > > > Ghosting is basically a property of VECMPI not a different vector class. > I have added a VecMPISetGhost() to petsc-dev that you can call after > VecCreate(), VecSetType() and VecSetSizes(). > > It is not clear to me that this is a particularly useful function because > you generally decide before you write the code if you are going to use > ghosted vectors or not; that is, it is unlikely it is a runtime decision so > you might as well just use VecCreateGhost() instead of the sequence > VecCreate(), VecSetType(), VecSetSizes(), VecMPISetGhost(). But for > uniformity of PETSc style it is good to have it so here it is. > > > Barry > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Oct 19 17:22:40 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 19 Oct 2011 17:22:40 -0500 Subject: [petsc-users] VecSetType and ghosted vectors In-Reply-To: References: <0189A696-1142-4439-AFC2-CF09EF61C2C8@mcs.anl.gov> Message-ID: On Wed, Oct 19, 2011 at 17:09, Dominik Szczerba wrote: > My rationale: I have abstracted a C++ class called PetscLinearSolver that > is supposed to wrap all the complex ksp & co functionality in an easy to use > C++ class. Just a word of caution: many, many packages have tried to this, but all the advanced users end up pulling out the PETSc objects anyway. The reason for "complexity" in the KSP and PC interfaces is language-independent. You're not going to make it simpler without losing a lot of flexibility if you wrap it in C++. (If you prove me wrong, I want to see how you do it, but I haven't seen a competitive solution yet, and not for lack of trying or lack of competence.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From vijay.m at gmail.com Wed Oct 19 17:29:03 2011 From: vijay.m at gmail.com (Vijay S. Mahadevan) Date: Wed, 19 Oct 2011 17:29:03 -0500 Subject: [petsc-users] VecSetType and ghosted vectors In-Reply-To: References: <0189A696-1142-4439-AFC2-CF09EF61C2C8@mcs.anl.gov> Message-ID: >> My rationale:?I have abstracted a C++ class called PetscLinearSolver that >> is supposed to wrap all the complex ksp & co functionality in an easy to use >> C++ class. I don't see why you would have to create your own vector here. If PetscLinearSolver is a wrapper around KSP object, and x, b are supplied by the user, what other work vectors are you allocating ? If this is your own custom solver, then a consistent layout can be obtained with VecDuplicate(x, &work_vec). As Jed points out, many packages have done this with a C++ wrapper but mostly these are thin calls that just forward it to the petsc solver with right options. Do you have a more complicated requirement ? On Wed, Oct 19, 2011 at 5:22 PM, Jed Brown wrote: > On Wed, Oct 19, 2011 at 17:09, Dominik Szczerba > wrote: >> >> My rationale:?I have abstracted a C++ class called PetscLinearSolver that >> is supposed to wrap all the complex ksp & co functionality in an easy to use >> C++ class. > > Just a word of caution: many, many packages have tried to this, but all the > advanced users end up pulling out the PETSc objects anyway. The reason for > "complexity" in the KSP and PC interfaces is language-independent. You're > not going to make it simpler without losing a lot of flexibility if you wrap > it in C++. (If you prove me wrong, I want to see how you do it, but I > haven't seen a competitive solution yet, and not for lack of trying or lack > of competence.) From bourdin at lsu.edu Wed Oct 19 21:39:33 2011 From: bourdin at lsu.edu (Blaise Bourdin) Date: Wed, 19 Oct 2011 21:39:33 -0500 Subject: [petsc-users] Questions about TS In-Reply-To: References: Message-ID: <4EA87598-C5D7-4515-A869-F3BE3231196C@lsu.edu> On Oct 19, 2011, at 4:16 PM, Jed Brown wrote: > On Wed, Oct 19, 2011 at 16:08, Blaise Bourdin wrote: > I get it now... My confusion was due to being used to derived all scheme on the pde, then discretizing, whereas the documentation assumes that the equation is already discretized. I should have figured it out. > > For software purposes and sometimes also for analysis, the "method of lines" approach is often useful. That's how the TS interfaces are set up. That make sense. I work mostly on problems where whose time-continuous formulation is obtained by writing a time discrete problem, then letting the time discretization interval go to 0. I have too much of a tendency to think this way. > > > Is it right to think of the division between ODE, DAE and IMEX in the documentation as Fully Explicit vs. Fully implicit vs. Semi-implicit? > > Sure, but it's a matter of the interface more than the method. You can write > > Xdot = F(X) > > and use -ts_type beuler to solve it fully implicitly. I would consider > > G(X,Xdot) = F(X) > > to be the most general interface. When an IMEX method is used, this has the clear semantics that G is implicit and F is explicit. Explicit methods usually assume G(X,Xdot) = Xdot which is the default if you never call TSSetIFunction. I think we will eventually have support for using standard explicit methods where you just put the mass matrix into G, but that isn't done yet. OK, this is clear. It will be really nice indeed when everything is reorganized this way. Thanks again, Blaise -- Department of Mathematics and Center for Computation & Technology Louisiana State University, Baton Rouge, LA 70803, USA Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourdin at lsu.edu Wed Oct 19 21:40:05 2011 From: bourdin at lsu.edu (Blaise Bourdin) Date: Wed, 19 Oct 2011 21:40:05 -0500 Subject: [petsc-users] Questions about TS In-Reply-To: References: Message-ID: On Oct 19, 2011, at 4:13 PM, Sean Farley wrote: > Is it right to think of the division between ODE, DAE and IMEX in the documentation as Fully Explicit vs. Fully implicit vs. Semi-implicit? > > Mostly, yes. Indeed, we made 'shortcuts' such as "if the user provides no IFunction, then treat it as udot = F(u,t)" and "if the user provides no RHS, then treat it as G(udot, u, t) = 0." That make sense, yes. Thanks, Blaise -- Department of Mathematics and Center for Computation & Technology Louisiana State University, Baton Rouge, LA 70803, USA Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Oct 19 21:53:41 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 19 Oct 2011 21:53:41 -0500 Subject: [petsc-users] Questions about TS In-Reply-To: <4EA87598-C5D7-4515-A869-F3BE3231196C@lsu.edu> References: <4EA87598-C5D7-4515-A869-F3BE3231196C@lsu.edu> Message-ID: On Wed, Oct 19, 2011 at 21:39, Blaise Bourdin wrote: > OK, this is clear. It will be really nice indeed when everything is > reorganized this way. > Note that most methods work this way already. The main hole right now is that explicit methods don't support nontrivial mass matrices. The best error control is currently in the Rosenbrock-W methods (TSROSW), but the other methods will also grow support (via extrapolation for those methods that don't have any embedded error estimators). One aspect that we haven't settled on an API for is user-provided stability requirements (especially for explicit methods or for the explicit part of an IMEX scheme, sometimes strong stability). Often the user can cheaply characterize some extreme eigenvalues of their system in which case we could use those estimates to limit step size instead of needing to reject steps only after the error estimator decides the accuracy is unacceptable. Specifically, the user-provided stability limits are sharper and don't require tuning the tolerances on the time integrator so that it detects instability immediately, but does not shorten the steps excessively. -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Thu Oct 20 09:34:52 2011 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 20 Oct 2011 14:34:52 +0000 Subject: [petsc-users] KSPMonitorSingularValue Message-ID: I'm confused about how to use KSPMonitorSingularValue in my fortran code (not using the options database) By looking at ex30.c I have: call KSPMonitorSet(ksp,KSPMonitorSingularValue,PETSC_NULL,PETSC_NULL,ierr); call KSPSetComputeSingularValues(ksp,PETSC_TRUE,ierr) in the part where I initialize the KSP. But where to call KSPMonitorSingularValue? And how do I get "n" and "rnorm"? Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From knepley at gmail.com Thu Oct 20 09:37:13 2011 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 20 Oct 2011 14:37:13 +0000 Subject: [petsc-users] KSPMonitorSingularValue In-Reply-To: References: Message-ID: On Thu, Oct 20, 2011 at 2:34 PM, Klaij, Christiaan wrote: > I'm confused about how to use KSPMonitorSingularValue > in my fortran code (not using the options database) > > By looking at ex30.c I have: > > call KSPMonitorSet(ksp,KSPMonitorSingularValue,PETSC_NULL,PETSC_NULL,ierr); > call KSPSetComputeSingularValues(ksp,PETSC_TRUE,ierr) > > in the part where I initialize the KSP. But where to call > KSPMonitorSingularValue? And how do I get "n" and "rnorm"? > You don't. The monitor routine is called by the solver. What exactly do you want to do? Matt > Chris > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Oct 20 09:38:46 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 20 Oct 2011 09:38:46 -0500 Subject: [petsc-users] KSPMonitorSingularValue In-Reply-To: References: Message-ID: On Thu, Oct 20, 2011 at 09:34, Klaij, Christiaan wrote: > call KSPMonitorSet(ksp,KSPMonitorSingularValue,PETSC_NULL,PETSC_NULL,ierr); > This gets KSPMonitorSingularValue called after each iteration. That routine prints the singular values. > call KSPSetComputeSingularValues(ksp,PETSC_TRUE,ierr) > This instructs the KSP to save the extra information needed by the monitor above. > > in the part where I initialize the KSP. But where to call > KSPMonitorSingularValue? And how do I get "n" and "rnorm"? > Do you want a monitor or do you want to extract estimates of the singular values after KSPSolve has converged? You might be interested in KSPComputeExtremeSingularValues(), for example. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rtm at eecs.utk.edu Thu Oct 20 09:52:08 2011 From: rtm at eecs.utk.edu (Richard Tran Mills) Date: Thu, 20 Oct 2011 10:52:08 -0400 Subject: [petsc-users] KSPMonitorSingularValue In-Reply-To: References: Message-ID: <4EA03598.7000601@eecs.utk.edu> I noticed this thread and just wanted to point out that you need to be careful with what you use these singular values for. In my own (limited) experience, I seem to recall finding that the singular values you get from the Hessenberg matrix in GMRES are pretty inaccurate. Your mileage may vary. --Richard On 10/20/2011 10:38 AM, Jed Brown wrote: > On Thu, Oct 20, 2011 at 09:34, Klaij, Christiaan > wrote: > > call > KSPMonitorSet(ksp,KSPMonitorSingularValue,PETSC_NULL,PETSC_NULL,ierr); > > > This gets KSPMonitorSingularValue called after each iteration. That > routine prints the singular values. > > call KSPSetComputeSingularValues(ksp,PETSC_TRUE,ierr) > > > This instructs the KSP to save the extra information needed by the > monitor above. > > > in the part where I initialize the KSP. But where to call > KSPMonitorSingularValue? And how do I get "n" and "rnorm"? > > > Do you want a monitor or do you want to extract estimates of the > singular values after KSPSolve has converged? You might be interested > in KSPComputeExtremeSingularValues(), for example. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Oct 20 09:54:56 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 20 Oct 2011 09:54:56 -0500 Subject: [petsc-users] KSPMonitorSingularValue In-Reply-To: <4EA03598.7000601@eecs.utk.edu> References: <4EA03598.7000601@eecs.utk.edu> Message-ID: On Thu, Oct 20, 2011 at 09:52, Richard Tran Mills wrote: > I noticed this thread and just wanted to point out that you need to be > careful with what you use these singular values for. In my own (limited) > experience, I seem to recall finding that the singular values you get from > the Hessenberg matrix in GMRES are pretty inaccurate. Your mileage may > vary. True, usually the largest eigen/singular value is fairly accurate, but the smallest is typically not found until the solve has converged. If you use restarts, then the estimates only come from the Hessenberg matrix associated with last restart. If you want meaningful estimates with GMRES, you really can't use restarts. -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Thu Oct 20 10:15:04 2011 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 20 Oct 2011 15:15:04 +0000 Subject: [petsc-users] KSPMonitorSingularValue Message-ID: Thanks for all the answers! First, I would like to see something printed to stdout, right now I don't. Assuming my KSPMonitorSet call is correct, could it be that I don't see anything because I'm using FGMRES? Should I get a warning when using this monitor with any other KSP beside CG and GMRES? Second, I have a case where FGMRES stagnates with my custom PC. I would like to get an impression of the extreme eigenvalues after applying the PC. If the estimates are only trustworthy upon convergence, then it's useless of course... The setting is Navier-Stokes, colocated FVM, matrix-free Krylov-Picard with SIMPLE preconditioning. It works for most case but stagnates for this particular case although SIMPLE as solver does converge nicely. dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From jedbrown at mcs.anl.gov Thu Oct 20 10:21:25 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 20 Oct 2011 10:21:25 -0500 Subject: [petsc-users] KSPMonitorSingularValue In-Reply-To: References: Message-ID: On Thu, Oct 20, 2011 at 10:15, Klaij, Christiaan wrote: > The setting is Navier-Stokes, colocated FVM, matrix-free Krylov-Picard > with SIMPLE preconditioning. It works for most case but stagnates > for this particular case although SIMPLE as solver does converge nicely. > How are you applying the action of the linear operator? If you use finite differencing, it could be inaccurate. Is this incompressible or a low-Mach compressible formulation? Try -ksp_monitor_true_residual, if the true residual drifts from the unpreconditioned residual computed by FGMRES, the Krylov space could be losing orthogonality. You can try -ksp_gmres_modifiedgramschmidt. Are you losing a lot of progress in restarts? -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Oct 20 10:29:29 2011 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 20 Oct 2011 15:29:29 +0000 Subject: [petsc-users] KSPMonitorSingularValue In-Reply-To: References: Message-ID: On Thu, Oct 20, 2011 at 3:15 PM, Klaij, Christiaan wrote: > Thanks for all the answers! > > First, I would like to see something printed to stdout, right now I don't. > Assuming my KSPMonitorSet call is correct, could it be that I don't > see anything because I'm using FGMRES? Should I get a warning > when using this monitor with any other KSP beside CG and GMRES? > If you just want to see the monitor, why not use the command line or PetscOptionsSetValue()? Matt > Second, I have a case where FGMRES stagnates with my custom > PC. I would like to get an impression of the extreme eigenvalues after > applying the PC. If the estimates are only trustworthy upon > convergence, then it's useless of course... > > The setting is Navier-Stokes, colocated FVM, matrix-free Krylov-Picard > with SIMPLE preconditioning. It works for most case but stagnates > for this particular case although SIMPLE as solver does converge nicely. > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Oct 20 10:35:13 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 20 Oct 2011 10:35:13 -0500 Subject: [petsc-users] KSPMonitorSingularValue In-Reply-To: References: Message-ID: On Thu, Oct 20, 2011 at 10:29, Matthew Knepley wrote: > On Thu, Oct 20, 2011 at 3:15 PM, Klaij, Christiaan wrote: > >> Thanks for all the answers! >> >> First, I would like to see something printed to stdout, right now I don't. >> Assuming my KSPMonitorSet call is correct, could it be that I don't >> see anything because I'm using FGMRES? Should I get a warning >> when using this monitor with any other KSP beside CG and GMRES? >> > > If you just want to see the monitor, why not use the command line or > PetscOptionsSetValue()? > This is the code for setting up the monitor, you can call the part inside the if statement yourself if you like. ierr = PetscOptionsString("-ksp_monitor_singular_value","Monitor singular values","KSPMonitorSet","stdout",monfilename,PETSC_MAX_PATH_LEN,&flg);CHKERRQ(ierr); if (flg) { ierr = KSPSetComputeSingularValues(ksp,PETSC_TRUE);CHKERRQ(ierr); ierr = PetscViewerASCIIOpen(((PetscObject)ksp)->comm,monfilename,&monviewer);CHKERRQ(ierr); ierr = KSPMonitorSet(ksp,KSPMonitorSingularValue,monviewer,(PetscErrorCode (*)(void**))PetscViewerDestroy);CHKERRQ(ierr); } -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffrey.k.wiens at gmail.com Thu Oct 20 15:05:08 2011 From: jeffrey.k.wiens at gmail.com (Jeff Wiens) Date: Thu, 20 Oct 2011 13:05:08 -0700 Subject: [petsc-users] petsc4py: PetSc is no longer initialised when calling C Function from Cython Message-ID: I am in the process of learning PetSc 3.1 and petsc4py 1.2. Since there is so little documentation for petsc4py, I am resorting to the PetSc mailing list. If anyone knows any resources for petsc4py, it would be really helpful. I am attempting to make a simple test problem where I 1) Initialise PetSc from Python and then 2) Call a "C" function, using Cython as a Wrapper, to construct and perform a dot product for two Petsc Vectors. I know there are better and simpler ways to deal with this problem. However, this is a simple test problem which allows python PetSc objects to interact with a Petsc C library. The problem seems to be that PetSc is no longer initialised when you call a C function. If I initialise PETSc from C, the program works. However, if I don't call init() from C, the program won't use the python PetSc. Eventually, I would like to Initialise PETSc and construct my PETSc vectors in python and then have my C code perform operations on them. Again, this yet another test problem. The code is as follows: ------- Python Code ----------- import sys, petsc4py petsc4py.init(sys.argv) from petsc4py import PETSc import dot val = dot.mydot() PETSc.Sys.Print( "PETSc Dot Product: %i"%val ) ------- Cython Code ----------- cdef extern from "c_files/dot.h": void init() double dot() def mydot(): #init(); return dot(); ------- C Code ----------- void init() { PetscInitialize(NULL,NULL,(char *)0,NULL); } double dot() { PetscScalar r; Vec v1,v2; VecCreate(PETSC_COMM_WORLD, &v1); VecCreate(PETSC_COMM_WORLD, &v2); VecSetSizes(v1, PETSC_DECIDE, 5000); VecSetSizes(v2, PETSC_DECIDE, 5000); VecSetFromOptions(v1); VecSetFromOptions(v2); VecSet(v1,1.0); VecSet(v2,2.0); VecAssemblyBegin(v1); VecAssemblyEnd(v1); VecAssemblyBegin(v2); VecAssemblyEnd(v2); VecDot(v1, v2, &r); return r; } From ecoon at lanl.gov Thu Oct 20 15:26:10 2011 From: ecoon at lanl.gov (Ethan Coon) Date: Thu, 20 Oct 2011 14:26:10 -0600 Subject: [petsc-users] petsc4py: PetSc is no longer initialised when calling C Function from Cython In-Reply-To: References: Message-ID: <1319142370.23290.14.camel@echo.lanl.gov> On Thu, 2011-10-20 at 13:05 -0700, Jeff Wiens wrote: > Eventually, I would like > to Initialise PETSc and construct my PETSc vectors in python and then > have my C code perform operations on them. Again, this yet another > test problem. > Maybe I don't completely understand your planned working model, but I don't think it's the right choice. petsc4py is very good at making setup and "high level" features of PETSc work well quickly and easily. If you plan to call things like VecSetSizes() from C, you might as well write your entire program in C. It would be much easier to simply call VecSetSizes() from petsc4py via PETSc.Vec().setSizes(). What I suspect you're trying to avoid is doing the work in python, including things like numerical calculations on the data contained within the Vec. Doing this in C via Cython is a good idea. The better model for implementing this sort of thing is to do all of your setup in python, getting the data from the Vec, and passing that data off to your cython/c code, i.e. something like: -- Python Code -- import sys, petsc4py petsc4py.init(sys.argv) from petsc4py import PETSc v1 = PETSc.Vec().createSeq(3) v2 = v1.duplicate() v1.setFromOptions() v2.setFromOptions() v1.set(3.) v2.set(6.) import dot val = dot.mydot(v1[...], v2[...]) # note that these ellipses treat the underlying C-data stored in Vec as data to a numpy array v1.destroy() v2.destroy() -- Cython code -- import numpy cimport numpy DTYPE = numpy.float ctypedef numpy.float_t DTYPE_t cdef mydot( numpy.ndarray[DTYPE_t, ndim=1] v1, numpy.ndarray[DTYPE_t, ndim=1] v2): cdef int np = v1.shape[0] cdef lcv cdef DTYPE_t val val = 0.0 for lcv in range(np): val += v1[lcv]*v2[lcv] return val --------- Ethan > The code is as follows: > > ------- Python Code ----------- > > import sys, petsc4py > petsc4py.init(sys.argv) > from petsc4py import PETSc > > import dot > val = dot.mydot() > PETSc.Sys.Print( "PETSc Dot Product: %i"%val ) > > ------- Cython Code ----------- > > cdef extern from "c_files/dot.h": > void init() > double dot() > > def mydot(): > #init(); > return dot(); > > ------- C Code ----------- > > void init() > { > PetscInitialize(NULL,NULL,(char *)0,NULL); > } > > > double dot() > { > PetscScalar r; > Vec v1,v2; > > VecCreate(PETSC_COMM_WORLD, &v1); > VecCreate(PETSC_COMM_WORLD, &v2); > > VecSetSizes(v1, PETSC_DECIDE, 5000); > VecSetSizes(v2, PETSC_DECIDE, 5000); > VecSetFromOptions(v1); > VecSetFromOptions(v2); > > VecSet(v1,1.0); > VecSet(v2,2.0); > > VecAssemblyBegin(v1); > VecAssemblyEnd(v1); > VecAssemblyBegin(v2); > VecAssemblyEnd(v2); > > VecDot(v1, v2, &r); > return r; > } -- ------------------------------------ Ethan Coon Post-Doctoral Researcher Applied Mathematics - T-5 Los Alamos National Laboratory 505-665-8289 http://www.ldeo.columbia.edu/~ecoon/ ------------------------------------ From jeffrey.k.wiens at gmail.com Thu Oct 20 15:40:32 2011 From: jeffrey.k.wiens at gmail.com (Jeff Wiens) Date: Thu, 20 Oct 2011 13:40:32 -0700 Subject: [petsc-users] petsc4py: PetSc is no longer initialised when calling C Function from Cython In-Reply-To: <1319142370.23290.14.camel@echo.lanl.gov> References: <1319142370.23290.14.camel@echo.lanl.gov> Message-ID: The point of the example was to perform the PetSc operation in C. Although you could do the dot product in Cython or Python, it defeats my purpose. I am attempting to write a wrapper for an existing PetSc C code base. I would like to setup PetSc and construct the initial conditions from python. I would then like to send this information to the PetSc C program to execute and then return to Python the final state as a petsc4py vector. If there is a better way of doing this, I would love to hear it. However, because of the size of the existing PetSc C code, rewriting it in Cython or Python is not an option. Jeff On Thu, Oct 20, 2011 at 1:26 PM, Ethan Coon wrote: > On Thu, 2011-10-20 at 13:05 -0700, Jeff Wiens wrote: >> Eventually, I would like >> to Initialise PETSc and construct my PETSc vectors in python and then >> have my C code perform operations on them. Again, this yet another >> test problem. >> > > Maybe I don't completely understand your planned working model, but I > don't think it's the right choice. > > petsc4py is very good at making setup and "high level" features of PETSc > work well quickly and easily. ?If you plan to call things like > VecSetSizes() from C, you might as well write your entire program in C. > It would be much easier to simply call VecSetSizes() from petsc4py ?via > PETSc.Vec().setSizes(). > > What I suspect you're trying to avoid is doing the work in python, > including things like numerical calculations on the data contained > within the Vec. ?Doing this in C via Cython is a good idea. ?The better > model for implementing this sort of thing is to do all of your setup in > python, getting the data from the Vec, and passing that data off to your > cython/c code, i.e. something like: > > -- Python Code -- > > import sys, petsc4py > petsc4py.init(sys.argv) > from petsc4py import PETSc > > v1 = PETSc.Vec().createSeq(3) > v2 = v1.duplicate() > v1.setFromOptions() > v2.setFromOptions() > > v1.set(3.) > v2.set(6.) > > import dot > val = dot.mydot(v1[...], v2[...]) # note that these ellipses treat the underlying C-data stored in Vec as data to a numpy array > > v1.destroy() > v2.destroy() > > > > -- Cython code -- > import numpy > cimport numpy > > DTYPE = numpy.float > ctypedef numpy.float_t DTYPE_t > > cdef mydot( numpy.ndarray[DTYPE_t, ndim=1] v1, > ? ? ? ? ? ?numpy.ndarray[DTYPE_t, ndim=1] v2): > ? ?cdef int np = v1.shape[0] > ? ?cdef lcv > ? ?cdef DTYPE_t val > > ? ?val = 0.0 > ? ?for lcv in range(np): > ? ? ? ?val += v1[lcv]*v2[lcv] > ? ?return val > > --------- > > Ethan > > > > >> The code is as follows: >> >> ------- Python Code ----------- >> >> import sys, petsc4py >> petsc4py.init(sys.argv) >> from petsc4py import PETSc >> >> import dot >> val = dot.mydot() >> PETSc.Sys.Print( "PETSc Dot Product: %i"%val ) >> >> ------- Cython Code ----------- >> >> cdef extern from "c_files/dot.h": >> ? ?void init() >> ? ?double dot() >> >> def mydot(): >> ? ?#init(); >> ? ?return dot(); >> >> ------- C Code ----------- >> >> void init() >> { >> ? ? PetscInitialize(NULL,NULL,(char *)0,NULL); >> } >> >> >> double dot() >> { >> ? ? PetscScalar r; >> ? ? Vec v1,v2; >> >> ? ? VecCreate(PETSC_COMM_WORLD, &v1); >> ? ? VecCreate(PETSC_COMM_WORLD, &v2); >> >> ? ? VecSetSizes(v1, PETSC_DECIDE, 5000); >> ? ? VecSetSizes(v2, PETSC_DECIDE, 5000); >> ? ? VecSetFromOptions(v1); >> ? ? VecSetFromOptions(v2); >> >> ? ? VecSet(v1,1.0); >> ? ? VecSet(v2,2.0); >> >> ? ? VecAssemblyBegin(v1); >> ? ? VecAssemblyEnd(v1); >> ? ? VecAssemblyBegin(v2); >> ? ? VecAssemblyEnd(v2); >> >> ? ? VecDot(v1, v2, &r); >> ? ? return r; >> } > > -- > ------------------------------------ > Ethan Coon > Post-Doctoral Researcher > Applied Mathematics - T-5 > Los Alamos National Laboratory > 505-665-8289 > > http://www.ldeo.columbia.edu/~ecoon/ > ------------------------------------ > > From dalcinl at gmail.com Thu Oct 20 15:57:52 2011 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 20 Oct 2011 17:57:52 -0300 Subject: [petsc-users] petsc4py: PetSc is no longer initialised when calling C Function from Cython In-Reply-To: References: Message-ID: On 20 October 2011 17:05, Jeff Wiens wrote: > > The problem seems to be that PetSc is no longer initialised when you > call a C function. If I initialise PETSc from C, the program works. > However, if I don't call init() from C, the program won't use the > python PetSc. Mmm... I bet you built PETSc with static libraries. There is not (easy&portable) way to mix petsc4py and other C codes when using static libraries. Please build PETSc with shared libraries. > Eventually, I would like > to Initialise PETSc and construct my PETSc vectors in python and then > have my C code perform operations on them. Again, this yet another > test problem. > Have you seen the demo/wrap-cython example in petsc4py sources? -- Lisandro Dalcin --------------- CIMEC (INTEC/CONICET-UNL) Predio CONICET-Santa Fe Colectora RN 168 Km 472, Paraje El Pozo 3000 Santa Fe, Argentina Tel: +54-342-4511594 (ext 1011) Tel/Fax: +54-342-4511169 From ecoon at lanl.gov Thu Oct 20 16:04:40 2011 From: ecoon at lanl.gov (Ethan Coon) Date: Thu, 20 Oct 2011 15:04:40 -0600 Subject: [petsc-users] petsc4py: PetSc is no longer initialised when calling C Function from Cython In-Reply-To: References: <1319142370.23290.14.camel@echo.lanl.gov> Message-ID: <1319144680.23290.16.camel@echo.lanl.gov> On Thu, 2011-10-20 at 13:40 -0700, Jeff Wiens wrote: > The point of the example was to perform the PetSc operation in C. > Although you could do the dot product in Cython or Python, it defeats > my purpose. I am attempting to write a wrapper for an existing PetSc C > code base. I would like to setup PetSc and construct the initial > conditions from python. I would then like to send this information to > the PetSc C program to execute and then return to Python the final > state as a petsc4py vector. If there is a better way of doing this, I > would love to hear it. However, because of the size of the existing > PetSc C code, rewriting it in Cython or Python is not an option. Yep, I didn't understand what you were trying to do. There is an example of wrapping existing PETSc code in cython in the demo/wrap-cython directory. Ethan > > Jeff > > On Thu, Oct 20, 2011 at 1:26 PM, Ethan Coon wrote: > > On Thu, 2011-10-20 at 13:05 -0700, Jeff Wiens wrote: > >> Eventually, I would like > >> to Initialise PETSc and construct my PETSc vectors in python and then > >> have my C code perform operations on them. Again, this yet another > >> test problem. > >> > > > > Maybe I don't completely understand your planned working model, but I > > don't think it's the right choice. > > > > petsc4py is very good at making setup and "high level" features of PETSc > > work well quickly and easily. If you plan to call things like > > VecSetSizes() from C, you might as well write your entire program in C. > > It would be much easier to simply call VecSetSizes() from petsc4py via > > PETSc.Vec().setSizes(). > > > > What I suspect you're trying to avoid is doing the work in python, > > including things like numerical calculations on the data contained > > within the Vec. Doing this in C via Cython is a good idea. The better > > model for implementing this sort of thing is to do all of your setup in > > python, getting the data from the Vec, and passing that data off to your > > cython/c code, i.e. something like: > > > > -- Python Code -- > > > > import sys, petsc4py > > petsc4py.init(sys.argv) > > from petsc4py import PETSc > > > > v1 = PETSc.Vec().createSeq(3) > > v2 = v1.duplicate() > > v1.setFromOptions() > > v2.setFromOptions() > > > > v1.set(3.) > > v2.set(6.) > > > > import dot > > val = dot.mydot(v1[...], v2[...]) # note that these ellipses treat the underlying C-data stored in Vec as data to a numpy array > > > > v1.destroy() > > v2.destroy() > > > > > > > > -- Cython code -- > > import numpy > > cimport numpy > > > > DTYPE = numpy.float > > ctypedef numpy.float_t DTYPE_t > > > > cdef mydot( numpy.ndarray[DTYPE_t, ndim=1] v1, > > numpy.ndarray[DTYPE_t, ndim=1] v2): > > cdef int np = v1.shape[0] > > cdef lcv > > cdef DTYPE_t val > > > > val = 0.0 > > for lcv in range(np): > > val += v1[lcv]*v2[lcv] > > return val > > > > --------- > > > > Ethan > > > > > > > > > >> The code is as follows: > >> > >> ------- Python Code ----------- > >> > >> import sys, petsc4py > >> petsc4py.init(sys.argv) > >> from petsc4py import PETSc > >> > >> import dot > >> val = dot.mydot() > >> PETSc.Sys.Print( "PETSc Dot Product: %i"%val ) > >> > >> ------- Cython Code ----------- > >> > >> cdef extern from "c_files/dot.h": > >> void init() > >> double dot() > >> > >> def mydot(): > >> #init(); > >> return dot(); > >> > >> ------- C Code ----------- > >> > >> void init() > >> { > >> PetscInitialize(NULL,NULL,(char *)0,NULL); > >> } > >> > >> > >> double dot() > >> { > >> PetscScalar r; > >> Vec v1,v2; > >> > >> VecCreate(PETSC_COMM_WORLD, &v1); > >> VecCreate(PETSC_COMM_WORLD, &v2); > >> > >> VecSetSizes(v1, PETSC_DECIDE, 5000); > >> VecSetSizes(v2, PETSC_DECIDE, 5000); > >> VecSetFromOptions(v1); > >> VecSetFromOptions(v2); > >> > >> VecSet(v1,1.0); > >> VecSet(v2,2.0); > >> > >> VecAssemblyBegin(v1); > >> VecAssemblyEnd(v1); > >> VecAssemblyBegin(v2); > >> VecAssemblyEnd(v2); > >> > >> VecDot(v1, v2, &r); > >> return r; > >> } > > > > -- > > ------------------------------------ > > Ethan Coon > > Post-Doctoral Researcher > > Applied Mathematics - T-5 > > Los Alamos National Laboratory > > 505-665-8289 > > > > http://www.ldeo.columbia.edu/~ecoon/ > > ------------------------------------ > > > > -- ------------------------------------ Ethan Coon Post-Doctoral Researcher Applied Mathematics - T-5 Los Alamos National Laboratory 505-665-8289 http://www.ldeo.columbia.edu/~ecoon/ ------------------------------------ From jeffrey.k.wiens at gmail.com Thu Oct 20 17:34:32 2011 From: jeffrey.k.wiens at gmail.com (Jeff Wiens) Date: Thu, 20 Oct 2011 15:34:32 -0700 Subject: [petsc-users] petsc4py: PetSc is no longer initialised when calling C Function from Cython In-Reply-To: <1319144680.23290.16.camel@echo.lanl.gov> References: <1319142370.23290.14.camel@echo.lanl.gov> <1319144680.23290.16.camel@echo.lanl.gov> Message-ID: Lisandro, the shared library was the problem. I seem to have everything working by including the --with-shared and --with-dynamic options in my PetSc 3.1 configuration. Note, I had to make significant modifications to your setup.py file for your cython demo to work (on my installation). I have attached my setup.py file, which is based on code I found in PyClaw. The main differences is overloading: build_src.build_src.generate_a_pyrex_source = generate_a_cython_source and including the following directories: INCLUDE_DIRS +=["/PATH/To/openmpi/1.4.3/include"] INCLUDE_DIRS += ["/PATH/To/hdf5/1.8.5-patch1/include"] Thanks for your help. It saved me a LOT of time. Jeff -------------- next part -------------- A non-text attachment was scrubbed... Name: setup.py Type: application/octet-stream Size: 3278 bytes Desc: not available URL: From dalcinl at gmail.com Thu Oct 20 18:51:53 2011 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 20 Oct 2011 20:51:53 -0300 Subject: [petsc-users] petsc4py: PetSc is no longer initialised when calling C Function from Cython In-Reply-To: References: <1319142370.23290.14.camel@echo.lanl.gov> <1319144680.23290.16.camel@echo.lanl.gov> Message-ID: On 20 October 2011 19:34, Jeff Wiens wrote: > Lisandro, the shared library was the problem. I seem to have > everything working by including the --with-shared and --with-dynamic > options in my PetSc 3.1 configuration. > The --with-dynamic stuff should not be required, and I think it's better if you remove it. > Note, I had to make significant modifications to your setup.py file > for your cython demo to work (on my installation). > Expected > I have attached my > setup.py file, which is based on code I found in PyClaw. The main > differences is overloading: > ? ? ? ? build_src.build_src.generate_a_pyrex_source = generate_a_cython_source > and including the following directories: > ? ?INCLUDE_DIRS +=["/PATH/To/openmpi/1.4.3/include"] > ? ?INCLUDE_DIRS += ["/PATH/To/hdf5/1.8.5-patch1/include"] > > Thanks for your help. It saved me a LOT of time. > Jeff > I'll take a look at your setup.py, thanks! -- Lisandro Dalcin --------------- CIMEC (INTEC/CONICET-UNL) Predio CONICET-Santa Fe Colectora RN 168 Km 472, Paraje El Pozo 3000 Santa Fe, Argentina Tel: +54-342-4511594 (ext 1011) Tel/Fax: +54-342-4511169 From C.Klaij at marin.nl Fri Oct 21 02:48:47 2011 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Fri, 21 Oct 2011 07:48:47 +0000 Subject: [petsc-users] KSPMonitorSingularValue Message-ID: > If you just want to see the monitor, why not use the command line or > PetscOptionsSetValue()? I have three ksp's: one for the momentum eqs (GMRES) and one for the pressure eq (CG) inside the SIMPLE preconditioner and one for the matrix-free coupled mass-momentum system (FGMRES). The command line shows me results for all but right now I'm only interested in monitoring FGMRES on the coupled system. > This is the code for setting up the monitor, you can call the part inside > the if statement yourself if you like. > > ierr = PetscOptionsString("-ksp_monitor_singular_value","Monitor > singular > values","KSPMonitorSet","stdout",monfilename,PETSC_MAX_PATH_LEN,&flg);CHKERRQ(ierr); > if (flg) { > ierr = KSPSetComputeSingularValues(ksp,PETSC_TRUE);CHKERRQ(ierr); > ierr = > PetscViewerASCIIOpen(((PetscObject)ksp)->comm,monfilename,&monviewer);CHKERRQ(ierr); > ierr = > KSPMonitorSet(ksp,KSPMonitorSingularValue,monviewer,(PetscErrorCode > (*)(void**))PetscViewerDestroy);CHKERRQ(ierr); > } I'm still confused whether it is supposed to work with FGMRES, the manual states only CG and GMRES. (Besides, I'm using fortran) > How are you applying the action of the linear operator? If you use finite > differencing, it could be inaccurate. Is this incompressible or a low-Mach > compressible formulation? Try -ksp_monitor_true_residual, if the true > residual drifts from the unpreconditioned residual computed by FGMRES, the > Krylov space could be losing orthogonality. You can try > -ksp_gmres_modifiedgramschmidt. Are you losing a lot of progress in > restarts? It's incompressible Navier-Stokes. No finite differencing, the action is computed directly without approximations. It's right preconditioning, so preconditioned and true residual should be the same. I don't get any progress, the residual is stagnating from the very first iteration way before any restart. Regarding modified Gram Schmidt, I tried to set it as follows: call KSPGMRESSetOrthogonalization(ksp,KSPGMRESModifiedGramSchmidtOrthogonalization,ierr) But my compiler tells me: This name does not have a type, and must have an explicit type. [KSPGMRESMODIFIEDGRAMSCHMIDTORTHOGONALIZATIO] (petsc-3.1-p7, fortran with "use petscksp" and #include "finclude/petsckspdef.h") dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From dominik at itis.ethz.ch Fri Oct 21 04:18:53 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 21 Oct 2011 11:18:53 +0200 Subject: [petsc-users] Error: Too many KSP monitors set Message-ID: I am getting this error when performing a series of KSP solves. Removing "-ksp_monitor_true_residual" from options removes the error, but I would like to see the residues on screen. I do not seem to find any functions in the docu to set max number of KSP monitors, but even if there is one - is it really wasteful to monitor each solve of my system? Thanks, Dominik [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Argument out of range! [0]PETSC ERROR: Too many KSP monitors set! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48 CDT 2011 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Oct 21 08:19:00 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 21 Oct 2011 08:19:00 -0500 Subject: [petsc-users] KSPMonitorSingularValue In-Reply-To: References: Message-ID: On Oct 21, 2011, at 2:48 AM, Klaij, Christiaan wrote: >> If you just want to see the monitor, why not use the command line or >> PetscOptionsSetValue()? > > I have three ksp's: one for the momentum eqs (GMRES) and one for > the pressure eq (CG) inside the SIMPLE preconditioner and one for > the matrix-free coupled mass-momentum system (FGMRES). The > command line shows me results for all but right now I'm only > interested in monitoring FGMRES on the coupled system. http://www.mcs.anl.gov/petsc/petsc-2/snapshots/petsc-current/docs/manualpages/KSP/KSPSetOptionsPrefix.html Give each ksp a unique prefix: for example -momentum -pressure and -mass-momentum Then use -mass-momentum_ksp_monitor_singular_value at the command line. Barry > > > > >> This is the code for setting up the monitor, you can call the part inside >> the if statement yourself if you like. >> >> ierr = PetscOptionsString("-ksp_monitor_singular_value","Monitor >> singular >> values","KSPMonitorSet","stdout",monfilename,PETSC_MAX_PATH_LEN,&flg);CHKERRQ(ierr); >> if (flg) { >> ierr = KSPSetComputeSingularValues(ksp,PETSC_TRUE);CHKERRQ(ierr); >> ierr = >> PetscViewerASCIIOpen(((PetscObject)ksp)->comm,monfilename,&monviewer);CHKERRQ(ierr); >> ierr = >> KSPMonitorSet(ksp,KSPMonitorSingularValue,monviewer,(PetscErrorCode >> (*)(void**))PetscViewerDestroy);CHKERRQ(ierr); >> } > > I'm still confused whether it is supposed to work with FGMRES, > the manual states only CG and GMRES. (Besides, I'm using fortran) > > > >> How are you applying the action of the linear operator? If you use finite >> differencing, it could be inaccurate. Is this incompressible or a low-Mach >> compressible formulation? Try -ksp_monitor_true_residual, if the true >> residual drifts from the unpreconditioned residual computed by FGMRES, the >> Krylov space could be losing orthogonality. You can try >> -ksp_gmres_modifiedgramschmidt. Are you losing a lot of progress in >> restarts? > > It's incompressible Navier-Stokes. No finite differencing, the > action is computed directly without approximations. It's right > preconditioning, so preconditioned and true residual should be > the same. I don't get any progress, the residual is stagnating > from the very first iteration way before any restart. > > Regarding modified Gram Schmidt, I tried to set it as follows: > > call KSPGMRESSetOrthogonalization(ksp,KSPGMRESModifiedGramSchmidtOrthogonalization,ierr) > > But my compiler tells me: > > This name does not have a type, and must have an explicit type. [KSPGMRESMODIFIEDGRAMSCHMIDTORTHOGONALIZATIO] > > (petsc-3.1-p7, fortran with "use petscksp" and #include "finclude/petsckspdef.h") > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > From manuel.perezcerquera at polito.it Fri Oct 21 08:24:44 2011 From: manuel.perezcerquera at polito.it (PEREZ CERQUERA MANUEL RICARDO) Date: Fri, 21 Oct 2011 15:24:44 +0200 Subject: [petsc-users] PETSC-LINK- Error-Using Complex Types Message-ID: Hi all I build the complex version of Petsc. I used Microsoft Visual Stud . Run Petsc in this platform and got the following error: Error 1 error LNK2019: unresolved external symbol> __invalid_parameter_noinfo referenced in function "public:> char & __thiscall std::basic_string std::char_traits,class std::allocator> >::operator[](unsigned int)"> (??A?$basic_string at DU?$char_traits at D@std@@V?$allocator at D@2@@std@@QAEAADI at Z) libpetsc.lib(errtrace.o) Error 2 error LNK2001: unresolved external symbol __invalid_parameter_noinfo libpetsc.lib(err.o) Error 3 error LNK2001: unresolved external symbol __invalid_parameter_noinfo libpetsc.lib(plog.o) Error 4 fatal error LNK1120: 1 unresolved externals Debug\PAtreju.exe Could someone tell me how to resolve it. Best regards Manuel Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student Antenna and EMC Lab (LACE) Istituto Superiore Mario Boella (ISMB) Politecnico di Torino Via Pier Carlo Boggio 61, Torino 10138, Italy Email: manuel.perezcerquera at polito.it Phone: +39 0112276704 Fax: +39 011 2276 299 From bsmith at mcs.anl.gov Fri Oct 21 08:28:03 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 21 Oct 2011 08:28:03 -0500 Subject: [petsc-users] Error: Too many KSP monitors set In-Reply-To: References: Message-ID: <51018310-2101-4A9D-846F-A6F07039430B@mcs.anl.gov> Dominik, This should not happen. Does this happen with petsc-3.2? Note that we are now supporting PETSc 3.2 and if the problem does not exist in 3.2 we will not fix it in 3.1. If it exists in 3.2 we will definitely fix it. Can you send us a small piece of code that exhibits this behavior or point to a PETSc example that exhibits this behavior so we can reproduce it and fix the error in 3.2 Barry We have specific code that prevents the On Oct 21, 2011, at 4:18 AM, Dominik Szczerba wrote: > I am getting this error when performing a series of KSP solves. Removing "-ksp_monitor_true_residual" from options removes the error, but I would like to see the residues on screen. > > I do not seem to find any functions in the docu to set max number of KSP monitors, but even if there is one - is it really wasteful to monitor each solve of my system? > > Thanks, > Dominik > > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > [0]PETSC ERROR: Argument out of range! > [0]PETSC ERROR: Too many KSP monitors set! > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48 CDT 2011 > From petsc-maint at mcs.anl.gov Fri Oct 21 08:37:10 2011 From: petsc-maint at mcs.anl.gov (Satish Balay) Date: Fri, 21 Oct 2011 08:37:10 -0500 (CDT) Subject: [petsc-users] [petsc-maint #91113] PETSC-LINK- Error-Using Complex Types In-Reply-To: References: Message-ID: Do you get these errors with petsc examples? [when compiled with PETSc makefiles?] Satish On Fri, 21 Oct 2011, PEREZ CERQUERA MANUEL RICARDO wrote: > Hi all > > I build the complex version of Petsc. I used Microsoft > Visual Stud . Run Petsc in this platform and got the > following error: > > Error 1 error LNK2019: unresolved external symbol> > __invalid_parameter_noinfo referenced in function > "public:> char & __thiscall std::basic_string > std::char_traits,class std::allocator> > >::operator[](unsigned int)"> > (??A?$basic_string at DU?$char_traits at D@std@@V?$allocator at D@2@@std@@QAEAADI at Z) > libpetsc.lib(errtrace.o) > > Error 2 error LNK2001: unresolved external symbol > __invalid_parameter_noinfo libpetsc.lib(err.o) > Error 3 error LNK2001: unresolved external symbol > __invalid_parameter_noinfo libpetsc.lib(plog.o) > Error 4 fatal error LNK1120: 1 unresolved > externals Debug\PAtreju.exe > > Could someone tell me how to resolve it. > Best regards Manuel > > Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student > Antenna and EMC Lab (LACE) > Istituto Superiore Mario Boella (ISMB) > Politecnico di Torino > Via Pier Carlo Boggio 61, Torino 10138, Italy > Email: manuel.perezcerquera at polito.it > Phone: +39 0112276704 > Fax: +39 011 2276 299 > > From dalcinl at gmail.com Fri Oct 21 08:53:12 2011 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 21 Oct 2011 10:53:12 -0300 Subject: [petsc-users] petsc4py: PetSc is no longer initialised when calling C Function from Cython In-Reply-To: References: <1319142370.23290.14.camel@echo.lanl.gov> <1319144680.23290.16.camel@echo.lanl.gov> Message-ID: On 20 October 2011 19:34, Jeff Wiens wrote: > > Thanks for your help. It saved me a LOT of time. > Jeff > BTW, have you ever used SWIG? If the functions you need to wrap are simple (let say, any PetscObject subtype and scalar paramenters) you can get your wrappers with less code to write on your side. -- Lisandro Dalcin --------------- CIMEC (INTEC/CONICET-UNL) Predio CONICET-Santa Fe Colectora RN 168 Km 472, Paraje El Pozo 3000 Santa Fe, Argentina Tel: +54-342-4511594 (ext 1011) Tel/Fax: +54-342-4511169 From dominik at itis.ethz.ch Fri Oct 21 09:05:37 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 21 Oct 2011 16:05:37 +0200 Subject: [petsc-users] Error: Too many KSP monitors set In-Reply-To: <51018310-2101-4A9D-846F-A6F07039430B@mcs.anl.gov> References: <51018310-2101-4A9D-846F-A6F07039430B@mcs.anl.gov> Message-ID: Unfortunately, I am still forced to use version 3.1. I will come back to this issue after the upgrade to 3.2, which should be next week. Many thanks, Dominik On Fri, Oct 21, 2011 at 3:28 PM, Barry Smith wrote: > > Dominik, > > This should not happen. Does this happen with petsc-3.2? Note that we > are now supporting PETSc 3.2 and if the problem does not exist in 3.2 we > will not fix it in 3.1. If it exists in 3.2 we will definitely fix it. Can > you send us a small piece of code that exhibits this behavior or point to a > PETSc example that exhibits this behavior so we can reproduce it and fix the > error in 3.2 > > Barry > > We have specific code that prevents the > On Oct 21, 2011, at 4:18 AM, Dominik Szczerba wrote: > > > I am getting this error when performing a series of KSP solves. Removing > "-ksp_monitor_true_residual" from options removes the error, but I would > like to see the residues on screen. > > > > I do not seem to find any functions in the docu to set max number of KSP > monitors, but even if there is one - is it really wasteful to monitor each > solve of my system? > > > > Thanks, > > Dominik > > > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > > [0]PETSC ERROR: Argument out of range! > > [0]PETSC ERROR: Too many KSP monitors set! > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48 > CDT 2011 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chetan.jhurani at gmail.com Fri Oct 21 09:07:28 2011 From: chetan.jhurani at gmail.com (Chetan Jhurani) Date: Fri, 21 Oct 2011 10:07:28 -0400 Subject: [petsc-users] PETSC-LINK- Error-Using Complex Types In-Reply-To: References: Message-ID: <4ea17cad.f318340a.0950.ffffa9db@mx.google.com> You may be using a runtime link library for your visual studio project that is incompatible with what petsc used to build. This could be either due to debug/release choices or static/dynamic/ dynamic-multithreaded choices (or both). See http://msdn.microsoft.com/en-us/library/abx4dbyh%28v=vs.80%29.aspx as a starting point and compare the flags in your configure.log. I think petsc uses /MT by default, which is "Multithreaded, static link". Chetan > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of PEREZ > CERQUERA MANUEL RICARDO > Sent: Friday, October 21, 2011 9:25 AM > To: petsc-users at mcs.anl.gov; petsc-maint at mcs.anl.gov > Subject: [petsc-users] PETSC-LINK- Error-Using Complex Types > > Hi all > > I build the complex version of Petsc. I used Microsoft > Visual Stud . Run Petsc in this platform and got the > following error: > > Error 1 error LNK2019: unresolved external symbol> > __invalid_parameter_noinfo referenced in function > "public:> char & __thiscall std::basic_string > std::char_traits,class std::allocator> > >::operator[](unsigned int)"> > (??A?$basic_string at DU?$char_traits at D@std@@V?$allocator at D@2@@std@@QAEAADI at Z) > libpetsc.lib(errtrace.o) > > Error 2 error LNK2001: unresolved external symbol > __invalid_parameter_noinfo libpetsc.lib(err.o) > Error 3 error LNK2001: unresolved external symbol > __invalid_parameter_noinfo libpetsc.lib(plog.o) > Error 4 fatal error LNK1120: 1 unresolved > externals Debug\PAtreju.exe > > Could someone tell me how to resolve it. > Best regards Manuel > > Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student > Antenna and EMC Lab (LACE) > Istituto Superiore Mario Boella (ISMB) > Politecnico di Torino > Via Pier Carlo Boggio 61, Torino 10138, Italy > Email: manuel.perezcerquera at polito.it > Phone: +39 0112276704 > Fax: +39 011 2276 299 From dominik at itis.ethz.ch Fri Oct 21 09:29:59 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 21 Oct 2011 16:29:59 +0200 Subject: [petsc-users] problem with initial value Message-ID: I am doing a transient computation, solving one linear problem per timestep, so naturally I want to exploit 'x' from the previous time step to be the initial value for the next solve (KSPSetInitialGuessNonzero). For the longest time, however, I was getting wrong results, unless I was resetting 'x' each time step (to some constant value, pure zero caused bcgs to break down). After hours of debugging I was unable to find any errors in my coefficients, I experimentally found out, however, that changing the solver from bcgs to gmres or fgmres removes the problem: I no longer need to clear the solution vector. Now I am a bit worried, if this is still some time bomb in my code or is a known phenomenon. Thanks for any hints. Regards, Dominik -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Fri Oct 21 09:40:01 2011 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Fri, 21 Oct 2011 14:40:01 +0000 Subject: [petsc-users] KSPMonitorSingularValue Message-ID: >>> If you just want to see the monitor, why not use the command line or >>> PetscOptionsSetValue()? >> >> I have three ksp's: one for the momentum eqs (GMRES) and one for >> the pressure eq (CG) inside the SIMPLE preconditioner and one for >> the matrix-free coupled mass-momentum system (FGMRES). The >> command line shows me results for all but right now I'm only >> interested in monitoring FGMRES on the coupled system. > > http://www.mcs.anl.gov/petsc/petsc-2/snapshots/petsc-current/docs/manualpages/KSP/KSPSetOptionsPrefix.html > > Give each ksp a unique prefix: for example -momentum -pressure and -mass-momentum Then use -mass-momentum_ksp_monitor_singular_value at the command line. > > Barry Nice, thanks for the tip! Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From knepley at gmail.com Fri Oct 21 10:19:53 2011 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 21 Oct 2011 15:19:53 +0000 Subject: [petsc-users] KSPMonitorSingularValue In-Reply-To: References: Message-ID: On Fri, Oct 21, 2011 at 7:48 AM, Klaij, Christiaan wrote: > > If you just want to see the monitor, why not use the command line or > > PetscOptionsSetValue()? > > I have three ksp's: one for the momentum eqs (GMRES) and one for > the pressure eq (CG) inside the SIMPLE preconditioner and one for > the matrix-free coupled mass-momentum system (FGMRES). The > command line shows me results for all but right now I'm only > interested in monitoring FGMRES on the coupled system. > > > > > > This is the code for setting up the monitor, you can call the part inside > > the if statement yourself if you like. > > > > ierr = PetscOptionsString("-ksp_monitor_singular_value","Monitor > > singular > > > values","KSPMonitorSet","stdout",monfilename,PETSC_MAX_PATH_LEN,&flg);CHKERRQ(ierr); > > if (flg) { > > ierr = KSPSetComputeSingularValues(ksp,PETSC_TRUE);CHKERRQ(ierr); > > ierr = > > > PetscViewerASCIIOpen(((PetscObject)ksp)->comm,monfilename,&monviewer);CHKERRQ(ierr); > > ierr = > > KSPMonitorSet(ksp,KSPMonitorSingularValue,monviewer,(PetscErrorCode > > (*)(void**))PetscViewerDestroy);CHKERRQ(ierr); > > } > > I'm still confused whether it is supposed to work with FGMRES, > the manual states only CG and GMRES. (Besides, I'm using fortran) > > > > > How are you applying the action of the linear operator? If you use finite > > differencing, it could be inaccurate. Is this incompressible or a > low-Mach > > compressible formulation? Try -ksp_monitor_true_residual, if the true > > residual drifts from the unpreconditioned residual computed by FGMRES, > the > > Krylov space could be losing orthogonality. You can try > > -ksp_gmres_modifiedgramschmidt. Are you losing a lot of progress in > > restarts? > > It's incompressible Navier-Stokes. No finite differencing, the > action is computed directly without approximations. It's right > preconditioning, so preconditioned and true residual should be > the same. I don't get any progress, the residual is stagnating > from the very first iteration way before any restart. > > Regarding modified Gram Schmidt, I tried to set it as follows: > > call > KSPGMRESSetOrthogonalization(ksp,KSPGMRESModifiedGramSchmidtOrthogonalization,ierr) > > But my compiler tells me: > > This name does not have a type, and must have an explicit type. > [KSPGMRESMODIFIEDGRAMSCHMIDTORTHOGONALIZATIO] > It looks like you have a line length problem. Matt > (petsc-3.1-p7, fortran with "use petscksp" and #include > "finclude/petsckspdef.h") > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhaonanavril at gmail.com Fri Oct 21 10:23:18 2011 From: zhaonanavril at gmail.com (NAN ZHAO) Date: Fri, 21 Oct 2011 09:23:18 -0600 Subject: [petsc-users] INSERTVALUES after ADDVALUES Message-ID: Dear all, I am assembling a matrix using ADD_VALUES option, I need to insert certain values to the matrix cause I need to apply some boundary conditions. But I got the error [0]PETSC ERROR: Object is in wrong state! [0]PETSC ERROR: You have already added values; you cannot now insert! Is anyone have a solution of this? Thanks, Nan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Oct 21 10:27:04 2011 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 21 Oct 2011 15:27:04 +0000 Subject: [petsc-users] INSERTVALUES after ADDVALUES In-Reply-To: References: Message-ID: On Fri, Oct 21, 2011 at 3:23 PM, NAN ZHAO wrote: > Dear all, > > I am assembling a matrix using ADD_VALUES option, I need to insert certain > values to the matrix cause I need to apply some boundary conditions. But I > got the error > [0]PETSC ERROR: Object is in wrong state! > [0]PETSC ERROR: You have already added values; you cannot now insert! > > Is anyone have a solution of this? > Call Assembly between the ADD and INSERT calls. Matt > Thanks, > Nan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Oct 21 10:27:33 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 21 Oct 2011 10:27:33 -0500 Subject: [petsc-users] INSERTVALUES after ADDVALUES In-Reply-To: References: Message-ID: 1. Assemble twice 2. Do not insert into Dirichlet rows/columns, perhaps by using negative indices. Then you can always ADD. On Oct 21, 2011 10:23 AM, "NAN ZHAO" wrote: > Dear all, > > I am assembling a matrix using ADD_VALUES option, I need to insert certain > values to the matrix cause I need to apply some boundary conditions. But I > got the error > [0]PETSC ERROR: Object is in wrong state! > [0]PETSC ERROR: You have already added values; you cannot now insert! > > Is anyone have a solution of this? > > Thanks, > Nan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Oct 21 10:29:00 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 21 Oct 2011 10:29:00 -0500 (CDT) Subject: [petsc-users] INSERTVALUES after ADDVALUES In-Reply-To: References: Message-ID: On Fri, 21 Oct 2011, NAN ZHAO wrote: > Dear all, > > I am assembling a matrix using ADD_VALUES option, I need to insert certain > values to the matrix cause I need to apply some boundary conditions. But I > got the error > [0]PETSC ERROR: Object is in wrong state! > [0]PETSC ERROR: You have already added values; you cannot now insert! > > Is anyone have a solution of this? You would have to add MatAssemblyBegin/End(MAT_FLUSH_ASSEMBLY) when you switch between ADD_VALUES and INSERT_VALUES [with a MatAssemblyBegin/End(MAT_FINAL_ASSEMBLY) at the end of inserting/adding all the necessary values in the matrix. Satish From manuel.perezcerquera at polito.it Fri Oct 21 10:48:55 2011 From: manuel.perezcerquera at polito.it (PEREZ CERQUERA MANUEL RICARDO) Date: Fri, 21 Oct 2011 17:48:55 +0200 Subject: [petsc-users] PETSC-LINK- Error-Using Complex Types Message-ID: Does Anybody used Visual Studio with PETSC using Complex, Did you get problems in the linking process? Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student Antenna and EMC Lab (LACE) Istituto Superiore Mario Boella (ISMB) Politecnico di Torino Via Pier Carlo Boggio 61, Torino 10138, Italy Email: manuel.perezcerquera at polito.it Phone: +39 0112276704 Fax: +39 011 2276 299 From jeffrey.k.wiens at gmail.com Fri Oct 21 11:02:23 2011 From: jeffrey.k.wiens at gmail.com (Jeff Wiens) Date: Fri, 21 Oct 2011 09:02:23 -0700 Subject: [petsc-users] petsc4py: PetSc is no longer initialised when calling C Function from Cython In-Reply-To: References: <1319142370.23290.14.camel@echo.lanl.gov> <1319144680.23290.16.camel@echo.lanl.gov> Message-ID: On Fri, Oct 21, 2011 at 6:53 AM, Lisandro Dalcin wrote: > BTW, have you ever used SWIG? If the functions you need to wrap are > simple (let say, any PetscObject subtype and scalar paramenters) you > can get your wrappers with less code to write on your side. I have heard very little about SWIG. I will take a look at it. The PETSc program that I'm trying to encapsulate is far from simple. If SWIG can only handle simple functions, it probably won't be very helpful. BTW, I removed the dynamic flag --with-dynamic from my PETSc installation. For some reason, my test program didn't work (the first time) when I installed PETSc with only the --with-shared option. However, I must have changed something else because it worked the second time I tried re-installing PETSc. Jeff From petsc-maint at mcs.anl.gov Fri Oct 21 11:06:17 2011 From: petsc-maint at mcs.anl.gov (Satish Balay) Date: Fri, 21 Oct 2011 11:06:17 -0500 (CDT) Subject: [petsc-users] [petsc-maint #91113] PETSC-LINK- Error-Using Complex Types In-Reply-To: References: Message-ID: Then you would check cd src/ksp/ksp/examples/tutorials make PETSC_DIR=/home/d022117/petsc-3.2-p2 PETSC_ARCH=arch-mswin-cxx-debug ex2f and try to match the options in the project files with the onces used here.. Note: The link options like /MT are independent of language.. Satish On Fri, 21 Oct 2011, Manuel Ricardo Perez Cerquera wrote: > Yes, The Problem is I'm Using Fortran and Mostly of this options are only > available for C or C++ projects. I don't know if Petsc Support Complex in > Fortran language, Do I'm wrong? > > 2011/10/21 Satish Balay > > > Some of these options should be selectable in 'compile' or 'link tab' > > of the project file settings. > > > > >>>>>>> > > /MT link with LIBCMT.LIB [multithreaded build - not "multithreaded dll"] > > /GR[-] enable C++ RTTI > > /EHs enable synchronous C++ EH > > /EHc extern "C" defaults to nothrow > > /Z7 enable old-style debug info > > /Zm max memory alloc (% of default) > > <<<<<< > > > > As Chetan mentioned - the most important option to set is '/MT' for > > "Multithreaded, static link". You can probably ignore the debug > > options. Not sure if the c++ RTTI/EH options matter for user code. > > > > Satish > > > > On Fri, 21 Oct 2011, PEREZ CERQUERA MANUEL RICARDO wrote: > > > > > No, With the examples It Works, Is when I linked With > > > Visual Studio, for example following the instructions I > > > did: > > > $ make PETSC_DIR=/home/d022117/petsc-3.2-p2 > > > PETSC_ARCH=arch-mswin-cxx-debug ex2 > > > I got this > > > /home/d022117/petsc-3.2-p2/bin/win32fe/win32fe cl -o > > > ex2.o -c -MT -GR -EHsc -Z7 -Zm200 -TP > > > -I/home/d022117/petsc-3.2-p2/include > > > -I/home/d022117/petsc-3.2-p2/arc > > > h-mswin-cxx-debug/include > > > -I/cygdrive/c/ProgramFiles/MPICH2/include > > > -D__INSDIR__=src/ksp/ksp/examples/tutorials/ex2.c > > > ex2.c > > > /home/d022117/petsc-3.2-p2/bin/win32fe/win32fe cl -MT -GR > > > -EHsc -Z7 -Zm200 -o ex2 ex2.o > > > -L/home/d022117/petsc-3.2-p2/arch-mswin-cxx-debug/lib > > > -lpetsc -lflapack -lfblas > > > /cygdrive/c/Program\Files/MPICH2/lib/fmpich2.lib > > > /cygdrive/c/Program\ Files/MPICH2/lib/fmpich2g.lib > > > /cygdrive/c/Program\Files/MPICH2/lib/fmpich2s.lib > > > /cygdrive/c/Program\ Files/MPICH2/lib/mpi.lib Gdi32.lib > > > User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib > > > /usr/bin/rm -f ex2.o > > > > > > So in VS I linked the libraries libpetsc.lib > > > libflapack.lib libfblasfmpich2s.lib fmpich2g.lib mpi.lib > > > Gdi32.lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib > > > , actually I do not know where to set -MT -GR -EHsc -Z7 > > > -Zm200 options in VS. > > > > > > Manuel > > > > > > On Fri, 21 Oct 2011 08:37:10 -0500 (CDT) > > > Satish Balay wrote: > > > > Do you get these errors with petsc examples? [when > > > >compiled with PETSc makefiles?] > > > > > > > > Satish > > > > > > > > On Fri, 21 Oct 2011, PEREZ CERQUERA MANUEL RICARDO > > > >wrote: > > > > > > > >> Hi all > > > >> > > > >> I build the complex version of Petsc. I used Microsoft > > > >> Visual Stud . Run Petsc in this platform and got the > > > >> following error: > > > >> > > > >> Error 1 error LNK2019: unresolved external > > > >>symbol> > > > >> __invalid_parameter_noinfo referenced in function > > > >> "public:> char & __thiscall > > > >>std::basic_string > > > >> std::char_traits,class std::allocator> > > > >> >::operator[](unsigned int)"> > > > >> (??A?$basic_string at DU?$char_traits at D@std@@V?$allocator at D@2@@std@ > > @QAEAADI at Z) > > > >> libpetsc.lib(errtrace.o) > > > >> > > > >> Error 2 error LNK2001: unresolved external symbol > > > >> __invalid_parameter_noinfo libpetsc.lib(err.o) > > > >> Error 3 error LNK2001: unresolved external symbol > > > >> __invalid_parameter_noinfo libpetsc.lib(plog.o) > > > >> Error 4 fatal error LNK1120: 1 unresolved > > > >> externals Debug\PAtreju.exe > > > >> > > > >> Could someone tell me how to resolve it. > > > >> Best regards Manuel > > > >> > > > >> Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student > > > >> Antenna and EMC Lab (LACE) > > > >> Istituto Superiore Mario Boella (ISMB) > > > >> Politecnico di Torino > > > >> Via Pier Carlo Boggio 61, Torino 10138, Italy > > > >> Email: manuel.perezcerquera at polito.it > > > >> Phone: +39 0112276704 > > > >> Fax: +39 011 2276 299 > > > >> > > > >> > > > > > > > > > > Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student > > > Antenna and EMC Lab (LACE) > > > Istituto Superiore Mario Boella (ISMB) > > > Politecnico di Torino > > > Via Pier Carlo Boggio 61, Torino 10138, Italy > > > Email: manuel.perezcerquera at polito.it > > > Phone: +39 0112276704 > > > Fax: +39 011 2276 299 > > > > > > > > > > > > > From bsmith at mcs.anl.gov Fri Oct 21 11:29:23 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 21 Oct 2011 11:29:23 -0500 Subject: [petsc-users] problem with initial value In-Reply-To: References: Message-ID: On Oct 21, 2011, at 9:29 AM, Dominik Szczerba wrote: > I am doing a transient computation, solving one linear problem per timestep, so naturally I want to exploit 'x' from the previous time step to be the initial value for the next solve (KSPSetInitialGuessNonzero). > For the longest time, however, I was getting wrong results, unless I was resetting 'x' each time step (to some constant value, pure zero caused bcgs to break down). What happened if you did not set it to some constant (that is kept the old solution)? Did you get KSP_DIVERGED_BREAKDOWN? It would be very odd that starting with a good initial guess would lead to breakdown but that cannot be completely ruled out. I would also check with valgrind http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind Have you tried KSPBCGSL? This is "enhanced Bi-CG-stab" algorithm that is designed to handle certain situations that may cause grief for regular Bi-CG-stab I guess. Barry > After hours of debugging I was unable to find any errors in my coefficients, I experimentally found out, however, that changing the solver from bcgs to gmres or fgmres removes the problem: I no longer need to clear the solution vector. > Now I am a bit worried, if this is still some time bomb in my code or is a known phenomenon. Thanks for any hints. > > Regards, Dominik From dominik at itis.ethz.ch Fri Oct 21 11:57:51 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 21 Oct 2011 18:57:51 +0200 Subject: [petsc-users] problem with initial value In-Reply-To: References: Message-ID: On Fri, Oct 21, 2011 at 6:29 PM, Barry Smith wrote: > > On Oct 21, 2011, at 9:29 AM, Dominik Szczerba wrote: > > > I am doing a transient computation, solving one linear problem per timestep, so naturally I want to exploit 'x' from the previous time step to be the initial value for the next solve (KSPSetInitialGuessNonzero). > > For the longest time, however, I was getting wrong results, unless I was resetting 'x' each time step (to some constant value, pure zero caused bcgs to break down). > > ? What happened if you did not set it to some constant (that is kept the old solution)? Did you get KSP_DIVERGED_BREAKDOWN? ? It would be very odd that starting with a good initial guess would lead to breakdown but that cannot be completely ruled out. > There was no error, the iterations reportedly converged. Only the results were wrong, sort of strong random spikes. > > ? I would also check with valgrind http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind > There are 3 issues with valgrind: 1) Syscall param writev(vector[...]) points to uninitialised byte(s) -> tribbered by MatPartitioningApply, then leading deep into ParMetis 2) Conditional jump or move depends on uninitialised value(s) ? ? -> many times, in?VecMin and VecMax and KSPSolve_BCGSL and 3) Syscall param writev(vector[...]) points to uninitialised byte(s) -> just once, in VecScatterBegin triggered by VecCreateGhost on the 'x' vector, which is ghosted. Do they pose any serious threats? > > ? ?Have you tried KSPBCGSL? This is "enhanced Bi-CG-stab" algorithm that is designed to handle certain situations that may cause grief for regular Bi-CG-stab I guess. > Thanks for the hint on bcgsl - it works as expected. So, do I have a problem in the code or bcgs is unreliable? If the latter: as a method or as this specific implementation? Thanks for any comments, Dominik > > ? Barry > > > > > After hours of debugging I was unable to find any errors in my coefficients, I experimentally found out, however, that changing the solver from bcgs to gmres or fgmres removes the problem: I no longer need to clear the solution vector. > > Now I am a bit worried, if this is still some time bomb in my code or is a known phenomenon. Thanks for any hints. > > > > Regards, Dominik > From jedbrown at mcs.anl.gov Fri Oct 21 12:02:43 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 21 Oct 2011 12:02:43 -0500 Subject: [petsc-users] problem with initial value In-Reply-To: References: Message-ID: On Fri, Oct 21, 2011 at 11:57, Dominik Szczerba wrote: > 1) Syscall param writev(vector[...]) points to uninitialised byte(s) > -> tribbered by MatPartitioningApply, then leading deep into ParMetis > If we're lucky, this has been fixed in the latest ParMetis release. (PETSc can use it, but not all the external packages can yet. We're patching these before using the new release by default.) > > 2) Conditional jump or move depends on uninitialised value(s) -> > many times, in VecMin and VecMax and KSPSolve_BCGSL > and > > 3) Syscall param writev(vector[...]) points to uninitialised byte(s) > -> just once, in VecScatterBegin triggered by VecCreateGhost on the > 'x' vector, which is ghosted. > > Do they pose any serious threats? > > > > > Have you tried KSPBCGSL? This is "enhanced Bi-CG-stab" algorithm that > is designed to handle certain situations that may cause grief for regular > Bi-CG-stab I guess. > > > > Thanks for the hint on bcgsl - it works as expected. > > So, do I have a problem in the code or bcgs is unreliable? If the > latter: as a method or as this specific implementation? > It would be much easier to answer these questions if you sent a test case or showed us how to reproduce the problem with a PETSc example. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Oct 21 12:57:55 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 21 Oct 2011 12:57:55 -0500 Subject: [petsc-users] problem with initial value In-Reply-To: References: Message-ID: On Oct 21, 2011, at 11:57 AM, Dominik Szczerba wrote: > On Fri, Oct 21, 2011 at 6:29 PM, Barry Smith wrote: >> >> On Oct 21, 2011, at 9:29 AM, Dominik Szczerba wrote: >> >>> I am doing a transient computation, solving one linear problem per timestep, so naturally I want to exploit 'x' from the previous time step to be the initial value for the next solve (KSPSetInitialGuessNonzero). >>> For the longest time, however, I was getting wrong results, unless I was resetting 'x' each time step (to some constant value, pure zero caused bcgs to break down). >> >> What happened if you did not set it to some constant (that is kept the old solution)? Did you get KSP_DIVERGED_BREAKDOWN? It would be very odd that starting with a good initial guess would lead to breakdown but that cannot be completely ruled out. >> > > There was no error, the iterations reportedly converged. Only the > results were wrong, sort of strong random spikes. > >> >> I would also check with valgrind http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind >> > > There are 3 issues with valgrind: > > 1) Syscall param writev(vector[...]) points to uninitialised byte(s) > -> tribbered by MatPartitioningApply, then leading deep into ParMetis > > 2) Conditional jump or move depends on uninitialised value(s) -> > many times, in VecMin and VecMax and KSPSolve_BCGSL > and > > 3) Syscall param writev(vector[...]) points to uninitialised byte(s) > -> just once, in VecScatterBegin triggered by VecCreateGhost on the > 'x' vector, which is ghosted.\ These are very bad things and should not happen at all. They must be tracked down before anything can be trusted. Start by sending the full valgrind output from a PETSc 3.2 run to petsc-maint at mcs.anl.gov Barry > > Do they pose any serious threats? > >> >> Have you tried KSPBCGSL? This is "enhanced Bi-CG-stab" algorithm that is designed to handle certain situations that may cause grief for regular Bi-CG-stab I guess. >> > > Thanks for the hint on bcgsl - it works as expected. > > So, do I have a problem in the code or bcgs is unreliable? If the > latter: as a method or as this specific implementation? > > Thanks for any comments, > Dominik > > >> >> Barry >> >> >> >>> After hours of debugging I was unable to find any errors in my coefficients, I experimentally found out, however, that changing the solver from bcgs to gmres or fgmres removes the problem: I no longer need to clear the solution vector. >>> Now I am a bit worried, if this is still some time bomb in my code or is a known phenomenon. Thanks for any hints. >>> >>> Regards, Dominik >> From baagaard at usgs.gov Fri Oct 21 15:13:48 2011 From: baagaard at usgs.gov (Brad Aagaard) Date: Fri, 21 Oct 2011 13:13:48 -0700 Subject: [petsc-users] petsc4py: PetSc is no longer initialised when calling C Function from Cython In-Reply-To: References: <1319142370.23290.14.camel@echo.lanl.gov> <1319144680.23290.16.camel@echo.lanl.gov> Message-ID: <4EA1D27C.2010304@usgs.gov> Jeff- We use SWIG in PyLith, which which uses PETSc and Sieve, and has high-level code in Python and low-level code in C++. With typemaps from numpy and just a very few of our own custom typemaps, we have wrapped all of our objects and functions using stripped down C++ header files. This has been much less work for us than using Pyrex or Cython. It seamless handles pointers. The typemaps from numpy make it possible to pass numpy arrays as C arrays (i.e., double*) with sizes with just one extra line to specify which typemap to use. Brad On 10/21/2011 09:02 AM, Jeff Wiens wrote: > On Fri, Oct 21, 2011 at 6:53 AM, Lisandro Dalcin wrote: >> BTW, have you ever used SWIG? If the functions you need to wrap are >> simple (let say, any PetscObject subtype and scalar paramenters) you >> can get your wrappers with less code to write on your side. > > I have heard very little about SWIG. I will take a look at it. The > PETSc program that I'm trying to encapsulate is far from simple. If > SWIG can only handle simple functions, it probably won't be very > helpful. > > BTW, I removed the dynamic flag --with-dynamic from my PETSc > installation. For some reason, my test program didn't work (the first > time) when I installed PETSc with only the --with-shared option. > However, I must have changed something else because it worked the > second time I tried re-installing PETSc. > > Jeff > From dominik at itis.ethz.ch Fri Oct 21 15:23:06 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 21 Oct 2011 22:23:06 +0200 Subject: [petsc-users] VecValid gone in 3.2 Message-ID: VecValid is gone in 3.2 with no mention in the Changes. Is there any replacement? Dominik From jedbrown at mcs.anl.gov Fri Oct 21 15:32:12 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 21 Oct 2011 15:32:12 -0500 Subject: [petsc-users] VecValid gone in 3.2 In-Reply-To: References: Message-ID: On Fri, Oct 21, 2011 at 15:23, Dominik Szczerba wrote: > VecValid is gone in 3.2 with no mention in the Changes. Is there any > replacement? > You can use PetscValidHeaderSpecific(vec,VEC_CLASSID,argnum); if you are validating function arguments. VecValid() didn't have meaningful semantics when called from user code and PetscValidHeaderSpecific is used consistently in PETSc's input validation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Fri Oct 21 15:50:40 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 21 Oct 2011 22:50:40 +0200 Subject: [petsc-users] ERROR: MatPartitioningApply_Parmetis() ... Does not support any processor with 0 entries Message-ID: The following used to work with Petsc 3.1, but fails in runtime at MatPartitioningApply with Petsc 3.2 with the above error. I do not seem to find any pointers in the Changelog. Can you please advise? Regards, Dominik // Connectivity arrays ia, ja allocated and filled on master, zero otherwise // mlocal is number of cells on master, zero otherwise Mat adj; ierr = MatCreateMPIAdj(PETSC_COMM_WORLD, mlocal, mlocal, ia, ja, PETSC_NULL, &adj); CHKERRQ(ierr); MatPartitioning part; ierr = MatPartitioningCreate(PETSC_COMM_WORLD, &part); CHKERRQ(ierr); ierr = MatPartitioningSetAdjacency(part, adj); CHKERRQ(ierr); ierr = MatPartitioningSetNParts(part, np); CHKERRQ(ierr); IS isAssignment; ierr = MatPartitioningApply(part, &isAssignment); CHKERRQ(ierr); From sean at mcs.anl.gov Fri Oct 21 15:52:08 2011 From: sean at mcs.anl.gov (Sean Farley) Date: Fri, 21 Oct 2011 15:52:08 -0500 Subject: [petsc-users] ERROR: MatPartitioningApply_Parmetis() ... Does not support any processor with 0 entries In-Reply-To: References: Message-ID: > > The following used to work with Petsc 3.1, but fails in runtime at > MatPartitioningApply with Petsc 3.2 with the above error. > I do not seem to find any pointers in the Changelog. Can you please advise? I ran into this error with parmetis 4.0 (and 4.0.1). Which version of parmetis are you using? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Fri Oct 21 15:54:38 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 21 Oct 2011 22:54:38 +0200 Subject: [petsc-users] ERROR: MatPartitioningApply_Parmetis() ... Does not support any processor with 0 entries In-Reply-To: References: Message-ID: On Fri, Oct 21, 2011 at 10:52 PM, Sean Farley wrote: >> The following used to work with Petsc 3.1, but fails in runtime at >> MatPartitioningApply with Petsc 3.2 with the above error. >> I do not seem to find any pointers in the Changelog. Can you please >> advise? > > I ran into this error with parmetis 4.0 (and 4.0.1). Which version of > parmetis are you using? The one that was downloaded automatically during configuration, which is ParMetis-3.2.0-p1 Dominik From jedbrown at mcs.anl.gov Fri Oct 21 16:02:16 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 21 Oct 2011 16:02:16 -0500 Subject: [petsc-users] ERROR: MatPartitioningApply_Parmetis() ... Does not support any processor with 0 entries In-Reply-To: References: Message-ID: On Fri, Oct 21, 2011 at 15:54, Dominik Szczerba wrote: > The one that was downloaded automatically during configuration, which > is ParMetis-3.2.0-p1 > If I remember correctly, the problem is that ParMetis frequently crashes when called with empty subdomains. If it was working for you before, you were probably just lucky. The best fix is to have the bug/limitation fixed upstream, but barring that, we should build a subcommunicator of suitable size, migrate the graph to that subcomm, call ParMetis from there, and migrate the new partition back to the original comm. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Fri Oct 21 16:14:24 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 21 Oct 2011 23:14:24 +0200 Subject: [petsc-users] ERROR: MatPartitioningApply_Parmetis() ... Does not support any processor with 0 entries In-Reply-To: References: Message-ID: >> The one that was downloaded automatically during configuration, which >> is ParMetis-3.2.0-p1 > > If I remember correctly, the problem is that ParMetis frequently crashes > when called with empty subdomains. If it was working for you before, you > were probably just lucky. Hmm, I just want to partition my mesh that I read in serial. So adjacency vectors are only on root. How can I provide distributed adjacency information, before partitioning? This is an endless circle. Or do you mean I should just broadcast my adjacency vectors ia,ja to all the other processes? > The best fix is to have the bug/limitation fixed upstream, but barring that, > we should build a subcommunicator of suitable size, migrate the graph to > that subcomm, call ParMetis from there, and migrate the new partition back > to the original comm. Are you suggesting to use other communicator than PETSC_COMM_WORLD in the code I cited? How would I go about creating a subcommunicator "of suitable size". And what do you mean by "migrate"? Thanks, Dominik From jedbrown at mcs.anl.gov Fri Oct 21 16:21:03 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 21 Oct 2011 16:21:03 -0500 Subject: [petsc-users] ERROR: MatPartitioningApply_Parmetis() ... Does not support any processor with 0 entries In-Reply-To: References: Message-ID: On Fri, Oct 21, 2011 at 16:14, Dominik Szczerba wrote: > Hmm, I just want to partition my mesh that I read in serial. So > adjacency vectors are only on root. How can I provide distributed > adjacency information, before partitioning? This is an endless circle. > Or do you mean I should just broadcast my adjacency vectors ia,ja to > all the other processes? > The usual thing is to distribute the mesh naively to begin with, then partition and move the mesh to the correct place with respect to the new partition. It is possible that the case you describe works reliably, in which case the guard was to blunt. The crashes I'm familiar with occur when there are very few nodes such that some processors don't get any. You can open up pmetis.c and remove the guard entirely or write a better guard. > > > The best fix is to have the bug/limitation fixed upstream, but barring > that, > > we should build a subcommunicator of suitable size, migrate the graph to > > that subcomm, call ParMetis from there, and migrate the new partition > back > > to the original comm. > > Are you suggesting to use other communicator than PETSC_COMM_WORLD in > the code I cited? > How would I go about creating a subcommunicator "of suitable size". > And what do you mean by "migrate"? > This is really intended for the other case where the graph is really small compared to the node count. (This can happen on coarse levels of multigrid.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Fri Oct 21 16:53:56 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 21 Oct 2011 23:53:56 +0200 Subject: [petsc-users] ERROR: MatPartitioningApply_Parmetis() ... Does not support any processor with 0 entries In-Reply-To: References: Message-ID: On Fri, Oct 21, 2011 at 11:21 PM, Jed Brown wrote: > On Fri, Oct 21, 2011 at 16:14, Dominik Szczerba > wrote: >> >> Hmm, I just want to partition my mesh that I read in serial. So >> adjacency vectors are only on root. How can I provide distributed >> adjacency information, before partitioning? This is an endless circle. >> Or do you mean I should just broadcast my adjacency vectors ia,ja to >> all the other processes? > > The usual thing is to distribute the mesh naively to begin with, then > partition and move the mesh to the correct place with respect to the new > partition. It is possible that the case you describe works reliably, in > which case the guard was to blunt. The crashes I'm familiar with occur when > there are very few nodes such that some processors don't get any. You can > open up pmetis.c and remove the guard entirely or write a better guard. I can not find anything resembling any guards in pmetis.c as pet Petsc 3.2. Could you please refer me to the specific section? Thanks a lot, Dominik From jedbrown at mcs.anl.gov Fri Oct 21 16:58:19 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 21 Oct 2011 16:58:19 -0500 Subject: [petsc-users] ERROR: MatPartitioningApply_Parmetis() ... Does not support any processor with 0 entries In-Reply-To: References: Message-ID: On Fri, Oct 21, 2011 at 16:53, Dominik Szczerba wrote: > I can not find anything resembling any guards in pmetis.c as pet Petsc > 3.2. Could you please refer me to the specific section? > $ grep -n 'Does not support any processor with' petsc-3.2/src/**/*.c petsc-3.2/src/mat/partition/impls/pmetis/pmetis.c:61: SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"Does not support any processor with %d entries",vtxdist[rank+1] - vtxdist[rank]); -------------- next part -------------- An HTML attachment was scrubbed... URL: From rongliang.chan at gmail.com Fri Oct 21 17:09:15 2011 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Fri, 21 Oct 2011 16:09:15 -0600 Subject: [petsc-users] Log_summary Message-ID: Hello All, I got a strange thing in my log_summary output (see below). In the output file, the time spend on the function "MatLUFactorNum" is "1.6966e+02" but the percent time in this phase (T%) for this function is 20%, which means that the total time should be 8.483e+02. But the exact total time is 5.506e+02. So what kind of bug can cause this problem? Thanks, Best, Rongliang Max Max/Min Avg Total Time (sec): 5.506e+02 1.00014 5.506e+02 ......... ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ MatLUFactorNum 20 1.0 1.6966e+02 2.4 1.94e+11 3.2 0.0e+00 0.0e+00 0.0e+00 20 62 0 0 0 20 62 0 0 0 19505 =========================================================================== ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./joab on a Janus-nod named node1755 with 32 processors, by ronglian Sun Oct 16 16:34:25 2011 Using Petsc Release Version 3.2.0, Patch 1, Mon Sep 12 16:01:51 CDT 2011 Max Max/Min Avg Total Time (sec): 5.506e+02 1.00014 5.506e+02 Objects: 9.800e+02 1.00000 9.800e+02 Flops: 2.639e+11 2.37776 1.673e+11 5.353e+12 Flops/sec: 4.794e+08 2.37786 3.038e+08 9.722e+09 MPI Messages: 1.057e+05 5.06348 3.760e+04 1.203e+06 MPI Message Lengths: 1.191e+09 2.60674 2.113e+04 2.543e+10 MPI Reductions: 6.213e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 5.5061e+02 100.0% 5.3531e+12 100.0% 1.203e+06 100.0% 2.113e+04 100.0% 6.212e+03 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 2589 1.0 5.0895e+01 7.9 4.45e+09 1.1 4.1e+05 2.2e+03 0.0e+00 5 3 34 4 0 5 3 34 4 0 2655 MatMultTranspose 5 1.0 1.5848e-02 1.3 6.15e+06 1.1 6.6e+02 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 11882 MatSolve 2562 1.0 1.1045e+02 1.6 6.03e+10 1.7 0.0e+00 0.0e+00 0.0e+00 17 29 0 0 0 17 29 0 0 0 14238 MatLUFactorSym 4 1.0 2.0894e+00 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 20 1.0 1.6966e+02 2.4 1.94e+11 3.2 0.0e+00 0.0e+00 0.0e+00 20 62 0 0 0 20 62 0 0 0 19505 MatILUFactorSym 1 1.0 4.8680e-03 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 110 1.0 1.6247e+0119.7 0.00e+00 0.0 1.7e+04 6.7e+04 1.8e+02 2 0 1 5 3 2 0 1 5 3 0 MatAssemblyEnd 110 1.0 1.5937e+02 1.0 0.00e+00 0.0 3.0e+03 5.3e+02 1.7e+02 29 0 0 0 3 29 0 0 0 3 0 MatGetRowIJ 5 1.0 3.0132e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 20 1.0 2.9429e+00 1.4 0.00e+00 0.0 1.8e+04 3.9e+05 8.0e+01 0 0 1 27 1 0 0 1 27 1 0 MatGetOrdering 5 1.0 8.6869e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 0 0 0 0 0 0 0 MatIncreaseOvrlp 1 1.0 3.3869e-02 1.0 0.00e+00 0.0 1.6e+03 1.3e+03 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatPartitioning 1 1.0 1.1711e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 45 1.0 9.1191e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecDot 19 1.0 2.4045e-0213.2 6.91e+05 1.1 0.0e+00 0.0e+00 1.9e+01 0 0 0 0 0 0 0 0 0 0 879 VecMDot 2518 1.0 1.7926e+01 2.5 5.33e+09 1.1 0.0e+00 0.0e+00 2.5e+03 2 3 0 0 41 2 3 0 0 41 9107 VecNorm 2601 1.0 3.0213e+00 5.2 1.23e+08 1.1 0.0e+00 0.0e+00 2.6e+03 0 0 0 0 42 0 0 0 0 42 1243 VecScale 2562 1.0 5.3008e-02 1.4 6.06e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 35014 VecCopy 166 1.0 1.3387e-02 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 5223 1.0 4.8880e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 142 1.0 1.1994e-02 1.7 6.89e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 17581 VecWAXPY 20 1.0 2.2526e-03 1.6 3.50e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4755 VecMAXPY 2562 1.0 7.9590e+00 1.4 5.45e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 3 0 0 0 1 3 0 0 0 20969 VecAssemblyBegin 70 1.0 1.4910e-01 4.0 0.00e+00 0.0 5.0e+03 5.6e+02 2.1e+02 0 0 0 0 3 0 0 0 0 3 0 VecAssemblyEnd 70 1.0 2.8467e-04 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 7779 1.0 1.1640e+00 1.6 0.00e+00 0.0 1.2e+06 1.5e+04 0.0e+00 0 0 96 68 0 0 0 96 68 0 0 VecScatterEnd 7779 1.0 1.4521e+0235.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 14 0 0 0 0 14 0 0 0 0 0 VecReduceArith 8 1.0 8.0991e-04 1.1 3.05e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11525 VecReduceComm 4 1.0 3.4404e-04 5.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 2562 1.0 2.9302e+00 4.9 1.81e+08 1.1 0.0e+00 0.0e+00 2.5e+03 0 0 0 0 41 0 0 0 0 41 1893 SNESSolve 4 1.0 5.3863e+02 1.0 2.61e+11 2.4 9.9e+05 2.5e+04 4.3e+03 98 98 82 96 69 98 98 82 96 69 9778 SNESLineSearch 19 1.0 8.7346e+00 1.0 6.45e+07 1.1 1.9e+04 1.9e+04 3.5e+02 2 0 2 1 6 2 0 2 1 6 225 SNESFunctionEval 24 1.0 6.6409e+01 1.0 3.06e+07 1.1 2.0e+04 2.1e+04 3.6e+02 12 0 2 2 6 12 0 2 2 6 14 SNESJacobianEval 19 1.0 1.6340e+02 1.0 0.00e+00 0.0 1.1e+04 6.9e+04 1.5e+02 30 0 1 3 2 30 0 1 3 2 0 KSPGMRESOrthog 2518 1.0 2.5557e+01 1.7 1.07e+10 1.1 0.0e+00 0.0e+00 2.5e+03 3 6 0 0 41 3 6 0 0 41 12776 KSPSetup 40 1.0 2.5885e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 20 1.0 3.1525e+02 1.0 2.64e+11 2.4 1.2e+06 2.1e+04 5.2e+03 57100 96 95 84 57100 96 95 84 16973 PCSetUp 40 1.0 1.7504e+02 2.4 1.94e+11 3.2 2.1e+04 3.3e+05 1.6e+02 21 62 2 27 3 21 62 2 27 3 18906 PCSetUpOnBlocks 20 1.0 1.7249e+02 2.4 1.94e+11 3.2 0.0e+00 0.0e+00 2.7e+01 21 62 0 0 0 21 62 0 0 0 19186 PCApply 2562 1.0 2.0489e+02 2.0 6.03e+10 1.7 7.4e+05 2.2e+04 0.0e+00 28 29 62 65 0 28 29 62 65 0 7675 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 44 34 2570073620 0 Matrix Partitioning 1 1 640 0 Index Set 189 182 3025696 0 IS L to G Mapping 2 1 283604 0 Vector 688 452 114894176 0 Vector Scatter 29 25 26300 0 Application Order 2 2 9335968 0 SNES 4 2 2544 0 Krylov Solver 10 6 11645040 0 Preconditioner 10 6 5456 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 1.19209e-07 Average time for MPI_Barrier(): 1.02043e-05 Average time for zero size MPI_Send(): 2.90573e-06 #PETSc Option Table entries: -coarse_ksp_rtol 1.0e-1 -coarsegrid /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E2000_N8241_D70170.fsi -computeinitialguess -f /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E32000_N128961_D1096650.fsi -geometric_asm -geometric_asm_overlap 8 -inletu 5.0 -ksp_atol 1e-8 -ksp_gmres_restart 600 -ksp_max_it 3000 -ksp_pc_side right -ksp_rtol 1.e-2 -ksp_type gmres -log_summary -mat_partitioning_type parmetis -nest_geometric_asm_overlap 4 -nest_ksp_atol 1e-8 -nest_ksp_gmres_restart 800 -nest_ksp_max_it 1000 -nest_ksp_pc_side right -nest_ksp_rtol 1.e-2 -nest_ksp_type gmres -nest_pc_asm_type basic -nest_pc_type asm -nest_snes_atol 1.e-10 -nest_snes_max_it 20 -nest_snes_rtol 1.e-4 -nest_sub_pc_factor_mat_ordering_type qmd -nest_sub_pc_factor_shift_amount 1e-8 -sub_pc_factor_shift_type nonzero -sub_pc_type lu -viscosity 0.01 #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Tue Sep 13 13:28:48 2011 Configure options: --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --with-batch=1 --with-mpi-shared-libraries=1 --known-mpi-shared-libraries=0 --download-f-blas-lapack=1 --download-hypre=1 --download-superlu=1 --download-parmetis=1 --download-superlu_dist=1 --download-blacs=1 --download-scalapack=1 --download-mumps=1 --with-debugging=0 ----------------------------------------- Libraries compiled on Tue Sep 13 13:28:48 2011 on node1367 Machine characteristics: Linux-2.6.18-238.12.1.el5-x86_64-with-redhat-5.6-Tikanga Using PETSc directory: /home/ronglian/soft/petsc-3.2-p1 Using PETSc arch: Janus-nodebug ----------------------------------------- Using C compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -Wall -Wno-unused-variable -O ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/include -I/home/ronglian/soft/petsc-3.2-p1/include -I/home/ronglian/soft/petsc-3.2-p1/include -I/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/include -I/curc/tools/free/redhat_5_x86_64/openmpi-1.4.3_ib/include ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -L/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -lpetsc -lX11 -Wl,-rpath,/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -L/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -lsuperlu_dist_2.5 -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lHYPRE -lmpi_cxx -lstdc++ -lscalapack -lblacs -lsuperlu_4.2 -lflapack -lfblas -L/curc/tools/free/redhat_5_x86_64/openmpi-1.4.3_ib/lib -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortran -lm -lm -lm -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s -lpthread -ldl ----------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Oct 21 17:58:05 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 21 Oct 2011 17:58:05 -0500 Subject: [petsc-users] Log_summary In-Reply-To: References: Message-ID: <1858AF88-52CA-46E7-BE68-F06E03FD38EB@mcs.anl.gov> Sorry for the confusing output but it is not a bug. The time given on the third column is the slowest time over all the processes; the next number is the ratio of fastest to slowest which is 2.4 hence the fastest process took 1.6966e+02/2.4 seconds. Meanwhile the percent is computed by ADDING the times over all the processes per event and dividing by the total time. Since some processes were much faster than others the percentage time for all processes is much lower than you would expect given the maximum time. The fact that the ratio is 2.4 indicates that some processes have many more non-zeros in their part of the matrix than other processes, i.e. load inbalance Barry On Oct 21, 2011, at 5:09 PM, Rongliang Chen wrote: > Hello All, > > I got a strange thing in my log_summary output (see below). In the output file, the time spend on the function "MatLUFactorNum" is "1.6966e+02" but the percent time in this phase (T%) for this function is 20%, which means that the total time should be 8.483e+02. But the exact total time is 5.506e+02. So what kind of bug can cause this problem? Thanks, > > Best, > Rongliang > > Max Max/Min Avg Total > Time (sec): 5.506e+02 1.00014 5.506e+02 > > ......... > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > MatLUFactorNum 20 1.0 1.6966e+02 2.4 1.94e+11 3.2 0.0e+00 0.0e+00 0.0e+00 20 62 0 0 0 20 62 0 0 0 19505 > > =========================================================================== > > ************************************************************************************************************************ > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > ./joab on a Janus-nod named node1755 with 32 processors, by ronglian Sun Oct 16 16:34:25 2011 > Using Petsc Release Version 3.2.0, Patch 1, Mon Sep 12 16:01:51 CDT 2011 > > Max Max/Min Avg Total > Time (sec): 5.506e+02 1.00014 5.506e+02 > Objects: 9.800e+02 1.00000 9.800e+02 > Flops: 2.639e+11 2.37776 1.673e+11 5.353e+12 > Flops/sec: 4.794e+08 2.37786 3.038e+08 9.722e+09 > MPI Messages: 1.057e+05 5.06348 3.760e+04 1.203e+06 > MPI Message Lengths: 1.191e+09 2.60674 2.113e+04 2.543e+10 > MPI Reductions: 6.213e+03 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N --> 2N flops > and VecAXPY() for complex vectors of length N --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts %Total Avg %Total counts %Total > 0: Main Stage: 5.5061e+02 100.0% 5.3531e+12 100.0% 1.203e+06 100.0% 2.113e+04 100.0% 6.212e+03 100.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in this phase > %M - percent messages in this phase %L - percent message lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > --- Event Stage 0: Main Stage > > MatMult 2589 1.0 5.0895e+01 7.9 4.45e+09 1.1 4.1e+05 2.2e+03 0.0e+00 5 3 34 4 0 5 3 34 4 0 2655 > MatMultTranspose 5 1.0 1.5848e-02 1.3 6.15e+06 1.1 6.6e+02 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 11882 > MatSolve 2562 1.0 1.1045e+02 1.6 6.03e+10 1.7 0.0e+00 0.0e+00 0.0e+00 17 29 0 0 0 17 29 0 0 0 14238 > MatLUFactorSym 4 1.0 2.0894e+00 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 20 1.0 1.6966e+02 2.4 1.94e+11 3.2 0.0e+00 0.0e+00 0.0e+00 20 62 0 0 0 20 62 0 0 0 19505 > MatILUFactorSym 1 1.0 4.8680e-03 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatAssemblyBegin 110 1.0 1.6247e+0119.7 0.00e+00 0.0 1.7e+04 6.7e+04 1.8e+02 2 0 1 5 3 2 0 1 5 3 0 > MatAssemblyEnd 110 1.0 1.5937e+02 1.0 0.00e+00 0.0 3.0e+03 5.3e+02 1.7e+02 29 0 0 0 3 29 0 0 0 3 0 > MatGetRowIJ 5 1.0 3.0132e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetSubMatrice 20 1.0 2.9429e+00 1.4 0.00e+00 0.0 1.8e+04 3.9e+05 8.0e+01 0 0 1 27 1 0 0 1 27 1 0 > MatGetOrdering 5 1.0 8.6869e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 0 0 0 0 0 0 0 > MatIncreaseOvrlp 1 1.0 3.3869e-02 1.0 0.00e+00 0.0 1.6e+03 1.3e+03 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatPartitioning 1 1.0 1.1711e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatZeroEntries 45 1.0 9.1191e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecDot 19 1.0 2.4045e-0213.2 6.91e+05 1.1 0.0e+00 0.0e+00 1.9e+01 0 0 0 0 0 0 0 0 0 0 879 > VecMDot 2518 1.0 1.7926e+01 2.5 5.33e+09 1.1 0.0e+00 0.0e+00 2.5e+03 2 3 0 0 41 2 3 0 0 41 9107 > VecNorm 2601 1.0 3.0213e+00 5.2 1.23e+08 1.1 0.0e+00 0.0e+00 2.6e+03 0 0 0 0 42 0 0 0 0 42 1243 > VecScale 2562 1.0 5.3008e-02 1.4 6.06e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 35014 > VecCopy 166 1.0 1.3387e-02 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 5223 1.0 4.8880e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 142 1.0 1.1994e-02 1.7 6.89e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 17581 > VecWAXPY 20 1.0 2.2526e-03 1.6 3.50e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4755 > VecMAXPY 2562 1.0 7.9590e+00 1.4 5.45e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 3 0 0 0 1 3 0 0 0 20969 > VecAssemblyBegin 70 1.0 1.4910e-01 4.0 0.00e+00 0.0 5.0e+03 5.6e+02 2.1e+02 0 0 0 0 3 0 0 0 0 3 0 > VecAssemblyEnd 70 1.0 2.8467e-04 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 7779 1.0 1.1640e+00 1.6 0.00e+00 0.0 1.2e+06 1.5e+04 0.0e+00 0 0 96 68 0 0 0 96 68 0 0 > VecScatterEnd 7779 1.0 1.4521e+0235.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 14 0 0 0 0 14 0 0 0 0 0 > VecReduceArith 8 1.0 8.0991e-04 1.1 3.05e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11525 > VecReduceComm 4 1.0 3.4404e-04 5.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecNormalize 2562 1.0 2.9302e+00 4.9 1.81e+08 1.1 0.0e+00 0.0e+00 2.5e+03 0 0 0 0 41 0 0 0 0 41 1893 > SNESSolve 4 1.0 5.3863e+02 1.0 2.61e+11 2.4 9.9e+05 2.5e+04 4.3e+03 98 98 82 96 69 98 98 82 96 69 9778 > SNESLineSearch 19 1.0 8.7346e+00 1.0 6.45e+07 1.1 1.9e+04 1.9e+04 3.5e+02 2 0 2 1 6 2 0 2 1 6 225 > SNESFunctionEval 24 1.0 6.6409e+01 1.0 3.06e+07 1.1 2.0e+04 2.1e+04 3.6e+02 12 0 2 2 6 12 0 2 2 6 14 > SNESJacobianEval 19 1.0 1.6340e+02 1.0 0.00e+00 0.0 1.1e+04 6.9e+04 1.5e+02 30 0 1 3 2 30 0 1 3 2 0 > KSPGMRESOrthog 2518 1.0 2.5557e+01 1.7 1.07e+10 1.1 0.0e+00 0.0e+00 2.5e+03 3 6 0 0 41 3 6 0 0 41 12776 > KSPSetup 40 1.0 2.5885e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 20 1.0 3.1525e+02 1.0 2.64e+11 2.4 1.2e+06 2.1e+04 5.2e+03 57100 96 95 84 57100 96 95 84 16973 > PCSetUp 40 1.0 1.7504e+02 2.4 1.94e+11 3.2 2.1e+04 3.3e+05 1.6e+02 21 62 2 27 3 21 62 2 27 3 18906 > PCSetUpOnBlocks 20 1.0 1.7249e+02 2.4 1.94e+11 3.2 0.0e+00 0.0e+00 2.7e+01 21 62 0 0 0 21 62 0 0 0 19186 > PCApply 2562 1.0 2.0489e+02 2.0 6.03e+10 1.7 7.4e+05 2.2e+04 0.0e+00 28 29 62 65 0 28 29 62 65 0 7675 > ------------------------------------------------------------------------------------------------------------------------ > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Matrix 44 34 2570073620 0 > Matrix Partitioning 1 1 640 0 > Index Set 189 182 3025696 0 > IS L to G Mapping 2 1 283604 0 > Vector 688 452 114894176 0 > Vector Scatter 29 25 26300 0 > Application Order 2 2 9335968 0 > SNES 4 2 2544 0 > Krylov Solver 10 6 11645040 0 > Preconditioner 10 6 5456 0 > Viewer 1 0 0 0 > ======================================================================================================================== > Average time to get PetscTime(): 1.19209e-07 > Average time for MPI_Barrier(): 1.02043e-05 > Average time for zero size MPI_Send(): 2.90573e-06 > #PETSc Option Table entries: > -coarse_ksp_rtol 1.0e-1 > -coarsegrid /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E2000_N8241_D70170.fsi > -computeinitialguess > -f /scratch/stmp00/ronglian/input/Cannula/Cannula_Nest2_E32000_N128961_D1096650.fsi > -geometric_asm > -geometric_asm_overlap 8 > -inletu 5.0 > -ksp_atol 1e-8 > -ksp_gmres_restart 600 > -ksp_max_it 3000 > -ksp_pc_side right > -ksp_rtol 1.e-2 > -ksp_type gmres > -log_summary > -mat_partitioning_type parmetis > -nest_geometric_asm_overlap 4 > -nest_ksp_atol 1e-8 > -nest_ksp_gmres_restart 800 > -nest_ksp_max_it 1000 > -nest_ksp_pc_side right > -nest_ksp_rtol 1.e-2 > -nest_ksp_type gmres > -nest_pc_asm_type basic > -nest_pc_type asm > -nest_snes_atol 1.e-10 > -nest_snes_max_it 20 > -nest_snes_rtol 1.e-4 > -nest_sub_pc_factor_mat_ordering_type qmd > -nest_sub_pc_factor_shift_amount 1e-8 > -sub_pc_factor_shift_type nonzero > -sub_pc_type lu > -viscosity 0.01 > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 > Configure run at: Tue Sep 13 13:28:48 2011 > Configure options: --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --with-batch=1 --with-mpi-shared-libraries=1 --known-mpi-shared-libraries=0 --download-f-blas-lapack=1 --download-hypre=1 --download-superlu=1 --download-parmetis=1 --download-superlu_dist=1 --download-blacs=1 --download-scalapack=1 --download-mumps=1 --with-debugging=0 > ----------------------------------------- > Libraries compiled on Tue Sep 13 13:28:48 2011 on node1367 > Machine characteristics: Linux-2.6.18-238.12.1.el5-x86_64-with-redhat-5.6-Tikanga > Using PETSc directory: /home/ronglian/soft/petsc-3.2-p1 > Using PETSc arch: Janus-nodebug > ----------------------------------------- > > Using C compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O ${COPTFLAGS} ${CFLAGS} > Using Fortran compiler: mpif90 -Wall -Wno-unused-variable -O ${FOPTFLAGS} ${FFLAGS} > ----------------------------------------- > > Using include paths: -I/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/include -I/home/ronglian/soft/petsc-3.2-p1/include -I/home/ronglian/soft/petsc-3.2-p1/include -I/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/include -I/curc/tools/free/redhat_5_x86_64/openmpi-1.4.3_ib/include > ----------------------------------------- > > Using C linker: mpicc > Using Fortran linker: mpif90 > Using libraries: -Wl,-rpath,/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -L/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -lpetsc -lX11 -Wl,-rpath,/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -L/home/ronglian/soft/petsc-3.2-p1/Janus-nodebug/lib -lsuperlu_dist_2.5 -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lHYPRE -lmpi_cxx -lstdc++ -lscalapack -lblacs -lsuperlu_4.2 -lflapack -lfblas -L/curc/tools/free/redhat_5_x86_64/openmpi-1.4.3_ib/lib -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortran -lm -lm -lm -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s -lpthread -ldl > ----------------------------------------- > From dominik at itis.ethz.ch Sat Oct 22 03:43:04 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 22 Oct 2011 10:43:04 +0200 Subject: [petsc-users] --with-parmetis=1 --download-parmetis=1 needed with petsc 3.2? Message-ID: Do I need --with-parmetis=1 --download-parmetis=1 when building petsc 3.2? I already see parmetis under petsc-3.2-p3/src/mat/partition/impls/pmetis When specifying the above flags I have one more in externalpackages. Which one takes precedence? Thanks, Dominik From dominik at itis.ethz.ch Sat Oct 22 04:14:06 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 22 Oct 2011 11:14:06 +0200 Subject: [petsc-users] ERROR: MatPartitioningApply_Parmetis() ... Does not support any processor with 0 entries In-Reply-To: References: Message-ID: Oh, you meant pmetis.c from src/mat/partition/impls/pmetis, not the same file in externalpackages/ParMetis-3.2.0-p1... I removed this check: #if 1 if ((vtxdist[rank+1] - vtxdist[rank]) < 1) { SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"Does not support any processor with %d entries",vtxdist[rank+1] - vtxdist[rank]); } #endif but it generates more errors in other files: dsz at tharsis:~/pack/petsc-3.2-p3$ rgrep "Poor vertex distribution" * externalpackages/ParMetis-3.2.0-p1/ParMETISLib/ametis.c: printf("Error: Poor vertex distribution (processor with no vertices).\n"); externalpackages/ParMetis-3.2.0-p1/ParMETISLib/ometis.c: printf("Error: Poor vertex distribution (processor with no vertices).\n"); externalpackages/ParMetis-3.2.0-p1/ParMETISLib/rmetis.c: printf("Error: Poor vertex distribution (processor with no vertices).\n"); externalpackages/ParMetis-3.2.0-p1/ParMETISLib/kmetis.c: printf("Error: Poor vertex distribution (processor with no vertices).\n"); I am not sure if tweaking parmetis code is a way I want to go... How about different solutions: 1) partitioning with metis and not parmetis? How to perform partitioning by definition just serially? I just have a mesh on master and want to get partitioning on master, I will take over then. 2) partitioning other than parmetis, will they work in my scenario? I see e.g. chaco in the docu, but can not specify it with -mat_partitioning_type chaco, I get an error. Are there any examples how to partition using different methods? I just need a quick fix for this problem, not necessarily the best one for the moment... Many thanks for any insight, Dominik On Fri, Oct 21, 2011 at 11:58 PM, Jed Brown wrote: > On Fri, Oct 21, 2011 at 16:53, Dominik Szczerba > wrote: >> >> I can not find anything resembling any guards in pmetis.c as pet Petsc >> 3.2. Could you please refer me to the specific section? > > $ grep -n 'Does not support any processor with' petsc-3.2/src/**/*.c > petsc-3.2/src/mat/partition/impls/pmetis/pmetis.c:61: > ?SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"Does not support any processor with > %d entries",vtxdist[rank+1] - vtxdist[rank]); From dominik at itis.ethz.ch Sat Oct 22 08:45:53 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 22 Oct 2011 15:45:53 +0200 Subject: [petsc-users] sequential partitioning? Message-ID: I do not seem to find any guidance on this subject in the docu. I built petsc with chaco and party to attempt sequential partitioning, where parmetis fails. However, I get the error: [0]PETSC ERROR: No support for this operation for this object type! [0]PETSC ERROR: Distributed matrix format MPIAdj is not supported for sequential partitioners! I do not seem to find a sequential equivalent of MatCreateMPIAdj... Are there any examples how to perform partitioning sequentially? My mesh/graph is located entirely on master. Thanks a lot, Dominik From bsmith at mcs.anl.gov Sat Oct 22 09:01:49 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 22 Oct 2011 09:01:49 -0500 Subject: [petsc-users] --with-parmetis=1 --download-parmetis=1 needed with petsc 3.2? In-Reply-To: References: Message-ID: <0316740A-2622-4DFB-8DE1-58C4B9FCD7F8@mcs.anl.gov> You only need --download-parmetis You should remove all the externalpackages/parmetis* to make sure the older one was deleted and not used Barry On Oct 22, 2011, at 3:43 AM, Dominik Szczerba wrote: > Do I need --with-parmetis=1 --download-parmetis=1 when building petsc > 3.2? I already see parmetis under > > petsc-3.2-p3/src/mat/partition/impls/pmetis > > When specifying the above flags I have one more in externalpackages. > Which one takes precedence? > > Thanks, > Dominik From bsmith at mcs.anl.gov Sat Oct 22 09:02:50 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 22 Oct 2011 09:02:50 -0500 Subject: [petsc-users] ERROR: MatPartitioningApply_Parmetis() ... Does not support any processor with 0 entries In-Reply-To: References: Message-ID: Since you are starting with everything on one process just run the partitioner on one process Barry On Oct 22, 2011, at 4:14 AM, Dominik Szczerba wrote: > Oh, you meant pmetis.c from src/mat/partition/impls/pmetis, not the > same file in externalpackages/ParMetis-3.2.0-p1... > > I removed this check: > > #if 1 > if ((vtxdist[rank+1] - vtxdist[rank]) < 1) { > SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"Does not support any > processor with %d entries",vtxdist[rank+1] - vtxdist[rank]); > } > #endif > > but it generates more errors in other files: > > dsz at tharsis:~/pack/petsc-3.2-p3$ rgrep "Poor vertex distribution" * > externalpackages/ParMetis-3.2.0-p1/ParMETISLib/ametis.c: > printf("Error: Poor vertex distribution (processor with no > vertices).\n"); > externalpackages/ParMetis-3.2.0-p1/ParMETISLib/ometis.c: > printf("Error: Poor vertex distribution (processor with no > vertices).\n"); > externalpackages/ParMetis-3.2.0-p1/ParMETISLib/rmetis.c: > printf("Error: Poor vertex distribution (processor with no > vertices).\n"); > externalpackages/ParMetis-3.2.0-p1/ParMETISLib/kmetis.c: > printf("Error: Poor vertex distribution (processor with no > vertices).\n"); > > I am not sure if tweaking parmetis code is a way I want to go... > > How about different solutions: > > 1) partitioning with metis and not parmetis? How to perform > partitioning by definition just serially? I just have a mesh on master > and want to get partitioning on master, I will take over then. > 2) partitioning other than parmetis, will they work in my scenario? I > see e.g. chaco in the docu, but can not specify it with > -mat_partitioning_type chaco, I get an error. Are there any examples > how to partition using different methods? > > I just need a quick fix for this problem, not necessarily the best one > for the moment... > > Many thanks for any insight, > Dominik > > On Fri, Oct 21, 2011 at 11:58 PM, Jed Brown wrote: >> On Fri, Oct 21, 2011 at 16:53, Dominik Szczerba >> wrote: >>> >>> I can not find anything resembling any guards in pmetis.c as pet Petsc >>> 3.2. Could you please refer me to the specific section? >> >> $ grep -n 'Does not support any processor with' petsc-3.2/src/**/*.c >> petsc-3.2/src/mat/partition/impls/pmetis/pmetis.c:61: >> SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"Does not support any processor with >> %d entries",vtxdist[rank+1] - vtxdist[rank]); From bsmith at mcs.anl.gov Sat Oct 22 09:05:39 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 22 Oct 2011 09:05:39 -0500 Subject: [petsc-users] sequential partitioning? In-Reply-To: References: Message-ID: <0A717446-E9EA-4175-8355-EC64A0C17EA4@mcs.anl.gov> You need to create the MPIAdj matrix with PETSC_COMM_SELF not COMM_WORLD Barry On Oct 22, 2011, at 8:45 AM, Dominik Szczerba wrote: > I do not seem to find any guidance on this subject in the docu. > I built petsc with chaco and party to attempt sequential partitioning, > where parmetis fails. > However, I get the error: > > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: Distributed matrix format MPIAdj is not supported for > sequential partitioners! > > > I do not seem to find a sequential equivalent of MatCreateMPIAdj... > Are there any examples how to perform partitioning sequentially? > My mesh/graph is located entirely on master. > > Thanks a lot, > Dominik From dominik at itis.ethz.ch Sat Oct 22 09:18:18 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 22 Oct 2011 16:18:18 +0200 Subject: [petsc-users] sequential partitioning? In-Reply-To: <0A717446-E9EA-4175-8355-EC64A0C17EA4@mcs.anl.gov> References: <0A717446-E9EA-4175-8355-EC64A0C17EA4@mcs.anl.gov> Message-ID: Thanks, this I had no chance to know. Is also true for MatPartitioningCreate? Which one should be run collectively and which one only on master? My code is along this line: MatCreateMPIAdj MatPartitioningCreate MatPartitioningSetAdjacency MatPartitioningSetNParts MatPartitioningApply Many thanks! Dominik On Sat, Oct 22, 2011 at 4:05 PM, Barry Smith wrote: > > ? You need to create the MPIAdj matrix with PETSC_COMM_SELF ? not COMM_WORLD > > ? Barry > > On Oct 22, 2011, at 8:45 AM, Dominik Szczerba wrote: > >> I do not seem to find any guidance on this subject in the docu. >> I built petsc with chaco and party to attempt sequential partitioning, >> where parmetis fails. >> However, I get the error: >> >> [0]PETSC ERROR: No support for this operation for this object type! >> [0]PETSC ERROR: Distributed matrix format MPIAdj is not supported for >> sequential partitioners! >> >> >> I do not seem to find a sequential equivalent of MatCreateMPIAdj... >> Are there any examples how to perform partitioning sequentially? >> My mesh/graph is located entirely on master. >> >> Thanks a lot, >> Dominik > > From bsmith at mcs.anl.gov Sat Oct 22 09:22:47 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 22 Oct 2011 09:22:47 -0500 Subject: [petsc-users] sequential partitioning? In-Reply-To: References: <0A717446-E9EA-4175-8355-EC64A0C17EA4@mcs.anl.gov> Message-ID: <64300481-2868-4BCA-9660-EAC887B66695@mcs.anl.gov> The MatCreateMPIAdj and MatPartitioningCreate need to take PETSC_COMM_SELF so they are sequential. Barry On Oct 22, 2011, at 9:18 AM, Dominik Szczerba wrote: > Thanks, this I had no chance to know. Is also true for MatPartitioningCreate? > Which one should be run collectively and which one only on master? My > code is along this line: > > MatCreateMPIAdj > MatPartitioningCreate > MatPartitioningSetAdjacency > MatPartitioningSetNParts > MatPartitioningApply > > Many thanks! > Dominik > > On Sat, Oct 22, 2011 at 4:05 PM, Barry Smith wrote: >> >> You need to create the MPIAdj matrix with PETSC_COMM_SELF not COMM_WORLD >> >> Barry >> >> On Oct 22, 2011, at 8:45 AM, Dominik Szczerba wrote: >> >>> I do not seem to find any guidance on this subject in the docu. >>> I built petsc with chaco and party to attempt sequential partitioning, >>> where parmetis fails. >>> However, I get the error: >>> >>> [0]PETSC ERROR: No support for this operation for this object type! >>> [0]PETSC ERROR: Distributed matrix format MPIAdj is not supported for >>> sequential partitioners! >>> >>> >>> I do not seem to find a sequential equivalent of MatCreateMPIAdj... >>> Are there any examples how to perform partitioning sequentially? >>> My mesh/graph is located entirely on master. >>> >>> Thanks a lot, >>> Dominik >> >> From dominik at itis.ethz.ch Sat Oct 22 10:00:34 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 22 Oct 2011 17:00:34 +0200 Subject: [petsc-users] sequential partitioning? In-Reply-To: <64300481-2868-4BCA-9660-EAC887B66695@mcs.anl.gov> References: <0A717446-E9EA-4175-8355-EC64A0C17EA4@mcs.anl.gov> <64300481-2868-4BCA-9660-EAC887B66695@mcs.anl.gov> Message-ID: Many thanlks, Barry, you saved my Saturday afternoon...! (so I can directly proceed to the valgrind issue reported separately... :)) Dominik On Sat, Oct 22, 2011 at 4:22 PM, Barry Smith wrote: > > ?The MatCreateMPIAdj and MatPartitioningCreate ?need to take PETSC_COMM_SELF so they are sequential. > > ? Barry > > On Oct 22, 2011, at 9:18 AM, Dominik Szczerba wrote: > >> Thanks, this I had no chance to know. Is also true for MatPartitioningCreate? >> Which one should be run collectively and which one only on master? My >> code is along this line: >> >> MatCreateMPIAdj >> MatPartitioningCreate >> MatPartitioningSetAdjacency >> MatPartitioningSetNParts >> MatPartitioningApply >> >> Many thanks! >> Dominik >> >> On Sat, Oct 22, 2011 at 4:05 PM, Barry Smith wrote: >>> >>> ? You need to create the MPIAdj matrix with PETSC_COMM_SELF ? not COMM_WORLD >>> >>> ? Barry >>> >>> On Oct 22, 2011, at 8:45 AM, Dominik Szczerba wrote: >>> >>>> I do not seem to find any guidance on this subject in the docu. >>>> I built petsc with chaco and party to attempt sequential partitioning, >>>> where parmetis fails. >>>> However, I get the error: >>>> >>>> [0]PETSC ERROR: No support for this operation for this object type! >>>> [0]PETSC ERROR: Distributed matrix format MPIAdj is not supported for >>>> sequential partitioners! >>>> >>>> >>>> I do not seem to find a sequential equivalent of MatCreateMPIAdj... >>>> Are there any examples how to perform partitioning sequentially? >>>> My mesh/graph is located entirely on master. >>>> >>>> Thanks a lot, >>>> Dominik >>> >>> > > From dominik at itis.ethz.ch Sat Oct 22 11:05:07 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 22 Oct 2011 18:05:07 +0200 Subject: [petsc-users] problem with initial value In-Reply-To: References: Message-ID: After upgrade to 3.2 the mentioned valgrind issues were still there (except for the first one, related to partitioning). However, I seem to be able to find the cause for them, which is NOT updating the ghost values in x BEFORE kspsolve, only AFTER. That way my coefficient matrix, depending on 'x', and obviously assembled before kspsolve, contained uninitialized values. When fixing the issue, bcgs solver behaves as expected, and as the other solvers. I am relieved the issue was with me. However, bcgs and only bcgs will occasionally break down (inconsistent state, division by zero), if the initial solution is exact zero. Pre-filling it with something small (compared to the expected solution) fixes the breakdown, this small issue, however, still bothers me a bit, what's so special about zero? Many thanks for your valuable support, Dominik On Fri, Oct 21, 2011 at 7:57 PM, Barry Smith wrote: > > On Oct 21, 2011, at 11:57 AM, Dominik Szczerba wrote: > >> On Fri, Oct 21, 2011 at 6:29 PM, Barry Smith wrote: >>> >>> On Oct 21, 2011, at 9:29 AM, Dominik Szczerba wrote: >>> >>>> I am doing a transient computation, solving one linear problem per timestep, so naturally I want to exploit 'x' from the previous time step to be the initial value for the next solve (KSPSetInitialGuessNonzero). >>>> For the longest time, however, I was getting wrong results, unless I was resetting 'x' each time step (to some constant value, pure zero caused bcgs to break down). >>> >>> ? What happened if you did not set it to some constant (that is kept the old solution)? Did you get KSP_DIVERGED_BREAKDOWN? ? It would be very odd that starting with a good initial guess would lead to breakdown but that cannot be completely ruled out. >>> >> >> There was no error, the iterations reportedly converged. Only the >> results were wrong, sort of strong random spikes. >> >>> >>> ? I would also check with valgrind http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind >>> >> >> There are 3 issues with valgrind: >> >> 1) Syscall param writev(vector[...]) points to uninitialised byte(s) >> -> tribbered by MatPartitioningApply, then leading deep into ParMetis >> >> 2) Conditional jump or move depends on uninitialised value(s) ? ? -> >> many times, in VecMin and VecMax and KSPSolve_BCGSL >> and >> >> 3) Syscall param writev(vector[...]) points to uninitialised byte(s) >> -> just once, in VecScatterBegin triggered by VecCreateGhost on the >> 'x' vector, which is ghosted.\ > > ? These are very bad things and should not happen at all. ? They must be tracked down before anything can be trusted. Start by sending the full valgrind output from a PETSc 3.2 run to petsc-maint at mcs.anl.gov > > > ? Barry > >> >> Do they pose any serious threats? >> >>> >>> ? ?Have you tried KSPBCGSL? This is "enhanced Bi-CG-stab" algorithm that is designed to handle certain situations that may cause grief for regular Bi-CG-stab I guess. >>> >> >> Thanks for the hint on bcgsl - it works as expected. >> >> So, do I have a problem in the code or bcgs is unreliable? If the >> latter: as a method or as this specific implementation? >> >> Thanks for any comments, >> Dominik >> >> >>> >>> ? Barry >>> >>> >>> >>>> After hours of debugging I was unable to find any errors in my coefficients, I experimentally found out, however, that changing the solver from bcgs to gmres or fgmres removes the problem: I no longer need to clear the solution vector. >>>> Now I am a bit worried, if this is still some time bomb in my code or is a known phenomenon. Thanks for any hints. >>>> >>>> Regards, Dominik >>> > > From bsmith at mcs.anl.gov Sat Oct 22 12:38:42 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 22 Oct 2011 12:38:42 -0500 Subject: [petsc-users] problem with initial value In-Reply-To: References: Message-ID: <8671939D-01A6-421D-8E14-966936E27AD8@mcs.anl.gov> On Oct 22, 2011, at 11:05 AM, Dominik Szczerba wrote: > After upgrade to 3.2 the mentioned valgrind issues were still there > (except for the first one, related to partitioning). However, I seem > to be able to find the cause for them, which is NOT updating the ghost > values in x BEFORE kspsolve, only AFTER. That way my coefficient > matrix, depending on 'x', and obviously assembled before kspsolve, > contained uninitialized values. When fixing the issue, bcgs solver > behaves as expected, and as the other solvers. I am relieved the issue > was with me. > > However, bcgs and only bcgs will occasionally break down (inconsistent > state, division by zero), if the initial solution is exact zero. > Pre-filling it with something small (compared to the expected > solution) fixes the breakdown, this small issue, however, still > bothers me a bit, what's so special about zero? > We'd need an example code that reproduces this. You could use -ksp_view_binary to generate binaryoutput file and send it to petsc-maint at mcs.anl.gov Barry > Many thanks for your valuable support, > Dominik > > On Fri, Oct 21, 2011 at 7:57 PM, Barry Smith wrote: >> >> On Oct 21, 2011, at 11:57 AM, Dominik Szczerba wrote: >> >>> On Fri, Oct 21, 2011 at 6:29 PM, Barry Smith wrote: >>>> >>>> On Oct 21, 2011, at 9:29 AM, Dominik Szczerba wrote: >>>> >>>>> I am doing a transient computation, solving one linear problem per timestep, so naturally I want to exploit 'x' from the previous time step to be the initial value for the next solve (KSPSetInitialGuessNonzero). >>>>> For the longest time, however, I was getting wrong results, unless I was resetting 'x' each time step (to some constant value, pure zero caused bcgs to break down). >>>> >>>> What happened if you did not set it to some constant (that is kept the old solution)? Did you get KSP_DIVERGED_BREAKDOWN? It would be very odd that starting with a good initial guess would lead to breakdown but that cannot be completely ruled out. >>>> >>> >>> There was no error, the iterations reportedly converged. Only the >>> results were wrong, sort of strong random spikes. >>> >>>> >>>> I would also check with valgrind http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind >>>> >>> >>> There are 3 issues with valgrind: >>> >>> 1) Syscall param writev(vector[...]) points to uninitialised byte(s) >>> -> tribbered by MatPartitioningApply, then leading deep into ParMetis >>> >>> 2) Conditional jump or move depends on uninitialised value(s) -> >>> many times, in VecMin and VecMax and KSPSolve_BCGSL >>> and >>> >>> 3) Syscall param writev(vector[...]) points to uninitialised byte(s) >>> -> just once, in VecScatterBegin triggered by VecCreateGhost on the >>> 'x' vector, which is ghosted.\ >> >> These are very bad things and should not happen at all. They must be tracked down before anything can be trusted. Start by sending the full valgrind output from a PETSc 3.2 run to petsc-maint at mcs.anl.gov >> >> >> Barry >> >>> >>> Do they pose any serious threats? >>> >>>> >>>> Have you tried KSPBCGSL? This is "enhanced Bi-CG-stab" algorithm that is designed to handle certain situations that may cause grief for regular Bi-CG-stab I guess. >>>> >>> >>> Thanks for the hint on bcgsl - it works as expected. >>> >>> So, do I have a problem in the code or bcgs is unreliable? If the >>> latter: as a method or as this specific implementation? >>> >>> Thanks for any comments, >>> Dominik >>> >>> >>>> >>>> Barry >>>> >>>> >>>> >>>>> After hours of debugging I was unable to find any errors in my coefficients, I experimentally found out, however, that changing the solver from bcgs to gmres or fgmres removes the problem: I no longer need to clear the solution vector. >>>>> Now I am a bit worried, if this is still some time bomb in my code or is a known phenomenon. Thanks for any hints. >>>>> >>>>> Regards, Dominik >>>> >> >> From dominik at itis.ethz.ch Sat Oct 22 12:56:13 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 22 Oct 2011 19:56:13 +0200 Subject: [petsc-users] Error: Too many KSP monitors set In-Reply-To: References: <51018310-2101-4A9D-846F-A6F07039430B@mcs.anl.gov> Message-ID: The issue is gone in 3.2, at least with the same number of successive calls to kspsolve. Thanks, Dominik On Fri, Oct 21, 2011 at 4:05 PM, Dominik Szczerba wrote: > Unfortunately, I am still forced to use version 3.1. I will come back to > this issue after the upgrade to 3.2, which should be next week. > Many thanks, > Dominik > > On Fri, Oct 21, 2011 at 3:28 PM, Barry Smith wrote: >> >> ? Dominik, >> >> ? ? ?This should not happen. Does this happen with petsc-3.2? Note that we >> are now supporting PETSc 3.2 and if the problem does not exist in 3.2 we >> will not fix it in 3.1. If it exists in 3.2 we will definitely fix it. Can >> you send us a small piece of code that exhibits this behavior or point to a >> PETSc example that exhibits this behavior so we can reproduce it and fix the >> error in 3.2 >> >> ? ?Barry >> >> ? ?We have specific code that prevents the >> On Oct 21, 2011, at 4:18 AM, Dominik Szczerba wrote: >> >> > I am getting this error when performing a series of KSP solves. Removing >> > "-ksp_monitor_true_residual" from options removes the error, but I would >> > like to see the residues on screen. >> > >> > I do not seem to find any functions in the docu to set max number of KSP >> > monitors, but even if there is one - is it really wasteful to monitor each >> > solve of my system? >> > >> > Thanks, >> > Dominik >> > >> > [0]PETSC ERROR: --------------------- Error Message >> > ------------------------------------ >> > [0]PETSC ERROR: Argument out of range! >> > [0]PETSC ERROR: Too many KSP monitors set! >> > [0]PETSC ERROR: >> > ------------------------------------------------------------------------ >> > [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 >> > 13:37:48 CDT 2011 >> > >> > > From dominik at itis.ethz.ch Sat Oct 22 16:56:17 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 22 Oct 2011 23:56:17 +0200 Subject: [petsc-users] ERROR: Cannot pass default in for both input and output indices Message-ID: After upgrade to 3.2 I face the following error: ERROR: VecScatterCreate() line 841 in /home/dsz/pack/petsc-3.2-p3/src/vec/vec/utils/vscat.c Cannot pass default in for both input and output indices My code is: VecScatterCreate(bv0, PETSC_NULL, bv0Seq, PETSC_NULL, &ctx); According to the docu, there is nothing wrong passing PETSC_NULL for both, it was also fine in 3.1. I find no mention of any changes here in the 3.2 Changes documentation. What have I missed here? Regards, Dominik From bsmith at mcs.anl.gov Sat Oct 22 18:54:53 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 22 Oct 2011 18:54:53 -0500 Subject: [petsc-users] ERROR: Cannot pass default in for both input and output indices In-Reply-To: References: Message-ID: Dominik, In VecScatterCreate() is the new line if (!ix && !iy) SETERRQ(((PetscObject)xin)->comm,PETSC_ERR_SUP,"Cannot pass default in for both input and output indices"); you can try taking that out and see what happens. The reason it is there is because, except for input vectors x and y with very specific layout having VecScatterCreate() create default index sets leads to errors (because how is it suppose to know what parts of the vector you want0 and this is confusing for users. The best solution is for you to pass in an index set for either one or both of the IS arguments. Barry On Oct 22, 2011, at 4:56 PM, Dominik Szczerba wrote: > After upgrade to 3.2 I face the following error: > > ERROR: VecScatterCreate() line 841 in > /home/dsz/pack/petsc-3.2-p3/src/vec/vec/utils/vscat.c Cannot pass > default in for both input and output indices > > My code is: > > VecScatterCreate(bv0, PETSC_NULL, bv0Seq, PETSC_NULL, &ctx); > > According to the docu, there is nothing wrong passing PETSC_NULL for > both, it was also fine in 3.1. I find no mention of any changes here > in the 3.2 Changes documentation. What have I missed here? > > Regards, > Dominik From dominik at itis.ethz.ch Sun Oct 23 04:33:56 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sun, 23 Oct 2011 11:33:56 +0200 Subject: [petsc-users] ERROR: Cannot pass default in for both input and output indices In-Reply-To: References: Message-ID: > ? ?The reason it is there is because, except for input vectors x and y with very specific layout having VecScatterCreate() create default index sets leads to errors (because how is it suppose to know what parts of the vector you want0 and this is confusing for users. To me "scatters all values" and "fills entire vector yin", as per the documentation, is unambiguous. > ? ? The best solution is for you to pass in an index set for either one or both of the IS arguments. This is changing very many lines in a few codes, so I am looking for a minimalistic approach. Am I rigorously correct, or just lucky till the next release, to replace the first PETSC_NULL with IS created with ISCreateStride? I am only doing SCATTER_FORWARD. Many thanks, Dominik > > ? ? Barry > > On Oct 22, 2011, at 4:56 PM, Dominik Szczerba wrote: > >> After upgrade to 3.2 I face the following error: >> >> ERROR: VecScatterCreate() line 841 in >> /home/dsz/pack/petsc-3.2-p3/src/vec/vec/utils/vscat.c Cannot pass >> default in for both input and output indices >> >> My code is: >> >> VecScatterCreate(bv0, PETSC_NULL, bv0Seq, PETSC_NULL, &ctx); >> >> According to the docu, there is nothing wrong passing PETSC_NULL for >> both, it was also fine in 3.1. I find no mention of any changes here >> in the 3.2 Changes documentation. What have I missed here? >> >> Regards, >> Dominik > > From bsmith at mcs.anl.gov Sun Oct 23 09:09:08 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 23 Oct 2011 09:09:08 -0500 Subject: [petsc-users] ERROR: Cannot pass default in for both input and output indices In-Reply-To: References: Message-ID: <556E2B37-46C6-400C-8B69-7652E849AA19@mcs.anl.gov> On Oct 23, 2011, at 4:33 AM, Dominik Szczerba wrote: >> The reason it is there is because, except for input vectors x and y with very specific layout having VecScatterCreate() create default index sets leads to errors (because how is it suppose to know what parts of the vector you want0 and this is confusing for users. > > To me "scatters all values" and "fills entire vector yin", as per the > documentation, is unambiguous. > >> The best solution is for you to pass in an index set for either one or both of the IS arguments. > > This is changing very many lines in a few codes, so I am looking for a > minimalistic approach. Am I rigorously correct, or just lucky till the > next release, to replace the first PETSC_NULL with IS created with > ISCreateStride? Yes. Just make the local size of the scatter the same as the local size of the vector. > I am only doing SCATTER_FORWARD. > > Many thanks, > Dominik > >> >> Barry >> >> On Oct 22, 2011, at 4:56 PM, Dominik Szczerba wrote: >> >>> After upgrade to 3.2 I face the following error: >>> >>> ERROR: VecScatterCreate() line 841 in >>> /home/dsz/pack/petsc-3.2-p3/src/vec/vec/utils/vscat.c Cannot pass >>> default in for both input and output indices >>> >>> My code is: >>> >>> VecScatterCreate(bv0, PETSC_NULL, bv0Seq, PETSC_NULL, &ctx); >>> >>> According to the docu, there is nothing wrong passing PETSC_NULL for >>> both, it was also fine in 3.1. I find no mention of any changes here >>> in the 3.2 Changes documentation. What have I missed here? >>> >>> Regards, >>> Dominik >> >> From gdiso at ustc.edu Mon Oct 24 02:56:05 2011 From: gdiso at ustc.edu (Gong Ding) Date: Mon, 24 Oct 2011 15:56:05 +0800 (CST) Subject: [petsc-users] CPU selection Message-ID: <8168154.288561319442965950.JavaMail.coremail@mail.ustc.edu> Hi all We'd like to buy one more computation node for TCAD simulation. The simulation code is based on petsc, 90% of the time spend on linear solver. Krylov solvers such as BCGS/GMRES and MUMPS solver are mostly used. Now we have two choise, one is based on Intel XEON E5620 (4 Core) and the othre is based on AMD Opteron 6128 (8 Core). The tow CPUs are nearly at the same price. Can anyone give some benckmark about the performance of above two system? Does petsc linear solve has performance gain with extra cores of AMD Opteron? Thanks. Gong Ding -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Oct 24 07:20:10 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 24 Oct 2011 07:20:10 -0500 Subject: [petsc-users] CPU selection In-Reply-To: <8168154.288561319442965950.JavaMail.coremail@mail.ustc.edu> References: <8168154.288561319442965950.JavaMail.coremail@mail.ustc.edu> Message-ID: 2011/10/24 Gong Ding > Hi all > We'd like to buy one more computation node for TCAD simulation. > The simulation code is based on petsc, 90% of the time spend on linear > solver. > Krylov solvers such as BCGS/GMRES and MUMPS solver are mostly used. > > Now we have two choise, one is based on Intel XEON E5620 (4 Core) and > the othre is based on AMD Opteron 6128 (8 Core). > The tow CPUs are nearly at the same price. > For sparse linear algebra, you are buying memory bandwidth. In this case, the Xeon has 3 channels of DDR3-1066 for an aggregate bandwidth of 25.5 GB/s where as the 6128 has 4 channels of DDR3-1333 for an aggregate bandwidth of 42.6 GB/s. Note however, that you are probably less likely to fully utilize the bandwidth on the Opteron because it has a deeper NUMA hierarchy. It's probably still faster, but probably not by much. If you get the Opteron, be sure to get a motherboard that supports DDR3-1333 and actually get the fast memory. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Oct 24 07:27:32 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 24 Oct 2011 07:27:32 -0500 Subject: [petsc-users] CPU selection In-Reply-To: References: <8168154.288561319442965950.JavaMail.coremail@mail.ustc.edu> Message-ID: On Oct 24, 2011, at 7:20 AM, Jed Brown wrote: > 2011/10/24 Gong Ding > Hi all > We'd like to buy one more computation node for TCAD simulation. > The simulation code is based on petsc, 90% of the time spend on linear solver. > Krylov solvers such as BCGS/GMRES and MUMPS solver are mostly used. > > Now we have two choise, one is based on Intel XEON E5620 (4 Core) and > the othre is based on AMD Opteron 6128 (8 Core). > The tow CPUs are nearly at the same price. > > For sparse linear algebra, you are buying memory bandwidth. In this case, the Xeon has 3 channels of DDR3-1066 for an aggregate bandwidth of 25.5 GB/s where as the 6128 has 4 channels of DDR3-1333 for an aggregate bandwidth of 42.6 GB/s. Note however, that you are probably less likely to fully utilize the bandwidth on the Opteron because it has a deeper NUMA hierarchy. It's probably still faster, but probably not by much. If you get the Opteron, be sure to get a motherboard that supports DDR3-1333 and actually get the fast memory. If at all possible, you'll want to see the Streams benchmark numbers in parallel for both of these for comparison purposes. Barry From C.Klaij at marin.nl Mon Oct 24 08:07:24 2011 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Mon, 24 Oct 2011 13:07:24 +0000 Subject: [petsc-users] KSPMonitorSingularValu Message-ID: >> Regarding modified Gram Schmidt, I tried to set it as follows: >> >> call >> KSPGMRESSetOrthogonalization(ksp,KSPGMRESModifiedGramSchmidtOrthogonalization,ierr) >> >> But my compiler tells me: >> >> This name does not have a type, and must have an explicit type. >> [KSPGMRESMODIFIEDGRAMSCHMIDTORTHOGONALIZATIO] >> > >It looks like you have a line length problem. > > Matt > > >> (petsc-3.1-p7, fortran with "use petscksp" and #include >> "finclude/petsckspdef.h") >> >> I'm using ifort and free form fortran so the lines can have any length in principle. call KSPGMRESSetOrthogonalization(ksp,KSPGMRESModifiedGramSchmidtOrthogonalization,ierr) or call KSPGMRESSetOrthogonalization(ksp, & KSPGMRESModifiedGramSchmidtOrthogonalization,ierr) give the same compiler message... Any ideas how to fix this? dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From jedbrown at mcs.anl.gov Mon Oct 24 08:15:38 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 24 Oct 2011 08:15:38 -0500 Subject: [petsc-users] KSPMonitorSingularValu In-Reply-To: References: Message-ID: On Mon, Oct 24, 2011 at 08:07, Klaij, Christiaan wrote: > I'm using ifort and free form fortran so the lines can have any > length in principle. > > call > KSPGMRESSetOrthogonalization(ksp,KSPGMRESModifiedGramSchmidtOrthogonalization,ierr) > > or > > call KSPGMRESSetOrthogonalization(ksp, & > KSPGMRESModifiedGramSchmidtOrthogonalization,ierr) > > give the same compiler message... Any ideas how to fix this? > Looks like there isn't a Fortran binding to this function so just use PetscOptionsSetValue("-ksp_gmres_modifiedgramschmidt", "1") for now. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gdiso at ustc.edu Mon Oct 24 08:31:02 2011 From: gdiso at ustc.edu (Gong Ding) Date: Mon, 24 Oct 2011 21:31:02 +0800 (CST) Subject: [petsc-users] CPU selection In-Reply-To: References: <8168154.288561319442965950.JavaMail.coremail@mail.ustc.edu> Message-ID: <26337493.288981319463062957.JavaMail.coremail@mail.ustc.edu> 2011/10/24 Gong Ding Hi all We'd like to buy one more computation node for TCAD simulation. The simulation code is based on petsc, 90% of the time spend on linear solver. Krylov solvers such as BCGS/GMRES and MUMPS solver are mostly used. Now we have two choise, one is based on Intel XEON E5620 (4 Core) and the othre is based on AMD Opteron 6128 (8 Core). The tow CPUs are nearly at the same price. For sparse linear algebra, you are buying memory bandwidth. In this case, the Xeon has 3 channels of DDR3-1066 for an aggregate bandwidth of 25.5 GB/s where as the 6128 has 4 channels of DDR3-1333 for an aggregate bandwidth of 42.6 GB/s. Note however, that you are probably less likely to fully utilize the bandwidth on the Opteron because it has a deeper NUMA hierarchy. It's probably still faster, but probably not by much. If you get the Opteron, be sure to get a motherboard that supports DDR3-1333 and actually get the fast memory. Thank you. I will consider Opteron system. -------------- next part -------------- An HTML attachment was scrubbed... URL: From manuel.perezcerquera at polito.it Mon Oct 24 09:37:52 2011 From: manuel.perezcerquera at polito.it (PEREZ CERQUERA MANUEL RICARDO) Date: Mon, 24 Oct 2011 16:37:52 +0200 Subject: [petsc-users] About VecGetArray Message-ID: Hi all, I'm Trying to figure out, how to Get all The global elements of a Vector, I mean something like VecGetArray but instead of getting only the local elements, I need to obtain all the global elements, Is there any directive in Petsc FORTRAN to do this or should I need to use directly MPI directives? Thanks. Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student Antenna and EMC Lab (LACE) Istituto Superiore Mario Boella (ISMB) Politecnico di Torino Via Pier Carlo Boggio 61, Torino 10138, Italy Email: manuel.perezcerquera at polito.it Phone: +39 0112276704 Fax: +39 011 2276 299 From petsc-maint at mcs.anl.gov Mon Oct 24 09:40:12 2011 From: petsc-maint at mcs.anl.gov (Satish Balay) Date: Mon, 24 Oct 2011 09:40:12 -0500 (CDT) Subject: [petsc-users] [petsc-maint #91329] About VecGetArray In-Reply-To: References: Message-ID: http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#mpi-vec-to-mpi-vec satish On Mon, 24 Oct 2011, PEREZ CERQUERA MANUEL RICARDO wrote: > Hi all, > > I'm Trying to figure out, how to Get all The global > elements of a Vector, I mean something like VecGetArray > but instead of getting only the local elements, I need to > obtain all the global elements, Is there any directive in > Petsc FORTRAN to do this or should I need to use directly > MPI directives? > > Thanks. > > Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student > Antenna and EMC Lab (LACE) > Istituto Superiore Mario Boella (ISMB) > Politecnico di Torino > Via Pier Carlo Boggio 61, Torino 10138, Italy > Email: manuel.perezcerquera at polito.it > Phone: +39 0112276704 > Fax: +39 011 2276 299 > > From balay at mcs.anl.gov Mon Oct 24 10:13:52 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 24 Oct 2011 10:13:52 -0500 (CDT) Subject: [petsc-users] petsc-3.2-p4.tar.gz now available Message-ID: Dear PETSc users, Patch-4 update to petsc-3.2 is now available for download. http://www.mcs.anl.gov/petsc/petsc-as/download/index.html Satish From dominik at itis.ethz.ch Mon Oct 24 14:41:13 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Mon, 24 Oct 2011 21:41:13 +0200 Subject: [petsc-users] Question about VecScatterCreate's Documentation Message-ID: Documentations says: ix - the indices of xin to scatter (if PETSC_NULL scatters all values) iy - the indices of yin to hold results (if PETSC_NULL fills entire vector yin) Are the indices expected in the application ordering or local ordering? From dominik at itis.ethz.ch Mon Oct 24 15:29:59 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Mon, 24 Oct 2011 22:29:59 +0200 Subject: [petsc-users] ERROR: Cannot pass default in for both input and output indices In-Reply-To: <556E2B37-46C6-400C-8B69-7652E849AA19@mcs.anl.gov> References: <556E2B37-46C6-400C-8B69-7652E849AA19@mcs.anl.gov> Message-ID: >>> ? ? The best solution is for you to pass in an index set for either one or both of the IS arguments. >> >> This is changing very many lines in a few codes, so I am looking for a >> minimalistic approach. Am I rigorously correct, or just lucky till the >> next release, to replace the first PETSC_NULL with IS created with >> ISCreateStride? > > ? Yes. Just make the local size of the scatter the same as the local size of the vector. When doing so, I receive: [1]PETSC ERROR: VecScatterCreate() line 1136 in /home/dsz/pack/petsc-3.2-p3/src/vec/vec/utils/vscat.c Local scatter sizes don't match Inspecting vscat.c reveals: ierr = ISGetLocalSize(ix,&nx);CHKERRQ(ierr); ... ierr = ISGetLocalSize(iy,&ny);CHKERRQ(ierr); ... if (nx != ny) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_SIZ,"Local scatter sizes don't match"); which looks quite suspicious to me, i.e. 1) If iy is PETSC_NULL, how can its size be queried? 2) looks like NONE of the index sets is actually allowed to be PETSC_NULL. I would be very grateful for some clarifications here. Many thanks and best regards, Dominik From jiangwen84 at gmail.com Mon Oct 24 15:37:19 2011 From: jiangwen84 at gmail.com (Wen Jiang) Date: Mon, 24 Oct 2011 16:37:19 -0400 Subject: [petsc-users] problem of running jobs on cluster Message-ID: Hi guys, I reported this problem a few days ago but I still cannot get it fixed. Right now I am learning how to debug the parallel code. And I just want to get some suggestions before I figure out how the debugger works. This is just a big run of my own fem code, which has almost the same structure as the ex3 in ksp examples. This code ( the largest dof I used is around 65,000 ) is running totally fine on one compute node with any number of processes. And the code with smaller dof ( less than 5000) is also working fine on more than one compute node. However, I am encountering a problem when I tries to run a large job ( for example, dof = 10,000 ) on two compute nodes. The problem is that my code will get stuck at the MatAssemblyEnd() stage. I use the option -info to print information about the code and find that only some of the processes gives the MatAssemblyEnd_SeqAIJ() information and thus the code gets stuck there. I have several questions here, 1. In ex3, the comments said that the matrix is intentionally laid out across processors differently from the way it is assembled. As far as I understand, this means that the MatSetValues() will insert the values to different processors.( am I correct?). Since generating the entries on the 'wrong' process is expensive, I am just wondering whether there is a better way to do it especially for the assembly the global stiffness matrix in FEM. ( In my code, the MatSetValues will add a 64 by 64 element stiffness matrix every time ) 2. Since my code (dofs around 10,000 ) is working fine on single node but get stuck on two nodes, I am guessing that might be due to the large chuck of data which needs to be communicated between different nodes in the stage of MatAssembly ? Will the data communication be slower between different nodes than within single node? I appreciate any of your suggestion and I will also keep working on the debugging. Thanks, Wen -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Oct 24 15:41:22 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 24 Oct 2011 15:41:22 -0500 Subject: [petsc-users] Question about VecScatterCreate's Documentation In-Reply-To: References: Message-ID: The indices are ALWAYS with respect to the numbering of the Vec they are associated with. If the Vec they are associated with lives on just that one process (it is a VECSEQ) then you would say it is with respect to the local ordering if the Vec is parallel then you would say it is with respect to the global ordering. But that is the wrong way of looking at it: the indices are simply indices into the correct slot in the Vec. Barry On Oct 24, 2011, at 2:41 PM, Dominik Szczerba wrote: > Documentations says: > > ix - the indices of xin to scatter (if PETSC_NULL scatters all values) > iy - the indices of yin to hold results (if PETSC_NULL fills entire vector yin) > > Are the indices expected in the application ordering or local ordering? From bsmith at mcs.anl.gov Mon Oct 24 15:46:54 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 24 Oct 2011 15:46:54 -0500 Subject: [petsc-users] ERROR: Cannot pass default in for both input and output indices In-Reply-To: References: <556E2B37-46C6-400C-8B69-7652E849AA19@mcs.anl.gov> Message-ID: Your problem is that you think that somehow if you pass defaults to VecScatter is going to magically determine which elements you want mapped from the first vector to which elements in the second vector. Unless both vectors have the exact same local sizes how is it to know what mapping you want? (Hence the error it generated) You need to determine exactly what you want mapped from the first vector to the second and then set up the IS to make that happen. Hoping that some defaults will do what you want is not a good idea. Note that each process has to provide the same size in ix and iy, that determine the froms and tos for the entries. It makes no sense for a process to provide the locations of some froms but not the locations of the corresponding toos. Barry On Oct 24, 2011, at 3:29 PM, Dominik Szczerba wrote: >>>> The best solution is for you to pass in an index set for either one or both of the IS arguments. >>> >>> This is changing very many lines in a few codes, so I am looking for a >>> minimalistic approach. Am I rigorously correct, or just lucky till the >>> next release, to replace the first PETSC_NULL with IS created with >>> ISCreateStride? >> >> Yes. Just make the local size of the scatter the same as the local size of the vector. > > When doing so, I receive: > > [1]PETSC ERROR: VecScatterCreate() line 1136 in > /home/dsz/pack/petsc-3.2-p3/src/vec/vec/utils/vscat.c Local scatter > sizes don't match > > Inspecting vscat.c reveals: > > > > ierr = ISGetLocalSize(ix,&nx);CHKERRQ(ierr); > ... > ierr = ISGetLocalSize(iy,&ny);CHKERRQ(ierr); > ... > if (nx != ny) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_SIZ,"Local scatter > sizes don't match"); > > > > which looks quite suspicious to me, i.e. 1) If iy is PETSC_NULL, how > can its size be queried? 2) looks like NONE of the index sets is > actually allowed to be PETSC_NULL. > > I would be very grateful for some clarifications here. > > Many thanks and best regards, > Dominik From bsmith at mcs.anl.gov Mon Oct 24 15:52:43 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 24 Oct 2011 15:52:43 -0500 Subject: [petsc-users] problem of running jobs on cluster In-Reply-To: References: Message-ID: On Oct 24, 2011, at 3:37 PM, Wen Jiang wrote: > Hi guys, > > I reported this problem a few days ago but I still cannot get it fixed. Right now I am learning how to debug the parallel code. And I just want to get some suggestions before I figure out how the debugger works. > > This is just a big run of my own fem code, which has almost the same structure as the ex3 in ksp examples. This code ( the largest dof I used is around 65,000 ) is running totally fine on one compute node with any number of processes. And the code with smaller dof ( less than 5000) is also working fine on more than one compute node. However, I am encountering a problem when I tries to run a large job ( for example, dof = 10,000 ) on two compute nodes. > > The problem is that my code will get stuck at the MatAssemblyEnd() stage. I use the option -info to print information about the code and find that only some of the processes gives the MatAssemblyEnd_SeqAIJ() information and thus the code gets stuck there. > > I have several questions here, > > 1. In ex3, the comments said that the matrix is intentionally laid out across processors differently from the way it is assembled. As far as I understand, this means that the MatSetValues() will insert the values to different processors.( am I correct?). Since generating the entries on the 'wrong' process is expensive, I am just wondering whether there is a better way to do it especially for the assembly the global stiffness matrix in FEM. ( In my code, the MatSetValues will add a 64 by 64 element stiffness matrix every time ) > > 2. Since my code (dofs around 10,000 ) is working fine on single node but get stuck on two nodes, I am guessing that might be due to the large chuck of data which needs to be communicated between different nodes in the stage of MatAssembly ? Will the data communication be slower between different nodes than within single node? Absolutely. You want to generate most of the matrix entries on the process where they will be stored. You also need to make sure you've done the correct matrix preallocation: http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly Working for small problem and taking "forever" for larger problem is a sign of bad preallocation or too much data computed on the wrong process. Also run the smaller problem with valgrind http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind to make sure there are no memory corruption problems that are slipping by on the small mesh but causing problems on the large. Also run the small problem and check for correct memory preallocation; if it is wrong for the small problem it will be wrong for the large. Barry > > I appreciate any of your suggestion and I will also keep working on the debugging. > > Thanks, > Wen From jedbrown at mcs.anl.gov Mon Oct 24 19:15:51 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 24 Oct 2011 19:15:51 -0500 Subject: [petsc-users] ERROR: Cannot pass default in for both input and output indices In-Reply-To: References: <556E2B37-46C6-400C-8B69-7652E849AA19@mcs.anl.gov> Message-ID: On Mon, Oct 24, 2011 at 15:29, Dominik Szczerba wrote: > ierr = ISGetLocalSize(ix,&nx);CHKERRQ(ierr); > ... > ierr = ISGetLocalSize(iy,&ny);CHKERRQ(ierr); > ... > if (nx != ny) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_SIZ,"Local scatter > sizes don't match"); > > > > which looks quite suspicious to me, i.e. 1) If iy is PETSC_NULL, how > can its size be queried? > Your confusion comes from not looking at the earlier part of the function where index sets are created if you passed in NULL. The only reason ix=NULL and iy=NULL is not allowed is that it's a rare case that more frequently represents misunderstanding. If it's really what you want, then just make the index set. -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Tue Oct 25 01:42:58 2011 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Tue, 25 Oct 2011 06:42:58 +0000 Subject: [petsc-users] KSPMonitorSingularValue Message-ID: >> I'm using ifort and free form fortran so the lines can have any >> length in principle. >> >> call >> KSPGMRESSetOrthogonalization(ksp,KSPGMRESModifiedGramSchmidtOrthogonalization,ierr) >> >> or >> >> call KSPGMRESSetOrthogonalization(ksp, & >> KSPGMRESModifiedGramSchmidtOrthogonalization,ierr) >> >> give the same compiler message... Any ideas how to fix this? >> > >Looks like there isn't a Fortran binding to this function so just use >PetscOptionsSetValue("-ksp_gmres_modifiedgramschmidt", "1") for now. That compiles without problem, thanks! dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From C.Klaij at marin.nl Tue Oct 25 03:45:42 2011 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Tue, 25 Oct 2011 08:45:42 +0000 Subject: [petsc-users] KSPMonitorSingularValue Message-ID: >> The setting is Navier-Stokes, colocated FVM, matrix-free Krylov-Picard >> with SIMPLE preconditioning. It works for most case but stagnates >> for this particular case although SIMPLE as solver does converge nicely. >> > >> How are you applying the action of the linear operator? If you use finite >> differencing, it could be inaccurate. Is this incompressible or a low-Mach >> compressible formulation? Try -ksp_monitor_true_residual, if the true >> residual drifts from the unpreconditioned residual computed by FGMRES, the >> Krylov space could be losing orthogonality. You can try >> -ksp_gmres_modifiedgramschmidt. Are you losing a lot of progress in >> restarts? > > >It's incompressible Navier-Stokes. No finite differencing, the >action is computed directly without approximations. It's right >preconditioning, so preconditioned and true residual should be >the same. I don't get any progress, the residual is stagnating >from the very first iteration way before any restart. Now that I have the singular value monitor working this is what I get for a converging problem and the stagnating problem. Both problems are quite similar (same type of flow, same type of grid type, same BCs, just slightly different geometry). Any ideas? Converging problem: 0 KSP Residual norm 2.365447773598e+00 % max 1 min 1 max/min 1 1 KSP Residual norm 1.688717750865e+00 % max 2.43352 min 2.43352 max/min 1 2 KSP Residual norm 1.441677051369e+00 % max 3.39561 min 0.658191 max/min 5.15901 3 KSP Residual norm 1.309618066659e+00 % max 3.45429 min 0.477676 max/min 7.23145 4 KSP Residual norm 1.236059684429e+00 % max 3.68325 min 0.418244 max/min 8.80645 5 KSP Residual norm 1.178488235028e+00 % max 4.26371 min 0.37149 max/min 11.4773 Stagnating problem (classical Gram-Schmidt): 0 KSP Residual norm 6.466095059482e-02 % max 1 min 1 max/min 1 1 KSP Residual norm 6.466093783326e-02 % max 1.07631 min 1.07631 max/min 1 2 KSP Residual norm 6.466093747580e-02 % max 3.0223 min 1.06721 max/min 2.83196 3 KSP Residual norm 6.466093697245e-02 % max 6.79697 min 1.0663 max/min 6.37436 4 KSP Residual norm 6.466092195305e-02 % max 8.16759 min 1.06345 max/min 7.68027 5 KSP Residual norm 6.466090166280e-02 % max 8.80407 min 1.06326 max/min 8.28026 Stagnating problem (modified Gram-Schmidt): 0 KSP Residual norm 6.470362245549e-02 % max 1 min 1 max/min 1 1 KSP Residual norm 6.470334363709e-02 % max 1.08156 min 1.08156 max/min 1 2 KSP Residual norm 6.470326556761e-02 % max 3.05856 min 1.07224 max/min 2.8525 3 KSP Residual norm 6.470326223293e-02 % max 6.85267 min 1.0711 max/min 6.39777 4 KSP Residual norm 6.470312640256e-02 % max 8.17917 min 1.06837 max/min 7.65572 5 KSP Residual norm 6.470312619015e-02 % max 8.7711 min 1.06807 max/min 8.21213 dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From Debao.Shao at brion.com Tue Oct 25 03:45:24 2011 From: Debao.Shao at brion.com (Debao Shao) Date: Tue, 25 Oct 2011 01:45:24 -0700 Subject: [petsc-users] MatLUFactorNumeric_SeqAIJ() consumes long runtime Message-ID: <384FF55F15E3E447802DC8CCA85696980E26254051@EX03> DA, I'm using PETSc ILU(1)+GMRES to solve an QP problem. But the job may hang at "MatLUFactorNumeric_SeqAIJ()" for long time. (gdb) bt #0 0x0000000002284520 in MatLUFactorNumeric_SeqAIJ () #1 0x0000000002265f89 in MatLUFactorNumeric () #2 0x0000000002342ff3 in PCSetUp_ILU () #3 0x00000000024f8d65 in PCSetUp () #4 0x0000000002362964 in KSPSetUp () #5 0x0000000002363455 in KSPSolve () Do you have idea for such phenomenon? Thanks, Debao ________________________________ -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From manuel.perezcerquera at polito.it Tue Oct 25 04:39:05 2011 From: manuel.perezcerquera at polito.it (PEREZ CERQUERA MANUEL RICARDO) Date: Tue, 25 Oct 2011 11:39:05 +0200 Subject: [petsc-users] Using VecScatterCreateToAll Errors Message-ID: Hi all, I'm Trying to use VecScatterCreateToAll in FORTRAN so I'm doing: VecScatter ctx Vec AllCurrentValues CALL VecScatterCreateToAll(I,ctx,AllCurrentValues,ierr) CALL VecScatterBegin(ctx,I,AllCurrentValues,INSERT_VALUES,SCATTER_FORWARD,ierr) CALL VecScatterEnd(ctx,I,AllCurrentValues,INSERT_VALUES,SCATTER_FORWARD,ierr) I is al MPIVector previously created, initialized and assembled, However when I call VecScatterCreateToAll I got thid error message: [0]PETSC ERROR: Invalid argument! [0]PETSC ERROR: Wrong type of object: Parameter # 1! [0]PETSC ERROR: ---------------------------------------------------------------- -------- [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 2, Fri Sep 16 10:10:45 CDT 20 11 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ---------------------------------------------------------------- Then I tried to Declare ctx and Vec as pointers ,in this way: VecScatter,pointer :: ctx Vec,pointer :: AllCurrentValues But It didn't work, I obtained this other error message: [0]PETSC ERROR: Null argument, when expecting valid pointer! [0]PETSC ERROR: Null Object: Parameter # 1! [0]PETSC ERROR: ---------------------------------------------------------------- -------- [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 2, Fri Sep 16 10:10:45 CDT 20 11 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ---------------------------------------------------------------- I really appreciate some help, because there is not FORTRAN examples with this directive, and I'm afraid I'm missing some change in the CALL respect to C calling. Thank you! Manuel. Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student Antenna and EMC Lab (LACE) Istituto Superiore Mario Boella (ISMB) Politecnico di Torino Via Pier Carlo Boggio 61, Torino 10138, Italy Email: manuel.perezcerquera at polito.it Phone: +39 0112276704 Fax: +39 011 2276 299 From jedbrown at mcs.anl.gov Tue Oct 25 06:58:33 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 25 Oct 2011 06:58:33 -0500 Subject: [petsc-users] Using VecScatterCreateToAll Errors In-Reply-To: References: Message-ID: Don't declare as pointers. What is the complete error message? Is I really a Vec? On Oct 25, 2011 4:39 AM, "PEREZ CERQUERA MANUEL RICARDO" < manuel.perezcerquera at polito.it> wrote: > Hi all, > > I'm Trying to use VecScatterCreateToAll in FORTRAN so I'm doing: > > VecScatter ctx > Vec AllCurrentValues > > CALL VecScatterCreateToAll(I,ctx,**AllCurrentValues,ierr) > CALL VecScatterBegin(ctx,I,**AllCurrentValues,INSERT_** > VALUES,SCATTER_FORWARD,ierr) > CALL VecScatterEnd(ctx,I,**AllCurrentValues,INSERT_** > VALUES,SCATTER_FORWARD,ierr) > > I is al MPIVector previously created, initialized and assembled, However > when I call VecScatterCreateToAll I got thid error message: > > [0]PETSC ERROR: Invalid argument! > [0]PETSC ERROR: Wrong type of object: Parameter # 1! > [0]PETSC ERROR: ------------------------------** > ------------------------------**---- > -------- > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 2, Fri Sep 16 10:10:45 > CDT 20 > 11 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: ------------------------------** > ------------------------------**---- > > Then I tried to Declare ctx and Vec as pointers ,in this way: > > VecScatter,pointer :: ctx > Vec,pointer :: AllCurrentValues > > But It didn't work, I obtained this other error message: > [0]PETSC ERROR: Null argument, when expecting valid pointer! > [0]PETSC ERROR: Null Object: Parameter # 1! > [0]PETSC ERROR: ------------------------------** > ------------------------------**---- > -------- > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 2, Fri Sep 16 10:10:45 > CDT 20 > 11 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: ------------------------------** > ------------------------------**---- > > I really appreciate some help, because there is not FORTRAN examples with > this directive, and I'm afraid I'm missing some change in the CALL respect > to C calling. Thank you! > > Manuel. > > Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student > Antenna and EMC Lab (LACE) > Istituto Superiore Mario Boella (ISMB) > Politecnico di Torino > Via Pier Carlo Boggio 61, Torino 10138, Italy > Email: manuel.perezcerquera at polito.it > Phone: +39 0112276704 > Fax: +39 011 2276 299 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Oct 25 08:00:14 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 25 Oct 2011 08:00:14 -0500 Subject: [petsc-users] KSPMonitorSingularValue In-Reply-To: References: Message-ID: <9685813B-4515-4246-9AE2-8685FE76BB22@mcs.anl.gov> You need to run for much more iterations to get any kind of reliable singular values. Use something like at least -ksp_gmres_restart 100 -ksp_max_it 100 and see what happens to the singular values. Barry On Oct 25, 2011, at 3:45 AM, Klaij, Christiaan wrote: >>> The setting is Navier-Stokes, colocated FVM, matrix-free Krylov-Picard >>> with SIMPLE preconditioning. It works for most case but stagnates >>> for this particular case although SIMPLE as solver does converge nicely. >>> >> >>> How are you applying the action of the linear operator? If you use finite >>> differencing, it could be inaccurate. Is this incompressible or a low-Mach >>> compressible formulation? Try -ksp_monitor_true_residual, if the true >>> residual drifts from the unpreconditioned residual computed by FGMRES, the >>> Krylov space could be losing orthogonality. You can try >>> -ksp_gmres_modifiedgramschmidt. Are you losing a lot of progress in >>> restarts? >> >> >> It's incompressible Navier-Stokes. No finite differencing, the >> action is computed directly without approximations. It's right >> preconditioning, so preconditioned and true residual should be >> the same. I don't get any progress, the residual is stagnating >> from the very first iteration way before any restart. > > Now that I have the singular value monitor working this is what I > get for a converging problem and the stagnating problem. Both > problems are quite similar (same type of flow, same type of grid > type, same BCs, just slightly different geometry). Any ideas? > > Converging problem: > > 0 KSP Residual norm 2.365447773598e+00 % max 1 min 1 max/min 1 > 1 KSP Residual norm 1.688717750865e+00 % max 2.43352 min 2.43352 max/min 1 > 2 KSP Residual norm 1.441677051369e+00 % max 3.39561 min 0.658191 max/min 5.15901 > 3 KSP Residual norm 1.309618066659e+00 % max 3.45429 min 0.477676 max/min 7.23145 > 4 KSP Residual norm 1.236059684429e+00 % max 3.68325 min 0.418244 max/min 8.80645 > 5 KSP Residual norm 1.178488235028e+00 % max 4.26371 min 0.37149 max/min 11.4773 > > Stagnating problem (classical Gram-Schmidt): > > 0 KSP Residual norm 6.466095059482e-02 % max 1 min 1 max/min 1 > 1 KSP Residual norm 6.466093783326e-02 % max 1.07631 min 1.07631 max/min 1 > 2 KSP Residual norm 6.466093747580e-02 % max 3.0223 min 1.06721 max/min 2.83196 > 3 KSP Residual norm 6.466093697245e-02 % max 6.79697 min 1.0663 max/min 6.37436 > 4 KSP Residual norm 6.466092195305e-02 % max 8.16759 min 1.06345 max/min 7.68027 > 5 KSP Residual norm 6.466090166280e-02 % max 8.80407 min 1.06326 max/min 8.28026 > > Stagnating problem (modified Gram-Schmidt): > > 0 KSP Residual norm 6.470362245549e-02 % max 1 min 1 max/min 1 > 1 KSP Residual norm 6.470334363709e-02 % max 1.08156 min 1.08156 max/min 1 > 2 KSP Residual norm 6.470326556761e-02 % max 3.05856 min 1.07224 max/min 2.8525 > 3 KSP Residual norm 6.470326223293e-02 % max 6.85267 min 1.0711 max/min 6.39777 > 4 KSP Residual norm 6.470312640256e-02 % max 8.17917 min 1.06837 max/min 7.65572 > 5 KSP Residual norm 6.470312619015e-02 % max 8.7711 min 1.06807 max/min 8.21213 > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > From bsmith at mcs.anl.gov Tue Oct 25 08:04:12 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 25 Oct 2011 08:04:12 -0500 Subject: [petsc-users] MatLUFactorNumeric_SeqAIJ() consumes long runtime In-Reply-To: <384FF55F15E3E447802DC8CCA85696980E26254051@EX03> References: <384FF55F15E3E447802DC8CCA85696980E26254051@EX03> Message-ID: <128B46D5-40E7-4E55-BE7F-9A7B13AEFF73@mcs.anl.gov> My guess is that the code is repeatedly trying larger and large shifts to get a nonzero pivot as it factors. If you run with -info and let it run to completion it will print some information about the shifts it has tried. Barry On Oct 25, 2011, at 3:45 AM, Debao Shao wrote: > DA, > > I?m using PETSc ILU(1)+GMRES to solve an QP problem. But the job may hang at ?MatLUFactorNumeric_SeqAIJ()? for long time. > > (gdb) bt > #0 0x0000000002284520 in MatLUFactorNumeric_SeqAIJ () > #1 0x0000000002265f89 in MatLUFactorNumeric () > #2 0x0000000002342ff3 in PCSetUp_ILU () > #3 0x00000000024f8d65 in PCSetUp () > #4 0x0000000002362964 in KSPSetUp () > #5 0x0000000002363455 in KSPSolve () > > Do you have idea for such phenomenon? > > Thanks, > Debao > > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. From jedbrown at mcs.anl.gov Tue Oct 25 08:06:40 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 25 Oct 2011 08:06:40 -0500 Subject: [petsc-users] KSPMonitorSingularValue In-Reply-To: References: Message-ID: As alway, listen to Barry. You might also look at how the eigenvalues of this matrix is distributed. If they encircle the origin, for example, then GMRES will be pathologically bad, but the condition number is still small. It's worth comparing with bcgs and cgne. On Oct 25, 2011 3:44 AM, "Klaij, Christiaan" wrote: > >> The setting is Navier-Stokes, colocated FVM, matrix-free Krylov-Picard > >> with SIMPLE preconditioning. It works for most case but stagnates > >> for this particular case although SIMPLE as solver does converge nicely. > >> > > > >> How are you applying the action of the linear operator? If you use > finite > >> differencing, it could be inaccurate. Is this incompressible or a > low-Mach > >> compressible formulation? Try -ksp_monitor_true_residual, if the true > >> residual drifts from the unpreconditioned residual computed by FGMRES, > the > >> Krylov space could be losing orthogonality. You can try > >> -ksp_gmres_modifiedgramschmidt. Are you losing a lot of progress in > >> restarts? > > > > > >It's incompressible Navier-Stokes. No finite differencing, the > >action is computed directly without approximations. It's right > >preconditioning, so preconditioned and true residual should be > >the same. I don't get any progress, the residual is stagnating > >from the very first iteration way before any restart. > > Now that I have the singular value monitor working this is what I > get for a converging problem and the stagnating problem. Both > problems are quite similar (same type of flow, same type of grid > type, same BCs, just slightly different geometry). Any ideas? > > Converging problem: > > 0 KSP Residual norm 2.365447773598e+00 % max 1 min 1 max/min 1 > 1 KSP Residual norm 1.688717750865e+00 % max 2.43352 min 2.43352 max/min 1 > 2 KSP Residual norm 1.441677051369e+00 % max 3.39561 min 0.658191 max/min > 5.15901 > 3 KSP Residual norm 1.309618066659e+00 % max 3.45429 min 0.477676 max/min > 7.23145 > 4 KSP Residual norm 1.236059684429e+00 % max 3.68325 min 0.418244 max/min > 8.80645 > 5 KSP Residual norm 1.178488235028e+00 % max 4.26371 min 0.37149 max/min > 11.4773 > > Stagnating problem (classical Gram-Schmidt): > > 0 KSP Residual norm 6.466095059482e-02 % max 1 min 1 max/min 1 > 1 KSP Residual norm 6.466093783326e-02 % max 1.07631 min 1.07631 max/min 1 > 2 KSP Residual norm 6.466093747580e-02 % max 3.0223 min 1.06721 max/min > 2.83196 > 3 KSP Residual norm 6.466093697245e-02 % max 6.79697 min 1.0663 max/min > 6.37436 > 4 KSP Residual norm 6.466092195305e-02 % max 8.16759 min 1.06345 max/min > 7.68027 > 5 KSP Residual norm 6.466090166280e-02 % max 8.80407 min 1.06326 max/min > 8.28026 > > Stagnating problem (modified Gram-Schmidt): > > 0 KSP Residual norm 6.470362245549e-02 % max 1 min 1 max/min 1 > 1 KSP Residual norm 6.470334363709e-02 % max 1.08156 min 1.08156 max/min 1 > 2 KSP Residual norm 6.470326556761e-02 % max 3.05856 min 1.07224 max/min > 2.8525 > 3 KSP Residual norm 6.470326223293e-02 % max 6.85267 min 1.0711 max/min > 6.39777 > 4 KSP Residual norm 6.470312640256e-02 % max 8.17917 min 1.06837 max/min > 7.65572 > 5 KSP Residual norm 6.470312619015e-02 % max 8.7711 min 1.06807 max/min > 8.21213 > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From curfman at mcs.anl.gov Tue Oct 25 07:46:56 2011 From: curfman at mcs.anl.gov (Lois Curfman McInnes) Date: Tue, 25 Oct 2011 08:46:56 -0400 Subject: [petsc-users] query about PETSc usage in Fortran applications Message-ID: I am collecting information about PETSc use in Fortran applications. If you are a using PETSc via the Fortran interface, please send email (to me only, curfman at mcs.anl.gov) to indicate: - application area - what parts of PETSc are used - pointer to any publications or other references Of particular interest are applications in which PETSc facilitated a transition to parallelism for an existing application that had previously been only sequential. Thanks, Lois -------------- next part -------------- An HTML attachment was scrubbed... URL: From jiangwen84 at gmail.com Tue Oct 25 14:38:22 2011 From: jiangwen84 at gmail.com (Wen Jiang) Date: Tue, 25 Oct 2011 15:38:22 -0400 Subject: [petsc-users] debugger does not work on my machine Message-ID: Hi all, I was trying to use the gdb debugger by mpiexec -n 2 ./myprogram -start_in_debugger I got the error: ******************************************************************************************* [1]PETSC ERROR: PETSC: Attaching gdb to ./myprogram of pid 60344 on display knossos:0.0 on machine knossos [0]PETSC ERROR: PETSC: Attaching gdb to ./myprogram of pid 60343 on display knossos:0.0 on machine knossos Unable to start debugger in xterm: No such file or directory Unable to start debugger in xterm: No such file or directory ******************************************************************************************* I tried mpiexec -n 2 ./myprogram -start_in_debugger noxterm I got the information, some of them are ******************************************************************************************* 0x00000033c50a6a50 in __nanosleep_nocancel () from /lib64/libc.so.6 0x00000033c50a6a50 in __nanosleep_nocancel () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.7.el6_0.5.x86_64 libX11-1.3-2.el6.x86_64 libXau-1.0.5-1.el6.x86_64 libgcc-4.4.4-13.el6.x86_64 libgfortran-4.4.4-13.el6.x86_ 64 libstdc++-4.4.4-13.el6.x86_64 libxcb-1.5-1.el6.x86_64 (gdb) Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.7.el6_0.5.x86_64 libX11-1.3-2.el6.x86_64 libXau-1.0.5-1.el6.x86_64 libgcc-4.4.4-13.el6.x86_64 libgfortran-4.4.4-13.el6.x86_64 libstdc++-4.4.4-13.el6.x86_64 libxcb-1.5-1.el6.x86_64 (gdb) ******************************************************************************************** ( Does this mean I need to install those libraries? I also tried to install as root, but it said that 'Could not find debuginfo pkg for dependency package libxcb-1.5.1-el6.x86_64'. My system is scientific linux and it use GNOME 2.28.2. I am wondering how to fix this problem. Thanks, Wen -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Oct 25 14:41:34 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 25 Oct 2011 14:41:34 -0500 (CDT) Subject: [petsc-users] debugger does not work on my machine In-Reply-To: References: Message-ID: you need to install xterm Satish On Tue, 25 Oct 2011, Wen Jiang wrote: > Hi all, > > I was trying to use the gdb debugger by > > mpiexec -n 2 ./myprogram -start_in_debugger > > I got the error: > > ******************************************************************************************* > [1]PETSC ERROR: PETSC: Attaching gdb to ./myprogram of pid 60344 on display > knossos:0.0 on machine knossos > [0]PETSC ERROR: PETSC: Attaching gdb to ./myprogram of pid 60343 on display > knossos:0.0 on machine knossos > Unable to start debugger in xterm: No such file or directory > Unable to start debugger in xterm: No such file or directory > ******************************************************************************************* > > I tried mpiexec -n 2 ./myprogram -start_in_debugger noxterm > > I got the information, some of them are > > ******************************************************************************************* > 0x00000033c50a6a50 in __nanosleep_nocancel () from /lib64/libc.so.6 > 0x00000033c50a6a50 in __nanosleep_nocancel () from /lib64/libc.so.6 > Missing separate debuginfos, use: debuginfo-install > glibc-2.12-1.7.el6_0.5.x86_64 libX11-1.3-2.el6.x86_64 > libXau-1.0.5-1.el6.x86_64 libgcc-4.4.4-13.el6.x86_64 > libgfortran-4.4.4-13.el6.x86_ > 64 libstdc++-4.4.4-13.el6.x86_64 libxcb-1.5-1.el6.x86_64 > (gdb) Missing separate debuginfos, use: debuginfo-install > glibc-2.12-1.7.el6_0.5.x86_64 libX11-1.3-2.el6.x86_64 > libXau-1.0.5-1.el6.x86_64 libgcc-4.4.4-13.el6.x86_64 > libgfortran-4.4.4-13.el6.x86_64 libstdc++-4.4.4-13.el6.x86_64 > libxcb-1.5-1.el6.x86_64 > (gdb) > ******************************************************************************************** > ( Does this mean I need to install those libraries? I also tried to install > as root, but it said that 'Could not find debuginfo pkg for dependency > package libxcb-1.5.1-el6.x86_64'. > > My system is scientific linux and it use GNOME 2.28.2. I am wondering how to > fix this problem. > > Thanks, > Wen > From dominik at itis.ethz.ch Tue Oct 25 15:46:30 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Tue, 25 Oct 2011 22:46:30 +0200 Subject: [petsc-users] default preconditioning side for KSPFGMRES in 3.2 Message-ID: Has the default preconditioning side for KSPFGMRES changed between 3.1 and 3.2? I am debugging one of my old 3.1 codes where I had two KSPFGMRES solvers with explicit defined PC side, and after upgrade to 3.2 one and only one of them complains he does not support left preconditioning. Many thanks for any directions. Dominik From jedbrown at mcs.anl.gov Tue Oct 25 15:51:18 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 25 Oct 2011 15:51:18 -0500 Subject: [petsc-users] default preconditioning side for KSPFGMRES in 3.2 In-Reply-To: References: Message-ID: On Tue, Oct 25, 2011 at 15:46, Dominik Szczerba wrote: > Has the default preconditioning side for KSPFGMRES changed between 3.1 and > 3.2? > I am debugging one of my old 3.1 codes where I had two KSPFGMRES > solvers with explicit defined PC side, and after upgrade to 3.2 one > and only one of them complains he does not support left > preconditioning. > FGMRES never supported left preconditioning. It may be that your choice was getting switched before (I removed some order dependence, setting the type after setting the side used to be able to lose the side). -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Oct 25 15:51:19 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 25 Oct 2011 15:51:19 -0500 Subject: [petsc-users] default preconditioning side for KSPFGMRES in 3.2 In-Reply-To: References: Message-ID: <6B04DB55-AA3C-4029-B329-AF28EE6DAC41@mcs.anl.gov> FGMRES never supported left preconditioning, it always used right It may have previously silently switched to right when needed. Barry It may be that the FGMRES algorithm only makes sense with right preconditioning On Oct 25, 2011, at 3:46 PM, Dominik Szczerba wrote: > Has the default preconditioning side for KSPFGMRES changed between 3.1 and 3.2? > I am debugging one of my old 3.1 codes where I had two KSPFGMRES > solvers with explicit defined PC side, and after upgrade to 3.2 one > and only one of them complains he does not support left > preconditioning. > > Many thanks for any directions. > > Dominik From s_g at berkeley.edu Tue Oct 25 16:18:00 2011 From: s_g at berkeley.edu (Sanjay Govindjee) Date: Tue, 25 Oct 2011 14:18:00 -0700 Subject: [petsc-users] query about PETSc usage in Fortran applications In-Reply-To: References: Message-ID: <4EA72788.80201@berkeley.edu> Finite Element Analysis Linear solvers iterative and direct No explicit publications but it is referenced in our user manual see http://www.ce.berkeley.edu/feap In this case FEAP (our FEA code) has been around for roughly 30 years in its present incarnation. PETSc allowed us to parallelize is it. -sg On 10/25/11 5:46 AM, Lois Curfman McInnes wrote: > > I am collecting information about PETSc use in Fortran applications. > If you are a using PETSc via the Fortran interface, please send email > (to me only, curfman at mcs.anl.gov ) to > indicate: > - application area > - what parts of PETSc are used > - pointer to any publications or other references > > Of particular interest are applications in which PETSc facilitated a > transition to parallelism for an existing application that had > previously been only sequential. > > Thanks, > Lois > -- ----------------------------------------------- Sanjay Govindjee, PhD, PE Professor of Civil Engineering and Chancellor's Professor 779 Davis Hall Structural Engineering, Mechanics and Materials Department of Civil Engineering University of California Berkeley, CA 94720-1710 Voice: +1 510 642 6060 FAX: +1 510 643 5264 s_g at berkeley.edu http://www.ce.berkeley.edu/~sanjay ----------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From Debao.Shao at brion.com Wed Oct 26 04:13:51 2011 From: Debao.Shao at brion.com (Debao Shao) Date: Wed, 26 Oct 2011 02:13:51 -0700 Subject: [petsc-users] MatLUFactorNumeric_SeqAIJ() consumes long runtime In-Reply-To: <128B46D5-40E7-4E55-BE7F-9A7B13AEFF73@mcs.anl.gov> References: <384FF55F15E3E447802DC8CCA85696980E26254051@EX03> <128B46D5-40E7-4E55-BE7F-9A7B13AEFF73@mcs.anl.gov> Message-ID: <384FF55F15E3E447802DC8CCA85696980E26254323@EX03> When will "PCSetUp" set up new PC? 1), PCSetUp(): Setting up PC with same nonzero pattern 2), [0] PCSetUp(): Setting up new PC Thanks, Debao -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Tuesday, October 25, 2011 9:04 PM To: PETSc users list Subject: Re: [petsc-users] MatLUFactorNumeric_SeqAIJ() consumes long runtime My guess is that the code is repeatedly trying larger and large shifts to get a nonzero pivot as it factors. If you run with -info and let it run to completion it will print some information about the shifts it has tried. Barry On Oct 25, 2011, at 3:45 AM, Debao Shao wrote: > DA, > > I'm using PETSc ILU(1)+GMRES to solve an QP problem. But the job may hang at "MatLUFactorNumeric_SeqAIJ()" for long time. > > (gdb) bt > #0 0x0000000002284520 in MatLUFactorNumeric_SeqAIJ () > #1 0x0000000002265f89 in MatLUFactorNumeric () > #2 0x0000000002342ff3 in PCSetUp_ILU () > #3 0x00000000024f8d65 in PCSetUp () > #4 0x0000000002362964 in KSPSetUp () > #5 0x0000000002363455 in KSPSolve () > > Do you have idea for such phenomenon? > > Thanks, > Debao > > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. From bsmith at mcs.anl.gov Wed Oct 26 08:13:22 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 26 Oct 2011 08:13:22 -0500 Subject: [petsc-users] MatLUFactorNumeric_SeqAIJ() consumes long runtime In-Reply-To: <384FF55F15E3E447802DC8CCA85696980E26254323@EX03> References: <384FF55F15E3E447802DC8CCA85696980E26254051@EX03> <128B46D5-40E7-4E55-BE7F-9A7B13AEFF73@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980E26254323@EX03> Message-ID: If you call KSPSetUp() directly it calls PCSetUp(). If you do not call KSPSetUp() directly it is called at the beginning of KSPSolve() the FIRST time KSPSolve() is called after a KSPSetOperators(). If KSPSolve() is called repeatedly without calling KSPSetFromOperators() again (meaning changing the right hand side but not the matrix) then KSPSetUp() is not called for those solves since the preconditioner has already been setup. Barry On Oct 26, 2011, at 4:13 AM, Debao Shao wrote: > When will "PCSetUp" set up new PC? > 1), PCSetUp(): Setting up PC with same nonzero pattern > 2), [0] PCSetUp(): Setting up new PC > > Thanks, > Debao > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Tuesday, October 25, 2011 9:04 PM > To: PETSc users list > Subject: Re: [petsc-users] MatLUFactorNumeric_SeqAIJ() consumes long runtime > > > My guess is that the code is repeatedly trying larger and large shifts to get a nonzero pivot as it factors. If you run with -info and let it run to completion it will print some information about the shifts it has tried. > > Barry > > On Oct 25, 2011, at 3:45 AM, Debao Shao wrote: > >> DA, >> >> I'm using PETSc ILU(1)+GMRES to solve an QP problem. But the job may hang at "MatLUFactorNumeric_SeqAIJ()" for long time. >> >> (gdb) bt >> #0 0x0000000002284520 in MatLUFactorNumeric_SeqAIJ () >> #1 0x0000000002265f89 in MatLUFactorNumeric () >> #2 0x0000000002342ff3 in PCSetUp_ILU () >> #3 0x00000000024f8d65 in PCSetUp () >> #4 0x0000000002362964 in KSPSetUp () >> #5 0x0000000002363455 in KSPSolve () >> >> Do you have idea for such phenomenon? >> >> Thanks, >> Debao >> >> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. > > > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. From jedbrown at mcs.anl.gov Wed Oct 26 08:14:52 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 26 Oct 2011 08:14:52 -0500 Subject: [petsc-users] MatLUFactorNumeric_SeqAIJ() consumes long runtime In-Reply-To: <384FF55F15E3E447802DC8CCA85696980E26254323@EX03> References: <384FF55F15E3E447802DC8CCA85696980E26254051@EX03> <128B46D5-40E7-4E55-BE7F-9A7B13AEFF73@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980E26254323@EX03> Message-ID: On Wed, Oct 26, 2011 at 04:13, Debao Shao wrote: > When will "PCSetUp" set up new PC? > What do you mean? It does a different amount of work depending on the MatStructure flag. -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Wed Oct 26 09:52:46 2011 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Wed, 26 Oct 2011 14:52:46 +0000 Subject: [petsc-users] KSPMonitorSingularValue Message-ID: > You need to run for much more iterations to get any kind of reliable >singular values. Use something like at least -ksp_gmres_restart 100 >-ksp_max_it 100 and see what happens to the singular values. > > Barry Barry, Jed, I followed your advice, this is what happens to the singular values (taking out some for readability). Do you think these values are reliable even though the solver doesn't converge? If so what could be the reason for FGMRES to stagnate? 0 KSP Residual norm 2.069443859095e-01 % max 1 min 1 max/min 1 1 KSP Residual norm 2.069442759608e-01 % max 4.08809 min 4.08809 max/min 1 2 KSP Residual norm 2.069441760813e-01 % max 5.60619 min 4.06952 max/min 1.3776 3 KSP Residual norm 2.069441732278e-01 % max 9.04702 min 4.0695 max/min 2.22313 4 KSP Residual norm 2.069440548767e-01 % max 9.39853 min 4.06865 max/min 2.30998 5 KSP Residual norm 2.069439774947e-01 % max 11.5495 min 4.06865 max/min 2.83866 10 KSP Residual norm 2.069432503764e-01 % max 14.6194 min 4.04732 max/min 3.61213 20 KSP Residual norm 2.069424990159e-01 % max 22.7708 min 2.95767 max/min 7.69891 30 KSP Residual norm 2.069403094003e-01 % max 23.8491 min 2.93612 max/min 8.12265 50 KSP Residual norm 2.069374104472e-01 % max 24.0821 min 2.23473 max/min 10.7763 70 KSP Residual norm 2.069306551868e-01 % max 29.3557 min 2.13925 max/min 13.7224 100 KSP Residual norm 2.069163489527e-01 % max 29.4884 min 2.0458 max/min 14.4142 dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From bsmith at mcs.anl.gov Wed Oct 26 09:56:51 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 26 Oct 2011 09:56:51 -0500 Subject: [petsc-users] KSPMonitorSingularValue In-Reply-To: References: Message-ID: <28BE66BE-E26A-4F5C-AC84-FE6D0A14FF6D@mcs.anl.gov> What happens at restart to the residual norm? That is does the residual norm change dramatically on the first iteration after the restart? This is an indication of loss of orthogonality. You can put in a restart of 1000 and run for 1000 iterations. What happens then? Barry On Oct 26, 2011, at 9:52 AM, Klaij, Christiaan wrote: >> You need to run for much more iterations to get any kind of reliable >> singular values. Use something like at least -ksp_gmres_restart 100 >> -ksp_max_it 100 and see what happens to the singular values. >> >> Barry > > Barry, Jed, > > I followed your advice, this is what happens to the singular values > (taking out some for readability). Do you think these values are > reliable even though the solver doesn't converge? If so what could > be the reason for FGMRES to stagnate? > > 0 KSP Residual norm 2.069443859095e-01 % max 1 min 1 max/min 1 > 1 KSP Residual norm 2.069442759608e-01 % max 4.08809 min 4.08809 max/min 1 > 2 KSP Residual norm 2.069441760813e-01 % max 5.60619 min 4.06952 max/min 1.3776 > 3 KSP Residual norm 2.069441732278e-01 % max 9.04702 min 4.0695 max/min 2.22313 > 4 KSP Residual norm 2.069440548767e-01 % max 9.39853 min 4.06865 max/min 2.30998 > 5 KSP Residual norm 2.069439774947e-01 % max 11.5495 min 4.06865 max/min 2.83866 > 10 KSP Residual norm 2.069432503764e-01 % max 14.6194 min 4.04732 max/min 3.61213 > 20 KSP Residual norm 2.069424990159e-01 % max 22.7708 min 2.95767 max/min 7.69891 > 30 KSP Residual norm 2.069403094003e-01 % max 23.8491 min 2.93612 max/min 8.12265 > 50 KSP Residual norm 2.069374104472e-01 % max 24.0821 min 2.23473 max/min 10.7763 > 70 KSP Residual norm 2.069306551868e-01 % max 29.3557 min 2.13925 max/min 13.7224 > 100 KSP Residual norm 2.069163489527e-01 % max 29.4884 min 2.0458 max/min 14.4142 > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > From bogdan at lmn.pub.ro Wed Oct 26 10:34:47 2011 From: bogdan at lmn.pub.ro (Bogdan Dita) Date: Wed, 26 Oct 2011 18:34:47 +0300 Subject: [petsc-users] Question about KSPSolve for multiple rhs Message-ID: Hello, First of all I'm new to PETSc so please be pacient with me. I'm trying to solve 2 linear systems with the same A matrix using superlu_dist, and so i'm using the same ksp context for both systems. The matrix is a square matrix of 84719 with 351289 nonzero elements. The time for the first call to KSPSolve is 66 sec and the second 0.14 sec. I was expecting some difference but not that big. Is this difference normal for a linear system with two right hand sides? I really think that I'm doing something wrong, but I even checked the solution in Matlab and it seems fine to me. Thanks, Bogdan -------------------------------------- Cosmin-Bogdan DITA, PhD Student "Politehnica" University of Bucharest Electrical Engineering Faculty Numerical Modelling Laboratory (EA-D07) Splaiul Independentei 313 Bucharest, zip 060042 Romania E-mail: bogdan at lmn.pub.ro bogdandita at gmail.com From bsmith at mcs.anl.gov Wed Oct 26 10:39:13 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 26 Oct 2011 10:39:13 -0500 Subject: [petsc-users] Question about KSPSolve for multiple rhs In-Reply-To: References: Message-ID: On Oct 26, 2011, at 10:34 AM, Bogdan Dita wrote: > > Hello, > > First of all I'm new to PETSc so please be pacient with me. > I'm trying to solve 2 linear systems with the same A matrix using > superlu_dist, and so i'm using the same ksp context for both systems. > The matrix is a square matrix of 84719 with 351289 nonzero elements. > The time for the first call to KSPSolve is 66 sec and the second 0.14 > sec. I was expecting some difference but not that big. Is this > difference normal for a linear system with two right hand sides? I > really think that I'm doing something wrong, but I even checked the > solution in Matlab and it seems fine to me. This is completely normal. The sparse factorization is generally much much more time consuming than the triangular solves since it involves much more operations and memory motion. Hence when using direct solvers reusing a previous factorization is really beneficial. Barry > > Thanks, > Bogdan > > -------------------------------------- > > Cosmin-Bogdan DITA, PhD Student > > "Politehnica" University of Bucharest > Electrical Engineering Faculty > Numerical Modelling Laboratory (EA-D07) > Splaiul Independentei 313 > Bucharest, zip 060042 > Romania > > E-mail: bogdan at lmn.pub.ro > bogdandita at gmail.com > > From dominik at itis.ethz.ch Thu Oct 27 02:53:10 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Thu, 27 Oct 2011 09:53:10 +0200 Subject: [petsc-users] support for complex number computations Message-ID: I have several codes using the KSP linear solvers with real numbers. As I understood from the documentation, in order to use complex numbers I have to configure petsc in a different way. The information I failed to find is what does it take to port my current real-based codes to work with a complex-based Petsc version. Is a choice real/complex a transparent one or requires explicit adaptations in the code? Thanks for any clarifications, Regards, Dominik From behzad.baghapour at gmail.com Thu Oct 27 05:05:37 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Thu, 27 Oct 2011 13:35:37 +0330 Subject: [petsc-users] Joining SNES with user-defined object Message-ID: Dear All, I want to use SNES for nonlinear Newton Iterations. All of my RHS and Jacobian Matrix are calculated with my own defined object called "element". How should I pass my object into SNES functions to calculate required RHS and Jacobian? Thanks, B. B. -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Thu Oct 27 07:44:34 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Thu, 27 Oct 2011 16:14:34 +0330 Subject: [petsc-users] Change SNES/TS setting from inside context Message-ID: Dear All, 1- Is there any way to change the setting of SNES within its iteration process? ( for example, changing PC type or fill level due to some reasons ) 2- page 120 of PETSc manual (TS): "For location-dependent pseudo-timestepping, the interface function has not yet been created." Is there any way to control time-step locally? (This is due to variable time updating usually needed in PDE solution) Thanks a lot, B. B. -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Thu Oct 27 08:15:12 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Thu, 27 Oct 2011 15:15:12 +0200 Subject: [petsc-users] Print ||r(i)||/||Ax0-b|| while monitoring Message-ID: Is there an equivalent of -ksp_monitor_true_residual to get ||r(i)||/||Ax0-b|| (x0 initial guess) instead of ||r(i)||/||b|| ? Many thanks, Dominik From knepley at gmail.com Thu Oct 27 08:40:51 2011 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 27 Oct 2011 13:40:51 +0000 Subject: [petsc-users] support for complex number computations In-Reply-To: References: Message-ID: On Thu, Oct 27, 2011 at 7:53 AM, Dominik Szczerba wrote: > I have several codes using the KSP linear solvers with real numbers. > As I understood from the documentation, in order to use complex > numbers I have to configure petsc in a different way. The information > I failed to find is what does it take to port my current real-based > codes to work with a complex-based Petsc version. Is a choice > real/complex a transparent one or requires explicit adaptations in the > code? > If you consistently use PetscScalar, and use PetscReal when you rely on the ordering property of the reals, everything should work. Matt > Thanks for any clarifications, > Regards, > Dominik -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Oct 27 08:45:17 2011 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 27 Oct 2011 13:45:17 +0000 Subject: [petsc-users] Joining SNES with user-defined object In-Reply-To: References: Message-ID: On Thu, Oct 27, 2011 at 10:05 AM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > Dear All, > > I want to use SNES for nonlinear Newton Iterations. All of my RHS and > Jacobian Matrix are calculated with my own defined object called "element". > How should I pass my object into SNES functions to calculate required RHS > and Jacobian? > Define functions CalculateRHS() and CalculateJacobian() and pass them to SNES, and pass 'element' as the user context. Then inside that function, you can get 'element' back from the 'ctx' argument. Matt > Thanks, > B. B. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Oct 27 08:46:52 2011 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 27 Oct 2011 13:46:52 +0000 Subject: [petsc-users] Change SNES/TS setting from inside context In-Reply-To: References: Message-ID: On Thu, Oct 27, 2011 at 12:44 PM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > Dear All, > > 1- Is there any way to change the setting of SNES within its iteration > process? ( for example, changing PC type or fill level due to some reasons ) > > Yes, SNESGetKSP() KSPGetPC() PC***() I would probably put this code in a SNES monitor. Matt 2- page 120 of PETSc manual (TS): "For location-dependent > pseudo-timestepping, the interface function has not yet been created." > Is there any way to control time-step locally? (This is due to variable > time updating usually needed in PDE solution) > > Thanks a lot, > B. B. > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Oct 27 08:50:43 2011 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 27 Oct 2011 13:50:43 +0000 Subject: [petsc-users] Print ||r(i)||/||Ax0-b|| while monitoring In-Reply-To: References: Message-ID: On Thu, Oct 27, 2011 at 1:15 PM, Dominik Szczerba wrote: > Is there an equivalent of -ksp_monitor_true_residual to get > > ||r(i)||/||Ax0-b|| (x0 initial guess) > > instead of > > ||r(i)||/||b|| > > ? > You can easily make a custom monitor that does what you want. The code for the original is in src/ksp/ksp/interface/iterativ.c Matt > Many thanks, > Dominik > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeronova.mailing at gmail.com Thu Oct 27 08:51:03 2011 From: aeronova.mailing at gmail.com (Kyunghoon Lee) Date: Thu, 27 Oct 2011 21:51:03 +0800 Subject: [petsc-users] slepc compilation problem: 'module' object has no attribute 'INSTALL_DIR' Message-ID: Hi, slpec used to work, but somehow I cannot compile it again after I recompiled petsc to enable complex variable support. Here's what I did: $ export SLEPC_DIR=$PWD $ export PETSC_DIR=/Users/aeronova/Development/local/lib64/petsc/petsc-3.2-p4 $ export CXX="/usr/bin/mpicxx" $ export CC="/usr/bin/mpicc" $ ./configure --prefix=/Users/aeronova/Development/local/lib64/slepc/slepc-3.1-p6 Checking environment... Traceback (most recent call last): File "./configure", line 223, in log.Write('PETSc install directory: ' + petscconf.INSTALL_DIR) AttributeError: 'module' object has no attribute 'INSTALL_DIR' Plz help me with this error. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Thu Oct 27 08:53:59 2011 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 27 Oct 2011 15:53:59 +0200 Subject: [petsc-users] slepc compilation problem: 'module' object has no attribute 'INSTALL_DIR' In-Reply-To: References: Message-ID: <9B3ED0F6-16AE-4AAC-A731-487F95045EC9@dsic.upv.es> El 27/10/2011, a las 15:51, Kyunghoon Lee escribi?: > Hi, > > slpec used to work, but somehow I cannot compile it again after I recompiled petsc to enable complex variable support. Here's what I did: > > $ export SLEPC_DIR=$PWD > $ export PETSC_DIR=/Users/aeronova/Development/local/lib64/petsc/petsc-3.2-p4 > $ export CXX="/usr/bin/mpicxx" > $ export CC="/usr/bin/mpicc" > $ ./configure --prefix=/Users/aeronova/Development/local/lib64/slepc/slepc-3.1-p6 > Checking environment... > Traceback (most recent call last): > File "./configure", line 223, in > log.Write('PETSc install directory: ' + petscconf.INSTALL_DIR) > AttributeError: 'module' object has no attribute 'INSTALL_DIR' > > Plz help me with this error. > slepc-3.1 is not compatible with petsc-3.2. You have to wait for slepc-3.2, which will be out very soon (maybe today). Jose From behzad.baghapour at gmail.com Thu Oct 27 09:18:15 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Thu, 27 Oct 2011 17:48:15 +0330 Subject: [petsc-users] Change SNES/TS setting from inside context In-Reply-To: References: Message-ID: So, I should define a monitor context "mon" and a function like "Monitor" which I want to change whatever I wand during each SNES iteration then pass it to the function SNESMonitorSet( snes, Monitor, &mon, 0 ); So is it correct? On Thu, Oct 27, 2011 at 5:16 PM, Matthew Knepley wrote: > On Thu, Oct 27, 2011 at 12:44 PM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> Dear All, >> >> 1- Is there any way to change the setting of SNES within its iteration >> process? ( for example, changing PC type or fill level due to some reasons ) >> >> > > Yes, > > SNESGetKSP() > KSPGetPC() > PC***() > > I would probably put this code in a SNES monitor. > > Matt > > 2- page 120 of PETSc manual (TS): "For location-dependent >> pseudo-timestepping, the interface function has not yet been created." >> Is there any way to control time-step locally? (This is due to variable >> time updating usually needed in PDE solution) >> >> Thanks a lot, >> B. B. >> >> -- >> ================================== >> Behzad Baghapour >> Ph.D. Candidate, Mechecanical Engineering >> University of Tehran, Tehran, Iran >> https://sites.google.com/site/behzadbaghapour >> Fax: 0098-21-88020741 >> ================================== >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Oct 27 09:19:01 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 27 Oct 2011 08:19:01 -0600 Subject: [petsc-users] Change SNES/TS setting from inside context In-Reply-To: References: Message-ID: On Thu, Oct 27, 2011 at 06:44, behzad baghapour wrote: > 2- page 120 of PETSc manual (TS): "For location-dependent > pseudo-timestepping, the interface function has not yet been created." > Is there any way to control time-step locally? (This is due to variable > time updating usually needed in PDE solution) > The documentation needs to be updated. Use TSSetIJacobian() and regard the "shift" as a maximum shift (corresponds to minimum time step). If you have a cell in which you want to take a 10x longer time step, then use shift/10 in that cell. Note that you can use steady-state constraints on some fields. For example, src/ts/examples/tutorials/ex26.c (in petsc-dev) uses no shift for the velocity components (enforces incompressibility in velocity-vorticity form). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Oct 27 09:20:22 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 27 Oct 2011 08:20:22 -0600 Subject: [petsc-users] Change SNES/TS setting from inside context In-Reply-To: References: Message-ID: On Thu, Oct 27, 2011 at 08:18, behzad baghapour wrote: > So, I should define a monitor context "mon" and a function like "Monitor" > which I want to change whatever I wand during each SNES iteration then pass > it to the function SNESMonitorSet( > snes, Monitor, &mon, 0 ); > Yes, put whatever application-specific information in your monitor context and have it decide how to reconfigure the KSP or PC. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeronova.mailing at gmail.com Thu Oct 27 09:20:48 2011 From: aeronova.mailing at gmail.com (Kyunghoon Lee) Date: Thu, 27 Oct 2011 22:20:48 +0800 Subject: [petsc-users] slepc compilation problem: 'module' object has no attribute 'INSTALL_DIR' In-Reply-To: <9B3ED0F6-16AE-4AAC-A731-487F95045EC9@dsic.upv.es> References: <9B3ED0F6-16AE-4AAC-A731-487F95045EC9@dsic.upv.es> Message-ID: Jose, Thanks for the info. I guess I have to revert to petsc-3.1. K. Lee. On Thu, Oct 27, 2011 at 9:53 PM, Jose E. Roman wrote: > > El 27/10/2011, a las 15:51, Kyunghoon Lee escribi?: > > > Hi, > > > > slpec used to work, but somehow I cannot compile it again after I > recompiled petsc to enable complex variable support. Here's what I did: > > > > $ export SLEPC_DIR=$PWD > > $ export > PETSC_DIR=/Users/aeronova/Development/local/lib64/petsc/petsc-3.2-p4 > > $ export CXX="/usr/bin/mpicxx" > > $ export CC="/usr/bin/mpicc" > > $ ./configure > --prefix=/Users/aeronova/Development/local/lib64/slepc/slepc-3.1-p6 > > Checking environment... > > Traceback (most recent call last): > > File "./configure", line 223, in > > log.Write('PETSc install directory: ' + petscconf.INSTALL_DIR) > > AttributeError: 'module' object has no attribute 'INSTALL_DIR' > > > > Plz help me with this error. > > > > slepc-3.1 is not compatible with petsc-3.2. > You have to wait for slepc-3.2, which will be out very soon (maybe today). > Jose > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Thu Oct 27 09:28:33 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Thu, 27 Oct 2011 17:58:33 +0330 Subject: [petsc-users] Change SNES/TS setting from inside context In-Reply-To: References: Message-ID: Actually I want to update the domain according to the global CFL number; e.g. dt = CFL * dx / Lambda; separately for each cell ( dx is the dimension of the cell and Lambda is the local maximum eigenvalue of the field ) So, here shift is the maximum time step and I should adjust other cells to this maximum value? Thanks, On Thu, Oct 27, 2011 at 5:49 PM, Jed Brown wrote: > On Thu, Oct 27, 2011 at 06:44, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> 2- page 120 of PETSc manual (TS): "For location-dependent >> pseudo-timestepping, the interface function has not yet been created." >> Is there any way to control time-step locally? (This is due to variable >> time updating usually needed in PDE solution) >> > > The documentation needs to be updated. Use TSSetIJacobian() and regard the > "shift" as a maximum shift (corresponds to minimum time step). If you have a > cell in which you want to take a 10x longer time step, then use shift/10 in > that cell. Note that you can use steady-state constraints on some fields. > For example, src/ts/examples/tutorials/ex26.c (in petsc-dev) uses no shift > for the velocity components (enforces incompressibility in > velocity-vorticity form). > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Thu Oct 27 09:34:30 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Thu, 27 Oct 2011 18:04:30 +0330 Subject: [petsc-users] Change SNES/TS setting from inside context In-Reply-To: References: Message-ID: Thanks. I want to update my user-defined object "element" in each SNES iteration. Should I include it as user-defined context and pass it by void* in Monitor function too? On Thu, Oct 27, 2011 at 5:50 PM, Jed Brown wrote: > On Thu, Oct 27, 2011 at 08:18, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> So, I should define a monitor context "mon" and a function like "Monitor" >> which I want to change whatever I wand during each SNES iteration then pass >> it to the function SNESMonitorSet( >> snes, Monitor, &mon, 0 ); >> > > Yes, put whatever application-specific information in your monitor context > and have it decide how to reconfigure the KSP or PC. > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Thu Oct 27 09:45:21 2011 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 27 Oct 2011 14:45:21 +0000 Subject: [petsc-users] KSPMonitorSingularValue Message-ID: > What happens at restart to the residual norm? That is does the > residual norm change dramatically on the first iteration after > the restart? This is an indication of loss of orthogonality. > > You can put in a restart of 1000 and run for 1000 > iterations. What happens then? > > Barry 1000 is a bit much but I did 100 without restart and 100 with default restart (every 30 its). I don't see significant change on the first iteration after the restart. dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From jedbrown at mcs.anl.gov Thu Oct 27 09:49:54 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 27 Oct 2011 08:49:54 -0600 Subject: [petsc-users] Change SNES/TS setting from inside context In-Reply-To: References: Message-ID: On Thu, Oct 27, 2011 at 08:28, behzad baghapour wrote: > Actually I want to update the domain according to the global CFL number; > e.g. dt = CFL * dx / Lambda; separately for each cell > ( dx is the dimension of the cell and Lambda is the local maximum > eigenvalue of the field ) > TS does not know about "CFL" because it's not even a universal quantity. It gives you a step size, call it pseudo_dt. You can sweep over the grid to compute global_dt = min_e dx_e / lambda_e (the most restrictive step size according to CFL=1). Now you can define the global CFL number = pseudo_dt / global_dt and use this however you like. You might have some cheap estimate of global_dt in which case you could use it instead of making the sweep on each step to have an exact definition. Or you can do it while computing the residual and cache the result in your application context. > > So, here shift is the maximum time step and I should adjust other cells to > this maximum value? > It really is the minimum time step since the global CFL is computed for the fastest wave in the smallest cell (smallest value of dx/lambda). In your slower, larger cells, the number dx/lambda will be larger. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Oct 27 10:16:37 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 27 Oct 2011 09:16:37 -0600 Subject: [petsc-users] slepc compilation problem: 'module' object has no attribute 'INSTALL_DIR' In-Reply-To: References: <9B3ED0F6-16AE-4AAC-A731-487F95045EC9@dsic.upv.es> Message-ID: On Thu, Oct 27, 2011 at 08:20, Kyunghoon Lee wrote: > Thanks for the info. I guess I have to revert to petsc-3.1. > You can use slepc-dev now. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Oct 27 11:37:20 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 27 Oct 2011 11:37:20 -0500 Subject: [petsc-users] KSPMonitorSingularValue In-Reply-To: References: Message-ID: <531584AA-28E7-4BB3-9D4C-7E7A6AB8B67C@mcs.anl.gov> If you have an explicit sparse matrix you are storing you can run with -ksp_view_binary and email to us the resulting file called binaryoutput at petsc-maint at mcs.anl.gov if the matrix is smaller than 20 megabytes. If it is larger you can anonymously ftp it to the ftp site called ftp.mcs.anl.gov and put it in the directory incoming then send email to petsc-maint at mcs.anl.gov and tell us where the file is. Barry On Oct 27, 2011, at 9:45 AM, Klaij, Christiaan wrote: >> What happens at restart to the residual norm? That is does the >> residual norm change dramatically on the first iteration after >> the restart? This is an indication of loss of orthogonality. >> >> You can put in a restart of 1000 and run for 1000 >> iterations. What happens then? >> >> Barry > > 1000 is a bit much but I did 100 without restart and 100 with > default restart (every 30 its). I don't see significant change on > the first iteration after the restart. > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > From ping.rong at tuhh.de Thu Oct 27 13:34:13 2011 From: ping.rong at tuhh.de (Ping Rong) Date: Thu, 27 Oct 2011 20:34:13 +0200 Subject: [petsc-users] Petsc-3.2 with superlu_dist building error Message-ID: <4EA9A425.9010306@tuhh.de> Hello all, I was trying to compile the PETSc-3.2-p4 with superlu_dist. The machine is running with Ubuntu 10.04, gcc 4.4.5 and cmake 2.8.5. the configuration looks like the following: ./config/configure.py \ --with-precision=double \ --with-blas-lib=$LIBS_DIR/lapack/lib/libblas.so \ --with-lapack-lib=$LIBS_DIR/lapack/lib/liblapack.so \ --with-scalar-type=complex \ --with-clanguage=c++ \ --with-mpi-dir=$LIBS_DIR/mpich2 \ --download-superlu=yes \ --with-superlu=1 \ --download-parmetis=yes \ --with-parmetis=1 \ --download-superlu_dist=yes \ --with-superlu_dist=1 \ --with-debugging=no \ --with-shared-libraries=1 when I try to run "make all", it throws the error //------------------------------- [ 60%] Building CXX object CMakeFiles/petsc.dir/src/mat/order/amd/amd.c.o /home/xxx/libs/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c: In function ?PetscErrorCode MatGetFactor_aij_superlu_dist(_p_Mat*, MatFactorType, _p_Mat**)?: /home/xxx/libs/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:634: error: ?DOUBLE? was not declared in this scope /home/xxx/libs/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c: In function ?PetscErrorCode MatFactorInfo_SuperLU_DIST(_p_Mat*, _p_PetscViewer*)?: /home/xxx/libs/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:698: error: ?DOUBLE? was not declared in this scope //--------------------------------------------------- I have been using this configuration with the dev version before 3.2 is released. It worked quite well. I also test the same configuration on a redhat machine with gcc 4.4.4-13 and cmake 2.8.5. it works well, too. Any ideas what may go wrong? is this compiler related? Thanks!! -- Ping Rong, M.Sc. Hamburg University of Technology Institut of modelling and computation Denickestra?e 17 (Room 3031) 21073 Hamburg Tel.: ++49 - (0)40 42878 2749 Fax: ++49 - (0)40 42878 43533 Email: ping.rong at tuhh.de From jedbrown at mcs.anl.gov Thu Oct 27 13:44:45 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 27 Oct 2011 12:44:45 -0600 Subject: [petsc-users] Petsc-3.2 with superlu_dist building error In-Reply-To: <4EA9A425.9010306@tuhh.de> References: <4EA9A425.9010306@tuhh.de> Message-ID: On Thu, Oct 27, 2011 at 12:34, Ping Rong wrote: > Hello all, > > I was trying to compile the PETSc-3.2-p4 with superlu_dist. The machine is > running with Ubuntu 10.04, gcc 4.4.5 and cmake 2.8.5. the configuration > looks like the following: > ./config/configure.py \ > --with-precision=double \ > --with-blas-lib=$LIBS_DIR/**lapack/lib/libblas.so \ > --with-lapack-lib=$LIBS_DIR/**lapack/lib/liblapack.so \ > --with-scalar-type=complex \ > --with-clanguage=c++ \ > --with-mpi-dir=$LIBS_DIR/**mpich2 \ > --download-superlu=yes \ > --with-superlu=1 \ > --download-parmetis=yes \ > --with-parmetis=1 \ > --download-superlu_dist=yes \ > --with-superlu_dist=1 \ > --with-debugging=no \ > --with-shared-libraries=1 > > when I try to run "make all", it throws the error > //----------------------------**--- > [ 60%] Building CXX object CMakeFiles/petsc.dir/src/mat/** > order/amd/amd.c.o > /home/xxx/libs/petsc/src/mat/**impls/aij/mpi/superlu_dist/**superlu_dist.c: > In function ?PetscErrorCode MatGetFactor_aij_superlu_dist(**_p_Mat*, > MatFactorType, _p_Mat**)?: > /home/xxx/libs/petsc/src/mat/**impls/aij/mpi/superlu_dist/**superlu_dist.c:634: > error: ?DOUBLE? was not declared in this scope > /home/xxx/libs/petsc/src/mat/**impls/aij/mpi/superlu_dist/**superlu_dist.c: > In function ?PetscErrorCode MatFactorInfo_SuperLU_DIST(_p_**Mat*, > _p_PetscViewer*)?: > /home/xxx/libs/petsc/src/mat/**impls/aij/mpi/superlu_dist/**superlu_dist.c:698: > error: ?DOUBLE? was not declared in this scope > The SuperLU maintainer made an API-incompatible change to the released tarball. Sherry, we talked about this last week and I thought you were going to make a SuperLU-4.3 release with the namespaced enum values? -------------- next part -------------- An HTML attachment was scrubbed... URL: From xsli at lbl.gov Thu Oct 27 14:11:51 2011 From: xsli at lbl.gov (Xiaoye S. Li) Date: Thu, 27 Oct 2011 12:11:51 -0700 Subject: [petsc-users] Petsc-3.2 with superlu_dist building error In-Reply-To: References: <4EA9A425.9010306@tuhh.de> Message-ID: The error is related to the new SuperLU_DIST (v3.0), not serial SuperLU. In PETSc / SuperLU_DIST wrapper: superlu_dist.c, line 634, DOUBLE should be replaced by SLU_DOUBLE. Sherry On Thu, Oct 27, 2011 at 11:44 AM, Jed Brown wrote: > On Thu, Oct 27, 2011 at 12:34, Ping Rong wrote: >> >> Hello all, >> >> I was trying to compile the PETSc-3.2-p4 with superlu_dist. The machine is >> running with Ubuntu 10.04, gcc 4.4.5 and cmake 2.8.5. the configuration >> looks like the following: >> ./config/configure.py \ >> --with-precision=double \ >> --with-blas-lib=$LIBS_DIR/lapack/lib/libblas.so \ >> --with-lapack-lib=$LIBS_DIR/lapack/lib/liblapack.so \ >> --with-scalar-type=complex \ >> --with-clanguage=c++ \ >> --with-mpi-dir=$LIBS_DIR/mpich2 \ >> --download-superlu=yes \ >> --with-superlu=1 \ >> --download-parmetis=yes \ >> --with-parmetis=1 \ >> --download-superlu_dist=yes \ >> --with-superlu_dist=1 \ >> --with-debugging=no \ >> --with-shared-libraries=1 >> >> when I try to run "make all", it throws the error >> //------------------------------- >> [ 60%] Building CXX object CMakeFiles/petsc.dir/src/mat/order/amd/amd.c.o >> /home/xxx/libs/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c: In >> function ?PetscErrorCode MatGetFactor_aij_superlu_dist(_p_Mat*, >> MatFactorType, _p_Mat**)?: >> >> /home/xxx/libs/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:634: >> error: ?DOUBLE? was not declared in this scope >> /home/xxx/libs/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c: In >> function ?PetscErrorCode MatFactorInfo_SuperLU_DIST(_p_Mat*, >> _p_PetscViewer*)?: >> >> /home/xxx/libs/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:698: >> error: ?DOUBLE? was not declared in this scope > > The SuperLU maintainer made an API-incompatible change to the released > tarball. > Sherry, we talked about this last week and I thought you were going to make > a SuperLU-4.3 release with the namespaced enum values? From balay at mcs.anl.gov Thu Oct 27 14:17:04 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 27 Oct 2011 14:17:04 -0500 (CDT) Subject: [petsc-users] Petsc-3.2 with superlu_dist building error In-Reply-To: References: <4EA9A425.9010306@tuhh.de> Message-ID: Actually the issue is: "when installing both superlu & superlu_dist" together [if superlu gets installed by configure after superlu_dist] - then the include from superlu is the one that gets used. Petsc-3.2 uses superlu_dist-2.5 - and petsc-dev uses superlu_dist-3.0 Now that superlu was updated to match superlu_dist-3.0 [without a version change] - it breaks Petsc-3.2 build with superlu_dist-2.5. [because the updated include file from superlu is incompatible with superlu_dist-2.5 code] Satish On Thu, 27 Oct 2011, Xiaoye S. Li wrote: > The error is related to the new SuperLU_DIST (v3.0), not serial SuperLU. > In PETSc / SuperLU_DIST wrapper: superlu_dist.c, line 634, DOUBLE > should be replaced by SLU_DOUBLE. > > Sherry > > > On Thu, Oct 27, 2011 at 11:44 AM, Jed Brown wrote: > > On Thu, Oct 27, 2011 at 12:34, Ping Rong wrote: > >> > >> Hello all, > >> > >> I was trying to compile the PETSc-3.2-p4 with superlu_dist. The machine is > >> running with Ubuntu 10.04, gcc 4.4.5 and cmake 2.8.5. the configuration > >> looks like the following: > >> ./config/configure.py \ > >> --with-precision=double \ > >> --with-blas-lib=$LIBS_DIR/lapack/lib/libblas.so \ > >> --with-lapack-lib=$LIBS_DIR/lapack/lib/liblapack.so \ > >> --with-scalar-type=complex \ > >> --with-clanguage=c++ \ > >> --with-mpi-dir=$LIBS_DIR/mpich2 \ > >> --download-superlu=yes \ > >> --with-superlu=1 \ > >> --download-parmetis=yes \ > >> --with-parmetis=1 \ > >> --download-superlu_dist=yes \ > >> --with-superlu_dist=1 \ > >> --with-debugging=no \ > >> --with-shared-libraries=1 > >> > >> when I try to run "make all", it throws the error > >> //------------------------------- > >> [ 60%] Building CXX object CMakeFiles/petsc.dir/src/mat/order/amd/amd.c.o > >> /home/xxx/libs/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c: In > >> function ?PetscErrorCode MatGetFactor_aij_superlu_dist(_p_Mat*, > >> MatFactorType, _p_Mat**)?: > >> > >> /home/xxx/libs/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:634: > >> error: ?DOUBLE? was not declared in this scope > >> /home/xxx/libs/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c: In > >> function ?PetscErrorCode MatFactorInfo_SuperLU_DIST(_p_Mat*, > >> _p_PetscViewer*)?: > >> > >> /home/xxx/libs/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:698: > >> error: ?DOUBLE? was not declared in this scope > > > > The SuperLU maintainer made an API-incompatible change to the released > > tarball. > > Sherry, we talked about this last week and I thought you were going to make > > a SuperLU-4.3 release with the namespaced enum values? > From sean at mcs.anl.gov Thu Oct 27 14:18:03 2011 From: sean at mcs.anl.gov (Sean Farley) Date: Thu, 27 Oct 2011 14:18:03 -0500 Subject: [petsc-users] Petsc-3.2 with superlu_dist building error In-Reply-To: References: <4EA9A425.9010306@tuhh.de> Message-ID: > > The error is related to the new SuperLU_DIST (v3.0), not serial SuperLU. > In PETSc / SuperLU_DIST wrapper: superlu_dist.c, line 634, DOUBLE > should be replaced by SLU_DOUBLE. Sherry, the problem is that you updated the SuperLU tarball to have the new enum type but that petsc-3.2 was not updated for the new interface, hence the breakage. If Ping were to switch to petsc-dev, this issue would go away. An alternative, where *everybody* wins is to update SuperLU to 4.3 and make a new tarball (reverting 4.2 back to the older definition for the enum). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Oct 27 14:20:56 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 27 Oct 2011 13:20:56 -0600 Subject: [petsc-users] Petsc-3.2 with superlu_dist building error In-Reply-To: References: <4EA9A425.9010306@tuhh.de> Message-ID: On Thu, Oct 27, 2011 at 13:11, Xiaoye S. Li wrote: > The error is related to the new SuperLU_DIST (v3.0), not serial SuperLU. > In PETSc / SuperLU_DIST wrapper: superlu_dist.c, line 634, DOUBLE > should be replaced by SLU_DOUBLE. > Sherry, this is in the released version of PETSc which uses SuperLU_DIST-2.5 and SuperLU-4.2. While petsc-dev has been updated for SuperLU_DIST-3.0, your API change in the SuperLU-4.2 tarball breaks petsc-3.2 (and any other library that was released with support for SuperLU-4.2). As we discussed last week, please revert the SuperLU-4.2 tarball to have the same API as it had when it was released and make the API change in a new tarball (e.g. 4.3). -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourdin at lsu.edu Thu Oct 27 14:53:45 2011 From: bourdin at lsu.edu (Blaise Bourdin) Date: Thu, 27 Oct 2011 14:53:45 -0500 Subject: [petsc-users] query about PETSc usage in Fortran applications In-Reply-To: References: Message-ID: <0CC12A09-7518-4E2A-8C4E-2323616E965F@lsu.edu> Lois, I use petsc through the fortran interface for my variational fracture mechanics code. (unstructured finite elements, 2d-3d). petsc has been instrumental in getting a parallel version. Sieve really helped me getting to large problems (largest to date being a 24M elements, 2,400 cores simulation, after which I cannot find tools to post-process my results) The application area is somewhere between engineering, computational mechanics, and applied mathematics I currently use Sieve, the KSP solvers and am in the (slow) process of adding TS. In the future, I will try to use VI in order to replace the optimization routines from TAO. I have acknowledged petsc in the following publications (I don't think that any of them is listed on the petsc web page). [Bourdin et al., 2011] Bourdin, B., Larsen, C., and Richardson, C. (2011). A time-discrete model for dynamic fracture based on crack regularization. International Journal of Fracture, 168:133?143. 10.1007/s10704- 010-9562-x. [Bourdin et al., 2010] Bourdin, B., Bucur, D., and Oudet, E. (2009/2010). Optimal partitions. SIAM J. Sci. Comput., 31(6):4100?4114. [Bourdin et al., 2008] Bourdin, B., Francfort, G., and Marigo, J.-J. (2008). The Variational Approach to Fracture. (reprinted from J. Elasticity 91(1-3):1?148, 2008). Springer. [Bourdin et al., 2008] Bourdin, B., Francfort, G., and Marigo, J.-J. (2008). The variational approach to fracture. J. Elasticity, 91(1-3):1?148. [Kimn and Bourdin, 2007] Kimn, J.-H. and Bourdin, B. (2007). Numerical implementation of overlapping balancing domain decomposition methods on unstructured meshes. In Widlund, O. B. and Keyes, D. E., editors, Domain Decomposition Methods in Science and Engineering XVI, volume 55 of Lecture Notes in Computational Science and Engineering, pages 309?315. Springer-Verlag. as well as in several conference proceedings, and talks. Blaise On Oct 25, 2011, at 7:46 AM, Lois Curfman McInnes wrote: > > I am collecting information about PETSc use in Fortran applications. If you are a using PETSc via the Fortran interface, please send email (to me only, curfman at mcs.anl.gov) to indicate: > - application area > - what parts of PETSc are used > - pointer to any publications or other references > > Of particular interest are applications in which PETSc facilitated a transition to parallelism for an existing application that had previously been only sequential. > > Thanks, > Lois > -- Department of Mathematics and Center for Computation & Technology Louisiana State University, Baton Rouge, LA 70803, USA Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Thu Oct 27 15:06:00 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Thu, 27 Oct 2011 22:06:00 +0200 Subject: [petsc-users] query about PETSc usage in Fortran applications In-Reply-To: <0CC12A09-7518-4E2A-8C4E-2323616E965F@lsu.edu> References: <0CC12A09-7518-4E2A-8C4E-2323616E965F@lsu.edu> Message-ID: > instrumental in getting a parallel version. Sieve really helped me getting > to large problems (largest to date being a 24M elements, 2,400 cores > simulation, after which I cannot find tools to post-process my results) I was doing ~100M elements postpro in VTK. Dominik From ping.rong at tu-harburg.de Thu Oct 27 16:09:04 2011 From: ping.rong at tu-harburg.de (Ping Rong) Date: Thu, 27 Oct 2011 23:09:04 +0200 Subject: [petsc-users] Petsc-3.2 with superlu_dist building error In-Reply-To: References: <4EA9A425.9010306@tuhh.de> Message-ID: <4EA9C870.906@tu-harburg.de> Thank you guys! I think I understood the problem, have seen the changelog of superlu_dist3.0, but I checked the configuration report, saw that superlu_dist-2.5 is used, so I didn't think of that as a problem. anyway, I will give another try, when I get to the office tmw. On 27.10.2011 21:20, Jed Brown wrote: > On Thu, Oct 27, 2011 at 13:11, Xiaoye S. Li > wrote: > > The error is related to the new SuperLU_DIST (v3.0), not serial > SuperLU. > In PETSc / SuperLU_DIST wrapper: superlu_dist.c, line 634, DOUBLE > should be replaced by SLU_DOUBLE. > > > Sherry, this is in the released version of PETSc which uses > SuperLU_DIST-2.5 and SuperLU-4.2. While petsc-dev has been updated for > SuperLU_DIST-3.0, your API change in the SuperLU-4.2 tarball breaks > petsc-3.2 (and any other library that was released with support for > SuperLU-4.2). As we discussed last week, please revert the SuperLU-4.2 > tarball to have the same API as it had when it was released and make > the API change in a new tarball (e.g. 4.3). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Oct 27 16:16:58 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 27 Oct 2011 15:16:58 -0600 Subject: [petsc-users] Petsc-3.2 with superlu_dist building error In-Reply-To: <4EA9C870.906@tu-harburg.de> References: <4EA9A425.9010306@tuhh.de> <4EA9C870.906@tu-harburg.de> Message-ID: On Thu, Oct 27, 2011 at 15:09, Ping Rong wrote: > Thank you guys! I think I understood the problem, have seen the changelog > of superlu_dist3.0, but I checked the configuration report, saw that > superlu_dist-2.5 is used, so I didn't think of that as a problem. anyway, I > will give another try, when I get to the office tmw. You can revert SuperLU-4.2 to have the same API it was released with by unpacking it, running find . -type f -exec perl -pi -e 's,SLU_(SINGLE|DOUBLE|EXTRA),$1,g' {} \; and then building as usual. -------------- next part -------------- An HTML attachment was scrubbed... URL: From xsli at lbl.gov Thu Oct 27 16:28:47 2011 From: xsli at lbl.gov (Xiaoye S. Li) Date: Thu, 27 Oct 2011 14:28:47 -0700 Subject: [petsc-users] Petsc-3.2 with superlu_dist building error In-Reply-To: References: <4EA9A425.9010306@tuhh.de> Message-ID: Okay, the older superlu_4.2.tar.gz is reverted, and the new one is superlu_4.3.tar.gz. Sherry On Thu, Oct 27, 2011 at 12:20 PM, Jed Brown wrote: > On Thu, Oct 27, 2011 at 13:11, Xiaoye S. Li wrote: >> >> The error is related to the new SuperLU_DIST (v3.0), not serial SuperLU. >> In PETSc / SuperLU_DIST wrapper: ?superlu_dist.c, line 634, ?DOUBLE >> should be replaced by SLU_DOUBLE. > > Sherry, this is in the released version of PETSc which uses SuperLU_DIST-2.5 > and SuperLU-4.2. While petsc-dev has been updated for SuperLU_DIST-3.0, your > API change in the SuperLU-4.2 tarball breaks petsc-3.2 (and any other > library that was released with support for SuperLU-4.2). As we discussed > last week, please revert the SuperLU-4.2 tarball to have the same API as it > had when it was released and make the API change in a new tarball (e.g. > 4.3). From balay at mcs.anl.gov Thu Oct 27 16:31:15 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 27 Oct 2011 16:31:15 -0500 (CDT) Subject: [petsc-users] Petsc-3.2 with superlu_dist building error In-Reply-To: References: <4EA9A425.9010306@tuhh.de> Message-ID: thanks! petsc-dev now updated to use the new tarball. Satish On Thu, 27 Oct 2011, Xiaoye S. Li wrote: > Okay, the older superlu_4.2.tar.gz is reverted, and the new one is > superlu_4.3.tar.gz. > > Sherry > > > On Thu, Oct 27, 2011 at 12:20 PM, Jed Brown wrote: > > On Thu, Oct 27, 2011 at 13:11, Xiaoye S. Li wrote: > >> > >> The error is related to the new SuperLU_DIST (v3.0), not serial SuperLU. > >> In PETSc / SuperLU_DIST wrapper: ?superlu_dist.c, line 634, ?DOUBLE > >> should be replaced by SLU_DOUBLE. > > > > Sherry, this is in the released version of PETSc which uses SuperLU_DIST-2.5 > > and SuperLU-4.2. While petsc-dev has been updated for SuperLU_DIST-3.0, your > > API change in the SuperLU-4.2 tarball breaks petsc-3.2 (and any other > > library that was released with support for SuperLU-4.2). As we discussed > > last week, please revert the SuperLU-4.2 tarball to have the same API as it > > had when it was released and make the API change in a new tarball (e.g. > > 4.3). > From jedbrown at mcs.anl.gov Thu Oct 27 16:56:21 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 27 Oct 2011 15:56:21 -0600 Subject: [petsc-users] Petsc-3.2 with superlu_dist building error In-Reply-To: References: <4EA9A425.9010306@tuhh.de> Message-ID: On Thu, Oct 27, 2011 at 15:28, Xiaoye S. Li wrote: > Okay, the older superlu_4.2.tar.gz is reverted, and the new one is > superlu_4.3.tar.gz. > Thank you. Ping, now you should be able to run: rm -rf $PETSC_ARCH/conf/SuperLU externalpackages/SuperLU_* $PETSC_ARCH/conf/reconfigure*.py make -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Fri Oct 28 01:39:25 2011 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Fri, 28 Oct 2011 06:39:25 +0000 Subject: [petsc-users] KSPMonitorSingularValue Message-ID: > If you have an explicit sparse matrix you are storing you can > run with -ksp_view_binary and email to us the resulting file > called binaryoutput at petsc-maint at mcs.anl.gov if the matrix is > smaller than 20 megabytes. If it is larger you can anonymously > ftp it to the ftp site called ftp.mcs.anl.gov and put it in the > directory incoming then send email to petsc-maint at mcs.anl.gov > and tell us where the file is. > > Barry > Thanks for the offer, but it's a matrix-free implementation, see my previous emails in this thread. Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From manuel.perezcerquera at polito.it Fri Oct 28 05:24:11 2011 From: manuel.perezcerquera at polito.it (PEREZ CERQUERA MANUEL RICARDO) Date: Fri, 28 Oct 2011 12:24:11 +0200 Subject: [petsc-users] Collecting Values in a Parallel Vector Message-ID: Hi everybody, I would like to know How do I add values from a sequential PETSC vector from each process into a parallel vector? , I'm doing this: PetscScatter ctx PetscInt NOfTotalBEMFunctions idx=(/(i,i=0,NOfTotalBEMFunctions-1)/) I create the Vectors LocalZNearNOfNonZeros and GlobalZNearNOfNonZeros with VecCreateSeq(...) and VecCreateMPI(...) respectively CALL ISCreateGeneral(PETSC_COMM_SELF,NOfTotalBEMFunctions,idx,from,ierr); CALL ISCreateGeneral(PETSC_COMM_WORLD,NOfTotalBEMFunctions,idx,towards,ierr); CALL VecScatterCreate(LocalZNearNOfNonZeros,from,GlobalZNearNOfNonZeros,towards,ctx,ierr); CALL VecScatterBegin(ctx,LocalZNearNOfNonZeros,GlobalZNearNOfNonZeros,ADD_VALUES,SCATTER_FORWARD,ierr) CALL VecScatterEnd(ctx,LocalZNearNOfNonZeros,GlobalZNearNOfNonZeros,ADD_VALUES,SCATTER_FORWARD,ierr) CALL VecScatterDestroy(ctx,ierr); So when I run in two Process , It crashes in ISCreateGeneral and I got this error: -------- [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably m emory access out of range [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.h tml#valgrind[1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Ma c OS X to find memory corruption errors [1]PETSC ERROR: likely location of problem given in stack below [1]PETSC ERROR: --------------------- Stack Frames ---------------------------- -------- [0]PETSC ERROR: ---------------------------------------------------------------- -------- [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably m emory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.h tml#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Ma c OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ---------------------------- -------- [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [1]PETSC ERROR: INSTEAD the line number of the start of the function [1]PETSC ERROR: is given. [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [1]PETSC ERROR: --------------------- Error Message ---------------------------- -------- [1]PETSC ERROR: Signal received! [1]PETSC ERROR: ---------------------------------------------------------------- -------- [1]PETSC ERROR: Petsc Release Version 3.2.0, Patch 2, Fri Sep 16 10:10:45 CDT 20 11 [1]PETSC ERROR: See docs/changes/index.html for recent updates. [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [1]PETSC ERROR: See docs/index.html for manual pages. [1]PETSC ERROR: ---------------------------------------------------------------- -------- [1]PETSC ERROR: C:\Documents and Settings\d022117\Desktop\MPIRunsPatrju\PAtreju. exe on a arch-mswi named GVSRV by d022117 Fri Oct 28 12:18:28 2011 [1]PETSC ERROR: Libraries linked from /home/d022117/petsc-3.2-p2/arch-mswin-cxx- debug/lib [1]PETSC ERROR: Configure run at Fri Sep 30 18:13:15 2011 [1]PETSC ERROR: Configure options --with-cc="win32fe cl" --with-fc="win32fe ifor t" --with-cxx="win32fe cl" --download-f-blas-lapack=1 --with-scalar-type=complex --with-clanguage=cxx --useThreads=0 [1]PETSC ERROR: ---------------------------------------------------------------- -------- [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown fil e application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 [0]PETSC ERROR: --------------------- Error Message ---------------------------- -------- [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ---------------------------------------------------------------- -------- [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 2, Fri Sep 16 10:10:45 CDT 20 11 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ---------------------------------------------------------------- -------- [0]PETSC ERROR: C:\Documents and Settings\d022117\Desktop\MPIRunsPatrju\PAtreju. exe on a arch-mswi named GVSRV by d022117 Fri Oct 28 12:18:28 2011 [0]PETSC ERROR: Libraries linked from /home/d022117/petsc-3.2-p2/arch-mswin-cxx- debug/lib [0]PETSC ERROR: Configure run at Fri Sep 30 18:13:15 2011 [0]PETSC ERROR: Configure options --with-cc="win32fe cl" --with-fc="win32fe ifor t" --with-cxx="win32fe cl" --download-f-blas-lapack=1 --with-scalar-type=complex --with-clanguage=cxx --useThreads=0 [0]PETSC ERROR: ---------------------------------------------------------------- -------- [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown fil e application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 job aborted: rank: node: exit code[: error message] 0: gvsrv.delen.polito.it: 59: process 0 exited without calling finalize 1: gvsrv.delen.polito.it: 59: process 1 exited without calling finalize I don't know how to solve it, and I would like to know if I'm really doing well the gatter operation. Thanks, Manuel . Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student Antenna and EMC Lab (LACE) Istituto Superiore Mario Boella (ISMB) Politecnico di Torino Via Pier Carlo Boggio 61, Torino 10138, Italy Email: manuel.perezcerquera at polito.it Phone: +39 0112276704 Fax: +39 011 2276 299 From aeronova.mailing at gmail.com Fri Oct 28 05:46:21 2011 From: aeronova.mailing at gmail.com (Kyunghoon Lee) Date: Fri, 28 Oct 2011 18:46:21 +0800 Subject: [petsc-users] petsc-3.1-p8 test error Message-ID: Hi all, I have configured petsc-3.1-p8 with the following options on my Mac OS X 10.6.8 (primarily to support complex variables): ./configure --prefix=/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8 --download-mpich=1 --download-blacs=1 --download-parmetis=1 --download-scalapack=1 --download-mumps=1 --download-umfpack=1 --with-scalar-type=complex --with-clanguage=C++ At the test stage, I got the following error --- it seems to have something to do with FORTRAN, although I do not need FORTRAN support. I'd appreciate if someone could help me with this error. Regards, K. Lee. $ make PETSC_DIR=/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8 test Running test examples to verify correct installation C/C++ example src/snes/examples/tutorials/ex19 run successfully with 1 MPI process C/C++ example src/snes/examples/tutorials/ex19 run successfully with 2 MPI processes --------------Error detected during compile or link!----------------------- See http://www.mcs.anl.gov/petsc/petsc-2/documentation/troubleshooting.html /Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/bin/mpif90 -c -Wall -Wno-unused-variable -g -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include -o ex5f.o ex5f.F Warning (115): Line 92 of ex5f.F is being truncated Warning (115): Line 113 of ex5f.F is being truncated Warning (115): Line 114 of ex5f.F is being truncated Warning (115): Line 113 of ex5f.F is being truncated Warning (115): Line 114 of ex5f.F is being truncated Warning (115): Line 125 of ex5f.F is being truncated Warning (115): Line 126 of ex5f.F is being truncated Warning (115): Line 127 of ex5f.F is being truncated Warning (115): Line 128 of ex5f.F is being truncated Warning (115): Line 125 of ex5f.F is being truncated Warning (115): Line 126 of ex5f.F is being truncated Warning (115): Line 127 of ex5f.F is being truncated Warning (115): Line 128 of ex5f.F is being truncated Warning (115): Line 130 of ex5f.F is being truncated Warning (115): Line 132 of ex5f.F is being truncated Warning (115): Line 188 of ex5f.F is being truncated Warning (115): Line 344 of ex5f.F is being truncated Warning (115): Line 348 of ex5f.F is being truncated Warning (115): Line 412 of ex5f.F is being truncated Warning (115): Line 417 of ex5f.F is being truncated Warning (115): Line 517 of ex5f.F is being truncated Warning (115): Line 522 of ex5f.F is being truncated Warning (115): Line 528 of ex5f.F is being truncated Warning (115): Line 537 of ex5f.F is being truncated /Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/bin/mpif90 -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wall -Wno-unused-variable -g -o ex5f ex5f.o -L/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/lib -L/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/lib -lpetsc -L/usr/X11R6/lib -lX11 -L/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lblacs -lparmetis -lmetis -lumfpack -lamd -llapack -lblas -L/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/lib -L/usr/lib/gcc/i686-apple-darwin10/4.2.1/x86_64 -L/usr/lib/i686-apple-darwin10/4.2.1 -L/usr/lib/gcc/i686-apple-darwin10/4.2.1 -ldl -lpmpich -lmpich -lSystem -lmpichf90 -lf95 -lm -L/opt/local/lib/g95/x86_64-apple-darwin10/4.2.4 -L/usr/lib/gcc -lm -lmpichcxx -lstdc++ -lmpichcxx -lstdc++ -ldl -lpmpich -lmpich -lSystem -ldl /bin/rm -f ex5f.o Fortran example src/snes/examples/tutorials/ex5f run successfully with 1 MPI process Completed test examples -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Fri Oct 28 08:31:33 2011 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Fri, 28 Oct 2011 08:31:33 -0500 Subject: [petsc-users] petsc-3.1-p8 test error In-Reply-To: References: Message-ID: petsc-3.1 is more than two years old. The latest release is 3.2. These are warnings in fortran. You can ignore. Hong On Fri, Oct 28, 2011 at 5:46 AM, Kyunghoon Lee wrote: > Hi all, > > I have configured petsc-3.1-p8 with the following options on my Mac OS X > 10.6.8 (primarily to support complex variables): > > ./configure > --prefix=/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8 > --download-mpich=1 --download-blacs=1 --download-parmetis=1 > --download-scalapack=1 --download-mumps=1 --download-umfpack=1 > --with-scalar-type=complex --with-clanguage=C++ > > At the test stage, I got the following error --- it seems to have something > to do with FORTRAN, although I do not need FORTRAN support.? I'd appreciate > if someone could help me with this error. > > Regards, > K. Lee. > > > $ make PETSC_DIR=/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8 > test > Running test examples to verify correct installation > C/C++ example src/snes/examples/tutorials/ex19 run successfully with 1 MPI > process > C/C++ example src/snes/examples/tutorials/ex19 run successfully with 2 MPI > processes > --------------Error detected during compile or link!----------------------- > See http://www.mcs.anl.gov/petsc/petsc-2/documentation/troubleshooting.html > /Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/bin/mpif90 -c > -Wall -Wno-unused-variable -g > -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include > -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include > -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include > -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include > -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include??? -o > ex5f.o ex5f.F > Warning (115): Line 92 of ex5f.F is being truncated > Warning (115): Line 113 of ex5f.F is being truncated > Warning (115): Line 114 of ex5f.F is being truncated > Warning (115): Line 113 of ex5f.F is being truncated > Warning (115): Line 114 of ex5f.F is being truncated > Warning (115): Line 125 of ex5f.F is being truncated > Warning (115): Line 126 of ex5f.F is being truncated > Warning (115): Line 127 of ex5f.F is being truncated > Warning (115): Line 128 of ex5f.F is being truncated > Warning (115): Line 125 of ex5f.F is being truncated > Warning (115): Line 126 of ex5f.F is being truncated > Warning (115): Line 127 of ex5f.F is being truncated > Warning (115): Line 128 of ex5f.F is being truncated > Warning (115): Line 130 of ex5f.F is being truncated > Warning (115): Line 132 of ex5f.F is being truncated > Warning (115): Line 188 of ex5f.F is being truncated > Warning (115): Line 344 of ex5f.F is being truncated > Warning (115): Line 348 of ex5f.F is being truncated > Warning (115): Line 412 of ex5f.F is being truncated > Warning (115): Line 417 of ex5f.F is being truncated > Warning (115): Line 517 of ex5f.F is being truncated > Warning (115): Line 522 of ex5f.F is being truncated > Warning (115): Line 528 of ex5f.F is being truncated > Warning (115): Line 537 of ex5f.F is being truncated > /Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/bin/mpif90 > -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress > -Wl,-commons,use_dylibs -Wl,-search_paths_first > -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress > -Wl,-commons,use_dylibs -Wl,-search_paths_first?? -Wall -Wno-unused-variable > -g? -o ex5f ex5f.o > -L/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/lib > -L/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/lib -lpetsc > -L/usr/X11R6/lib -lX11 > -L/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/lib -lcmumps > -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lblacs > -lparmetis -lmetis -lumfpack -lamd -llapack -lblas > -L/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/lib > -L/usr/lib/gcc/i686-apple-darwin10/4.2.1/x86_64 > -L/usr/lib/i686-apple-darwin10/4.2.1 > -L/usr/lib/gcc/i686-apple-darwin10/4.2.1 -ldl -lpmpich -lmpich -lSystem > -lmpichf90 -lf95 -lm -L/opt/local/lib/g95/x86_64-apple-darwin10/4.2.4 > -L/usr/lib/gcc -lm -lmpichcxx -lstdc++ -lmpichcxx -lstdc++ -ldl -lpmpich > -lmpich -lSystem -ldl > /bin/rm -f ex5f.o > Fortran example src/snes/examples/tutorials/ex5f run successfully with 1 MPI > process > Completed test examples > > From hzhang at mcs.anl.gov Fri Oct 28 08:39:18 2011 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Fri, 28 Oct 2011 08:39:18 -0500 Subject: [petsc-users] Collecting Values in a Parallel Vector In-Reply-To: References: Message-ID: See an example on "Scatters from a sequential vector to a parallel vector.": ~petsc/src/vec/vec/examples/tests/ex12.c Hong On Fri, Oct 28, 2011 at 5:24 AM, PEREZ CERQUERA MANUEL RICARDO wrote: > Hi everybody, > > I would like to know How do I add values from a sequential PETSC vector from > each process into a parallel vector? , I'm doing this: > > ? ?PetscScatter ctx > ? ?PetscInt NOfTotalBEMFunctions > ? ?idx=(/(i,i=0,NOfTotalBEMFunctions-1)/) > > I create the Vectors LocalZNearNOfNonZeros and GlobalZNearNOfNonZeros with > VecCreateSeq(...) and VecCreateMPI(...) respectively > > ? ?CALL ISCreateGeneral(PETSC_COMM_SELF,NOfTotalBEMFunctions,idx,from,ierr); > ? ?CALL > ISCreateGeneral(PETSC_COMM_WORLD,NOfTotalBEMFunctions,idx,towards,ierr); > ? ?CALL > VecScatterCreate(LocalZNearNOfNonZeros,from,GlobalZNearNOfNonZeros,towards,ctx,ierr); > ? ?CALL > VecScatterBegin(ctx,LocalZNearNOfNonZeros,GlobalZNearNOfNonZeros,ADD_VALUES,SCATTER_FORWARD,ierr) > ? ?CALL > VecScatterEnd(ctx,LocalZNearNOfNonZeros,GlobalZNearNOfNonZeros,ADD_VALUES,SCATTER_FORWARD,ierr) > ? ?CALL VecScatterDestroy(ctx,ierr); > > So when I run in two Process , It crashes in ISCreateGeneral and I got this > error: > > -------- > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably m > emory access out of range > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.h > tml#valgrind[1]PETSC ERROR: or try http://valgrind.org on GNU/linux and > Apple Ma > c OS X to find memory corruption errors > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: --------------------- ?Stack Frames > ---------------------------- > -------- > [0]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably m > emory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.h > tml#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and > Apple Ma > c OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- ?Stack Frames > ---------------------------- > -------- > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [1]PETSC ERROR: ? ? ? INSTEAD the line number of the start of the function > [1]PETSC ERROR: ? ? ? is given. > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: ? ? ? INSTEAD the line number of the start of the function > [0]PETSC ERROR: ? ? ? is given. > [1]PETSC ERROR: --------------------- Error Message > ---------------------------- > -------- > [1]PETSC ERROR: Signal received! > [1]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [1]PETSC ERROR: Petsc Release Version 3.2.0, Patch 2, Fri Sep 16 10:10:45 > CDT 20 > 11 > [1]PETSC ERROR: See docs/changes/index.html for recent updates. > [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [1]PETSC ERROR: See docs/index.html for manual pages. > [1]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [1]PETSC ERROR: C:\Documents and > Settings\d022117\Desktop\MPIRunsPatrju\PAtreju. > exe on a arch-mswi named GVSRV by d022117 Fri Oct 28 12:18:28 2011 > [1]PETSC ERROR: Libraries linked from > /home/d022117/petsc-3.2-p2/arch-mswin-cxx- > debug/lib > [1]PETSC ERROR: Configure run at Fri Sep 30 18:13:15 2011 > [1]PETSC ERROR: Configure options --with-cc="win32fe cl" --with-fc="win32fe > ifor > t" --with-cxx="win32fe cl" --download-f-blas-lapack=1 > --with-scalar-type=complex > ?--with-clanguage=cxx --useThreads=0 > [1]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown > fil > e > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 > [0]PETSC ERROR: --------------------- Error Message > ---------------------------- > -------- > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 2, Fri Sep 16 10:10:45 > CDT 20 > 11 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [0]PETSC ERROR: C:\Documents and > Settings\d022117\Desktop\MPIRunsPatrju\PAtreju. > exe on a arch-mswi named GVSRV by d022117 Fri Oct 28 12:18:28 2011 > [0]PETSC ERROR: Libraries linked from > /home/d022117/petsc-3.2-p2/arch-mswin-cxx- > debug/lib > [0]PETSC ERROR: Configure run at Fri Sep 30 18:13:15 2011 > [0]PETSC ERROR: Configure options --with-cc="win32fe cl" --with-fc="win32fe > ifor > t" --with-cxx="win32fe cl" --download-f-blas-lapack=1 > --with-scalar-type=complex > ?--with-clanguage=cxx --useThreads=0 > [0]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown > fil > e > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > job aborted: > rank: node: exit code[: error message] > 0: gvsrv.delen.polito.it: 59: process 0 exited without calling finalize > 1: gvsrv.delen.polito.it: 59: process 1 exited without calling finalize > > I don't know how to solve it, and I would like to know if I'm really doing > well the gatter operation. > > Thanks, Manuel . > > > Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student > Antenna and EMC Lab (LACE) > Istituto Superiore Mario Boella (ISMB) > Politecnico di Torino > Via Pier Carlo Boggio 61, Torino 10138, Italy > Email: manuel.perezcerquera at polito.it > Phone: +39 0112276704 > Fax: +39 011 2276 299 > From fredva at ifi.uio.no Fri Oct 28 05:24:17 2011 From: fredva at ifi.uio.no (Fredrik Heffer Valdmanis) Date: Fri, 28 Oct 2011 12:24:17 +0200 Subject: [petsc-users] Questions about setting values for GPU based matrices Message-ID: Hi, I am working on integrating the new GPU based vectors and matrices into FEniCS. Now, I'm looking at the possibility for getting some speedup during finite element assembly, specifically when inserting the local element matrix into the global element matrix. In that regard, I have a few questions I hope you can help me out with: - When calling MatSetValues with a MATSEQAIJCUSP matrix as parameter, what exactly is it that happens? As far as I can see, MatSetValues is not implemented for GPU based matrices, neither is the mat->ops->setvalues set to point at any function for this Mat type. - Is it such that matrices are assembled in their entirety on the CPU, and then copied over to the GPU (after calling MatAssemblyBegin)? Or are values copied over to the GPU each time you call MatSetValues? - Can we expect to see any speedup from using MatSetValuesBatch over MatSetValues, or is the batch version simply a utility function? This question goes for both CPU- and GPU-based matrices. Thanks, Fredrik V -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Fri Oct 28 09:38:44 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Fri, 28 Oct 2011 16:38:44 +0200 Subject: [petsc-users] Collecting Values in a Parallel Vector In-Reply-To: References: Message-ID: Run in debugger, you will catch the error instantly. On Fri, Oct 28, 2011 at 12:24 PM, PEREZ CERQUERA MANUEL RICARDO wrote: > Hi everybody, > > I would like to know How do I add values from a sequential PETSC vector from > each process into a parallel vector? , I'm doing this: > > ? ?PetscScatter ctx > ? ?PetscInt NOfTotalBEMFunctions > ? ?idx=(/(i,i=0,NOfTotalBEMFunctions-1)/) > > I create the Vectors LocalZNearNOfNonZeros and GlobalZNearNOfNonZeros with > VecCreateSeq(...) and VecCreateMPI(...) respectively > > ? ?CALL ISCreateGeneral(PETSC_COMM_SELF,NOfTotalBEMFunctions,idx,from,ierr); > ? ?CALL > ISCreateGeneral(PETSC_COMM_WORLD,NOfTotalBEMFunctions,idx,towards,ierr); > ? ?CALL > VecScatterCreate(LocalZNearNOfNonZeros,from,GlobalZNearNOfNonZeros,towards,ctx,ierr); > ? ?CALL > VecScatterBegin(ctx,LocalZNearNOfNonZeros,GlobalZNearNOfNonZeros,ADD_VALUES,SCATTER_FORWARD,ierr) > ? ?CALL > VecScatterEnd(ctx,LocalZNearNOfNonZeros,GlobalZNearNOfNonZeros,ADD_VALUES,SCATTER_FORWARD,ierr) > ? ?CALL VecScatterDestroy(ctx,ierr); > > So when I run in two Process , It crashes in ISCreateGeneral and I got this > error: > > -------- > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably m > emory access out of range > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.h > tml#valgrind[1]PETSC ERROR: or try http://valgrind.org on GNU/linux and > Apple Ma > c OS X to find memory corruption errors > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: --------------------- ?Stack Frames > ---------------------------- > -------- > [0]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably m > emory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.h > tml#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and > Apple Ma > c OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- ?Stack Frames > ---------------------------- > -------- > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [1]PETSC ERROR: ? ? ? INSTEAD the line number of the start of the function > [1]PETSC ERROR: ? ? ? is given. > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: ? ? ? INSTEAD the line number of the start of the function > [0]PETSC ERROR: ? ? ? is given. > [1]PETSC ERROR: --------------------- Error Message > ---------------------------- > -------- > [1]PETSC ERROR: Signal received! > [1]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [1]PETSC ERROR: Petsc Release Version 3.2.0, Patch 2, Fri Sep 16 10:10:45 > CDT 20 > 11 > [1]PETSC ERROR: See docs/changes/index.html for recent updates. > [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [1]PETSC ERROR: See docs/index.html for manual pages. > [1]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [1]PETSC ERROR: C:\Documents and > Settings\d022117\Desktop\MPIRunsPatrju\PAtreju. > exe on a arch-mswi named GVSRV by d022117 Fri Oct 28 12:18:28 2011 > [1]PETSC ERROR: Libraries linked from > /home/d022117/petsc-3.2-p2/arch-mswin-cxx- > debug/lib > [1]PETSC ERROR: Configure run at Fri Sep 30 18:13:15 2011 > [1]PETSC ERROR: Configure options --with-cc="win32fe cl" --with-fc="win32fe > ifor > t" --with-cxx="win32fe cl" --download-f-blas-lapack=1 > --with-scalar-type=complex > ?--with-clanguage=cxx --useThreads=0 > [1]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown > fil > e > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 > [0]PETSC ERROR: --------------------- Error Message > ---------------------------- > -------- > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 2, Fri Sep 16 10:10:45 > CDT 20 > 11 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [0]PETSC ERROR: C:\Documents and > Settings\d022117\Desktop\MPIRunsPatrju\PAtreju. > exe on a arch-mswi named GVSRV by d022117 Fri Oct 28 12:18:28 2011 > [0]PETSC ERROR: Libraries linked from > /home/d022117/petsc-3.2-p2/arch-mswin-cxx- > debug/lib > [0]PETSC ERROR: Configure run at Fri Sep 30 18:13:15 2011 > [0]PETSC ERROR: Configure options --with-cc="win32fe cl" --with-fc="win32fe > ifor > t" --with-cxx="win32fe cl" --download-f-blas-lapack=1 > --with-scalar-type=complex > ?--with-clanguage=cxx --useThreads=0 > [0]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown > fil > e > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > job aborted: > rank: node: exit code[: error message] > 0: gvsrv.delen.polito.it: 59: process 0 exited without calling finalize > 1: gvsrv.delen.polito.it: 59: process 1 exited without calling finalize > > I don't know how to solve it, and I would like to know if I'm really doing > well the gatter operation. > > Thanks, Manuel . > > > Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student > Antenna and EMC Lab (LACE) > Istituto Superiore Mario Boella (ISMB) > Politecnico di Torino > Via Pier Carlo Boggio 61, Torino 10138, Italy > Email: manuel.perezcerquera at polito.it > Phone: +39 0112276704 > Fax: +39 011 2276 299 > > From vkuhlem at emory.edu Fri Oct 28 10:17:53 2011 From: vkuhlem at emory.edu (Kuhlemann, Verena) Date: Fri, 28 Oct 2011 15:17:53 +0000 Subject: [petsc-users] pcasm vs. pcgasm Message-ID: Hi, just a curious question: What is the difference between preconditioners of type PCASM and PCGASM? Thanks, Verena ________________________________ This e-mail message (including any attachments) is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message (including any attachments) is strictly prohibited. If you have received this message in error, please contact the sender by reply e-mail message and destroy all copies of the original message (including attachments). -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Oct 28 10:59:24 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 28 Oct 2011 10:59:24 -0500 Subject: [petsc-users] pcasm vs. pcgasm In-Reply-To: References: Message-ID: <5227F54D-9558-4764-804B-EBCAEB6FA66C@mcs.anl.gov> With PCASM there can be one or more subdomains per process but a single subdomain cannot extend over two (MPI) processes. With GASM subdomains can be over any subset of processes. Barry On Oct 28, 2011, at 10:17 AM, Kuhlemann, Verena wrote: > Hi, > > just a curious question: What is the difference between preconditioners > of type PCASM and PCGASM? > > Thanks, > Verena > > > This e-mail message (including any attachments) is for the sole use of > the intended recipient(s) and may contain confidential and privileged > information. If the reader of this message is not the intended > recipient, you are hereby notified that any dissemination, distribution > or copying of this message (including any attachments) is strictly > prohibited. > > If you have received this message in error, please contact > the sender by reply e-mail message and destroy all copies of the > original message (including attachments). From knepley at gmail.com Fri Oct 28 11:02:46 2011 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 28 Oct 2011 16:02:46 +0000 Subject: [petsc-users] KSPMonitorSingularValue In-Reply-To: References: Message-ID: On Fri, Oct 28, 2011 at 6:39 AM, Klaij, Christiaan wrote: > > If you have an explicit sparse matrix you are storing you can > > run with -ksp_view_binary and email to us the resulting file > > called binaryoutput at petsc-maint at mcs.anl.gov if the matrix is > > smaller than 20 megabytes. If it is larger you can anonymously > > ftp it to the ftp site called ftp.mcs.anl.gov and put it in the > > directory incoming then send email to petsc-maint at mcs.anl.gov > > and tell us where the file is. > > > > Barry > > > > Thanks for the offer, but it's a matrix-free implementation, see > my previous emails in this thread. > You could try http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/PC/PCComputeExplicitOperator.html Matt > Chris > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Oct 28 11:32:00 2011 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 28 Oct 2011 16:32:00 +0000 Subject: [petsc-users] Questions about setting values for GPU based matrices In-Reply-To: References: Message-ID: On Fri, Oct 28, 2011 at 10:24 AM, Fredrik Heffer Valdmanis < fredva at ifi.uio.no> wrote: > Hi, > > I am working on integrating the new GPU based vectors and matrices into > FEniCS. Now, I'm looking at the possibility for getting some speedup during > finite element assembly, specifically when inserting the local element > matrix into the global element matrix. In that regard, I have a few > questions I hope you can help me out with: > > - When calling MatSetValues with a MATSEQAIJCUSP matrix as parameter, what > exactly is it that happens? As far as I can see, MatSetValues is not > implemented for GPU based matrices, neither is the mat->ops->setvalues set > to point at any function for this Mat type. > Yes, MatSetValues always operates on the CPU side. It would not make sense to do individual operations on the GPU. I have written batched of assembly for element matrices that are all the same size: http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatSetValuesBatch.html > - Is it such that matrices are assembled in their entirety on the CPU, and > then copied over to the GPU (after calling MatAssemblyBegin)? Or are values > copied over to the GPU each time you call MatSetValues? > That function assembles the matrix on the GPU and then copies to the CPU. The only time you do not want this copy is when you are running in serial and never touch the matrix afterwards, so I left it in. > - Can we expect to see any speedup from using MatSetValuesBatch over > MatSetValues, or is the batch version simply a utility function? This > question goes for both CPU- and GPU-based matrices. > CPU: no GPU: yes, I see about the memory bandwidth ratio Matt > Thanks, > > Fredrik V > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Oct 28 14:37:04 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 28 Oct 2011 14:37:04 -0500 Subject: [petsc-users] [petsc-maint #94846] Collecting Values in a Parallel Vector In-Reply-To: References: Message-ID: <89CCF28A-E156-4957-83B6-39D79B94BC23@mcs.anl.gov> You need to run in the debugger, one of your arguments to the function call is not right for some reason. Barry On Oct 28, 2011, at 5:24 AM, PEREZ CERQUERA MANUEL RICARDO wrote: > Hi everybody, > > I would like to know How do I add values from a sequential > PETSC vector from each process into a parallel vector? , > I'm doing this: > > PetscScatter ctx > PetscInt NOfTotalBEMFunctions > idx=(/(i,i=0,NOfTotalBEMFunctions-1)/) > > I create the Vectors LocalZNearNOfNonZeros and > GlobalZNearNOfNonZeros with VecCreateSeq(...) and > VecCreateMPI(...) respectively > > CALL > ISCreateGeneral(PETSC_COMM_SELF,NOfTotalBEMFunctions,idx,from,ierr); > CALL > ISCreateGeneral(PETSC_COMM_WORLD,NOfTotalBEMFunctions,idx,towards,ierr); > CALL > VecScatterCreate(LocalZNearNOfNonZeros,from,GlobalZNearNOfNonZeros,towards,ctx,ierr); > CALL > VecScatterBegin(ctx,LocalZNearNOfNonZeros,GlobalZNearNOfNonZeros,ADD_VALUES,SCATTER_FORWARD,ierr) > CALL > VecScatterEnd(ctx,LocalZNearNOfNonZeros,GlobalZNearNOfNonZeros,ADD_VALUES,SCATTER_FORWARD,ierr) > CALL VecScatterDestroy(ctx,ierr); > > So when I run in two Process , It crashes in > ISCreateGeneral and I got this error: > > -------- > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > Violation, probably m > emory access out of range > [1]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > [1]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.h > tml#valgrind[1]PETSC ERROR: or try http://valgrind.org on > GNU/linux and Apple Ma > c OS X to find memory corruption errors > [1]PETSC ERROR: likely location of problem given in stack > below > [1]PETSC ERROR: --------------------- Stack Frames > ---------------------------- > -------- > [0]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > Violation, probably m > emory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.h > tml#valgrind[0]PETSC ERROR: or try http://valgrind.org on > GNU/linux and Apple Ma > c OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack > below > [0]PETSC ERROR: --------------------- Stack Frames > ---------------------------- > -------- > [1]PETSC ERROR: Note: The EXACT line numbers in the stack > are not available, > [1]PETSC ERROR: INSTEAD the line number of the start > of the function > [1]PETSC ERROR: is given. > [0]PETSC ERROR: Note: The EXACT line numbers in the stack > are not available, > [0]PETSC ERROR: INSTEAD the line number of the start > of the function > [0]PETSC ERROR: is given. > [1]PETSC ERROR: --------------------- Error Message > ---------------------------- > -------- > [1]PETSC ERROR: Signal received! > [1]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [1]PETSC ERROR: Petsc Release Version 3.2.0, Patch 2, Fri > Sep 16 10:10:45 CDT 20 > 11 > [1]PETSC ERROR: See docs/changes/index.html for recent > updates. > [1]PETSC ERROR: See docs/faq.html for hints about trouble > shooting. > [1]PETSC ERROR: See docs/index.html for manual pages. > [1]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [1]PETSC ERROR: C:\Documents and > Settings\d022117\Desktop\MPIRunsPatrju\PAtreju. > exe on a arch-mswi named GVSRV by d022117 Fri Oct 28 > 12:18:28 2011 > [1]PETSC ERROR: Libraries linked from > /home/d022117/petsc-3.2-p2/arch-mswin-cxx- > debug/lib > [1]PETSC ERROR: Configure run at Fri Sep 30 18:13:15 2011 > [1]PETSC ERROR: Configure options --with-cc="win32fe cl" > --with-fc="win32fe ifor > t" --with-cxx="win32fe cl" --download-f-blas-lapack=1 > --with-scalar-type=complex > --with-clanguage=cxx --useThreads=0 > [1]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [1]PETSC ERROR: User provided function() line 0 in unknown > directory unknown fil > e > application called MPI_Abort(MPI_COMM_WORLD, 59) - process > 1 > [0]PETSC ERROR: --------------------- Error Message > ---------------------------- > -------- > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 2, Fri > Sep 16 10:10:45 CDT 20 > 11 > [0]PETSC ERROR: See docs/changes/index.html for recent > updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble > shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [0]PETSC ERROR: C:\Documents and > Settings\d022117\Desktop\MPIRunsPatrju\PAtreju. > exe on a arch-mswi named GVSRV by d022117 Fri Oct 28 > 12:18:28 2011 > [0]PETSC ERROR: Libraries linked from > /home/d022117/petsc-3.2-p2/arch-mswin-cxx- > debug/lib > [0]PETSC ERROR: Configure run at Fri Sep 30 18:13:15 2011 > [0]PETSC ERROR: Configure options --with-cc="win32fe cl" > --with-fc="win32fe ifor > t" --with-cxx="win32fe cl" --download-f-blas-lapack=1 > --with-scalar-type=complex > --with-clanguage=cxx --useThreads=0 > [0]PETSC ERROR: > ---------------------------------------------------------------- > -------- > [0]PETSC ERROR: User provided function() line 0 in unknown > directory unknown fil > e > application called MPI_Abort(MPI_COMM_WORLD, 59) - process > 0 > > job aborted: > rank: node: exit code[: error message] > 0: gvsrv.delen.polito.it: 59: process 0 exited without > calling finalize > 1: gvsrv.delen.polito.it: 59: process 1 exited without > calling finalize > > I don't know how to solve it, and I would like to know if > I'm really doing well the gatter operation. > > Thanks, Manuel . > > > Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student > Antenna and EMC Lab (LACE) > Istituto Superiore Mario Boella (ISMB) > Politecnico di Torino > Via Pier Carlo Boggio 61, Torino 10138, Italy > Email: manuel.perezcerquera at polito.it > Phone: +39 0112276704 > Fax: +39 011 2276 299 > From aeronova.mailing at gmail.com Fri Oct 28 14:50:11 2011 From: aeronova.mailing at gmail.com (Kyunghoon Lee) Date: Sat, 29 Oct 2011 03:50:11 +0800 Subject: [petsc-users] petsc-3.1-p8 test error In-Reply-To: References: Message-ID: Thanks, but 3.2 does not work with slepc-3.1. :( On Fri, Oct 28, 2011 at 9:31 PM, Hong Zhang wrote: > petsc-3.1 is more than two years old. The latest release is 3.2. > > These are warnings in fortran. You can ignore. > Hong > > On Fri, Oct 28, 2011 at 5:46 AM, Kyunghoon Lee > wrote: > > Hi all, > > > > I have configured petsc-3.1-p8 with the following options on my Mac OS X > > 10.6.8 (primarily to support complex variables): > > > > ./configure > > --prefix=/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8 > > --download-mpich=1 --download-blacs=1 --download-parmetis=1 > > --download-scalapack=1 --download-mumps=1 --download-umfpack=1 > > --with-scalar-type=complex --with-clanguage=C++ > > > > At the test stage, I got the following error --- it seems to have > something > > to do with FORTRAN, although I do not need FORTRAN support. I'd > appreciate > > if someone could help me with this error. > > > > Regards, > > K. Lee. > > > > > > $ make > PETSC_DIR=/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8 > > test > > Running test examples to verify correct installation > > C/C++ example src/snes/examples/tutorials/ex19 run successfully with 1 > MPI > > process > > C/C++ example src/snes/examples/tutorials/ex19 run successfully with 2 > MPI > > processes > > --------------Error detected during compile or > link!----------------------- > > See > http://www.mcs.anl.gov/petsc/petsc-2/documentation/troubleshooting.html > > /Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/bin/mpif90 -c > > -Wall -Wno-unused-variable -g > > -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include > > -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include > > -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include > > -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include > > -I/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/include > -o > > ex5f.o ex5f.F > > Warning (115): Line 92 of ex5f.F is being truncated > > Warning (115): Line 113 of ex5f.F is being truncated > > Warning (115): Line 114 of ex5f.F is being truncated > > Warning (115): Line 113 of ex5f.F is being truncated > > Warning (115): Line 114 of ex5f.F is being truncated > > Warning (115): Line 125 of ex5f.F is being truncated > > Warning (115): Line 126 of ex5f.F is being truncated > > Warning (115): Line 127 of ex5f.F is being truncated > > Warning (115): Line 128 of ex5f.F is being truncated > > Warning (115): Line 125 of ex5f.F is being truncated > > Warning (115): Line 126 of ex5f.F is being truncated > > Warning (115): Line 127 of ex5f.F is being truncated > > Warning (115): Line 128 of ex5f.F is being truncated > > Warning (115): Line 130 of ex5f.F is being truncated > > Warning (115): Line 132 of ex5f.F is being truncated > > Warning (115): Line 188 of ex5f.F is being truncated > > Warning (115): Line 344 of ex5f.F is being truncated > > Warning (115): Line 348 of ex5f.F is being truncated > > Warning (115): Line 412 of ex5f.F is being truncated > > Warning (115): Line 417 of ex5f.F is being truncated > > Warning (115): Line 517 of ex5f.F is being truncated > > Warning (115): Line 522 of ex5f.F is being truncated > > Warning (115): Line 528 of ex5f.F is being truncated > > Warning (115): Line 537 of ex5f.F is being truncated > > /Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/bin/mpif90 > > -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress > > -Wl,-commons,use_dylibs -Wl,-search_paths_first > > -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress > > -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wall > -Wno-unused-variable > > -g -o ex5f ex5f.o > > -L/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/lib > > -L/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/lib -lpetsc > > -L/usr/X11R6/lib -lX11 > > -L/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/lib -lcmumps > > -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lblacs > > -lparmetis -lmetis -lumfpack -lamd -llapack -lblas > > -L/Users/aeronova/Development/local/lib64/petsc/petsc-3.1-p8/lib > > -L/usr/lib/gcc/i686-apple-darwin10/4.2.1/x86_64 > > -L/usr/lib/i686-apple-darwin10/4.2.1 > > -L/usr/lib/gcc/i686-apple-darwin10/4.2.1 -ldl -lpmpich -lmpich -lSystem > > -lmpichf90 -lf95 -lm -L/opt/local/lib/g95/x86_64-apple-darwin10/4.2.4 > > -L/usr/lib/gcc -lm -lmpichcxx -lstdc++ -lmpichcxx -lstdc++ -ldl -lpmpich > > -lmpich -lSystem -ldl > > /bin/rm -f ex5f.o > > Fortran example src/snes/examples/tutorials/ex5f run successfully with 1 > MPI > > process > > Completed test examples > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Oct 28 14:51:49 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 28 Oct 2011 13:51:49 -0600 Subject: [petsc-users] petsc-3.1-p8 test error In-Reply-To: References: Message-ID: On Fri, Oct 28, 2011 at 13:50, Kyunghoon Lee wrote: > Thanks, but 3.2 does not work with slepc-3.1. :( You can use slepc-3.2: http://www.grycap.upv.es/slepc/download/download.htm -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeronova.mailing at gmail.com Fri Oct 28 14:54:59 2011 From: aeronova.mailing at gmail.com (Kyunghoon Lee) Date: Sat, 29 Oct 2011 03:54:59 +0800 Subject: [petsc-users] petsc-3.1-p8 test error In-Reply-To: References: Message-ID: I cannot find 3.2 download; do you mean the svn version? On Sat, Oct 29, 2011 at 3:51 AM, Jed Brown wrote: > On Fri, Oct 28, 2011 at 13:50, Kyunghoon Lee wrote: > >> Thanks, but 3.2 does not work with slepc-3.1. :( > > > You can use slepc-3.2: > > http://www.grycap.upv.es/slepc/download/download.htm > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Oct 28 14:57:02 2011 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 28 Oct 2011 19:57:02 +0000 Subject: [petsc-users] petsc-3.1-p8 test error In-Reply-To: References: Message-ID: On Fri, Oct 28, 2011 at 7:54 PM, Kyunghoon Lee wrote: > I cannot find 3.2 download; do you mean the svn version? > It is right at the top of the page: Here is the link http://www.grycap.upv.es/slepc/download/distrib/slepc-3.2-p0.tar.gz Matt > On Sat, Oct 29, 2011 at 3:51 AM, Jed Brown wrote: > >> On Fri, Oct 28, 2011 at 13:50, Kyunghoon Lee wrote: >> >>> Thanks, but 3.2 does not work with slepc-3.1. :( >> >> >> You can use slepc-3.2: >> >> http://www.grycap.upv.es/slepc/download/download.htm >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeronova.mailing at gmail.com Fri Oct 28 14:58:57 2011 From: aeronova.mailing at gmail.com (Kyunghoon Lee) Date: Sat, 29 Oct 2011 03:58:57 +0800 Subject: [petsc-users] petsc-3.1-p8 test error In-Reply-To: References: Message-ID: Thanks! On Sat, Oct 29, 2011 at 3:57 AM, Matthew Knepley wrote: > On Fri, Oct 28, 2011 at 7:54 PM, Kyunghoon Lee > wrote: > >> I cannot find 3.2 download; do you mean the svn version? >> > > It is right at the top of the page: Here is the link > > http://www.grycap.upv.es/slepc/download/distrib/slepc-3.2-p0.tar.gz > > Matt > > >> On Sat, Oct 29, 2011 at 3:51 AM, Jed Brown wrote: >> >>> On Fri, Oct 28, 2011 at 13:50, Kyunghoon Lee >> > wrote: >>> >>>> Thanks, but 3.2 does not work with slepc-3.1. :( >>> >>> >>> You can use slepc-3.2: >>> >>> http://www.grycap.upv.es/slepc/download/download.htm >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeronova.mailing at gmail.com Fri Oct 28 17:50:24 2011 From: aeronova.mailing at gmail.com (Kyunghoon Lee) Date: Sat, 29 Oct 2011 06:50:24 +0800 Subject: [petsc-users] =?windows-1252?q?=5Bsplec-3=2E2=5D_cannot_convert_?= =?windows-1252?q?=91std=3A=3Acomplex=3Cdouble=3E=92_to_=91PetscRea?= =?windows-1252?q?l=92_in_assignment?= Message-ID: Hi, I got the following error for slepc-3.2-p0 ks-slice.c: In function ?PetscErrorCode EPSKrylovSchur_Slice(_p_EPS*)?: ks-slice.c:244: error: cannot convert ?std::complex? to ?PetscReal? in assignment ks-slice.c:257: error: cannot convert ?std::complex? to ?PetscReal? in assignment ks-slice.c:316: error: cannot convert ?std::complex? to ?PetscReal? in assignment I compiled petsc-3.2 with the following options: Using PETSc configure options: --prefix=/Users/aeronova/Development/local/lib64/petsc/petsc-3.2-p4 --download-mpich=1 --download-blacs=1 --download-parmetis=1 --download-scalapack=1 --download-mumps=1 --download-umfpack=1 --with-scalar-type=complex --with-clanguage=C++ --with-fc=g95 Using SLEPc I hope someone can help me with this error. Regards, K. Lee. -------------- next part -------------- An HTML attachment was scrubbed... URL: From huangsc at gmail.com Fri Oct 28 18:48:26 2011 From: huangsc at gmail.com (Shao-Ching Huang) Date: Fri, 28 Oct 2011 16:48:26 -0700 Subject: [petsc-users] multiblock question Message-ID: Hi We are planning a new (finite volume) multiblock code, in which each block has logically structured mesh. We plan to create one DMDA for one block (which could span across 1 or more processes; we already have code). What would be the recommended PETSc-way to couple these blocks together for implicit solves? We also need ghosted region between two connected blocks (just like the ghost regions among the subdomains within a DMDA) for interpolation. Thanks. Shao-Ching From knepley at gmail.com Fri Oct 28 19:17:51 2011 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 29 Oct 2011 00:17:51 +0000 Subject: [petsc-users] multiblock question In-Reply-To: References: Message-ID: On Fri, Oct 28, 2011 at 11:48 PM, Shao-Ching Huang wrote: > Hi > > We are planning a new (finite volume) multiblock code, in which each > block has logically structured mesh. We plan to create one DMDA for > one block (which could span across 1 or more processes; we already > have code). What would be the recommended PETSc-way to couple these > blocks together for implicit solves? We also need ghosted region > between two connected blocks (just like the ghost regions among the > subdomains within a DMDA) for interpolation. > I think the idea here is to use a DMComposite to couple together these DMDAs. You would have to specify the coupling explicitly since we have no way of knowing how they are connected, but after that, the GlobalToLocal() should work just the same. Thanks, Matt > Thanks. > > Shao-Ching > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Oct 28 21:07:43 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 28 Oct 2011 21:07:43 -0500 Subject: [petsc-users] =?windows-1252?q?=5Bsplec-3=2E2=5D_cannot_convert_?= =?windows-1252?q?=91std=3A=3Acomplex=3Cdouble=3E=92_to_=91PetscReal=92_in?= =?windows-1252?q?_assignment?= In-Reply-To: References: Message-ID: <6274EED0-C92B-45A4-86CF-B110A50B1774@mcs.anl.gov> This appears to be a question suitable for slepc-maint not petsc-users On Oct 28, 2011, at 5:50 PM, Kyunghoon Lee wrote: > Hi, > > I got the following error for slepc-3.2-p0 > > ks-slice.c: In function ?PetscErrorCode EPSKrylovSchur_Slice(_p_EPS*)?: > ks-slice.c:244: error: cannot convert ?std::complex? to ?PetscReal? in assignment > ks-slice.c:257: error: cannot convert ?std::complex? to ?PetscReal? in assignment > ks-slice.c:316: error: cannot convert ?std::complex? to ?PetscReal? in assignment > > I compiled petsc-3.2 with the following options: > > Using PETSc configure options: --prefix=/Users/aeronova/Development/local/lib64/petsc/petsc-3.2-p4 --download-mpich=1 --download-blacs=1 --download-parmetis=1 --download-scalapack=1 --download-mumps=1 --download-umfpack=1 --with-scalar-type=complex --with-clanguage=C++ --with-fc=g95 > Using SLEPc > > I hope someone can help me with this error. > > Regards, > K. Lee. From huangsc at gmail.com Sat Oct 29 13:02:33 2011 From: huangsc at gmail.com (Shao-Ching Huang) Date: Sat, 29 Oct 2011 11:02:33 -0700 Subject: [petsc-users] multiblock question In-Reply-To: References: Message-ID: Thanks Matt. I will look into DMComposite. Shao-Ching On Fri, Oct 28, 2011 at 5:17 PM, Matthew Knepley wrote: > On Fri, Oct 28, 2011 at 11:48 PM, Shao-Ching Huang > wrote: >> >> Hi >> >> We are planning a new (finite volume) multiblock code, in which each >> block has logically structured mesh. We plan to create one DMDA for >> one block (which could span across 1 or more processes; we already >> have code). What would be the recommended PETSc-way to couple these >> blocks together for implicit solves? We also need ghosted region >> between two connected blocks (just like the ghost regions among the >> subdomains within a DMDA) for interpolation. > > I think the idea here is to use a DMComposite to couple together these > DMDAs. You > would have to specify the coupling explicitly since we have no way of > knowing how they > are connected, but after that, the GlobalToLocal() should work just the > same. > ? Thanks, > ? ? ?Matt > >> >> Thanks. >> >> Shao-Ching > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > From huangsc at gmail.com Sat Oct 29 13:58:45 2011 From: huangsc at gmail.com (Shao-Ching Huang) Date: Sat, 29 Oct 2011 11:58:45 -0700 Subject: [petsc-users] multiblock question In-Reply-To: References: Message-ID: Hi, I have additional questions: 1. Suppose I create DMDA0 on two processes {0,1} (communicator comm0) and DMDA1 on another two processes {2,3} (communicator comm1). Can I DMCompositeAddDM() them into DMComposite (created using communicator MPI_COMM_WORLD, containing all 4 processes, 0-3)? 2. Suppose DMDA0 and DMDA1 are 2D Cartesian domains, and that the right-hand-side of DMDA0 is "connected" to DMDA1 (just like the subdomains within a regular DMDA). Which API should I use to tell DMComposite that the "right side" (say i=Nx, all j) of DMDA0 is connected to the "left side" (say i=0, all j) of DMDA1. I suppose I need to use IS index set somewhere. Thanks, Shao-Ching On Fri, Oct 28, 2011 at 5:17 PM, Matthew Knepley wrote: > On Fri, Oct 28, 2011 at 11:48 PM, Shao-Ching Huang > wrote: >> >> Hi >> >> We are planning a new (finite volume) multiblock code, in which each >> block has logically structured mesh. We plan to create one DMDA for >> one block (which could span across 1 or more processes; we already >> have code). What would be the recommended PETSc-way to couple these >> blocks together for implicit solves? We also need ghosted region >> between two connected blocks (just like the ghost regions among the >> subdomains within a DMDA) for interpolation. > > I think the idea here is to use a DMComposite to couple together these > DMDAs. You > would have to specify the coupling explicitly since we have no way of > knowing how they > are connected, but after that, the GlobalToLocal() should work just the > same. > ? Thanks, > ? ? ?Matt > >> >> Thanks. >> >> Shao-Ching > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > From dominik at itis.ethz.ch Sat Oct 29 15:10:14 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 29 Oct 2011 22:10:14 +0200 Subject: [petsc-users] -ksp_type mumps ? Message-ID: I configured petsc with mumps. How can I use it? I do not seem to find any hints in the documentation, and naive -ksp_type mumps fails with an error "Unknown type" Any hints are appreciated. Dominik From dave.mayhem23 at gmail.com Sat Oct 29 15:15:52 2011 From: dave.mayhem23 at gmail.com (Dave May) Date: Sat, 29 Oct 2011 22:15:52 +0200 Subject: [petsc-users] -ksp_type mumps ? In-Reply-To: References: Message-ID: You need to use this command line option -pc_type lu -pc_factor_mat_solver_package mumps to use mumps. Cheers On 29 October 2011 22:10, Dominik Szczerba wrote: > I configured petsc with mumps. How can I use it? I do not seem to find > any hints in the documentation, and naive -ksp_type mumps fails with > an error "Unknown type" > > Any hints are appreciated. > > Dominik > From dominik at itis.ethz.ch Sat Oct 29 15:26:56 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Sat, 29 Oct 2011 22:26:56 +0200 Subject: [petsc-users] -ksp_type mumps ? In-Reply-To: References: Message-ID: Fan-tas-tic, I am indebted. PS. It could be simpler. Thanks, Dominik On Sat, Oct 29, 2011 at 10:15 PM, Dave May wrote: > You need to use this command line option > -pc_type lu -pc_factor_mat_solver_package mumps > to use mumps. > > > Cheers > > > On 29 October 2011 22:10, Dominik Szczerba wrote: >> I configured petsc with mumps. How can I use it? I do not seem to find >> any hints in the documentation, and naive -ksp_type mumps fails with >> an error "Unknown type" >> >> Any hints are appreciated. >> >> Dominik >> > > From knepley at gmail.com Sat Oct 29 21:03:02 2011 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 30 Oct 2011 02:03:02 +0000 Subject: [petsc-users] multiblock question In-Reply-To: References: Message-ID: On Sat, Oct 29, 2011 at 6:58 PM, Shao-Ching Huang wrote: > Hi, I have additional questions: > > 1. Suppose I create DMDA0 on two processes {0,1} (communicator comm0) > and DMDA1 on another two processes {2,3} (communicator comm1). Can I > DMCompositeAddDM() them into DMComposite (created using communicator > MPI_COMM_WORLD, containing all 4 processes, 0-3)? > No. This is due to the complexity of MPI for collectives, etc between communicators. Instead, you should use the full communicator for both, but give no vertices to ranks you want to leave out. This means you will have to partition the DMDA yourself, but this is straightforward. There are no performance hits when communicating ghost values, and the reductions inside the solve would need all the procs anyway. > 2. Suppose DMDA0 and DMDA1 are 2D Cartesian domains, and that the > right-hand-side of DMDA0 is "connected" to DMDA1 (just like the > subdomains within a regular DMDA). Which API should I use to tell > DMComposite that the "right side" (say i=Nx, all j) of DMDA0 is > connected to the "left side" (say i=0, all j) of DMDA1. I suppose I > need to use IS index set somewhere. > This is more complicated. All our examples (like SNES ex28) are not grids or scalar which are not coupled. You would need to construct the LocalToGlobal mapping for this collection of grids (which is a set of two ISes). Here is the current code: http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/dm/impls/composite/pack.c.html#DMCompositeGetISLocalToGlobalMappings Notice that the mapping for each DMDA is just concatenated. You code would look similar, except that you would knit together one edge. Since we have never had anyone ask for this, the interface is still primitive. If you have a suggestion for a nice way to construct this IS, please let us know. Thanks, Matt > Thanks, > > Shao-Ching > > On Fri, Oct 28, 2011 at 5:17 PM, Matthew Knepley > wrote: > > On Fri, Oct 28, 2011 at 11:48 PM, Shao-Ching Huang > > wrote: > >> > >> Hi > >> > >> We are planning a new (finite volume) multiblock code, in which each > >> block has logically structured mesh. We plan to create one DMDA for > >> one block (which could span across 1 or more processes; we already > >> have code). What would be the recommended PETSc-way to couple these > >> blocks together for implicit solves? We also need ghosted region > >> between two connected blocks (just like the ghost regions among the > >> subdomains within a DMDA) for interpolation. > > > > I think the idea here is to use a DMComposite to couple together these > > DMDAs. You > > would have to specify the coupling explicitly since we have no way of > > knowing how they > > are connected, but after that, the GlobalToLocal() should work just the > > same. > > Thanks, > > Matt > > > >> > >> Thanks. > >> > >> Shao-Ching > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments > > is infinitely more interesting than any results to which their > experiments > > lead. > > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sun Oct 30 02:49:25 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sun, 30 Oct 2011 11:19:25 +0330 Subject: [petsc-users] about Singular value Message-ID: Dear all, What is the procedure or method to calculate Extreme Singular Values when calling KSPComputeExtremeSingularValues( )? Thanks, B. B. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Oct 30 06:17:17 2011 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 30 Oct 2011 11:17:17 +0000 Subject: [petsc-users] about Singular value In-Reply-To: References: Message-ID: On Sun, Oct 30, 2011 at 7:49 AM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > Dear all, > > What is the procedure or method to calculate Extreme Singular Values when > calling KSPComputeExtremeSingularValues( )? > Are you asking what is done internally? We call LAPACK SVD on the Hermitian matrix made by the Krylov method. Matt > Thanks, B. B. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sun Oct 30 06:24:53 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sun, 30 Oct 2011 14:54:53 +0330 Subject: [petsc-users] about Singular value In-Reply-To: References: Message-ID: Then it would be an approximation to the Preconditioned Matrix (JM^(-1) or M^(-1)J) up to the dimension of the Krylov subspace? On Sun, Oct 30, 2011 at 2:47 PM, Matthew Knepley wrote: > On Sun, Oct 30, 2011 at 7:49 AM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> Dear all, >> >> What is the procedure or method to calculate Extreme Singular Values when >> calling KSPComputeExtremeSingularValues( )? >> > > Are you asking what is done internally? We call LAPACK SVD on the > Hermitian matrix made by the Krylov method. > > Matt > > >> Thanks, B. B. >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Oct 30 07:12:19 2011 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 30 Oct 2011 12:12:19 +0000 Subject: [petsc-users] about Singular value In-Reply-To: References: Message-ID: On Sun, Oct 30, 2011 at 11:24 AM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > Then it would be an approximation to the Preconditioned Matrix (JM^(-1) or > M^(-1)J) up to the dimension of the Krylov subspace? Yes. Matt > On Sun, Oct 30, 2011 at 2:47 PM, Matthew Knepley wrote: > >> On Sun, Oct 30, 2011 at 7:49 AM, behzad baghapour < >> behzad.baghapour at gmail.com> wrote: >> >>> Dear all, >>> >>> What is the procedure or method to calculate Extreme Singular Values >>> when calling KSPComputeExtremeSingularValues( )? >>> >> >> Are you asking what is done internally? We call LAPACK SVD on the >> Hermitian matrix made by the Krylov method. >> >> Matt >> >> >>> Thanks, B. B. >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From fredva at ifi.uio.no Sun Oct 30 07:22:20 2011 From: fredva at ifi.uio.no (Fredrik Heffer Valdmanis) Date: Sun, 30 Oct 2011 13:22:20 +0100 Subject: [petsc-users] Questions about setting values for GPU based matrices In-Reply-To: References: Message-ID: 2011/10/28 Matthew Knepley > On Fri, Oct 28, 2011 at 10:24 AM, Fredrik Heffer Valdmanis < > fredva at ifi.uio.no> wrote: > >> Hi, >> >> I am working on integrating the new GPU based vectors and matrices into >> FEniCS. Now, I'm looking at the possibility for getting some speedup during >> finite element assembly, specifically when inserting the local element >> matrix into the global element matrix. In that regard, I have a few >> questions I hope you can help me out with: >> >> - When calling MatSetValues with a MATSEQAIJCUSP matrix as parameter, >> what exactly is it that happens? As far as I can see, MatSetValues is not >> implemented for GPU based matrices, neither is the mat->ops->setvalues set >> to point at any function for this Mat type. >> > > Yes, MatSetValues always operates on the CPU side. It would not make sense > to do individual operations on the GPU. > > I have written batched of assembly for element matrices that are all the > same size: > > > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatSetValuesBatch.html > Thanks. I assume that the best way to use the batch function is to batch up all element matrices and insert all with one function call? Or is it recommended to split it up into several smaller batches? -- Fredrik -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sun Oct 30 08:20:42 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sun, 30 Oct 2011 16:50:42 +0330 Subject: [petsc-users] about Singular value In-Reply-To: References: Message-ID: Thanks... On Sun, Oct 30, 2011 at 3:42 PM, Matthew Knepley wrote: > On Sun, Oct 30, 2011 at 11:24 AM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> Then it would be an approximation to the Preconditioned Matrix (JM^(-1) >> or M^(-1)J) up to the dimension of the Krylov subspace? > > > Yes. > > Matt > > >> On Sun, Oct 30, 2011 at 2:47 PM, Matthew Knepley wrote: >> >>> On Sun, Oct 30, 2011 at 7:49 AM, behzad baghapour < >>> behzad.baghapour at gmail.com> wrote: >>> >>>> Dear all, >>>> >>>> What is the procedure or method to calculate Extreme Singular Values >>>> when calling KSPComputeExtremeSingularValues( )? >>>> >>> >>> Are you asking what is done internally? We call LAPACK SVD on the >>> Hermitian matrix made by the Krylov method. >>> >>> Matt >>> >>> >>>> Thanks, B. B. >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> >> >> -- >> ================================== >> Behzad Baghapour >> Ph.D. Candidate, Mechecanical Engineering >> University of Tehran, Tehran, Iran >> https://sites.google.com/site/behzadbaghapour >> Fax: 0098-21-88020741 >> ================================== >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sun Oct 30 10:19:56 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sun, 30 Oct 2011 09:19:56 -0600 Subject: [petsc-users] about Singular value In-Reply-To: References: Message-ID: On Sun, Oct 30, 2011 at 05:17, Matthew Knepley wrote: > We call LAPACK SVD on the Hermitian matrix made by the Krylov method. GMRES builds a Hessenberg matrix. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sch at ucla.edu Sat Oct 29 23:55:39 2011 From: sch at ucla.edu (Shao-Ching Huang) Date: Sat, 29 Oct 2011 21:55:39 -0700 Subject: [petsc-users] multiblock question In-Reply-To: References: Message-ID: Matt, thanks again for the helpful comments. Really appreciate it. I will now work on putting things together. Shao-Ching On Sat, Oct 29, 2011 at 7:03 PM, Matthew Knepley wrote: > On Sat, Oct 29, 2011 at 6:58 PM, Shao-Ching Huang wrote: >> >> Hi, I have additional questions: >> >> 1. Suppose I create DMDA0 on two processes {0,1} (communicator comm0) >> and DMDA1 on another two processes {2,3} (communicator comm1). Can I >> DMCompositeAddDM() them into DMComposite (created using communicator >> MPI_COMM_WORLD, containing all 4 processes, 0-3)? > > No. This is due to the complexity of MPI for collectives, etc between > communicators. > Instead, you should use the full communicator for both, but give no vertices > to > ranks you want to leave out. This means you will have to partition the DMDA > yourself, > but this is straightforward. There are no performance hits when > communicating ghost > values, and the reductions inside the solve would need all the procs anyway. > >> >> 2. Suppose DMDA0 and DMDA1 are 2D Cartesian domains, and that the >> right-hand-side of DMDA0 is "connected" to DMDA1 (just like the >> subdomains within a regular DMDA). Which API should I use to tell >> DMComposite that the "right side" (say i=Nx, all j) of DMDA0 is >> connected to the "left side" (say i=0, all j) of DMDA1. I suppose I >> need to use IS index set somewhere. > > This is more complicated. All our examples (like SNES ex28) are not grids or > scalar which > are not coupled. You would need to construct the LocalToGlobal mapping for > this collection > of grids (which is a set of two ISes). Here is the current code: > ??http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/dm/impls/composite/pack.c.html#DMCompositeGetISLocalToGlobalMappings > Notice that the mapping for each DMDA is just concatenated. You code would > look similar, > except that you would knit together one edge. > Since we have never had anyone ask for this, the interface is still > primitive. If > you have a suggestion for a nice way to construct this IS, please let us > know. > ? Thanks, > ? ? ? Matt > >> >> Thanks, >> >> Shao-Ching >> >> On Fri, Oct 28, 2011 at 5:17 PM, Matthew Knepley >> wrote: >> > On Fri, Oct 28, 2011 at 11:48 PM, Shao-Ching Huang >> > wrote: >> >> >> >> Hi >> >> >> >> We are planning a new (finite volume) multiblock code, in which each >> >> block has logically structured mesh. We plan to create one DMDA for >> >> one block (which could span across 1 or more processes; we already >> >> have code). What would be the recommended PETSc-way to couple these >> >> blocks together for implicit solves? We also need ghosted region >> >> between two connected blocks (just like the ghost regions among the >> >> subdomains within a DMDA) for interpolation. >> > >> > I think the idea here is to use a DMComposite to couple together these >> > DMDAs. You >> > would have to specify the coupling explicitly since we have no way of >> > knowing how they >> > are connected, but after that, the GlobalToLocal() should work just the >> > same. >> > ? Thanks, >> > ? ? ?Matt >> > >> >> >> >> Thanks. >> >> >> >> Shao-Ching >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> > experiments >> > is infinitely more interesting than any results to which their >> > experiments >> > lead. >> > -- Norbert Wiener >> > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > From knepley at gmail.com Sun Oct 30 10:33:57 2011 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 30 Oct 2011 15:33:57 +0000 Subject: [petsc-users] Questions about setting values for GPU based matrices In-Reply-To: References: Message-ID: On Sun, Oct 30, 2011 at 12:22 PM, Fredrik Heffer Valdmanis < fredva at ifi.uio.no> wrote: > 2011/10/28 Matthew Knepley > >> On Fri, Oct 28, 2011 at 10:24 AM, Fredrik Heffer Valdmanis < >> fredva at ifi.uio.no> wrote: >> >>> Hi, >>> >>> I am working on integrating the new GPU based vectors and matrices into >>> FEniCS. Now, I'm looking at the possibility for getting some speedup during >>> finite element assembly, specifically when inserting the local element >>> matrix into the global element matrix. In that regard, I have a few >>> questions I hope you can help me out with: >>> >>> - When calling MatSetValues with a MATSEQAIJCUSP matrix as parameter, >>> what exactly is it that happens? As far as I can see, MatSetValues is not >>> implemented for GPU based matrices, neither is the mat->ops->setvalues set >>> to point at any function for this Mat type. >>> >> >> Yes, MatSetValues always operates on the CPU side. It would not make >> sense to do individual operations on the GPU. >> >> I have written batched of assembly for element matrices that are all the >> same size: >> >> >> http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatSetValuesBatch.html >> > > Thanks. I assume that the best way to use the batch function is to batch > up all element matrices and insert all with one function call? Or is it > recommended to split it up into several smaller batches? > Right now, several batches does not work.For insertion to be efficient, you should keep the matrices in COO format, or convert them back. We do not do either right now. The idea is to see if it ever matters for applications. Matt > -- > Fredrik > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sun Oct 30 10:43:57 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sun, 30 Oct 2011 19:13:57 +0330 Subject: [petsc-users] How to calculate Induced Norm of Matrix? Message-ID: Dear all, Is there any way in Petsc to obtain Induced norm of matrix? ( especially NORM-2 ) Thanks a lot, B.B. -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sun Oct 30 10:47:42 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sun, 30 Oct 2011 09:47:42 -0600 Subject: [petsc-users] How to calculate Induced Norm of Matrix? In-Reply-To: References: Message-ID: On Sun, Oct 30, 2011 at 09:43, behzad baghapour wrote: > Is there any way in Petsc to obtain Induced norm of matrix? ( especially > NORM-2 ) > Estimate the largest singular value using a Krylov method. -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sun Oct 30 10:50:06 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sun, 30 Oct 2011 19:20:06 +0330 Subject: [petsc-users] How to calculate Induced Norm of Matrix? In-Reply-To: References: Message-ID: OK. Thanks. On Sun, Oct 30, 2011 at 7:17 PM, Jed Brown wrote: > On Sun, Oct 30, 2011 at 09:43, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> Is there any way in Petsc to obtain Induced norm of matrix? ( especially >> NORM-2 ) >> > > Estimate the largest singular value using a Krylov method. > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Oct 30 10:52:28 2011 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 30 Oct 2011 15:52:28 +0000 Subject: [petsc-users] How to calculate Induced Norm of Matrix? In-Reply-To: References: Message-ID: On Sun, Oct 30, 2011 at 3:50 PM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > OK. Thanks. More commentary: There are lots of papers about estimating these norms (1-norms too), and nothing works well. There are no good ways to generically approximate the matrix norm. For certain very special classes of matrix, you can do it, but these are also the matrices for which you have a specialize very fast solver, like the Laplacian, so you rarely care. Matt > On Sun, Oct 30, 2011 at 7:17 PM, Jed Brown wrote: > >> On Sun, Oct 30, 2011 at 09:43, behzad baghapour < >> behzad.baghapour at gmail.com> wrote: >> >>> Is there any way in Petsc to obtain Induced norm of matrix? ( especially >>> NORM-2 ) >>> >> >> Estimate the largest singular value using a Krylov method. >> > > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sun Oct 30 10:57:25 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sun, 30 Oct 2011 19:27:25 +0330 Subject: [petsc-users] How to calculate Induced Norm of Matrix? In-Reply-To: References: Message-ID: So, this means there is no clear way to obtain Induced Norm of matrix like NORM-2 ( unless using SVD and maximum SV ) ? On Sun, Oct 30, 2011 at 7:22 PM, Matthew Knepley wrote: > On Sun, Oct 30, 2011 at 3:50 PM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> OK. Thanks. > > > More commentary: There are lots of papers about estimating these norms > (1-norms too), and > nothing works well. There are no good ways to generically approximate the > matrix norm. For > certain very special classes of matrix, you can do it, but these are also > the matrices for which > you have a specialize very fast solver, like the Laplacian, so you rarely > care. > > Matt > > >> On Sun, Oct 30, 2011 at 7:17 PM, Jed Brown wrote: >> >>> On Sun, Oct 30, 2011 at 09:43, behzad baghapour < >>> behzad.baghapour at gmail.com> wrote: >>> >>>> Is there any way in Petsc to obtain Induced norm of matrix? ( >>>> especially NORM-2 ) >>>> >>> >>> Estimate the largest singular value using a Krylov method. >>> >> >> >> >> -- >> ================================== >> Behzad Baghapour >> Ph.D. Candidate, Mechecanical Engineering >> University of Tehran, Tehran, Iran >> https://sites.google.com/site/behzadbaghapour >> Fax: 0098-21-88020741 >> ================================== >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Oct 30 10:59:29 2011 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 30 Oct 2011 15:59:29 +0000 Subject: [petsc-users] How to calculate Induced Norm of Matrix? In-Reply-To: References: Message-ID: On Sun, Oct 30, 2011 at 3:57 PM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > So, this means there is no clear way to obtain Induced Norm of matrix like > NORM-2 ( unless using SVD and maximum SV ) ? I say yes. I invite you to examine the literature. Matt > On Sun, Oct 30, 2011 at 7:22 PM, Matthew Knepley wrote: > >> On Sun, Oct 30, 2011 at 3:50 PM, behzad baghapour < >> behzad.baghapour at gmail.com> wrote: >> >>> OK. Thanks. >> >> >> More commentary: There are lots of papers about estimating these norms >> (1-norms too), and >> nothing works well. There are no good ways to generically approximate the >> matrix norm. For >> certain very special classes of matrix, you can do it, but these are also >> the matrices for which >> you have a specialize very fast solver, like the Laplacian, so you rarely >> care. >> >> Matt >> >> >>> On Sun, Oct 30, 2011 at 7:17 PM, Jed Brown wrote: >>> >>>> On Sun, Oct 30, 2011 at 09:43, behzad baghapour < >>>> behzad.baghapour at gmail.com> wrote: >>>> >>>>> Is there any way in Petsc to obtain Induced norm of matrix? ( >>>>> especially NORM-2 ) >>>>> >>>> >>>> Estimate the largest singular value using a Krylov method. >>>> >>> >>> >>> >>> -- >>> ================================== >>> Behzad Baghapour >>> Ph.D. Candidate, Mechecanical Engineering >>> University of Tehran, Tehran, Iran >>> https://sites.google.com/site/behzadbaghapour >>> Fax: 0098-21-88020741 >>> ================================== >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sun Oct 30 11:03:10 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sun, 30 Oct 2011 19:33:10 +0330 Subject: [petsc-users] How to calculate Induced Norm of Matrix? In-Reply-To: References: Message-ID: OK. Thanks. I should continue my research. On Sun, Oct 30, 2011 at 7:29 PM, Matthew Knepley wrote: > On Sun, Oct 30, 2011 at 3:57 PM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> So, this means there is no clear way to obtain Induced Norm of matrix >> like NORM-2 ( unless using SVD and maximum SV ) ? > > > I say yes. I invite you to examine the literature. > > Matt > > >> On Sun, Oct 30, 2011 at 7:22 PM, Matthew Knepley wrote: >> >>> On Sun, Oct 30, 2011 at 3:50 PM, behzad baghapour < >>> behzad.baghapour at gmail.com> wrote: >>> >>>> OK. Thanks. >>> >>> >>> More commentary: There are lots of papers about estimating these norms >>> (1-norms too), and >>> nothing works well. There are no good ways to generically approximate >>> the matrix norm. For >>> certain very special classes of matrix, you can do it, but these are >>> also the matrices for which >>> you have a specialize very fast solver, like the Laplacian, so you >>> rarely care. >>> >>> Matt >>> >>> >>>> On Sun, Oct 30, 2011 at 7:17 PM, Jed Brown wrote: >>>> >>>>> On Sun, Oct 30, 2011 at 09:43, behzad baghapour < >>>>> behzad.baghapour at gmail.com> wrote: >>>>> >>>>>> Is there any way in Petsc to obtain Induced norm of matrix? ( >>>>>> especially NORM-2 ) >>>>>> >>>>> >>>>> Estimate the largest singular value using a Krylov method. >>>>> >>>> >>>> >>>> >>>> -- >>>> ================================== >>>> Behzad Baghapour >>>> Ph.D. Candidate, Mechecanical Engineering >>>> University of Tehran, Tehran, Iran >>>> https://sites.google.com/site/behzadbaghapour >>>> Fax: 0098-21-88020741 >>>> ================================== >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> >> >> -- >> ================================== >> Behzad Baghapour >> Ph.D. Candidate, Mechecanical Engineering >> University of Tehran, Tehran, Iran >> https://sites.google.com/site/behzadbaghapour >> Fax: 0098-21-88020741 >> ================================== >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From jack.poulson at gmail.com Sun Oct 30 11:24:41 2011 From: jack.poulson at gmail.com (Jack Poulson) Date: Sun, 30 Oct 2011 11:24:41 -0500 Subject: [petsc-users] How to calculate Induced Norm of Matrix? In-Reply-To: References: Message-ID: On Sun, Oct 30, 2011 at 10:52 AM, Matthew Knepley wrote: > More commentary: There are lots of papers about estimating these norms > (1-norms too), and > nothing works well. There are no good ways to generically approximate the > matrix norm. For > certain very special classes of matrix, you can do it, but these are also > the matrices for which > you have a specialize very fast solver, like the Laplacian, so you rarely > care. > > There is a nice paper by John D. Dixon, "Estimating Extremal Eigenvalues and Condition Numbers of Matrices", http://www.jstor.org/pss/2157241, which provides an extremely robust method for getting rough estimates of the condition number, and it only requires the ability to apply your operator and its adjoint. A typical usage would be to compute an estimate of the condition number K, such that the true condition number is within a factor of 2 of K with a probability of 1-10^-6. The 1-norm is actually pretty trivial to compute if you have access to your matrix entries; it is the maximum vector one norm of the columns of the matrix. Jack -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sun Oct 30 11:33:39 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sun, 30 Oct 2011 20:03:39 +0330 Subject: [petsc-users] How to calculate Induced Norm of Matrix? In-Reply-To: References: Message-ID: A good paper, I will work on it. Thanks a lot dear Jack. On Sun, Oct 30, 2011 at 7:54 PM, Jack Poulson wrote: > On Sun, Oct 30, 2011 at 10:52 AM, Matthew Knepley wrote: > >> More commentary: There are lots of papers about estimating these norms >> (1-norms too), and >> nothing works well. There are no good ways to generically approximate the >> matrix norm. For >> certain very special classes of matrix, you can do it, but these are also >> the matrices for which >> you have a specialize very fast solver, like the Laplacian, so you rarely >> care. >> >> > > There is a nice paper by John D. Dixon, "Estimating Extremal Eigenvalues > and Condition Numbers of Matrices", http://www.jstor.org/pss/2157241, > which provides an extremely robust method for getting rough estimates of > the condition number, and it only requires the ability to apply your > operator and its adjoint. A typical usage would be to compute an estimate > of the condition number K, such that the true condition number is within a > factor of 2 of K with a probability of 1-10^-6. > > The 1-norm is actually pretty trivial to compute if you have access to > your matrix entries; it is the maximum vector one norm of the columns of > the matrix. > > Jack > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From jack.poulson at gmail.com Sun Oct 30 11:40:27 2011 From: jack.poulson at gmail.com (Jack Poulson) Date: Sun, 30 Oct 2011 11:40:27 -0500 Subject: [petsc-users] How to calculate Induced Norm of Matrix? In-Reply-To: References: Message-ID: Not a problem; though for some reason I repeatedly wrote "condition number" when I meant "two norm". Dixon's paper certainly provides a method for computing an estimate to the condition number, but the latter also requires the ability to apply the inverse of your operator and the inverse of its adjoint. Jack On Sun, Oct 30, 2011 at 11:33 AM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > A good paper, I will work on it. > Thanks a lot dear Jack. > > > On Sun, Oct 30, 2011 at 7:54 PM, Jack Poulson wrote: > >> On Sun, Oct 30, 2011 at 10:52 AM, Matthew Knepley wrote: >> >>> More commentary: There are lots of papers about estimating these norms >>> (1-norms too), and >>> nothing works well. There are no good ways to generically approximate >>> the matrix norm. For >>> certain very special classes of matrix, you can do it, but these are >>> also the matrices for which >>> you have a specialize very fast solver, like the Laplacian, so you >>> rarely care. >>> >>> >> >> There is a nice paper by John D. Dixon, "Estimating Extremal Eigenvalues >> and Condition Numbers of Matrices", http://www.jstor.org/pss/2157241, >> which provides an extremely robust method for getting rough estimates of >> the condition number, and it only requires the ability to apply your >> operator and its adjoint. A typical usage would be to compute an estimate >> of the condition number K, such that the true condition number is within a >> factor of 2 of K with a probability of 1-10^-6. >> >> The 1-norm is actually pretty trivial to compute if you have access to >> your matrix entries; it is the maximum vector one norm of the columns of >> the matrix. >> >> Jack >> > > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Sun Oct 30 11:51:16 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Sun, 30 Oct 2011 20:21:16 +0330 Subject: [petsc-users] How to calculate Induced Norm of Matrix? In-Reply-To: References: Message-ID: OK. Thanks again. On Sun, Oct 30, 2011 at 8:10 PM, Jack Poulson wrote: > Not a problem; though for some reason I repeatedly wrote "condition > number" when I meant "two norm". Dixon's paper certainly provides a method > for computing an estimate to the condition number, but the latter also > requires the ability to apply the inverse of your operator and the inverse > of its adjoint. > > Jack > > > On Sun, Oct 30, 2011 at 11:33 AM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> A good paper, I will work on it. >> Thanks a lot dear Jack. >> >> >> On Sun, Oct 30, 2011 at 7:54 PM, Jack Poulson wrote: >> >>> On Sun, Oct 30, 2011 at 10:52 AM, Matthew Knepley wrote: >>> >>>> More commentary: There are lots of papers about estimating these norms >>>> (1-norms too), and >>>> nothing works well. There are no good ways to generically approximate >>>> the matrix norm. For >>>> certain very special classes of matrix, you can do it, but these are >>>> also the matrices for which >>>> you have a specialize very fast solver, like the Laplacian, so you >>>> rarely care. >>>> >>>> >>> >>> There is a nice paper by John D. Dixon, "Estimating Extremal Eigenvalues >>> and Condition Numbers of Matrices", http://www.jstor.org/pss/2157241, >>> which provides an extremely robust method for getting rough estimates of >>> the condition number, and it only requires the ability to apply your >>> operator and its adjoint. A typical usage would be to compute an estimate >>> of the condition number K, such that the true condition number is within a >>> factor of 2 of K with a probability of 1-10^-6. >>> >>> The 1-norm is actually pretty trivial to compute if you have access to >>> your matrix entries; it is the maximum vector one norm of the columns of >>> the matrix. >>> >>> Jack >>> >> >> >> >> -- >> ================================== >> Behzad Baghapour >> Ph.D. Candidate, Mechecanical Engineering >> University of Tehran, Tehran, Iran >> https://sites.google.com/site/behzadbaghapour >> Fax: 0098-21-88020741 >> ================================== >> >> > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From xdliang at gmail.com Sun Oct 30 20:43:49 2011 From: xdliang at gmail.com (Xiangdong Liang) Date: Sun, 30 Oct 2011 21:43:49 -0400 Subject: [petsc-users] PaStiX is not included in Cray's PETSc module Message-ID: Hello everyone, I am trying to solve linear system with sparse-direct solver on Kraken Cray XT5 system. As I tested on our own small cluster, PaStiX works better than superlu_dist and mumps. However, PaStix was not installed on Kraken, while superlu_dist and mumps are installed as part of PETSc. I contact Kraken user support and was told the PETSc library was built by Cray and specially optimized for its structure. Do you happen to know the reason that PaStiX was not included in Cray's PETSc module? Thanks. Best, Xiangdong From bsmith at mcs.anl.gov Sun Oct 30 20:47:02 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 30 Oct 2011 20:47:02 -0500 Subject: [petsc-users] PaStiX is not included in Cray's PETSc module In-Reply-To: References: Message-ID: You can install petsc-dev on the system yourself and use whatever solver you chose. Barry On Oct 30, 2011, at 8:43 PM, Xiangdong Liang wrote: > Hello everyone, > > I am trying to solve linear system with sparse-direct solver on Kraken > Cray XT5 system. As I tested on our own small cluster, PaStiX works > better than superlu_dist and mumps. > > However, PaStix was not installed on Kraken, while superlu_dist and > mumps are installed as part of PETSc. I contact Kraken user support > and was told the PETSc library was built by Cray and specially > optimized for its structure. Do you happen to know the reason that > PaStiX was not included in Cray's PETSc module? Thanks. > > Best, > Xiangdong From xdliang at gmail.com Sun Oct 30 20:57:26 2011 From: xdliang at gmail.com (Xiangdong Liang) Date: Sun, 30 Oct 2011 21:57:26 -0400 Subject: [petsc-users] PaStiX is not included in Cray's PETSc module In-Reply-To: References: Message-ID: Thanks, Barry. Do you have an idea of the performance of the PETSc library between Cray's module and the one I compiled? I was told by Kraken's user support that the PetSc module in Cray's LibSci is highly tuned for the interconnect of the hardware. I am wondering whether my own compiled library will heavily affect the performance. Thanks. Best, Xiangdong On Sun, Oct 30, 2011 at 9:47 PM, Barry Smith wrote: > > ? You can install petsc-dev on the system yourself and use whatever solver you chose. > > ? Barry > > On Oct 30, 2011, at 8:43 PM, Xiangdong Liang wrote: > >> Hello everyone, >> >> I am trying to solve linear system with sparse-direct solver on Kraken >> Cray XT5 system. As I tested on our own small cluster, PaStiX works >> better than superlu_dist and mumps. >> >> However, PaStix was not installed on Kraken, while superlu_dist and >> mumps are installed as part of PETSc. I contact Kraken user support >> and was told the PETSc library was built by Cray and specially >> optimized for its structure. Do you happen to know the reason that >> PaStiX was not included in Cray's PETSc module? Thanks. >> >> Best, >> Xiangdong > > From bsmith at mcs.anl.gov Sun Oct 30 21:01:28 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 30 Oct 2011 21:01:28 -0500 Subject: [petsc-users] PaStiX is not included in Cray's PETSc module In-Reply-To: References: Message-ID: <24A7BB18-0FAE-4AFC-A357-D2264845D98C@mcs.anl.gov> What's the harm in installing petsc-dev with Pastix and running a comparison vs MUMPS and SuperLU_DIST. Direct solvers are notoriously fickle and can only be determined via experimentation. Barry On Oct 30, 2011, at 8:57 PM, Xiangdong Liang wrote: > Thanks, Barry. Do you have an idea of the performance of the PETSc > library between Cray's module and the one I compiled? > > I was told by Kraken's user support that the PetSc module in Cray's > LibSci is highly tuned for the interconnect of the hardware. I am > wondering whether my own compiled library will heavily affect the > performance. Thanks. > > Best, > Xiangdong > > > On Sun, Oct 30, 2011 at 9:47 PM, Barry Smith wrote: >> >> You can install petsc-dev on the system yourself and use whatever solver you chose. >> >> Barry >> >> On Oct 30, 2011, at 8:43 PM, Xiangdong Liang wrote: >> >>> Hello everyone, >>> >>> I am trying to solve linear system with sparse-direct solver on Kraken >>> Cray XT5 system. As I tested on our own small cluster, PaStiX works >>> better than superlu_dist and mumps. >>> >>> However, PaStix was not installed on Kraken, while superlu_dist and >>> mumps are installed as part of PETSc. I contact Kraken user support >>> and was told the PETSc library was built by Cray and specially >>> optimized for its structure. Do you happen to know the reason that >>> PaStiX was not included in Cray's PETSc module? Thanks. >>> >>> Best, >>> Xiangdong >> >> From gdiso at ustc.edu Sun Oct 30 23:03:46 2011 From: gdiso at ustc.edu (Gong Ding) Date: Mon, 31 Oct 2011 12:03:46 +0800 (CST) Subject: [petsc-users] Block ILU with AIJ matrix? Message-ID: <17945181.298501320033826965.JavaMail.coremail@mail.ustc.edu> Hi, I have several materials, each has different dofs. It is a block matrix with nonuniform block size. However, I know it is impossible to set different block size within a BAIJ matrix -- sigh. I can only use AIJ matrix here. My question is let a ILU preconditioner of AIJ matrix do block factorization? I guess the block ILU can be 1) more efficient since it is cache friendly. 2) more accurate since it has a "matched" structure. Or any suggestion? Gong Ding From knepley at gmail.com Sun Oct 30 23:07:32 2011 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 31 Oct 2011 04:07:32 +0000 Subject: [petsc-users] Block ILU with AIJ matrix? In-Reply-To: <17945181.298501320033826965.JavaMail.coremail@mail.ustc.edu> References: <17945181.298501320033826965.JavaMail.coremail@mail.ustc.edu> Message-ID: 2011/10/31 Gong Ding > Hi, > I have several materials, each has different dofs. > It is a block matrix with nonuniform block size. > However, I know it is impossible to set different block size within a BAIJ > matrix -- sigh. > I can only use AIJ matrix here. > My question is let a ILU preconditioner of AIJ matrix do block > factorization? > > I guess the block ILU can be > 1) more efficient since it is cache friendly. > 2) more accurate since it has a "matched" structure. > > Or any suggestion? I do not believe any sparse factorization package handles varying block size. Matt > > Gong Ding -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Mon Oct 31 00:37:57 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Mon, 31 Oct 2011 09:07:57 +0330 Subject: [petsc-users] Block ILU with AIJ matrix? In-Reply-To: References: <17945181.298501320033826965.JavaMail.coremail@mail.ustc.edu> Message-ID: I think you may choose the maximum block size as a BAIJ dimension. However ITSOL has variable BLOCK ILU routine which may help you BUT it is not paralleled. http://www-users.cs.umn.edu/~saad/software/ITSOL/index.html B.B. On Mon, Oct 31, 2011 at 7:37 AM, Matthew Knepley wrote: > 2011/10/31 Gong Ding > >> Hi, >> I have several materials, each has different dofs. >> It is a block matrix with nonuniform block size. >> However, I know it is impossible to set different block size within a >> BAIJ matrix -- sigh. >> I can only use AIJ matrix here. >> My question is let a ILU preconditioner of AIJ matrix do block >> factorization? >> >> I guess the block ILU can be >> 1) more efficient since it is cache friendly. >> 2) more accurate since it has a "matched" structure. >> >> Or any suggestion? > > > I do not believe any sparse factorization package handles varying block > size. > > Matt > > >> >> Gong Ding > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From gdiso at ustc.edu Mon Oct 31 00:50:28 2011 From: gdiso at ustc.edu (Gong Ding) Date: Mon, 31 Oct 2011 13:50:28 +0800 (CST) Subject: [petsc-users] Block ILU with AIJ matrix? In-Reply-To: References: <17945181.298501320033826965.JavaMail.coremail@mail.ustc.edu> Message-ID: <2039792.298621320040228798.JavaMail.coremail@mail.ustc.edu> I had also considered use max block size. But it is not efficient here. Hope petsc support nonuniform block size one day. I think you may choose the maximum block size as a BAIJ dimension. However ITSOL has variable BLOCK ILU routine which may help you BUT it is not paralleled. http://www-users.cs.umn.edu/~saad/software/ITSOL/index.html B.B. -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Mon Oct 31 00:58:05 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Mon, 31 Oct 2011 09:28:05 +0330 Subject: [petsc-users] Block ILU with AIJ matrix? In-Reply-To: <2039792.298621320040228798.JavaMail.coremail@mail.ustc.edu> References: <17945181.298501320033826965.JavaMail.coremail@mail.ustc.edu> <2039792.298621320040228798.JavaMail.coremail@mail.ustc.edu> Message-ID: I hope so. On Mon, Oct 31, 2011 at 9:20 AM, Gong Ding wrote: > I had also considered use max block size. But it is not efficient here. > Hope petsc support nonuniform block size one day. > > > I think you may choose the maximum block size as a BAIJ dimension. However > ITSOL has variable BLOCK ILU routine which may help you BUT it is not > paralleled. > > http://www-users.cs.umn.edu/~saad/software/ITSOL/index.html > > B.B. > > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Oct 31 01:18:47 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 31 Oct 2011 00:18:47 -0600 Subject: [petsc-users] Block ILU with AIJ matrix? In-Reply-To: <2039792.298621320040228798.JavaMail.coremail@mail.ustc.edu> References: <17945181.298501320033826965.JavaMail.coremail@mail.ustc.edu> <2039792.298621320040228798.JavaMail.coremail@mail.ustc.edu> Message-ID: On Sun, Oct 30, 2011 at 23:50, Gong Ding wrote: > I had also considered use max block size. But it is not efficient here. > Hope petsc support nonuniform block size one day. > Inodes do a sort of partial blocking. There has to a clear performance benefit to explicit variable blocking in order to justify the implementation and interface complexity. I have not yet seen a demonstration of this. -------------- next part -------------- An HTML attachment was scrubbed... URL: From xsli at lbl.gov Mon Oct 31 01:33:06 2011 From: xsli at lbl.gov (Xiaoye S. Li) Date: Sun, 30 Oct 2011 23:33:06 -0700 Subject: [petsc-users] Block ILU with AIJ matrix? In-Reply-To: References: <17945181.298501320033826965.JavaMail.coremail@mail.ustc.edu> <2039792.298621320040228798.JavaMail.coremail@mail.ustc.edu> Message-ID: The ILUTP in SuperLU (4.2 and up) exploits blocking (supernodes) in the (approximate) L & U factors. Note the block boundary is discovered on the fly, which is usually larger than the block size from the input matrix A. So the efficiency is pretty good. It uses threshold dropping with partial pivoting, numerically is quite good too. It's available only in the serial version. Sherry Li On Sun, Oct 30, 2011 at 11:18 PM, Jed Brown wrote: > On Sun, Oct 30, 2011 at 23:50, Gong Ding wrote: >> >> I had also considered use max block size. But it is not efficient here. >> Hope petsc support nonuniform block size one day. > > Inodes do a sort of partial blocking. There has to a clear performance > benefit to explicit variable blocking in order to justify the implementation > and interface complexity. I have not yet seen a demonstration of this. From gdiso at ustc.edu Mon Oct 31 03:15:41 2011 From: gdiso at ustc.edu (Gong Ding) Date: Mon, 31 Oct 2011 16:15:41 +0800 (CST) Subject: [petsc-users] Block ILU with AIJ matrix? In-Reply-To: References: <17945181.298501320033826965.JavaMail.coremail@mail.ustc.edu> <2039792.298621320040228798.JavaMail.coremail@mail.ustc.edu> Message-ID: <22314575.298931320048941114.JavaMail.coremail@mail.ustc.edu> Yes, modern direct solver do block factorization as super node method. Can ASM type preconditioner use superlu ILUT on each subdomain? At preset, my solver (BCGS) only works with ASM+ILU(1) on a "smart partitioned" mesh. > The ILUTP in SuperLU (4.2 and up) exploits blocking (supernodes) in > > the (approximate) L & U factors. Note the block boundary is > > discovered on the fly, which is usually larger than the block size > > from the input matrix A. So the efficiency is pretty good. It uses > > threshold dropping with partial pivoting, numerically is quite good > > too. It's available only in the serial version. > > > > Sherry Li > > > > > > On Sun, Oct 30, 2011 at 11:18 PM, Jed Brown wrote: > > > On Sun, Oct 30, 2011 at 23:50, Gong Ding wrote: > > >> > > >> I had also considered use max block size. But it is not efficient here. > > >> Hope petsc support nonuniform block size one day. > > > > > > Inodes do a sort of partial blocking. There has to a clear performance > > > benefit to explicit variable blocking in order to justify the implementation > > > and interface complexity. I have not yet seen a demonstration of this. > > From behzad.baghapour at gmail.com Mon Oct 31 03:55:11 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Mon, 31 Oct 2011 12:25:11 +0330 Subject: [petsc-users] How to change PC content during Iterations Message-ID: Dear all, I'm using KSP iteration (for now) to solve my nonlinear problem and handling Newton Iterations manually. Here I want to change the PC method from some specified KSP iteration. How should I do it correctly in Petsc? Thanks, B.B. -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Oct 31 05:46:43 2011 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 31 Oct 2011 10:46:43 +0000 Subject: [petsc-users] Block ILU with AIJ matrix? In-Reply-To: <22314575.298931320048941114.JavaMail.coremail@mail.ustc.edu> References: <17945181.298501320033826965.JavaMail.coremail@mail.ustc.edu> <2039792.298621320040228798.JavaMail.coremail@mail.ustc.edu> <22314575.298931320048941114.JavaMail.coremail@mail.ustc.edu> Message-ID: On Mon, Oct 31, 2011 at 8:15 AM, Gong Ding wrote: > Yes, modern direct solver do block factorization as super node method. > Can ASM type preconditioner use superlu ILUT on each subdomain? > Yes. Matt > At preset, my solver (BCGS) only works with ASM+ILU(1) on a "smart > partitioned" mesh. > > > > The ILUTP in SuperLU (4.2 and up) exploits blocking (supernodes) in > > > > the (approximate) L & U factors. Note the block boundary is > > > > discovered on the fly, which is usually larger than the block size > > > > from the input matrix A. So the efficiency is pretty good. It uses > > > > threshold dropping with partial pivoting, numerically is quite good > > > > too. It's available only in the serial version. > > > > > > > > Sherry Li > > > > > > > > > > > > On Sun, Oct 30, 2011 at 11:18 PM, Jed Brown > wrote: > > > > > On Sun, Oct 30, 2011 at 23:50, Gong Ding wrote: > > > > >> > > > > >> I had also considered use max block size. But it is not efficient > here. > > > > >> Hope petsc support nonuniform block size one day. > > > > > > > > > > Inodes do a sort of partial blocking. There has to a clear performance > > > > > benefit to explicit variable blocking in order to justify the > implementation > > > > > and interface complexity. I have not yet seen a demonstration of this. > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Oct 31 05:49:34 2011 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 31 Oct 2011 10:49:34 +0000 Subject: [petsc-users] How to change PC content during Iterations In-Reply-To: References: Message-ID: On Mon, Oct 31, 2011 at 8:55 AM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > Dear all, > > I'm using KSP iteration (for now) to solve my nonlinear problem and > handling Newton Iterations manually. Here I want to change the PC method > from some specified KSP iteration. How should I do it correctly in Petsc? > Call PCSetType() and then KSPSetOperators() again in your loop Matt > Thanks, B.B. > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Mon Oct 31 05:58:41 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Mon, 31 Oct 2011 14:28:41 +0330 Subject: [petsc-users] How to change PC content during Iterations In-Reply-To: References: Message-ID: I did it but received this Error when I want to change the level of fill for PCILU: [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Object is in wrong state! [0]PETSC ERROR: Cannot change levels after use! [0]PETSC ERROR: ------------------------------------------------------------------------ On Mon, Oct 31, 2011 at 2:19 PM, Matthew Knepley wrote: > On Mon, Oct 31, 2011 at 8:55 AM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> Dear all, >> >> I'm using KSP iteration (for now) to solve my nonlinear problem and >> handling Newton Iterations manually. Here I want to change the PC method >> from some specified KSP iteration. How should I do it correctly in Petsc? >> > > Call PCSetType() and then KSPSetOperators() again in your loop > > Matt > > >> Thanks, B.B. >> >> -- >> ================================== >> Behzad Baghapour >> Ph.D. Candidate, Mechecanical Engineering >> University of Tehran, Tehran, Iran >> https://sites.google.com/site/behzadbaghapour >> Fax: 0098-21-88020741 >> ================================== >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Oct 31 06:02:56 2011 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 31 Oct 2011 11:02:56 +0000 Subject: [petsc-users] How to change PC content during Iterations In-Reply-To: References: Message-ID: On Mon, Oct 31, 2011 at 10:58 AM, behzad baghapour < behzad.baghapour at gmail.com> wrote: > I did it but received this Error when I want to change the level of fill > for PCILU: > Right, you would have to recreate the preconditioner. The SetType() would not work since you did not actually change the type. Matt > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Object is in wrong state! > [0]PETSC ERROR: Cannot change levels after use! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > > On Mon, Oct 31, 2011 at 2:19 PM, Matthew Knepley wrote: > >> On Mon, Oct 31, 2011 at 8:55 AM, behzad baghapour < >> behzad.baghapour at gmail.com> wrote: >> >>> Dear all, >>> >>> I'm using KSP iteration (for now) to solve my nonlinear problem and >>> handling Newton Iterations manually. Here I want to change the PC method >>> from some specified KSP iteration. How should I do it correctly in Petsc? >>> >> >> Call PCSetType() and then KSPSetOperators() again in your loop >> >> Matt >> >> >>> Thanks, B.B. >>> >>> -- >>> ================================== >>> Behzad Baghapour >>> Ph.D. Candidate, Mechecanical Engineering >>> University of Tehran, Tehran, Iran >>> https://sites.google.com/site/behzadbaghapour >>> Fax: 0098-21-88020741 >>> ================================== >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From behzad.baghapour at gmail.com Mon Oct 31 06:14:02 2011 From: behzad.baghapour at gmail.com (behzad baghapour) Date: Mon, 31 Oct 2011 14:44:02 +0330 Subject: [petsc-users] How to change PC content during Iterations In-Reply-To: References: Message-ID: This means to PCDestroy() and then PCCreate again at that iteration? On Mon, Oct 31, 2011 at 2:32 PM, Matthew Knepley wrote: > On Mon, Oct 31, 2011 at 10:58 AM, behzad baghapour < > behzad.baghapour at gmail.com> wrote: > >> I did it but received this Error when I want to change the level of fill >> for PCILU: >> > > Right, you would have to recreate the preconditioner. The SetType() would > not work since > you did not actually change the type. > > Matt > > >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Object is in wrong state! >> [0]PETSC ERROR: Cannot change levels after use! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> >> >> >> On Mon, Oct 31, 2011 at 2:19 PM, Matthew Knepley wrote: >> >>> On Mon, Oct 31, 2011 at 8:55 AM, behzad baghapour < >>> behzad.baghapour at gmail.com> wrote: >>> >>>> Dear all, >>>> >>>> I'm using KSP iteration (for now) to solve my nonlinear problem and >>>> handling Newton Iterations manually. Here I want to change the PC method >>>> from some specified KSP iteration. How should I do it correctly in Petsc? >>>> >>> >>> Call PCSetType() and then KSPSetOperators() again in your loop >>> >>> Matt >>> >>> >>>> Thanks, B.B. >>>> >>>> -- >>>> ================================== >>>> Behzad Baghapour >>>> Ph.D. Candidate, Mechecanical Engineering >>>> University of Tehran, Tehran, Iran >>>> https://sites.google.com/site/behzadbaghapour >>>> Fax: 0098-21-88020741 >>>> ================================== >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> >> >> -- >> ================================== >> Behzad Baghapour >> Ph.D. Candidate, Mechecanical Engineering >> University of Tehran, Tehran, Iran >> https://sites.google.com/site/behzadbaghapour >> Fax: 0098-21-88020741 >> ================================== >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- ================================== Behzad Baghapour Ph.D. Candidate, Mechecanical Engineering University of Tehran, Tehran, Iran https://sites.google.com/site/behzadbaghapour Fax: 0098-21-88020741 ================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From fredva at ifi.uio.no Mon Oct 31 07:31:35 2011 From: fredva at ifi.uio.no (Fredrik Heffer Valdmanis) Date: Mon, 31 Oct 2011 13:31:35 +0100 Subject: [petsc-users] Questions about setting values for GPU based matrices In-Reply-To: References: Message-ID: 2011/10/30 Matthew Knepley > On Sun, Oct 30, 2011 at 12:22 PM, Fredrik Heffer Valdmanis < > fredva at ifi.uio.no> wrote: > >> 2011/10/28 Matthew Knepley >> >>> On Fri, Oct 28, 2011 at 10:24 AM, Fredrik Heffer Valdmanis < >>> fredva at ifi.uio.no> wrote: >>> >>>> Hi, >>>> >>>> I am working on integrating the new GPU based vectors and matrices into >>>> FEniCS. Now, I'm looking at the possibility for getting some speedup during >>>> finite element assembly, specifically when inserting the local element >>>> matrix into the global element matrix. In that regard, I have a few >>>> questions I hope you can help me out with: >>>> >>>> - When calling MatSetValues with a MATSEQAIJCUSP matrix as parameter, >>>> what exactly is it that happens? As far as I can see, MatSetValues is not >>>> implemented for GPU based matrices, neither is the mat->ops->setvalues set >>>> to point at any function for this Mat type. >>>> >>> >>> Yes, MatSetValues always operates on the CPU side. It would not make >>> sense to do individual operations on the GPU. >>> >>> I have written batched of assembly for element matrices that are all the >>> same size: >>> >>> >>> http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatSetValuesBatch.html >>> >> >> Thanks. I assume that the best way to use the batch function is to batch >> up all element matrices and insert all with one function call? Or is it >> recommended to split it up into several smaller batches? >> > > Right now, several batches does not work.For insertion to be efficient, > you should keep the matrices in COO > format, or convert them back. We do not do either right now. The idea is > to see if it ever matters for applications. > > > OK, thanks. Any estimate on when additive mode will be added to MatSetValuesBatch? As it is now, this batch function is of limited use to us, as it forces us to maintain an extra internal data structure to handle accumulation of numbers that are inserted at the same indices in the matrix. Any particular reason you chose not to support additive mode in this first implementation? Are there any considerations I should be aware of? Thanks, Fredrik -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Oct 31 08:08:32 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 31 Oct 2011 07:08:32 -0600 Subject: [petsc-users] How to change PC content during Iterations In-Reply-To: References: Message-ID: As a hack, you can probably PCSetType(pc, PCNONE); PCSetType(pc, PCILU); and then set the new number of levels. PCILU should be updated to be able to do this. On Oct 31, 2011 4:14 AM, "behzad baghapour" wrote: > This means to PCDestroy() and then PCCreate again at that iteration? > > > On Mon, Oct 31, 2011 at 2:32 PM, Matthew Knepley wrote: > >> On Mon, Oct 31, 2011 at 10:58 AM, behzad baghapour < >> behzad.baghapour at gmail.com> wrote: >> >>> I did it but received this Error when I want to change the level of fill >>> for PCILU: >>> >> >> Right, you would have to recreate the preconditioner. The SetType() would >> not work since >> you did not actually change the type. >> >> Matt >> >> >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [0]PETSC ERROR: Object is in wrong state! >>> [0]PETSC ERROR: Cannot change levels after use! >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> >>> >>> >>> >>> On Mon, Oct 31, 2011 at 2:19 PM, Matthew Knepley wrote: >>> >>>> On Mon, Oct 31, 2011 at 8:55 AM, behzad baghapour < >>>> behzad.baghapour at gmail.com> wrote: >>>> >>>>> Dear all, >>>>> >>>>> I'm using KSP iteration (for now) to solve my nonlinear problem and >>>>> handling Newton Iterations manually. Here I want to change the PC method >>>>> from some specified KSP iteration. How should I do it correctly in Petsc? >>>>> >>>> >>>> Call PCSetType() and then KSPSetOperators() again in your loop >>>> >>>> Matt >>>> >>>> >>>>> Thanks, B.B. >>>>> >>>>> -- >>>>> ================================== >>>>> Behzad Baghapour >>>>> Ph.D. Candidate, Mechecanical Engineering >>>>> University of Tehran, Tehran, Iran >>>>> https://sites.google.com/site/behzadbaghapour >>>>> Fax: 0098-21-88020741 >>>>> ================================== >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >>> >>> -- >>> ================================== >>> Behzad Baghapour >>> Ph.D. Candidate, Mechecanical Engineering >>> University of Tehran, Tehran, Iran >>> https://sites.google.com/site/behzadbaghapour >>> Fax: 0098-21-88020741 >>> ================================== >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > ================================== > Behzad Baghapour > Ph.D. Candidate, Mechecanical Engineering > University of Tehran, Tehran, Iran > https://sites.google.com/site/behzadbaghapour > Fax: 0098-21-88020741 > ================================== > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Mon Oct 31 08:17:26 2011 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Mon, 31 Oct 2011 08:17:26 -0500 Subject: [petsc-users] Block ILU with AIJ matrix? In-Reply-To: References: <17945181.298501320033826965.JavaMail.coremail@mail.ustc.edu> <2039792.298621320040228798.JavaMail.coremail@mail.ustc.edu> <22314575.298931320048941114.JavaMail.coremail@mail.ustc.edu> Message-ID: >> Yes, modern direct solver do block factorization as super node method. >> Can ASM type preconditioner use superlu ILUT on each subdomain? > > Yes. Yes. Example ~petsc/src/ksp/ksp/examples/tutorials>mpiexec -n 2 ./ex2 -pc_type asm -sub_pc_type ilu -sub_pc_factor_mat_solver_package superlu -ksp_view Hong >> >> At preset, my solver (BCGS) only works with ASM+ILU(1) on a "smart >> partitioned" mesh. >> >> >> > The ILUTP in SuperLU (4.2 and up) exploits blocking (supernodes) in >> > >> > the (approximate) L & U factors. ?Note the block boundary is >> > >> > discovered on the fly, which is usually larger than the block size >> > >> > from the input matrix A. ?So the efficiency is pretty good. ?It uses >> > >> > threshold dropping with partial pivoting, numerically is quite good >> > >> > too. ? ?It's available only in the serial version. >> > >> > >> > >> > Sherry Li >> > >> > >> > >> > >> > >> > On Sun, Oct 30, 2011 at 11:18 PM, Jed Brown >> > wrote: >> > >> > > On Sun, Oct 30, 2011 at 23:50, Gong Ding wrote: >> > >> > >> >> > >> > >> I had also considered use max block size. But it is not efficient >> > >> here. >> > >> > >> Hope petsc support nonuniform block size one day. >> > >> > > >> > >> > > Inodes do a sort of partial blocking. There has to a clear performance >> > >> > > benefit to explicit variable blocking in order to justify the >> > > implementation >> > >> > > and interface complexity. I have not yet seen a demonstration of this. >> > >> > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > From dominik at itis.ethz.ch Mon Oct 31 10:11:10 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Mon, 31 Oct 2011 16:11:10 +0100 Subject: [petsc-users] Does an array obtained with VecGetArray remain valid after vec's values changed? Message-ID: In the documentation of VecGetArray I read that it returns a pointer to the local data array and does not use any copies. Does this mean, that changing the values of the vector followed by VecAssemblyBegin/End does NOT invalidate the pointer? In other words, do I need to re-get the array when vec's values are changed? Thanks, Dominik From balay at mcs.anl.gov Mon Oct 31 10:19:39 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 31 Oct 2011 10:19:39 -0500 (CDT) Subject: [petsc-users] Does an array obtained with VecGetArray remain valid after vec's values changed? In-Reply-To: References: Message-ID: On Mon, 31 Oct 2011, Dominik Szczerba wrote: > In the documentation of VecGetArray I read that it returns a pointer > to the local data array and does not use any copies. Does this mean, > that changing the values of the vector followed by > VecAssemblyBegin/End does NOT invalidate the pointer? In other words, > do I need to re-get the array when vec's values are changed? For one - you don't need VecAssemblyBegin/End if you obtain the array with VecGetArray() - and change values there. Also - after you are done using the array - you should call VecRestoreArray(). And you should be changing the values only between these two calls. [i.e do not stash the pointer for later use - and keep changing values with this pointer - after the call to VecRestroeArray()]. If you need to change the vec again - call VecGetArray() again. Note: With VecGetArray() - you get access to local array - so can modify only local values. If you have to set values that might go to a different processor - then VecSetValues() - with VecAssemblyBegin/End() - is the correct thing to do. Satish From jedbrown at mcs.anl.gov Mon Oct 31 10:47:44 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 31 Oct 2011 09:47:44 -0600 Subject: [petsc-users] Does an array obtained with VecGetArray remain valid after vec's values changed? In-Reply-To: References: Message-ID: On Mon, Oct 31, 2011 at 09:19, Satish Balay wrote: > Also - after you are done using the array - you should call > VecRestoreArray(). And you should be changing the values only between > these two calls. [i.e do not stash the pointer for later use - and > keep changing values with this pointer - after the call to > VecRestroeArray()]. If you need to change the vec again - call > VecGetArray() again. > This seems to come up frequently. People check that the implementation is mostly correct when they stash the pointer and then want to do this, perhaps because of some false sense that it is an "optimization". Don't do that. Changing the values in this way can invalidate norms and artificially restricts what can be done with other Vec types or possible future multicore memory optimizations. Just call VecGetArray() and VecRestoreArray() around any place that you modify values. Some people want to wrap PETSc Vecs in their own C++ type that implements operator[]. I don't recommend wrapping because it makes implementing callbacks from PETSc more complicated. (You get a plain PETSc Vec and have to wrap it before you can use it, which means that any extra semantic information that you wanted to have in the wrapper needs to be reconstructed (where?). You should be able to compose that extra semantic information with a PETSc Vec (or DM) so that all the callbacks get that information.) But if you insist on wrapping and implementing operator[], you have a few choices: 1. Put VecGetArray() in the operator[] implementation, return a proxy object with VecRestoreArray() in the proxy's destructor. This is the finest grain option and I don't like its semantics. To have multiple proxies live at once, you would need to hold a reference count so that VecRestoreArray() is only called when the last one goes out of scope. Although unfortunately a common idiom in C++, returning proxy objects from operator[] is complicated and unavoidably causes lots of painful semantics, see std::vector. 2. Call VecGetArray() when needed in the operator[] implementation and have an explicit function in your wrapper that ends the access phase by calling VecRestoreArray(). I don't like this because there isn't a good way to declare whether you will be modifying the vector or whether access is read-only (can use VecGetArrayRead()) and I think some explicit statement that you are about to modify a Vec is good documentation. 3. Have explicit functions in your wrappers to gain and restore access, with operator[] only allowed between these calls. This has the same semantics as using the native PETSc API, but it lets you access without needing to declare local pointers (you still needed your wrappers). 4. Have a function in your wrapper that returns an accessor object that implements operator[] (in this model, your wrapper does not implement operator[] itself). The accessor's destructor can call VecRestoreArray(). A variant of this is to have an accessor whose constructor takes your wrapper or a PETSc Vec directly. This is my favorite option if you are trying to avoid raw pointers in your internal interfaces. PetscErrorCode YourCallback(...,Vec vec) { VecAccessor a(vec, access_mode); // Can associate grid information and possibly update ghost values. for (int i=a.xs; i From knepley at gmail.com Mon Oct 31 10:48:52 2011 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 31 Oct 2011 15:48:52 +0000 Subject: [petsc-users] Questions about setting values for GPU based matrices In-Reply-To: References: Message-ID: On Mon, Oct 31, 2011 at 12:31 PM, Fredrik Heffer Valdmanis < fredva at ifi.uio.no> wrote: > 2011/10/30 Matthew Knepley > >> On Sun, Oct 30, 2011 at 12:22 PM, Fredrik Heffer Valdmanis < >> fredva at ifi.uio.no> wrote: >> >>> 2011/10/28 Matthew Knepley >>> >>>> On Fri, Oct 28, 2011 at 10:24 AM, Fredrik Heffer Valdmanis < >>>> fredva at ifi.uio.no> wrote: >>>> >>>>> Hi, >>>>> >>>>> I am working on integrating the new GPU based vectors and matrices >>>>> into FEniCS. Now, I'm looking at the possibility for getting some speedup >>>>> during finite element assembly, specifically when inserting the local >>>>> element matrix into the global element matrix. In that regard, I have a few >>>>> questions I hope you can help me out with: >>>>> >>>>> - When calling MatSetValues with a MATSEQAIJCUSP matrix as parameter, >>>>> what exactly is it that happens? As far as I can see, MatSetValues is not >>>>> implemented for GPU based matrices, neither is the mat->ops->setvalues set >>>>> to point at any function for this Mat type. >>>>> >>>> >>>> Yes, MatSetValues always operates on the CPU side. It would not make >>>> sense to do individual operations on the GPU. >>>> >>>> I have written batched of assembly for element matrices that are all >>>> the same size: >>>> >>>> >>>> http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatSetValuesBatch.html >>>> >>> >>> Thanks. I assume that the best way to use the batch function is to batch >>> up all element matrices and insert all with one function call? Or is it >>> recommended to split it up into several smaller batches? >>> >> >> Right now, several batches does not work.For insertion to be efficient, >> you should keep the matrices in COO >> format, or convert them back. We do not do either right now. The idea is >> to see if it ever matters for applications. >> >> >> OK, thanks. > > Any estimate on when additive mode will be added to MatSetValuesBatch? As > it is now, this batch function is of limited use to us, as it forces us to > maintain an extra internal data structure to handle accumulation of numbers > that are inserted at the same indices in the matrix. > I cannot understand what you need this for. All you need is the complete list of element matrices. Any particular reason you chose not to support additive mode in this first > implementation? Are there any considerations I should be aware of? > I said why above. It would require data structure changes and code support, and we would need some level of user request for that. Matt > Thanks, > > Fredrik > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Oct 31 10:54:00 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 31 Oct 2011 10:54:00 -0500 Subject: [petsc-users] Questions about setting values for GPU based matrices In-Reply-To: References: Message-ID: <7ED800AB-C79C-4A5A-8436-796D9797B2DE@mcs.anl.gov> Matt, Won't you always want ADD_VALUES support? After all this is how finite element matrix assembly is done. Each element stiffness becomes one block in the batch and then they get added into the global matrix? How would one use the current batch routine with finite elements? Barry On Oct 31, 2011, at 10:48 AM, Matthew Knepley wrote: > On Mon, Oct 31, 2011 at 12:31 PM, Fredrik Heffer Valdmanis wrote: > 2011/10/30 Matthew Knepley > On Sun, Oct 30, 2011 at 12:22 PM, Fredrik Heffer Valdmanis wrote: > 2011/10/28 Matthew Knepley > On Fri, Oct 28, 2011 at 10:24 AM, Fredrik Heffer Valdmanis wrote: > Hi, > > I am working on integrating the new GPU based vectors and matrices into FEniCS. Now, I'm looking at the possibility for getting some speedup during finite element assembly, specifically when inserting the local element matrix into the global element matrix. In that regard, I have a few questions I hope you can help me out with: > > - When calling MatSetValues with a MATSEQAIJCUSP matrix as parameter, what exactly is it that happens? As far as I can see, MatSetValues is not implemented for GPU based matrices, neither is the mat->ops->setvalues set to point at any function for this Mat type. > > Yes, MatSetValues always operates on the CPU side. It would not make sense to do individual operations on the GPU. > > I have written batched of assembly for element matrices that are all the same size: > > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatSetValuesBatch.html > > Thanks. I assume that the best way to use the batch function is to batch up all element matrices and insert all with one function call? Or is it recommended to split it up into several smaller batches? > > Right now, several batches does not work.For insertion to be efficient, you should keep the matrices in COO > format, or convert them back. We do not do either right now. The idea is to see if it ever matters for applications. > > > OK, thanks. > > Any estimate on when additive mode will be added to MatSetValuesBatch? As it is now, this batch function is of limited use to us, as it forces us to maintain an extra internal data structure to handle accumulation of numbers that are inserted at the same indices in the matrix. > > I cannot understand what you need this for. All you need is the complete list of element matrices. > > Any particular reason you chose not to support additive mode in this first implementation? Are there any considerations I should be aware of? > > I said why above. It would require data structure changes and code support, and we would need some level of user request for that. > > Matt > > Thanks, > > Fredrik > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From knepley at gmail.com Mon Oct 31 11:14:00 2011 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 31 Oct 2011 16:14:00 +0000 Subject: [petsc-users] Questions about setting values for GPU based matrices In-Reply-To: <7ED800AB-C79C-4A5A-8436-796D9797B2DE@mcs.anl.gov> References: <7ED800AB-C79C-4A5A-8436-796D9797B2DE@mcs.anl.gov> Message-ID: On Mon, Oct 31, 2011 at 3:54 PM, Barry Smith wrote: > > Matt, > > Won't you always want ADD_VALUES support? After all this is how finite > element matrix assembly is done. Each element stiffness becomes one block > in the batch and then they get added into the global matrix? How would one > use the current batch routine with finite elements? Of course ADD_VALUES works. What does not work is calling it multiple times on the same matrix, for the reasons I stated. Matt > > Barry > > On Oct 31, 2011, at 10:48 AM, Matthew Knepley wrote: > > > On Mon, Oct 31, 2011 at 12:31 PM, Fredrik Heffer Valdmanis < > fredva at ifi.uio.no> wrote: > > 2011/10/30 Matthew Knepley > > On Sun, Oct 30, 2011 at 12:22 PM, Fredrik Heffer Valdmanis < > fredva at ifi.uio.no> wrote: > > 2011/10/28 Matthew Knepley > > On Fri, Oct 28, 2011 at 10:24 AM, Fredrik Heffer Valdmanis < > fredva at ifi.uio.no> wrote: > > Hi, > > > > I am working on integrating the new GPU based vectors and matrices into > FEniCS. Now, I'm looking at the possibility for getting some speedup during > finite element assembly, specifically when inserting the local element > matrix into the global element matrix. In that regard, I have a few > questions I hope you can help me out with: > > > > - When calling MatSetValues with a MATSEQAIJCUSP matrix as parameter, > what exactly is it that happens? As far as I can see, MatSetValues is not > implemented for GPU based matrices, neither is the mat->ops->setvalues set > to point at any function for this Mat type. > > > > Yes, MatSetValues always operates on the CPU side. It would not make > sense to do individual operations on the GPU. > > > > I have written batched of assembly for element matrices that are all the > same size: > > > > > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatSetValuesBatch.html > > > > Thanks. I assume that the best way to use the batch function is to batch > up all element matrices and insert all with one function call? Or is it > recommended to split it up into several smaller batches? > > > > Right now, several batches does not work.For insertion to be efficient, > you should keep the matrices in COO > > format, or convert them back. We do not do either right now. The idea is > to see if it ever matters for applications. > > > > > > OK, thanks. > > > > Any estimate on when additive mode will be added to MatSetValuesBatch? > As it is now, this batch function is of limited use to us, as it forces us > to maintain an extra internal data structure to handle accumulation of > numbers that are inserted at the same indices in the matrix. > > > > I cannot understand what you need this for. All you need is the complete > list of element matrices. > > > > Any particular reason you chose not to support additive mode in this > first implementation? Are there any considerations I should be aware of? > > > > I said why above. It would require data structure changes and code > support, and we would need some level of user request for that. > > > > Matt > > > > Thanks, > > > > Fredrik > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Oct 31 11:29:01 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 31 Oct 2011 11:29:01 -0500 Subject: [petsc-users] Questions about setting values for GPU based matrices In-Reply-To: References: <7ED800AB-C79C-4A5A-8436-796D9797B2DE@mcs.anl.gov> Message-ID: <6F8DCEE0-C6EC-49A5-AD9B-F144193AC5D0@mcs.anl.gov> On Oct 31, 2011, at 11:14 AM, Matthew Knepley wrote: > On Mon, Oct 31, 2011 at 3:54 PM, Barry Smith wrote: > > Matt, > > Won't you always want ADD_VALUES support? After all this is how finite element matrix assembly is done. Each element stiffness becomes one block in the batch and then they get added into the global matrix? How would one use the current batch routine with finite elements? > > Of course ADD_VALUES works. What does not work is calling it multiple times on the same matrix, for the reasons I stated. Maybe you should change the manual page to make this whole business clear. Currently one could interpret it in a variety of ways. For example "Inserts many blocks of values into a matrix at once." "Inserts" isn't really right and "In the future, we may extend this routine to handle rectangular blocks, and additive mode." What the hey is "additive mode"? I thought ADD_VALUES as opposed to INSERT_VALUES. and just say that it cannot be called multiply times before MatAssemblyBegin/End .... Barry > > Matt > > > Barry > > On Oct 31, 2011, at 10:48 AM, Matthew Knepley wrote: > > > On Mon, Oct 31, 2011 at 12:31 PM, Fredrik Heffer Valdmanis wrote: > > 2011/10/30 Matthew Knepley > > On Sun, Oct 30, 2011 at 12:22 PM, Fredrik Heffer Valdmanis wrote: > > 2011/10/28 Matthew Knepley > > On Fri, Oct 28, 2011 at 10:24 AM, Fredrik Heffer Valdmanis wrote: > > Hi, > > > > I am working on integrating the new GPU based vectors and matrices into FEniCS. Now, I'm looking at the possibility for getting some speedup during finite element assembly, specifically when inserting the local element matrix into the global element matrix. In that regard, I have a few questions I hope you can help me out with: > > > > - When calling MatSetValues with a MATSEQAIJCUSP matrix as parameter, what exactly is it that happens? As far as I can see, MatSetValues is not implemented for GPU based matrices, neither is the mat->ops->setvalues set to point at any function for this Mat type. > > > > Yes, MatSetValues always operates on the CPU side. It would not make sense to do individual operations on the GPU. > > > > I have written batched of assembly for element matrices that are all the same size: > > > > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatSetValuesBatch.html > > > > Thanks. I assume that the best way to use the batch function is to batch up all element matrices and insert all with one function call? Or is it recommended to split it up into several smaller batches? > > > > Right now, several batches does not work.For insertion to be efficient, you should keep the matrices in COO > > format, or convert them back. We do not do either right now. The idea is to see if it ever matters for applications. > > > > > > OK, thanks. > > > > Any estimate on when additive mode will be added to MatSetValuesBatch? As it is now, this batch function is of limited use to us, as it forces us to maintain an extra internal data structure to handle accumulation of numbers that are inserted at the same indices in the matrix. > > > > I cannot understand what you need this for. All you need is the complete list of element matrices. > > > > Any particular reason you chose not to support additive mode in this first implementation? Are there any considerations I should be aware of? > > > > I said why above. It would require data structure changes and code support, and we would need some level of user request for that. > > > > Matt > > > > Thanks, > > > > Fredrik > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From knepley at gmail.com Mon Oct 31 11:46:27 2011 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 31 Oct 2011 16:46:27 +0000 Subject: [petsc-users] Questions about setting values for GPU based matrices In-Reply-To: <6F8DCEE0-C6EC-49A5-AD9B-F144193AC5D0@mcs.anl.gov> References: <7ED800AB-C79C-4A5A-8436-796D9797B2DE@mcs.anl.gov> <6F8DCEE0-C6EC-49A5-AD9B-F144193AC5D0@mcs.anl.gov> Message-ID: On Mon, Oct 31, 2011 at 4:29 PM, Barry Smith wrote: > > On Oct 31, 2011, at 11:14 AM, Matthew Knepley wrote: > > > On Mon, Oct 31, 2011 at 3:54 PM, Barry Smith wrote: > > > > Matt, > > > > Won't you always want ADD_VALUES support? After all this is how > finite element matrix assembly is done. Each element stiffness becomes one > block in the batch and then they get added into the global matrix? How > would one use the current batch routine with finite elements? > > > > Of course ADD_VALUES works. What does not work is calling it multiple > times on the same matrix, for the reasons I stated. > > Maybe you should change the manual page to make this whole business > clear. Currently one could interpret it in a variety of ways. > > For example "Inserts many blocks of values into a matrix at once." > "Inserts" isn't really right > > and "In the future, we may extend this routine to handle rectangular > blocks, and additive mode." What the hey is "additive mode"? I thought > ADD_VALUES as opposed to INSERT_VALUES. > > and just say that it cannot be called multiply times before > MatAssemblyBegin/End .... Done. Matt > > Barry > > > > > Matt > > > > > > Barry > > > > On Oct 31, 2011, at 10:48 AM, Matthew Knepley wrote: > > > > > On Mon, Oct 31, 2011 at 12:31 PM, Fredrik Heffer Valdmanis < > fredva at ifi.uio.no> wrote: > > > 2011/10/30 Matthew Knepley > > > On Sun, Oct 30, 2011 at 12:22 PM, Fredrik Heffer Valdmanis < > fredva at ifi.uio.no> wrote: > > > 2011/10/28 Matthew Knepley > > > On Fri, Oct 28, 2011 at 10:24 AM, Fredrik Heffer Valdmanis < > fredva at ifi.uio.no> wrote: > > > Hi, > > > > > > I am working on integrating the new GPU based vectors and matrices > into FEniCS. Now, I'm looking at the possibility for getting some speedup > during finite element assembly, specifically when inserting the local > element matrix into the global element matrix. In that regard, I have a few > questions I hope you can help me out with: > > > > > > - When calling MatSetValues with a MATSEQAIJCUSP matrix as parameter, > what exactly is it that happens? As far as I can see, MatSetValues is not > implemented for GPU based matrices, neither is the mat->ops->setvalues set > to point at any function for this Mat type. > > > > > > Yes, MatSetValues always operates on the CPU side. It would not make > sense to do individual operations on the GPU. > > > > > > I have written batched of assembly for element matrices that are all > the same size: > > > > > > > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatSetValuesBatch.html > > > > > > Thanks. I assume that the best way to use the batch function is to > batch up all element matrices and insert all with one function call? Or is > it recommended to split it up into several smaller batches? > > > > > > Right now, several batches does not work.For insertion to be > efficient, you should keep the matrices in COO > > > format, or convert them back. We do not do either right now. The idea > is to see if it ever matters for applications. > > > > > > > > > OK, thanks. > > > > > > Any estimate on when additive mode will be added to MatSetValuesBatch? > As it is now, this batch function is of limited use to us, as it forces us > to maintain an extra internal data structure to handle accumulation of > numbers that are inserted at the same indices in the matrix. > > > > > > I cannot understand what you need this for. All you need is the > complete list of element matrices. > > > > > > Any particular reason you chose not to support additive mode in this > first implementation? Are there any considerations I should be aware of? > > > > > > I said why above. It would require data structure changes and code > support, and we would need some level of user request for that. > > > > > > Matt > > > > > > Thanks, > > > > > > Fredrik > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dominik at itis.ethz.ch Mon Oct 31 16:53:20 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Mon, 31 Oct 2011 22:53:20 +0100 Subject: [petsc-users] Does an array obtained with VecGetArray remain valid after vec's values changed? In-Reply-To: References: Message-ID: Seems I was not clear enough. I have my pointer to the local vector values obtained witth VecGetArray. This function is called only once at the beginning of the program and VecRestoreArray is called at the end. The pointer is only used to read the values. I am doing an iterative scheme whereby I change the vector values (VecSetValues, i.e., cross-process) and subsequent VecAssemblyBegin/End. All I want to know is if I need to VecRestoreArray and VecGetArray again, of if it remains valid throughout the scheme. Many thanks and sorry for the confusion, Dominik On Mon, Oct 31, 2011 at 4:19 PM, Satish Balay wrote: > On Mon, 31 Oct 2011, Dominik Szczerba wrote: > >> In the documentation of VecGetArray I read that it returns a pointer >> to the local data array and does not use any copies. Does this mean, >> that changing the values of the vector followed by >> VecAssemblyBegin/End does NOT invalidate the pointer? In other words, >> do I need to re-get the array when vec's values are changed? > > For one - you don't need VecAssemblyBegin/End if you obtain the array > with VecGetArray() - and change values there. > > Also - after you are done using the array - you should call > VecRestoreArray(). ?And you should be changing the values only between > these two calls. [i.e do not stash the pointer for later use - and > keep changing values with this pointer - after the call to > VecRestroeArray()]. ?If you need to change the vec again - call > VecGetArray() again. > > > Note: With VecGetArray() - you get access to local array - so can > modify only local values. If you have to set values that might go to a > different processor - then VecSetValues() - with > VecAssemblyBegin/End() - is the correct thing to do. > > Satish > > From dominik at itis.ethz.ch Mon Oct 31 16:56:55 2011 From: dominik at itis.ethz.ch (Dominik Szczerba) Date: Mon, 31 Oct 2011 22:56:55 +0100 Subject: [petsc-users] Does an array obtained with VecGetArray remain valid after vec's values changed? In-Reply-To: References: Message-ID: Thanks Jed for the detailed elaboration. No, I am not overloading [] operator, really doing C++ minimalistically actually. My question is much simpler, just if the pointer obtained with VecGetArray will continue to reflect the correct values despite the intermittent calls to VecSetValues and VecAssemblyBegin/End. Many thanks, Dominik On Mon, Oct 31, 2011 at 4:47 PM, Jed Brown wrote: > On Mon, Oct 31, 2011 at 09:19, Satish Balay wrote: >> >> Also - after you are done using the array - you should call >> VecRestoreArray(). ?And you should be changing the values only between >> these two calls. [i.e do not stash the pointer for later use - and >> keep changing values with this pointer - after the call to >> VecRestroeArray()]. ?If you need to change the vec again - call >> VecGetArray() again. > > This seems to come up frequently. People check that the implementation is > mostly correct when they stash the pointer and then want to do this, perhaps > because of some false sense that it is an "optimization". Don't do that. > Changing the values in this way can invalidate norms and artificially > restricts what can be done with other Vec types or possible future multicore > memory optimizations. Just call VecGetArray() and VecRestoreArray() around > any place that you modify values. > Some people want to wrap PETSc Vecs in their own C++ type that implements > operator[]. I don't recommend wrapping because it makes implementing > callbacks from PETSc more complicated. (You get a plain PETSc Vec and have > to wrap it before you can use it, which means that any extra semantic > information that you wanted to have in the wrapper needs to be reconstructed > (where?). You should be able to compose that extra semantic information with > a PETSc Vec (or DM) so that all the callbacks get that information.) But if > you insist on wrapping and implementing operator[], you have a few choices: > 1. Put VecGetArray() in the operator[] implementation, return a proxy object > with VecRestoreArray() in the proxy's destructor. This is the finest grain > option and I don't like its semantics. To have multiple proxies live at > once, you would need to hold a reference count so that VecRestoreArray() is > only called when the last one goes out of scope. Although unfortunately a > common idiom in C++, returning proxy objects from operator[] is complicated > and unavoidably causes lots of painful semantics, see std::vector. > 2. Call VecGetArray() when needed in the operator[] implementation and have > an explicit function in your wrapper that ends the access phase by calling > VecRestoreArray(). I don't like this because there isn't a good way to > declare whether you will be modifying the vector or whether access is > read-only (can use VecGetArrayRead()) and I think some explicit statement > that you are about to modify a Vec is good documentation. > 3. Have explicit functions in your wrappers to gain and restore access, with > operator[] only allowed between these calls. This has the same semantics as > using the native PETSc API, but it lets you access without needing to > declare local pointers (you still needed your wrappers). > 4. Have a function in your wrapper that returns an accessor object that > implements operator[] (in this model, your wrapper does not implement > operator[] itself). The accessor's destructor can call VecRestoreArray(). A > variant of this is to have an accessor whose constructor takes your wrapper > or a PETSc Vec directly. This is my favorite option if you are trying to > avoid raw pointers in your internal interfaces. > PetscErrorCode YourCallback(...,Vec vec) { > VecAccessor a(vec, access_mode); // Can associate grid information and > possibly update ghost values. > for (int i=a.xs; i ?a[i] = ...; > } > // Destructor calls VecRestoreArray(). It could also accumulate ghost values > depending on access_mode. From jedbrown at mcs.anl.gov Mon Oct 31 21:24:27 2011 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 31 Oct 2011 20:24:27 -0600 Subject: [petsc-users] Does an array obtained with VecGetArray remain valid after vec's values changed? In-Reply-To: References: Message-ID: On Mon, Oct 31, 2011 at 15:53, Dominik Szczerba wrote: > Seems I was not clear enough. I have my pointer to the local vector > values obtained witth VecGetArray. This function is called only once > at the beginning of the program and VecRestoreArray is called at the > end. The pointer is only used to read the values. I am doing an > iterative scheme whereby I change the vector values (VecSetValues, > i.e., cross-process) and subsequent VecAssemblyBegin/End. All I want > to know is if I need to VecRestoreArray and VecGetArray again, of if > it remains valid throughout the scheme. > Yes, call VecRestoreArray() when you are done looking at values and call VecGetArray() the next time you want to look at values. With PETSc native Vec, both will work, at least under typical use conditions. But why misuse an API in a fragile way just to create more implicit, non-referenced counted, confusing sharing? Your question sounds to me like asking whether it's okay to replaced structured control flow with a linked list of function addresses stored as char* that you "call" using longjmp(). Yes, it's possible and will usually work, but please don't. -------------- next part -------------- An HTML attachment was scrubbed... URL: