From sapphire.jxy at gmail.com Tue Sep 1 18:25:40 2009 From: sapphire.jxy at gmail.com (xiaoyin ji) Date: Tue, 1 Sep 2009 19:25:40 -0400 Subject: PETSc is getting slower in C++? Message-ID: <6985a8f00909011625n76e67e73m37f843bb0d28a075@mail.gmail.com> Hi, I'm embedding a PETSc solver inside a molecular dynamics program. I used MATMPIAIJ matrix format and set ksp solver with KSPBICG and preconditioer as PCBJACOBI. Now the problem is ksp solver is getting slower and slower for each MD step, at the very beginning it takes about 1sec, and after 5000 steps the solving part takes up to 10sec.(increasing gradually) However, I've also have an identical program in Fortran version, which did exactly the same thing and did not have such problem for over 1000000 steps, th Fortran version always solve within 1sec. The MPI_Wtime() shows that ksp solver is exactly the slowing down part. Thanks for any hints. Best, Xiaoyin Ji Department of Material Science and Engineering North Carolina State University From knepley at gmail.com Tue Sep 1 18:31:01 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 1 Sep 2009 18:31:01 -0500 Subject: PETSc is getting slower in C++? In-Reply-To: <6985a8f00909011625n76e67e73m37f843bb0d28a075@mail.gmail.com> References: <6985a8f00909011625n76e67e73m37f843bb0d28a075@mail.gmail.com> Message-ID: We can't say anything without at least seeing a) The output of -log_summary b) The output -ksp_monitor and -ksp_view for a few solves Matt On Tue, Sep 1, 2009 at 6:25 PM, xiaoyin ji wrote: > Hi, > > I'm embedding a PETSc solver inside a molecular dynamics program. I > used MATMPIAIJ matrix format and set ksp solver with KSPBICG and > preconditioer as PCBJACOBI. Now the problem is ksp solver is getting > slower and slower for each MD step, at the very beginning it takes > about 1sec, and after 5000 steps the solving part takes up to > 10sec.(increasing gradually) However, I've also have an identical > program in Fortran version, which did exactly the same thing and did > not have such problem for over 1000000 steps, th Fortran version > always solve within 1sec. The MPI_Wtime() shows that ksp solver is > exactly the slowing down part. Thanks for any hints. > > Best, > Xiaoyin Ji > > Department of Material Science and Engineering > North Carolina State University > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Sep 1 20:40:22 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 1 Sep 2009 20:40:22 -0500 Subject: PETSc is getting slower in C++? In-Reply-To: <6985a8f00909011625n76e67e73m37f843bb0d28a075@mail.gmail.com> References: <6985a8f00909011625n76e67e73m37f843bb0d28a075@mail.gmail.com> Message-ID: What is the stopping criteria for the KSPSolve? Are you providing a nonzero initial guess (and calling KSPSetInitialGuessNonzero()).? Is the number of iterations of BiCG increasing with the step? Is the matrix changing or remains the same? If it changes are you calling KSPSetOperators() with the new matrix? Barry On Sep 1, 2009, at 6:25 PM, xiaoyin ji wrote: > Hi, > > I'm embedding a PETSc solver inside a molecular dynamics program. I > used MATMPIAIJ matrix format and set ksp solver with KSPBICG and > preconditioer as PCBJACOBI. Now the problem is ksp solver is getting > slower and slower for each MD step, at the very beginning it takes > about 1sec, and after 5000 steps the solving part takes up to > 10sec.(increasing gradually) However, I've also have an identical > program in Fortran version, which did exactly the same thing and did > not have such problem for over 1000000 steps, th Fortran version > always solve within 1sec. The MPI_Wtime() shows that ksp solver is > exactly the slowing down part. Thanks for any hints. > > Best, > Xiaoyin Ji > > Department of Material Science and Engineering > North Carolina State University From juhaj at iki.fi Wed Sep 2 07:23:36 2009 From: juhaj at iki.fi (Juha =?iso-8859-1?q?J=E4ykk=E4?=) Date: Wed, 2 Sep 2009 13:23:36 +0100 Subject: HDF5 and DA ndof > 1 Message-ID: <200909021323.36847.juhaj@iki.fi> Hi all! I noticed a bug in 3.0.0-p8/src/dm/da/src/gr2.c, which causes the DA vector to be saved incorrectly into the hdf5 file. However, this seems to be fixed in the latest nightly build, so my question is, how usable is the nightly or when will the -p9 come out with working HDF5 DA saving when ndof>1? Cheers, Juha From bsmith at mcs.anl.gov Wed Sep 2 12:12:24 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 2 Sep 2009 12:12:24 -0500 Subject: HDF5 and DA ndof > 1 In-Reply-To: <200909021323.36847.juhaj@iki.fi> References: <200909021323.36847.juhaj@iki.fi> Message-ID: <29740F76-86B8-4B96-AE1E-26A6F899C2D8@mcs.anl.gov> Juha, You will need to switch to petsc-dev to have this ability. This fix won't get backported to petsc-3.0.0 If you have trouble with petsc-dev let us know at petsc-maint at mcs.anl.gov Barry On Sep 2, 2009, at 7:23 AM, Juha J?ykk? wrote: > Hi all! > > I noticed a bug in 3.0.0-p8/src/dm/da/src/gr2.c, which causes the DA > vector to > be saved incorrectly into the hdf5 file. > > However, this seems to be fixed in the latest nightly build, so my > question > is, how usable is the nightly or when will the -p9 come out with > working HDF5 > DA saving when ndof>1? > > Cheers, > Juha > From dalcinl at gmail.com Wed Sep 2 14:49:17 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 2 Sep 2009 16:49:17 -0300 Subject: HDF5 and DA ndof > 1 In-Reply-To: <29740F76-86B8-4B96-AE1E-26A6F899C2D8@mcs.anl.gov> References: <200909021323.36847.juhaj@iki.fi> <29740F76-86B8-4B96-AE1E-26A6F899C2D8@mcs.anl.gov> Message-ID: On Wed, Sep 2, 2009 at 2:12 PM, Barry Smith wrote: > > ?Juha, > > ? ?You will need to switch to petsc-dev to have this ability. This fix won't > get backported to petsc-3.0.0 > If you have trouble with petsc-dev let us know at petsc-maint at mcs.anl.gov > The problem for Juha is not likely petsc-dev, but building latest TAO 1.10 against petsc-dev... > > ? Barry > > On Sep 2, 2009, at 7:23 AM, Juha J?ykk? wrote: > >> Hi all! >> >> I noticed a bug in 3.0.0-p8/src/dm/da/src/gr2.c, which causes the DA >> vector to >> be saved incorrectly into the hdf5 file. >> >> However, this seems to be fixed in the latest nightly build, so my >> question >> is, how usable is the nightly or when will the -p9 come out with working >> HDF5 >> DA saving when ndof>1? >> >> Cheers, >> Juha >> > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From bsmith at mcs.anl.gov Wed Sep 2 14:53:43 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 2 Sep 2009 14:53:43 -0500 Subject: HDF5 and DA ndof > 1 In-Reply-To: References: <200909021323.36847.juhaj@iki.fi> <29740F76-86B8-4B96-AE1E-26A6F899C2D8@mcs.anl.gov> Message-ID: It may be possible to make the changes to 3.0.0 (by simply diffing with petsc-dev) but I don't have time to do it. Barry On Sep 2, 2009, at 2:49 PM, Lisandro Dalcin wrote: > On Wed, Sep 2, 2009 at 2:12 PM, Barry Smith wrote: >> >> Juha, >> >> You will need to switch to petsc-dev to have this ability. This >> fix won't >> get backported to petsc-3.0.0 >> If you have trouble with petsc-dev let us know at petsc-maint at mcs.anl.gov >> > > The problem for Juha is not likely petsc-dev, but building latest TAO > 1.10 against petsc-dev... > >> >> Barry >> >> On Sep 2, 2009, at 7:23 AM, Juha J?ykk? wrote: >> >>> Hi all! >>> >>> I noticed a bug in 3.0.0-p8/src/dm/da/src/gr2.c, which causes the DA >>> vector to >>> be saved incorrectly into the hdf5 file. >>> >>> However, this seems to be fixed in the latest nightly build, so my >>> question >>> is, how usable is the nightly or when will the -p9 come out with >>> working >>> HDF5 >>> DA saving when ndof>1? >>> >>> Cheers, >>> Juha >>> >> >> > > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 From balay at mcs.anl.gov Wed Sep 2 15:00:39 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 2 Sep 2009 15:00:39 -0500 (CDT) Subject: HDF5 and DA ndof > 1 In-Reply-To: References: <200909021323.36847.juhaj@iki.fi> <29740F76-86B8-4B96-AE1E-26A6F899C2D8@mcs.anl.gov> Message-ID: Easiest thing to do is isolate the changeset - and grab the diff - and apply to petsc-3.0.0. If you know the changeset id [or the app lines that belong to this changeset] - I can take a look. If its multiple changesets for this fix - then we'll have to find/extract them all.. [finding all might be tricky] Satish On Wed, 2 Sep 2009, Barry Smith wrote: > > It may be possible to make the changes to 3.0.0 (by simply diffing with > petsc-dev) but I don't have time to do it. > > Barry > > On Sep 2, 2009, at 2:49 PM, Lisandro Dalcin wrote: > > > On Wed, Sep 2, 2009 at 2:12 PM, Barry Smith wrote: > > > > > > Juha, > > > > > > You will need to switch to petsc-dev to have this ability. This fix > > > won't > > > get backported to petsc-3.0.0 > > > If you have trouble with petsc-dev let us know at petsc-maint at mcs.anl.gov > > > > > > > The problem for Juha is not likely petsc-dev, but building latest TAO > > 1.10 against petsc-dev... > > > > > > > > Barry > > > > > > On Sep 2, 2009, at 7:23 AM, Juha J?ykk? wrote: > > > > > > > Hi all! > > > > > > > > I noticed a bug in 3.0.0-p8/src/dm/da/src/gr2.c, which causes the DA > > > > vector to > > > > be saved incorrectly into the hdf5 file. > > > > > > > > However, this seems to be fixed in the latest nightly build, so my > > > > question > > > > is, how usable is the nightly or when will the -p9 come out with working > > > > HDF5 > > > > DA saving when ndof>1? > > > > > > > > Cheers, > > > > Juha > > > > > > > > > > > > > > > > > > -- > > Lisandro Dalc?n > > --------------- > > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > > Tel/Fax: +54-(0)342-451.1594 > From juhaj at iki.fi Wed Sep 2 15:10:17 2009 From: juhaj at iki.fi (Juha =?iso-8859-15?q?J=E4ykk=E4?=) Date: Wed, 02 Sep 2009 23:10:17 +0300 Subject: HDF5 and DA ndof > 1 In-Reply-To: References: <200909021323.36847.juhaj@iki.fi> Message-ID: <200909022310.21088.juhaj@iki.fi> > If you know the changeset id [or the app lines that belong to this > changeset] - I can take a look. No need to, I already backported the change to 3.0.0-p8. Thanks anyway. It is rather simple, really: one can simply replace 3.0.0-p8/gr2.c with dev/gr2.c, provided one also adds the define #defines PetscHDF5CastInt(a) (a) someplace suitable (I wanted to confine my changes to gr2.c so I put that there, too). That fixes it - at least for uniprocessor case, I did not check multiprocessor yet. I'll see about that tomorrow. My thought was that if -p9 would include that fix AND would be out soon, I would have waited for that before going into production with a tweaked source. But thank you all for a set of prompt replies anyway! -Juha -- ----------------------------------------------- | Juha J?ykk?, juhaj at iki.fi | | http://www.utu.fi/~juolja/ | ----------------------------------------------- From sapphire.jxy at gmail.com Thu Sep 3 07:34:30 2009 From: sapphire.jxy at gmail.com (xiaoyin ji) Date: Thu, 3 Sep 2009 08:34:30 -0400 Subject: PETSc is slowing down in C++? continued Message-ID: <6985a8f00909030534s7f3c10f7v74d02437cb150e99@mail.gmail.com> Hi, Here are the print outs for the very beginning, average time is about 0.8sec for the ksp solver KSP Object: type: bicg maximum iterations=10000, initial guess is zero tolerances: relative=1e-07, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: bjacobi block Jacobi: number of blocks = 16 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(sub_) type: ilu ILU: 0 levels of fill ILU: factor fill ratio allocated 1 ILU: tolerance for zero pivot 1e-12 ILU: using diagonal shift to prevent zero pivot ILU: using diagonal shift on blocks to prevent zero pivot out-of-place factorization matrix ordering: natural ILU: factor fill ratio needed 1 Factored matrix follows Matrix Object: type=seqaij, rows=5672, cols=5672 package used to perform factorization: petsc total: nonzeros=39090, allocated nonzeros=39704 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=5672, cols=5672 total: nonzeros=39090, allocated nonzeros=39704 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=90746, cols=90746 total: nonzeros=636378, allocated nonzeros=1279114 not using I-node (on process 0) routines Norm of error 48.144, Iterations 137 After 4000 steps, solver takes 7.5sec KSP Object: type: bicg maximum iterations=10000, initial guess is zero tolerances: relative=1e-07, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: bjacobi block Jacobi: number of blocks = 16 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(sub_) type: ilu ILU: 0 levels of fill ILU: factor fill ratio allocated 1 ILU: tolerance for zero pivot 1e-12 ILU: using diagonal shift to prevent zero pivot ILU: using diagonal shift on blocks to prevent zero pivot out-of-place factorization matrix ordering: natural ILU: factor fill ratio needed 1 Factored matrix follows Matrix Object: type=seqaij, rows=5672, cols=5672 package used to perform factorization: petsc total: nonzeros=39090, allocated nonzeros=39704 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=5672, cols=5672 total: nonzeros=39090, allocated nonzeros=39704 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=90746, cols=90746 total: nonzeros=636378, allocated nonzeros=1279114 not using I-node (on process 0) routines Norm of error 48.7467, Iterations 132 The iterations are similar, solving time is actually increasing exponentially, and the matrix should not be too complicated here as the PETSc in Fortran solved this in 1sec. By the way, will there be a way to set a PETSc vector directly into a preconditioner for the ksp solver? Thanks! Best, Xiaoyin Ji Department of Materials Science and Engineering North Carolina State University From knepley at gmail.com Thu Sep 3 07:53:37 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 3 Sep 2009 07:53:37 -0500 Subject: PETSc is slowing down in C++? continued In-Reply-To: <6985a8f00909030534s7f3c10f7v74d02437cb150e99@mail.gmail.com> References: <6985a8f00909030534s7f3c10f7v74d02437cb150e99@mail.gmail.com> Message-ID: On Thu, Sep 3, 2009 at 7:34 AM, xiaoyin ji wrote: > Hi, > > Here are the print outs > > for the very beginning, average time is about 0.8sec for the ksp solver > KSP Object: > type: bicg > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-07, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: bjacobi > block Jacobi: number of blocks = 16 > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object:(sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(sub_) > type: ilu > ILU: 0 levels of fill > ILU: factor fill ratio allocated 1 > ILU: tolerance for zero pivot 1e-12 > ILU: using diagonal shift to prevent zero pivot > ILU: using diagonal shift on blocks to prevent zero pivot > out-of-place factorization > matrix ordering: natural > ILU: factor fill ratio needed 1 > Factored matrix follows > Matrix Object: > type=seqaij, rows=5672, cols=5672 > package used to perform factorization: petsc > total: nonzeros=39090, allocated nonzeros=39704 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=5672, cols=5672 > total: nonzeros=39090, allocated nonzeros=39704 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=90746, cols=90746 > total: nonzeros=636378, allocated nonzeros=1279114 > not using I-node (on process 0) routines > Norm of error 48.144, Iterations 137 > > After 4000 steps, solver takes 7.5sec > > KSP Object: > type: bicg > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-07, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: bjacobi > block Jacobi: number of blocks = 16 > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object:(sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(sub_) > type: ilu > ILU: 0 levels of fill > ILU: factor fill ratio allocated 1 > ILU: tolerance for zero pivot 1e-12 > ILU: using diagonal shift to prevent zero pivot > ILU: using diagonal shift on blocks to prevent zero pivot > out-of-place factorization > matrix ordering: natural > ILU: factor fill ratio needed 1 > Factored matrix follows > Matrix Object: > type=seqaij, rows=5672, cols=5672 > package used to perform factorization: petsc > total: nonzeros=39090, allocated nonzeros=39704 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=5672, cols=5672 > total: nonzeros=39090, allocated nonzeros=39704 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=90746, cols=90746 > total: nonzeros=636378, allocated nonzeros=1279114 > not using I-node (on process 0) routines > Norm of error 48.7467, Iterations 132 > > > The iterations are similar, solving time is actually increasing > exponentially, and the matrix should not be too complicated here as > the PETSc in Fortran solved this in 1sec. > You did not send the output of -log_summary. You will probably have to segregate the solves into different stages in order to see the difference. I suspect a) You do not have something preallocated correctly b) You are not freeing something (run with -malloc_dump) and thus clogging memory c) something apart from PETSc is taking time By the way, will there be a way to set a PETSc vector directly into a > preconditioner for the ksp solver? > Do not know what you mean here. Matt > Thanks! > > Best, > Xiaoyin Ji > > Department of Materials Science and Engineering > North Carolina State University > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From sapphire.jxy at gmail.com Thu Sep 3 12:54:41 2009 From: sapphire.jxy at gmail.com (xiaoyin ji) Date: Thu, 3 Sep 2009 13:54:41 -0400 Subject: petsc-users Digest, Vol 9, Issue 3 In-Reply-To: References: Message-ID: <6985a8f00909031054j100728cbte8e1f27fd144cc21@mail.gmail.com> I thought the preconditioner with ksp solver is a vector where the solver start to iterate and solve...so my idea is if I could set a vector which is close to the solution as preconditioner, then ksp will do less iteration. Xiaoyin Ji > By the way, will there be a way to set a PETSc vector directly into a >> preconditioner for the ksp solver? >> > > Do not know what you mean here. > > ?Matt > > >> Thanks! >> >> Best, >> Xiaoyin Ji >> >> Department of Materials Science and Engineering >> North Carolina State University >> > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > _______________________________________________ > petsc-users mailing list > petsc-users at mcs.anl.gov > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > > > End of petsc-users Digest, Vol 9, Issue 3 > ***************************************** > From knepley at gmail.com Thu Sep 3 14:08:14 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 3 Sep 2009 14:08:14 -0500 Subject: petsc-users Digest, Vol 9, Issue 3 In-Reply-To: <6985a8f00909031054j100728cbte8e1f27fd144cc21@mail.gmail.com> References: <6985a8f00909031054j100728cbte8e1f27fd144cc21@mail.gmail.com> Message-ID: On Thu, Sep 3, 2009 at 12:54 PM, xiaoyin ji wrote: > I thought the preconditioner with ksp solver is a vector where the > solver start to iterate and solve...so my idea is if I could set a > vector which is close to the solution as preconditioner, then ksp will > do less iteration. > It is not. I suggest Yousef Saad's book Iterative Methods for Sparse Linear Systems. Matt > Xiaoyin Ji > > > By the way, will there be a way to set a PETSc vector directly into a > >> preconditioner for the ksp solver? > >> > > > > Do not know what you mean here. > > > > Matt > > > > > >> Thanks! > >> > >> Best, > >> Xiaoyin Ji > >> > >> Department of Materials Science and Engineering > >> North Carolina State University > >> > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments > > is infinitely more interesting than any results to which their > experiments > > lead. > > -- Norbert Wiener > > -------------- next part -------------- > > An HTML attachment was scrubbed... > > URL: < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090903/9cc3bda5/attachment-0001.htm > > > > > > ------------------------------ > > > > _______________________________________________ > > petsc-users mailing list > > petsc-users at mcs.anl.gov > > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > > > > > > End of petsc-users Digest, Vol 9, Issue 3 > > ***************************************** > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.aunai at gmail.com Fri Sep 4 07:39:12 2009 From: nicolas.aunai at gmail.com (nicolas aunai) Date: Fri, 4 Sep 2009 14:39:12 +0200 Subject: bug ? number of processor, coarse and fine DA for DMMG Message-ID: Hello, I have a code that uses the DMMG objet for a linear problem. The DA I create for the coarse grid is DA_XPERIODIC and 2D. I let petsc decide for the number of processors in each direction. I have a problem which depends on the number of processors I use. If I want a 1024x1025 grid with 4 levels, the coarsest grid will be 128x129 with the default refinement factor. Everything seems to work fine when I run with 1, 2, 3, 4, 6, 8... processors... but 5, 7, 11 proc fail with the following error : nicolas at lx-nau:~/code/tests/petsc/dmmg$ mpiexec -n 5 ./inertia6 -ksp_type gmres -pc_type sor -ksp_rtol 1e-9 size of the coarse level : 128 129 debug 1, number of proc in each dir. : 1 5 [3]PETSC ERROR: --------------------- Error Message ------------------------------------ [3]PETSC ERROR: Arguments are incompatible! [3]PETSC ERROR: Processor's coarse DA must lie over fine DA j_start 615 j_c 307 j_start_ghost_c 308! [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Petsc Release Version 3.0.0, Patch 7, Mon Jul 6 11:33:34 CDT 2009 [3]PETSC ERROR: See docs/changes/index.html for recent updates. [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [3]PETSC ERROR: See docs/index.html for manual pages. [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: ./inertia6 on a linux-gnu named lx-nau by nicolas Fri Sep 4 14:39:27 2009 [3]PETSC ERROR: Libraries linked from /home/nicolas/bin/petsc-3.0.0-p7/linux-gnu-debug/lib [3]PETSC ERROR: Configure run at Mon Aug 17 11:34:27 2009 [3]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran --download-f-blas-lapack=1 --download-mpich=1 PETSC_ARCH=linux-gnu-debug --with-shared=0 [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: DAGetInterpolation_2D_Q1() line 280 in src/dm/da/src/dainterp.c [3]PETSC ERROR: DAGetInterpolation() line 879 in src/dm/da/src/dainterp.c [3]PETSC ERROR: DMGetInterpolation() line 144 in src/dm/da/utils/dm.c [3]PETSC ERROR: DMMGSetDM() line 309 in src/snes/utils/damg.c it seems that the error come from the routine DMMGSetDM, when petsc build the restriction/interpolation operators... but I can't figure out why I couln't use 5 processors with this grid size ? if the code is needed, it is here : http://nico.aunai.free.fr/inertia6.c Thx for your help Nico From C.Klaij at marin.nl Fri Sep 4 10:35:40 2009 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Fri, 4 Sep 2009 17:35:40 +0200 Subject: dummy in MyKSPConverged Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F7C4@MAR150CV1.marin.local> Hello, With the new version 3.0.0-p8, if I assign a value to the dummy in MyKSPConverged at the end of src/ksp/ksp/examples/tutorials/ex2f.F, I get a segmentation fault (this didn't happen with version 2.3.3-p13). Any idea why? $ tail ex2f.F else flag = 0 endif ierr = 0 dummy = 0 end $ ./ex2f -my_ksp_convergence [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] GMREScycle line 133 src/ksp/ksp/impls/gmres/gmres.c [0]PETSC ERROR: [0] KSPSolve_GMRES line 228 src/ksp/ksp/impls/gmres/gmres.c [0]PETSC ERROR: [0] KSPSolve line 308 src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 CDT 2009 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ dr. ir. Christiaan Klaij CFD Researcher Research & Development mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I http://www.marin.nl/ http://www.marin.nl/web/News/News-items/RD-seminar-on-September-22-few-places-left.htm R&D seminar on September 22: few places left This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 1069 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 1622 bytes Desc: not available URL: From knepley at gmail.com Fri Sep 4 10:49:25 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 4 Sep 2009 10:49:25 -0500 Subject: dummy in MyKSPConverged In-Reply-To: <5D9143EF9FADE942BEF6F2A636A861170800F7C4@MAR150CV1.marin.local> References: <5D9143EF9FADE942BEF6F2A636A861170800F7C4@MAR150CV1.marin.local> Message-ID: On Fri, Sep 4, 2009 at 10:35 AM, Klaij, Christiaan wrote: > Hello, > > With the new version 3.0.0-p8, if I assign a value to the dummy in > MyKSPConverged at the end of src/ksp/ksp/examples/tutorials/ex2f.F, I get a > segmentation fault (this didn't happen with version 2.3.3-p13). Any idea > why? > 1) I do not get a SEGV 2) You should not be setting this at all. You pass in PETSC_NULL_OBJECT, which means you are changing the definition of that basic thing in PETSc. This can produce undefined results. Matt > $ tail ex2f.F > else > flag = 0 > endif > ierr = 0 > dummy = 0 > > end > $ ./ex2f -my_ksp_convergence > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC > ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to > find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] GMREScycle line 133 src/ksp/ksp/impls/gmres/gmres.c > [0]PETSC ERROR: [0] KSPSolve_GMRES line 228 src/ksp/ksp/impls/gmres/gmres.c > [0]PETSC ERROR: [0] KSPSolve line 308 src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 > CDT 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > dr. ir. Christiaan Klaij CFD Researcher Research & Development *MARIN* 2, > Haagsteeg C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 AA > Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I > www.marin.nl > > MARIN webnews: R&D > seminar on September 22: few places left > > This e-mail may be confidential, privileged and/or protected by copyright. > If you are not the intended recipient, you should return it to the sender > immediately and delete your copy from your system. > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 1622 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 1069 bytes Desc: not available URL: From bsmith at mcs.anl.gov Fri Sep 4 13:30:25 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 4 Sep 2009 13:30:25 -0500 Subject: bug ? number of processor, coarse and fine DA for DMMG In-Reply-To: References: Message-ID: This is a limitation in the way the interpolation is computed. It is not really an error. There is no easy way to extend the code for these other configurations. Barry On Sep 4, 2009, at 7:39 AM, nicolas aunai wrote: > Hello, > > > I have a code that uses the DMMG objet for a linear problem. The DA I > create for the coarse grid is DA_XPERIODIC and 2D. I let petsc decide > for the number of processors in each direction. > > I have a problem which depends on the number of processors I use. > If I want a 1024x1025 grid with 4 levels, the coarsest grid will be > 128x129 with the default refinement factor. > > Everything seems to work fine when I run with 1, 2, 3, 4, 6, 8... > processors... but 5, 7, 11 proc fail with the following error : > > > nicolas at lx-nau:~/code/tests/petsc/dmmg$ mpiexec -n 5 ./inertia6 > -ksp_type gmres -pc_type sor -ksp_rtol 1e-9 > > size of the coarse level : 128 129 > debug 1, number of proc in each dir. : 1 5 > > [3]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [3]PETSC ERROR: Arguments are incompatible! > [3]PETSC ERROR: Processor's coarse DA must lie over fine DA > j_start 615 j_c 307 j_start_ghost_c 308! > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: Petsc Release Version 3.0.0, Patch 7, Mon Jul 6 > 11:33:34 CDT 2009 > [3]PETSC ERROR: See docs/changes/index.html for recent updates. > [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [3]PETSC ERROR: See docs/index.html for manual pages. > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: ./inertia6 on a linux-gnu named lx-nau by nicolas Fri > Sep 4 14:39:27 2009 > [3]PETSC ERROR: Libraries linked from > /home/nicolas/bin/petsc-3.0.0-p7/linux-gnu-debug/lib > [3]PETSC ERROR: Configure run at Mon Aug 17 11:34:27 2009 > [3]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran > --download-f-blas-lapack=1 --download-mpich=1 > PETSC_ARCH=linux-gnu-debug --with-shared=0 > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: DAGetInterpolation_2D_Q1() line 280 in src/dm/da/src/ > dainterp.c > [3]PETSC ERROR: DAGetInterpolation() line 879 in src/dm/da/src/ > dainterp.c > [3]PETSC ERROR: DMGetInterpolation() line 144 in src/dm/da/utils/dm.c > [3]PETSC ERROR: DMMGSetDM() line 309 in src/snes/utils/damg.c > > > > it seems that the error come from the routine DMMGSetDM, when petsc > build the restriction/interpolation operators... but I can't figure > out why I couln't use 5 processors with this grid size ? > > if the code is needed, it is here : http://nico.aunai.free.fr/inertia6.c > > > Thx for your help > > Nico From Hung.V.Nguyen at usace.army.mil Fri Sep 4 14:25:02 2009 From: Hung.V.Nguyen at usace.army.mil (Nguyen, Hung V ERDC-ITL-MS) Date: Fri, 4 Sep 2009 14:25:02 -0500 Subject: Problem with saving the binary matrix via Matview Message-ID: Hello, I have a problem with saving a matrix with the binary viewer PetscBinaryViewerOpen() for the matrix with nrow=853564, nnzeros=47191472. The application ran with 32 pes for more than 3 hours without writing any into file. However, the same executable ran for small size of application (nrows-12856,nnzeros=675744) and were able to write successfully binary files. Note: using Totalview I found that it hang in the line of Matview(matrix->petsc, fd); Thank you, -hung --- code: petsc_analyst_mat(matrix->petsc); sprintf(file[LOAD_MATRIX],"Matrix.at%f",t_prev); PetscViewerBinaryOpen(PETSC_COMM_WORLD, file[LOAD_MATRIX],FILE_MODE_WRITE,&fd); MatView(matrix->petsc, fd); PetscViewerDestroy(fd); ---- Matrix info using salsa/AnaMod module: Computed as <9.362113e+03> Computed as <9.362623e+03> Computed as <2.800000e+01> Computed as <2.800000e+01> Computed as <2.404511e+02> Computed as <-3.064463e-02> Could not compute Could not compute Could not compute Could not compute Could not compute Computed as <853564> Could not compute Computed as <47191472> Computed as <112> Computed as <16> From knepley at gmail.com Fri Sep 4 14:31:48 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 4 Sep 2009 14:31:48 -0500 Subject: Problem with saving the binary matrix via Matview In-Reply-To: References: Message-ID: On Fri, Sep 4, 2009 at 2:25 PM, Nguyen, Hung V ERDC-ITL-MS < Hung.V.Nguyen at usace.army.mil> wrote: > > Hello, > > I have a problem with saving a matrix with the binary viewer > PetscBinaryViewerOpen() for the matrix with nrow=853564, nnzeros=47191472. > The application ran with 32 pes for more than 3 hours without writing any > into file. > However, the same executable ran for small size of application > (nrows-12856,nnzeros=675744) and were able to write successfully binary > files. > Much bigger matrices than this are routinely saved. I do not think this has to do with MatView() in particular, unless this is a very slow disk. You can check with the debugger where the code is currently occupied. This sounds to me more like a deadlock (not calling the routine with every process). Matt > Note: using Totalview I found that it hang in the line of > Matview(matrix->petsc, fd); > > Thank you, > > -hung > > --- code: > petsc_analyst_mat(matrix->petsc); > sprintf(file[LOAD_MATRIX],"Matrix.at%f",t_prev); > PetscViewerBinaryOpen(PETSC_COMM_WORLD, > file[LOAD_MATRIX],FILE_MODE_WRITE,&fd); > MatView(matrix->petsc, fd); > PetscViewerDestroy(fd); > > ---- Matrix info using salsa/AnaMod module: > > Computed as <9.362113e+03> > Computed as <9.362623e+03> > Computed as <2.800000e+01> > Computed as <2.800000e+01> > Computed as <2.404511e+02> > Computed as <-3.064463e-02> > Could not compute > Could not compute > Could not compute > Could not compute > Could not compute > Computed as <853564> > Could not compute > Computed as <47191472> > Computed as <112> > Computed as <16> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 4 15:23:33 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 4 Sep 2009 15:23:33 -0500 Subject: Problem with saving the binary matrix via Matview In-Reply-To: References: Message-ID: <934F80E4-717B-45DB-9136-5EC8B8109054@mcs.anl.gov> Is this a Cray? You may need to set the environmental variables MPI UNEX BUFFER SIZE and/or MPICH PTL MATCH OFF and/or MPICH PTL OTHER EVENTS and/or MPICH MSGS PER PROC and/or MPICH PTL SEND CREDITS you will likely need to hunt through Cray documentation to find the meaning of all this stuff. The problems will be worse if you don't have the latest Cray software on the system. Note we generally frown up saving huge matrices to disk except for debugging testing purposes. But as Matt notes this is not a particularly huge matrix. With any reasonable configuration it should take very little time to write the file. Good luck, Barry On Sep 4, 2009, at 2:25 PM, Nguyen, Hung V ERDC-ITL-MS wrote: > > Hello, > > I have a problem with saving a matrix with the binary viewer > PetscBinaryViewerOpen() for the matrix with nrow=853564, > nnzeros=47191472. > The application ran with 32 pes for more than 3 hours without > writing any > into file. > However, the same executable ran for small size of application > (nrows-12856,nnzeros=675744) and were able to write successfully > binary > files. > > Note: using Totalview I found that it hang in the line of > Matview(matrix->petsc, fd); > > Thank you, > > -hung > > --- code: > petsc_analyst_mat(matrix->petsc); > sprintf(file[LOAD_MATRIX],"Matrix.at%f",t_prev); > PetscViewerBinaryOpen(PETSC_COMM_WORLD, > file[LOAD_MATRIX],FILE_MODE_WRITE,&fd); > MatView(matrix->petsc, fd); > PetscViewerDestroy(fd); > > ---- Matrix info using salsa/AnaMod module: > > Computed as <9.362113e+03> > Computed as <9.362623e+03> > Computed as <2.800000e+01> > Computed as <2.800000e+01> > Computed as <2.404511e+02> > Computed as <-3.064463e-02> > Could not compute > Could not compute > Could not compute > Could not compute > Could not compute > Computed as <853564> > Could not compute > Computed as <47191472> > Computed as <112> > Computed as <16> From Hung.V.Nguyen at usace.army.mil Fri Sep 4 15:59:20 2009 From: Hung.V.Nguyen at usace.army.mil (Nguyen, Hung V ERDC-ITL-MS) Date: Fri, 4 Sep 2009 15:59:20 -0500 Subject: Problem with saving the binary matrix via Matview In-Reply-To: <934F80E4-717B-45DB-9136-5EC8B8109054@mcs.anl.gov> References: <934F80E4-717B-45DB-9136-5EC8B8109054@mcs.anl.gov> Message-ID: Hello, Yes, It is a CrayXT4. I just rerun with setting the variables below. I will let you know if it helps. hvnguyen:jade20% setenv MPICH_PTL_SEND_CREDITS -1 hvnguyen:jade20% setenv MPICH_MAX_SHORT_MSG_SIZE 64000 hvnguyen:jade20% setenv MPICH_UNEX_BUFFER_SIZE 240M hvnguyen:jade20% setenv MPICH_PTL_UNEX_EVENTS 60000 >The problems will be worse if you don't have the latest Cray software on the system. What do you mean exactly about the latest Cray software? >Note we generally frown up saving huge matrices to disk except for debugging testing purposes. But as Matt notes this is not a particularly huge matrix. With any reasonable configuration it should take very little time to write the file. We need to dump out some matrices required large number of iterations for testing. As for small application case (nrows-12856,nnzeros=675744), it took a minute to write binary matrix to a file on CrayXT4 system so I don't know why it has a deadlock in this case. Thank you. -hung -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Friday, September 04, 2009 3:24 PM To: PETSc users list Subject: Re: Problem with saving the binary matrix via Matview Is this a Cray? You may need to set the environmental variables MPI UNEX BUFFER SIZE and/or MPICH PTL MATCH OFF and/or MPICH PTL OTHER EVENTS and/or MPICH MSGS PER PROC and/or MPICH PTL SEND CREDITS you will likely need to hunt through Cray documentation to find the meaning of all this stuff. The problems will be worse if you don't have the latest Cray software on the system. Note we generally frown up saving huge matrices to disk except for debugging testing purposes. But as Matt notes this is not a particularly huge matrix. With any reasonable configuration it should take very little time to write the file. Good luck, Barry On Sep 4, 2009, at 2:25 PM, Nguyen, Hung V ERDC-ITL-MS wrote: > > Hello, > > I have a problem with saving a matrix with the binary viewer > PetscBinaryViewerOpen() for the matrix with nrow=853564, > nnzeros=47191472. > The application ran with 32 pes for more than 3 hours without writing > any into file. > However, the same executable ran for small size of application > (nrows-12856,nnzeros=675744) and were able to write successfully > binary files. > > Note: using Totalview I found that it hang in the line of > Matview(matrix->petsc, fd); > > Thank you, > > -hung > > --- code: > petsc_analyst_mat(matrix->petsc); > sprintf(file[LOAD_MATRIX],"Matrix.at%f",t_prev); > PetscViewerBinaryOpen(PETSC_COMM_WORLD, > file[LOAD_MATRIX],FILE_MODE_WRITE,&fd); > MatView(matrix->petsc, fd); > PetscViewerDestroy(fd); > > ---- Matrix info using salsa/AnaMod module: > > Computed as <9.362113e+03> Computed > as <9.362623e+03> Computed as <2.800000e+01> Computed > as <2.800000e+01> Computed as > <2.404511e+02> Computed as <-3.064463e-02> > Could not compute Could not compute > Could not compute > Could not compute Could not compute > Computed as <853564> > Could not compute Computed as > <47191472> Computed as <112> Computed > as <16> From sapphire.jxy at gmail.com Sun Sep 6 20:39:03 2009 From: sapphire.jxy at gmail.com (xiaoyin ji) Date: Sun, 6 Sep 2009 21:39:03 -0400 Subject: PETSc is slowing down in C++? again Message-ID: <6985a8f00909061839s3aa75fk387cdd300a196031@mail.gmail.com> Hi, I cannot include running options like -ksp_monitor in my program as it's not a simple petsc code. However, after testing the ksp examples I've found similar problem. Here is the code I'm testing src/ksp/ksp/examples/tutorials/ex2.c What I've done is add a loop of 10000 steps between MatCreate and MatDestory (right before PetscFinalize), and print the time for each loop. The time will increase exponentially just like my program. Moreover, if I narrow the loop so that only ksp create and destory are included, solving time does not change. -ksp_monitor option shows that ksp loop is running fine, however I cannot use this option with the time test as print out will change the loop time significantly. It seems to me that either MatDestory or VecDestory does not clear everything well in C++ codes( in Fortran codes they work well). Besides, instead of directly call petsc functions, I've also created a class which contains petsc mat and ksp utilities, and create/destroy the object of this class for each loop. However problem still exists. Best, Xiaoyin Ji On Thu, Sep 3, 2009 at 8:34 AM, xiaoyin ji wrote: > Hi, > > Here are the print outs > > for the very beginning, average time is about 0.8sec for the ksp solver > KSP Object: > ?type: bicg > ?maximum iterations=10000, initial guess is zero > ?tolerances: ?relative=1e-07, absolute=1e-50, divergence=10000 > ?left preconditioning > PC Object: > ?type: bjacobi > ? ?block Jacobi: number of blocks = 16 > ? ?Local solve is same for all blocks, in the following KSP and PC objects: > ?KSP Object:(sub_) > ? ?type: preonly > ? ?maximum iterations=10000, initial guess is zero > ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ?left preconditioning > ?PC Object:(sub_) > ? ?type: ilu > ? ? ?ILU: 0 levels of fill > ? ? ?ILU: factor fill ratio allocated 1 > ? ? ?ILU: tolerance for zero pivot 1e-12 > ? ? ?ILU: using diagonal shift to prevent zero pivot > ? ? ?ILU: using diagonal shift on blocks to prevent zero pivot > ? ? ? ? ? out-of-place factorization > ? ? ? ? ? matrix ordering: natural > ? ? ?ILU: factor fill ratio needed 1 > ? ? ? ? ? Factored matrix follows > ? ? ? ? ?Matrix Object: > ? ? ? ? ? ?type=seqaij, rows=5672, cols=5672 > ? ? ? ? ? ?package used to perform factorization: petsc > ? ? ? ? ? ?total: nonzeros=39090, allocated nonzeros=39704 > ? ? ? ? ? ? ?not using I-node routines > ? ?linear system matrix = precond matrix: > ? ?Matrix Object: > ? ? ?type=seqaij, rows=5672, cols=5672 > ? ? ?total: nonzeros=39090, allocated nonzeros=39704 > ? ? ? ?not using I-node routines > ?linear system matrix = precond matrix: > ?Matrix Object: > ? ?type=mpiaij, rows=90746, cols=90746 > ? ?total: nonzeros=636378, allocated nonzeros=1279114 > ? ? ?not using I-node (on process 0) routines > Norm of error 48.144, Iterations 137 > > After 4000 steps, solver takes 7.5sec > > KSP Object: > ?type: bicg > ?maximum iterations=10000, initial guess is zero > ?tolerances: ?relative=1e-07, absolute=1e-50, divergence=10000 > ?left preconditioning > PC Object: > ?type: bjacobi > ? ?block Jacobi: number of blocks = 16 > ? ?Local solve is same for all blocks, in the following KSP and PC objects: > ?KSP Object:(sub_) > ? ?type: preonly > ? ?maximum iterations=10000, initial guess is zero > ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ?left preconditioning > ?PC Object:(sub_) > ? ?type: ilu > ? ? ?ILU: 0 levels of fill > ? ? ?ILU: factor fill ratio allocated 1 > ? ? ?ILU: tolerance for zero pivot 1e-12 > ? ? ?ILU: using diagonal shift to prevent zero pivot > ? ? ?ILU: using diagonal shift on blocks to prevent zero pivot > ? ? ? ? ? out-of-place factorization > ? ? ? ? ? matrix ordering: natural > ? ? ?ILU: factor fill ratio needed 1 > ? ? ? ? ? Factored matrix follows > ? ? ? ? ?Matrix Object: > ? ? ? ? ? ?type=seqaij, rows=5672, cols=5672 > ? ? ? ? ? ?package used to perform factorization: petsc > ? ? ? ? ? ?total: nonzeros=39090, allocated nonzeros=39704 > ? ? ? ? ? ? ?not using I-node routines > ? ?linear system matrix = precond matrix: > ? ?Matrix Object: > ? ? ?type=seqaij, rows=5672, cols=5672 > ? ? ?total: nonzeros=39090, allocated nonzeros=39704 > ? ? ? ?not using I-node routines > ?linear system matrix = precond matrix: > ?Matrix Object: > ? ?type=mpiaij, rows=90746, cols=90746 > ? ?total: nonzeros=636378, allocated nonzeros=1279114 > ? ? ?not using I-node (on process 0) routines > Norm of error 48.7467, Iterations 132 > > > The iterations are similar, solving time is actually increasing > exponentially, and the matrix should not be too complicated here as > the PETSc in Fortran solved this in 1sec. > > By the way, will there be a way to set a PETSc vector directly into a > preconditioner for the ksp solver? > > Thanks! > > Best, > Xiaoyin Ji > > Department of Materials Science and Engineering > North Carolina State University > From knepley at gmail.com Sun Sep 6 21:07:29 2009 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 6 Sep 2009 21:07:29 -0500 Subject: PETSc is slowing down in C++? again In-Reply-To: <6985a8f00909061839s3aa75fk387cdd300a196031@mail.gmail.com> References: <6985a8f00909061839s3aa75fk387cdd300a196031@mail.gmail.com> Message-ID: On Sun, Sep 6, 2009 at 8:39 PM, xiaoyin ji wrote: > Hi, > > I cannot include running options like -ksp_monitor in my program as > it's not a simple petsc code. However, after testing the ksp examples > I've found similar problem. > > Here is the code I'm testing src/ksp/ksp/examples/tutorials/ex2.c > > What I've done is add a loop of 10000 steps between MatCreate and > MatDestory (right before PetscFinalize), and print the time for each > loop. The time will increase exponentially just like my program. > Moreover, if I narrow the loop so that only ksp create and destory are > included, solving time does not change. -ksp_monitor option shows that > ksp loop is running fine, however I cannot use this option with the > time test as print out will change the loop time significantly. > > It seems to me that either MatDestory or VecDestory does not clear > everything well in C++ codes( in Fortran codes they work well). > Besides, instead of directly call petsc functions, I've also created a > class which contains petsc mat and ksp utilities, and create/destroy > the object of this class for each loop. However problem still exists. > 1) I cannot reproduce this bug. However, the description is not that clear. 2) The Fortran and C++ interfaces are just wrappers. They do not handle memory allocation or calculation. 3) This must be something specific to your computer. Matt > Best, > Xiaoyin Ji > > On Thu, Sep 3, 2009 at 8:34 AM, xiaoyin ji wrote: > > Hi, > > > > Here are the print outs > > > > for the very beginning, average time is about 0.8sec for the ksp solver > > KSP Object: > > type: bicg > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-07, absolute=1e-50, divergence=10000 > > left preconditioning > > PC Object: > > type: bjacobi > > block Jacobi: number of blocks = 16 > > Local solve is same for all blocks, in the following KSP and PC > objects: > > KSP Object:(sub_) > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > PC Object:(sub_) > > type: ilu > > ILU: 0 levels of fill > > ILU: factor fill ratio allocated 1 > > ILU: tolerance for zero pivot 1e-12 > > ILU: using diagonal shift to prevent zero pivot > > ILU: using diagonal shift on blocks to prevent zero pivot > > out-of-place factorization > > matrix ordering: natural > > ILU: factor fill ratio needed 1 > > Factored matrix follows > > Matrix Object: > > type=seqaij, rows=5672, cols=5672 > > package used to perform factorization: petsc > > total: nonzeros=39090, allocated nonzeros=39704 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: > > type=seqaij, rows=5672, cols=5672 > > total: nonzeros=39090, allocated nonzeros=39704 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: > > type=mpiaij, rows=90746, cols=90746 > > total: nonzeros=636378, allocated nonzeros=1279114 > > not using I-node (on process 0) routines > > Norm of error 48.144, Iterations 137 > > > > After 4000 steps, solver takes 7.5sec > > > > KSP Object: > > type: bicg > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-07, absolute=1e-50, divergence=10000 > > left preconditioning > > PC Object: > > type: bjacobi > > block Jacobi: number of blocks = 16 > > Local solve is same for all blocks, in the following KSP and PC > objects: > > KSP Object:(sub_) > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > PC Object:(sub_) > > type: ilu > > ILU: 0 levels of fill > > ILU: factor fill ratio allocated 1 > > ILU: tolerance for zero pivot 1e-12 > > ILU: using diagonal shift to prevent zero pivot > > ILU: using diagonal shift on blocks to prevent zero pivot > > out-of-place factorization > > matrix ordering: natural > > ILU: factor fill ratio needed 1 > > Factored matrix follows > > Matrix Object: > > type=seqaij, rows=5672, cols=5672 > > package used to perform factorization: petsc > > total: nonzeros=39090, allocated nonzeros=39704 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: > > type=seqaij, rows=5672, cols=5672 > > total: nonzeros=39090, allocated nonzeros=39704 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: > > type=mpiaij, rows=90746, cols=90746 > > total: nonzeros=636378, allocated nonzeros=1279114 > > not using I-node (on process 0) routines > > Norm of error 48.7467, Iterations 132 > > > > > > The iterations are similar, solving time is actually increasing > > exponentially, and the matrix should not be too complicated here as > > the PETSc in Fortran solved this in 1sec. > > > > By the way, will there be a way to set a PETSc vector directly into a > > preconditioner for the ksp solver? > > > > Thanks! > > > > Best, > > Xiaoyin Ji > > > > Department of Materials Science and Engineering > > North Carolina State University > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.aunai at gmail.com Mon Sep 7 01:33:07 2009 From: nicolas.aunai at gmail.com (nicolas aunai) Date: Mon, 7 Sep 2009 08:33:07 +0200 Subject: bug ? number of processor, coarse and fine DA for DMMG In-Reply-To: References: Message-ID: Hello and thanks for your answer, could you tell me what exactly is the limitation, in terms of number of processors and grid size ? Nico 2009/9/4 Barry Smith : > > ?This is a limitation in the way the interpolation is computed. ?It is not > really an error. There is no easy way to extend the code for these other > configurations. > > > ? Barry > > On Sep 4, 2009, at 7:39 AM, nicolas aunai wrote: > >> Hello, >> >> >> I have a code that uses the DMMG objet for a linear problem. The DA I >> create for the coarse grid is DA_XPERIODIC and 2D. I let petsc decide >> for the number of processors in each direction. >> >> I have a problem which depends on the number of processors I use. >> If I want a 1024x1025 grid with 4 levels, the coarsest grid will be >> 128x129 with the default refinement factor. >> >> Everything seems to work fine when I run with 1, 2, 3, 4, 6, 8... >> processors... but 5, 7, 11 proc fail with the following error : >> >> >> nicolas at lx-nau:~/code/tests/petsc/dmmg$ mpiexec -n 5 ./inertia6 >> -ksp_type gmres -pc_type sor -ksp_rtol 1e-9 >> >> size of the coarse level : 128 129 >> debug 1, number of proc in each dir. : 1 5 >> >> [3]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [3]PETSC ERROR: Arguments are incompatible! >> [3]PETSC ERROR: Processor's coarse DA must lie over fine DA >> ? j_start 615 j_c 307 j_start_ghost_c 308! >> [3]PETSC ERROR: >> ------------------------------------------------------------------------ >> [3]PETSC ERROR: Petsc Release Version 3.0.0, Patch 7, Mon Jul ?6 >> 11:33:34 CDT 2009 >> [3]PETSC ERROR: See docs/changes/index.html for recent updates. >> [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [3]PETSC ERROR: See docs/index.html for manual pages. >> [3]PETSC ERROR: >> ------------------------------------------------------------------------ >> [3]PETSC ERROR: ./inertia6 on a linux-gnu named lx-nau by nicolas Fri >> Sep ?4 14:39:27 2009 >> [3]PETSC ERROR: Libraries linked from >> /home/nicolas/bin/petsc-3.0.0-p7/linux-gnu-debug/lib >> [3]PETSC ERROR: Configure run at Mon Aug 17 11:34:27 2009 >> [3]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran >> --download-f-blas-lapack=1 --download-mpich=1 >> PETSC_ARCH=linux-gnu-debug --with-shared=0 >> [3]PETSC ERROR: >> ------------------------------------------------------------------------ >> [3]PETSC ERROR: DAGetInterpolation_2D_Q1() line 280 in >> src/dm/da/src/dainterp.c >> [3]PETSC ERROR: DAGetInterpolation() line 879 in src/dm/da/src/dainterp.c >> [3]PETSC ERROR: DMGetInterpolation() line 144 in src/dm/da/utils/dm.c >> [3]PETSC ERROR: DMMGSetDM() line 309 in src/snes/utils/damg.c >> >> >> >> it seems that the error come from the routine DMMGSetDM, when petsc >> build the restriction/interpolation operators... but I can't figure >> out why I couln't use 5 processors with this grid size ? >> >> if the code is needed, it is here : http://nico.aunai.free.fr/inertia6.c >> >> >> Thx for your help >> >> Nico > > From C.Klaij at marin.nl Mon Sep 7 02:01:58 2009 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Mon, 7 Sep 2009 09:01:58 +0200 Subject: dummy in MyKSPConverged References: Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F7C8@MAR150CV1.marin.local> Matt, Thanks for your answer! dummy is defined as a PETSc integer inside MyKSPConverged, so why can't I set it to 0? I know it's not useful because it's just a dummy and nothing happens with it, but setting it shouldn't crash the code, right? Also, if setting it changes the basic definition of PETSC_NULL_OBJECT shouldn't you get a SEGV as well, or at least some kind of trouble? Chris Message: 5 Date: Fri, 4 Sep 2009 10:49:25 -0500 From: Matthew Knepley Subject: Re: dummy in MyKSPConverged To: PETSc users list Message-ID: Content-Type: text/plain; charset="iso-8859-1" On Fri, Sep 4, 2009 at 10:35 AM, Klaij, Christiaan wrote: > Hello, > > With the new version 3.0.0-p8, if I assign a value to the dummy in > MyKSPConverged at the end of src/ksp/ksp/examples/tutorials/ex2f.F, I get a > segmentation fault (this didn't happen with version 2.3.3-p13). Any idea > why? > 1) I do not get a SEGV 2) You should not be setting this at all. You pass in PETSC_NULL_OBJECT, which means you are changing the definition of that basic thing in PETSc. This can produce undefined results. Matt > $ tail ex2f.F > else > flag = 0 > endif > ierr = 0 > dummy = 0 > > end > $ ./ex2f -my_ksp_convergence > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC > ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to > find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] GMREScycle line 133 src/ksp/ksp/impls/gmres/gmres.c > [0]PETSC ERROR: [0] KSPSolve_GMRES line 228 src/ksp/ksp/impls/gmres/gmres.c > [0]PETSC ERROR: [0] KSPSolve line 308 src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 > CDT 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ dr. ir. Christiaan Klaij CFD Researcher Research & Development mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I http://www.marin.nl/ http://www.marin.nl/web/News/News-items/RD-seminar-on-September-22-few-places-left.htm R&D seminar on September 22: few places left This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/ms-tnef Size: 4396 bytes Desc: not available URL: From knepley at gmail.com Mon Sep 7 07:03:01 2009 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 7 Sep 2009 07:03:01 -0500 Subject: dummy in MyKSPConverged In-Reply-To: <5D9143EF9FADE942BEF6F2A636A861170800F7C8@MAR150CV1.marin.local> References: <5D9143EF9FADE942BEF6F2A636A861170800F7C8@MAR150CV1.marin.local> Message-ID: On Mon, Sep 7, 2009 at 2:01 AM, Klaij, Christiaan wrote: > Matt, > > Thanks for your answer! > > dummy is defined as a PETSc integer inside MyKSPConverged, so why can't I > set it to 0? I know it's not useful because it's just a dummy and nothing > happens with it, but setting it shouldn't crash the code, right? Also, if > setting it changes the basic definition of PETSC_NULL_OBJECT shouldn't you > get a SEGV as well, or at least some kind of trouble? > dummy is something *you passed in* earlier in the code, and you passed in something that should not change. I do not know why you are getting an SEGV (I do not get one), but I do know you should not change this for the code to function correctly. Matt > Chris > > > > Message: 5 > Date: Fri, 4 Sep 2009 10:49:25 -0500 > From: Matthew Knepley > Subject: Re: dummy in MyKSPConverged > To: PETSc users list > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > On Fri, Sep 4, 2009 at 10:35 AM, Klaij, Christiaan > wrote: > > > Hello, > > > > With the new version 3.0.0-p8, if I assign a value to the dummy in > > MyKSPConverged at the end of src/ksp/ksp/examples/tutorials/ex2f.F, I get > a > > segmentation fault (this didn't happen with version 2.3.3-p13). Any idea > > why? > > > > 1) I do not get a SEGV > > 2) You should not be setting this at all. You pass in PETSC_NULL_OBJECT, > which means you are changing the definition of that > basic thing in PETSc. This can produce undefined results. > > Matt > > > > $ tail ex2f.F > > else > > flag = 0 > > endif > > ierr = 0 > > dummy = 0 > > > > end > > $ ./ex2f -my_ksp_convergence > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > > probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > [0]PETSC ERROR: or see > > > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC > > ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to > > find memory corruption errors > > [0]PETSC ERROR: likely location of problem given in stack below > > [0]PETSC ERROR: --------------------- Stack Frames > > ------------------------------------ > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > > available, > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > [0]PETSC ERROR: is given. > > [0]PETSC ERROR: [0] GMREScycle line 133 src/ksp/ksp/impls/gmres/gmres.c > > [0]PETSC ERROR: [0] KSPSolve_GMRES line 228 > src/ksp/ksp/impls/gmres/gmres.c > > [0]PETSC ERROR: [0] KSPSolve line 308 src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: --------------------- Error Message > > ------------------------------------ > > [0]PETSC ERROR: Signal received! > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 > > CDT 2009 > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > [0]PETSC ERROR: See docs/index.html for manual pages. > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I http://www.marin.nl/ > > http://www.marin.nl/web/News/News-items/RD-seminar-on-September-22-few-places-left.htmR&D seminar on September 22: few places left > > This e-mail may be confidential, privileged and/or protected by copyright. > If you are not the intended recipient, you should return it to the sender > immediately and delete your copy from your system. > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From sapphire.jxy at gmail.com Mon Sep 7 11:26:32 2009 From: sapphire.jxy at gmail.com (xiaoyin ji) Date: Mon, 7 Sep 2009 12:26:32 -0400 Subject: petsc-users Digest, Vol 9, Issue 6 In-Reply-To: References: Message-ID: <6985a8f00909070926k647e23d3m43cee4111a6961aa@mail.gmail.com> Thanks Matt, very useful hint, I'll contact the administrator. Xiaoyin Ji >> > > 1) I cannot reproduce this bug. However, the description is not that clear. > > 2) The Fortran and C++ interfaces are just wrappers. They do not handle > ? ? memory allocation or calculation. > > 3) This must be something specific to your computer. > > ?Matt > > From bsmith at mcs.anl.gov Mon Sep 7 11:38:20 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 7 Sep 2009 11:38:20 -0500 Subject: PETSc is slowing down in C++? again In-Reply-To: <6985a8f00909061839s3aa75fk387cdd300a196031@mail.gmail.com> References: <6985a8f00909061839s3aa75fk387cdd300a196031@mail.gmail.com> Message-ID: There must be something in your C++ classes that is not being freed or your class destructors are not calling the appropriate MatDestroy()s and KSPDestroy()s. If you run a few cycles with -malloc_dump it will indicate what memory (and where it was created) was not destroyed and you can track down your problem. Barry On Sep 6, 2009, at 8:39 PM, xiaoyin ji wrote: > Hi, > > I cannot include running options like -ksp_monitor in my program as > it's not a simple petsc code. However, after testing the ksp examples > I've found similar problem. > > Here is the code I'm testing src/ksp/ksp/examples/tutorials/ex2.c > > What I've done is add a loop of 10000 steps between MatCreate and > MatDestory (right before PetscFinalize), and print the time for each > loop. The time will increase exponentially just like my program. > Moreover, if I narrow the loop so that only ksp create and destory are > included, solving time does not change. -ksp_monitor option shows that > ksp loop is running fine, however I cannot use this option with the > time test as print out will change the loop time significantly. > > It seems to me that either MatDestory or VecDestory does not clear > everything well in C++ codes( in Fortran codes they work well). > Besides, instead of directly call petsc functions, I've also created a > class which contains petsc mat and ksp utilities, and create/destroy > the object of this class for each loop. However problem still exists. > > Best, > Xiaoyin Ji > > On Thu, Sep 3, 2009 at 8:34 AM, xiaoyin ji > wrote: >> Hi, >> >> Here are the print outs >> >> for the very beginning, average time is about 0.8sec for the ksp >> solver >> KSP Object: >> type: bicg >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-07, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: bjacobi >> block Jacobi: number of blocks = 16 >> Local solve is same for all blocks, in the following KSP and PC >> objects: >> KSP Object:(sub_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(sub_) >> type: ilu >> ILU: 0 levels of fill >> ILU: factor fill ratio allocated 1 >> ILU: tolerance for zero pivot 1e-12 >> ILU: using diagonal shift to prevent zero pivot >> ILU: using diagonal shift on blocks to prevent zero pivot >> out-of-place factorization >> matrix ordering: natural >> ILU: factor fill ratio needed 1 >> Factored matrix follows >> Matrix Object: >> type=seqaij, rows=5672, cols=5672 >> package used to perform factorization: petsc >> total: nonzeros=39090, allocated nonzeros=39704 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=5672, cols=5672 >> total: nonzeros=39090, allocated nonzeros=39704 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=90746, cols=90746 >> total: nonzeros=636378, allocated nonzeros=1279114 >> not using I-node (on process 0) routines >> Norm of error 48.144, Iterations 137 >> >> After 4000 steps, solver takes 7.5sec >> >> KSP Object: >> type: bicg >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-07, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: bjacobi >> block Jacobi: number of blocks = 16 >> Local solve is same for all blocks, in the following KSP and PC >> objects: >> KSP Object:(sub_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(sub_) >> type: ilu >> ILU: 0 levels of fill >> ILU: factor fill ratio allocated 1 >> ILU: tolerance for zero pivot 1e-12 >> ILU: using diagonal shift to prevent zero pivot >> ILU: using diagonal shift on blocks to prevent zero pivot >> out-of-place factorization >> matrix ordering: natural >> ILU: factor fill ratio needed 1 >> Factored matrix follows >> Matrix Object: >> type=seqaij, rows=5672, cols=5672 >> package used to perform factorization: petsc >> total: nonzeros=39090, allocated nonzeros=39704 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=5672, cols=5672 >> total: nonzeros=39090, allocated nonzeros=39704 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=90746, cols=90746 >> total: nonzeros=636378, allocated nonzeros=1279114 >> not using I-node (on process 0) routines >> Norm of error 48.7467, Iterations 132 >> >> >> The iterations are similar, solving time is actually increasing >> exponentially, and the matrix should not be too complicated here as >> the PETSc in Fortran solved this in 1sec. >> >> By the way, will there be a way to set a PETSc vector directly into a >> preconditioner for the ksp solver? >> >> Thanks! >> >> Best, >> Xiaoyin Ji >> >> Department of Materials Science and Engineering >> North Carolina State University >> From bsmith at mcs.anl.gov Mon Sep 7 11:38:50 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 7 Sep 2009 11:38:50 -0500 Subject: bug ? number of processor, coarse and fine DA for DMMG In-Reply-To: References: Message-ID: There is no simple formula. Barry On Sep 7, 2009, at 1:33 AM, nicolas aunai wrote: > Hello and thanks for your answer, could you tell me what exactly is > the limitation, in terms of number of processors and grid size ? > > > Nico > > > > 2009/9/4 Barry Smith : >> >> This is a limitation in the way the interpolation is computed. It >> is not >> really an error. There is no easy way to extend the code for these >> other >> configurations. >> >> >> Barry >> >> On Sep 4, 2009, at 7:39 AM, nicolas aunai wrote: >> >>> Hello, >>> >>> >>> I have a code that uses the DMMG objet for a linear problem. The >>> DA I >>> create for the coarse grid is DA_XPERIODIC and 2D. I let petsc >>> decide >>> for the number of processors in each direction. >>> >>> I have a problem which depends on the number of processors I use. >>> If I want a 1024x1025 grid with 4 levels, the coarsest grid will be >>> 128x129 with the default refinement factor. >>> >>> Everything seems to work fine when I run with 1, 2, 3, 4, 6, 8... >>> processors... but 5, 7, 11 proc fail with the following error : >>> >>> >>> nicolas at lx-nau:~/code/tests/petsc/dmmg$ mpiexec -n 5 ./inertia6 >>> -ksp_type gmres -pc_type sor -ksp_rtol 1e-9 >>> >>> size of the coarse level : 128 129 >>> debug 1, number of proc in each dir. : 1 5 >>> >>> [3]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [3]PETSC ERROR: Arguments are incompatible! >>> [3]PETSC ERROR: Processor's coarse DA must lie over fine DA >>> j_start 615 j_c 307 j_start_ghost_c 308! >>> [3]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [3]PETSC ERROR: Petsc Release Version 3.0.0, Patch 7, Mon Jul 6 >>> 11:33:34 CDT 2009 >>> [3]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [3]PETSC ERROR: See docs/index.html for manual pages. >>> [3]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [3]PETSC ERROR: ./inertia6 on a linux-gnu named lx-nau by nicolas >>> Fri >>> Sep 4 14:39:27 2009 >>> [3]PETSC ERROR: Libraries linked from >>> /home/nicolas/bin/petsc-3.0.0-p7/linux-gnu-debug/lib >>> [3]PETSC ERROR: Configure run at Mon Aug 17 11:34:27 2009 >>> [3]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran >>> --download-f-blas-lapack=1 --download-mpich=1 >>> PETSC_ARCH=linux-gnu-debug --with-shared=0 >>> [3]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [3]PETSC ERROR: DAGetInterpolation_2D_Q1() line 280 in >>> src/dm/da/src/dainterp.c >>> [3]PETSC ERROR: DAGetInterpolation() line 879 in src/dm/da/src/ >>> dainterp.c >>> [3]PETSC ERROR: DMGetInterpolation() line 144 in src/dm/da/utils/ >>> dm.c >>> [3]PETSC ERROR: DMMGSetDM() line 309 in src/snes/utils/damg.c >>> >>> >>> >>> it seems that the error come from the routine DMMGSetDM, when petsc >>> build the restriction/interpolation operators... but I can't figure >>> out why I couln't use 5 processors with this grid size ? >>> >>> if the code is needed, it is here : http://nico.aunai.free.fr/inertia6.c >>> >>> >>> Thx for your help >>> >>> Nico >> >> From sapphire.jxy at gmail.com Mon Sep 7 11:41:16 2009 From: sapphire.jxy at gmail.com (xiaoyin ji) Date: Mon, 7 Sep 2009 12:41:16 -0400 Subject: petsc-users Digest, Vol 9, Issue 6 In-Reply-To: <6985a8f00909070926k647e23d3m43cee4111a6961aa@mail.gmail.com> References: <6985a8f00909070926k647e23d3m43cee4111a6961aa@mail.gmail.com> Message-ID: <6985a8f00909070941q4e86bd35mb5541b7a95a13d10@mail.gmail.com> Sorry for the description, here is the example code. Xiaoyin Ji On Mon, Sep 7, 2009 at 12:26 PM, xiaoyin ji wrote: > Thanks Matt, very useful hint, I'll contact the administrator. > > Xiaoyin Ji > >>> >> >> 1) I cannot reproduce this bug. However, the description is not that clear. >> >> 2) The Fortran and C++ interfaces are just wrappers. They do not handle >> ? ? memory allocation or calculation. >> >> 3) This must be something specific to your computer. >> >> ?Matt >> >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: ex2.c Type: application/octet-stream Size: 10125 bytes Desc: not available URL: From bsmith at mcs.anl.gov Mon Sep 7 16:13:30 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 7 Sep 2009 16:13:30 -0500 Subject: petsc-users Digest, Vol 9, Issue 6 In-Reply-To: <6985a8f00909070941q4e86bd35mb5541b7a95a13d10@mail.gmail.com> References: <6985a8f00909070926k647e23d3m43cee4111a6961aa@mail.gmail.com> <6985a8f00909070941q4e86bd35mb5541b7a95a13d10@mail.gmail.com> Message-ID: <1A63CA6F-75FC-438A-BA1E-B4BA7A250B0D@mcs.anl.gov> Move the PetscLogStageRegister() call out of the loop. Each time it is called it allocates a little memory to keep information about that stage. Millions of iterations later, it runs out of memory. We will change the code in the future so it generates an error if the same stage is registered more than once to prevent this. Barry On Sep 7, 2009, at 11:41 AM, xiaoyin ji wrote: > Sorry for the description, here is the example code. > > Xiaoyin Ji > > On Mon, Sep 7, 2009 at 12:26 PM, xiaoyin ji > wrote: >> Thanks Matt, very useful hint, I'll contact the administrator. >> >> Xiaoyin Ji >> >>>> >>> >>> 1) I cannot reproduce this bug. However, the description is not >>> that clear. >>> >>> 2) The Fortran and C++ interfaces are just wrappers. They do not >>> handle >>> memory allocation or calculation. >>> >>> 3) This must be something specific to your computer. >>> >>> Matt >>> >>> >> > From C.Klaij at marin.nl Tue Sep 8 01:55:36 2009 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Tue, 8 Sep 2009 08:55:36 +0200 Subject: dummy in MyKSPConverged References: Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F7CD@MAR150CV1.marin.local> Matt, Thanks for your explanation. I guess we are using different compilers (I'm using the intel 10.1) which may explain why I get a SEGV and you don't. I didn't change anything else in src/ksp/ksp/examples/tutorials/ex2f.F. To help me understand, could you please tell me where exactly the dummy is passed in? Chris ---------------------------------------------------------------------- Message: 1 Date: Mon, 7 Sep 2009 07:03:01 -0500 From: Matthew Knepley Subject: Re: dummy in MyKSPConverged To: PETSc users list Message-ID: Content-Type: text/plain; charset="iso-8859-1" On Mon, Sep 7, 2009 at 2:01 AM, Klaij, Christiaan wrote: > Matt, > > Thanks for your answer! > > dummy is defined as a PETSc integer inside MyKSPConverged, so why can't I > set it to 0? I know it's not useful because it's just a dummy and nothing > happens with it, but setting it shouldn't crash the code, right? Also, if > setting it changes the basic definition of PETSC_NULL_OBJECT shouldn't you > get a SEGV as well, or at least some kind of trouble? > dummy is something *you passed in* earlier in the code, and you passed in something that should not change. I do not know why you are getting an SEGV (I do not get one), but I do know you should not change this for the code to function correctly. Matt > Chris > > > > Message: 5 > Date: Fri, 4 Sep 2009 10:49:25 -0500 > From: Matthew Knepley > Subject: Re: dummy in MyKSPConverged > To: PETSc users list > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > On Fri, Sep 4, 2009 at 10:35 AM, Klaij, Christiaan > wrote: > > > Hello, > > > > With the new version 3.0.0-p8, if I assign a value to the dummy in > > MyKSPConverged at the end of src/ksp/ksp/examples/tutorials/ex2f.F, I get > a > > segmentation fault (this didn't happen with version 2.3.3-p13). Any idea > > why? > > > > 1) I do not get a SEGV > > 2) You should not be setting this at all. You pass in PETSC_NULL_OBJECT, > which means you are changing the definition of that > basic thing in PETSc. This can produce undefined results. > > Matt > > > > $ tail ex2f.F > > else > > flag = 0 > > endif > > ierr = 0 > > dummy = 0 > > > > end > > $ ./ex2f -my_ksp_convergence > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > > probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > [0]PETSC ERROR: or see > > > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC > > ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to > > find memory corruption errors > > [0]PETSC ERROR: likely location of problem given in stack below > > [0]PETSC ERROR: --------------------- Stack Frames > > ------------------------------------ > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > > available, > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > [0]PETSC ERROR: is given. > > [0]PETSC ERROR: [0] GMREScycle line 133 src/ksp/ksp/impls/gmres/gmres.c > > [0]PETSC ERROR: [0] KSPSolve_GMRES line 228 > src/ksp/ksp/impls/gmres/gmres.c > > [0]PETSC ERROR: [0] KSPSolve line 308 src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: --------------------- Error Message > > ------------------------------------ > > [0]PETSC ERROR: Signal received! > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 > > CDT 2009 > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > [0]PETSC ERROR: See docs/index.html for manual pages. > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I http://www.marin.nl/ > > http://www.marin.nl/web/News/News-items/RD-seminar-on-September-22-few-places-left.htmR&D seminar on September 22: few places left > > This e-mail may be confidential, privileged and/or protected by copyright. > If you are not the intended recipient, you should return it to the sender > immediately and delete your copy from your system. > > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/ms-tnef Size: 4940 bytes Desc: not available URL: From knepley at gmail.com Tue Sep 8 07:11:37 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 8 Sep 2009 07:11:37 -0500 Subject: dummy in MyKSPConverged In-Reply-To: <5D9143EF9FADE942BEF6F2A636A861170800F7CD@MAR150CV1.marin.local> References: <5D9143EF9FADE942BEF6F2A636A861170800F7CD@MAR150CV1.marin.local> Message-ID: On Tue, Sep 8, 2009 at 1:55 AM, Klaij, Christiaan wrote: > Matt, > > Thanks for your explanation. I guess we are using different compilers (I'm > using the intel 10.1) which may explain why I get a SEGV and you don't. I > didn't change anything else in src/ksp/ksp/examples/tutorials/ex2f.F. To > help me understand, could you please tell me where exactly the dummy is > passed in? > call KSPSetConvergenceTest(ksp,MyKSPConverged, & & PETSC_NULL_OBJECT,PETSC_NULL_FUNCTION,ierr) The PETSC_NULL_OBJECT is "dummy" in the callback. Matt > Chris > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 7 Sep 2009 07:03:01 -0500 > From: Matthew Knepley > Subject: Re: dummy in MyKSPConverged > To: PETSc users list > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > On Mon, Sep 7, 2009 at 2:01 AM, Klaij, Christiaan > wrote: > > > Matt, > > > > Thanks for your answer! > > > > dummy is defined as a PETSc integer inside MyKSPConverged, so why can't I > > set it to 0? I know it's not useful because it's just a dummy and nothing > > happens with it, but setting it shouldn't crash the code, right? Also, if > > setting it changes the basic definition of PETSC_NULL_OBJECT shouldn't > you > > get a SEGV as well, or at least some kind of trouble? > > > > dummy is something *you passed in* earlier in the code, and you passed in > something that should not change. I > do not know why you are getting an SEGV (I do not get one), but I do know > you should not change this for the > code to function correctly. > > Matt > > > > Chris > > > > > > > > Message: 5 > > Date: Fri, 4 Sep 2009 10:49:25 -0500 > > From: Matthew Knepley > > Subject: Re: dummy in MyKSPConverged > > To: PETSc users list > > Message-ID: > > > > Content-Type: text/plain; charset="iso-8859-1" > > > > On Fri, Sep 4, 2009 at 10:35 AM, Klaij, Christiaan > > wrote: > > > > > Hello, > > > > > > With the new version 3.0.0-p8, if I assign a value to the dummy in > > > MyKSPConverged at the end of src/ksp/ksp/examples/tutorials/ex2f.F, I > get > > a > > > segmentation fault (this didn't happen with version 2.3.3-p13). Any > idea > > > why? > > > > > > > 1) I do not get a SEGV > > > > 2) You should not be setting this at all. You pass in PETSC_NULL_OBJECT, > > which means you are changing the definition of that > > basic thing in PETSc. This can produce undefined results. > > > > Matt > > > > > > > $ tail ex2f.F > > > else > > > flag = 0 > > > endif > > > ierr = 0 > > > dummy = 0 > > > > > > end > > > $ ./ex2f -my_ksp_convergence > > > [0]PETSC ERROR: > > > > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > > > probably memory access out of range > > > [0]PETSC ERROR: Try option -start_in_debugger or > > -on_error_attach_debugger > > > [0]PETSC ERROR: or see > > > > > > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC > < > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal%5B0%5DPETSC > > > > > ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple > to > > > find memory corruption errors > > > [0]PETSC ERROR: likely location of problem given in stack below > > > [0]PETSC ERROR: --------------------- Stack Frames > > > ------------------------------------ > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > > > available, > > > [0]PETSC ERROR: INSTEAD the line number of the start of the > > function > > > [0]PETSC ERROR: is given. > > > [0]PETSC ERROR: [0] GMREScycle line 133 src/ksp/ksp/impls/gmres/gmres.c > > > [0]PETSC ERROR: [0] KSPSolve_GMRES line 228 > > src/ksp/ksp/impls/gmres/gmres.c > > > [0]PETSC ERROR: [0] KSPSolve line 308 src/ksp/ksp/interface/itfunc.c > > > [0]PETSC ERROR: --------------------- Error Message > > > ------------------------------------ > > > [0]PETSC ERROR: Signal received! > > > [0]PETSC ERROR: > > > > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 > 14:02:12 > > > CDT 2009 > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > [0]PETSC ERROR: > > > > ------------------------------------------------------------------------ > > > > dr. ir. Christiaan Klaij > > CFD Researcher > > Research & Development > > mailto:C.Klaij at marin.nl > > T +31 317 49 33 44 > > > > MARIN > > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > > T +31 317 49 39 11, F +31 317 49 32 45, I http://www.marin.nl/ > > > > > http://www.marin.nl/web/News/News-items/RD-seminar-on-September-22-few-places-left.htmR&Dseminar on September 22: few places left > > > > This e-mail may be confidential, privileged and/or protected by > copyright. > > If you are not the intended recipient, you should return it to the sender > > immediately and delete your copy from your system. > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From keita at cray.com Tue Sep 8 10:56:02 2009 From: keita at cray.com (Keita Teranishi) Date: Tue, 8 Sep 2009 10:56:02 -0500 Subject: Problem with saving the binary matrix via Matview In-Reply-To: References: <934F80E4-717B-45DB-9136-5EC8B8109054@mcs.anl.gov> Message-ID: <925346A443D4E340BEB20248BAFCDBDF0C981CB8@CFEVS1-IP.americas.cray.com> Hung, I did not see any problem calling MatView on 8 nodes of XT4 (32 cores). I was able to save a 900Kx900K (40 nonzeros per row) sparse matrix within a few seconds. Thanks, ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD,"matrix.dat",FILE_MODE_WRITE,&view); CHKERRQ(ierr); ierr = MatView(A,view);CHKERRQ(ierr); ierr = PetscViewerDestroy(view);CHKERRQ(ierr); ================================ ?Keita Teranishi ?Scientific Library Group ?Cray, Inc. ?keita at cray.com ================================ -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Nguyen, Hung V ERDC-ITL-MS Sent: Friday, September 04, 2009 3:59 PM To: PETSc users list Subject: RE: Problem with saving the binary matrix via Matview Hello, Yes, It is a CrayXT4. I just rerun with setting the variables below. I will let you know if it helps. hvnguyen:jade20% setenv MPICH_PTL_SEND_CREDITS -1 hvnguyen:jade20% setenv MPICH_MAX_SHORT_MSG_SIZE 64000 hvnguyen:jade20% setenv MPICH_UNEX_BUFFER_SIZE 240M hvnguyen:jade20% setenv MPICH_PTL_UNEX_EVENTS 60000 >The problems will be worse if you don't have the latest Cray software on the system. What do you mean exactly about the latest Cray software? >Note we generally frown up saving huge matrices to disk except for debugging testing purposes. But as Matt notes this is not a particularly huge matrix. With any reasonable configuration it should take very little time to write the file. We need to dump out some matrices required large number of iterations for testing. As for small application case (nrows-12856,nnzeros=675744), it took a minute to write binary matrix to a file on CrayXT4 system so I don't know why it has a deadlock in this case. Thank you. -hung -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Friday, September 04, 2009 3:24 PM To: PETSc users list Subject: Re: Problem with saving the binary matrix via Matview Is this a Cray? You may need to set the environmental variables MPI UNEX BUFFER SIZE and/or MPICH PTL MATCH OFF and/or MPICH PTL OTHER EVENTS and/or MPICH MSGS PER PROC and/or MPICH PTL SEND CREDITS you will likely need to hunt through Cray documentation to find the meaning of all this stuff. The problems will be worse if you don't have the latest Cray software on the system. Note we generally frown up saving huge matrices to disk except for debugging testing purposes. But as Matt notes this is not a particularly huge matrix. With any reasonable configuration it should take very little time to write the file. Good luck, Barry On Sep 4, 2009, at 2:25 PM, Nguyen, Hung V ERDC-ITL-MS wrote: > > Hello, > > I have a problem with saving a matrix with the binary viewer > PetscBinaryViewerOpen() for the matrix with nrow=853564, > nnzeros=47191472. > The application ran with 32 pes for more than 3 hours without writing > any into file. > However, the same executable ran for small size of application > (nrows-12856,nnzeros=675744) and were able to write successfully > binary files. > > Note: using Totalview I found that it hang in the line of > Matview(matrix->petsc, fd); > > Thank you, > > -hung > > --- code: > petsc_analyst_mat(matrix->petsc); > sprintf(file[LOAD_MATRIX],"Matrix.at%f",t_prev); > PetscViewerBinaryOpen(PETSC_COMM_WORLD, > file[LOAD_MATRIX],FILE_MODE_WRITE,&fd); > MatView(matrix->petsc, fd); > PetscViewerDestroy(fd); > > ---- Matrix info using salsa/AnaMod module: > > Computed as <9.362113e+03> Computed > as <9.362623e+03> Computed as <2.800000e+01> Computed > as <2.800000e+01> Computed as > <2.404511e+02> Computed as <-3.064463e-02> > Could not compute Could not compute > Could not compute > Could not compute Could not compute > Computed as <853564> > Could not compute Computed as > <47191472> Computed as <112> Computed > as <16> From Hung.V.Nguyen at usace.army.mil Tue Sep 8 15:21:10 2009 From: Hung.V.Nguyen at usace.army.mil (Nguyen, Hung V ERDC-ITL-MS) Date: Tue, 8 Sep 2009 15:21:10 -0500 Subject: Problem with saving the binary matrix via Matview In-Reply-To: <925346A443D4E340BEB20248BAFCDBDF0C981CB8@CFEVS1-IP.americas.cray.com> References: <934F80E4-717B-45DB-9136-5EC8B8109054@mcs.anl.gov> <925346A443D4E340BEB20248BAFCDBDF0C981CB8@CFEVS1-IP.americas.cray.com> Message-ID: Hello Keita, Are you running a simple program to test MatView function on CrayXT4 or a *big* application codes? I have a simple program that reads its portion of matrix and save a matrix with nrows=2124108, nnzeros=43024032, max-nnzeros-per-row=27 on 16 cores. However, I have a problem with MatView function in the ADaptive Hydraulics Modeling (ADH)code with petsc interface for a large matrix application as indicated in previous email. By the way, setting MPI environment variables is not help. Thank you, -hung -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Keita Teranishi Sent: Tuesday, September 08, 2009 10:56 AM To: PETSc users list Subject: RE: Problem with saving the binary matrix via Matview Hung, I did not see any problem calling MatView on 8 nodes of XT4 (32 cores). I was able to save a 900Kx900K (40 nonzeros per row) sparse matrix within a few seconds. Thanks, ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD,"matrix.dat",FILE_MODE_WRITE,&view); CHKERRQ(ierr); ierr = MatView(A,view);CHKERRQ(ierr); ierr = PetscViewerDestroy(view);CHKERRQ(ierr); ================================ ?Keita Teranishi ?Scientific Library Group ?Cray, Inc. ?keita at cray.com ================================ -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Nguyen, Hung V ERDC-ITL-MS Sent: Friday, September 04, 2009 3:59 PM To: PETSc users list Subject: RE: Problem with saving the binary matrix via Matview Hello, Yes, It is a CrayXT4. I just rerun with setting the variables below. I will let you know if it helps. hvnguyen:jade20% setenv MPICH_PTL_SEND_CREDITS -1 hvnguyen:jade20% setenv MPICH_MAX_SHORT_MSG_SIZE 64000 hvnguyen:jade20% setenv MPICH_UNEX_BUFFER_SIZE 240M hvnguyen:jade20% setenv MPICH_PTL_UNEX_EVENTS 60000 >The problems will be worse if you don't have the latest Cray software >on the system. What do you mean exactly about the latest Cray software? >Note we generally frown up saving huge matrices to disk except for >debugging testing purposes. But as Matt notes this is not a particularly huge matrix. With any reasonable configuration it should take very little time to write the file. We need to dump out some matrices required large number of iterations for testing. As for small application case (nrows-12856,nnzeros=675744), it took a minute to write binary matrix to a file on CrayXT4 system so I don't know why it has a deadlock in this case. Thank you. -hung -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Friday, September 04, 2009 3:24 PM To: PETSc users list Subject: Re: Problem with saving the binary matrix via Matview Is this a Cray? You may need to set the environmental variables MPI UNEX BUFFER SIZE and/or MPICH PTL MATCH OFF and/or MPICH PTL OTHER EVENTS and/or MPICH MSGS PER PROC and/or MPICH PTL SEND CREDITS you will likely need to hunt through Cray documentation to find the meaning of all this stuff. The problems will be worse if you don't have the latest Cray software on the system. Note we generally frown up saving huge matrices to disk except for debugging testing purposes. But as Matt notes this is not a particularly huge matrix. With any reasonable configuration it should take very little time to write the file. Good luck, Barry On Sep 4, 2009, at 2:25 PM, Nguyen, Hung V ERDC-ITL-MS wrote: > > Hello, > > I have a problem with saving a matrix with the binary viewer > PetscBinaryViewerOpen() for the matrix with nrow=853564, > nnzeros=47191472. > The application ran with 32 pes for more than 3 hours without writing > any into file. > However, the same executable ran for small size of application > (nrows-12856,nnzeros=675744) and were able to write successfully > binary files. > > Note: using Totalview I found that it hang in the line of > Matview(matrix->petsc, fd); > > Thank you, > > -hung > > --- code: > petsc_analyst_mat(matrix->petsc); > sprintf(file[LOAD_MATRIX],"Matrix.at%f",t_prev); > PetscViewerBinaryOpen(PETSC_COMM_WORLD, > file[LOAD_MATRIX],FILE_MODE_WRITE,&fd); > MatView(matrix->petsc, fd); > PetscViewerDestroy(fd); > > ---- Matrix info using salsa/AnaMod module: > > Computed as <9.362113e+03> Computed > as <9.362623e+03> Computed as <2.800000e+01> Computed > as <2.800000e+01> Computed as > <2.404511e+02> Computed as <-3.064463e-02> > Could not compute Could not compute > Could not compute > Could not compute Could not compute > Computed as <853564> > Could not compute Computed as > <47191472> Computed as <112> Computed > as <16> From knepley at gmail.com Tue Sep 8 15:43:17 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 8 Sep 2009 15:43:17 -0500 Subject: Problem with saving the binary matrix via Matview In-Reply-To: References: <934F80E4-717B-45DB-9136-5EC8B8109054@mcs.anl.gov> <925346A443D4E340BEB20248BAFCDBDF0C981CB8@CFEVS1-IP.americas.cray.com> Message-ID: On Tue, Sep 8, 2009 at 3:21 PM, Nguyen, Hung V ERDC-ITL-MS < Hung.V.Nguyen at usace.army.mil> wrote: > Hello Keita, > > Are you running a simple program to test MatView function on CrayXT4 or a > *big* application codes? I have a simple program that reads its portion of > matrix and save a matrix with nrows=2124108, nnzeros=43024032, > max-nnzeros-per-row=27 on 16 cores. However, I have a problem with MatView > function in the ADaptive Hydraulics Modeling (ADH)code with petsc interface > for a large matrix application as indicated in previous email. > > By the way, setting MPI environment variables is not help. > If MatView() with a large matrix works in your example, but is slow in your application (for the same size matrix), doesn't that lead you to believe that PETSc is not responsible for the slow down, but something else in your application is? Matt > Thank you, > > -hung > > > > > > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov > [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Keita Teranishi > Sent: Tuesday, September 08, 2009 10:56 AM > To: PETSc users list > Subject: RE: Problem with saving the binary matrix via Matview > > Hung, > > I did not see any problem calling MatView on 8 nodes of XT4 (32 cores). I > was > able to save a 900Kx900K (40 nonzeros per row) sparse matrix within a few > seconds. > > Thanks, > > ierr = > PetscViewerBinaryOpen(PETSC_COMM_WORLD,"matrix.dat",FILE_MODE_WRITE,&view); > CHKERRQ(ierr); > ierr = MatView(A,view);CHKERRQ(ierr); > ierr = PetscViewerDestroy(view);CHKERRQ(ierr); > > ================================ > Keita Teranishi > Scientific Library Group > Cray, Inc. > keita at cray.com > ================================ > > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov > [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Nguyen, Hung V > ERDC-ITL-MS > Sent: Friday, September 04, 2009 3:59 PM > To: PETSc users list > Subject: RE: Problem with saving the binary matrix via Matview > > Hello, > > Yes, It is a CrayXT4. I just rerun with setting the variables below. I will > let you know if it helps. > > hvnguyen:jade20% setenv MPICH_PTL_SEND_CREDITS -1 hvnguyen:jade20% setenv > MPICH_MAX_SHORT_MSG_SIZE 64000 hvnguyen:jade20% setenv > MPICH_UNEX_BUFFER_SIZE > 240M hvnguyen:jade20% setenv MPICH_PTL_UNEX_EVENTS 60000 > > >The problems will be worse if you don't have the latest Cray software > >on the > system. > > What do you mean exactly about the latest Cray software? > > >Note we generally frown up saving huge matrices to disk except for > >debugging > testing purposes. But as Matt notes this is not a particularly huge matrix. > With any reasonable configuration it should take very little time to write > the file. > > We need to dump out some matrices required large number of iterations for > testing. > As for small application case (nrows-12856,nnzeros=675744), it took a > minute > to write binary matrix to a file on CrayXT4 system so I don't know why it > has > a deadlock in this case. > > Thank you. > > -hung > > > > > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov > [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Friday, September 04, 2009 3:24 PM > To: PETSc users list > Subject: Re: Problem with saving the binary matrix via Matview > > > Is this a Cray? You may need to set the environmental variables MPI UNEX > BUFFER SIZE and/or MPICH PTL MATCH OFF and/or MPICH PTL OTHER EVENTS and/or > MPICH MSGS PER PROC and/or MPICH PTL SEND CREDITS you will likely need to > hunt through Cray documentation to find the meaning of all this stuff. The > problems will be worse if you don't have the latest Cray software on the > system. > > Note we generally frown up saving huge matrices to disk except for > debugging testing purposes. But as Matt notes this is not a particularly > huge > matrix. With any reasonable configuration it should take very little time > to > write the file. > > Good luck, > > Barry > > > > > On Sep 4, 2009, at 2:25 PM, Nguyen, Hung V ERDC-ITL-MS wrote: > > > > > Hello, > > > > I have a problem with saving a matrix with the binary viewer > > PetscBinaryViewerOpen() for the matrix with nrow=853564, > > nnzeros=47191472. > > The application ran with 32 pes for more than 3 hours without writing > > any into file. > > However, the same executable ran for small size of application > > (nrows-12856,nnzeros=675744) and were able to write successfully > > binary files. > > > > Note: using Totalview I found that it hang in the line of > > Matview(matrix->petsc, fd); > > > > Thank you, > > > > -hung > > > > --- code: > > petsc_analyst_mat(matrix->petsc); > > sprintf(file[LOAD_MATRIX],"Matrix.at%f",t_prev); > > PetscViewerBinaryOpen(PETSC_COMM_WORLD, > > file[LOAD_MATRIX],FILE_MODE_WRITE,&fd); > > MatView(matrix->petsc, fd); > > PetscViewerDestroy(fd); > > > > ---- Matrix info using salsa/AnaMod module: > > > > Computed as <9.362113e+03> Computed > > as <9.362623e+03> Computed as <2.800000e+01> Computed > > as <2.800000e+01> Computed as > > <2.404511e+02> Computed as <-3.064463e-02> > > Could not compute Could not compute > > Could not compute > > Could not compute Could not compute > > Computed as <853564> > > Could not compute Computed as > > <47191472> Computed as <112> Computed > > as <16> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsjb00 at hotmail.com Thu Sep 10 13:05:40 2009 From: tsjb00 at hotmail.com (tsjb00) Date: Thu, 10 Sep 2009 18:05:40 +0000 Subject: KSP quick question Message-ID: Hi! I have a quick question about KSP solvers. I will use the KSP solver to solve a time varying problem. My question is: do I need to call the following functions once in initialization, or do I need to call them in the time loop (solving equations in each time step)? The functions I am not sure about include: KSPSetOperators KSPGetPC PCSetType KSPSetType KSPSetFromOptions Many thanks! _________________________________________________________________ ????????????360??????? http://club.msn.cn/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 10 13:32:47 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 10 Sep 2009 13:32:47 -0500 Subject: KSP quick question In-Reply-To: References: Message-ID: 2009/9/10 tsjb00 > Hi! I have a quick question about KSP solvers. I will use the KSP solver > to solve a time varying problem. My question is: do I need to call the > following functions once in initialization, or do I need to call them in the > time loop (solving equations in each time step)? > > The functions I am not sure about include: > KSPSetOperators > ^^^^^ This is the only one to call in the loop. Matt > KSPGetPC > PCSetType > KSPSetType > KSPSetFromOptions > > Many thanks! > ------------------------------ > ??+??+?? ??????,??MSN????! ????? > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rxk at cfdrc.com Thu Sep 10 13:22:32 2009 From: rxk at cfdrc.com (Ravi) Date: Thu, 10 Sep 2009 13:22:32 -0500 Subject: parallel statistics. Message-ID: <000001ca3243$a9f65050$fde2f0f0$@com> Hi all, This is Ravi Kannan from CFD Research Corporation. Are there readily available or installable commands or chunks of code to get the parallel statistics including important ones like time spent in waiting and latency, communication time, available memory spent in each machine etc. when solving AX=B type problem. Thanks in advance Ravi Ravi Kannan, PhD Research Engineer CFD Research Corporation 215 Wynn Drive, Suite 501 Huntsville, AL 35805 (256)726-4851 rxk at cfdrc.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 10 14:05:35 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 10 Sep 2009 14:05:35 -0500 Subject: parallel statistics. In-Reply-To: <000001ca3243$a9f65050$fde2f0f0$@com> References: <000001ca3243$a9f65050$fde2f0f0$@com> Message-ID: On Thu, Sep 10, 2009 at 1:22 PM, Ravi wrote: > Hi all, > > This is Ravi Kannan from CFD Research Corporation. Are > there readily available or installable commands or chunks of code to get the > parallel statistics including important ones like time spent in waiting and > latency, communication time, available memory spent in each machine etc. > when solving AX=B type problem > In -log_summary. Matt > > Thanks in advance > > Ravi > > > > Ravi Kannan, PhD > > Research Engineer > > CFD Research Corporation > > 215 Wynn Drive, Suite 501 > > Huntsville, AL 35805 > > (256)726-4851 > > rxk at cfdrc.com > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Sep 10 15:12:34 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 10 Sep 2009 15:12:34 -0500 Subject: KSP quick question In-Reply-To: References: Message-ID: On Sep 10, 2009, at 1:32 PM, Matthew Knepley wrote: > 2009/9/10 tsjb00 > Hi! I have a quick question about KSP solvers. I will use the KSP > solver to solve a time varying problem. My question is: do I need to > call the following functions once in initialization, or do I need to > call them in the time loop (solving equations in each time step)? > > The functions I am not sure about include: > KSPSetOperators > > ^^^^^ This is the only one to call in the loop. This assumes that matrix is changing with time. If the matrix is fixed then you do not need to call this more than once. Barry > > Matt > > KSPGetPC > PCSetType > KSPSetType > KSPSetFromOptions > > Many thanks! > ??+??+?? ??????,??MSN????! ????? > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener From rxk at cfdrc.com Fri Sep 11 10:09:04 2009 From: rxk at cfdrc.com (Ravi) Date: Fri, 11 Sep 2009 10:09:04 -0500 Subject: parallel statistics. In-Reply-To: References: <000001ca3243$a9f65050$fde2f0f0$@com> Message-ID: <000001ca32f1$cd376ff0$67a64fd0$@com> Hi Matt, log_summary does give information like time for barrier, zero size MPI_send. However interesting statistics like total latency time, computation/communication time ratio etc. This is needed by me to study the effect of partitions and BCs. Thanks Ravi From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: Thursday, September 10, 2009 2:06 PM To: PETSc users list Subject: Re: parallel statistics. On Thu, Sep 10, 2009 at 1:22 PM, Ravi wrote: Hi all, This is Ravi Kannan from CFD Research Corporation. Are there readily available or installable commands or chunks of code to get the parallel statistics including important ones like time spent in waiting and latency, communication time, available memory spent in each machine etc. when solving AX=B type problem In -log_summary. Matt Thanks in advance Ravi Ravi Kannan, PhD Research Engineer CFD Research Corporation 215 Wynn Drive, Suite 501 Huntsville, AL 35805 (256)726-4851 rxk at cfdrc.com -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rxk at cfdrc.com Fri Sep 11 10:11:07 2009 From: rxk at cfdrc.com (Ravi) Date: Fri, 11 Sep 2009 10:11:07 -0500 Subject: parallel statistics. References: <000001ca3243$a9f65050$fde2f0f0$@com> Message-ID: <000801ca32f2$16d67bb0$44837310$@com> log_summary does give information like time for barrier, zero size MPI_send. However interesting statistics like total latency time, computation/communication time ratio etc ARE NOT GIVEN. IS THERE A WAY to get those. Thanks Ravi From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: Thursday, September 10, 2009 2:06 PM To: PETSc users list Subject: Re: parallel statistics. On Thu, Sep 10, 2009 at 1:22 PM, Ravi wrote: Hi all, This is Ravi Kannan from CFD Research Corporation. Are there readily available or installable commands or chunks of code to get the parallel statistics including important ones like time spent in waiting and latency, communication time, available memory spent in each machine etc. when solving AX=B type problem In -log_summary. Matt Thanks in advance Ravi Ravi Kannan, PhD Research Engineer CFD Research Corporation 215 Wynn Drive, Suite 501 Huntsville, AL 35805 (256)726-4851 rxk at cfdrc.com -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 11 10:12:14 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 11 Sep 2009 10:12:14 -0500 Subject: parallel statistics. In-Reply-To: <000001ca32f1$cd376ff0$67a64fd0$@com> References: <000001ca3243$a9f65050$fde2f0f0$@com> <000001ca32f1$cd376ff0$67a64fd0$@com> Message-ID: On Fri, Sep 11, 2009 at 10:09 AM, Ravi wrote: > Hi Matt, > > log_summary does give information like time for barrier, zero size > MPI_send. However interesting statistics like total latency time, > computation/communication time ratio etc. This is needed by me to study the > effect of partitions and BCs. > It gives these numbers for individual functions, like VecDot(), which is what is important for solvers. Maybe you can get generic things from JUMPSHOT, but the interpretation would be problematic. What routine you you fix if total latency was high? Matt > Thanks > > Ravi > > > > *From:* petsc-users-bounces at mcs.anl.gov [mailto: > petsc-users-bounces at mcs.anl.gov] *On Behalf Of *Matthew Knepley > *Sent:* Thursday, September 10, 2009 2:06 PM > *To:* PETSc users list > *Subject:* Re: parallel statistics. > > > > On Thu, Sep 10, 2009 at 1:22 PM, Ravi wrote: > > Hi all, > > This is Ravi Kannan from CFD Research Corporation. Are > there readily available or installable commands or chunks of code to get the > parallel statistics including important ones like time spent in waiting and > latency, communication time, available memory spent in each machine etc. > when solving AX=B type problem > > > In -log_summary. > > Matt > > > > > Thanks in advance > > Ravi > > > > Ravi Kannan, PhD > > Research Engineer > > CFD Research Corporation > > 215 Wynn Drive, Suite 501 > > Huntsville, AL 35805 > > (256)726-4851 > > rxk at cfdrc.com > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rxk at cfdrc.com Fri Sep 11 10:48:13 2009 From: rxk at cfdrc.com (Ravi) Date: Fri, 11 Sep 2009 10:48:13 -0500 Subject: parallel statistics. In-Reply-To: References: <000001ca3243$a9f65050$fde2f0f0$@com> <000001ca32f1$cd376ff0$67a64fd0$@com> Message-ID: <001701ca32f7$453f07b0$cfbd1710$@com> Matt, I see your point: lots of interesting information is given by log_summary. However I need to report the total latency time and computation/communication for some simulations. How do I get those? Thanks Ravi From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: Friday, September 11, 2009 10:12 AM To: PETSc users list Cc: pjd at cfdrc.com; Vincent Harrand Subject: Re: parallel statistics. On Fri, Sep 11, 2009 at 10:09 AM, Ravi wrote: Hi Matt, log_summary does give information like time for barrier, zero size MPI_send. However interesting statistics like total latency time, computation/communication time ratio etc. This is needed by me to study the effect of partitions and BCs. It gives these numbers for individual functions, like VecDot(), which is what is important for solvers. Maybe you can get generic things from JUMPSHOT, but the interpretation would be problematic. What routine you you fix if total latency was high? Matt Thanks Ravi From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: Thursday, September 10, 2009 2:06 PM To: PETSc users list Subject: Re: parallel statistics. On Thu, Sep 10, 2009 at 1:22 PM, Ravi wrote: Hi all, This is Ravi Kannan from CFD Research Corporation. Are there readily available or installable commands or chunks of code to get the parallel statistics including important ones like time spent in waiting and latency, communication time, available memory spent in each machine etc. when solving AX=B type problem In -log_summary. Matt Thanks in advance Ravi Ravi Kannan, PhD Research Engineer CFD Research Corporation 215 Wynn Drive, Suite 501 Huntsville, AL 35805 (256)726-4851 rxk at cfdrc.com -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 11 10:57:48 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 11 Sep 2009 10:57:48 -0500 Subject: parallel statistics. In-Reply-To: <001701ca32f7$453f07b0$cfbd1710$@com> References: <000001ca3243$a9f65050$fde2f0f0$@com> <000001ca32f1$cd376ff0$67a64fd0$@com> <001701ca32f7$453f07b0$cfbd1710$@com> Message-ID: On Fri, Sep 11, 2009 at 10:48 AM, Ravi wrote: > Matt, > > I see your point: lots of interesting information is given by log_summary. > However I need to report the total latency time and > computation/communication for some simulations. How do I get those? > We have no way of getting that information. You can try the MPI profiling tools (like JUMPSHOT). Matt > Thanks > > Ravi > > > > > > > > *From:* petsc-users-bounces at mcs.anl.gov [mailto: > petsc-users-bounces at mcs.anl.gov] *On Behalf Of *Matthew Knepley > *Sent:* Friday, September 11, 2009 10:12 AM > *To:* PETSc users list > *Cc:* pjd at cfdrc.com; Vincent Harrand > *Subject:* Re: parallel statistics. > > > > On Fri, Sep 11, 2009 at 10:09 AM, Ravi wrote: > > Hi Matt, > > log_summary does give information like time for barrier, zero size > MPI_send. However interesting statistics like total latency time, > computation/communication time ratio etc. This is needed by me to study the > effect of partitions and BCs. > > > It gives these numbers for individual functions, like VecDot(), which is > what is important for solvers. Maybe you can > get generic things from JUMPSHOT, but the interpretation would be > problematic. What routine you you fix if total > latency was high? > > Matt > > > Thanks > > Ravi > > > > *From:* petsc-users-bounces at mcs.anl.gov [mailto: > petsc-users-bounces at mcs.anl.gov] *On Behalf Of *Matthew Knepley > *Sent:* Thursday, September 10, 2009 2:06 PM > *To:* PETSc users list > *Subject:* Re: parallel statistics. > > > > On Thu, Sep 10, 2009 at 1:22 PM, Ravi wrote: > > Hi all, > > This is Ravi Kannan from CFD Research Corporation. Are > there readily available or installable commands or chunks of code to get the > parallel statistics including important ones like time spent in waiting and > latency, communication time, available memory spent in each machine etc. > when solving AX=B type problem > > > In -log_summary. > > Matt > > > > > Thanks in advance > > Ravi > > > > Ravi Kannan, PhD > > Research Engineer > > CFD Research Corporation > > 215 Wynn Drive, Suite 501 > > Huntsville, AL 35805 > > (256)726-4851 > > rxk at cfdrc.com > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsjb00 at hotmail.com Fri Sep 11 11:07:27 2009 From: tsjb00 at hotmail.com (tsjb00) Date: Fri, 11 Sep 2009 16:07:27 +0000 Subject: ksp follow-up Qs Message-ID: Many thanks for your replies! I have three follow-up questions. I use the AIJ format for the matrix, BCGS for KSP and ILU for PC. The matrix changes with time. Now I put PC and KSP definition in initialization. In each timestep, I redefine the matrix and rhs vector, use the KSPSetOperators and KSPSolve. When I run the program, I get the following error info, which seems to occur right before KSPSetOperators [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Arguments are incompatible! [0]PETSC ERROR: Incompatible vector global lengths! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 5, Mon Apr 13 09:15:37 CDT 2009 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ../src_o_petsc/min3pp on a linux-gnu named pardiff by jinbei Fri Sep 11 08:49:22 2009 [0]PETSC ERROR: Libraries linked from /home/jinbei/Soft/petsc-3.0.0-p5/linux-gnu-c-debug/lib [0]PETSC ERROR: Configure run at Thu Aug 20 11:36:14 2009 [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=ifort --download-f-blas-lapack=1 --download-mpich=1 --with-shared=0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: VecCopy() line 1685 in src/vec/vec/interface/vector.c [0]PETSC ERROR: KSPInitialResidual() line 60 in src/ksp/ksp/interface/itres.c [0]PETSC ERROR: KSPSolve_BCGS() line 44 in src/ksp/ksp/impls/bcgs/bcgs.c [0]PETSC ERROR: KSPSolve() line 385 in src/ksp/ksp/interface/itfunc.c What might cause the problem? Another question is, the BCGS+ILU combination is kind of slow. Would you please give me some suggestions on how to optimize the solver performance? I've tried setting PC level to 1 and reducing maximum iterations, but doesn't really work well. The last question is, if I want to use the true residual norm (based on original A*x=y) and the norm of solution updates as the criteria of convergence, what functions would do? Any example in the package I can refer to? Many thanks again and have a nice day! _________________________________________________________________ ?????????????????msn????? http://ditu.live.com/?form=TL&swm=1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 11 11:16:47 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 11 Sep 2009 11:16:47 -0500 Subject: ksp follow-up Qs In-Reply-To: References: Message-ID: 2009/9/11 tsjb00 > Many thanks for your replies! I have three follow-up questions. I use the > AIJ format for the matrix, BCGS for KSP and ILU for PC. The matrix changes > with time. > > Now I put PC and KSP definition in initialization. In each timestep, I > redefine the matrix and rhs vector, use the KSPSetOperators and KSPSolve. > When I run the program, I get the following error info, which seems to occur > right before KSPSetOperators > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Arguments are incompatible! > [0]PETSC ERROR: Incompatible vector global lengths! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 5, Mon Apr 13 09:15:37 > CDT 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages . > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ../src_o_petsc/min3pp on a linux-gnu named pardiff by > jinbei Fri Sep 11 08:49:22 2009 > [0]PETSC ERROR: Libraries linked from > /home/jinbei/Soft/petsc-3.0.0-p5/linux-gnu-c-debug/lib > [0]PETSC ERROR: Configure run at Thu Aug 20 11:36:14 2009 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=ifort > --download-f-blas-lapack=1 --download-mpich=1 --with-shared=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: VecCopy() line 1685 in src/vec/vec/interface/vector.c > [0]PETSC ERROR: KSPInitialResidual() line 60 in > src/ksp/ksp/interface/itres.c > [0]PETSC ERROR: KSPSolve_BCGS() line 44 in src/ksp/ksp/impls/bcgs/bcgs.c > [0]PETSC ERROR: KSPSolve() line 385 in src/ksp/ksp/interface/itfunc.c > > > What might cause the problem? > You cannot change the problem size dynamically with KSP, since it allocates work vectors, etc. to support the computation. If the size changes, everything will have to be recreated. > Another question is, the BCGS+ILU combination is kind of slo w. Would you > please give me some suggestions on how to optimize the solver performance? > I've tried setting PC level to 1 and reducing maximum iterations, but > doesn't really work well. > Preconditioning is highly problem dependent. I do not believe in block-box preconditioners. > The last question is, if I want to use the true residual norm (based on > original A*x=y) and the norm of solution updates as the criteria of > convergence, what functions would do? Any example in the package I can refer > to? > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/KSP/KSPSetNormType.html If you want something more sophisticated, you can always use http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/KSP/KSPSetConvergenceTest.html Matt > Many thanks again and have a nice day! > > > ------------------------------ > ??????????????? ???????? > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 11 11:28:27 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 11 Sep 2009 11:28:27 -0500 Subject: anyone running PETSc application code with thousands of processors? Message-ID: <9C9CD825-AC1C-427D-A8F3-403C54615D9E@mcs.anl.gov> PETSc-users, We are thinking of organizing a minisymposium at the SIAM Parallel Processing Conference (http://www.siam.org/meetings/pp10/index.php ) in Feb, 2010. We need one additional speaker. Is anyone running their (nontrivial application) using PETSc on at least thousands of cores that would be interested in presenting? Please respond to petsc-maint at mcs.anl.gov Thanks Barry From bernardosk at gmail.com Tue Sep 15 14:01:46 2009 From: bernardosk at gmail.com (Bernardo Rocha) Date: Tue, 15 Sep 2009 16:01:46 -0300 Subject: memory usage of a SeqAIJ matrix Message-ID: <8d285c840909151201m7fdc8e89ge0d547a0d9ea7c4d@mail.gmail.com> Hi everyone, I need to know the number of nonzero element of the matrix in an application using PETSc. How can I do it? What is the best way to do it? As far as I'm concerned with PETSc, running on a single processor, I'm using the command line argument "-info" and then I get this information in some line of the output that looks like this [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5100 X 5100; storage space: 92706 unneeded,44994 used then I simply get the number of used entries. But when I have a large simulation, where the matrix does not fit into the memory of one processor, I must use several processors. My question is how to get the number of nonzero entries of the "global" matrix? I wrote a simple python script to parse the output and sum the number of entries used on each processor, but I found out that my calculations are wrong, I'm having twice more nonzero elements (I tested against a tiny simulation on a single processor). It seems that on the output I'm parsing I have two kinds of informations about the entries used: [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5100 X 5100; storage space: 92706 unneeded,44994 used [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5100 X 5100; storage space: 0 unneeded,44994 used That is, one that the "unneeded" field has some value and another that this field is zero. Then I decided to discard the information where the field "unneeded" is zero and finally the results matched perfectly with a single processor case. So, i would like to know (1) why do I have these lines with "0 unneeded" and (2) if there is a more elegant way to measure this. That's all! Best regards, Bernardo M. R. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Sep 15 14:06:18 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 15 Sep 2009 14:06:18 -0500 Subject: memory usage of a SeqAIJ matrix In-Reply-To: <8d285c840909151201m7fdc8e89ge0d547a0d9ea7c4d@mail.gmail.com> References: <8d285c840909151201m7fdc8e89ge0d547a0d9ea7c4d@mail.gmail.com> Message-ID: You can use the output of -ksp_view, which gives the matrix information. Matt On Tue, Sep 15, 2009 at 2:01 PM, Bernardo Rocha wrote: > Hi everyone, > > I need to know the number of nonzero element of the matrix in an > application using PETSc. How can I do it? What is the best way to do it? > > As far as I'm concerned with PETSc, running on a single processor, I'm > using the command line argument "-info" and then I get this information in > some line of the output that looks like this > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5100 X 5100; storage space: 92706 > unneeded,44994 used > > then I simply get the number of used entries. > > But when I have a large simulation, where the matrix does not fit into the > memory of one processor, I must use several processors. My question is how > to get the number of nonzero entries of the "global" matrix? I wrote a > simple python script to parse the output and sum the number of entries used > on each processor, but I found out that my calculations are wrong, I'm > having twice more nonzero elements (I tested against a tiny simulation on a > single processor). It seems that on the output I'm parsing I have two kinds > of informations about the entries used: > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5100 X 5100; storage space: 92706 > unneeded,44994 used > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5100 X 5100; storage space: 0 > unneeded,44994 used > > That is, one that the "unneeded" field has some value and another that this > field is zero. Then I decided to discard the information where the field > "unneeded" is zero and finally the results matched perfectly with a single > processor case. > > So, i would like to know (1) why do I have these lines with "0 unneeded" > and (2) if there is a more elegant way to measure this. > > That's all! > Best regards, > Bernardo M. R. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bernardosk at gmail.com Tue Sep 15 14:07:41 2009 From: bernardosk at gmail.com (Bernardo Rocha) Date: Tue, 15 Sep 2009 16:07:41 -0300 Subject: memory usage of a SeqAIJ matrix In-Reply-To: References: <8d285c840909151201m7fdc8e89ge0d547a0d9ea7c4d@mail.gmail.com> Message-ID: <8d285c840909151207m372b67bap5edc66523226c2cf@mail.gmail.com> Thanks a lot! =) 2009/9/15 Matthew Knepley > You can use the output of -ksp_view, which gives the matrix information. > > Matt > > > On Tue, Sep 15, 2009 at 2:01 PM, Bernardo Rocha wrote: > >> Hi everyone, >> >> I need to know the number of nonzero element of the matrix in an >> application using PETSc. How can I do it? What is the best way to do it? >> >> As far as I'm concerned with PETSc, running on a single processor, I'm >> using the command line argument "-info" and then I get this information in >> some line of the output that looks like this >> >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5100 X 5100; storage space: >> 92706 unneeded,44994 used >> >> then I simply get the number of used entries. >> >> But when I have a large simulation, where the matrix does not fit into the >> memory of one processor, I must use several processors. My question is how >> to get the number of nonzero entries of the "global" matrix? I wrote a >> simple python script to parse the output and sum the number of entries used >> on each processor, but I found out that my calculations are wrong, I'm >> having twice more nonzero elements (I tested against a tiny simulation on a >> single processor). It seems that on the output I'm parsing I have two kinds >> of informations about the entries used: >> >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5100 X 5100; storage space: >> 92706 unneeded,44994 used >> >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5100 X 5100; storage space: 0 >> unneeded,44994 used >> >> That is, one that the "unneeded" field has some value and another that >> this field is zero. Then I decided to discard the information where the >> field "unneeded" is zero and finally the results matched perfectly with a >> single processor case. >> >> So, i would like to know (1) why do I have these lines with "0 unneeded" >> and (2) if there is a more elegant way to measure this. >> >> That's all! >> Best regards, >> Bernardo M. R. >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Tue Sep 15 17:18:09 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Tue, 15 Sep 2009 18:18:09 -0400 Subject: a question about tolerance Message-ID: Hi All, I have a question about -ksp_rtol. when I run my application with -ksp_rtol 1e-2, I got the convergence history as the follows. 0 KSP preconditioned resid norm 2.129970351489e+03 true resid norm 1.040507012955e-02 ||Ae||/||Ax|| 1.000000000000e+00 1 KSP preconditioned resid norm 3.979448548799e+01 true resid norm 7.588481061015e-03 ||Ae||/||Ax|| 7.293060946766e-01 2 KSP preconditioned resid norm 2.177668520277e+01 true resid norm 6.608917234932e-03 ||Ae||/||Ax|| 6.351631610983e-01 3 KSP preconditioned resid norm 2.519310038389e+00 true resid norm 4.725744786785e-03 ||Ae||/||Ax|| 4.541771201874e-01 KSP Object: type: gmres ... when I run my application with -ksp_rtol 1e-3, I got the convergence history as the follows. 0 KSP preconditioned resid norm 2.129970351489e+03 true resid norm 1.040507012955e-02 ||Ae||/||Ax|| 1.000000000000e+00 1 KSP preconditioned resid norm 3.979448548799e+01 true resid norm 7.588481061015e-03 ||Ae||/||Ax|| 7.293060946766e-01 2 KSP preconditioned resid norm 2.177668520277e+01 true resid norm 6.608917234932e-03 ||Ae||/||Ax|| 6.351631610983e-01 3 KSP preconditioned resid norm 2.519310038389e+00 true resid norm 4.725744786785e-03 ||Ae||/||Ax|| 4.541771201874e-01 4 KSP preconditioned resid norm 5.945684204702e-01 true resid norm 2.622092992533e-03 ||Ae||/||Ax|| 2.520014723483e-01 KSP Object: type: gmres How can I explain the relative tolerance is function correctly? as you can see from the term ||Ae||/||Ax||, neither of my testruns seems to stop at the right place(should be 10^(-2), 10^(-3) respectively). Thanks a lot. Yan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Tue Sep 15 17:27:00 2009 From: jed at 59A2.org (Jed Brown) Date: Tue, 15 Sep 2009 17:27:00 -0500 Subject: a question about tolerance In-Reply-To: References: Message-ID: <4AB014B4.8060508@59A2.org> Ryan Yan wrote: > How can I explain the relative tolerance is function correctly? as you can > see from the term ||Ae||/||Ax||, neither of my testruns seems to stop at the > right place(should be 10^(-2), 10^(-3) respectively). The ||Ae||/||Ax|| column is not the preconditioned norm. You are using left-preconditioned GMRES (the default) which uses the preconditioned norm. Use -ksp_right_pc -ksp_norm_type unpreconditioned for the behavior you seem to expect. Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 261 bytes Desc: OpenPGP digital signature URL: From vyan2000 at gmail.com Tue Sep 15 17:35:38 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Tue, 15 Sep 2009 18:35:38 -0400 Subject: a question about tolerance In-Reply-To: <4AB014B4.8060508@59A2.org> References: <4AB014B4.8060508@59A2.org> Message-ID: Hi Jed, Thanks a lot, it works after switching to the right . But can you say a little more about what happened in the first scenario, I did not quite fellow you. Let's say, If I set -ksp_rtol 1e-2, what does this mean in the left PC case. More specifically, which term will be considered as a stop creteria for the KSP solve. Yan On Tue, Sep 15, 2009 at 6:27 PM, Jed Brown wrote: > Ryan Yan wrote: > > How can I explain the relative tolerance is function correctly? as you > can > > see from the term ||Ae||/||Ax||, neither of my testruns seems to stop at > the > > right place(should be 10^(-2), 10^(-3) respectively). > > The ||Ae||/||Ax|| column is not the preconditioned norm. You are using > left-preconditioned GMRES (the default) which uses the preconditioned > norm. Use > > -ksp_right_pc -ksp_norm_type unpreconditioned > > for the behavior you seem to expect. > > Jed > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Tue Sep 15 17:41:44 2009 From: jed at 59A2.org (Jed Brown) Date: Tue, 15 Sep 2009 17:41:44 -0500 Subject: a question about tolerance In-Reply-To: References: <4AB014B4.8060508@59A2.org> Message-ID: <4AB01828.8050406@59A2.org> Ryan Yan wrote: > But can you say a little more about what happened in the first scenario, I > did not quite fellow you. Let's say, If I set -ksp_rtol 1e-2, what does this > mean in the left PC case. More specifically, which term will be considered > as a stop creteria for the KSP solve. Look at the first column (preconditioned residual norm). Notice that this is decreasing by 1e-2 and 1e-3 respectively in your examples. Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 261 bytes Desc: OpenPGP digital signature URL: From vyan2000 at gmail.com Tue Sep 15 18:04:23 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Tue, 15 Sep 2009 19:04:23 -0400 Subject: a question about tolerance In-Reply-To: <4AB01828.8050406@59A2.org> References: <4AB014B4.8060508@59A2.org> <4AB01828.8050406@59A2.org> Message-ID: Thanks Jed. And in the term ||Ae||/||Ax||, PETSc impilictly assume that we use zero as initial guess and here the x really means exact solution, right? Yan On Tue, Sep 15, 2009 at 6:41 PM, Jed Brown wrote: > Ryan Yan wrote: > > But can you say a little more about what happened in the first scenario, > I > > did not quite fellow you. Let's say, If I set -ksp_rtol 1e-2, what does > this > > mean in the left PC case. More specifically, which term will be > considered > > as a stop creteria for the KSP solve. > > Look at the first column (preconditioned residual norm). Notice that > this is decreasing by 1e-2 and 1e-3 respectively in your examples. > > Jed > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Sep 15 19:29:04 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 15 Sep 2009 19:29:04 -0500 Subject: a question about tolerance In-Reply-To: References: <4AB014B4.8060508@59A2.org> <4AB01828.8050406@59A2.org> Message-ID: On Tue, Sep 15, 2009 at 6:04 PM, Ryan Yan wrote: > Thanks Jed. > > And in the term ||Ae||/||Ax||, PETSc impilictly assume that we use zero as > initial guess and here the x really means exact solution, right? > This has nothing to do with the initial guess. Ae is the residual, and x is the current guess. We do not have the exact solution, but I guess A x^* = b, so that could have been used. Matt > > Yan > > > > > On Tue, Sep 15, 2009 at 6:41 PM, Jed Brown wrote: > >> Ryan Yan wrote: >> > But can you say a little more about what happened in the first scenario, >> I >> > did not quite fellow you. Let's say, If I set -ksp_rtol 1e-2, what does >> this >> > mean in the left PC case. More specifically, which term will be >> considered >> > as a stop creteria for the KSP solve. >> >> Look at the first column (preconditioned residual norm). Notice that >> this is decreasing by 1e-2 and 1e-3 respectively in your examples. >> >> Jed >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Sep 15 19:32:07 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 15 Sep 2009 19:32:07 -0500 Subject: a question about tolerance In-Reply-To: References: <4AB014B4.8060508@59A2.org> <4AB01828.8050406@59A2.org> Message-ID: On Sep 15, 2009, at 6:04 PM, Ryan Yan wrote: > Thanks Jed. > > And in the term ||Ae||/||Ax||, PETSc impilictly assume that we use > zero as initial guess and here the x really means exact solution, > right? If you call KSPSetInitialGuessNonzero() then PETSc will use the initial value in x as the initial guess. The default convergence tests use the ratio of || r_current||/ || b ||. Thus if you start with a great initial guess it may not iterate much at all. You can use also - ksp_converged_use_initial_residual_norm or KSPDefaultConvergedSetUIRNorm() to have it use the ratio of || r_current||/|| r_initial|| for the relative convergence test. You can also provide your own convergence test with KSPSetConvergenceTest(). Barry > > Yan > > > > > On Tue, Sep 15, 2009 at 6:41 PM, Jed Brown wrote: > Ryan Yan wrote: > > But can you say a little more about what happened in the first > scenario, I > > did not quite fellow you. Let's say, If I set -ksp_rtol 1e-2, what > does this > > mean in the left PC case. More specifically, which term will be > considered > > as a stop creteria for the KSP solve. > > Look at the first column (preconditioned residual norm). Notice that > this is decreasing by 1e-2 and 1e-3 respectively in your examples. > > Jed > > From bsmith at mcs.anl.gov Tue Sep 15 19:33:50 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 15 Sep 2009 19:33:50 -0500 Subject: memory usage of a SeqAIJ matrix In-Reply-To: <8d285c840909151207m372b67bap5edc66523226c2cf@mail.gmail.com> References: <8d285c840909151201m7fdc8e89ge0d547a0d9ea7c4d@mail.gmail.com> <8d285c840909151207m372b67bap5edc66523226c2cf@mail.gmail.com> Message-ID: You can also call MatGetInfo() with a second argument of MAT_GLOBAL_SUM. Barry On Sep 15, 2009, at 2:07 PM, Bernardo Rocha wrote: > Thanks a lot! =) > > 2009/9/15 Matthew Knepley > You can use the output of -ksp_view, which gives the matrix > information. > > Matt > > > On Tue, Sep 15, 2009 at 2:01 PM, Bernardo Rocha > wrote: > Hi everyone, > > I need to know the number of nonzero element of the matrix in an > application using PETSc. How can I do it? What is the best way to do > it? > > As far as I'm concerned with PETSc, running on a single processor, > I'm using the command line argument "-info" and then I get this > information in some line of the output that looks like this > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5100 X 5100; storage > space: 92706 unneeded,44994 used > > then I simply get the number of used entries. > > But when I have a large simulation, where the matrix does not fit > into the memory of one processor, I must use several processors. My > question is how to get the number of nonzero entries of the "global" > matrix? I wrote a simple python script to parse the output and sum > the number of entries used on each processor, but I found out that > my calculations are wrong, I'm having twice more nonzero elements (I > tested against a tiny simulation on a single processor). It seems > that on the output I'm parsing I have two kinds of informations > about the entries used: > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5100 X 5100; storage > space: 92706 unneeded,44994 used > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5100 X 5100; storage > space: 0 unneeded,44994 used > > That is, one that the "unneeded" field has some value and another > that this field is zero. Then I decided to discard the information > where the field "unneeded" is zero and finally the results matched > perfectly with a single processor case. > > So, i would like to know (1) why do I have these lines with "0 > unneeded" and (2) if there is a more elegant way to measure this. > > That's all! > Best regards, > Bernardo M. R. > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > From vyan2000 at gmail.com Tue Sep 15 22:11:19 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Tue, 15 Sep 2009 23:11:19 -0400 Subject: a question about tolerance In-Reply-To: References: <4AB014B4.8060508@59A2.org> <4AB01828.8050406@59A2.org> Message-ID: On Tue, Sep 15, 2009 at 8:29 PM, Matthew Knepley wrote: > On Tue, Sep 15, 2009 at 6:04 PM, Ryan Yan wrote: > >> Thanks Jed. >> >> And in the term ||Ae||/||Ax||, PETSc impilictly assume that we use zero >> as initial guess and here the x really means exact solution, right? >> > > This has nothing to do with the initial guess. Ae is the residual, and x is > the current guess. We do not have the exact solution, > but I guess A x^* = b, so that could have been used. > > Thanks Matt. I agree that in the print out ||Ae||/||Ax||, x here just represents the x*. Yan > Matt > > >> >> Yan >> >> >> >> >> On Tue, Sep 15, 2009 at 6:41 PM, Jed Brown wrote: >> >>> Ryan Yan wrote: >>> > But can you say a little more about what happened in the first >>> scenario, I >>> > did not quite fellow you. Let's say, If I set -ksp_rtol 1e-2, what does >>> this >>> > mean in the left PC case. More specifically, which term will be >>> considered >>> > as a stop creteria for the KSP solve. >>> >>> Look at the first column (preconditioned residual norm). Notice that >>> this is decreasing by 1e-2 and 1e-3 respectively in your examples. >>> >>> Jed >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Tue Sep 15 22:14:40 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Tue, 15 Sep 2009 23:14:40 -0400 Subject: a question about tolerance In-Reply-To: References: <4AB014B4.8060508@59A2.org> <4AB01828.8050406@59A2.org> Message-ID: Hi Barry, Thank you very much for the explanation. It is good to know there are more options to start with. Yan On Tue, Sep 15, 2009 at 8:32 PM, Barry Smith wrote: > > On Sep 15, 2009, at 6:04 PM, Ryan Yan wrote: > > Thanks Jed. >> >> And in the term ||Ae||/||Ax||, PETSc impilictly assume that we use zero >> as initial guess and here the x really means exact solution, right? >> > > If you call KSPSetInitialGuessNonzero() then PETSc will use the initial > value in x as the initial guess. The default convergence tests > use the ratio of || r_current||/ || b ||. Thus if you start with a great > initial guess it may not iterate much at all. You can use also > -ksp_converged_use_initial_residual_norm > or KSPDefaultConvergedSetUIRNorm() to have it use the ratio of || > r_current||/|| r_initial|| for the relative convergence test. > > You can also provide your own convergence test with > KSPSetConvergenceTest(). > > Barry > > > > >> Yan >> >> >> >> >> On Tue, Sep 15, 2009 at 6:41 PM, Jed Brown wrote: >> Ryan Yan wrote: >> > But can you say a little more about what happened in the first scenario, >> I >> > did not quite fellow you. Let's say, If I set -ksp_rtol 1e-2, what does >> this >> > mean in the left PC case. More specifically, which term will be >> considered >> > as a stop creteria for the KSP solve. >> >> Look at the first column (preconditioned residual norm). Notice that >> this is decreasing by 1e-2 and 1e-3 respectively in your examples. >> >> Jed >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rodrigowpa at gmail.com Wed Sep 16 09:38:27 2009 From: rodrigowpa at gmail.com (Rodrigo Araujo) Date: Wed, 16 Sep 2009 11:38:27 -0300 Subject: Questions about Blas Message-ID: <357feb30909160738n4ef1fc04p2d40c3bfd7d44c5a@mail.gmail.com> Hi All, I am multiplying matrices using this line of code: ierr = MatMatMult(A, B, MAT_INITIAL_MATRIX, ((PetscReal) 2.0), &C);CHKERRQ(ierr); But I need to see the source code of the file which has this function, because I?d want to see how the functions from BLAS is being used. Is that possible? If yes, could somebody tell me how. Thaks very much, Regards. -- Rodrigo W. Pimentel Araujo Engenharia da Computa??o UFPE -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Sep 16 09:54:49 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 16 Sep 2009 10:54:49 -0400 Subject: Questions about Blas In-Reply-To: <357feb30909160738n4ef1fc04p2d40c3bfd7d44c5a@mail.gmail.com> References: <357feb30909160738n4ef1fc04p2d40c3bfd7d44c5a@mail.gmail.com> Message-ID: Since you are using a fill factor of 2, I assume that you are multiplying sparse matrices? In this case the BLAS are not used since they are only for dense matrix computations. If you use Emacs take a look at the Emacs Users section in the users manual. It will tell you a simple way to find all the MatMatMult_*** functions. Essentially in Emacs just use esc . MatMatMult_ then esc 0 esc . to find the next use. Barry On Sep 16, 2009, at 10:38 AM, Rodrigo Araujo wrote: > Hi All, > > I am multiplying matrices using this line of code: > > ierr = MatMatMult(A, B, MAT_INITIAL_MATRIX, ((PetscReal) 2.0), > &C);CHKERRQ(ierr); > > But I need to see the source code of the file which has this > function, because I?d want to see how the functions from BLAS is > being used. Is that possible? If yes, could somebody tell me how. > > Thaks very much, > > Regards. > > > > -- > Rodrigo W. Pimentel Araujo > Engenharia da Computa??o > UFPE From hzhang at mcs.anl.gov Wed Sep 16 09:56:30 2009 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Wed, 16 Sep 2009 09:56:30 -0500 (CDT) Subject: Questions about Blas In-Reply-To: <357feb30909160738n4ef1fc04p2d40c3bfd7d44c5a@mail.gmail.com> References: <357feb30909160738n4ef1fc04p2d40c3bfd7d44c5a@mail.gmail.com> Message-ID: > I am multiplying matrices using this line of code: > > ierr = MatMatMult(A, B, MAT_INITIAL_MATRIX, ((PetscReal) 2.0), > &C);CHKERRQ(ierr); > > But I need to see the source code of the file which has this function, > because I?d want to see how the functions from BLAS is being used. Is that > possible? If yes, could somebody tell me how. We have implementations for different matrix data structures. You can search 'MatMatMult_' from petsc/src/ for all. BLAS is used in MatMatMultNumeric_SeqDense_SeqDense() in src/mat/impls/dense/seq/dense.c. For sparse matrices, we do not use BLAS in MatMatMult(). Hong > > > -- > Rodrigo W. Pimentel Araujo > Engenharia da Computa??o > UFPE > From rodrigowpa at gmail.com Wed Sep 16 13:09:37 2009 From: rodrigowpa at gmail.com (Rodrigo Araujo) Date: Wed, 16 Sep 2009 15:09:37 -0300 Subject: Questions about Blas In-Reply-To: References: <357feb30909160738n4ef1fc04p2d40c3bfd7d44c5a@mail.gmail.com> Message-ID: <357feb30909161109n7ab9ca25had0ead12d881999d@mail.gmail.com> Thanks for the answer, But I have another question. I am trying to multiply the matrices using 2 pcs, so I found the function MatMatMultNumeric_MPIDense_MPIDense, which probably could do this using the BLAS. My question is how the BLAS or the MPI splits the matrices to multiply them is separated pcs? 2009/9/16 Hong Zhang > > I am multiplying matrices using this line of code: >> >> ierr = MatMatMult(A, B, MAT_INITIAL_MATRIX, ((PetscReal) 2.0), >> &C);CHKERRQ(ierr); >> >> But I need to see the source code of the file which has this function, >> because I?d want to see how the functions from BLAS is being used. Is that >> possible? If yes, could somebody tell me how. >> > > We have implementations for different matrix data structures. > You can search 'MatMatMult_' > from petsc/src/ for all. > BLAS is used in MatMatMultNumeric_SeqDense_SeqDense() > in src/mat/impls/dense/seq/dense.c. > For sparse matrices, we do not use BLAS in MatMatMult(). > > Hong > > >> >> -- >> Rodrigo W. Pimentel Araujo >> Engenharia da Computa??o >> UFPE >> > -- Rodrigo W. Pimentel Araujo Engenharia da Computa??o UFPE -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Sep 16 13:10:40 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 16 Sep 2009 13:10:40 -0500 Subject: Questions about Blas In-Reply-To: <357feb30909161109n7ab9ca25had0ead12d881999d@mail.gmail.com> References: <357feb30909160738n4ef1fc04p2d40c3bfd7d44c5a@mail.gmail.com> <357feb30909161109n7ab9ca25had0ead12d881999d@mail.gmail.com> Message-ID: On Wed, Sep 16, 2009 at 1:09 PM, Rodrigo Araujo wrote: > Thanks for the answer, > > But I have another question. > > I am trying to multiply the matrices using 2 pcs, so I found the function > MatMatMultNumeric_MPIDense_MPIDense, which probably could do this using the > BLAS. My question is how the BLAS or the MPI splits the matrices to multiply > them is separated pcs? > Same divisions as the matrices already have id they are MPIDENSE. Matt > 2009/9/16 Hong Zhang > >> >> I am multiplying matrices using this line of code: >>> >>> ierr = MatMatMult(A, B, MAT_INITIAL_MATRIX, ((PetscReal) 2.0), >>> &C);CHKERRQ(ierr); >>> >>> But I need to see the source code of the file which has this function, >>> because I?d want to see how the functions from BLAS is being used. Is >>> that >>> possible? If yes, could somebody tell me how. >>> >> >> We have implementations for different matrix data structures. >> You can search 'MatMatMult_' >> from petsc/src/ for all. >> BLAS is used in MatMatMultNumeric_SeqDense_SeqDense() >> in src/mat/impls/dense/seq/dense.c. >> For sparse matrices, we do not use BLAS in MatMatMult(). >> >> Hong >> >> >>> >>> -- >>> Rodrigo W. Pimentel Araujo >>> Engenharia da Computa??o >>> UFPE >>> >> > > > -- > Rodrigo W. Pimentel Araujo > Engenharia da Computa??o > UFPE > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Sep 16 13:46:31 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 16 Sep 2009 14:46:31 -0400 Subject: Questions about Blas In-Reply-To: <357feb30909161109n7ab9ca25had0ead12d881999d@mail.gmail.com> References: <357feb30909160738n4ef1fc04p2d40c3bfd7d44c5a@mail.gmail.com> <357feb30909161109n7ab9ca25had0ead12d881999d@mail.gmail.com> Message-ID: <8CC70C06-8379-42ED-B379-9E6FF0F73561@mcs.anl.gov> On Sep 16, 2009, at 2:09 PM, Rodrigo Araujo wrote: > Thanks for the answer, > > But I have another question. > > I am trying to multiply the matrices using 2 pcs, so I found the > function MatMatMultNumeric_MPIDense_MPIDense, which probably could > do this using the BLAS. My question is how the BLAS or the MPI > splits the matrices to multiply them is separated pcs? This uses PLAPACK PLA_Gemm() which calls the BLAS appropriately. Barry > > > 2009/9/16 Hong Zhang > > I am multiplying matrices using this line of code: > > ierr = MatMatMult(A, B, MAT_INITIAL_MATRIX, ((PetscReal) 2.0), > &C);CHKERRQ(ierr); > > But I need to see the source code of the file which has this function, > because I?d want to see how the functions from BLAS is being used. > Is that > possible? If yes, could somebody tell me how. > > We have implementations for different matrix data structures. > You can search 'MatMatMult_' > from petsc/src/ for all. > BLAS is used in MatMatMultNumeric_SeqDense_SeqDense() > in src/mat/impls/dense/seq/dense.c. > For sparse matrices, we do not use BLAS in MatMatMult(). > > Hong > > > > -- > Rodrigo W. Pimentel Araujo > Engenharia da Computa??o > UFPE > > > > -- > Rodrigo W. Pimentel Araujo > Engenharia da Computa??o > UFPE From rxk at cfdrc.com Wed Sep 16 14:20:51 2009 From: rxk at cfdrc.com (Ravi) Date: Wed, 16 Sep 2009 14:20:51 -0500 Subject: memory consumed : using log_summary In-Reply-To: References: <000001ca3243$a9f65050$fde2f0f0$@com> <000001ca32f1$cd376ff0$67a64fd0$@com> Message-ID: <000601ca3702$cdc03af0$6940b0d0$@com> Hi all, This is Ravi Kannan from CFD Research Corporation. How does one interpret the memory consumed (in parallel) for object types like Vec Scatter, matrix etc when using the output obtained from log_summary. For instance, the "Matrix" object type in many simulations decrease half in size when the # of procs is doubled. So is the number shown as simple as the average size stored per processor OR is am I missing something more deeper? Thanks in advance Ravi Ravi Kannan, PhD Research Engineer CFD Research Corporation 215 Wynn Drive, Suite 501 Huntsville, AL 35805 (256)726-4851 rxk at cfdrc.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Sep 16 14:28:04 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 16 Sep 2009 14:28:04 -0500 Subject: memory consumed : using log_summary In-Reply-To: <000601ca3702$cdc03af0$6940b0d0$@com> References: <000001ca3243$a9f65050$fde2f0f0$@com> <000001ca32f1$cd376ff0$67a64fd0$@com> <000601ca3702$cdc03af0$6940b0d0$@com> Message-ID: These are only statistics for process 0. We have never implemented global memory stats. Matt On Wed, Sep 16, 2009 at 2:20 PM, Ravi wrote: > Hi all, > > This is Ravi Kannan from CFD Research Corporation. How > does one interpret the memory consumed (in parallel) for object types like > Vec Scatter, matrix etc when using the output obtained from log_summary. > For instance, the ?Matrix? object type in many simulations decrease half in > size when the # of procs is doubled. So is the number shown as simple as the > average size stored per processor OR is am I missing something more deeper? > > Thanks in advance > > Ravi > > Ravi Kannan, PhD > > Research Engineer > > CFD Research Corporation > > 215 Wynn Drive, Suite 501 > > Huntsville, AL 35805 > > (256)726-4851 > > rxk at cfdrc.com > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Wed Sep 16 14:55:04 2009 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 17 Sep 2009 05:55:04 +1000 Subject: DMMG with different Message-ID: <956373f0909161255u772c333fr17e4523229efd505@mail.gmail.com> Hi, I'm trying to use DMMG to solve linear Stokes flow discretised with finite elements. Does anyone know how to specify different matrices for the matrix associated with the linear system and the matrix used to construct the preconditioner? Ie, like choosing a different Amat and Bmat with KSPSetOperators(). In looking at DMMGSetKSP(), I don't see how this is possible. DMMG seems to generate the operator B via DMGetMatrix(), and then sets J equal to B. A dirty hack around the problem seemed to be to used PCDMMG, but I think I should be able to define different A and B mat's with DMMG directly. I also noted that PCDMMG does not appear to be a registered PC implementation in petsc 3. Thanks in advance, Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Sep 16 15:53:30 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 16 Sep 2009 16:53:30 -0400 Subject: DMMG with different In-Reply-To: <956373f0909161255u772c333fr17e4523229efd505@mail.gmail.com> References: <956373f0909161255u772c333fr17e4523229efd505@mail.gmail.com> Message-ID: <29EB6141-2DEB-45AC-A6E6-FB7B33BD9FFD@mcs.anl.gov> On Sep 16, 2009, at 3:55 PM, Dave May wrote: > Hi, > I'm trying to use DMMG to solve linear Stokes flow discretised > with finite elements. > Does anyone know how to specify different matrices for the matrix > associated with the linear system and the matrix used to construct > the preconditioner? Ie, like choosing a different Amat and Bmat with > KSPSetOperators(). There is not a current user friendly way to specify this. After you have called DMMGSetKSP() or DMMGSetSNES() you can call MatDestroy( dmmg[level]->B) or MatDestroy( dmmg[level]->J ) for any levels you want to change; create new matrices and put them into those locations. Different ones for B and J if you like. > > In looking at DMMGSetKSP(), I don't see how this is possible. DMMG > seems to generate the operator B via DMGetMatrix(), and then sets J > equal to B. A dirty hack around the problem seemed to be to used > PCDMMG, but I think I should be able to define different A and B > mat's with DMMG directly. > > I also noted that PCDMMG does not appear to be a registered PC > implementation in petsc 3. That is all dead code. It was an experiment that failed. Barry > > Thanks in advance, > Dave > > > From fls2 at cin.ufpe.br Thu Sep 17 12:37:06 2009 From: fls2 at cin.ufpe.br (Fabio Leite Soares) Date: Thu, 17 Sep 2009 14:37:06 -0300 Subject: MatMatMult_MPIDense_MPIDense() works currently? Message-ID: <1f9a136d0909171037o6b93c113xec0fd3b6c6255e16@mail.gmail.com> Hi everyone, I have the same problem and I don't know how to fix it. I need to multiply two mpi dense matrices using the BLAS3 routines. I have tried the MatMatMult_MPIDense_MPIDense() function but the console shows this message: [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSCERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] MatMPIDenseCopyToPlapack line 1028 src/mat/impls/dense/mpi/mpidense.c [0]PETSC ERROR: [0] MatMatMultNumeric_MPIDense_MPIDense line 1078 src/mat/impls/dense/mpi/mpidense.c [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 CDT 2009 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./mult on a linux-gnu named hpcin08 by hpcin Thu Sep 17 14:28:28 2009 [0]PETSC ERROR: Libraries linked from /home/hpcin/soft/petsc-3.0.0-p8/linux-gnu-c-debug/lib [0]PETSC ERROR: Configure run at Wed Sep 16 17:06:08 2009 [0]PETSC ERROR: Configure options --download-f-blas-lapack=1 --download-plapack --with-mpi-dir=/usr/local/bin/mpich2-1.1.1p1 --with-scalar-type=real --with-precision=double --with-shared=0 [0]PETSC ERROR: --------------------------[1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[1]PETSCERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to find memory corruption errors [1]PETSC ERROR: likely location of problem given in stack below [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [1]PETSC ERROR: INSTEAD the line number of the start of the function [1]PETSC ERROR: is given. [1]PETSC ERROR: [1] MatMPIDenseCopyToPlapack line 1028 src/mat/impls/dense/mpi/mpidense.c [1]PETSC ERROR: [1] MatMatMultNumeric_MPIDense_MPIDense line 1078 src/mat/impls/dense/mpi/mpidense.c [1]PETSC ERR---------------------------------------------- [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 OR: --------------------- Error Message ------------------------------------ [1]PETSC ERROR: Signal received! [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 CDT 2009 [1]PETSC ERROR: See docs/changes/index.html for recent updates. [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [1]PETSC ERROR: See docs/index.html for manual pages. [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: ./mult on a linux-gnu named hpcin-desktop by hpcin Thu Sep 17 14:28:27 2009 [1]PETSC ERROR: Libraries linked from /home/hpcin/soft/petsc-3.0.0-p8/linux-gnu-c-debug/lib [1]PETSC ERROR: Configure run at Tue Sep 15 15:57:39 2009 [1]PETSC ERROR: Configure options --download-plapack=1 --download-f-blas-lapack=1 --with-mpi-dir=/usr/local/bin/mpich2-1.1.1p1 --with-scalar-type=real --with-precision=double --with-shared=0 [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 rank 1 in job 1 hpcin08_34697 caused collective abort of all ranks exit status of rank 1: return code 59 rank 0 in job 1 hpcin08_34697 caused collective abort of all ranks exit status of rank 0: return code 59 I tried to execute the ex123.c example and I did not succeeded to. Regards -- F?bio Leite Soares Undergraduate Student of Computing Engineering Centro de Inform?tica - UFPE - BRAZIL -------------- next part -------------- An HTML attachment was scrubbed... URL: From nmoran at thphys.nuim.ie Thu Sep 17 12:44:21 2009 From: nmoran at thphys.nuim.ie (Niall Moran) Date: Thu, 17 Sep 2009 18:44:21 +0100 Subject: Problem with MatMatMultTranspose In-Reply-To: <48FED8DB.4020902@ewi.tudelft.nl> References: <48FCAA45.6050609@ewi.tudelft.nl> <48FED8DB.4020902@ewi.tudelft.nl> Message-ID: <4AB27575.4040206@thphys.nuim.ie> Hi, I am just wondering if anything has changed on the status of this feature. Would be great to be able to perform matrix vector multiplications on complex Hermitian matrices by only providing one half of the matrix. Regards, Niall. zhifeng sheng wrote: > Hi, > > you mean the conjugate transpose for complex matrix is not supported? > > then how can you implement the iterative solvers for complex matrices? > because, some iterative solvers need it. > > Thanks > Best regards > > > > Hong Zhang wrote: >> >> Zhifeng, >> >> We do not have support for matrix operations on Hermitian matrix yet. >> Hong >> >> On Mon, 20 Oct 2008, zhifeng sheng wrote: >> >>> Dear all >>> >>> I am using this MatMatMultTranspose function for complex matrices, >>> but it seems to be doing something weird. >>> >>> for instance, if I have complex matrix A, and I compute A^T*A with >>> this function, it does not generate a Hermitian >>> >>> matrix. >>> >>> I am thinking that maybe the function take the transpose of A >>> instead of the conjugate transpose .... >>> >>> Do you know how I can get an A^H*A instead of A^T*A for complex >>> matrices? >>> >>> Thanks a lot >>> Best regards >>> Zhifeng >>> >>> >> From knepley at gmail.com Thu Sep 17 14:26:42 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 17 Sep 2009 14:26:42 -0500 Subject: Problem with MatMatMultTranspose In-Reply-To: <4AB27575.4040206@thphys.nuim.ie> References: <48FCAA45.6050609@ewi.tudelft.nl> <48FED8DB.4020902@ewi.tudelft.nl> <4AB27575.4040206@thphys.nuim.ie> Message-ID: On Thu, Sep 17, 2009 at 12:44 PM, Niall Moran wrote: > Hi, > > I am just wondering if anything has changed on the status of this feature. > Would be great to be able to perform matrix vector multiplications on > complex Hermitian matrices by only providing one half of the matrix. > It is not curently in the todo list since we have only had one request. It seems like it would just take being careful about the complex case for SBAIJ if you would like the try the implementation. We can answer questions. Thanks, Matt > Regards, > > Niall. > > zhifeng sheng wrote: > >> Hi, >> >> you mean the conjugate transpose for complex matrix is not supported? >> >> then how can you implement the iterative solvers for complex matrices? >> because, some iterative solvers need it. >> >> Thanks >> Best regards >> >> >> >> Hong Zhang wrote: >> >>> >>> Zhifeng, >>> >>> We do not have support for matrix operations on Hermitian matrix yet. >>> Hong >>> >>> On Mon, 20 Oct 2008, zhifeng sheng wrote: >>> >>> Dear all >>>> >>>> I am using this MatMatMultTranspose function for complex matrices, but >>>> it seems to be doing something weird. >>>> >>>> for instance, if I have complex matrix A, and I compute A^T*A with this >>>> function, it does not generate a Hermitian < >>>> http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatIsHermitian.html> >>>> matrix. >>>> >>>> I am thinking that maybe the function take the transpose of A instead of >>>> the conjugate transpose .... >>>> >>>> Do you know how I can get an A^H*A instead of A^T*A for complex >>>> matrices? >>>> >>>> Thanks a lot >>>> Best regards >>>> Zhifeng >>>> < >>>> http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatIsHermitian.html> >>>> >>>> >>> >>> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 17 14:27:52 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 17 Sep 2009 14:27:52 -0500 Subject: MatMatMult_MPIDense_MPIDense() works currently? In-Reply-To: <1f9a136d0909171037o6b93c113xec0fd3b6c6255e16@mail.gmail.com> References: <1f9a136d0909171037o6b93c113xec0fd3b6c6255e16@mail.gmail.com> Message-ID: Give us the exact command line you use for ex123 and the error output. Send to petsc-maint. Matt On Thu, Sep 17, 2009 at 12:37 PM, Fabio Leite Soares wrote: > Hi everyone, I have the same problem and I don't know how to fix it. > > I need to multiply two mpi dense matrices using the BLAS3 routines. I have > tried the MatMatMult_MPIDense_MPIDense() function but the console shows this > message: > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSCERROR: or try > http://valgrind.org on linux or man libgmalloc on Apple to find memory > corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatMPIDenseCopyToPlapack line 1028 > src/mat/impls/dense/mpi/mpidense.c > [0]PETSC ERROR: [0] MatMatMultNumeric_MPIDense_MPIDense line 1078 > src/mat/impls/dense/mpi/mpidense.c > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 > CDT 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./mult on a linux-gnu named hpcin08 by hpcin Thu Sep 17 > 14:28:28 2009 > [0]PETSC ERROR: Libraries linked from > /home/hpcin/soft/petsc-3.0.0-p8/linux-gnu-c-debug/lib > [0]PETSC ERROR: Configure run at Wed Sep 16 17:06:08 2009 > [0]PETSC ERROR: Configure options --download-f-blas-lapack=1 > --download-plapack --with-mpi-dir=/usr/local/bin/mpich2-1.1.1p1 > --with-scalar-type=real --with-precision=double --with-shared=0 > [0]PETSC ERROR: --------------------------[1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[1]PETSCERROR: or try > http://valgrind.org on linux or man libgmalloc on Apple to find memory > corruption errors > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [1]PETSC ERROR: INSTEAD the line number of the start of the function > [1]PETSC ERROR: is given. > [1]PETSC ERROR: [1] MatMPIDenseCopyToPlapack line 1028 > src/mat/impls/dense/mpi/mpidense.c > [1]PETSC ERROR: [1] MatMatMultNumeric_MPIDense_MPIDense line 1078 > src/mat/impls/dense/mpi/mpidense.c > [1]PETSC ERR---------------------------------------------- > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > OR: --------------------- Error Message > ------------------------------------ > [1]PETSC ERROR: Signal received! > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 > CDT 2009 > [1]PETSC ERROR: See docs/changes/index.html for recent updates. > [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [1]PETSC ERROR: See docs/index.html for manual pages. > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: ./mult on a linux-gnu named hpcin-desktop by hpcin Thu Sep > 17 14:28:27 2009 > [1]PETSC ERROR: Libraries linked from > /home/hpcin/soft/petsc-3.0.0-p8/linux-gnu-c-debug/lib > [1]PETSC ERROR: Configure run at Tue Sep 15 15:57:39 2009 > [1]PETSC ERROR: Configure options --download-plapack=1 > --download-f-blas-lapack=1 --with-mpi-dir=/usr/local/bin/mpich2-1.1.1p1 > --with-scalar-type=real --with-precision=double --with-shared=0 > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 > rank 1 in job 1 hpcin08_34697 caused collective abort of all ranks > exit status of rank 1: return code 59 > rank 0 in job 1 hpcin08_34697 caused collective abort of all ranks > exit status of rank 0: return code 59 > > > I tried to execute the ex123.c example and I did not succeeded to. > > Regards > > -- > F?bio Leite Soares > Undergraduate Student of Computing Engineering > Centro de Inform?tica - UFPE - BRAZIL > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mafunk at nmsu.edu Thu Sep 17 18:04:09 2009 From: mafunk at nmsu.edu (Matt Funk) Date: Thu, 17 Sep 2009 17:04:09 -0600 Subject: memory reporting question Message-ID: <200909171704.09298.mafunk@nmsu.edu> Hi, I am wondering whether the -memory_info option and the PetscMemoryGetMaximumUsage() call report different things. The reason i am asking is because i call the PetscMemoryGetCurrentUsage fcn in my code and it shows: 5.25558e+08. At the end of the run the -memory_info option reports: max process malloc()'ed: 4.00524e+08 max petsc malloc()'ed: 1.15254e+08. So i am a little confused by those numbers unless the fcn call is the more complete picture (as said on the manual page) I guess then my question is whether the -memory_info corresponds to the PetscMalloc* fcns? thanks matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From mafunk at nmsu.edu Thu Sep 17 18:13:44 2009 From: mafunk at nmsu.edu (Matt Funk) Date: Thu, 17 Sep 2009 17:13:44 -0600 Subject: memory reporting question In-Reply-To: <200909171704.09298.mafunk@nmsu.edu> References: <200909171704.09298.mafunk@nmsu.edu> Message-ID: <200909171713.45032.mafunk@nmsu.edu> I forgot to mention: Some of the memory allocated in the code is from non-petsc structures. Not sure if that is important. Another question: The number reported by -memory_info, is it in bytes? thanks matt On Thursday 17 September 2009, you wrote: > Hi, > > I am wondering whether the -memory_info option and the > PetscMemoryGetMaximumUsage() call report different things. > > The reason i am asking is because i call the PetscMemoryGetCurrentUsage fcn > in my code and it shows: > 5.25558e+08. > > At the end of the run the -memory_info option reports: > max process malloc()'ed: 4.00524e+08 > max petsc malloc()'ed: 1.15254e+08. > > So i am a little confused by those numbers unless the fcn call is the more > complete picture (as said on the manual page) > > I guess then my question is whether the -memory_info corresponds to the > PetscMalloc* fcns? > > thanks > matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 17 18:23:28 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 17 Sep 2009 18:23:28 -0500 Subject: memory reporting question In-Reply-To: <200909171704.09298.mafunk@nmsu.edu> References: <200909171704.09298.mafunk@nmsu.edu> Message-ID: On Thu, Sep 17, 2009 at 6:04 PM, Matt Funk wrote: > Hi, > > > I am wondering whether the -memory_info option and the > PetscMemoryGetMaximumUsage() > call report different things. > > > The reason i am asking is because i call the PetscMemoryGetCurrentUsage fcn > in my code and it shows: > 5.25558e+08. > This call get_rusage(), so it gives you the entire process size. > At the end of the run the -memory_info option reports: > max process malloc()'ed: 4.00524e+08 > This is a sampling of rusage every time an object is destroyed. > max petsc malloc()'ed: 1.15254e+08. > This is all the memory malloced using PetscMalloc() summed. Matt > So i am a little confused by those numbers unless the fcn call is the more > complete picture (as said on the manual page) > > > I guess then my question is whether the -memory_info corresponds to the > PetscMalloc* fcns? > > > thanks > matt > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Thu Sep 17 20:41:15 2009 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Thu, 17 Sep 2009 20:41:15 -0500 (CDT) Subject: MatMatMult_MPIDense_MPIDense() works currently? In-Reply-To: References: <1f9a136d0909171037o6b93c113xec0fd3b6c6255e16@mail.gmail.com> Message-ID: Fabio, Did you install plapack with petsc? Hong On Thu, 17 Sep 2009, Matthew Knepley wrote: > Give us the exact command line you use for ex123 and the error output. Send > to petsc-maint. > > Matt > > On Thu, Sep 17, 2009 at 12:37 PM, Fabio Leite Soares wrote: > >> Hi everyone, I have the same problem and I don't know how to fix it. >> >> I need to multiply two mpi dense matrices using the BLAS3 routines. I have >> tried the MatMatMult_MPIDense_MPIDense() function but the console shows this >> message: >> >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSCERROR: or try >> http://valgrind.org on linux or man libgmalloc on Apple to find memory >> corruption errors >> [0]PETSC ERROR: likely location of problem given in stack below >> [0]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> [0]PETSC ERROR: INSTEAD the line number of the start of the function >> [0]PETSC ERROR: is given. >> [0]PETSC ERROR: [0] MatMPIDenseCopyToPlapack line 1028 >> src/mat/impls/dense/mpi/mpidense.c >> [0]PETSC ERROR: [0] MatMatMultNumeric_MPIDense_MPIDense line 1078 >> src/mat/impls/dense/mpi/mpidense.c >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Signal received! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 >> CDT 2009 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: ./mult on a linux-gnu named hpcin08 by hpcin Thu Sep 17 >> 14:28:28 2009 >> [0]PETSC ERROR: Libraries linked from >> /home/hpcin/soft/petsc-3.0.0-p8/linux-gnu-c-debug/lib >> [0]PETSC ERROR: Configure run at Wed Sep 16 17:06:08 2009 >> [0]PETSC ERROR: Configure options --download-f-blas-lapack=1 >> --download-plapack --with-mpi-dir=/usr/local/bin/mpich2-1.1.1p1 >> --with-scalar-type=real --with-precision=double --with-shared=0 >> [0]PETSC ERROR: --------------------------[1]PETSC ERROR: >> ------------------------------------------------------------------------ >> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [1]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[1]PETSCERROR: or try >> http://valgrind.org on linux or man libgmalloc on Apple to find memory >> corruption errors >> [1]PETSC ERROR: likely location of problem given in stack below >> [1]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> [1]PETSC ERROR: INSTEAD the line number of the start of the function >> [1]PETSC ERROR: is given. >> [1]PETSC ERROR: [1] MatMPIDenseCopyToPlapack line 1028 >> src/mat/impls/dense/mpi/mpidense.c >> [1]PETSC ERROR: [1] MatMatMultNumeric_MPIDense_MPIDense line 1078 >> src/mat/impls/dense/mpi/mpidense.c >> [1]PETSC ERR---------------------------------------------- >> [0]PETSC ERROR: User provided function() line 0 in unknown directory >> unknown file >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> OR: --------------------- Error Message >> ------------------------------------ >> [1]PETSC ERROR: Signal received! >> [1]PETSC ERROR: >> ------------------------------------------------------------------------ >> [1]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 >> CDT 2009 >> [1]PETSC ERROR: See docs/changes/index.html for recent updates. >> [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [1]PETSC ERROR: See docs/index.html for manual pages. >> [1]PETSC ERROR: >> ------------------------------------------------------------------------ >> [1]PETSC ERROR: ./mult on a linux-gnu named hpcin-desktop by hpcin Thu Sep >> 17 14:28:27 2009 >> [1]PETSC ERROR: Libraries linked from >> /home/hpcin/soft/petsc-3.0.0-p8/linux-gnu-c-debug/lib >> [1]PETSC ERROR: Configure run at Tue Sep 15 15:57:39 2009 >> [1]PETSC ERROR: Configure options --download-plapack=1 >> --download-f-blas-lapack=1 --with-mpi-dir=/usr/local/bin/mpich2-1.1.1p1 >> --with-scalar-type=real --with-precision=double --with-shared=0 >> [1]PETSC ERROR: >> ------------------------------------------------------------------------ >> [1]PETSC ERROR: User provided function() line 0 in unknown directory >> unknown file >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 >> rank 1 in job 1 hpcin08_34697 caused collective abort of all ranks >> exit status of rank 1: return code 59 >> rank 0 in job 1 hpcin08_34697 caused collective abort of all ranks >> exit status of rank 0: return code 59 >> >> >> I tried to execute the ex123.c example and I did not succeeded to. >> >> Regards >> >> -- >> F?bio Leite Soares >> Undergraduate Student of Computing Engineering >> Centro de Inform?tica - UFPE - BRAZIL >> > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > From nmoran at thphys.nuim.ie Fri Sep 18 04:10:41 2009 From: nmoran at thphys.nuim.ie (Niall Moran) Date: Fri, 18 Sep 2009 10:10:41 +0100 Subject: Issue compiling on snow leopard Message-ID: Hi, I am trying to compile petsc on snow leopard. I need to have complex support and support for C++. I am using the gcc and g++ compilers that come packaged with the developers tools i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5646) Copyright (C) 2007 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. and the gfortran compiler from http://hpc.sourceforge.net/. The configuration file I am using is #!/usr/bin/env python configure_options = [ '--FFLAGS=-m64', '--CFLAGS=-m64', '--CXXFLAGS=-m64', '--LDFLAGS=-L/usr/lib', '--with-python=0', '--with-shared=0', '--with-dynamic=0', '--with-mpi-dir=/Users/nmoran/local/openmpi', '--with-clanguage=C++', '--with-scalar-type=complex', '--with-debugging=yes', '--with-gcov=0' ] if __name__ == '__main__': import sys,os sys.path.insert(0,os.path.abspath('config')) import configure configure.petsc_configure(configure_options) I am getting errors that the macros isinf and isnan cannot be found in the scope for various files. The first one listed is src/sys/ftn- custom/zutils.c on the lines return (PetscTruth) PetscIsInfOrNanScalar(*v); and return (PetscTruth) PetscIsInfOrNanReal(*v); These macros are defined in /usr/include/architecture/i386/math.h. No errors are found if scalar-type is real. It seems the #include somehow undefines these macros. I have got petsc to compile by redefining these macros at the top of each of the problem files but this is not a very elegant solution. Regards. Niall. From David.Colignon at ulg.ac.be Fri Sep 18 04:21:51 2009 From: David.Colignon at ulg.ac.be (David Colignon) Date: Fri, 18 Sep 2009 11:21:51 +0200 Subject: Issue compiling on snow leopard In-Reply-To: References: Message-ID: <4AB3512F.2020801@ulg.ac.be> Hi, I already have had the same kind of problem and here is the answer I got from Barry Smith : This happens because the BuildSystem inherits the bad feature of autoconf that checks for functions using the C compiler even when the C++ compiler will be used to actually compile the package. Edit $PETSC_ARCH/include/petscconf.h and remove the lines #ifndef PETSC_HAVE_ISINF #define PETSC_HAVE_ISINF 1 #endif #ifndef PETSC_HAVE_ISNAN #define PETSC_HAVE_ISNAN 1 #endif DO NOT rerun config/configure.py just do the make all again. Barry -- David Colignon, Ph.D. Collaborateur Logistique du F.R.S.-FNRS C?CI - Consortium des ?quipements de Calcul Intensif ACE - Applied & Computational Electromagnetics Sart-Tilman B28 Universit? de Li?ge 4000 Li?ge - BELGIQUE T?l: +32 (0)4 366 37 32 Fax: +32 (0)4 366 29 10 WWW: http://hpc.montefiore.ulg.ac.be/ Agenda: http://www.google.com/calendar/embed?src=david.colignon%40gmail.com Niall Moran wrote: > Hi, > > I am trying to compile petsc on snow leopard. I need to have complex > support and support for C++. I am using the gcc and g++ compilers that > come packaged with the developers tools > > i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5646) > Copyright (C) 2007 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > and the gfortran compiler from http://hpc.sourceforge.net/. The > configuration file I am using is > > #!/usr/bin/env python > > configure_options = [ > '--FFLAGS=-m64', > '--CFLAGS=-m64', > '--CXXFLAGS=-m64', > '--LDFLAGS=-L/usr/lib', > '--with-python=0', > '--with-shared=0', > '--with-dynamic=0', > '--with-mpi-dir=/Users/nmoran/local/openmpi', > '--with-clanguage=C++', > '--with-scalar-type=complex', > '--with-debugging=yes', > '--with-gcov=0' > ] > > if __name__ == '__main__': > import sys,os > sys.path.insert(0,os.path.abspath('config')) > import configure > configure.petsc_configure(configure_options) > > > I am getting errors that the macros isinf and isnan cannot be found in > the scope for various files. The first one listed is > src/sys/ftn-custom/zutils.c on the lines > > return (PetscTruth) PetscIsInfOrNanScalar(*v); > and > return (PetscTruth) PetscIsInfOrNanReal(*v); > > These macros are defined in /usr/include/architecture/i386/math.h. No > errors are found if scalar-type is real. It seems the #include > somehow undefines these macros. I have got petsc to compile by > redefining these macros at the top of each of the problem files but this > is not a very elegant solution. > > Regards. > > Niall. From michel.cancelliere at polito.it Fri Sep 18 05:04:22 2009 From: michel.cancelliere at polito.it (Michel Cancelliere) Date: Fri, 18 Sep 2009 12:04:22 +0200 Subject: Petsc and .Net Message-ID: <7f18de3b0909180304o78bda701r1e052da8397032f3@mail.gmail.com> Hi Petsc Users, I'm trying to use petsc in a .Net app but i'm experiencing problem when linking (ijw/native module detected; cannot link with pure modules). Has somebody managed to do that? Thank you in advance, Michel Cancelliere -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 18 05:35:54 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 18 Sep 2009 05:35:54 -0500 Subject: Issue compiling on snow leopard In-Reply-To: <4AB3512F.2020801@ulg.ac.be> References: <4AB3512F.2020801@ulg.ac.be> Message-ID: Barry is correct. I wanted to note that this is fixed in petsc-dev. Matt On Fri, Sep 18, 2009 at 4:21 AM, David Colignon wrote: > Hi, > > I already have had the same kind of problem and here is the answer I got > from Barry Smith : > > > This happens because the BuildSystem inherits the bad feature of autoconf > that checks for functions > using the C compiler even when the C++ compiler will be used to actually > compile the package. > > Edit $PETSC_ARCH/include/petscconf.h and remove the lines > > > #ifndef PETSC_HAVE_ISINF > #define PETSC_HAVE_ISINF 1 > #endif > > > #ifndef PETSC_HAVE_ISNAN > #define PETSC_HAVE_ISNAN 1 > #endif > > DO NOT rerun config/configure.py just do the make all again. > > Barry > > > -- > David Colignon, Ph.D. > Collaborateur Logistique du F.R.S.-FNRS > C?CI - Consortium des ?quipements de Calcul Intensif > ACE - Applied & Computational Electromagnetics > Sart-Tilman B28 > Universit? de Li?ge > 4000 Li?ge - BELGIQUE > T?l: +32 (0)4 366 37 32 > Fax: +32 (0)4 366 29 10 > WWW: http://hpc.montefiore.ulg.ac.be/ > Agenda: > http://www.google.com/calendar/embed?src=david.colignon%40gmail.com > > > > > Niall Moran wrote: > >> Hi, >> >> I am trying to compile petsc on snow leopard. I need to have complex >> support and support for C++. I am using the gcc and g++ compilers that come >> packaged with the developers tools >> >> i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5646) >> Copyright (C) 2007 Free Software Foundation, Inc. >> This is free software; see the source for copying conditions. There is NO >> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR >> PURPOSE. >> >> and the gfortran compiler from http://hpc.sourceforge.net/. The >> configuration file I am using is >> >> #!/usr/bin/env python >> >> configure_options = [ >> '--FFLAGS=-m64', >> '--CFLAGS=-m64', >> '--CXXFLAGS=-m64', >> '--LDFLAGS=-L/usr/lib', >> '--with-python=0', >> '--with-shared=0', >> '--with-dynamic=0', >> '--with-mpi-dir=/Users/nmoran/local/openmpi', >> '--with-clanguage=C++', >> '--with-scalar-type=complex', >> '--with-debugging=yes', >> '--with-gcov=0' >> ] >> >> if __name__ == '__main__': >> import sys,os >> sys.path.insert(0,os.path.abspath('config')) >> import configure >> configure.petsc_configure(configure_options) >> >> >> I am getting errors that the macros isinf and isnan cannot be found in the >> scope for various files. The first one listed is src/sys/ftn-custom/zutils.c >> on the lines >> >> return (PetscTruth) PetscIsInfOrNanScalar(*v); >> and >> return (PetscTruth) PetscIsInfOrNanReal(*v); >> >> These macros are defined in /usr/include/architecture/i386/math.h. No >> errors are found if scalar-type is real. It seems the #include >> somehow undefines these macros. I have got petsc to compile by redefining >> these macros at the top of each of the problem files but this is not a very >> elegant solution. >> >> Regards. >> >> Niall. >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From nmoran at thphys.nuim.ie Fri Sep 18 07:32:35 2009 From: nmoran at thphys.nuim.ie (Niall Moran) Date: Fri, 18 Sep 2009 13:32:35 +0100 Subject: Problem with MatMatMultTranspose In-Reply-To: References: <48FCAA45.6050609@ewi.tudelft.nl> <48FED8DB.4020902@ewi.tudelft.nl> <4AB27575.4040206@thphys.nuim.ie> Message-ID: <4AB37DE3.7020605@thphys.nuim.ie> Matthew Knepley wrote: > On Thu, Sep 17, 2009 at 12:44 PM, Niall Moran > wrote: > > Hi, > > I am just wondering if anything has changed on the status of this > feature. Would be great to be able to perform matrix vector > multiplications on complex Hermitian matrices by only providing > one half of the matrix. > > > It is not curently in the todo list since we have only had one > request. It seems like it would just take being careful about the > complex case for SBAIJ if you would like the try the implementation. > We can answer questions. > Thanks for you rapid response. I would be interested in attempting to implement this for MPIAIJ if it would not be too involved. Would you be able to sketch a rough outline of what this would involve? Would I need to just modify the MatMult_MPIAIJ function or would I need to modify the creation of the scatterers in MatAssemblyBegin_MPIAIJ? Could anyone suggest some documentation that describes how the matrix vector multiplication works in petsc for MPIAIJ typed matrices? Does matrix vector multiplication with just the upper half work for real symmetric matrices? Regards, Niall. From knepley at gmail.com Fri Sep 18 07:38:12 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 18 Sep 2009 07:38:12 -0500 Subject: Problem with MatMatMultTranspose In-Reply-To: <4AB37DE3.7020605@thphys.nuim.ie> References: <48FCAA45.6050609@ewi.tudelft.nl> <48FED8DB.4020902@ewi.tudelft.nl> <4AB27575.4040206@thphys.nuim.ie> <4AB37DE3.7020605@thphys.nuim.ie> Message-ID: On Fri, Sep 18, 2009 at 7:32 AM, Niall Moran wrote: > Matthew Knepley wrote: > >> On Thu, Sep 17, 2009 at 12:44 PM, Niall Moran > nmoran at thphys.nuim.ie>> wrote: >> >> Hi, >> >> I am just wondering if anything has changed on the status of this >> feature. Would be great to be able to perform matrix vector >> multiplications on complex Hermitian matrices by only providing >> one half of the matrix. >> >> >> It is not curently in the todo list since we have only had one request. It >> seems like it would just take being careful about the >> complex case for SBAIJ if you would like the try the implementation. We >> can answer questions. >> >> Thanks for you rapid response. I would be interested in attempting to > implement this for MPIAIJ if it would not be too involved. Would you be able > to sketch a rough outline of what this would involve? Would I need to just > modify the MatMult_MPIAIJ function or would I need to modify the creation of > the scatterers in MatAssemblyBegin_MPIAIJ? > > Could anyone suggest some documentation that describes how the matrix > vector multiplication works in petsc for MPIAIJ typed matrices? > > Does matrix vector multiplication with just the upper half work for real > symmetric matrices? > 1) Understand the SBAIJ implementation. This is used for real symmetric matrices 2) In the places where the lower triangle is retrieved, add a complex conjugation. Matt > Regards, > > Niall. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Fri Sep 18 08:48:12 2009 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Fri, 18 Sep 2009 08:48:12 -0500 (CDT) Subject: Problem with MatMatMultTranspose In-Reply-To: References: <48FCAA45.6050609@ewi.tudelft.nl> <48FED8DB.4020902@ewi.tudelft.nl> <4AB27575.4040206@thphys.nuim.ie> <4AB37DE3.7020605@thphys.nuim.ie> Message-ID: Should we create a new matrix type, say hbaij, for operations of Hermitian matrices? Hong On Fri, 18 Sep 2009, Matthew Knepley wrote: > On Fri, Sep 18, 2009 at 7:32 AM, Niall Moran wrote: > >> Matthew Knepley wrote: >> >>> On Thu, Sep 17, 2009 at 12:44 PM, Niall Moran >> nmoran at thphys.nuim.ie>> wrote: >>> >>> Hi, >>> >>> I am just wondering if anything has changed on the status of this >>> feature. Would be great to be able to perform matrix vector >>> multiplications on complex Hermitian matrices by only providing >>> one half of the matrix. >>> >>> >>> It is not curently in the todo list since we have only had one request. It >>> seems like it would just take being careful about the >>> complex case for SBAIJ if you would like the try the implementation. We >>> can answer questions. >>> >>> Thanks for you rapid response. I would be interested in attempting to >> implement this for MPIAIJ if it would not be too involved. Would you be able >> to sketch a rough outline of what this would involve? Would I need to just >> modify the MatMult_MPIAIJ function or would I need to modify the creation of >> the scatterers in MatAssemblyBegin_MPIAIJ? >> >> Could anyone suggest some documentation that describes how the matrix >> vector multiplication works in petsc for MPIAIJ typed matrices? >> >> Does matrix vector multiplication with just the upper half work for real >> symmetric matrices? >> > > 1) Understand the SBAIJ implementation. This is used for real symmetric > matrices > > 2) In the places where the lower triangle is retrieved, add a complex > conjugation. > > Matt > > >> Regards, >> >> Niall. >> > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > From knepley at gmail.com Fri Sep 18 09:03:44 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 18 Sep 2009 09:03:44 -0500 Subject: Problem with MatMatMultTranspose In-Reply-To: References: <48FCAA45.6050609@ewi.tudelft.nl> <48FED8DB.4020902@ewi.tudelft.nl> <4AB27575.4040206@thphys.nuim.ie> <4AB37DE3.7020605@thphys.nuim.ie> Message-ID: On Fri, Sep 18, 2009 at 8:48 AM, Hong Zhang wrote: > > Should we create a new matrix type, say hbaij, > for operations of Hermitian matrices? > No, it should be a minor perturbation. Matt > Hong > > > On Fri, 18 Sep 2009, Matthew Knepley wrote: > > On Fri, Sep 18, 2009 at 7:32 AM, Niall Moran >> wrote: >> >> Matthew Knepley wrote: >>> >>> On Thu, Sep 17, 2009 at 12:44 PM, Niall Moran >>> >>> nmoran at thphys.nuim.ie>> wrote: >>>> >>>> Hi, >>>> >>>> I am just wondering if anything has changed on the status of this >>>> feature. Would be great to be able to perform matrix vector >>>> multiplications on complex Hermitian matrices by only providing >>>> one half of the matrix. >>>> >>>> >>>> It is not curently in the todo list since we have only had one request. >>>> It >>>> seems like it would just take being careful about the >>>> complex case for SBAIJ if you would like the try the implementation. We >>>> can answer questions. >>>> >>>> Thanks for you rapid response. I would be interested in attempting to >>>> >>> implement this for MPIAIJ if it would not be too involved. Would you be >>> able >>> to sketch a rough outline of what this would involve? Would I need to >>> just >>> modify the MatMult_MPIAIJ function or would I need to modify the creation >>> of >>> the scatterers in MatAssemblyBegin_MPIAIJ? >>> >>> Could anyone suggest some documentation that describes how the matrix >>> vector multiplication works in petsc for MPIAIJ typed matrices? >>> >>> Does matrix vector multiplication with just the upper half work for real >>> symmetric matrices? >>> >>> >> 1) Understand the SBAIJ implementation. This is used for real symmetric >> matrices >> >> 2) In the places where the lower triangle is retrieved, add a complex >> conjugation. >> >> Matt >> >> >> Regards, >>> >>> Niall. >>> >>> -- >> What most experimenters take for granted before they begin their >> experiments >> is infinitely more interesting than any results to which their experiments >> lead. >> -- Norbert Wiener >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Fri Sep 18 09:19:31 2009 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 18 Sep 2009 16:19:31 +0200 Subject: Problem with MatMatMultTranspose In-Reply-To: References: <48FCAA45.6050609@ewi.tudelft.nl> <48FED8DB.4020902@ewi.tudelft.nl> <4AB27575.4040206@thphys.nuim.ie> <4AB37DE3.7020605@thphys.nuim.ie> Message-ID: On 18/09/2009, Matthew Knepley wrote: > On Fri, Sep 18, 2009 at 8:48 AM, Hong Zhang > wrote: > > Should we create a new matrix type, say hbaij, > for operations of Hermitian matrices? > > No, it should be a minor perturbation. > > Matt I think that the complex SBAIJ can handle both cases by taking into account the MAT_SYMMETRIC and MAT_HERMITIAN flags (set with MatSetOption), then decide whether elements should be conjugated or not. Jose From mafunk at nmsu.edu Fri Sep 18 10:32:24 2009 From: mafunk at nmsu.edu (Matt Funk) Date: Fri, 18 Sep 2009 09:32:24 -0600 Subject: memory reporting question In-Reply-To: References: <200909171704.09298.mafunk@nmsu.edu> Message-ID: <200909180932.24495.mafunk@nmsu.edu> Ok, i am more confused now. The following is part of my code: PetscLogDouble mem; PetscMemoryGetCurrentUsage(&mem); cout<<"PetscMemoryGetCurrentUsage: "< On Thu, Sep 17, 2009 at 6:04 PM, Matt Funk wrote: > > Hi, > > > > > > I am wondering whether the -memory_info option and the > > PetscMemoryGetMaximumUsage >s/petsc-current/docs/manualpages/Sys/PetscMemoryGetMaximumUsage.html#Petsc > >MemoryGetMaximumUsage>() call report different things. > > > > > > The reason i am asking is because i call the PetscMemoryGetCurrentUsage > > fcn in my code and it shows: > > 5.25558e+08. > > This call get_rusage(), so it gives you the entire process size. > > > At the end of the run the -memory_info option reports: > > max process malloc()'ed: 4.00524e+08 > > This is a sampling of rusage every time an object is destroyed. > > > max petsc malloc()'ed: 1.15254e+08. > > This is all the memory malloced using PetscMalloc() summed. > > Matt > > > So i am a little confused by those numbers unless the fcn call is the > > more complete picture (as said on the manual page) > > > > > > I guess then my question is whether the -memory_info corresponds to the > > PetscMalloc* fcns? > > > > > > thanks > > matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Sep 18 10:48:20 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 18 Sep 2009 10:48:20 -0500 (CDT) Subject: memory reporting question In-Reply-To: <200909180932.24495.mafunk@nmsu.edu> References: <200909171704.09298.mafunk@nmsu.edu> <200909180932.24495.mafunk@nmsu.edu> Message-ID: There is no destroy call between PetscMemoryGetCurrentUsage(), PetscMemoryGetMaximumUsage - so MaxUsage isn't updated. Perhaps it should be updated in PetscMemoryGetCurrentUsage() call aswell? Satish On Fri, 18 Sep 2009, Matt Funk wrote: > Ok, > > i am more confused now. > The following is part of my code: > > PetscLogDouble mem; > PetscMemoryGetCurrentUsage(&mem); > cout<<"PetscMemoryGetCurrentUsage: "< PetscMemoryGetMaximumUsage(&mem); > cout<<"PetscMemoryGetMaximumUsage: "< > > This is what is reported: > > PetscMemoryGetCurrentUsage: 5.25525e+08 > PetscMemoryGetMaximumUsage: 3.09055e+08 > > > Now, unless i am missing something obvious, i am completely confused how it is > that the maximum usage can be less than the current usage. > > What am i missing here? > > > thanks > matt > > > > > On Thursday 17 September 2009, you wrote: > > On Thu, Sep 17, 2009 at 6:04 PM, Matt Funk wrote: > > > Hi, > > > > > > > > > I am wondering whether the -memory_info option and the > > > PetscMemoryGetMaximumUsage > >s/petsc-current/docs/manualpages/Sys/PetscMemoryGetMaximumUsage.html#Petsc > > >MemoryGetMaximumUsage>() call report different things. > > > > > > > > > The reason i am asking is because i call the PetscMemoryGetCurrentUsage > > > fcn in my code and it shows: > > > 5.25558e+08. > > > > This call get_rusage(), so it gives you the entire process size. > > > > > At the end of the run the -memory_info option reports: > > > max process malloc()'ed: 4.00524e+08 > > > > This is a sampling of rusage every time an object is destroyed. > > > > > max petsc malloc()'ed: 1.15254e+08. > > > > This is all the memory malloced using PetscMalloc() summed. > > > > Matt > > > > > So i am a little confused by those numbers unless the fcn call is the > > > more complete picture (as said on the manual page) > > > > > > > > > I guess then my question is whether the -memory_info corresponds to the > > > PetscMalloc* fcns? > > > > > > > > > thanks > > > matt > > > From fabioleite0 at gmail.com Fri Sep 18 13:16:23 2009 From: fabioleite0 at gmail.com (=?ISO-8859-1?Q?F=E1bio_Leite?=) Date: Fri, 18 Sep 2009 15:16:23 -0300 Subject: MatMatMult_MPIDense_MPIDense() works currently? In-Reply-To: References: <1f9a136d0909171037o6b93c113xec0fd3b6c6255e16@mail.gmail.com> Message-ID: <1f9a136d0909181116m3793e6ebj46f0502c3c157135@mail.gmail.com> Hong Zhang, I put --download-plapack=1 to install plapack. Is there something extra to do ? Regards 2009/9/17 Hong Zhang > > Fabio, > > Did you install plapack with petsc? > Hong > > On Thu, 17 Sep 2009, Matthew Knepley wrote: > > Give us the exact command line you use for ex123 and the error output. >> Send >> to petsc-maint. >> >> Matt >> >> On Thu, Sep 17, 2009 at 12:37 PM, Fabio Leite Soares > >wrote: >> >> Hi everyone, I have the same problem and I don't know how to fix it. >>> >>> I need to multiply two mpi dense matrices using the BLAS3 routines. I >>> have >>> tried the MatMatMult_MPIDense_MPIDense() function but the console shows >>> this >>> message: >>> >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>> probably memory access out of range >>> [0]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> [0]PETSC ERROR: or see >>> >>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC >>> < >>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal%5B0%5DPETSC>ERROR: >>> or try >>> >>> http://valgrind.org on linux or man libgmalloc on Apple to find memory >>> corruption errors >>> [0]PETSC ERROR: likely location of problem given in stack below >>> [0]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>> available, >>> [0]PETSC ERROR: INSTEAD the line number of the start of the >>> function >>> [0]PETSC ERROR: is given. >>> [0]PETSC ERROR: [0] MatMPIDenseCopyToPlapack line 1028 >>> src/mat/impls/dense/mpi/mpidense.c >>> [0]PETSC ERROR: [0] MatMatMultNumeric_MPIDense_MPIDense line 1078 >>> src/mat/impls/dense/mpi/mpidense.c >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [0]PETSC ERROR: Signal received! >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 >>> CDT 2009 >>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [0]PETSC ERROR: See docs/index.html for manual pages. >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: ./mult on a linux-gnu named hpcin08 by hpcin Thu Sep 17 >>> 14:28:28 2009 >>> [0]PETSC ERROR: Libraries linked from >>> /home/hpcin/soft/petsc-3.0.0-p8/linux-gnu-c-debug/lib >>> [0]PETSC ERROR: Configure run at Wed Sep 16 17:06:08 2009 >>> [0]PETSC ERROR: Configure options --download-f-blas-lapack=1 >>> --download-plapack --with-mpi-dir=/usr/local/bin/mpich2-1.1.1p1 >>> --with-scalar-type=real --with-precision=double --with-shared=0 >>> [0]PETSC ERROR: --------------------------[1]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>> probably memory access out of range >>> [1]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> [1]PETSC ERROR: or see >>> >>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[1]PETSC >>> < >>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal%5B1%5DPETSC>ERROR: >>> or try >>> >>> http://valgrind.org on linux or man libgmalloc on Apple to find memory >>> corruption errors >>> [1]PETSC ERROR: likely location of problem given in stack below >>> [1]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>> available, >>> [1]PETSC ERROR: INSTEAD the line number of the start of the >>> function >>> [1]PETSC ERROR: is given. >>> [1]PETSC ERROR: [1] MatMPIDenseCopyToPlapack line 1028 >>> src/mat/impls/dense/mpi/mpidense.c >>> [1]PETSC ERROR: [1] MatMatMultNumeric_MPIDense_MPIDense line 1078 >>> src/mat/impls/dense/mpi/mpidense.c >>> [1]PETSC ERR---------------------------------------------- >>> [0]PETSC ERROR: User provided function() line 0 in unknown directory >>> unknown file >>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >>> OR: --------------------- Error Message >>> ------------------------------------ >>> [1]PETSC ERROR: Signal received! >>> [1]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [1]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 >>> CDT 2009 >>> [1]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [1]PETSC ERROR: See docs/index.html for manual pages. >>> [1]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [1]PETSC ERROR: ./mult on a linux-gnu named hpcin-desktop by hpcin Thu >>> Sep >>> 17 14:28:27 2009 >>> [1]PETSC ERROR: Libraries linked from >>> /home/hpcin/soft/petsc-3.0.0-p8/linux-gnu-c-debug/lib >>> [1]PETSC ERROR: Configure run at Tue Sep 15 15:57:39 2009 >>> [1]PETSC ERROR: Configure options --download-plapack=1 >>> --download-f-blas-lapack=1 --with-mpi-dir=/usr/local/bin/mpich2-1.1.1p1 >>> --with-scalar-type=real --with-precision=double --with-shared=0 >>> [1]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [1]PETSC ERROR: User provided function() line 0 in unknown directory >>> unknown file >>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 >>> rank 1 in job 1 hpcin08_34697 caused collective abort of all ranks >>> exit status of rank 1: return code 59 >>> rank 0 in job 1 hpcin08_34697 caused collective abort of all ranks >>> exit status of rank 0: return code 59 >>> >>> >>> I tried to execute the ex123.c example and I did not succeeded to. >>> >>> Regards >>> >>> -- >>> F?bio Leite Soares >>> Undergraduate Student of Computing Engineering >>> Centro de Inform?tica - UFPE - BRAZIL >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments >> is infinitely more interesting than any results to which their experiments >> lead. >> -- Norbert Wiener >> > -- F?bio Leite Soares Graduando em Engenharia da Computa??o Centro de Inform?tica - UFPE - 2007.1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Fri Sep 18 15:37:57 2009 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Fri, 18 Sep 2009 15:37:57 -0500 (CDT) Subject: MatMatMult_MPIDense_MPIDense() works currently? In-Reply-To: <1f9a136d0909181116m3793e6ebj46f0502c3c157135@mail.gmail.com> References: <1f9a136d0909171037o6b93c113xec0fd3b6c6255e16@mail.gmail.com> <1f9a136d0909181116m3793e6ebj46f0502c3c157135@mail.gmail.com> Message-ID: You may test your plapack installation using /src/mat/examples/tests/ex103.c or ex107.c See runex103 ore runex107 in /src/mat/examples/tests/makefile. Hmm, ex123.c is a test for MatMatMult() using plapack. runnning it, I get [1]PETSC ERROR: Error in external library! [1]PETSC ERROR: Due to aparent bugs in PLAPACK,this is not currently supported! Thus, as it says, MatMatMult() for mpidense matrix is not supported. Hong On Fri, 18 Sep 2009, F??bio Leite wrote: > Hong Zhang, > > I put --download-plapack=1 to install plapack. Is there something extra to > do ? > > Regards > > > > 2009/9/17 Hong Zhang > >> >> Fabio, >> >> Did you install plapack with petsc? >> Hong >> >> On Thu, 17 Sep 2009, Matthew Knepley wrote: >> >> Give us the exact command line you use for ex123 and the error output. >>> Send >>> to petsc-maint. >>> >>> Matt >>> >>> On Thu, Sep 17, 2009 at 12:37 PM, Fabio Leite Soares >>> wrote: >>> >>> Hi everyone, I have the same problem and I don't know how to fix it. >>>> >>>> I need to multiply two mpi dense matrices using the BLAS3 routines. I >>>> have >>>> tried the MatMatMult_MPIDense_MPIDense() function but the console shows >>>> this >>>> message: >>>> >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>>> probably memory access out of range >>>> [0]PETSC ERROR: Try option -start_in_debugger or >>>> -on_error_attach_debugger >>>> [0]PETSC ERROR: or see >>>> >>>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC >>>> < >>>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal%5B0%5DPETSC>ERROR: >>>> or try >>>> >>>> http://valgrind.org on linux or man libgmalloc on Apple to find memory >>>> corruption errors >>>> [0]PETSC ERROR: likely location of problem given in stack below >>>> [0]PETSC ERROR: --------------------- Stack Frames >>>> ------------------------------------ >>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>> available, >>>> [0]PETSC ERROR: INSTEAD the line number of the start of the >>>> function >>>> [0]PETSC ERROR: is given. >>>> [0]PETSC ERROR: [0] MatMPIDenseCopyToPlapack line 1028 >>>> src/mat/impls/dense/mpi/mpidense.c >>>> [0]PETSC ERROR: [0] MatMatMultNumeric_MPIDense_MPIDense line 1078 >>>> src/mat/impls/dense/mpi/mpidense.c >>>> [0]PETSC ERROR: --------------------- Error Message >>>> ------------------------------------ >>>> [0]PETSC ERROR: Signal received! >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 >>>> CDT 2009 >>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>> [0]PETSC ERROR: See docs/index.html for manual pages. >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: ./mult on a linux-gnu named hpcin08 by hpcin Thu Sep 17 >>>> 14:28:28 2009 >>>> [0]PETSC ERROR: Libraries linked from >>>> /home/hpcin/soft/petsc-3.0.0-p8/linux-gnu-c-debug/lib >>>> [0]PETSC ERROR: Configure run at Wed Sep 16 17:06:08 2009 >>>> [0]PETSC ERROR: Configure options --download-f-blas-lapack=1 >>>> --download-plapack --with-mpi-dir=/usr/local/bin/mpich2-1.1.1p1 >>>> --with-scalar-type=real --with-precision=double --with-shared=0 >>>> [0]PETSC ERROR: --------------------------[1]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>>> probably memory access out of range >>>> [1]PETSC ERROR: Try option -start_in_debugger or >>>> -on_error_attach_debugger >>>> [1]PETSC ERROR: or see >>>> >>>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[1]PETSC >>>> < >>>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal%5B1%5DPETSC>ERROR: >>>> or try >>>> >>>> http://valgrind.org on linux or man libgmalloc on Apple to find memory >>>> corruption errors >>>> [1]PETSC ERROR: likely location of problem given in stack below >>>> [1]PETSC ERROR: --------------------- Stack Frames >>>> ------------------------------------ >>>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>> available, >>>> [1]PETSC ERROR: INSTEAD the line number of the start of the >>>> function >>>> [1]PETSC ERROR: is given. >>>> [1]PETSC ERROR: [1] MatMPIDenseCopyToPlapack line 1028 >>>> src/mat/impls/dense/mpi/mpidense.c >>>> [1]PETSC ERROR: [1] MatMatMultNumeric_MPIDense_MPIDense line 1078 >>>> src/mat/impls/dense/mpi/mpidense.c >>>> [1]PETSC ERR---------------------------------------------- >>>> [0]PETSC ERROR: User provided function() line 0 in unknown directory >>>> unknown file >>>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >>>> OR: --------------------- Error Message >>>> ------------------------------------ >>>> [1]PETSC ERROR: Signal received! >>>> [1]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [1]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 >>>> CDT 2009 >>>> [1]PETSC ERROR: See docs/changes/index.html for recent updates. >>>> [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>> [1]PETSC ERROR: See docs/index.html for manual pages. >>>> [1]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [1]PETSC ERROR: ./mult on a linux-gnu named hpcin-desktop by hpcin Thu >>>> Sep >>>> 17 14:28:27 2009 >>>> [1]PETSC ERROR: Libraries linked from >>>> /home/hpcin/soft/petsc-3.0.0-p8/linux-gnu-c-debug/lib >>>> [1]PETSC ERROR: Configure run at Tue Sep 15 15:57:39 2009 >>>> [1]PETSC ERROR: Configure options --download-plapack=1 >>>> --download-f-blas-lapack=1 --with-mpi-dir=/usr/local/bin/mpich2-1.1.1p1 >>>> --with-scalar-type=real --with-precision=double --with-shared=0 >>>> [1]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory >>>> unknown file >>>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 >>>> rank 1 in job 1 hpcin08_34697 caused collective abort of all ranks >>>> exit status of rank 1: return code 59 >>>> rank 0 in job 1 hpcin08_34697 caused collective abort of all ranks >>>> exit status of rank 0: return code 59 >>>> >>>> >>>> I tried to execute the ex123.c example and I did not succeeded to. >>>> >>>> Regards >>>> >>>> -- >>>> F?bio Leite Soares >>>> Undergraduate Student of Computing Engineering >>>> Centro de Inform?tica - UFPE - BRAZIL >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments >>> is infinitely more interesting than any results to which their experiments >>> lead. >>> -- Norbert Wiener >>> >> > > > -- > F?bio Leite Soares > Graduando em Engenharia da Computa??o > Centro de Inform?tica - UFPE - 2007.1 > From fabioleite0 at gmail.com Fri Sep 18 18:17:39 2009 From: fabioleite0 at gmail.com (=?ISO-8859-1?Q?F=E1bio_Leite?=) Date: Fri, 18 Sep 2009 20:17:39 -0300 Subject: MatMatMult_MPIDense_MPIDense() works currently? In-Reply-To: References: <1f9a136d0909171037o6b93c113xec0fd3b6c6255e16@mail.gmail.com> <1f9a136d0909181116m3793e6ebj46f0502c3c157135@mail.gmail.com> Message-ID: <1f9a136d0909181617yaf02b2esf53fb4e1045f15f6@mail.gmail.com> Thanks Hong, I tested this examples (ex103.c and ex107.c) and worked fine ! Now I want to multiply two dense matrices using BLAS3 routines. I try to use the MatMatMultNumeric_MPIDense_MPIDense() function implemented in src/mat/impls/dense/mpi/mpidense.c but I see this error in console: [...] [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSCERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] MatMPIDenseCopyToPlapack line 1028 /home/hpcin/soft/petsc-3.0.0-p8/src/mat/impls/dense/mpi/mpidense.c [0]PETSC ERROR: [0] MatMatMultNumeric_MPIDense_MPIDense line 1078 /home/hpcin/soft/petsc-3.0.0-p8/src/mat/impls/dense/mpi/mpidense.c [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 CDT 2009 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./mult on a linux-gnu named hpcin08 by hpcin Fri Sep 18 19:14:55 2009 [0]PETSC ERROR: Libraries linked from /home/hpcin/soft/petsc-3.0.0-p8/linux-gnu-c-debug/lib [0]PETSC ERROR: Configure run at Fri Sep 18 16:03:03 2009 [0]PETSC ERROR: Configure options --download-f-blas-lapack=1 --download-plapack --with-mpi-dir=/usr/local/bin/mpich2-1.1.1p1 --with-scalar-type=real --with-precision=d[1] [...] Can I really use this function to multiply two dense matrices ? If not, how can I multiply these matrices using BLAS3 routines ? I am attaching my code if you want to see how I am doing. Regards. -- F?bio Leite Soares Undergraduate Student of Computing Engineering Centro de Inform?tica - UFPE - BRAZIL -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- 10 10 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 -------------- next part -------------- A non-text attachment was scrubbed... Name: mult.c Type: text/x-csrc Size: 2564 bytes Desc: not available URL: From bsmith at mcs.anl.gov Fri Sep 18 19:28:02 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 18 Sep 2009 19:28:02 -0500 Subject: MatMatMult_MPIDense_MPIDense() works currently? In-Reply-To: <1f9a136d0909181617yaf02b2esf53fb4e1045f15f6@mail.gmail.com> References: <1f9a136d0909171037o6b93c113xec0fd3b6c6255e16@mail.gmail.com> <1f9a136d0909181116m3793e6ebj46f0502c3c157135@mail.gmail.com> <1f9a136d0909181617yaf02b2esf53fb4e1045f15f6@mail.gmail.com> Message-ID: <49FDF73E-E4B8-4AD1-B694-A39B20C6B96C@mcs.anl.gov> This mail belongs in petsc-maint at mcs.anl.gov You will need to run in the debugger to see what it is crashing or even better run in the debugger with valgrind (see valgrind.org). Barry On Sep 18, 2009, at 6:17 PM, F?bio Leite wrote: > Thanks Hong, I tested this examples (ex103.c and ex107.c) and worked > fine ! > Now I want to multiply two dense matrices using BLAS3 routines. I > try to use the MatMatMultNumeric_MPIDense_MPIDense() function > implemented in src/mat/impls/dense/mpi/mpidense.c but I see this > error in console: > > > [...] > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > Violation, probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or - > on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal > [0]PETSC ERROR: or try http://valgrind.org on linux or man > libgmalloc on Apple to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatMPIDenseCopyToPlapack line 1028 /home/hpcin/ > soft/petsc-3.0.0-p8/src/mat/impls/dense/mpi/mpidense.c > [0]PETSC ERROR: [0] MatMatMultNumeric_MPIDense_MPIDense line 1078 / > home/hpcin/soft/petsc-3.0.0-p8/src/mat/impls/dense/mpi/mpidense.c > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 > 14:02:12 CDT 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./mult on a linux-gnu named hpcin08 by hpcin Fri Sep > 18 19:14:55 2009 > [0]PETSC ERROR: Libraries linked from /home/hpcin/soft/petsc-3.0.0- > p8/linux-gnu-c-debug/lib > [0]PETSC ERROR: Configure run at Fri Sep 18 16:03:03 2009 > [0]PETSC ERROR: Configure options --download-f-blas-lapack=1 -- > download-plapack --with-mpi-dir=/usr/local/bin/mpich2-1.1.1p1 --with- > scalar-type=real --with-precision=d[1] > [...] > > > > Can I really use this function to multiply two dense matrices ? If > not, how can I multiply these matrices using BLAS3 routines ? > I am attaching my code if you want to see how I am doing. > > Regards. > > > > -- > F?bio Leite Soares > Undergraduate Student of Computing Engineering > Centro de Inform?tica - UFPE - BRAZIL > From hzhang at mcs.anl.gov Fri Sep 18 20:50:50 2009 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Fri, 18 Sep 2009 20:50:50 -0500 (CDT) Subject: MatMatMult_MPIDense_MPIDense() works currently? In-Reply-To: <1f9a136d0909181617yaf02b2esf53fb4e1045f15f6@mail.gmail.com> References: <1f9a136d0909171037o6b93c113xec0fd3b6c6255e16@mail.gmail.com> <1f9a136d0909181116m3793e6ebj46f0502c3c157135@mail.gmail.com> <1f9a136d0909181617yaf02b2esf53fb4e1045f15f6@mail.gmail.com> Message-ID: In my previous email, I tested petsc-dev/src/mat/examples/tests/ex123.c on MatMatMult_MPIDense_MPIDense() and get error: Hmm, ex123.c is a test for MatMatMult() using plapack. runnning it, I get [1]PETSC ERROR: Error in external library! [1]PETSC ERROR: Due to aparent bugs in PLAPACK,this is not currently supported! Thus, as it says, MatMatMult() for mpidense matrix is not supported by petsc. > Thanks Hong, I tested this examples (ex103.c and ex107.c) and worked fine ! > Now I want to multiply two dense matrices using BLAS3 routines. I try to use > the MatMatMultNumeric_MPIDense_MPIDense() function implemented in > src/mat/impls/dense/mpi/mpidense.c but I see this error in console: This is likely the same error. We added error message in petsc-dev. BLAS3 is for sequential run. We do not support interface with scalapack, which is a parallel dense matrix package. Hong > > > [...] > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSCERROR: > or try > http://valgrind.org on linux or man libgmalloc on Apple to find memory > corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatMPIDenseCopyToPlapack line 1028 > /home/hpcin/soft/petsc-3.0.0-p8/src/mat/impls/dense/mpi/mpidense.c > [0]PETSC ERROR: [0] MatMatMultNumeric_MPIDense_MPIDense line 1078 > /home/hpcin/soft/petsc-3.0.0-p8/src/mat/impls/dense/mpi/mpidense.c > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 > CDT 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./mult on a linux-gnu named hpcin08 by hpcin Fri Sep 18 > 19:14:55 2009 > [0]PETSC ERROR: Libraries linked from > /home/hpcin/soft/petsc-3.0.0-p8/linux-gnu-c-debug/lib > [0]PETSC ERROR: Configure run at Fri Sep 18 16:03:03 2009 > [0]PETSC ERROR: Configure options --download-f-blas-lapack=1 > --download-plapack --with-mpi-dir=/usr/local/bin/mpich2-1.1.1p1 > --with-scalar-type=real --with-precision=d[1] > [...] > > > > Can I really use this function to multiply two dense matrices ? If not, how > can I multiply these matrices using BLAS3 routines ? > I am attaching my code if you want to see how I am doing. > > Regards. > > > > -- > F?bio Leite Soares > Undergraduate Student of Computing Engineering > Centro de Inform?tica - UFPE - BRAZIL > From vyan2000 at gmail.com Sat Sep 19 14:12:17 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Sat, 19 Sep 2009 15:12:17 -0400 Subject: Out of Memory Error. Message-ID: Hi All, My application code is reading PETSc binary files to obtain the information about a linear system and then solve it in parallel. The code works well for median size problem. Now, I am testing a largest case requested by our custom on *one* processor. I got the following errors. It looks like that error happenned when PETSc is requesting an malloc of size "[0]PETSC ERROR: Memory requested 44784088!", but I did see there are PETSc routines the use even more memory than "44784088", for instance, "[0] 46 5520000 ISGetIndices_Stride()". So can I guess error is caused by the hardware memory limitation? The code was running on MPIS machine with 6 CPUs one each Node. The code broke for 1 Node with 1 process for 1 Node with 2 process for 1 Node with 6 process But the code succeed for 2 Node with 2 process. for 2 Node with 4 process. The code also succeed when Node number is big than 2. Is this another indicator of the hardware limitation? Thanks a lot, Yan $ srun -p sci-comp -N 1 -n 1 ./rpisolve_25_field -ksp_monitor_true_residual -log_summary -malloc_dump -malloc_log >& out.rpisolve.N1.n1 $ cat out.rpisolve.N1.n1 breakpoint 1 breakpoint 2 breakpoint 750000 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Out of memory. This could be due to allocating [0]PETSC ERROR: too large an object or bleeding by not properly [0]PETSC ERROR: destroying unneeded objects. [0] Maximum memory PetscMalloc()ed 3172769832 maximum size of entire process 0 [0] Memory usage sorted by function [0] 2 3216 ClassPerfLogCreate() [0] 2 1616 ClassRegLogCreate() [0] 2 6416 EventPerfLogCreate() [0] 1 12800 EventPerfLogEnsureSize() [0] 2 1616 EventRegLogCreate() [0] 1 3200 EventRegLogRegister() [0] 92 11960 ISCreateBlock() [0] 292 36792 ISCreateStride() [0] 46 5520000 ISGetIndices_Stride() [0] 78 21632 KSPCreate() [0] 1 200 KSPCreate_FGMRES() [0] 26 416 KSPDefaultConvergedCreate() [0] 6 17600 KSPSetUp_FGMRES() [0] 475 180880 MatCreate() [0] 24 3648 MatCreate_MPIAIJ() [0] 71 22152 MatCreate_SeqAIJ() [0] 1 1504 MatGetRow_MPIAIJ() [0] 23 368 MatGetSubMatrices_MPIAIJ() [0] 690 140770488 MatGetSubMatrices_MPIAIJ_Local() [0] 22 5280176 MatGetSubMatrix_MPIAIJ() [0] 7 1497800024 MatLoad_MPIAIJ() [0] 68 13920000 MatMarkDiagonal_SeqAIJ() [0] 138 1236969200 MatSeqAIJSetPreallocation_SeqAIJ() [0] 23 184 MatSetUpMultiply_MPIAIJ() [0] 24 192 MatStashCreate_Private() [0] 138 1288 MatStashScatterBegin_Private() [0] 23 184 Mat_CheckCompressedRow() [0] 45 8280360 Mat_CheckInode() [0] 78 14768 PCCreate() [0] 1 120 PCCreate_FieldSplit() [0] 2 208 PCFieldSplitSetDefaults() [0] 50 2400 PCFieldSplitSetFields_FieldSplit() [0] 1 104 PCSetFromOptions_FieldSplit() [0] 1 200 PCSetUp_FieldSplit() [0] 3 24 PetscCommDuplicate() [0] 1768 84864 PetscFListAdd() [0] 46 368 PetscGatherNumberOfMessages() [0] 237 1896 PetscMapSetUp() [0] 4 32 PetscMaxSum() [0] 22 5984 PetscOListAdd() [0] 75 4800 PetscOptionsCreate_Private() [0] 4 96 PetscOptionsGetEList() [0] 6 384000 PetscOptionsInsertFile() [0] 75 600 PetscOptionsInt() [0] 92 736 PetscPostIrecvInt() [0] 46 368 PetscPostIrecvScalar() [0] 0 32 PetscPushSignalHandler() [0] 4570 130832 PetscStrallocpy() [0] 69 16924048 PetscTableCreate() [0] 1 16 PetscViewerASCIIMonitorCreate() [0] 1 16 PetscViewerASCIIOpen() [0] 12 1952 PetscViewerCreate() [0] 1 56 PetscViewerCreate_ASCII() [0] 3 192 PetscViewerCreate_Binary() [0] 2 528 StackCreate() [0] 2 1008 StageLogCreate() [0] 2 16 VecAssemblyBegin_MPI() [0] 236 74104 VecCreate() [0] 49 78003168 VecCreate_MPI_Private() [0] 23 552 VecCreate_Seq_Private() [0] 2 80 VecDuplicateVecs_Default() [0] 92 11224 VecScatterCreate() [0] 72 576 VecStashCreate_Private() [0] 28 1056 VecStashScatterBegin_Private() [0]PETSC ERROR: Memory requested 44784088! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 5, Mon Apr 13 09:15:37 CDT 2009 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: /tmp/lustre/home/yy2250/local/PETSc/petsc-3.0.0-p5/src/ksp/ksp/examples/tutorials/ttt_5fld/./rpisolve_25_field on a O-hypre-n named sci-m0n0.scsystem by yy2250 Sat Sep 19 14:37:43 2009 [0]PETSC ERROR: Libraries linked from /home/yy2250/local/PETSc/petsc-test-3-p5/O-hypre-nodebug/lib [0]PETSC ERROR: Configure run at Tue Jul 21 15:19:41 2009 [0]PETSC ERROR: Configure options --with-cc=mpicc --with-fc=mpif77 --with-mpiexec=srun --with-debugging=0 --with-fortran-kernels=generic --with-shared=0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c [0]PETSC ERROR: MatSeqAIJSetPreallocation_SeqAIJ() line 2986 in src/mat/impls/aij/seq/aij.c [0]PETSC ERROR: MatSeqAIJSetPreallocation() line 2928 in src/mat/impls/aij/seq/aij.c [0]PETSC ERROR: MatGetSubMatrices_MPIAIJ_Local() line 1267 in src/mat/impls/aij/mpi/mpiov.c [0]PETSC ERROR: MatGetSubMatrices_MPIAIJ() line 787 in src/mat/impls/aij/mpi/mpiov.c [0]PETSC ERROR: MatGetSubMatrices() line 5524 in src/mat/interface/matrix.c [0]PETSC ERROR: MatGetSubMatrix_MPIAIJ() line 3069 in src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: MatGetSubMatrix() line 6212 in src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_FieldSplit() line 285 in src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: main() line 246 in src/ksp/ksp/examples/tutorials/rpisolve_25_field.c application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 In: PMI_Abort(1, application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0) srun: error: task 0: Exited with exit code 1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Sep 19 14:24:03 2009 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 19 Sep 2009 14:24:03 -0500 Subject: Out of Memory Error. In-Reply-To: References: Message-ID: On Sat, Sep 19, 2009 at 2:12 PM, Ryan Yan wrote: > Hi All, > My application code is reading PETSc binary files to obtain the information > about a linear system and then solve it in parallel. > > The code works well for median size problem. Now, I am testing a largest > case requested by our custom on *one* processor. I got the following errors. > > It looks like that error happenned when PETSc is requesting an malloc of > size "[0]PETSC ERROR: Memory requested 44784088!", but I did see there are > PETSc routines the use even more memory than "44784088", > for instance, "[0] 46 5520000 ISGetIndices_Stride()". So can I guess error > is caused by the hardware memory limitation? > You are running out of memory. If you want to run bigger problems, you will have to use more nodes. Matt > The code was running on MPIS machine with 6 CPUs one each Node. > The code broke for 1 Node with 1 process > for 1 Node with 2 process > for 1 Node with 6 process > > But the code succeed for 2 Node with 2 process. > for 2 Node with 4 process. > The code also succeed when Node number is big than 2. > > Is this another indicator of the hardware limitation? > > Thanks a lot, > > Yan > > > $ srun -p sci-comp -N 1 -n 1 ./rpisolve_25_field -ksp_monitor_true_residual > -log_summary -malloc_dump -malloc_log >& out.rpisolve.N1.n1 > $ cat out.rpisolve.N1.n1 > > > > > > breakpoint 1 > breakpoint 2 > breakpoint 750000 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Out of memory. This could be due to allocating > [0]PETSC ERROR: too large an object or bleeding by not properly > [0]PETSC ERROR: destroying unneeded objects. > [0] Maximum memory PetscMalloc()ed 3172769832 maximum size of entire > process 0 > [0] Memory usage sorted by function > [0] 2 3216 ClassPerfLogCreate() > [0] 2 1616 ClassRegLogCreate() > [0] 2 6416 EventPerfLogCreate() > [0] 1 12800 EventPerfLogEnsureSize() > [0] 2 1616 EventRegLogCreate() > [0] 1 3200 EventRegLogRegister() > [0] 92 11960 ISCreateBlock() > [0] 292 36792 ISCreateStride() > [0] 46 5520000 ISGetIndices_Stride() > [0] 78 21632 KSPCreate() > [0] 1 200 KSPCreate_FGMRES() > [0] 26 416 KSPDefaultConvergedCreate() > [0] 6 17600 KSPSetUp_FGMRES() > [0] 475 180880 MatCreate() > [0] 24 3648 MatCreate_MPIAIJ() > [0] 71 22152 MatCreate_SeqAIJ() > [0] 1 1504 MatGetRow_MPIAIJ() > [0] 23 368 MatGetSubMatrices_MPIAIJ() > [0] 690 140770488 MatGetSubMatrices_MPIAIJ_Local() > [0] 22 5280176 MatGetSubMatrix_MPIAIJ() > [0] 7 1497800024 MatLoad_MPIAIJ() > [0] 68 13920000 MatMarkDiagonal_SeqAIJ() > [0] 138 1236969200 MatSeqAIJSetPreallocation_SeqAIJ() > [0] 23 184 MatSetUpMultiply_MPIAIJ() > [0] 24 192 MatStashCreate_Private() > [0] 138 1288 MatStashScatterBegin_Private() > [0] 23 184 Mat_CheckCompressedRow() > [0] 45 8280360 Mat_CheckInode() > [0] 78 14768 PCCreate() > [0] 1 120 PCCreate_FieldSplit() > [0] 2 208 PCFieldSplitSetDefaults() > [0] 50 2400 PCFieldSplitSetFields_FieldSplit() > [0] 1 104 PCSetFromOptions_FieldSplit() > [0] 1 200 PCSetUp_FieldSplit() > [0] 3 24 PetscCommDuplicate() > [0] 1768 84864 PetscFListAdd() > [0] 46 368 PetscGatherNumberOfMessages() > [0] 237 1896 PetscMapSetUp() > [0] 4 32 PetscMaxSum() > [0] 22 5984 PetscOListAdd() > [0] 75 4800 PetscOptionsCreate_Private() > [0] 4 96 PetscOptionsGetEList() > [0] 6 384000 PetscOptionsInsertFile() > [0] 75 600 PetscOptionsInt() > [0] 92 736 PetscPostIrecvInt() > [0] 46 368 PetscPostIrecvScalar() > [0] 0 32 PetscPushSignalHandler() > [0] 4570 130832 PetscStrallocpy() > [0] 69 16924048 PetscTableCreate() > [0] 1 16 PetscViewerASCIIMonitorCreate() > [0] 1 16 PetscViewerASCIIOpen() > [0] 12 1952 PetscViewerCreate() > [0] 1 56 PetscViewerCreate_ASCII() > [0] 3 192 PetscViewerCreate_Binary() > [0] 2 528 StackCreate() > [0] 2 1008 StageLogCreate() > [0] 2 16 VecAssemblyBegin_MPI() > [0] 236 74104 VecCreate() > [0] 49 78003168 VecCreate_MPI_Private() > [0] 23 552 VecCreate_Seq_Private() > [0] 2 80 VecDuplicateVecs_Default() > [0] 92 11224 VecScatterCreate() > [0] 72 576 VecStashCreate_Private() > [0] 28 1056 VecStashScatterBegin_Private() > [0]PETSC ERROR: Memory requested 44784088! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 5, Mon Apr 13 09:15:37 > CDT 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: > /tmp/lustre/home/yy2250/local/PETSc/petsc-3.0.0-p5/src/ksp/ksp/examples/tutorials/ttt_5fld/./rpisolve_25_field > on a O-hypre-n named sci-m0n0.scsystem by yy2250 Sat Sep 19 14:37:43 2009 > [0]PETSC ERROR: Libraries linked from > /home/yy2250/local/PETSc/petsc-test-3-p5/O-hypre-nodebug/lib > [0]PETSC ERROR: Configure run at Tue Jul 21 15:19:41 2009 > [0]PETSC ERROR: Configure options --with-cc=mpicc --with-fc=mpif77 > --with-mpiexec=srun --with-debugging=0 --with-fortran-kernels=generic > --with-shared=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c > [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c > [0]PETSC ERROR: MatSeqAIJSetPreallocation_SeqAIJ() line 2986 in > src/mat/impls/aij/seq/aij.c > [0]PETSC ERROR: MatSeqAIJSetPreallocation() line 2928 in > src/mat/impls/aij/seq/aij.c > [0]PETSC ERROR: MatGetSubMatrices_MPIAIJ_Local() line 1267 in > src/mat/impls/aij/mpi/mpiov.c > [0]PETSC ERROR: MatGetSubMatrices_MPIAIJ() line 787 in > src/mat/impls/aij/mpi/mpiov.c > [0]PETSC ERROR: MatGetSubMatrices() line 5524 in src/mat/interface/matrix.c > [0]PETSC ERROR: MatGetSubMatrix_MPIAIJ() line 3069 in > src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: MatGetSubMatrix() line 6212 in src/mat/interface/matrix.c > [0]PETSC ERROR: PCSetUp_FieldSplit() line 285 in > src/ksp/pc/impls/fieldsplit/fieldsplit.c > [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: main() line 246 in > src/ksp/ksp/examples/tutorials/rpisolve_25_field.c > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 > In: PMI_Abort(1, application called MPI_Abort(MPI_COMM_WORLD, 1) - process > 0) > srun: error: task 0: Exited with exit code 1 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Sat Sep 19 14:46:09 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Sat, 19 Sep 2009 15:46:09 -0400 Subject: Out of Memory Error. In-Reply-To: References: Message-ID: Matt, thanks for the confirmation. Yan On Sat, Sep 19, 2009 at 3:24 PM, Matthew Knepley wrote: > On Sat, Sep 19, 2009 at 2:12 PM, Ryan Yan wrote: > >> Hi All, >> My application code is reading PETSc binary files to obtain the >> information about a linear system and then solve it in parallel. >> >> The code works well for median size problem. Now, I am testing a largest >> case requested by our custom on *one* processor. I got the following errors. >> >> It looks like that error happenned when PETSc is requesting an malloc of >> size "[0]PETSC ERROR: Memory requested 44784088!", but I did see there are >> PETSc routines the use even more memory than "44784088", >> for instance, "[0] 46 5520000 ISGetIndices_Stride()". So can I guess error >> is caused by the hardware memory limitation? >> > > You are running out of memory. If you want to run bigger problems, you will > have to use more nodes. > > Matt > > >> The code was running on MPIS machine with 6 CPUs one each Node. >> The code broke for 1 Node with 1 process >> for 1 Node with 2 process >> for 1 Node with 6 process >> >> But the code succeed for 2 Node with 2 process. >> for 2 Node with 4 process. >> The code also succeed when Node number is big than 2. >> >> Is this another indicator of the hardware limitation? >> >> Thanks a lot, >> >> Yan >> >> >> $ srun -p sci-comp -N 1 -n 1 ./rpisolve_25_field >> -ksp_monitor_true_residual -log_summary -malloc_dump -malloc_log >& >> out.rpisolve.N1.n1 >> $ cat out.rpisolve.N1.n1 >> >> >> >> >> >> breakpoint 1 >> breakpoint 2 >> breakpoint 750000 >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Out of memory. This could be due to allocating >> [0]PETSC ERROR: too large an object or bleeding by not properly >> [0]PETSC ERROR: destroying unneeded objects. >> [0] Maximum memory PetscMalloc()ed 3172769832 maximum size of entire >> process 0 >> [0] Memory usage sorted by function >> [0] 2 3216 ClassPerfLogCreate() >> [0] 2 1616 ClassRegLogCreate() >> [0] 2 6416 EventPerfLogCreate() >> [0] 1 12800 EventPerfLogEnsureSize() >> [0] 2 1616 EventRegLogCreate() >> [0] 1 3200 EventRegLogRegister() >> [0] 92 11960 ISCreateBlock() >> [0] 292 36792 ISCreateStride() >> [0] 46 5520000 ISGetIndices_Stride() >> [0] 78 21632 KSPCreate() >> [0] 1 200 KSPCreate_FGMRES() >> [0] 26 416 KSPDefaultConvergedCreate() >> [0] 6 17600 KSPSetUp_FGMRES() >> [0] 475 180880 MatCreate() >> [0] 24 3648 MatCreate_MPIAIJ() >> [0] 71 22152 MatCreate_SeqAIJ() >> [0] 1 1504 MatGetRow_MPIAIJ() >> [0] 23 368 MatGetSubMatrices_MPIAIJ() >> [0] 690 140770488 MatGetSubMatrices_MPIAIJ_Local() >> [0] 22 5280176 MatGetSubMatrix_MPIAIJ() >> [0] 7 1497800024 MatLoad_MPIAIJ() >> [0] 68 13920000 MatMarkDiagonal_SeqAIJ() >> [0] 138 1236969200 MatSeqAIJSetPreallocation_SeqAIJ() >> [0] 23 184 MatSetUpMultiply_MPIAIJ() >> [0] 24 192 MatStashCreate_Private() >> [0] 138 1288 MatStashScatterBegin_Private() >> [0] 23 184 Mat_CheckCompressedRow() >> [0] 45 8280360 Mat_CheckInode() >> [0] 78 14768 PCCreate() >> [0] 1 120 PCCreate_FieldSplit() >> [0] 2 208 PCFieldSplitSetDefaults() >> [0] 50 2400 PCFieldSplitSetFields_FieldSplit() >> [0] 1 104 PCSetFromOptions_FieldSplit() >> [0] 1 200 PCSetUp_FieldSplit() >> [0] 3 24 PetscCommDuplicate() >> [0] 1768 84864 PetscFListAdd() >> [0] 46 368 PetscGatherNumberOfMessages() >> [0] 237 1896 PetscMapSetUp() >> [0] 4 32 PetscMaxSum() >> [0] 22 5984 PetscOListAdd() >> [0] 75 4800 PetscOptionsCreate_Private() >> [0] 4 96 PetscOptionsGetEList() >> [0] 6 384000 PetscOptionsInsertFile() >> [0] 75 600 PetscOptionsInt() >> [0] 92 736 PetscPostIrecvInt() >> [0] 46 368 PetscPostIrecvScalar() >> [0] 0 32 PetscPushSignalHandler() >> [0] 4570 130832 PetscStrallocpy() >> [0] 69 16924048 PetscTableCreate() >> [0] 1 16 PetscViewerASCIIMonitorCreate() >> [0] 1 16 PetscViewerASCIIOpen() >> [0] 12 1952 PetscViewerCreate() >> [0] 1 56 PetscViewerCreate_ASCII() >> [0] 3 192 PetscViewerCreate_Binary() >> [0] 2 528 StackCreate() >> [0] 2 1008 StageLogCreate() >> [0] 2 16 VecAssemblyBegin_MPI() >> [0] 236 74104 VecCreate() >> [0] 49 78003168 VecCreate_MPI_Private() >> [0] 23 552 VecCreate_Seq_Private() >> [0] 2 80 VecDuplicateVecs_Default() >> [0] 92 11224 VecScatterCreate() >> [0] 72 576 VecStashCreate_Private() >> [0] 28 1056 VecStashScatterBegin_Private() >> [0]PETSC ERROR: Memory requested 44784088! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 5, Mon Apr 13 09:15:37 >> CDT 2009 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: >> /tmp/lustre/home/yy2250/local/PETSc/petsc-3.0.0-p5/src/ksp/ksp/examples/tutorials/ttt_5fld/./rpisolve_25_field >> on a O-hypre-n named sci-m0n0.scsystem by yy2250 Sat Sep 19 14:37:43 2009 >> [0]PETSC ERROR: Libraries linked from >> /home/yy2250/local/PETSc/petsc-test-3-p5/O-hypre-nodebug/lib >> [0]PETSC ERROR: Configure run at Tue Jul 21 15:19:41 2009 >> [0]PETSC ERROR: Configure options --with-cc=mpicc --with-fc=mpif77 >> --with-mpiexec=srun --with-debugging=0 --with-fortran-kernels=generic >> --with-shared=0 >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c >> [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c >> [0]PETSC ERROR: MatSeqAIJSetPreallocation_SeqAIJ() line 2986 in >> src/mat/impls/aij/seq/aij.c >> [0]PETSC ERROR: MatSeqAIJSetPreallocation() line 2928 in >> src/mat/impls/aij/seq/aij.c >> [0]PETSC ERROR: MatGetSubMatrices_MPIAIJ_Local() line 1267 in >> src/mat/impls/aij/mpi/mpiov.c >> [0]PETSC ERROR: MatGetSubMatrices_MPIAIJ() line 787 in >> src/mat/impls/aij/mpi/mpiov.c >> [0]PETSC ERROR: MatGetSubMatrices() line 5524 in >> src/mat/interface/matrix.c >> [0]PETSC ERROR: MatGetSubMatrix_MPIAIJ() line 3069 in >> src/mat/impls/aij/mpi/mpiaij.c >> [0]PETSC ERROR: MatGetSubMatrix() line 6212 in src/mat/interface/matrix.c >> [0]PETSC ERROR: PCSetUp_FieldSplit() line 285 in >> src/ksp/pc/impls/fieldsplit/fieldsplit.c >> [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c >> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c >> [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c >> [0]PETSC ERROR: main() line 246 in >> src/ksp/ksp/examples/tutorials/rpisolve_25_field.c >> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 >> In: PMI_Abort(1, application called MPI_Abort(MPI_COMM_WORLD, 1) - process >> 0) >> srun: error: task 0: Exited with exit code 1 >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Sun Sep 20 10:20:29 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Sun, 20 Sep 2009 11:20:29 -0400 Subject: BAIJ or AIJ Message-ID: Hi All, I have a large size application. Mesh size is 30000 nodes, with dof 25 on each vertex. And it's 3d unstructured, Tet, and Hex mesh. In the following I will denote blksize=25 I am testing how to build up a PETSc matrix object quick and fast. The data I have is Block Compressed Sparse Row(BCSR) files. And my objective is to read BCSR files and generate PETSc Binaries Firstly, I choose the MATMPIAIJ, I opened the BCSR data files on each processor, and set up the preallocation use MatMPIAIJSetPreallocation(A,blksize,PETSC_NULL,blksize,PETSC_NULL); The reason why I choose 25 as the number for d_nz and o_nz is that I do not have access to the ordering of the vertices. So it's the worst case set up, and it takes about 7 minutes on 30 MIPS node(180 processors) to write the output into PETSc binaries. Secondly, I choose the MatMPIBAIJ, and same procedure as above, but also set up MatMPIBAIJSetPreallocation(A,blksize,blksize,PETSC_NULL,blksize,PETSC_NULL), here blksize = 25 and it's also the worst case; This experiments takes forever and could not generate the PETSc binaries. I guess the reason why it takes so long in the MATMPIBAIJ case is that I did not set up the preallocation accurately. Alougth I think the preallocation is also not accurate in the MATMPIAIJ case, but it seems like the preallocation effect is not as serious as for the MPIBAIJ. Correct me please, if there are other reasons. Can I anyone please give a hint on how to set up the preallocation right for a unstructured mesh without knowing the mesh ordering? Thank you very much in advance, Yan -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Sun Sep 20 10:28:04 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Sun, 20 Sep 2009 11:28:04 -0400 Subject: BAIJ or AIJ In-Reply-To: References: Message-ID: I forget to mention that, the stencil of the mesh has face connectivity. For each tetrahedron* *mesh point, it has a self-connectivity, and 4 other connectivities( tet has 4 faces). For Hexahedron, it has a self-connectivity, and 6 other connectivities( hex has 6 faces). the dof 25 results from [u, v, w, p, a]^5, here 5 means 5 physical phase, for instance, bubble1, liquid1, bubble2, liquid2,solid1. u, v, w, p, a from conservation of the physics. Thanks, Yan On Sun, Sep 20, 2009 at 11:20 AM, Ryan Yan wrote: > Hi All, > I have a large size application. Mesh size is 30000 nodes, with dof 25 on > each vertex. And it's 3d unstructured, Tet, and Hex mesh. In the following > I will denote blksize=25 > > I am testing how to build up a PETSc matrix object quick and fast. > > The data I have is Block Compressed Sparse Row(BCSR) files. And my > objective is to read BCSR files and generate PETSc Binaries > > Firstly, I choose the MATMPIAIJ, I opened the BCSR data files on each > processor, and set up the preallocation use > MatMPIAIJSetPreallocation(A,blksize,PETSC_NULL,blksize,PETSC_NULL); The > reason why I choose 25 as the number for d_nz and o_nz is that I do not have > access to the ordering of the vertices. So it's the worst case set up, and > it takes about 7 minutes on 30 MIPS node(180 processors) to write the output > into PETSc binaries. > > Secondly, I choose the MatMPIBAIJ, and same procedure as above, but also > set up > MatMPIBAIJSetPreallocation(A,blksize,blksize,PETSC_NULL,blksize,PETSC_NULL), > here blksize = 25 and it's also the worst case; This experiments takes > forever and could not generate the PETSc binaries. > > I guess the reason why it takes so long in the MATMPIBAIJ case is that I > did not set up the preallocation accurately. Alougth I think the > preallocation is also not accurate in the MATMPIAIJ case, but it seems like > the preallocation effect is not as serious as for the MPIBAIJ. Correct me > please, if there are other reasons. > > Can I anyone please give a hint on how to set up the preallocation right > for a unstructured mesh without knowing the mesh ordering? > > Thank you very much in advance, > > Yan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Sep 20 10:36:53 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 20 Sep 2009 10:36:53 -0500 Subject: BAIJ or AIJ In-Reply-To: References: Message-ID: <7513EF7F-6D89-4AD2-8F51-1765785A1766@mcs.anl.gov> Yan, Simply read through the ASCII file(s) twice. The first time count the number of blocks per row, then preallocate then read through the ASCII file again reading and setting the values. This will be very fast. Barry On Sep 20, 2009, at 10:20 AM, Ryan Yan wrote: > Hi All, > I have a large size application. Mesh size is 30000 nodes, with dof > 25 on each vertex. And it's 3d unstructured, Tet, and Hex mesh. In > the following I will denote blksize=25 > > I am testing how to build up a PETSc matrix object quick and fast. > > The data I have is Block Compressed Sparse Row(BCSR) files. And my > objective is to read BCSR files and generate PETSc Binaries > > Firstly, I choose the MATMPIAIJ, I opened the BCSR data files on > each processor, and set up the preallocation use > MatMPIAIJSetPreallocation(A,blksize,PETSC_NULL,blksize,PETSC_NULL); > The reason why I choose 25 as the number for d_nz and o_nz is that I > do not have access to the ordering of the vertices. So it's the > worst case set up, and it takes about 7 minutes on 30 MIPS node(180 > processors) to write the output into PETSc binaries. > > Secondly, I choose the MatMPIBAIJ, and same procedure as above, but > also set up > MatMPIBAIJSetPreallocation > (A,blksize,blksize,PETSC_NULL,blksize,PETSC_NULL), here blksize = 25 > and it's also the worst case; This experiments takes forever and > could not generate the PETSc binaries. > > I guess the reason why it takes so long in the MATMPIBAIJ case is > that I did not set up the preallocation accurately. Alougth I think > the preallocation is also not accurate in the MATMPIAIJ case, but it > seems like the preallocation effect is not as serious as for the > MPIBAIJ. Correct me please, if there are other reasons. > > Can I anyone please give a hint on how to set up the preallocation > right for a unstructured mesh without knowing the mesh ordering? > > Thank you very much in advance, > > Yan From sperif at gmail.com Sun Sep 20 10:44:00 2009 From: sperif at gmail.com (Pierre-Yves Aquilanti) Date: Sun, 20 Sep 2009 17:44:00 +0200 Subject: extremly slow MatSetValues Message-ID: <4C6F9D8F-7A0E-4DC7-A9D0-957D108A9F11@gmail.com> Hello, i'm facing a very slow MatSetValues during the discretization of my mathematical problem when creating the matrix. My matrix is seven banded, come from the a discretization of an helmholtz problem using finite differences. In the 2D case, creating the matrix is really fast. But in 3D case it goes well for the first second loop processing and then it goes really really slow. Here is a small part of my code : ierr = DAGetCorners(*da, &Istartx, &Istarty, &Istartz, &Iendx, &Iendy, &Iendz); CHKERRQ(ierr); for (j = Istartz; j < Istartz + Iendz; j++) { for(k=Istarty; k < Istarty+Iendy; k++){ for (i = Istartx; i < Istartx + Iendx; i++) { /*get values on an array, their coordinate too*/ MatSetValues(*B, 1,ind_mat_x,value_count,ind_mat_y,val_mat_ligne,ADD_VALUES); } } Did you encounter any slowness for MatSetValues function ? Thanks a lot Regards PYA From vyan2000 at gmail.com Sun Sep 20 10:48:32 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Sun, 20 Sep 2009 11:48:32 -0400 Subject: BAIJ or AIJ In-Reply-To: <7513EF7F-6D89-4AD2-8F51-1765785A1766@mcs.anl.gov> References: <7513EF7F-6D89-4AD2-8F51-1765785A1766@mcs.anl.gov> Message-ID: Hi Barry, Thank you very much for the suggestion, I will try it. Yan On Sun, Sep 20, 2009 at 11:36 AM, Barry Smith wrote: > > Yan, > > Simply read through the ASCII file(s) twice. The first time count the > number of blocks per row, then preallocate then read through the ASCII file > again reading and setting the values. This will be very fast. > > Barry > > > On Sep 20, 2009, at 10:20 AM, Ryan Yan wrote: > > Hi All, >> I have a large size application. Mesh size is 30000 nodes, with dof 25 on >> each vertex. And it's 3d unstructured, Tet, and Hex mesh. In the following >> I will denote blksize=25 >> >> I am testing how to build up a PETSc matrix object quick and fast. >> >> The data I have is Block Compressed Sparse Row(BCSR) files. And my >> objective is to read BCSR files and generate PETSc Binaries >> >> Firstly, I choose the MATMPIAIJ, I opened the BCSR data files on each >> processor, and set up the preallocation use >> MatMPIAIJSetPreallocation(A,blksize,PETSC_NULL,blksize,PETSC_NULL); The >> reason why I choose 25 as the number for d_nz and o_nz is that I do not have >> access to the ordering of the vertices. So it's the worst case set up, and >> it takes about 7 minutes on 30 MIPS node(180 processors) to write the output >> into PETSc binaries. >> >> Secondly, I choose the MatMPIBAIJ, and same procedure as above, but also >> set up >> MatMPIBAIJSetPreallocation(A,blksize,blksize,PETSC_NULL,blksize,PETSC_NULL), >> here blksize = 25 and it's also the worst case; This experiments takes >> forever and could not generate the PETSc binaries. >> >> I guess the reason why it takes so long in the MATMPIBAIJ case is that I >> did not set up the preallocation accurately. Alougth I think the >> preallocation is also not accurate in the MATMPIAIJ case, but it seems like >> the preallocation effect is not as serious as for the MPIBAIJ. Correct me >> please, if there are other reasons. >> >> Can I anyone please give a hint on how to set up the preallocation right >> for a unstructured mesh without knowing the mesh ordering? >> >> Thank you very much in advance, >> >> Yan >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Sep 20 12:49:53 2009 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 20 Sep 2009 12:49:53 -0500 Subject: extremly slow MatSetValues In-Reply-To: <4C6F9D8F-7A0E-4DC7-A9D0-957D108A9F11@gmail.com> References: <4C6F9D8F-7A0E-4DC7-A9D0-957D108A9F11@gmail.com> Message-ID: Your matrix is not correctly preallocated: http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#slow Matt On Sun, Sep 20, 2009 at 10:44 AM, Pierre-Yves Aquilanti wrote: > Hello, > > i'm facing a very slow MatSetValues during the discretization of my > mathematical problem when creating the matrix. > My matrix is seven banded, come from the a discretization of an helmholtz > problem using finite differences. > > In the 2D case, creating the matrix is really fast. But in 3D case it goes > well for the first second loop processing and then it goes really really > slow. Here is a small part of my code : > > > ierr = DAGetCorners(*da, &Istartx, &Istarty, &Istartz, &Iendx, &Iendy, > &Iendz); > CHKERRQ(ierr); > > for (j = Istartz; j < Istartz + Iendz; j++) { > > for(k=Istarty; k < Istarty+Iendy; k++){ > for (i = Istartx; i < Istartx + Iendx; i++) { > > > /*get values on an array, their coordinate too*/ > > > MatSetValues(*B,1,ind_mat_x,value_count,ind_mat_y,val_mat_ligne,ADD_VALUES); > } > } > > Did you encounter any slowness for MatSetValues function ? > > Thanks a lot > > Regards > > > PYA > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Mon Sep 21 12:44:12 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Mon, 21 Sep 2009 13:44:12 -0400 Subject: ell1 norm convergence test Message-ID: Hi All, The following question is only for test and comparison reason. Is there a command line option to set up using 1-Norm(sum of absolute value of residuals) as convergence monitor and test. Alternatively, if I can get the residual "r" out at the end of each iteration, then it is going to be much simpler, since I can just call VecNorm(r, NORM_1, &r_ell1) and pass the r_ell1 into the convergence monitor and test. Can anyone inform me if PETSc has an interface for the resuidual r? Or one has to call the convergence test: MyKSPConverged(ksp,n,rnorm,flag,dummy); and following the steps: 1. fetch the exact solution: KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr) 2. calculate the residual: pass the Right Hand side and the Matrix in to calculate the 1-Norm. But this step is *not* obvious, since the matrix and rhs is already distributed over each processes. 3. set up the convergence test based on the 1_Norm residual. Thank you very much, Yan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Sep 21 13:40:29 2009 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 21 Sep 2009 13:40:29 -0500 Subject: ell1 norm convergence test In-Reply-To: References: Message-ID: You can use KSPBuildResidual(). Matt On Mon, Sep 21, 2009 at 12:44 PM, Ryan Yan wrote: > Hi All, > The following question is only for test and comparison reason. > > Is there a command line option to set up using 1-Norm(sum of absolute value > of residuals) as convergence monitor and test. > > Alternatively, if I can get the residual "r" out at the end of each > iteration, then it is going to be much simpler, since I can just call > VecNorm(r, NORM_1, &r_ell1) and pass the r_ell1 into the convergence monitor > and test. Can anyone inform me if PETSc has an interface for the resuidual > r? > > Or one has to call the convergence test: > MyKSPConverged(ksp,n,rnorm,flag,dummy); > and following the steps: > 1. fetch the exact solution: > > KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr) > > 2. calculate the residual: > > pass the Right Hand side and the Matrix in to calculate the 1-Norm. But > this step is *not* obvious, since the matrix and rhs is already distributed > over each processes. > > 3. set up the convergence test based on the 1_Norm residual. > > > Thank you very much, > > > Yan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 21 13:43:53 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 21 Sep 2009 13:43:53 -0500 Subject: ell1 norm convergence test In-Reply-To: References: Message-ID: <9C51ACE4-A8CB-4F0D-A1CD-E921B6106947@mcs.anl.gov> Yan, This depends on the Krylov method being used. For example with GMRES the residual vector is NOT available at each iteration (the 2- norm of the residual is approximated via a recurrence relationship). You can call KSPBuildResidual() to have the true residual computed for you. Note: it is expensive because it actually builds the current solution and computes r = b - A*x What we intended is that if you want the residual efficiently for example for the CG method, you determine what Krylov method you want to use the include the appropriate private include file and access the residual directly from the data structure. This would be efficient. (but like I said does not work for all methods). For cg include src/ ksp/ksp/impls/cg/cgctx.h Barry On Sep 21, 2009, at 12:44 PM, Ryan Yan wrote: > Hi All, > The following question is only for test and comparison reason. > > Is there a command line option to set up using 1-Norm(sum of > absolute value of residuals) as convergence monitor and test. > > Alternatively, if I can get the residual "r" out at the end of each > iteration, then it is going to be much simpler, since I can just > call VecNorm(r, NORM_1, &r_ell1) and pass the r_ell1 into the > convergence monitor and test. Can anyone inform me if PETSc has an > interface for the resuidual r? > > Or one has to call the convergence test: > MyKSPConverged(ksp,n,rnorm,flag,dummy); > and following the steps: > 1. fetch the exact solution: > > KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr) > > 2. calculate the residual: > > pass the Right Hand side and the Matrix in to calculate the 1-Norm. > But this step is *not* obvious, since the matrix and rhs is already > distributed over each processes. > > 3. set up the convergence test based on the 1_Norm residual. > > > Thank you very much, > > > Yan From vyan2000 at gmail.com Mon Sep 21 13:59:07 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Mon, 21 Sep 2009 14:59:07 -0400 Subject: ell1 norm convergence test In-Reply-To: References: Message-ID: Hi Matt, Thanks for the pointer. Yan On Mon, Sep 21, 2009 at 2:40 PM, Matthew Knepley wrote: > You can use KSPBuildResidual(). > > Matt > > > On Mon, Sep 21, 2009 at 12:44 PM, Ryan Yan wrote: > >> Hi All, >> The following question is only for test and comparison reason. >> >> Is there a command line option to set up using 1-Norm(sum of absolute >> value of residuals) as convergence monitor and test. >> >> Alternatively, if I can get the residual "r" out at the end of each >> iteration, then it is going to be much simpler, since I can just call >> VecNorm(r, NORM_1, &r_ell1) and pass the r_ell1 into the convergence monitor >> and test. Can anyone inform me if PETSc has an interface for the resuidual >> r? >> >> Or one has to call the convergence test: >> MyKSPConverged(ksp,n,rnorm,flag,dummy); >> and following the steps: >> 1. fetch the exact solution: >> >> KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr) >> >> 2. calculate the residual: >> >> pass the Right Hand side and the Matrix in to calculate the 1-Norm. But >> this step is *not* obvious, since the matrix and rhs is already distributed >> over each processes. >> >> 3. set up the convergence test based on the 1_Norm residual. >> >> >> Thank you very much, >> >> >> Yan >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Mon Sep 21 14:04:18 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Mon, 21 Sep 2009 15:04:18 -0400 Subject: ell1 norm convergence test In-Reply-To: <9C51ACE4-A8CB-4F0D-A1CD-E921B6106947@mcs.anl.gov> References: <9C51ACE4-A8CB-4F0D-A1CD-E921B6106947@mcs.anl.gov> Message-ID: Hi Barry, Thanks for the suggestion. I am using FGMRES as the krylov solver. Is there a direct access to the residual vector for FGMRES. I guess the answer is yes, since that's the Krylov method which supports right preconditioning and provides the true residual 2_Norm monitor, right? If so, which header should I include to fetch the residual directly from KSP_FGMRES? Yan On Mon, Sep 21, 2009 at 2:43 PM, Barry Smith wrote: > > Yan, > > This depends on the Krylov method being used. For example with GMRES the > residual vector is NOT available at each iteration (the 2-norm of the > residual is approximated via a recurrence relationship). > > You can call KSPBuildResidual() to have the true residual computed for > you. Note: it is expensive because it actually builds the current solution > and computes r = b - A*x > > What we intended is that if you want the residual efficiently for example > for the CG method, you determine what Krylov method you want to use the > include the appropriate private include file and access the residual > directly from the data structure. This would be efficient. (but like I said > does not work for all methods). For cg include src/ksp/ksp/impls/cg/cgctx.h > > > Barry > > > On Sep 21, 2009, at 12:44 PM, Ryan Yan wrote: > > Hi All, >> The following question is only for test and comparison reason. >> >> Is there a command line option to set up using 1-Norm(sum of absolute >> value of residuals) as convergence monitor and test. >> >> Alternatively, if I can get the residual "r" out at the end of each >> iteration, then it is going to be much simpler, since I can just call >> VecNorm(r, NORM_1, &r_ell1) and pass the r_ell1 into the convergence monitor >> and test. Can anyone inform me if PETSc has an interface for the resuidual >> r? >> >> Or one has to call the convergence test: >> MyKSPConverged(ksp,n,rnorm,flag,dummy); >> and following the steps: >> 1. fetch the exact solution: >> >> KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr) >> >> 2. calculate the residual: >> >> pass the Right Hand side and the Matrix in to calculate the 1-Norm. But >> this step is *not* obvious, since the matrix and rhs is already distributed >> over each processes. >> >> 3. set up the convergence test based on the 1_Norm residual. >> >> >> Thank you very much, >> >> >> Yan >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Mon Sep 21 14:12:50 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Mon, 21 Sep 2009 15:12:50 -0400 Subject: ell1 norm convergence test In-Reply-To: References: <9C51ACE4-A8CB-4F0D-A1CD-E921B6106947@mcs.anl.gov> Message-ID: Sorry, FGMRES does not need to compute residual expilictly. Please forget about what I just asked. Yan On Mon, Sep 21, 2009 at 3:04 PM, Ryan Yan wrote: > Hi Barry, > Thanks for the suggestion. > > I am using FGMRES as the krylov solver. Is there a direct access to the > residual vector for FGMRES. I guess the answer is yes, since that's the > Krylov method which supports right preconditioning and provides the true > residual 2_Norm monitor, right? > > If so, which header should I include to fetch the residual directly from > KSP_FGMRES? > > Yan > > > > > On Mon, Sep 21, 2009 at 2:43 PM, Barry Smith wrote: > >> >> Yan, >> >> This depends on the Krylov method being used. For example with GMRES >> the residual vector is NOT available at each iteration (the 2-norm of the >> residual is approximated via a recurrence relationship). >> >> You can call KSPBuildResidual() to have the true residual computed for >> you. Note: it is expensive because it actually builds the current solution >> and computes r = b - A*x >> >> What we intended is that if you want the residual efficiently for >> example for the CG method, you determine what Krylov method you want to use >> the include the appropriate private include file and access the residual >> directly from the data structure. This would be efficient. (but like I said >> does not work for all methods). For cg include src/ksp/ksp/impls/cg/cgctx.h >> >> >> Barry >> >> >> On Sep 21, 2009, at 12:44 PM, Ryan Yan wrote: >> >> Hi All, >>> The following question is only for test and comparison reason. >>> >>> Is there a command line option to set up using 1-Norm(sum of absolute >>> value of residuals) as convergence monitor and test. >>> >>> Alternatively, if I can get the residual "r" out at the end of each >>> iteration, then it is going to be much simpler, since I can just call >>> VecNorm(r, NORM_1, &r_ell1) and pass the r_ell1 into the convergence monitor >>> and test. Can anyone inform me if PETSc has an interface for the resuidual >>> r? >>> >>> Or one has to call the convergence test: >>> MyKSPConverged(ksp,n,rnorm,flag,dummy); >>> and following the steps: >>> 1. fetch the exact solution: >>> >>> KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr) >>> >>> 2. calculate the residual: >>> >>> pass the Right Hand side and the Matrix in to calculate the 1-Norm. But >>> this step is *not* obvious, since the matrix and rhs is already distributed >>> over each processes. >>> >>> 3. set up the convergence test based on the 1_Norm residual. >>> >>> >>> Thank you very much, >>> >>> >>> Yan >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 21 14:29:54 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 21 Sep 2009 14:29:54 -0500 Subject: ell1 norm convergence test In-Reply-To: References: <9C51ACE4-A8CB-4F0D-A1CD-E921B6106947@mcs.anl.gov> Message-ID: Likely it is too expensive to use the 1-norm for stopping criteria for fgmres unless you can reformulate fgmres to give you back the residual efficiently. Barry On Sep 21, 2009, at 2:04 PM, Ryan Yan wrote: > Hi Barry, > Thanks for the suggestion. > > I am using FGMRES as the krylov solver. Is there a direct access to > the residual vector for FGMRES. I guess the answer is yes, since > that's the Krylov method which supports right preconditioning and > provides the true residual 2_Norm monitor, right? > > If so, which header should I include to fetch the residual directly > from KSP_FGMRES? > > Yan > > > > On Mon, Sep 21, 2009 at 2:43 PM, Barry Smith > wrote: > > Yan, > > This depends on the Krylov method being used. For example with > GMRES the residual vector is NOT available at each iteration (the 2- > norm of the residual is approximated via a recurrence relationship). > > You can call KSPBuildResidual() to have the true residual > computed for you. Note: it is expensive because it actually builds > the current solution and computes r = b - A*x > > What we intended is that if you want the residual efficiently for > example for the CG method, you determine what Krylov method you want > to use the include the appropriate private include file and access > the residual directly from the data structure. This would be > efficient. (but like I said does not work for all methods). For cg > include src/ksp/ksp/impls/cg/cgctx.h > > > Barry > > > On Sep 21, 2009, at 12:44 PM, Ryan Yan wrote: > > Hi All, > The following question is only for test and comparison reason. > > Is there a command line option to set up using 1-Norm(sum of > absolute value of residuals) as convergence monitor and test. > > Alternatively, if I can get the residual "r" out at the end of each > iteration, then it is going to be much simpler, since I can just > call VecNorm(r, NORM_1, &r_ell1) and pass the r_ell1 into the > convergence monitor and test. Can anyone inform me if PETSc has an > interface for the resuidual r? > > Or one has to call the convergence test: > MyKSPConverged(ksp,n,rnorm,flag,dummy); > and following the steps: > 1. fetch the exact solution: > > KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr) > > 2. calculate the residual: > > pass the Right Hand side and the Matrix in to calculate the 1-Norm. > But this step is *not* obvious, since the matrix and rhs is already > distributed over each processes. > > 3. set up the convergence test based on the 1_Norm residual. > > > Thank you very much, > > > Yan > > From knepley at gmail.com Mon Sep 21 14:33:37 2009 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 21 Sep 2009 14:33:37 -0500 Subject: ell1 norm convergence test In-Reply-To: References: <9C51ACE4-A8CB-4F0D-A1CD-E921B6106947@mcs.anl.gov> Message-ID: On Mon, Sep 21, 2009 at 2:29 PM, Barry Smith wrote: > > Likely it is too expensive to use the 1-norm for stopping criteria for > fgmres unless you can reformulate fgmres to give you back the residual > efficiently. > I have been told that QMR has a natural 1-norm formulation, but I have never seen it. Matt > Barry > > On Sep 21, 2009, at 2:04 PM, Ryan Yan wrote: > > Hi Barry, >> Thanks for the suggestion. >> >> I am using FGMRES as the krylov solver. Is there a direct access to the >> residual vector for FGMRES. I guess the answer is yes, since that's the >> Krylov method which supports right preconditioning and provides the true >> residual 2_Norm monitor, right? >> >> If so, which header should I include to fetch the residual directly from >> KSP_FGMRES? >> >> Yan >> >> >> >> On Mon, Sep 21, 2009 at 2:43 PM, Barry Smith wrote: >> >> Yan, >> >> This depends on the Krylov method being used. For example with GMRES the >> residual vector is NOT available at each iteration (the 2-norm of the >> residual is approximated via a recurrence relationship). >> >> You can call KSPBuildResidual() to have the true residual computed for >> you. Note: it is expensive because it actually builds the current solution >> and computes r = b - A*x >> >> What we intended is that if you want the residual efficiently for example >> for the CG method, you determine what Krylov method you want to use the >> include the appropriate private include file and access the residual >> directly from the data structure. This would be efficient. (but like I said >> does not work for all methods). For cg include src/ksp/ksp/impls/cg/cgctx.h >> >> >> Barry >> >> >> On Sep 21, 2009, at 12:44 PM, Ryan Yan wrote: >> >> Hi All, >> The following question is only for test and comparison reason. >> >> Is there a command line option to set up using 1-Norm(sum of absolute >> value of residuals) as convergence monitor and test. >> >> Alternatively, if I can get the residual "r" out at the end of each >> iteration, then it is going to be much simpler, since I can just call >> VecNorm(r, NORM_1, &r_ell1) and pass the r_ell1 into the convergence monitor >> and test. Can anyone inform me if PETSc has an interface for the resuidual >> r? >> >> Or one has to call the convergence test: >> MyKSPConverged(ksp,n,rnorm,flag,dummy); >> and following the steps: >> 1. fetch the exact solution: >> >> KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr) >> >> 2. calculate the residual: >> >> pass the Right Hand side and the Matrix in to calculate the 1-Norm. But >> this step is *not* obvious, since the matrix and rhs is already distributed >> over each processes. >> >> 3. set up the convergence test based on the 1_Norm residual. >> >> >> Thank you very much, >> >> >> Yan >> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Mon Sep 21 15:00:56 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Mon, 21 Sep 2009 16:00:56 -0400 Subject: ell1 norm convergence test In-Reply-To: References: <9C51ACE4-A8CB-4F0D-A1CD-E921B6106947@mcs.anl.gov> Message-ID: I agree that reformulate 1-norm is going to be a pain. Actually, I just want to see the history of true residual 1_norm. If using the 2_norm as stop criteria and I still can see the history of 1-norm, than I am happy. But I guess there has to be a price for visiting the 1_norm. I will try to use KSPBuildResidual() to make a visit. The reason of checking L_1 norm is that our customer has a existing linear solver which uses the true residual 1_norm as the stopping criteria, and the 1_Norm or 2_Norm has a great difference. Imagine one has a huge vector with all small values but various magnitude. The 1_norm of that vector is going to be much larger than the 2_norm. That's what exactly happens in my case. Thank you very much for the suggestion, Yan On Mon, Sep 21, 2009 at 3:29 PM, Barry Smith wrote: > > Likely it is too expensive to use the 1-norm for stopping criteria for > fgmres unless you can reformulate fgmres to give you back the residual > efficiently. > > Barry > > > On Sep 21, 2009, at 2:04 PM, Ryan Yan wrote: > > Hi Barry, >> Thanks for the suggestion. >> >> I am using FGMRES as the krylov solver. Is there a direct access to the >> residual vector for FGMRES. I guess the answer is yes, since that's the >> Krylov method which supports right preconditioning and provides the true >> residual 2_Norm monitor, right? >> >> If so, which header should I include to fetch the residual directly from >> KSP_FGMRES? >> >> Yan >> >> >> >> On Mon, Sep 21, 2009 at 2:43 PM, Barry Smith wrote: >> >> Yan, >> >> This depends on the Krylov method being used. For example with GMRES the >> residual vector is NOT available at each iteration (the 2-norm of the >> residual is approximated via a recurrence relationship). >> >> You can call KSPBuildResidual() to have the true residual computed for >> you. Note: it is expensive because it actually builds the current solution >> and computes r = b - A*x >> >> What we intended is that if you want the residual efficiently for example >> for the CG method, you determine what Krylov method you want to use the >> include the appropriate private include file and access the residual >> directly from the data structure. This would be efficient. (but like I said >> does not work for all methods). For cg include src/ksp/ksp/impls/cg/cgctx.h >> >> >> Barry >> >> >> On Sep 21, 2009, at 12:44 PM, Ryan Yan wrote: >> >> Hi All, >> The following question is only for test and comparison reason. >> >> Is there a command line option to set up using 1-Norm(sum of absolute >> value of residuals) as convergence monitor and test. >> >> Alternatively, if I can get the residual "r" out at the end of each >> iteration, then it is going to be much simpler, since I can just call >> VecNorm(r, NORM_1, &r_ell1) and pass the r_ell1 into the convergence monitor >> and test. Can anyone inform me if PETSc has an interface for the resuidual >> r? >> >> Or one has to call the convergence test: >> MyKSPConverged(ksp,n,rnorm,flag,dummy); >> and following the steps: >> 1. fetch the exact solution: >> >> KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr) >> >> 2. calculate the residual: >> >> pass the Right Hand side and the Matrix in to calculate the 1-Norm. But >> this step is *not* obvious, since the matrix and rhs is already distributed >> over each processes. >> >> 3. set up the convergence test based on the 1_Norm residual. >> >> >> Thank you very much, >> >> >> Yan >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Mon Sep 21 15:01:51 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Mon, 21 Sep 2009 16:01:51 -0400 Subject: ell1 norm convergence test In-Reply-To: References: <9C51ACE4-A8CB-4F0D-A1CD-E921B6106947@mcs.anl.gov> Message-ID: Matt, thanks for letting me be aware of this. Yan On Mon, Sep 21, 2009 at 3:33 PM, Matthew Knepley wrote: > On Mon, Sep 21, 2009 at 2:29 PM, Barry Smith wrote: > >> >> Likely it is too expensive to use the 1-norm for stopping criteria for >> fgmres unless you can reformulate fgmres to give you back the residual >> efficiently. >> > > I have been told that QMR has a natural 1-norm formulation, but I have > never seen it. > > Matt > > >> Barry >> >> On Sep 21, 2009, at 2:04 PM, Ryan Yan wrote: >> >> Hi Barry, >>> Thanks for the suggestion. >>> >>> I am using FGMRES as the krylov solver. Is there a direct access to the >>> residual vector for FGMRES. I guess the answer is yes, since that's the >>> Krylov method which supports right preconditioning and provides the true >>> residual 2_Norm monitor, right? >>> >>> If so, which header should I include to fetch the residual directly from >>> KSP_FGMRES? >>> >>> Yan >>> >>> >>> >>> On Mon, Sep 21, 2009 at 2:43 PM, Barry Smith wrote: >>> >>> Yan, >>> >>> This depends on the Krylov method being used. For example with GMRES >>> the residual vector is NOT available at each iteration (the 2-norm of the >>> residual is approximated via a recurrence relationship). >>> >>> You can call KSPBuildResidual() to have the true residual computed for >>> you. Note: it is expensive because it actually builds the current solution >>> and computes r = b - A*x >>> >>> What we intended is that if you want the residual efficiently for >>> example for the CG method, you determine what Krylov method you want to use >>> the include the appropriate private include file and access the residual >>> directly from the data structure. This would be efficient. (but like I said >>> does not work for all methods). For cg include src/ksp/ksp/impls/cg/cgctx.h >>> >>> >>> Barry >>> >>> >>> On Sep 21, 2009, at 12:44 PM, Ryan Yan wrote: >>> >>> Hi All, >>> The following question is only for test and comparison reason. >>> >>> Is there a command line option to set up using 1-Norm(sum of absolute >>> value of residuals) as convergence monitor and test. >>> >>> Alternatively, if I can get the residual "r" out at the end of each >>> iteration, then it is going to be much simpler, since I can just call >>> VecNorm(r, NORM_1, &r_ell1) and pass the r_ell1 into the convergence monitor >>> and test. Can anyone inform me if PETSc has an interface for the resuidual >>> r? >>> >>> Or one has to call the convergence test: >>> MyKSPConverged(ksp,n,rnorm,flag,dummy); >>> and following the steps: >>> 1. fetch the exact solution: >>> >>> KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr) >>> >>> 2. calculate the residual: >>> >>> pass the Right Hand side and the Matrix in to calculate the 1-Norm. But >>> this step is *not* obvious, since the matrix and rhs is already distributed >>> over each processes. >>> >>> 3. set up the convergence test based on the 1_Norm residual. >>> >>> >>> Thank you very much, >>> >>> >>> Yan >>> >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fredrik.bengzon at math.umu.se Mon Sep 21 15:24:08 2009 From: fredrik.bengzon at math.umu.se (Fredrik Bengzon) Date: Mon, 21 Sep 2009 22:24:08 +0200 Subject: sparsity pattern setup Message-ID: <4AB7E0E8.3000203@math.umu.se> Hi, This is probably the wrong forum to ask, but does anyone have a piece of code for computing the correct ndnz and nodnz vectors needed for assembly of the stiffness (MPIAIJ) matrix on an unstructured tetrahedral mesh given the node-to-element adjacency? Thanks, Fredrik From vyan2000 at gmail.com Mon Sep 21 16:35:30 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Mon, 21 Sep 2009 17:35:30 -0400 Subject: sparsity pattern setup In-Reply-To: <4AB7E0E8.3000203@math.umu.se> References: <4AB7E0E8.3000203@math.umu.se> Message-ID: Hi Fredrik, If I understand correctly, I have the same issue as what you have here. I do not have the code yet(It is also depends on how your store your matrix data.). But I can forward Barry's idea to you. Hope this is helpful to you. Yan, Simply read through the ASCII file(s) twice. The first time count the number of blocks per row, then preallocate then read through the ASCII file again reading and setting the values. This will be very fast. Barry- Hide quoted text - On Sep 20, 2009, at 10:20 AM, Ryan Yan wrote: Hi All, > I have a large size application. Mesh size is 30000 nodes, with dof 25 on > each vertex. And it's 3d unstructured, Tet, and Hex mesh. In the following > I will denote blksize=25 > > I am testing how to build up a PETSc matrix object quick and fast. > > The data I have is Block Compressed Sparse Row(BCSR) files. And my > objective is to read BCSR files and generate PETSc Binaries > > Firstly, I choose the MATMPIAIJ, I opened the BCSR data files on each > processor, and set up the preallocation use > MatMPIAIJSetPreallocation(A,blksize,PETSC_NULL,blksize,PETSC_NULL); The > reason why I choose 25 as the number for d_nz and o_nz is that I do not have > access to the ordering of the vertices. So it's the worst case set up, and > it takes about 7 minutes on 30 MIPS node(180 processors) to write the output > into PETSc binaries. > > Secondly, I choose the MatMPIBAIJ, and same procedure as above, but also > set up > MatMPIBAIJSetPreallocation(A,blksize,blksize,PETSC_NULL,blksize,PETSC_NULL), > here blksize = 25 and it's also the worst case; This experiments takes > forever and could not generate the PETSc binaries. > > I guess the reason why it takes so long in the MATMPIBAIJ case is that I > did not set up the preallocation accurately. Alougth I think the > preallocation is also not accurate in the MATMPIAIJ case, but it seems like > the preallocation effect is not as serious as for the MPIBAIJ. Correct me > please, if there are other reasons. > > Can I anyone please give a hint on how to set up the preallocation right > for a unstructured mesh without knowing the mesh ordering? > > Thank you very much in advance, > > Yan > On Mon, Sep 21, 2009 at 4:24 PM, Fredrik Bengzon < fredrik.bengzon at math.umu.se> wrote: > Hi, > This is probably the wrong forum to ask, but does anyone have a piece of > code for computing the correct ndnz and nodnz vectors needed for assembly of > the stiffness (MPIAIJ) matrix on an unstructured tetrahedral mesh given the > node-to-element adjacency? > Thanks, > Fredrik > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fredrik.bengzon at math.umu.se Mon Sep 21 16:53:18 2009 From: fredrik.bengzon at math.umu.se (Fredrik Bengzon) Date: Mon, 21 Sep 2009 23:53:18 +0200 Subject: sparsity pattern setup In-Reply-To: References: <4AB7E0E8.3000203@math.umu.se> Message-ID: <4AB7F5CE.1060607@math.umu.se> Hi, Ryan, I'm aware of Barry's post https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2008-May/003020.html and it workes fine for triangle meshes. However, I do not see how this can be used for tetrahedral meshes. /Fredrik Ryan Yan wrote: > Hi Fredrik, > If I understand correctly, I have the same issue as what you have here. > > I do not have the code yet(It is also depends on how your store your > matrix data.). But I can forward Barry's idea to you. Hope this is > helpful to you. > > Yan, > > Simply read through the ASCII file(s) twice. The first time count > the number of blocks per row, then preallocate then read through the > ASCII file again reading and setting the values. This will be very fast. > > Barry > - Hide quoted text - > > > On Sep 20, 2009, at 10:20 AM, Ryan Yan wrote: > > Hi All, > I have a large size application. Mesh size is 30000 nodes, with > dof 25 on each vertex. And it's 3d unstructured, Tet, and Hex > mesh. In the following I will denote blksize=25 > > I am testing how to build up a PETSc matrix object quick and fast. > > The data I have is Block Compressed Sparse Row(BCSR) files. And my > objective is to read BCSR files and generate PETSc Binaries > > Firstly, I choose the MATMPIAIJ, I opened the BCSR data files on > each processor, and set up the preallocation use > MatMPIAIJSetPreallocation(A,blksize,PETSC_NULL,blksize,PETSC_NULL); > The reason why I choose 25 as the number for d_nz and o_nz is that > I do not have access to the ordering of the vertices. So it's the > worst case set up, and it takes about 7 minutes on 30 MIPS > node(180 processors) to write the output into PETSc binaries. > > Secondly, I choose the MatMPIBAIJ, and same procedure as above, > but also set up > MatMPIBAIJSetPreallocation(A,blksize,blksize,PETSC_NULL,blksize,PETSC_NULL), > here blksize = 25 and it's also the worst case; This experiments > takes forever and could not generate the PETSc binaries. > > I guess the reason why it takes so long in the MATMPIBAIJ case is > that I did not set up the preallocation accurately. Alougth I > think the preallocation is also not accurate in the MATMPIAIJ > case, but it seems like the preallocation effect is not as serious > as for the MPIBAIJ. Correct me please, if there are other reasons. > > Can I anyone please give a hint on how to set up the preallocation > right for a unstructured mesh without knowing the mesh ordering? > > Thank you very much in advance, > > Yan > > > > > > On Mon, Sep 21, 2009 at 4:24 PM, Fredrik Bengzon > > wrote: > > Hi, > This is probably the wrong forum to ask, but does anyone have a > piece of code for computing the correct ndnz and nodnz vectors > needed for assembly of the stiffness (MPIAIJ) matrix on an > unstructured tetrahedral mesh given the node-to-element adjacency? > Thanks, > Fredrik > > From knepley at gmail.com Mon Sep 21 17:08:47 2009 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 21 Sep 2009 17:08:47 -0500 Subject: sparsity pattern setup In-Reply-To: <4AB7F5CE.1060607@math.umu.se> References: <4AB7E0E8.3000203@math.umu.se> <4AB7F5CE.1060607@math.umu.se> Message-ID: On Mon, Sep 21, 2009 at 4:53 PM, Fredrik Bengzon < fredrik.bengzon at math.umu.se> wrote: > Hi, > Ryan, I'm aware of Barry's post > > https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2008-May/003020.html > > and it workes fine for triangle meshes. However, I do not see how this can > be used for tetrahedral meshes. > It is the same for tetrahedra. In fact, this algorithm can be generalized to work for any topology: http://arxiv.org/abs/0908.4427 Matt > /Fredrik > > > Ryan Yan wrote: > >> Hi Fredrik, >> If I understand correctly, I have the same issue as what you have here. >> >> I do not have the code yet(It is also depends on how your store your >> matrix data.). But I can forward Barry's idea to you. Hope this is helpful >> to you. >> >> Yan, >> >> Simply read through the ASCII file(s) twice. The first time count the >> number of blocks per row, then preallocate then read through the ASCII file >> again reading and setting the values. This will be very fast. >> >> Barry >> - Hide quoted text - >> >> >> On Sep 20, 2009, at 10:20 AM, Ryan Yan wrote: >> >> Hi All, >> I have a large size application. Mesh size is 30000 nodes, with >> dof 25 on each vertex. And it's 3d unstructured, Tet, and Hex >> mesh. In the following I will denote blksize=25 >> >> I am testing how to build up a PETSc matrix object quick and fast. >> >> The data I have is Block Compressed Sparse Row(BCSR) files. And my >> objective is to read BCSR files and generate PETSc Binaries >> >> Firstly, I choose the MATMPIAIJ, I opened the BCSR data files on >> each processor, and set up the preallocation use >> MatMPIAIJSetPreallocation(A,blksize,PETSC_NULL,blksize,PETSC_NULL); >> The reason why I choose 25 as the number for d_nz and o_nz is that >> I do not have access to the ordering of the vertices. So it's the >> worst case set up, and it takes about 7 minutes on 30 MIPS >> node(180 processors) to write the output into PETSc binaries. >> >> Secondly, I choose the MatMPIBAIJ, and same procedure as above, >> but also set up >> >> MatMPIBAIJSetPreallocation(A,blksize,blksize,PETSC_NULL,blksize,PETSC_NULL), >> here blksize = 25 and it's also the worst case; This experiments >> takes forever and could not generate the PETSc binaries. >> >> I guess the reason why it takes so long in the MATMPIBAIJ case is >> that I did not set up the preallocation accurately. Alougth I >> think the preallocation is also not accurate in the MATMPIAIJ >> case, but it seems like the preallocation effect is not as serious >> as for the MPIBAIJ. Correct me please, if there are other reasons. >> >> Can I anyone please give a hint on how to set up the preallocation >> right for a unstructured mesh without knowing the mesh ordering? >> >> Thank you very much in advance, >> >> Yan >> >> >> >> >> >> On Mon, Sep 21, 2009 at 4:24 PM, Fredrik Bengzon < >> fredrik.bengzon at math.umu.se > wrote: >> >> Hi, >> This is probably the wrong forum to ask, but does anyone have a >> piece of code for computing the correct ndnz and nodnz vectors >> needed for assembly of the stiffness (MPIAIJ) matrix on an >> unstructured tetrahedral mesh given the node-to-element adjacency? >> Thanks, >> Fredrik >> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From fredrik.bengzon at math.umu.se Mon Sep 21 17:31:40 2009 From: fredrik.bengzon at math.umu.se (Fredrik Bengzon) Date: Tue, 22 Sep 2009 00:31:40 +0200 Subject: sparsity pattern setup In-Reply-To: References: <4AB7E0E8.3000203@math.umu.se> <4AB7F5CE.1060607@math.umu.se> Message-ID: <4AB7FECC.3060302@math.umu.se> Matt, Clearly, I must be missing something. If I loop over the elements and record all edges between the nodes in a face by adding to the vectors "on" and "off" as the algorithm says, won't an edge be counted more than two times in general, thus giving a too big value in "on" and "off"? /Fredrik Matthew Knepley wrote: > On Mon, Sep 21, 2009 at 4:53 PM, Fredrik Bengzon > > wrote: > > Hi, > Ryan, I'm aware of Barry's post > > https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2008-May/003020.html > > and it workes fine for triangle meshes. However, I do not see how > this can be used for tetrahedral meshes. > > > It is the same for tetrahedra. In fact, this algorithm can be > generalized to work > for any topology: > > http://arxiv.org/abs/0908.4427 > > Matt > > > /Fredrik > > > Ryan Yan wrote: > > Hi Fredrik, > If I understand correctly, I have the same issue as what you > have here. > > I do not have the code yet(It is also depends on how your > store your matrix data.). But I can forward Barry's idea to > you. Hope this is helpful to you. > > Yan, > > Simply read through the ASCII file(s) twice. The first time > count the number of blocks per row, then preallocate then read > through the ASCII file again reading and setting the values. > This will be very fast. > > Barry > - Hide quoted text - > > > On Sep 20, 2009, at 10:20 AM, Ryan Yan wrote: > > Hi All, > I have a large size application. Mesh size is 30000 nodes, with > dof 25 on each vertex. And it's 3d unstructured, Tet, and Hex > mesh. In the following I will denote blksize=25 > > I am testing how to build up a PETSc matrix object quick > and fast. > > The data I have is Block Compressed Sparse Row(BCSR) files. > And my > objective is to read BCSR files and generate PETSc Binaries > > Firstly, I choose the MATMPIAIJ, I opened the BCSR data > files on > each processor, and set up the preallocation use > > MatMPIAIJSetPreallocation(A,blksize,PETSC_NULL,blksize,PETSC_NULL); > The reason why I choose 25 as the number for d_nz and o_nz > is that > I do not have access to the ordering of the vertices. So > it's the > worst case set up, and it takes about 7 minutes on 30 MIPS > node(180 processors) to write the output into PETSc binaries. > > Secondly, I choose the MatMPIBAIJ, and same procedure as above, > but also set up > > MatMPIBAIJSetPreallocation(A,blksize,blksize,PETSC_NULL,blksize,PETSC_NULL), > here blksize = 25 and it's also the worst case; This > experiments > takes forever and could not generate the PETSc binaries. > > I guess the reason why it takes so long in the MATMPIBAIJ > case is > that I did not set up the preallocation accurately. Alougth I > think the preallocation is also not accurate in the MATMPIAIJ > case, but it seems like the preallocation effect is not as > serious > as for the MPIBAIJ. Correct me please, if there are other > reasons. > > Can I anyone please give a hint on how to set up the > preallocation > right for a unstructured mesh without knowing the mesh > ordering? > > Thank you very much in advance, > > Yan > > > > > > On Mon, Sep 21, 2009 at 4:24 PM, Fredrik Bengzon > > >> wrote: > > Hi, > This is probably the wrong forum to ask, but does anyone have a > piece of code for computing the correct ndnz and nodnz vectors > needed for assembly of the stiffness (MPIAIJ) matrix on an > unstructured tetrahedral mesh given the node-to-element > adjacency? > Thanks, > Fredrik > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener From knepley at gmail.com Mon Sep 21 18:06:56 2009 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 21 Sep 2009 18:06:56 -0500 Subject: sparsity pattern setup In-Reply-To: <4AB7FECC.3060302@math.umu.se> References: <4AB7E0E8.3000203@math.umu.se> <4AB7F5CE.1060607@math.umu.se> <4AB7FECC.3060302@math.umu.se> Message-ID: On Mon, Sep 21, 2009 at 5:31 PM, Fredrik Bengzon < fredrik.bengzon at math.umu.se> wrote: > Matt, > Clearly, I must be missing something. If I loop over the elements and > record all edges between the nodes in a face by adding to the vectors "on" > and "off" as the algorithm says, won't an edge be counted more than two > times in general, thus giving a too big value in "on" and "off"? > You do not loop over edges ever that I see. He loops over nodes, and checks nodes that share an edge (in 2D) or face (in 3D). Matt > /Fredrik > > > Matthew Knepley wrote: > > On Mon, Sep 21, 2009 at 4:53 PM, Fredrik Bengzon < >> fredrik.bengzon at math.umu.se > wrote: >> >> Hi, >> Ryan, I'm aware of Barry's post >> >> >> https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2008-May/003020.html >> >> and it workes fine for triangle meshes. However, I do not see how >> this can be used for tetrahedral meshes. >> >> >> It is the same for tetrahedra. In fact, this algorithm can be generalized >> to work >> for any topology: >> >> http://arxiv.org/abs/0908.4427 >> >> Matt >> >> /Fredrik >> >> >> Ryan Yan wrote: >> >> Hi Fredrik, >> If I understand correctly, I have the same issue as what you >> have here. >> >> I do not have the code yet(It is also depends on how your >> store your matrix data.). But I can forward Barry's idea to >> you. Hope this is helpful to you. >> >> Yan, >> >> Simply read through the ASCII file(s) twice. The first time >> count the number of blocks per row, then preallocate then read >> through the ASCII file again reading and setting the values. >> This will be very fast. >> >> Barry >> - Hide quoted text - >> >> >> On Sep 20, 2009, at 10:20 AM, Ryan Yan wrote: >> >> Hi All, >> I have a large size application. Mesh size is 30000 nodes, with >> dof 25 on each vertex. And it's 3d unstructured, Tet, and Hex >> mesh. In the following I will denote blksize=25 >> >> I am testing how to build up a PETSc matrix object quick >> and fast. >> >> The data I have is Block Compressed Sparse Row(BCSR) files. >> And my >> objective is to read BCSR files and generate PETSc Binaries >> >> Firstly, I choose the MATMPIAIJ, I opened the BCSR data >> files on >> each processor, and set up the preallocation use >> >> MatMPIAIJSetPreallocation(A,blksize,PETSC_NULL,blksize,PETSC_NULL); >> The reason why I choose 25 as the number for d_nz and o_nz >> is that >> I do not have access to the ordering of the vertices. So >> it's the >> worst case set up, and it takes about 7 minutes on 30 MIPS >> node(180 processors) to write the output into PETSc binaries. >> >> Secondly, I choose the MatMPIBAIJ, and same procedure as above, >> but also set up >> >> MatMPIBAIJSetPreallocation(A,blksize,blksize,PETSC_NULL,blksize,PETSC_NULL), >> here blksize = 25 and it's also the worst case; This >> experiments >> takes forever and could not generate the PETSc binaries. >> >> I guess the reason why it takes so long in the MATMPIBAIJ >> case is >> that I did not set up the preallocation accurately. Alougth I >> think the preallocation is also not accurate in the MATMPIAIJ >> case, but it seems like the preallocation effect is not as >> serious >> as for the MPIBAIJ. Correct me please, if there are other >> reasons. >> >> Can I anyone please give a hint on how to set up the >> preallocation >> right for a unstructured mesh without knowing the mesh >> ordering? >> >> Thank you very much in advance, >> >> Yan >> >> >> >> >> >> On Mon, Sep 21, 2009 at 4:24 PM, Fredrik Bengzon >> > >> > >> wrote: >> >> Hi, >> This is probably the wrong forum to ask, but does anyone have a >> piece of code for computing the correct ndnz and nodnz vectors >> needed for assembly of the stiffness (MPIAIJ) matrix on an >> unstructured tetrahedral mesh given the node-to-element >> adjacency? >> Thanks, >> Fredrik >> >> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From fredrik.bengzon at math.umu.se Tue Sep 22 07:51:59 2009 From: fredrik.bengzon at math.umu.se (Fredrik Bengzon) Date: Tue, 22 Sep 2009 14:51:59 +0200 Subject: sparsity pattern setup In-Reply-To: References: <4AB7E0E8.3000203@math.umu.se> <4AB7F5CE.1060607@math.umu.se> Message-ID: <4AB8C86F.6010404@math.umu.se> Hi, This is the problem with the allocation algorithm in 3d I was talking about. I guess i should have read the whole thread before asking stuff :) https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2008-May/003022.html /Fredrik Matthew Knepley wrote: > On Mon, Sep 21, 2009 at 4:53 PM, Fredrik Bengzon > > wrote: > > Hi, > Ryan, I'm aware of Barry's post > > https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2008-May/003020.html > > and it workes fine for triangle meshes. However, I do not see how > this can be used for tetrahedral meshes. > > > It is the same for tetrahedra. In fact, this algorithm can be > generalized to work > for any topology: > > http://arxiv.org/abs/0908.4427 > > Matt > > > /Fredrik > > > Ryan Yan wrote: > > Hi Fredrik, > If I understand correctly, I have the same issue as what you > have here. > > I do not have the code yet(It is also depends on how your > store your matrix data.). But I can forward Barry's idea to > you. Hope this is helpful to you. > > Yan, > > Simply read through the ASCII file(s) twice. The first time > count the number of blocks per row, then preallocate then read > through the ASCII file again reading and setting the values. > This will be very fast. > > Barry > - Hide quoted text - > > > On Sep 20, 2009, at 10:20 AM, Ryan Yan wrote: > > Hi All, > I have a large size application. Mesh size is 30000 nodes, with > dof 25 on each vertex. And it's 3d unstructured, Tet, and Hex > mesh. In the following I will denote blksize=25 > > I am testing how to build up a PETSc matrix object quick > and fast. > > The data I have is Block Compressed Sparse Row(BCSR) files. > And my > objective is to read BCSR files and generate PETSc Binaries > > Firstly, I choose the MATMPIAIJ, I opened the BCSR data > files on > each processor, and set up the preallocation use > > MatMPIAIJSetPreallocation(A,blksize,PETSC_NULL,blksize,PETSC_NULL); > The reason why I choose 25 as the number for d_nz and o_nz > is that > I do not have access to the ordering of the vertices. So > it's the > worst case set up, and it takes about 7 minutes on 30 MIPS > node(180 processors) to write the output into PETSc binaries. > > Secondly, I choose the MatMPIBAIJ, and same procedure as above, > but also set up > > MatMPIBAIJSetPreallocation(A,blksize,blksize,PETSC_NULL,blksize,PETSC_NULL), > here blksize = 25 and it's also the worst case; This > experiments > takes forever and could not generate the PETSc binaries. > > I guess the reason why it takes so long in the MATMPIBAIJ > case is > that I did not set up the preallocation accurately. Alougth I > think the preallocation is also not accurate in the MATMPIAIJ > case, but it seems like the preallocation effect is not as > serious > as for the MPIBAIJ. Correct me please, if there are other > reasons. > > Can I anyone please give a hint on how to set up the > preallocation > right for a unstructured mesh without knowing the mesh > ordering? > > Thank you very much in advance, > > Yan > > > > > > On Mon, Sep 21, 2009 at 4:24 PM, Fredrik Bengzon > > >> wrote: > > Hi, > This is probably the wrong forum to ask, but does anyone have a > piece of code for computing the correct ndnz and nodnz vectors > needed for assembly of the stiffness (MPIAIJ) matrix on an > unstructured tetrahedral mesh given the node-to-element > adjacency? > Thanks, > Fredrik > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener From rodrigowpa at gmail.com Tue Sep 22 08:50:07 2009 From: rodrigowpa at gmail.com (Rodrigo Araujo) Date: Tue, 22 Sep 2009 10:50:07 -0300 Subject: Cluster of FPGA Message-ID: <357feb30909220650p56704133mdafdb8620c1f9a9@mail.gmail.com> Good Morning All, I need to implement a Cluster which in each PC there will be a FPGA connected to the PCIe slot. All of this to make possible to multiply dense matrices. Is that possible using PETSc? Best Regards. -- Rodrigo W. Pimentel Araujo Engenharia da Computa??o UFPE -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.kramer at imperial.ac.uk Tue Sep 22 08:47:52 2009 From: s.kramer at imperial.ac.uk (Stephan Kramer) Date: Tue, 22 Sep 2009 14:47:52 +0100 Subject: some sor questions Message-ID: <4AB8D588.8080606@imperial.ac.uk> Hi all, I have some questions basically about the MatRelax_SeqAIJ routine: If I understand correctly there are 2 versions of the sor routine depending on whether or not there is a zero guess, so that with a zero guess in the forward sweep you don't need to multiply the upper diagonal part U with the part of the x vector that is still zero. Why then does it look like that both versions log the same number of flops? I would have expected that the full forward sweep (i.e. no zero guess) takes 2*a->nz flops (i.e. the same as a matvec) and not a->nz. Why does the Richardson iteration with sor not take the zero guess into account, i.e. why does PCApplyRichardson_SOR not set SOR_ZERO_INIT_GUESS in the call to MatRelax if the Richardson ksp has a zero initial guess set? In parallel if you specify SOR_LOCAL_FORWARD_SWEEP or SOR_LOCAL_BACKWARD_SWEEP it calls MatRelax on the local part of the matrix, mat->A, with its=lits and lits=PETSC_NULL (?). However the first line of MatRelax_SeqAIJ then says: its = its*lits. Is that right? Please tell me if I'm totally misunderstanding how the routine works, thanks for any help. Cheers Stephan -- Stephan Kramer Applied Modelling and Computation Group, Department of Earth Science and Engineering, Imperial College London From knepley at gmail.com Tue Sep 22 09:26:04 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 22 Sep 2009 09:26:04 -0500 Subject: Cluster of FPGA In-Reply-To: <357feb30909220650p56704133mdafdb8620c1f9a9@mail.gmail.com> References: <357feb30909220650p56704133mdafdb8620c1f9a9@mail.gmail.com> Message-ID: On Tue, Sep 22, 2009 at 8:50 AM, Rodrigo Araujo wrote: > Good Morning All, > > I need to implement a Cluster which in each PC there will be a FPGA > connected to the PCIe slot. All of this to make possible to multiply dense > matrices. Is that possible using PETSc? > 1) I have never seen great performance from FPGAs (with the possible exception of GRAPE). I would strongly encourage you to rethink this plan, and instead go with something like a CUDA solution. 2) Software already exists to do dense linear algebra on GPUs, and would be easy to integrate with our existing dense matrix implementation. Matt > Best Regards. > > -- > Rodrigo W. Pimentel Araujo > Engenharia da Computa??o > UFPE > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Tue Sep 22 11:35:01 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Tue, 22 Sep 2009 12:35:01 -0400 Subject: ell1 norm convergence test In-Reply-To: References: <9C51ACE4-A8CB-4F0D-A1CD-E921B6106947@mcs.anl.gov> Message-ID: Just to share the experience. It is indeed very expansive to fetch the true residual out. I print the ell one norm by set up my own monitor, as well as the -ksp_monitor_true_residual. An interesting part is that in this problem, ell_1 norm and ell_2 norm have almost the same error-reduction history, in the sense of order of magnitude.. Yan On Mon, Sep 21, 2009 at 3:29 PM, Barry Smith wrote: > > Likely it is too expensive to use the 1-norm for stopping criteria for > fgmres unless you can reformulate fgmres to give you back the residual > efficiently. > > Barry > > > On Sep 21, 2009, at 2:04 PM, Ryan Yan wrote: > > Hi Barry, >> Thanks for the suggestion. >> >> I am using FGMRES as the krylov solver. Is there a direct access to the >> residual vector for FGMRES. I guess the answer is yes, since that's the >> Krylov method which supports right preconditioning and provides the true >> residual 2_Norm monitor, right? >> >> If so, which header should I include to fetch the residual directly from >> KSP_FGMRES? >> >> Yan >> >> >> >> On Mon, Sep 21, 2009 at 2:43 PM, Barry Smith wrote: >> >> Yan, >> >> This depends on the Krylov method being used. For example with GMRES the >> residual vector is NOT available at each iteration (the 2-norm of the >> residual is approximated via a recurrence relationship). >> >> You can call KSPBuildResidual() to have the true residual computed for >> you. Note: it is expensive because it actually builds the current solution >> and computes r = b - A*x >> >> What we intended is that if you want the residual efficiently for example >> for the CG method, you determine what Krylov method you want to use the >> include the appropriate private include file and access the residual >> directly from the data structure. This would be efficient. (but like I said >> does not work for all methods). For cg include src/ksp/ksp/impls/cg/cgctx.h >> >> >> Barry >> >> >> On Sep 21, 2009, at 12:44 PM, Ryan Yan wrote: >> >> Hi All, >> The following question is only for test and comparison reason. >> >> Is there a command line option to set up using 1-Norm(sum of absolute >> value of residuals) as convergence monitor and test. >> >> Alternatively, if I can get the residual "r" out at the end of each >> iteration, then it is going to be much simpler, since I can just call >> VecNorm(r, NORM_1, &r_ell1) and pass the r_ell1 into the convergence monitor >> and test. Can anyone inform me if PETSc has an interface for the resuidual >> r? >> >> Or one has to call the convergence test: >> MyKSPConverged(ksp,n,rnorm,flag,dummy); >> and following the steps: >> 1. fetch the exact solution: >> >> KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr) >> >> 2. calculate the residual: >> >> pass the Right Hand side and the Matrix in to calculate the 1-Norm. But >> this step is *not* obvious, since the matrix and rhs is already distributed >> over each processes. >> >> 3. set up the convergence test based on the 1_Norm residual. >> >> >> Thank you very much, >> >> >> Yan >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Sep 22 13:38:34 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 22 Sep 2009 13:38:34 -0500 Subject: some sor questions In-Reply-To: <4AB8D588.8080606@imperial.ac.uk> References: <4AB8D588.8080606@imperial.ac.uk> Message-ID: <08102395-B009-4844-AD16-0EB4DC22848F@mcs.anl.gov> On Sep 22, 2009, at 8:47 AM, Stephan Kramer wrote: > Hi all, > > I have some questions basically about the MatRelax_SeqAIJ routine: > > If I understand correctly there are 2 versions of the sor routine > depending on whether or not there is a zero guess, so that with a > zero guess in the forward sweep you don't need to multiply the upper > diagonal part U with the part of the x vector that is still zero. > Why then does it look like that both versions log the same number of > flops? I would have expected that the full forward sweep (i.e. no > zero guess) takes 2*a->nz flops (i.e. the same as a matvec) and not > a->nz. You are right. This is an error in our code. It will be in the next patch. > > Why does the Richardson iteration with sor not take the zero guess > into account, i.e. why does PCApplyRichardson_SOR not set > SOR_ZERO_INIT_GUESS in the call to MatRelax if the Richardson ksp > has a zero initial guess set? This appears to be a design limitation. There is no mechanism to pass the information that the initial guess is zero into PCApplyRichardson(). We could add support for this by adding one more argument to PCApplyRichardson() for this information. I don't see a simpler way. If one is running, say 2 iterations of Richardson then this would be a measurable improvement in time. If one is running many iterations then the savings is tiny. Perhaps this support should be added. > > In parallel if you specify SOR_LOCAL_FORWARD_SWEEP or > SOR_LOCAL_BACKWARD_SWEEP it > calls MatRelax on the local part of the matrix, mat->A, with > its=lits and lits=PETSC_NULL (?). > However the first line of MatRelax_SeqAIJ then says: its = its*lits. > Is that right? This is all wrong. It should be passing 1 in, not PETSC_NULL. This was fixed in petsc-dev but not in petsc-3.0.0 I will fix it in petsc-3.0.0 and it will be in the next patch. Thanks for pointing out the problems. If you plan to use SOR a lot, you might consider switching to petsc-dev since I have made some improvements there. Also consider the Eisenstat trick preconditioner. Barry > > Please tell me if I'm totally misunderstanding how the routine > works, thanks for any help. > > Cheers > Stephan > > -- > Stephan Kramer > Applied Modelling and Computation Group, > Department of Earth Science and Engineering, > Imperial College London From recrusader at gmail.com Tue Sep 22 15:14:48 2009 From: recrusader at gmail.com (Yujie) Date: Tue, 22 Sep 2009 15:14:48 -0500 Subject: where to realize "petscsetcommonblock_" and "petscgetcommoncomm_" Message-ID: <7ff0ee010909221314s158b20a6u355262adfb640e28@mail.gmail.com> Hello, PETSc developer I am trying to compile my codes. However, I got the following errors. "Linking myproj-dbg... /home/yujie/codes/petsc-3.0.0-p3/linux/lib/libpetsc.so: undefined reference to `petscsetcommonblock_' /home/yujie/codes/petsc-3.0.0-p3/linux/lib/libpetsc.so: undefined reference to `petscgetcommoncomm_' " I can't find "petscsetcommonblock_" and "petscgetcommoncomm_". I have checked libpetsc.so using nm commands as follows: "00000000000c65a4 T petscsequentialphasebegin_ 00000000000c6574 T petscsequentialphaseend_ U petscsetcommonblock_ 0000000000032cb4 T petscsetdefaultdebugger_ ...... 00000000003dcba0 B petscfortran9_ 00000000001441a2 T petscfprintf_ 000000000013e2c8 T petscgetarchtype_ U petscgetcommoncomm_ 0000000000052f18 T petscgetcputime_ 000000000014761e T petscgetflops_ " There is not the address linker for them. I can't find the realization for these two functions in PETSc codes. Could you give me some advice? CPU in my PC is x86_64bits. GCC is 4.2 version. Thanks a lot. Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Sep 22 15:26:53 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 22 Sep 2009 15:26:53 -0500 Subject: where to realize "petscsetcommonblock_" and "petscgetcommoncomm_" In-Reply-To: <7ff0ee010909221314s158b20a6u355262adfb640e28@mail.gmail.com> References: <7ff0ee010909221314s158b20a6u355262adfb640e28@mail.gmail.com> Message-ID: <29A783E3-B2A7-4EF0-AA3B-1E32D9D66428@mcs.anl.gov> After PETSc is installed do the PETSc examples work? That is, does "make test" run correctly? If it does not run then send configure.log and make.log to petsc-maint at mcs.anl.gov If the tests do run then there is something wrong with your makefiles and how you link to PETSc libraries. cd to src/snes/examples/ tutorials do make ex1f and make sure your makefile links using the same libraries. Best to base your makefile on a PETSc makefile then it is portable and easy to maintain. Barry On Sep 22, 2009, at 3:14 PM, Yujie wrote: > Hello, PETSc developer > > I am trying to compile my codes. However, I got the following errors. > "Linking myproj-dbg... > /home/yujie/codes/petsc-3.0.0-p3/linux/lib/libpetsc.so: undefined > reference to `petscsetcommonblock_' > /home/yujie/codes/petsc-3.0.0-p3/linux/lib/libpetsc.so: undefined > reference to `petscgetcommoncomm_' > " > I can't find "petscsetcommonblock_" and "petscgetcommoncomm_". I > have checked libpetsc.so using nm commands as follows: > "00000000000c65a4 T petscsequentialphasebegin_ > 00000000000c6574 T petscsequentialphaseend_ > U petscsetcommonblock_ > 0000000000032cb4 T petscsetdefaultdebugger_ > ...... > > 00000000003dcba0 B petscfortran9_ > 00000000001441a2 T petscfprintf_ > 000000000013e2c8 T petscgetarchtype_ > U petscgetcommoncomm_ > 0000000000052f18 T petscgetcputime_ > 000000000014761e T petscgetflops_ > " > There is not the address linker for them. I can't find the > realization for these two functions in PETSc codes. Could you give > me some advice? > CPU in my PC is x86_64bits. GCC is 4.2 version. Thanks a lot. > > Regards, > Yujie > From balay at mcs.anl.gov Tue Sep 22 15:26:54 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 22 Sep 2009 15:26:54 -0500 (CDT) Subject: where to realize "petscsetcommonblock_" and "petscgetcommoncomm_" In-Reply-To: <7ff0ee010909221314s158b20a6u355262adfb640e28@mail.gmail.com> References: <7ff0ee010909221314s158b20a6u355262adfb640e28@mail.gmail.com> Message-ID: petscgetcommoncomm_ should be in somefort.F Can you verify if you've configured/compiled PETSc with a fortran compiler. And also verify that there are no errors during petsc build [i.e in make.log] >>>>>> asterix:/home/balay/tmp/petsc-dist-test/asterix64/lib>nm -Ao *.a |grep -i petscgetcommoncomm libpetsc.a:zstart.o: U petscgetcommoncomm_ libpetsc.a:somefort.o:00000000000000a2 T petscgetcommoncomm_ asterix:/home/balay/tmp/petsc-dist-test/asterix64/lib> <<<<<<< Satish On Tue, 22 Sep 2009, Yujie wrote: > Hello, PETSc developer > > I am trying to compile my codes. However, I got the following errors. > "Linking myproj-dbg... > /home/yujie/codes/petsc-3.0.0-p3/linux/lib/libpetsc.so: undefined reference > to `petscsetcommonblock_' > /home/yujie/codes/petsc-3.0.0-p3/linux/lib/libpetsc.so: undefined reference > to `petscgetcommoncomm_' > " > I can't find "petscsetcommonblock_" and "petscgetcommoncomm_". I have > checked libpetsc.so using nm commands as follows: > "00000000000c65a4 T petscsequentialphasebegin_ > 00000000000c6574 T petscsequentialphaseend_ > U petscsetcommonblock_ > 0000000000032cb4 T petscsetdefaultdebugger_ > ...... > > 00000000003dcba0 B petscfortran9_ > 00000000001441a2 T petscfprintf_ > 000000000013e2c8 T petscgetarchtype_ > U petscgetcommoncomm_ > 0000000000052f18 T petscgetcputime_ > 000000000014761e T petscgetflops_ > " > There is not the address linker for them. I can't find the realization for > these two functions in PETSc codes. Could you give me some advice? > CPU in my PC is x86_64bits. GCC is 4.2 version. Thanks a lot. > > Regards, > Yujie > From s.kramer at imperial.ac.uk Tue Sep 22 15:37:37 2009 From: s.kramer at imperial.ac.uk (Stephan Kramer) Date: Tue, 22 Sep 2009 21:37:37 +0100 Subject: some sor questions In-Reply-To: <08102395-B009-4844-AD16-0EB4DC22848F@mcs.anl.gov> References: <4AB8D588.8080606@imperial.ac.uk> <08102395-B009-4844-AD16-0EB4DC22848F@mcs.anl.gov> Message-ID: <4AB93591.2020306@imperial.ac.uk> Thanks for your answers Barry Smith wrote: > On Sep 22, 2009, at 8:47 AM, Stephan Kramer wrote: > >> Hi all, >> >> I have some questions basically about the MatRelax_SeqAIJ routine: >> >> If I understand correctly there are 2 versions of the sor routine >> depending on whether or not there is a zero guess, so that with a >> zero guess in the forward sweep you don't need to multiply the upper >> diagonal part U with the part of the x vector that is still zero. >> Why then does it look like that both versions log the same number of >> flops? I would have expected that the full forward sweep (i.e. no >> zero guess) takes 2*a->nz flops (i.e. the same as a matvec) and not >> a->nz. > > You are right. This is an error in our code. It will be in the > next patch. > >> Why does the Richardson iteration with sor not take the zero guess >> into account, i.e. why does PCApplyRichardson_SOR not set >> SOR_ZERO_INIT_GUESS in the call to MatRelax if the Richardson ksp >> has a zero initial guess set? > > This appears to be a design limitation. There is no mechanism to > pass the information that the initial guess is zero into > PCApplyRichardson(). We could add support for this by adding one more > argument to PCApplyRichardson() for this information. I don't see a > simpler way. If one is running, say 2 iterations of Richardson then > this would be a measurable improvement in time. If one is running many > iterations then the savings is tiny. Perhaps this support should be > added. I'm thinking of the application of ksp richardson with sor as a smoother in pcmg. In which case the down smoother will have zero initial guess (as it only acts on the residual), and there will be typicaly only 1 or 2 iterations, so the saving would be significant. Is there another way I should set this up instead? > > > >> In parallel if you specify SOR_LOCAL_FORWARD_SWEEP or >> SOR_LOCAL_BACKWARD_SWEEP it > > >> calls MatRelax on the local part of the matrix, mat->A, with >> its=lits and lits=PETSC_NULL (?). > >> However the first line of MatRelax_SeqAIJ then says: its = its*lits. >> Is that right? > > This is all wrong. It should be passing 1 in, not PETSC_NULL. This > was fixed in petsc-dev but not in petsc-3.0.0 I will fix it in > petsc-3.0.0 and it will be in the next patch. > > Thanks for pointing out the problems. > > If you plan to use SOR a lot, you might consider switching to > petsc-dev since I have made some improvements there. Also consider the > Eisenstat trick preconditioner. > > Barry > > >> Please tell me if I'm totally misunderstanding how the routine >> works, thanks for any help. >> >> Cheers >> Stephan >> >> -- >> Stephan Kramer >> Applied Modelling and Computation Group, >> Department of Earth Science and Engineering, >> Imperial College London > > From bsmith at mcs.anl.gov Tue Sep 22 15:40:51 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 22 Sep 2009 15:40:51 -0500 Subject: some sor questions In-Reply-To: <4AB93591.2020306@imperial.ac.uk> References: <4AB8D588.8080606@imperial.ac.uk> <08102395-B009-4844-AD16-0EB4DC22848F@mcs.anl.gov> <4AB93591.2020306@imperial.ac.uk> Message-ID: <1B44D38B-6D3F-49F4-B82F-5568D503BB80@mcs.anl.gov> On Sep 22, 2009, at 3:37 PM, Stephan Kramer wrote: > Thanks for your answers > > Barry Smith wrote: >> On Sep 22, 2009, at 8:47 AM, Stephan Kramer wrote: >>> Hi all, >>> >>> I have some questions basically about the MatRelax_SeqAIJ routine: >>> >>> If I understand correctly there are 2 versions of the sor routine >>> depending on whether or not there is a zero guess, so that with a >>> zero guess in the forward sweep you don't need to multiply the >>> upper diagonal part U with the part of the x vector that is still >>> zero. Why then does it look like that both versions log the same >>> number of flops? I would have expected that the full forward >>> sweep (i.e. no zero guess) takes 2*a->nz flops (i.e. the same as >>> a matvec) and not a->nz. >> You are right. This is an error in our code. It will be in the >> next patch. >>> Why does the Richardson iteration with sor not take the zero >>> guess into account, i.e. why does PCApplyRichardson_SOR not set >>> SOR_ZERO_INIT_GUESS in the call to MatRelax if the Richardson ksp >>> has a zero initial guess set? >> This appears to be a design limitation. There is no mechanism >> to pass the information that the initial guess is zero into >> PCApplyRichardson(). We could add support for this by adding one >> more argument to PCApplyRichardson() for this information. I don't >> see a simpler way. If one is running, say 2 iterations of >> Richardson then this would be a measurable improvement in time. If >> one is running many iterations then the savings is tiny. Perhaps >> this support should be added. > > I'm thinking of the application of ksp richardson with sor as a > smoother in pcmg. In which case the down smoother will have zero > initial guess (as it only acts on the residual), and there will be > typicaly only 1 or 2 iterations, so the saving would be significant. > Is there another way I should set this up instead? I will add this support to petsc-dev. It should be there by tomorrow morning. Thanks for pointing this out and please let me know if you see other issues like these that I can fix. Barry > >>> In parallel if you specify SOR_LOCAL_FORWARD_SWEEP or >>> SOR_LOCAL_BACKWARD_SWEEP it >>> calls MatRelax on the local part of the matrix, mat->A, with >>> its=lits and lits=PETSC_NULL (?). >>> However the first line of MatRelax_SeqAIJ then says: its = >>> its*lits. Is that right? >> This is all wrong. It should be passing 1 in, not PETSC_NULL. >> This was fixed in petsc-dev but not in petsc-3.0.0 I will fix it >> in petsc-3.0.0 and it will be in the next patch. >> Thanks for pointing out the problems. >> If you plan to use SOR a lot, you might consider switching to >> petsc-dev since I have made some improvements there. Also consider >> the Eisenstat trick preconditioner. >> Barry >>> Please tell me if I'm totally misunderstanding how the routine >>> works, thanks for any help. >>> >>> Cheers >>> Stephan >>> >>> -- >>> Stephan Kramer >>> Applied Modelling and Computation Group, >>> Department of Earth Science and Engineering, >>> Imperial College London > From recrusader at gmail.com Tue Sep 22 19:35:07 2009 From: recrusader at gmail.com (Yujie) Date: Tue, 22 Sep 2009 18:35:07 -0600 Subject: where to realize "petscsetcommonblock_" and "petscgetcommoncomm_" In-Reply-To: References: <7ff0ee010909221314s158b20a6u355262adfb640e28@mail.gmail.com> Message-ID: <7ff0ee010909221735m3cfda2ffi3885ab272b99acda@mail.gmail.com> I have solved it. Thanks so much, Barry and Satish:). Regards, Yujie On Tue, Sep 22, 2009 at 2:26 PM, Satish Balay wrote: > petscgetcommoncomm_ should be in somefort.F > > Can you verify if you've configured/compiled PETSc with a > fortran compiler. > > And also verify that there are no errors during petsc build [i.e in > make.log] > > > >>>>>> > asterix:/home/balay/tmp/petsc-dist-test/asterix64/lib>nm -Ao *.a |grep -i > petscgetcommoncomm > libpetsc.a:zstart.o: U petscgetcommoncomm_ > libpetsc.a:somefort.o:00000000000000a2 T petscgetcommoncomm_ > asterix:/home/balay/tmp/petsc-dist-test/asterix64/lib> > <<<<<<< > > Satish > > On Tue, 22 Sep 2009, Yujie wrote: > > > Hello, PETSc developer > > > > I am trying to compile my codes. However, I got the following errors. > > "Linking myproj-dbg... > > /home/yujie/codes/petsc-3.0.0-p3/linux/lib/libpetsc.so: undefined > reference > > to `petscsetcommonblock_' > > /home/yujie/codes/petsc-3.0.0-p3/linux/lib/libpetsc.so: undefined > reference > > to `petscgetcommoncomm_' > > " > > I can't find "petscsetcommonblock_" and "petscgetcommoncomm_". I have > > checked libpetsc.so using nm commands as follows: > > "00000000000c65a4 T petscsequentialphasebegin_ > > 00000000000c6574 T petscsequentialphaseend_ > > U petscsetcommonblock_ > > 0000000000032cb4 T petscsetdefaultdebugger_ > > ...... > > > > 00000000003dcba0 B petscfortran9_ > > 00000000001441a2 T petscfprintf_ > > 000000000013e2c8 T petscgetarchtype_ > > U petscgetcommoncomm_ > > 0000000000052f18 T petscgetcputime_ > > 000000000014761e T petscgetflops_ > > " > > There is not the address linker for them. I can't find the realization > for > > these two functions in PETSc codes. Could you give me some advice? > > CPU in my PC is x86_64bits. GCC is 4.2 version. Thanks a lot. > > > > Regards, > > Yujie > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Wed Sep 23 00:04:00 2009 From: dave.mayhem23 at gmail.com (Dave May) Date: Wed, 23 Sep 2009 15:04:00 +1000 Subject: DA with imposed parallel decomposition Message-ID: <956373f0909222204w7f8cf47fp37fbd1bd3a573336@mail.gmail.com> Hello, Suppose I have a DA and I enforce the parallel decomposition during creation by specifying the arrays lx[], ly[], lz[] in DACreate3d(). If I now create a second DA using DARefine(), am I alays also able to obtain an interpolation operator between the two DA's via DAGetInterpolation()? Under what circumstance will DAGetInterpolation() fail when used between DA's generated in this manner? Cheers, Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Sep 23 07:33:02 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 23 Sep 2009 07:33:02 -0500 Subject: DA with imposed parallel decomposition In-Reply-To: <956373f0909222204w7f8cf47fp37fbd1bd3a573336@mail.gmail.com> References: <956373f0909222204w7f8cf47fp37fbd1bd3a573336@mail.gmail.com> Message-ID: Since it just inserts a point on every edge and face (in 2D), I do not see why it would fail. Does it? Matt On Wed, Sep 23, 2009 at 12:04 AM, Dave May wrote: > Hello, > Suppose I have a DA and I enforce the parallel decomposition during > creation by specifying the arrays lx[], ly[], lz[] in DACreate3d(). If I now > create a second DA using DARefine(), am I alays also able to obtain an > interpolation operator between the two DA's via DAGetInterpolation()? > > Under what circumstance will DAGetInterpolation() fail when used between > DA's generated in this manner? > > Cheers, > Dave > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Wed Sep 23 07:47:42 2009 From: dave.mayhem23 at gmail.com (Dave May) Date: Wed, 23 Sep 2009 14:47:42 +0200 Subject: DA with imposed parallel decomposition In-Reply-To: References: <956373f0909222204w7f8cf47fp37fbd1bd3a573336@mail.gmail.com> Message-ID: <956373f0909230547u210449e4n575c75b51719a509@mail.gmail.com> Hey Matt, In one piece of code I have, yes the call to DAGetInterpolation() does seem to cause a nasty crash. It doesn't occur all the time, just with certain processor sizes (64) and certain mesh sizes (80x80x40). I was wondering if there was some pathological cases I did not know about. I think I will have to write a stand alone test case to see if I can reproduce the error in a simpler code. I don't think what I'm doing should cause a problem, but I'm not sure how best to debug the problem I have. Any hints would be appreciated. :) Cheers, Dave On Wed, Sep 23, 2009 at 2:33 PM, Matthew Knepley wrote: > Since it just inserts a point on every edge and face (in 2D), I do not see > why it would fail. Does it? > > Matt > > > On Wed, Sep 23, 2009 at 12:04 AM, Dave May wrote: > >> Hello, >> Suppose I have a DA and I enforce the parallel decomposition during >> creation by specifying the arrays lx[], ly[], lz[] in DACreate3d(). If I now >> create a second DA using DARefine(), am I alays also able to obtain an >> interpolation operator between the two DA's via DAGetInterpolation()? >> >> Under what circumstance will DAGetInterpolation() fail when used between >> DA's generated in this manner? >> >> Cheers, >> Dave >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Sep 23 08:00:10 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 23 Sep 2009 08:00:10 -0500 Subject: DA with imposed parallel decomposition In-Reply-To: <956373f0909230547u210449e4n575c75b51719a509@mail.gmail.com> References: <956373f0909222204w7f8cf47fp37fbd1bd3a573336@mail.gmail.com> <956373f0909230547u210449e4n575c75b51719a509@mail.gmail.com> Message-ID: On Sep 23, 2009, at 7:47 AM, Dave May wrote: > Hey Matt, > In one piece of code I have, yes the call to > DAGetInterpolation() does seem to cause a nasty crash. A crash? Segmentation violation etc? Or an error message saying that the decomposition is not supported? They are very different things: a crash is a bug we need to deal with, unsupported is just because it is too hard to support refinement with all decompositions. Barry > It doesn't occur all the time, just with certain processor sizes > (64) and certain mesh sizes (80x80x40). I was wondering if there was > some pathological cases I did not know about. > > I think I will have to write a stand alone test case to see if I can > reproduce the error in a simpler code. > I don't think what I'm doing should cause a problem, but I'm not > sure how best to debug the problem I have. > > Any hints would be appreciated. :) > > Cheers, > Dave > > > > On Wed, Sep 23, 2009 at 2:33 PM, Matthew Knepley > wrote: > Since it just inserts a point on every edge and face (in 2D), I do > not see why it would fail. Does it? > > Matt > > > On Wed, Sep 23, 2009 at 12:04 AM, Dave May > wrote: > Hello, > Suppose I have a DA and I enforce the parallel decomposition > during creation by specifying the arrays lx[], ly[], lz[] in > DACreate3d(). If I now create a second DA using DARefine(), am I > alays also able to obtain an interpolation operator between the two > DA's via DAGetInterpolation()? > > Under what circumstance will DAGetInterpolation() fail when used > between DA's generated in this manner? > > Cheers, > Dave > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > From knepley at gmail.com Wed Sep 23 08:02:35 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 23 Sep 2009 08:02:35 -0500 Subject: DA with imposed parallel decomposition In-Reply-To: <956373f0909230547u210449e4n575c75b51719a509@mail.gmail.com> References: <956373f0909222204w7f8cf47fp37fbd1bd3a573336@mail.gmail.com> <956373f0909230547u210449e4n575c75b51719a509@mail.gmail.com> Message-ID: On Wed, Sep 23, 2009 at 7:47 AM, Dave May wrote: > Hey Matt, > In one piece of code I have, yes the call to DAGetInterpolation() does > seem to cause a nasty crash. > It doesn't occur all the time, just with certain processor sizes (64) and > certain mesh sizes (80x80x40). I was wondering if there was some > pathological cases I did not know about. > > I think I will have to write a stand alone test case to see if I can > reproduce the error in a simpler code. > I don't think what I'm doing should cause a problem, but I'm not sure how > best to debug the problem I have. > > Any hints would be appreciated. :) > I guess it might be possible for you to specify a partition that breaks it, but I have a hard time envisioning it. I would print out the DA sizes (happens with DAView()) for each level every time. Matt > Cheers, > Dave > > > > > On Wed, Sep 23, 2009 at 2:33 PM, Matthew Knepley wrote: > >> Since it just inserts a point on every edge and face (in 2D), I do not see >> why it would fail. Does it? >> >> Matt >> >> >> On Wed, Sep 23, 2009 at 12:04 AM, Dave May wrote: >> >>> Hello, >>> Suppose I have a DA and I enforce the parallel decomposition during >>> creation by specifying the arrays lx[], ly[], lz[] in DACreate3d(). If I now >>> create a second DA using DARefine(), am I alays also able to obtain an >>> interpolation operator between the two DA's via DAGetInterpolation()? >>> >>> Under what circumstance will DAGetInterpolation() fail when used between >>> DA's generated in this manner? >>> >>> Cheers, >>> Dave >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Wed Sep 23 08:07:09 2009 From: dave.mayhem23 at gmail.com (Dave May) Date: Wed, 23 Sep 2009 15:07:09 +0200 Subject: DA with imposed parallel decomposition In-Reply-To: References: <956373f0909222204w7f8cf47fp37fbd1bd3a573336@mail.gmail.com> <956373f0909230547u210449e4n575c75b51719a509@mail.gmail.com> Message-ID: <956373f0909230607q3e4725d3r5e3caf655cb5c02@mail.gmail.com> Sorry Barry, I should have been more specific. I get a segmentation violation. I think writing a simple test to isolate the bug is the best option to debug this error. Do you agree? Cheers, Dave On Wed, Sep 23, 2009 at 3:00 PM, Barry Smith wrote: > > On Sep 23, 2009, at 7:47 AM, Dave May wrote: > > Hey Matt, >> In one piece of code I have, yes the call to DAGetInterpolation() does >> seem to cause a nasty crash. >> > > A crash? Segmentation violation etc? Or an error message saying that the > decomposition is not supported? They are very different things: a crash is a > bug we need to deal with, unsupported is just because it is too hard to > support refinement with all decompositions. > > Barry > > > It doesn't occur all the time, just with certain processor sizes (64) and >> certain mesh sizes (80x80x40). I was wondering if there was some >> pathological cases I did not know about. >> >> I think I will have to write a stand alone test case to see if I can >> reproduce the error in a simpler code. >> I don't think what I'm doing should cause a problem, but I'm not sure how >> best to debug the problem I have. >> >> Any hints would be appreciated. :) >> >> Cheers, >> Dave >> >> >> >> On Wed, Sep 23, 2009 at 2:33 PM, Matthew Knepley >> wrote: >> Since it just inserts a point on every edge and face (in 2D), I do not see >> why it would fail. Does it? >> >> Matt >> >> >> On Wed, Sep 23, 2009 at 12:04 AM, Dave May >> wrote: >> Hello, >> Suppose I have a DA and I enforce the parallel decomposition during >> creation by specifying the arrays lx[], ly[], lz[] in DACreate3d(). If I now >> create a second DA using DARefine(), am I alays also able to obtain an >> interpolation operator between the two DA's via DAGetInterpolation()? >> >> Under what circumstance will DAGetInterpolation() fail when used between >> DA's generated in this manner? >> >> Cheers, >> Dave >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Sep 23 09:36:12 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 23 Sep 2009 09:36:12 -0500 Subject: DA with imposed parallel decomposition In-Reply-To: <956373f0909230607q3e4725d3r5e3caf655cb5c02@mail.gmail.com> References: <956373f0909222204w7f8cf47fp37fbd1bd3a573336@mail.gmail.com> <956373f0909230547u210449e4n575c75b51719a509@mail.gmail.com> <956373f0909230607q3e4725d3r5e3caf655cb5c02@mail.gmail.com> Message-ID: <230F8065-4B91-4533-BF15-5EFF9FCC635B@mcs.anl.gov> This is a bug, we need to capture it in a debugger somehow. If you can reproduce it in something we can run then we'll debug it. Barry On Sep 23, 2009, at 8:07 AM, Dave May wrote: > Sorry Barry, > I should have been more specific. I get a segmentation violation. > I think writing a simple test to isolate the bug is the best option > to debug this error. > Do you agree? > > Cheers, > Dave > > > On Wed, Sep 23, 2009 at 3:00 PM, Barry Smith > wrote: > > On Sep 23, 2009, at 7:47 AM, Dave May wrote: > > Hey Matt, > In one piece of code I have, yes the call to DAGetInterpolation() > does seem to cause a nasty crash. > > A crash? Segmentation violation etc? Or an error message saying > that the decomposition is not supported? They are very different > things: a crash is a bug we need to deal with, unsupported is just > because it is too hard to support refinement with all decompositions. > > Barry > > > It doesn't occur all the time, just with certain processor sizes > (64) and certain mesh sizes (80x80x40). I was wondering if there was > some pathological cases I did not know about. > > I think I will have to write a stand alone test case to see if I can > reproduce the error in a simpler code. > I don't think what I'm doing should cause a problem, but I'm not > sure how best to debug the problem I have. > > Any hints would be appreciated. :) > > Cheers, > Dave > > > > On Wed, Sep 23, 2009 at 2:33 PM, Matthew Knepley > wrote: > Since it just inserts a point on every edge and face (in 2D), I do > not see why it would fail. Does it? > > Matt > > > On Wed, Sep 23, 2009 at 12:04 AM, Dave May > wrote: > Hello, > Suppose I have a DA and I enforce the parallel decomposition during > creation by specifying the arrays lx[], ly[], lz[] in DACreate3d(). > If I now create a second DA using DARefine(), am I alays also able > to obtain an interpolation operator between the two DA's via > DAGetInterpolation()? > > Under what circumstance will DAGetInterpolation() fail when used > between DA's generated in this manner? > > Cheers, > Dave > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > > From bsmith at mcs.anl.gov Wed Sep 23 20:27:31 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 23 Sep 2009 20:27:31 -0500 Subject: some sor questions In-Reply-To: <4AB93591.2020306@imperial.ac.uk> References: <4AB8D588.8080606@imperial.ac.uk> <08102395-B009-4844-AD16-0EB4DC22848F@mcs.anl.gov> <4AB93591.2020306@imperial.ac.uk> Message-ID: <47F6C949-D8F1-4CB8-8BFD-3675697483D3@mcs.anl.gov> I have pushed the support, so when possible the first Richardson iteration will take advantage of the fact that it has a zero initial guess. Please let me know if you have difficulties. Note the convergence residuals will be slightly different because the computation of the smoother is slightly different when the guess is zero. Both answers are equally "correct". Barry On Sep 22, 2009, at 3:37 PM, Stephan Kramer wrote: > Thanks for your answers > > Barry Smith wrote: >> On Sep 22, 2009, at 8:47 AM, Stephan Kramer wrote: >>> Hi all, >>> >>> I have some questions basically about the MatRelax_SeqAIJ routine: >>> >>> If I understand correctly there are 2 versions of the sor routine >>> depending on whether or not there is a zero guess, so that with a >>> zero guess in the forward sweep you don't need to multiply the >>> upper diagonal part U with the part of the x vector that is still >>> zero. Why then does it look like that both versions log the same >>> number of flops? I would have expected that the full forward >>> sweep (i.e. no zero guess) takes 2*a->nz flops (i.e. the same as >>> a matvec) and not a->nz. >> You are right. This is an error in our code. It will be in the >> next patch. >>> Why does the Richardson iteration with sor not take the zero >>> guess into account, i.e. why does PCApplyRichardson_SOR not set >>> SOR_ZERO_INIT_GUESS in the call to MatRelax if the Richardson ksp >>> has a zero initial guess set? >> This appears to be a design limitation. There is no mechanism >> to pass the information that the initial guess is zero into >> PCApplyRichardson(). We could add support for this by adding one >> more argument to PCApplyRichardson() for this information. I don't >> see a simpler way. If one is running, say 2 iterations of >> Richardson then this would be a measurable improvement in time. If >> one is running many iterations then the savings is tiny. Perhaps >> this support should be added. > > I'm thinking of the application of ksp richardson with sor as a > smoother in pcmg. In which case the down smoother will have zero > initial guess (as it only acts on the residual), and there will be > typicaly only 1 or 2 iterations, so the saving would be significant. > Is there another way I should set this up instead? > >>> In parallel if you specify SOR_LOCAL_FORWARD_SWEEP or >>> SOR_LOCAL_BACKWARD_SWEEP it >>> calls MatRelax on the local part of the matrix, mat->A, with >>> its=lits and lits=PETSC_NULL (?). >>> However the first line of MatRelax_SeqAIJ then says: its = >>> its*lits. Is that right? >> This is all wrong. It should be passing 1 in, not PETSC_NULL. >> This was fixed in petsc-dev but not in petsc-3.0.0 I will fix it >> in petsc-3.0.0 and it will be in the next patch. >> Thanks for pointing out the problems. >> If you plan to use SOR a lot, you might consider switching to >> petsc-dev since I have made some improvements there. Also consider >> the Eisenstat trick preconditioner. >> Barry >>> Please tell me if I'm totally misunderstanding how the routine >>> works, thanks for any help. >>> >>> Cheers >>> Stephan >>> >>> -- >>> Stephan Kramer >>> Applied Modelling and Computation Group, >>> Department of Earth Science and Engineering, >>> Imperial College London > From thomas.witkowski at tu-dresden.de Thu Sep 24 02:26:04 2009 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Thu, 24 Sep 2009 09:26:04 +0200 Subject: How to get more information about mallocs during MatSetValues() Message-ID: <4ABB1F0C.60200@tu-dresden.de> Hi, in my FEM code I can determine the number of non zeros per row before starting petsc's matrix assembling. So, if everything is done correctly, there should be no mallocs within MatSetValues(). I've created the matrix using: MatCreateMPIAIJ(PETSC_COMM_WORLD, nRankRows, nRankRows, nOverallRows, nOverallRows, 0, d_nnz, 0, o_nnz, &petscMatrix); But I get the following info output (here just as an example for the last of the four processes): [3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 25119 X 25119; storage space: 420 unneeded,1333557 used [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 72 [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 231 [3] Mat_CheckInode(): Found 25119 nodes out of 25119 rows. Not using Inode routines [3] PetscCommDuplicate(): Using internal PETSc communicator 14631424 31909600 [3] PetscCommDuplicate(): returning tag 2147483645 [3] PetscCommDuplicate(): returning tag 2147483628 [3] PetscCommDuplicate(): Using internal PETSc communicator 14631424 31909600 [3] PetscCommDuplicate(): returning tag 2147483644 [3] PetscCommDuplicate(): returning tag 2147483627 [3] PetscCommDuplicate(): returning tag 2147483622 [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 25119 X 3510; storage space: 47 unneeded,22453 used [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 161 Is there an easy way to find out, why Petsc must do some mallocs? Thomas From jed at 59A2.org Thu Sep 24 04:32:11 2009 From: jed at 59A2.org (Jed Brown) Date: Thu, 24 Sep 2009 11:32:11 +0200 Subject: How to get more information about mallocs during MatSetValues() In-Reply-To: <4ABB1F0C.60200@tu-dresden.de> References: <4ABB1F0C.60200@tu-dresden.de> Message-ID: <4ABB3C9B.4030305@59A2.org> Thomas Witkowski wrote: > Is there an easy way to find out, why Petsc must do some mallocs? try MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_TRUE) and catch the error in a debugger. Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 261 bytes Desc: OpenPGP digital signature URL: From thomas.witkowski at tu-dresden.de Thu Sep 24 06:01:42 2009 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Thu, 24 Sep 2009 13:01:42 +0200 Subject: How to get more information about mallocs during MatSetValues() In-Reply-To: <4ABB3C9B.4030305@59A2.org> References: <4ABB1F0C.60200@tu-dresden.de> <4ABB3C9B.4030305@59A2.org> Message-ID: <4ABB5196.7070508@tu-dresden.de> Jed Brown wrote: > Thomas Witkowski wrote: > > >> Is there an easy way to find out, why Petsc must do some mallocs? >> > > try MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_TRUE) and catch the error in a debugger. > That's exactly what I've search for. Thanks ... Thomas > > Jed > > From rafaelsantoscoelho at gmail.com Thu Sep 24 10:50:58 2009 From: rafaelsantoscoelho at gmail.com (Rafael Santos Coelho) Date: Thu, 24 Sep 2009 12:50:58 -0300 Subject: Parallel graph coloring heuristics to color large-scale general jacobian matrices Message-ID: <3b6f83d40909240850u16a34c52p398d0ac10e6f4713@mail.gmail.com> Hello to everyone, I've been working for some time on this piece of code for estimating large-scale and general jacobian matrices via finite differences whose sparsity patterns can be determined a priori. It basically uses a parallel graph coloring heuristic to find a good vertex coloring for the column intersection graph relative to the jacobian matrix at hand. Now, I'd like to integrate that code into PETSc, but the problem is that I don't know exactly how. So after spending a lot of time sifting through and trying to grasp "the logic" of PETSc source code, especially, the source code to the MatFDColoring module, I came up with a "tacky" solution proposal: firstly, I would call my code in order to obtain the coloring in parallel, and then I would have a routine (which would have to be embedded in PETSc code base) that, on each processor, would build the "MatFDColoring" data structure based on two things, the coloring found and the sparsity pattern of the jacobian matrix in question. It seems to me that this might work out, though I'm not entirely sure. What do you guys think? Is there an easier and/or more "elegant" way to do this? Thanks in advance, Rafael -------------- next part -------------- An HTML attachment was scrubbed... URL: From i.gutheil at fz-juelich.de Thu Sep 24 02:15:20 2009 From: i.gutheil at fz-juelich.de (I.Gutheil) Date: Thu, 24 Sep 2009 09:15:20 +0200 Subject: Where is sles in PETSc 3.0.0? Message-ID: <4ABB1C88.1090505@fz-juelich.de> Hello all, we installed the new PETSc 3.0.0-p6 version of PETSc on a new computer and now a user is missing petscsles.h. When I looked into the src directory I saw that there is no sles subdirectory any longer. Where has this moved to? I did not find it in the changes page of PETSc. Sincerely Inge Gutheil -- Inge Gutheil Juelich Supercomputing Centre Institute for Advanced Simulation Forschungszentrum Juelich GmbH 52425 Juelich, Germany Phone: +49-2461-61-3135 Fax: +49-2461-61-6656 E-mail: i.gutheil at fz-juelich.de ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ From balay at mcs.anl.gov Thu Sep 24 11:07:51 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 24 Sep 2009 11:07:51 -0500 (CDT) Subject: Where is sles in PETSc 3.0.0? In-Reply-To: <4ABB1C88.1090505@fz-juelich.de> References: <4ABB1C88.1090505@fz-juelich.de> Message-ID: SLES has been renamed/merged-into KSP a few releases back.. Satish On Thu, 24 Sep 2009, I.Gutheil wrote: > Hello all, > > we installed the new PETSc 3.0.0-p6 version of PETSc on a new computer > and now a user is missing petscsles.h. > When I looked into the src directory I saw that there is no sles > subdirectory any longer. Where has this moved to? I did not find it in > the changes page of PETSc. > > Sincerely > > Inge Gutheil > > -- > > Inge Gutheil > Juelich Supercomputing Centre > Institute for Advanced Simulation > Forschungszentrum Juelich GmbH > 52425 Juelich, Germany > > Phone: +49-2461-61-3135 > Fax: +49-2461-61-6656 > E-mail: i.gutheil at fz-juelich.de > > > > ------------------------------------------------------------------------------------------------ > ------------------------------------------------------------------------------------------------ > Forschungszentrum Juelich GmbH > 52425 Juelich > Sitz der Gesellschaft: Juelich > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 > Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe > Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), > Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, > Prof. Dr. Sebastian M. Schmidt > ------------------------------------------------------------------------------------------------ > ------------------------------------------------------------------------------------------------ > From hzhang at mcs.anl.gov Thu Sep 24 11:15:41 2009 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Thu, 24 Sep 2009 11:15:41 -0500 (CDT) Subject: Parallel graph coloring heuristics to color large-scale general jacobian matrices In-Reply-To: <3b6f83d40909240850u16a34c52p398d0ac10e6f4713@mail.gmail.com> References: <3b6f83d40909240850u16a34c52p398d0ac10e6f4713@mail.gmail.com> Message-ID: Rafael, > I've been working for some time on this piece of code for estimating > large-scale and general jacobian matrices via finite differences whose > sparsity patterns can be determined a priori. It basically uses a parallel > graph coloring heuristic to find a good vertex coloring for the column > intersection graph relative to the jacobian matrix at hand. If you have sparsity patterns, then petsc can create coloring context easily. See ~petsc/src/snes/examples/tutorials/ex10d/ex10.c : } else { /* Use matfdcoloring */ ISColoring iscoloring; MatStructure flag; /* Get the data structure of Jac */ ierr = FormJacobian(snes,x,&Jac,&Jac,&flag,&user);CHKERRQ(ierr); /* Create coloring context */ ierr = MatGetColoring(Jac,MATCOLORING_SL,&iscoloring);CHKERRQ(ierr); ierr = MatFDColoringCreate(Jac,iscoloring,&matfdcoloring);CHKERRQ(ierr); ierr = MatFDColoringSetFunction(matfdcoloring,(PetscErrorCode (*)(void))FormFunction,&user);CHKERRQ(ierr); ierr = MatFDColoringSetFromOptions(matfdcoloring);CHKERRQ(ierr); /* ierr = MatFDColoringView(matfdcoloring,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); */ ierr = SNESSetJacobian(snes,Jac,Jac,SNESDefaultComputeJacobianColor,matfdcoloring);CHKERRQ(ierr); ierr = ISColoringDestroy(iscoloring);CHKERRQ(ierr); } in which, coloring context is created based on Jac sparse pattern. See its makefile on how to run it. Hong > > Now, I'd like to integrate that code into PETSc, but the problem is that I > don't know exactly how. So after spending a lot of time sifting through and > trying to grasp "the logic" of PETSc source code, especially, the source > code to the MatFDColoring module, I came up with a "tacky" solution > proposal: firstly, I would call my code in order to obtain the coloring in > parallel, and then I would have a routine (which would have to be embedded > in PETSc code base) that, on each processor, would build the "MatFDColoring" > data structure based on two things, the coloring found and the sparsity > pattern of the jacobian matrix in question. > > It seems to me that this might work out, though I'm not entirely sure. What > do you guys think? Is there an easier and/or more "elegant" way to do this? > > Thanks in advance, > > Rafael > From Chun.SUN at 3ds.com Thu Sep 24 10:42:04 2009 From: Chun.SUN at 3ds.com (SUN Chun) Date: Thu, 24 Sep 2009 11:42:04 -0400 Subject: Krylov vectors of matrix-free KSP In-Reply-To: <4ABB5196.7070508@tu-dresden.de> References: <4ABB1F0C.60200@tu-dresden.de> <4ABB3C9B.4030305@59A2.org> <4ABB5196.7070508@tu-dresden.de> Message-ID: <2545DC7A42DF804AAAB2ADA5043D57DA28E49E@CORP-CLT-EXB01.ds> Hi, Sorry for the intrusion. I am planning to use MATSHELL together with KSP to achieve matrix-free Krylov solve. I understand that I need to provide a function as MatVec to build the Krylov space. I also want to retain all the input vectors that I got from PETSC. i.e. I want to keep the Krylov vectors on my own side, together with some intermediate results. However I found that my MatVec is not only called during the building of Krylov space, it's also called else where like initialization and norm calculation. If I say -ksp_norm_type=no, the number of MatVec calls might still slightly off from what I see for the number of iterations. I just want to check: Is there a way to know if a particular MatVec call is from Krylov iteration or from elsewhere like norm check? Or after -ksp_norm_type=no I should see exactly the same number of Krylov iterations comparing to the number that MatVec is called? The reason for doing this is that I want to obtain each coefficient of all the Krylov basis from *this* solve and use them elsewhere to build solution to another related equation from the same Krylov space. I guess I can't access these coefficients but if I can take advantage of all vectors being orthogonal, I can recover the coefficients in a rather cheap way. Do you see any way around this? Thank you very much, Chun From bsmith at mcs.anl.gov Thu Sep 24 11:31:27 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 24 Sep 2009 11:31:27 -0500 Subject: Krylov vectors of matrix-free KSP In-Reply-To: <2545DC7A42DF804AAAB2ADA5043D57DA28E49E@CORP-CLT-EXB01.ds> References: <4ABB1F0C.60200@tu-dresden.de> <4ABB3C9B.4030305@59A2.org> <4ABB5196.7070508@tu-dresden.de> <2545DC7A42DF804AAAB2ADA5043D57DA28E49E@CORP-CLT-EXB01.ds> Message-ID: <212B128A-66EB-44D0-BC1B-DFA844C7107F@mcs.anl.gov> I assume that you are working with GMRES? If so, rather than trying to "trap" this information during the computation I suggest using the data structures that KSPGMRES constructs AFTER the solve to do your own calculations. The GMRES data structure is defined in src/ ksp/ksp/impls/gmres/gmresp.h and the code that uses it is in the various source files in that directory. Barry On Sep 24, 2009, at 10:42 AM, SUN Chun wrote: > Hi, > > Sorry for the intrusion. > > I am planning to use MATSHELL together with KSP to achieve matrix-free > Krylov solve. I understand that I need to provide a function as MatVec > to build the Krylov space. > > I also want to retain all the input vectors that I got from PETSC. > i.e. > I want to keep the Krylov vectors on my own side, together with some > intermediate results. However I found that my MatVec is not only > called > during the building of Krylov space, it's also called else where like > initialization and norm calculation. If I say -ksp_norm_type=no, the > number of MatVec calls might still slightly off from what I see for > the > number of iterations. > > I just want to check: Is there a way to know if a particular MatVec > call > is from Krylov iteration or from elsewhere like norm check? > > Or after -ksp_norm_type=no I should see exactly the same number of > Krylov iterations comparing to the number that MatVec is called? > > The reason for doing this is that I want to obtain each coefficient of > all the Krylov basis from *this* solve and use them elsewhere to build > solution to another related equation from the same Krylov space. I > guess > I can't access these coefficients but if I can take advantage of all > vectors being orthogonal, I can recover the coefficients in a rather > cheap way. Do you see any way around this? > > Thank you very much, > Chun From rafaelsantoscoelho at gmail.com Thu Sep 24 11:53:47 2009 From: rafaelsantoscoelho at gmail.com (Rafael Santos Coelho) Date: Thu, 24 Sep 2009 13:53:47 -0300 Subject: Parallel graph coloring heuristics to color large-scale general jacobian matrices In-Reply-To: References: <3b6f83d40909240850u16a34c52p398d0ac10e6f4713@mail.gmail.com> Message-ID: <3b6f83d40909240953u7b598253mc8d619b9c65b5823@mail.gmail.com> Hello, Hong, thank you very much for the help. Well, I'm not positive wether I really understood what that code snippet does, but I'll definitely take a closer look at it later. For now, I have some doubts I'd like to clear out. As far as I'm concerned, the following line > ierr = MatGetColoring(Jac,MATCOLORING_SL,&iscoloring);CHKERRQ(ierr); gives me the coloring, right? But what if I want to test my own parallel coloring routine, instead of just using PETSc's? Please, correct me if I'm wrong: the way I see it, the MatFDColoring context in PETSc is always built on top of 1) the coloring (ISColoring) *and* 2) the sparsity pattern of the underlying matrix, right? The thing is that I've developed my code completely separate from the PETSc API, I mean, it's written in "pure" ANSI C language/MPI. My goal now is to link it to the PETSc framework, so that I can benefit from all the other things already implemented within PETSc. As I'm not quite the expert in the "inner workings" of the PETSc library, my first guess was that I would have to provide a special routine to create the MatFDColoring context on each processor based on the coloring I found and on how my code "represents" the sparsity pattern of the jacobian matrix. Rafael -------------- next part -------------- An HTML attachment was scrubbed... URL: From sapphire.jxy at gmail.com Thu Sep 24 12:10:57 2009 From: sapphire.jxy at gmail.com (xiaoyin ji) Date: Thu, 24 Sep 2009 13:10:57 -0400 Subject: MATMPIAIJ preallocation may not be cleaned by MatDestroy? Message-ID: <6985a8f00909241010k1e6d7f15p6cb28b8d7815fbe5@mail.gmail.com> Hi all, I've finally fixed the problem with the increasing solving time for iterative call of ksp solver. The way is set mat type as MATAIJ instead of MATMPIAIJ. Here is the new codes of mat construction. MatCreate(PETSC_COMM_WORLD,&A); MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,ngridtot,ngridtot); MatSetType(A,MATAIJ); MatSetFromOptions(A); MPI_Comm_size(PETSC_COMM_WORLD,&size); if (size > 1){ MatMPIAIJSetPreallocation(A,7,PETSC_NULL,2,PETSC_NULL); } else { MatSeqAIJSetPreallocation(A,7,PETSC_NULL); } The older version is like this, mat type is defined seperately based on the system size as MATMPIAIJ and MATSEQAIJ respectively. This code works ok in Fortran subroutines, but in C++ it will cause ksp solver getting slower and slower like a parabola, which may be a memory leak. MatCreate(world,&A); MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,ngridtot,ngridtot); // setup sparse matrix in either MATMPIAIJ or MATSEQAIJ format MPI_Comm_size(world,&size); if (size > 1) { MatSetType(A,MATMPIAIJ); MatSetFromOptions(A); MatMPIAIJSetPreallocation(A,7,PETSC_NULL,2,PETSC_NULL); } else { MatSetType(A,MATSEQAIJ); MatSetFromOptions(A); MatSeqAIJSetPreallocation(A,7,PETSC_NULL); } However, I think petsc library still have problem, as both codes should work in C++ or Fortran. Thanks. Best, Xiaoyin Ji From bsmith at mcs.anl.gov Thu Sep 24 12:49:50 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 24 Sep 2009 12:49:50 -0500 Subject: MATMPIAIJ preallocation may not be cleaned by MatDestroy? In-Reply-To: <6985a8f00909241010k1e6d7f15p6cb28b8d7815fbe5@mail.gmail.com> References: <6985a8f00909241010k1e6d7f15p6cb28b8d7815fbe5@mail.gmail.com> Message-ID: On Sep 24, 2009, at 12:10 PM, xiaoyin ji wrote: > Hi all, > > I've finally fixed the problem with the increasing solving time for > iterative call of ksp solver. The way is set mat type as MATAIJ > instead of MATMPIAIJ. Here is the new codes of mat construction. > > MatCreate(PETSC_COMM_WORLD,&A); > MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,ngridtot,ngridtot); > MatSetType(A,MATAIJ); > MatSetFromOptions(A); > MPI_Comm_size(PETSC_COMM_WORLD,&size); > if (size > 1){ > MatMPIAIJSetPreallocation(A,7,PETSC_NULL,2,PETSC_NULL); > } else { > MatSeqAIJSetPreallocation(A,7,PETSC_NULL); > } You do not need to do the if test here. You only need to do > MatMPIAIJSetPreallocation(A,7,PETSC_NULL,2,PETSC_NULL); > MatMPIAIJSetPreallocation(A,7,PETSC_NULL,2,PETSC_NULL); PETSc automatically selects the appropriate one and ignores the inappropriate one. Barry > > The older version is like this, mat type is defined seperately based > on the system size as MATMPIAIJ and MATSEQAIJ respectively. This code > works ok in Fortran subroutines, but in C++ it will cause ksp solver > getting slower and slower like a parabola, which may be a memory leak. > > MatCreate(world,&A); > > MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,ngridtot,ngridtot); > > // setup sparse matrix in either MATMPIAIJ or MATSEQAIJ format > MPI_Comm_size(world,&size); > if (size > 1) { > MatSetType(A,MATMPIAIJ); > MatSetFromOptions(A); > MatMPIAIJSetPreallocation(A,7,PETSC_NULL,2,PETSC_NULL); > } else { > MatSetType(A,MATSEQAIJ); > MatSetFromOptions(A); > MatSeqAIJSetPreallocation(A,7,PETSC_NULL); > } > > However, I think petsc library still have problem, as both codes > should work in C++ or Fortran. Thanks. > > Best, > Xiaoyin Ji From bsmith at mcs.anl.gov Thu Sep 24 13:09:44 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 24 Sep 2009 13:09:44 -0500 Subject: Parallel graph coloring heuristics to color large-scale general jacobian matrices In-Reply-To: <3b6f83d40909240953u7b598253mc8d619b9c65b5823@mail.gmail.com> References: <3b6f83d40909240850u16a34c52p398d0ac10e6f4713@mail.gmail.com> <3b6f83d40909240953u7b598253mc8d619b9c65b5823@mail.gmail.com> Message-ID: On Sep 24, 2009, at 11:53 AM, Rafael Santos Coelho wrote: > Hello, Hong, > > thank you very much for the help. Well, I'm not positive wether I > really understood what that code snippet does, but I'll definitely > take a closer look at it later. For now, I have some doubts I'd like > to clear out. As far as I'm concerned, the following line > > ierr = MatGetColoring(Jac,MATCOLORING_SL,&iscoloring);CHKERRQ(ierr); > > gives me the coloring, right? But what if I want to test my own > parallel coloring routine, instead of just using PETSc's? Please, > correct me if I'm wrong: the way I see it, the MatFDColoring context > in PETSc is always built on top of 1) the coloring (ISColoring) and > 2) the sparsity pattern of the underlying matrix, right? > > The thing is that I've developed my code completely separate from > the PETSc API, I mean, it's written in "pure" ANSI C language/MPI. > My goal now is to link it to the PETSc framework, so that I can > benefit from all the other things already implemented within PETSc. > As I'm not quite the expert in the "inner workings" of the PETSc > library, my first guess was that I would have to provide a special > routine to create the MatFDColoring context on each processor based > on the coloring I found and on how my code "represents" the sparsity > pattern of the jacobian matrix. > > Rafael > There are two distinct parts of using coloring in PETSc to compute Jacobians: 1) computing the coloring of the sparse matrix: done with MatGetColoring(), this is done in PETSc with classes allowing one to easily and transparently provide new coloring algorithms. 2) using the computed coloring to actually compute a Jacobian (efficiently): done with MatFDColoringCreate() and its methods. This is hardwired in PETSc to use one particular data structure for actually computing the Jacobian. What you should: 1) Write a "wrapper" for your code that computes the coloring. It should have the calling sequence (Mat mat,MatColoringType name,ISColoring *iscoloring) you ignore the name argument (since it is the name you select). Then you call MatColoringRegisterDynamic() to register it. Now one can use it by simply calling MatGetColoring(mat,"yournameyouregisteredwith",&iscoloring); just like the "built in" colorings. I would do this first because having an efficient parallel coloring for PETSc would be great! 2) MatFDColoring code was not designed to allow other implementations to be easily plugged in. (There is no MatFDColoringType to allow having different ways of computing this). So you need to decide if your code that does 2) is possibly "better" then in PETSc, or if 1) is really the useful thing you are providing. If you want to also provide 2) then write new routines MyMatFDColoringCreate(), MyFDColoringDestroy(), MatFDColoringApply() etc with the same calling sequences as PETSc's and make them wrappers to call your code (do not try to reproduce the data structures inside the MatFDColoring data structure (use whatever data structures you currently use), why because if you use the same data structure then you are just rewriting MatFDColoringCreate() to do what it already does, the only reason to use your own is if it uses its own data structures that may be better then in PETSc. If you do write MyMatFD.. routines etc and they are as good or better then PETSc's then we will change PETSc's MatFDColoringCreate() code to support multiple instances (that is introduce a MatFDColoringType) and your new MyMatFDColoring stuff would be registratable. Hope this clarifies things, feel free to direct further questions to petsc-dev at mcs.anl.gov since that is the mailing list for people extending PETSc code. Barry From mafunk at nmsu.edu Thu Sep 24 13:22:48 2009 From: mafunk at nmsu.edu (Matt Funk) Date: Thu, 24 Sep 2009 12:22:48 -0600 Subject: question on MatSetValues In-Reply-To: References: <6985a8f00909241010k1e6d7f15p6cb28b8d7815fbe5@mail.gmail.com> Message-ID: <200909241222.49261.mafunk@nmsu.edu> Hi, i had a question concerning MatSetValues with respect to which values are ignored. I believe that PETSC ignores column indices of -1? My question then is, are the corresponding entries in the values array ignored as well? For example: int sizeOfNonZeroColumnEntries = 9 int a_numRowsToInsert = 1 int a_numRowsToInsert = 1 int rowNumber 0 //col. index vector for a given row (length 9) a_GlobalColNumber = {1 2 -1 4 5 6 -1 8 9} //corresponsing values vector for given row (length 9) a_Values = {x1 x2 x3 x4 x5 x6 x7 x8 x9} Now calling MatSetValues to insert this: m_ierr = MatSetValues(m_globalMatrix, a_numRowsToInsert, rowNumber, sizeOfNonZeroColumnEntries, &a_GlobalColNumber, &a_Values, INSERT_VALUES); question 1: will petsc insert 7 column/value pairs into the matrix, or will insert 9 question 2: assuming it only inserts 7, am assume it is ok that i only preallocated 7 memory slots for this entry? thanks matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Sep 24 13:25:55 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 24 Sep 2009 13:25:55 -0500 Subject: question on MatSetValues In-Reply-To: <200909241222.49261.mafunk@nmsu.edu> References: <6985a8f00909241010k1e6d7f15p6cb28b8d7815fbe5@mail.gmail.com> <200909241222.49261.mafunk@nmsu.edu> Message-ID: <4392929A-635C-4486-98E2-7C7841C91D13@mcs.anl.gov> On Sep 24, 2009, at 1:22 PM, Matt Funk wrote: > Hi, > > i had a question concerning MatSetValues with respect to which > values are ignored. > > I believe that PETSC ignores column indices of -1? > My question then is, are the corresponding entries in the values > array ignored as well? > > > For example: > > int sizeOfNonZeroColumnEntries = 9 > int a_numRowsToInsert = 1 > int a_numRowsToInsert = 1 > int rowNumber 0 > > //col. index vector for a given row (length 9) > a_GlobalColNumber = {1 2 -1 4 5 6 -1 8 9} > > //corresponsing values vector for given row (length 9) > a_Values = {x1 x2 x3 x4 x5 x6 x7 x8 x9} > > Now calling MatSetValues to insert this: > > m_ierr = > MatSetValues(m_globalMatrix, > a_numRowsToInsert, > rowNumber, > sizeOfNonZeroColumnEntries, > &a_GlobalColNumber, > &a_Values, > INSERT_VALUES); > > question 1: > will petsc insert 7 column/value pairs into the matrix, or will > insert 9 7 > > question 2: > assuming it only inserts 7, am assume it is ok that i only > preallocated 7 memory slots for this entry? Yes, you only need allocate 7. BTW: column (and row) indices always start at 0. Barry > > > thanks > matt > > > From rafaelsantoscoelho at gmail.com Thu Sep 24 13:41:04 2009 From: rafaelsantoscoelho at gmail.com (Rafael Santos Coelho) Date: Thu, 24 Sep 2009 15:41:04 -0300 Subject: Parallel graph coloring heuristics to color large-scale general jacobian matrices In-Reply-To: References: <3b6f83d40909240850u16a34c52p398d0ac10e6f4713@mail.gmail.com> <3b6f83d40909240953u7b598253mc8d619b9c65b5823@mail.gmail.com> Message-ID: <3b6f83d40909241141p73d8eaa5w76cd2e1fcb1eb817@mail.gmail.com> Hello, Barry, thanks for your reply, it has surely made things way more clear for me now. I believe I'm going to stick with strategy # 2 (writing my own MyMatFDColoringCreate, etc) because, to me, it sounds simpler, since I'm on a tight time schedule right now and that I'm quite knowledgeable about how things go inside the SNES (two years dealing with nonlinear problems in PETSc finally paid off) and the MatFDColoring modules. Plus, using my own data structures would be of great help to me as well, given that it took me a considerable amount of time to design them efficiently. And I would gladly contribute my coding effort to PETSc if my routine proves to be more efficient and flexible. :-) Rafael -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsjb00 at hotmail.com Fri Sep 25 10:25:50 2009 From: tsjb00 at hotmail.com (tsjb00) Date: Fri, 25 Sep 2009 15:25:50 +0000 Subject: KSP advice Message-ID: Hi! I am looking for suggestion for KSP solvers. I need to solve a multi-component reactive transport problem with KSP solver. The problem involves wide range of time and length scales. Which KSP method would be my best choice? Many thanks in advance! _________________________________________________________________ ?Windows Live ??????????Messenger? http://www.windowslive.cn -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 25 10:34:05 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 25 Sep 2009 10:34:05 -0500 Subject: KSP advice In-Reply-To: References: Message-ID: 2009/9/25 tsjb00 > Hi! I am looking for suggestion for KSP solvers. I need to solve a > multi-component reactive transport problem with KSP solver. The problem > involves wide range of time and length scales. Which KSP method would be my > best choice? > The whole points is to try them all. Anyone who says they know which will work is lying. Write a loop over -ksp_type. Matt > Many thanks in advance! > > > ------------------------------ > ??????????????MSN??????????????? ????? > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsjb00 at hotmail.com Fri Sep 25 10:44:21 2009 From: tsjb00 at hotmail.com (tsjb00) Date: Fri, 25 Sep 2009 15:44:21 +0000 Subject: KSP advice In-Reply-To: References: Message-ID: Many thanks for the advice! Date: Fri, 25 Sep 2009 10:34:05 -0500 Subject: Re: KSP advice From: knepley at gmail.com To: petsc-users at mcs.anl.gov 2009/9/25 tsjb00 Hi! I am looking for suggestion for KSP solvers. I need to solve a multi-component reactive transport problem with KSP solver. The problem involves wide range of time and length scales. Which KSP method would be my best choice? The whole points is to try them all. Anyone who says they know which will work is lying. Write a loop over -ksp_type. Matt Many thanks in advance! ??????????????MSN??????????????? ????? -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener _________________________________________________________________ ????????????MClub????????? http://club.msn.cn/?from=10 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 28 15:26:37 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 28 Sep 2009 15:26:37 -0500 Subject: Petsc and .Net In-Reply-To: <7f18de3b0909180304o78bda701r1e052da8397032f3@mail.gmail.com> References: <7f18de3b0909180304o78bda701r1e052da8397032f3@mail.gmail.com> Message-ID: <185410AD-5005-4B2E-BF14-4D9F4FA54E88@mcs.anl.gov> I found this with a google search, don't know if it helps. http://www.cs.ru.nl/~marko/onderwijs/oss/portable.dotnet.linking.pdf Barry On Sep 18, 2009, at 5:04 AM, Michel Cancelliere wrote: > Hi Petsc Users, > > I'm trying to use petsc in a .Net app but i'm experiencing problem > when linking (ijw/native module detected; cannot link with pure > modules). Has somebody managed to do that? > > Thank you in advance, > > Michel Cancelliere From recrusader at gmail.com Mon Sep 28 17:45:58 2009 From: recrusader at gmail.com (Yujie) Date: Mon, 28 Sep 2009 17:45:58 -0500 Subject: errors in make test (petsc 3.0.0 p8 with complex) Message-ID: <7ff0ee010909281545w6bd7879co252e39d51938b126@mail.gmail.com> Hi, PETSc Developers I am trying to compile the latest version (3.0.0 p8) of PETSc in my PC (AMD 64bits, GCC4.2). My configure command is: config/configure.py --with-mpi-dir=/home/yujie/codes/mpich127/ --with-clanguage=C++ --with-debugging=1 --with-shared=1 --with-scalar-type=complex --with-spooles=1 --download-spooles=1 --with-parmetis=1 --download-parmetis=1 --with-superlu_dist=1 --download-superlu_dist=1 --with-mumps=1 --download-mumps=1 --with-scalapack=1 --download-scalapack=1 --with-plapack=1 --download-plapack=1 --download-f-blas-lapack=1 --with-blacs --download-blacs=1 I can configure and make PETSc. However, when I make test, I got the following errros: " Running test examples to verify correct installation Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI process See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: symbol lookup error: /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: undefined symbol: _Z25DMMGSetSNESLocali_PrivatePP7_n_DMMGPFiP11DALocalInfoP10MatStencilPvPSt7complexIdES6_EPFiS3_S5_S6_S6_S6_ESD_ Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI processes See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: symbol lookup error: /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: undefined symbol: _Z25DMMGSetSNESLocali_PrivatePP7_n_DMMGPFiP11DALocalInfoP10MatStencilPvPSt7complexIdES6_EPFiS3_S5_S6_S6_S6_ESD_ /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: symbol lookup error: /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: undefined symbol: _Z25DMMGSetSNESLocali_PrivatePP7_n_DMMGPFiP11DALocalInfoP10MatStencilPvPSt7complexIdES6_EPFiS3_S5_S6_S6_S6_ESD_ Error running Fortran example src/snes/examples/tutorials/ex5f with 1 MPI process See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex5f: symbol lookup error: /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex5f: undefined symbol: mpi_comm_size_ Completed test examples " Could you give me some help? Thanks a lot. Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at mcs.anl.gov Mon Sep 28 18:02:13 2009 From: knepley at mcs.anl.gov (Matthew Knepley) Date: Mon, 28 Sep 2009 18:02:13 -0500 Subject: errors in make test (petsc 3.0.0 p8 with complex) In-Reply-To: <7ff0ee010909281545w6bd7879co252e39d51938b126@mail.gmail.com> References: <7ff0ee010909281545w6bd7879co252e39d51938b126@mail.gmail.com> Message-ID: Look like some things did not compile. Send make_log* to petsc-maint. Matt On Mon, Sep 28, 2009 at 5:45 PM, Yujie wrote: > Hi, PETSc Developers > > I am trying to compile the latest version (3.0.0 p8) of PETSc in my PC (AMD > 64bits, GCC4.2). > My configure command is: > config/configure.py --with-mpi-dir=/home/yujie/codes/mpich127/ > --with-clanguage=C++ --with-debugging=1 --with-shared=1 > --with-scalar-type=complex --with-spooles=1 --download-spooles=1 > --with-parmetis=1 --download-parmetis=1 --with-superlu_dist=1 > --download-superlu_dist=1 --with-mumps=1 --download-mumps=1 > --with-scalapack=1 --download-scalapack=1 --with-plapack=1 > --download-plapack=1 --download-f-blas-lapack=1 --with-blacs > --download-blacs=1 > > I can configure and make PETSc. However, when I make test, I got the > following errros: > " > Running test examples to verify correct installation > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI > process > See > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html > /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: symbol > lookup error: > /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: > undefined symbol: > _Z25DMMGSetSNESLocali_PrivatePP7_n_DMMGPFiP11DALocalInfoP10MatStencilPvPSt7complexIdES6_EPFiS3_S5_S6_S6_S6_ESD_ > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI > processes > See > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html > /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: symbol > lookup error: > /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: > undefined symbol: > _Z25DMMGSetSNESLocali_PrivatePP7_n_DMMGPFiP11DALocalInfoP10MatStencilPvPSt7complexIdES6_EPFiS3_S5_S6_S6_S6_ESD_ > /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: symbol > lookup error: > /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: > undefined symbol: > _Z25DMMGSetSNESLocali_PrivatePP7_n_DMMGPFiP11DALocalInfoP10MatStencilPvPSt7complexIdES6_EPFiS3_S5_S6_S6_S6_ESD_ > Error running Fortran example src/snes/examples/tutorials/ex5f with 1 MPI > process > See > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html > /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex5f: symbol > lookup error: > /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex5f: > undefined symbol: mpi_comm_size_ > Completed test examples > " > Could you give me some help? Thanks a lot. > > Regards, > Yujie > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.schauer at tu-bs.de Tue Sep 29 10:05:19 2009 From: m.schauer at tu-bs.de (Marco Schauer) Date: Tue, 29 Sep 2009 17:05:19 +0200 Subject: Schur decomposition Message-ID: <4AC2222F.4030300@tu-bs.de> Dear PETSc developers, I translate a sequential Fortran code, which uses LAPACK into a parallel c++ code based on PETSc 3.0.0-p7. The Fortran code uses a function called DGEES to calculate Schur decomposition. I have absolutely no idea how to do this in PETSc. Is there a PETSc example of this problem? Every hint is very welcome to me. Best regards Marco -- Dipl. Ing. Marco Schauer Technische Universit?t Carolo-Wilhelmina zu Braunschweig Institut f?r Angewandte Mechanik Spielmannstra?e 11 D-38106 Braunschweig, Germany phone +49 (0) 531 391 7108 fax +49 (0) 531 391 5843 email m.schauer at tu-braunschweig.de web http://www.infam.tu-braunschweig.de From knepley at mcs.anl.gov Tue Sep 29 10:22:54 2009 From: knepley at mcs.anl.gov (Matthew Knepley) Date: Tue, 29 Sep 2009 10:22:54 -0500 Subject: Schur decomposition In-Reply-To: <4AC2222F.4030300@tu-bs.de> References: <4AC2222F.4030300@tu-bs.de> Message-ID: On Tue, Sep 29, 2009 at 10:05 AM, Marco Schauer wrote: > Dear PETSc developers, > I translate a sequential Fortran code, which uses LAPACK into a parallel > c++ code based on PETSc 3.0.0-p7. The Fortran code uses a function called > DGEES to calculate Schur decomposition. I have absolutely no idea how to do > this in PETSc. Is there a PETSc example of this problem? Every hint is very > welcome to me. > Do you want this to run in parallel? You might be able to use PLAPACK. I would check the documentation. Then you could call it in PETSc. Matt > Best regards > Marco > > -- > Dipl. Ing. Marco Schauer > > Technische Universit?t Carolo-Wilhelmina zu Braunschweig > Institut f?r Angewandte Mechanik > Spielmannstra?e 11 > D-38106 Braunschweig, Germany > > phone +49 (0) 531 391 7108 > fax +49 (0) 531 391 5843 > email m.schauer at tu-braunschweig.de > web http://www.infam.tu-braunschweig.de > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Tue Sep 29 11:45:57 2009 From: recrusader at gmail.com (Yujie) Date: Tue, 29 Sep 2009 11:45:57 -0500 Subject: errors in make test (petsc 3.0.0 p8 with complex) In-Reply-To: References: <7ff0ee010909281545w6bd7879co252e39d51938b126@mail.gmail.com> Message-ID: <7ff0ee010909290945s3cec13ddr921a0a9cd27d9491@mail.gmail.com> Thanks very much, Matthew. I have sent my make.log file to petsc-maint. Regards, Yujie On Mon, Sep 28, 2009 at 6:02 PM, Matthew Knepley wrote: > Look like some things did not compile. Send make_log* to petsc-maint. > > Matt > > > On Mon, Sep 28, 2009 at 5:45 PM, Yujie wrote: > >> Hi, PETSc Developers >> >> I am trying to compile the latest version (3.0.0 p8) of PETSc in my PC >> (AMD 64bits, GCC4.2). >> My configure command is: >> config/configure.py --with-mpi-dir=/home/yujie/codes/mpich127/ >> --with-clanguage=C++ --with-debugging=1 --with-shared=1 >> --with-scalar-type=complex --with-spooles=1 --download-spooles=1 >> --with-parmetis=1 --download-parmetis=1 --with-superlu_dist=1 >> --download-superlu_dist=1 --with-mumps=1 --download-mumps=1 >> --with-scalapack=1 --download-scalapack=1 --with-plapack=1 >> --download-plapack=1 --download-f-blas-lapack=1 --with-blacs >> --download-blacs=1 >> >> I can configure and make PETSc. However, when I make test, I got the >> following errros: >> " >> Running test examples to verify correct installation >> Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI >> process >> See >> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html >> /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: >> symbol lookup error: >> /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: >> undefined symbol: >> _Z25DMMGSetSNESLocali_PrivatePP7_n_DMMGPFiP11DALocalInfoP10MatStencilPvPSt7complexIdES6_EPFiS3_S5_S6_S6_S6_ESD_ >> Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI >> processes >> See >> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html >> /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: >> symbol lookup error: >> /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: >> undefined symbol: >> _Z25DMMGSetSNESLocali_PrivatePP7_n_DMMGPFiP11DALocalInfoP10MatStencilPvPSt7complexIdES6_EPFiS3_S5_S6_S6_S6_ESD_ >> /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: >> symbol lookup error: >> /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex19: >> undefined symbol: >> _Z25DMMGSetSNESLocali_PrivatePP7_n_DMMGPFiP11DALocalInfoP10MatStencilPvPSt7complexIdES6_EPFiS3_S5_S6_S6_S6_ESD_ >> Error running Fortran example src/snes/examples/tutorials/ex5f with 1 MPI >> process >> See >> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html >> /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex5f: >> symbol lookup error: >> /home/yujie/codes/petsc-3.0.0-p8/src/snes/examples/tutorials/./ex5f: >> undefined symbol: mpi_comm_size_ >> Completed test examples >> " >> Could you give me some help? Thanks a lot. >> >> Regards, >> Yujie >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Tue Sep 29 12:52:58 2009 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 29 Sep 2009 19:52:58 +0200 Subject: Schur decomposition In-Reply-To: <4AC2222F.4030300@tu-bs.de> References: <4AC2222F.4030300@tu-bs.de> Message-ID: On 29/09/2009, Marco Schauer wrote: > Dear PETSc developers, > I translate a sequential Fortran code, which uses LAPACK into a > parallel c++ code based on PETSc 3.0.0-p7. The Fortran code uses a > function called DGEES to calculate Schur decomposition. I have > absolutely no idea how to do this in PETSc. Is there a PETSc example > of this problem? Every hint is very welcome to me. > Best regards > Marco If in your application it is sufficient to compute a partial Schur decomposition, you may consider using SLEPc. Jose From mafunk at nmsu.edu Tue Sep 29 17:32:53 2009 From: mafunk at nmsu.edu (Matt Funk) Date: Tue, 29 Sep 2009 16:32:53 -0600 Subject: PETSC memory usage Message-ID: <200909291632.53606.mafunk@nmsu.edu> Hi, i have another question regarding how petsc uses memory w.r.t caching in MatSetValues . My code does the standard stuff: 1) i preallocate the memory 2) i insert the values via MatSetValues 3)i assemble it. Say, for example i declare a 100x100 matrix with 10 NZ entries per row. After 1), will the memory used for the matrix be 100^2*10*sizeof(double)? After 2), will the memory used be 100^2*10*sizeof(double) from the prealloc PLUS 100^2*10*sizeof(double) form the caching of values After 3), will the memory then be reduced back to 100^2*10*sizeof(double)? My concern is step 2). If it is using memory for prealloc and seperately for caching, then is there a way to flush the cached values to the preallocated slots? I tried finding stuff in the manual pages but i am not quite sure if i can or not. thanks matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From mafunk at nmsu.edu Tue Sep 29 18:13:11 2009 From: mafunk at nmsu.edu (Matt Funk) Date: Tue, 29 Sep 2009 17:13:11 -0600 Subject: PETSC memory usage In-Reply-To: <200909291632.53606.mafunk@nmsu.edu> References: <200909291632.53606.mafunk@nmsu.edu> Message-ID: <200909291713.11571.mafunk@nmsu.edu> Eehh, i mean 100*10*sizeof(double). matt On Tuesday 29 September 2009, Matt Funk wrote: > Hi, > > i have another question regarding how petsc uses memory w.r.t caching in > MatSetValues . > > My code does the standard stuff: > 1) i preallocate the memory > 2) i insert the values via MatSetValues > 3)i assemble it. > > Say, for example i declare a 100x100 matrix with 10 NZ entries per row. > After 1), will the memory used for the matrix be 100^2*10*sizeof(double)? > > After 2), will the memory used be 100^2*10*sizeof(double) from the prealloc > PLUS 100^2*10*sizeof(double) form the caching of values > > After 3), will the memory then be reduced back to 100^2*10*sizeof(double)? > > My concern is step 2). If it is using memory for prealloc and seperately > for caching, then is there a way to flush the cached values to the > preallocated slots? I tried finding stuff in the manual pages but i am not > quite sure if i can or not. > > > thanks > matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at mcs.anl.gov Tue Sep 29 18:19:32 2009 From: knepley at mcs.anl.gov (Matthew Knepley) Date: Tue, 29 Sep 2009 18:19:32 -0500 Subject: PETSC memory usage In-Reply-To: <200909291632.53606.mafunk@nmsu.edu> References: <200909291632.53606.mafunk@nmsu.edu> Message-ID: On Tue, Sep 29, 2009 at 5:32 PM, Matt Funk wrote: > Hi, > > > i have another question regarding how petsc uses memory w.r.t caching in > MatSetValues . > > > My code does the standard stuff: > 1) i preallocate the memory > 2) i insert the values via MatSetValues > 3)i assemble it. > > > Say, for example i declare a 100x100 matrix with 10 NZ entries per row. > After 1), will the memory used for the matrix be 100^2*10*sizeof(double)? > 100*10 > After 2), will the memory used be 100^2*10*sizeof(double) from the prealloc > PLUS 100^2*10*sizeof(double) form the caching of values > Not sure what caching you mean. No extra memory will be needed. Do you mean storing values set for off-process rows? We refer to that as "stashing". You can clear the stash by calling MatAssemblyEnd() with ASSEMBLY_FLUSH. Matt > After 3), will the memory then be reduced back to 100^2*10*sizeof(double)? > > > My concern is step 2). If it is using memory for prealloc and seperately > for caching, then is there a way to flush the cached values to the > preallocated slots? I tried finding stuff in the manual pages but i am not > quite sure if i can or not. > > > > thanks > matt > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From likask at civil.gla.ac.uk Wed Sep 30 10:53:32 2009 From: likask at civil.gla.ac.uk (Lukasz Kaczmarczyk) Date: Wed, 30 Sep 2009 16:53:32 +0100 Subject: SNES and internal variables Message-ID: Hello, I like to use SNES to solve nonlinear system of equations. For my problem, I have to update some internal variables for computed newtons increments of primary unknowns, before I update residual (right hand side) and Jacobian. I found function SNESSetUpdate which is called at the beginning o every iteration. But I am not sure enough if that function is proper for updating internal variables. Can You advise me if it is right to use internal variables, f.e. increments of plastic strain at gauss points. Lukasz Kaczmarczyk From likask at civil.gla.ac.uk Wed Sep 30 10:15:43 2009 From: likask at civil.gla.ac.uk (Lukasz Kaczmarczyk) Date: Wed, 30 Sep 2009 16:15:43 +0100 Subject: SNES and internar variables Message-ID: Hello, I like to use SNES to solve nonlinear system of equations. For my problem, I have to update some internal variables for computed newtons increments of primary unknowns, before I update residual (right hand side) and Jacobian. I found function SNESSetUpdate which is called at the beginning o every iteration. But I am not sure enough if that function is proper for updating internal variables. Can You advise me if it is right to use internal variables, f.e. increments of plastic strain at gauss points. Such problems exist in solid mechanics when increment of plastic strain and plastic multiplayer have to be computed. Another example is for hybrid/mixed FE discretisation having some variables approximated with C(-1) continuity. Unknowns associated to C(-1) continuos approximations can be statically condensed , see attached file. As a results at each iteration step increments for statically condense variables have to be updated. Lukasz Kaczmarczyk -------------- next part -------------- A non-text attachment was scrubbed... Name: eq.pdf Type: application/pdf Size: 46799 bytes Desc: not available URL: From knepley at mcs.anl.gov Wed Sep 30 11:04:28 2009 From: knepley at mcs.anl.gov (Matthew Knepley) Date: Wed, 30 Sep 2009 11:04:28 -0500 Subject: SNES and internal variables In-Reply-To: References: Message-ID: Yes, that is a general update function, suitable for updating this information at the start of a Newton iterate. Matt On Wed, Sep 30, 2009 at 10:53 AM, Lukasz Kaczmarczyk wrote: > Hello, > > I like to use SNES to solve nonlinear system of equations. For my problem, > I have to update some internal variables for computed newtons increments of > primary unknowns, before I update residual (right hand side) and Jacobian. I > found function SNESSetUpdate which is called at the beginning o every > iteration. But I am not sure enough if that function is proper for updating > internal variables. Can You advise me if it is right to use internal > variables, f.e. increments of plastic strain at gauss points. > > Lukasz Kaczmarczyk > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From likask at civil.gla.ac.uk Wed Sep 30 11:20:17 2009 From: likask at civil.gla.ac.uk (Lukasz Kaczmarczyk) Date: Wed, 30 Sep 2009 17:20:17 +0100 Subject: SNES and internal variables In-Reply-To: References: Message-ID: <1AF7912F-AC7C-4B2A-BCC6-45AD68A880A9@civil.gla.ac.uk> Thank You for Your fast response. Lukasz On 30 Sep 2009, at 17:04, Matthew Knepley wrote: > Yes, that is a general update function, suitable for updating this > information at the > start of a Newton iterate. > > Matt > > On Wed, Sep 30, 2009 at 10:53 AM, Lukasz Kaczmarczyk > wrote: > Hello, > > I like to use SNES to solve nonlinear system of equations. For my > problem, I have to update some internal variables for computed > newtons increments of primary unknowns, before I update residual > (right hand side) and Jacobian. I found function SNESSetUpdate which > is called at the beginning o every iteration. But I am not sure > enough if that function is proper for updating internal variables. > Can You advise me if it is right to use internal variables, f.e. > increments of plastic strain at gauss points. > > Lukasz Kaczmarczyk > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener Lukasz Kaczmarczyk Lecturer Department of Civil Engineering, University of Glasgow, GLASGOW, G12 8LT Tel: +44 141 3305348 email: likask at civil.gla.ac.uk web: http://www.civil.gla.ac.uk/~kaczmarczyk/