From jwicks at cs.brown.edu Sun Dec 2 08:01:59 2007 From: jwicks at cs.brown.edu (John R. Wicks) Date: Sun, 2 Dec 2007 09:01:59 -0500 Subject: PCGetFactoredMatrix In-Reply-To: Message-ID: <000201c834eb$e89ecfa0$0201a8c0@jwickslptp> I am specifically interested in knowing if one can expect the residual matrix (A - LU) to be significantly more sparse than the original matrix, A. Does anyone know if this is the case for sparse A? > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Thursday, November 29, 2007 3:43 PM > To: petsc-users at mcs.anl.gov > Subject: Re: PCGetFactoredMatrix > > > > John, > > There is no immediate way to do this. > For the SeqAIJ format, we store both the LU in a single CSR > format. with for each row first the part of L (below the > diagonal) then 1/D_i then the part of U for that row. You can > see how the triangular solves are done by looking at > src/mat/impls/aij/seq/aijfact.c the routine > MatSolve_SeqAIJ() > Note that it is actually more complicated due to the row and column > permutations > (the factored matrix is stored in the ordering of the > permutations). For BAIJ matrix the storage is similar except > it is stored by block > instead of point > and the inverse of the block diagonal is stored. > > One could take the MatSolve_SeqAIJ() routine and modify it to do the > matrix > vector product without too much difficulty. > > If you decide to do this we would gladly include it in our > distribution. > > Barry > > One can ask why we don't provide this functionality in PETSc since > computing > A - LU is a reasonable thing to do if one wants to understand the > convergence > of the method. The answer is two-fold 1) time and energy and > 2) though > we > like everyone to use PETSc we driven more by people who are not > interested > in the solution algorithms etc but only in getting the answer easily > and relatively > efficiently. > > > On Nov 29, 2007, at 12:07 PM, John R. Wicks wrote: > > > I would like to compute the residual A - LU, where LU is the ILU > > factorization of A. What is the most convenient way of doing so? > > > >> -----Original Message----- > >> From: owner-petsc-users at mcs.anl.gov > >> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley > >> Sent: Thursday, November 29, 2007 12:04 PM > >> To: petsc-users at mcs.anl.gov > >> Subject: Re: PCGetFactoredMatrix > >> > >> > >> It depends on the package, but the petsc stuff stores L > and U in one > >> matrix. > >> > >> Matt > >> > >> On Nov 29, 2007 9:03 AM, John R. Wicks wrote: > >>> The documentation for PCGetFactoredMatrix is not clear. > What does > >>> this return for ILU(0), for example? Does it return the > >> product LU or > >>> the in place factorization? > >>> > >>> > >> > >> > >> > >> -- > >> What most experimenters take for granted before they begin > >> their experiments is infinitely more interesting than any > >> results to which their experiments lead. > >> -- Norbert Wiener > >> > > > From timothy.stitt at ichec.ie Mon Dec 3 11:49:07 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Mon, 03 Dec 2007 17:49:07 +0000 Subject: Global to Local Vector Mapping Message-ID: <47544193.6050501@ichec.ie> Hi all, Is there a quick way to map a global index for a parallel vector to a local mapping tuple (p,i) were 'p' represents the process containing the value and 'i' is the local index number on that process? As always, thanks in advance for any information provided. Tim. -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From knepley at gmail.com Mon Dec 3 12:03:50 2007 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 3 Dec 2007 12:03:50 -0600 Subject: Global to Local Vector Mapping In-Reply-To: <47544193.6050501@ichec.ie> References: <47544193.6050501@ichec.ie> Message-ID: On Dec 3, 2007 11:49 AM, Tim Stitt wrote: > Hi all, > > Is there a quick way to map a global index for a parallel vector to a > local mapping tuple (p,i) were 'p' represents the process containing the > value and 'i' is the local index number on that process? PetscMapGetGlobalRange(&v->map,const &range); for(p = 0; p < numProcs; ++p) if (range[p+1] > globalInd) break; localInd = globalInd - range[p]; Matt > As always, thanks in advance for any information provided. > > Tim. > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From timothy.stitt at ichec.ie Mon Dec 3 12:23:36 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Mon, 03 Dec 2007 18:23:36 +0000 Subject: Global to Local Vector Mapping In-Reply-To: References: <47544193.6050501@ichec.ie> Message-ID: <475449A8.7000307@ichec.ie> I have a problem Matthew...this is a Fortran code..which I don't think this routine is compatible with. Is there any other way? Matthew Knepley wrote: > On Dec 3, 2007 11:49 AM, Tim Stitt wrote: > >> Hi all, >> >> Is there a quick way to map a global index for a parallel vector to a >> local mapping tuple (p,i) were 'p' represents the process containing the >> value and 'i' is the local index number on that process? >> > > PetscMapGetGlobalRange(&v->map,const &range); > for(p = 0; p < numProcs; ++p) if (range[p+1] > globalInd) break; > localInd = globalInd - range[p]; > > Matt > > >> As always, thanks in advance for any information provided. >> >> Tim. >> >> -- >> Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> >> >> > > > > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From gdiso at ustc.edu Mon Dec 3 17:59:31 2007 From: gdiso at ustc.edu (Gong Ding) Date: Tue, 4 Dec 2007 07:59:31 +0800 Subject: Small bug about line search Message-ID: <7C7393CDF2354390AEC3A1E27815999F@nintatmel> Hi, After a SNESLineSearchPostCheck call, the function SNESLineSearchCubic and SNESLineSearchQuadratic should recompute residual norm ||g|| and search length norm ||y|| But the code is src/snes/impls/ls/ls.c 676: VecNormBegin(g,NORM_2,gnorm); 677: if (*gnorm != *gnorm) SETERRQ(PETSC_ERR_FP,"User provided compute function generated a Not-a-Number"); 678: VecNormBegin(w,NORM_2,ynorm); 679: VecNormEnd(g,NORM_2,gnorm); 680: VecNormEnd(w,NORM_2,ynorm); and 850: VecNormBegin(g,NORM_2,gnorm); 851: VecNormBegin(w,NORM_2,ynorm); 852: VecNormEnd(g,NORM_2,gnorm); 853: VecNormEnd(w,NORM_2,ynorm); it set ||w|| for variable ynorm, for which i think should be ||y|| Yours Gong Ding From bsmith at mcs.anl.gov Tue Dec 4 15:39:07 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 4 Dec 2007 15:39:07 -0600 Subject: Global to Local Vector Mapping In-Reply-To: <475449A8.7000307@ichec.ie> References: <47544193.6050501@ichec.ie> <475449A8.7000307@ichec.ie> Message-ID: Tim, Sorry for the delay. You will need to call VecGetOwnershipRanges(). Unfortunately it does not exist in either C or Fortran. I have put it into petsc-dev. You can add the following line to include/petscvec.h EXTERN PetscErrorCode PETSCVEC_DLLEXPORT VecGetOwnershipRanges(Vec,const PetscInt *[]); add the following lines in src/vec/vec/interface/vector.c #undef __FUNCT__ #define __FUNCT__ "VecGetOwnershipRanges" /*@C VecGetOwnershipRanges - Returns the range of indices owned by EACH processor, assuming that the vectors are laid out with the first n1 elements on the first processor, next n2 elements on the second, etc. For certain parallel layouts this range may not be well defined. Not Collective Input Parameter: . x - the vector Output Parameters: . range - array of length size+1 with the start and end+1 for each process Note: The high argument is one more than the last element stored locally. Fortran: You must PASS in an array of length size+1 Level: beginner Concepts: ownership^of vectors Concepts: vector^ownership of elements .seealso: MatGetOwnershipRange(), MatGetOwnershipRanges(), VecGetOwnershipRange() @*/ PetscErrorCode PETSCVEC_DLLEXPORT VecGetOwnershipRanges(Vec x,const PetscInt *ranges[]) { PetscErrorCode ierr; PetscFunctionBegin; PetscValidHeaderSpecific(x,VEC_COOKIE,1); PetscValidType(x,1); ierr = PetscMapGetGlobalRange(&x->map,ranges);CHKERRQ(ierr); PetscFunctionReturn(0); } Run make in that directory, then add to src/vec/vec/interface/ftn- custom/zvectorf.c #if defined(PETSC_HAVE_FORTRAN_CAPS) #define vecgetownershipranges_ VECGETOWNERSHIPRANGES #elif !defined(PETSC_HAVE_FORTRAN_UNDERSCORE) #define vecgetownershipranges_ vecgetownershipranges #endif void PETSC_STDCALL vecgetownershipranges_(Vec *x,PetscInt *range,PetscErrorCode *ierr) { PetscMPIInt size; const PetscInt *r; *ierr = MPI_Comm_size((*x)->map.comm,&size);if (*ierr) return; *ierr = VecGetOwnershipRanges(*x,&r);if (*ierr) return; *ierr = PetscMemcpy(range,r,(size+1)*sizeof(PetscInt)); } and again run make in that directory. Let us know if any problems come up, Barry On Dec 3, 2007, at 12:23 PM, Tim Stitt wrote: > I have a problem Matthew...this is a Fortran code..which I don't > think this routine is compatible with. Is there any other way? > > Matthew Knepley wrote: >> On Dec 3, 2007 11:49 AM, Tim Stitt wrote: >> >>> Hi all, >>> >>> Is there a quick way to map a global index for a parallel vector >>> to a >>> local mapping tuple (p,i) were 'p' represents the process >>> containing the >>> value and 'i' is the local index number on that process? >>> >> >> PetscMapGetGlobalRange(&v->map,const &range); >> for(p = 0; p < numProcs; ++p) if (range[p+1] > globalInd) break; >> localInd = globalInd - range[p]; >> >> Matt >> >> >>> As always, thanks in advance for any information provided. >>> >>> Tim. >>> >>> -- >>> Dr. Timothy Stitt >>> HPC Application Consultant - ICHEC (www.ichec.ie) >>> >>> Dublin Institute for Advanced Studies >>> 5 Merrion Square - Dublin 2 - Ireland >>> >>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>> >>> >>> >> >> >> >> > > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > From zonexo at gmail.com Wed Dec 5 04:40:08 2007 From: zonexo at gmail.com (Ben Tay) Date: Wed, 05 Dec 2007 18:40:08 +0800 Subject: Estimating PETSc performance using SuperPI's results Message-ID: <47568008.20506@gmail.com> Hi, I'm thinking of ways to estimate and compare the performance of PETSc on different CPUs. I think it will also enable one to make the wise choice of whether to upgrade or not. Of cos, the best way is to run your own code on the new machine to see how much increase there is. However, most of the time this option is not available. I have found many forums whereby users post the time required to run programs such as SuperPI or other benchmarking softwares. I wonder if such software can be used to estimate the performance of PETSc too? In other words, if cpu A ran 4 times faster than on cpu B running SuperPi, is it safe to assume that it 'll be roughly the same running PETSc? Btw, SuperPi is a single threaded program. Thanks From keita at cray.com Wed Dec 5 15:31:47 2007 From: keita at cray.com (Keita Teranishi) Date: Wed, 5 Dec 2007 15:31:47 -0600 Subject: Usage of fun3d (flow) Message-ID: <925346A443D4E340BEB20248BAFCDBDF034D13D9@CFEVS1-IP.americas.cray.com> Hi, I am trying to run fun3d bundled with petsc distribution. The main program says it needs to access petsc.opt file, but it is not provided with the petsc package. Can you tell me the format of the file, or give me any sample files? Thank you, ================================ Keita Teranishi Math Software Group Cray, Inc. keita at cray.com ================================ -------------- next part -------------- An HTML attachment was scrubbed... URL: From keita at cray.com Wed Dec 5 15:45:27 2007 From: keita at cray.com (Keita Teranishi) Date: Wed, 5 Dec 2007 15:45:27 -0600 Subject: Usage of fun3d (flow) In-Reply-To: <925346A443D4E340BEB20248BAFCDBDF034D13D9@CFEVS1-IP.americas.cray.com> References: <925346A443D4E340BEB20248BAFCDBDF034D13D9@CFEVS1-IP.americas.cray.com> Message-ID: <925346A443D4E340BEB20248BAFCDBDF034D140D@CFEVS1-IP.americas.cray.com> Hi, I also found fun3d requires many input files. I'd like to know the format and sample of these files. Thanks, ================================ Keita Teranishi Math Software Group Cray, Inc. keita at cray.com ================================ ________________________________ From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Keita Teranishi Sent: Wednesday, December 05, 2007 3:32 PM To: petsc-users at mcs.anl.gov Subject: Usage of fun3d (flow) Hi, I am trying to run fun3d bundled with petsc distribution. The main program says it needs to access petsc.opt file, but it is not provided with the petsc package. Can you tell me the format of the file, or give me any sample files? Thank you, ================================ Keita Teranishi Math Software Group Cray, Inc. keita at cray.com ================================ -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.stitt at ichec.ie Thu Dec 6 06:09:49 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Thu, 06 Dec 2007 12:09:49 +0000 Subject: Zero Pivot Row in LU Factorization In-Reply-To: <982DFD3C-8D76-4045-A8B4-F046675DBC08@mcs.anl.gov> References: <4749C4FE.7070602@ichec.ie> <4749C894.5090602@ichec.ie> <982DFD3C-8D76-4045-A8B4-F046675DBC08@mcs.anl.gov> Message-ID: <4757E68D.3030208@ichec.ie> Barry, I will be using these routines from Fortran..so I am assuming that Fortran interfaces are available for each routine? Also, how do I know how many sub ksp's there will be? I am assuming I need to dynamically allocate the subksp array in Fortran but do I know the size in advance? Is this related to the value 'n' ? If so, how do I calculate 'n'. What is the significance of subksp[0]? Is it just the sub ksp at this position I should be interested in? Finally, which of the PCFactorSetxxxxxx routines should I be using? Sorry for the twenty questions (well nearly) but I am just a bit confused with this approach. Thanks, Tim. Barry Smith wrote: > > KSP *subksp; > > KSPGetPC(ksp,pc) > PCBJacobiGetSubKSP(pc,&n,PETSC_NULL,&subksp) > KSPGetPC(subksp[0],&subpc); > PCFactorSetxxxxxx(subpc, .... > > Barry > > > On Nov 25, 2007, at 1:10 PM, Tim Stitt wrote: > >> I should also add that the code executes without this error when >> using 1 processor...but then displays the error when running in >> parallel with more than one process. >> >> Tim Stitt wrote: >>> Hi all, >>> >>> Can anyone suggest ways of overcoming the following pivot error I >>> keep receiving in my PETSc code during a KSPSolve(). >>> >>> [1]PETSC ERROR: Detected zero pivot in LU factorization >>> see >>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot! >>> >>> [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance >>> 0.00165189 * rowsum 1.65189e+09! >>> >>> From checking the documentation....the error is in row 1801, which >>> means it is most likely not a matrix assembly issue? >>> >>> I tried the following prior to the solve with no luck either..... >>> >>> call KSPGetPC(ksp,pc,error) >>> call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error) >>> >>> Is there anything else I can try? >>> >>> Thanks, >>> >>> Tim. >>> >> >> >> --Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From bsmith at mcs.anl.gov Thu Dec 6 11:26:29 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 6 Dec 2007 11:26:29 -0600 Subject: Zero Pivot Row in LU Factorization In-Reply-To: <4757E68D.3030208@ichec.ie> References: <4749C4FE.7070602@ichec.ie> <4749C894.5090602@ichec.ie> <982DFD3C-8D76-4045-A8B4-F046675DBC08@mcs.anl.gov> <4757E68D.3030208@ichec.ie> Message-ID: On Dec 6, 2007, at 6:09 AM, Tim Stitt wrote: > Barry, > > I will be using these routines from Fortran..so I am assuming that > Fortran interfaces are available for each routine? > > Also, how do I know how many sub ksp's there will be? I am assuming > I need to dynamically allocate the subksp array in Fortran but do I > know the size in advance? Is this related to the value 'n' ? If so, > how do I calculate 'n'. There will always be one sub ksp be default. There will only be more than one if you use PCBJacobiSetLocalBlocks() or PCBJacobiSetTotalBlocks() or - pc_bjacobi_blocks. In general we recommend keeping it one. This means you do not need to allocate any KSP, just pass in a KSP variable > > > What is the significance of subksp[0]? Is it just the sub ksp at > this position I should be interested in? This is just the first one. If you have multiply ones then you must loop over them, but I recommend having just one. > > > Finally, which of the PCFactorSetxxxxxx routines should I be using? PCFactorSetZeroPivot() or PCFactorSetShiftNonzero() or PCFactorSetShiftPd() depending on what you want to have happen. Barry > > > Sorry for the twenty questions (well nearly) but I am just a bit > confused with this approach. > > Thanks, > > Tim. > > Barry Smith wrote: >> >> KSP *subksp; >> >> KSPGetPC(ksp,pc) >> PCBJacobiGetSubKSP(pc,&n,PETSC_NULL,&subksp) >> KSPGetPC(subksp[0],&subpc); >> PCFactorSetxxxxxx(subpc, .... >> >> Barry >> >> >> On Nov 25, 2007, at 1:10 PM, Tim Stitt wrote: >> >>> I should also add that the code executes without this error when >>> using 1 processor...but then displays the error when running in >>> parallel with more than one process. >>> >>> Tim Stitt wrote: >>>> Hi all, >>>> >>>> Can anyone suggest ways of overcoming the following pivot error I >>>> keep receiving in my PETSc code during a KSPSolve(). >>>> >>>> [1]PETSC ERROR: Detected zero pivot in LU factorization >>>> see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot >>>> ! >>>> [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance >>>> 0.00165189 * rowsum 1.65189e+09! >>>> >>>> From checking the documentation....the error is in row 1801, >>>> which means it is most likely not a matrix assembly issue? >>>> >>>> I tried the following prior to the solve with no luck either..... >>>> >>>> call KSPGetPC(ksp,pc,error) >>>> call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error) >>>> >>>> Is there anything else I can try? >>>> >>>> Thanks, >>>> >>>> Tim. >>>> >>> >>> >>> --Dr. Timothy Stitt >>> HPC Application Consultant - ICHEC (www.ichec.ie) >>> >>> Dublin Institute for Advanced Studies >>> 5 Merrion Square - Dublin 2 - Ireland >>> >>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>> >> > > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > From amjad11 at gmail.com Thu Dec 6 23:44:37 2007 From: amjad11 at gmail.com (amjad ali) Date: Fri, 7 Dec 2007 10:44:37 +0500 Subject: Seclecting specific board Message-ID: <428810f20712062144q2a7226a4wee543b216bb7fae6@mail.gmail.com> Hello all, I want to bulid a beowulf cluster of 16+1 nodes with each node having one Intel Core2Duo (2.66 GHz, FSB 1333MHz, 4MB L2) processor and GiGE as the interconnect. On this cluster, I would run my PETSc based CFD/FEM codes (REQURING VERY FAST MEMORY/high memory bandwidth). Please help me out to select out any one of the following boards: 1) Intel Server board S3200SH, System Bus 1333MHz, supprting 240-pin DDR2 800 MHz RAM 2) Intel Desktop board DX38BT, System Bus 1333MHz, supprting 240-pin DDR3 1333 MHz RAM See that RAM speed difference. Given that keeping up running the cluster all the time and loging on of many user simultaneously is not the concern. The cluster may be dedicated to be used by one user whenever required. But it may be the case that running a code for several days will be required. Would the desktop board DX38BT be suitable to run the cluster for several hours/days? Which Board you recommend for this scenario? Regards, Amjad Ali. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Fri Dec 7 06:43:44 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 7 Dec 2007 09:43:44 -0300 Subject: Seclecting specific board In-Reply-To: <428810f20712062144q2a7226a4wee543b216bb7fae6@mail.gmail.com> References: <428810f20712062144q2a7226a4wee543b216bb7fae6@mail.gmail.com> Message-ID: I'm not a hardware expert, but If your processors have FSB 1333MHz, then you should select the matching NIC card, that is, the DX38BT one. Additionally, you have to carefully select your switch. However, this is not an easy task. If you have any chance of getting a switch from your provider for a trial, then try to do some testing based on MPI_Alltoall() routine. I believe many (almost all?) switches have some performance drop in this scenario. On 12/7/07, amjad ali wrote: > Hello all, > > I want to bulid a beowulf cluster of 16+1 nodes with each node having one > Intel Core2Duo (2.66 GHz, FSB 1333MHz, 4MB L2) processor and GiGE as the > interconnect. On this cluster, I would run my PETSc based CFD/FEM codes > (REQURING VERY FAST MEMORY/high memory bandwidth). Please help me out to > select out any one of the following boards: > > 1) Intel Server board S3200SH, System Bus 1333MHz, supprting 240-pin DDR2 > 800 MHz RAM > 2) Intel Desktop board DX38BT, System Bus 1333MHz, supprting 240-pin DDR3 > 1333 MHz RAM > > See that RAM speed difference. Given that keeping up running the cluster all > the time and loging on of many user simultaneously is not the concern. The > cluster may be dedicated to be used by one user whenever required. But it > may be the case that running a code for several days will be required. > > Would the desktop board DX38BT be suitable to run the cluster for several > hours/days? > Which Board you recommend for this scenario? > > Regards, > Amjad Ali. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From timothy.stitt at ichec.ie Fri Dec 7 09:58:30 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Fri, 07 Dec 2007 15:58:30 +0000 Subject: Zero Pivot Row in LU Factorization In-Reply-To: References: <4749C4FE.7070602@ichec.ie> <4749C894.5090602@ichec.ie> <982DFD3C-8D76-4045-A8B4-F046675DBC08@mcs.anl.gov> <4757E68D.3030208@ichec.ie> Message-ID: <47596DA6.1030809@ichec.ie> Barry, I added the following lines to my Fortran code: call KSPGetPC(ksp,pc,error) call KSPSetUp(ksp,error) oneInt=1 call PCBJacobiGetSubKSP(pc,oneInt,PETSC_NULL,kspSub,error) call KSPGetPC(kspSub,pcSub,error) call PCFactorSetShiftNonzero(pcSub,PETSC_DECIDE,error) Now the parallel code goes beyond the zero pivot problem I was getting in the KSPSolve()...but only process 0 seems to complete the KSPSolve() and Process 1 and higher never makes it out of the KSPSolve() i.e. process 0 moves on and performs post-KSPSolve work (just some print statements) while the other processes never get out of KSPSolve(). My job only terminates once the requested wallclock expires. Again when running with only 1 process everything terminates successfully. Any ideas? Have I done something stupid with the instructions above? Thanks, Tim. Barry Smith wrote: > > On Dec 6, 2007, at 6:09 AM, Tim Stitt wrote: > >> Barry, >> >> I will be using these routines from Fortran..so I am assuming that >> Fortran interfaces are available for each routine? >> >> Also, how do I know how many sub ksp's there will be? I am assuming I >> need to dynamically allocate the subksp array in Fortran but do I >> know the size in advance? Is this related to the value 'n' ? If so, >> how do I calculate 'n'. > > There will always be one sub ksp be default. There will only be > more than one if you use > PCBJacobiSetLocalBlocks() or PCBJacobiSetTotalBlocks() or > -pc_bjacobi_blocks. > In general we recommend keeping it one. This means you do not need to > allocate > any KSP, just pass in a KSP variable > >> >> >> What is the significance of subksp[0]? Is it just the sub ksp at this >> position I should be interested in? > > This is just the first one. If you have multiply ones then you must > loop over them, but I > recommend having just one. >> >> >> Finally, which of the PCFactorSetxxxxxx routines should I be using? > > PCFactorSetZeroPivot() or PCFactorSetShiftNonzero() or > PCFactorSetShiftPd() depending > on what you want to have happen. > > Barry > >> >> >> Sorry for the twenty questions (well nearly) but I am just a bit >> confused with this approach. >> >> Thanks, >> >> Tim. >> >> Barry Smith wrote: >>> >>> KSP *subksp; >>> >>> KSPGetPC(ksp,pc) >>> PCBJacobiGetSubKSP(pc,&n,PETSC_NULL,&subksp) >>> KSPGetPC(subksp[0],&subpc); >>> PCFactorSetxxxxxx(subpc, .... >>> >>> Barry >>> >>> >>> On Nov 25, 2007, at 1:10 PM, Tim Stitt wrote: >>> >>>> I should also add that the code executes without this error when >>>> using 1 processor...but then displays the error when running in >>>> parallel with more than one process. >>>> >>>> Tim Stitt wrote: >>>>> Hi all, >>>>> >>>>> Can anyone suggest ways of overcoming the following pivot error I >>>>> keep receiving in my PETSc code during a KSPSolve(). >>>>> >>>>> [1]PETSC ERROR: Detected zero pivot in LU factorization >>>>> see >>>>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot! >>>>> >>>>> [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance >>>>> 0.00165189 * rowsum 1.65189e+09! >>>>> >>>>> From checking the documentation....the error is in row 1801, which >>>>> means it is most likely not a matrix assembly issue? >>>>> >>>>> I tried the following prior to the solve with no luck either..... >>>>> >>>>> call KSPGetPC(ksp,pc,error) >>>>> call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error) >>>>> >>>>> Is there anything else I can try? >>>>> >>>>> Thanks, >>>>> >>>>> Tim. >>>>> >>>> >>>> >>>> --Dr. Timothy Stitt >>>> HPC Application Consultant - ICHEC (www.ichec.ie) >>>> >>>> Dublin Institute for Advanced Studies >>>> 5 Merrion Square - Dublin 2 - Ireland >>>> >>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>>> >>> >> >> >> --Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From knepley at gmail.com Fri Dec 7 11:44:57 2007 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 7 Dec 2007 11:44:57 -0600 Subject: Zero Pivot Row in LU Factorization In-Reply-To: <47596DA6.1030809@ichec.ie> References: <4749C4FE.7070602@ichec.ie> <4749C894.5090602@ichec.ie> <982DFD3C-8D76-4045-A8B4-F046675DBC08@mcs.anl.gov> <4757E68D.3030208@ichec.ie> <47596DA6.1030809@ichec.ie> Message-ID: On Dec 7, 2007 9:58 AM, Tim Stitt wrote: > Barry, > > I added the following lines to my Fortran code: > > call KSPGetPC(ksp,pc,error) > call KSPSetUp(ksp,error) > oneInt=1 > call PCBJacobiGetSubKSP(pc,oneInt,PETSC_NULL,kspSub,error) > call KSPGetPC(kspSub,pcSub,error) > call PCFactorSetShiftNonzero(pcSub,PETSC_DECIDE,error) > > Now the parallel code goes beyond the zero pivot problem I was getting > in the KSPSolve()...but only process 0 seems to complete the KSPSolve() > and Process 1 and higher never makes it out of the KSPSolve() i.e. > process 0 moves on and performs post-KSPSolve work (just some print > statements) while the other processes never get out of KSPSolve(). My > job only terminates once the requested wallclock expires. Again when > running with only 1 process everything terminates successfully. This does not seem possible. BJacobi synchronizes at each step for a residual evaluation. Are you sure you did not call KSPSolve() on the inner KSP? Matt > Any ideas? Have I done something stupid with the instructions above? > > Thanks, > > Tim. > > Barry Smith wrote: > > > > On Dec 6, 2007, at 6:09 AM, Tim Stitt wrote: > > > >> Barry, > >> > >> I will be using these routines from Fortran..so I am assuming that > >> Fortran interfaces are available for each routine? > >> > >> Also, how do I know how many sub ksp's there will be? I am assuming I > >> need to dynamically allocate the subksp array in Fortran but do I > >> know the size in advance? Is this related to the value 'n' ? If so, > >> how do I calculate 'n'. > > > > There will always be one sub ksp be default. There will only be > > more than one if you use > > PCBJacobiSetLocalBlocks() or PCBJacobiSetTotalBlocks() or > > -pc_bjacobi_blocks. > > In general we recommend keeping it one. This means you do not need to > > allocate > > any KSP, just pass in a KSP variable > > > >> > >> > >> What is the significance of subksp[0]? Is it just the sub ksp at this > >> position I should be interested in? > > > > This is just the first one. If you have multiply ones then you must > > loop over them, but I > > recommend having just one. > >> > >> > >> Finally, which of the PCFactorSetxxxxxx routines should I be using? > > > > PCFactorSetZeroPivot() or PCFactorSetShiftNonzero() or > > PCFactorSetShiftPd() depending > > on what you want to have happen. > > > > Barry > > > >> > >> > >> Sorry for the twenty questions (well nearly) but I am just a bit > >> confused with this approach. > >> > >> Thanks, > >> > >> Tim. > >> > >> Barry Smith wrote: > >>> > >>> KSP *subksp; > >>> > >>> KSPGetPC(ksp,pc) > >>> PCBJacobiGetSubKSP(pc,&n,PETSC_NULL,&subksp) > >>> KSPGetPC(subksp[0],&subpc); > >>> PCFactorSetxxxxxx(subpc, .... > >>> > >>> Barry > >>> > >>> > >>> On Nov 25, 2007, at 1:10 PM, Tim Stitt wrote: > >>> > >>>> I should also add that the code executes without this error when > >>>> using 1 processor...but then displays the error when running in > >>>> parallel with more than one process. > >>>> > >>>> Tim Stitt wrote: > >>>>> Hi all, > >>>>> > >>>>> Can anyone suggest ways of overcoming the following pivot error I > >>>>> keep receiving in my PETSc code during a KSPSolve(). > >>>>> > >>>>> [1]PETSC ERROR: Detected zero pivot in LU factorization > >>>>> see > >>>>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot! > >>>>> > >>>>> [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance > >>>>> 0.00165189 * rowsum 1.65189e+09! > >>>>> > >>>>> From checking the documentation....the error is in row 1801, which > >>>>> means it is most likely not a matrix assembly issue? > >>>>> > >>>>> I tried the following prior to the solve with no luck either..... > >>>>> > >>>>> call KSPGetPC(ksp,pc,error) > >>>>> call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error) > >>>>> > >>>>> Is there anything else I can try? > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Tim. > >>>>> > >>>> > >>>> > >>>> --Dr. Timothy Stitt > >>>> HPC Application Consultant - ICHEC (www.ichec.ie) > >>>> > >>>> Dublin Institute for Advanced Studies > >>>> 5 Merrion Square - Dublin 2 - Ireland > >>>> > >>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) > >>>> > >>> > >> > >> > >> --Dr. Timothy Stitt > >> HPC Application Consultant - ICHEC (www.ichec.ie) > >> > >> Dublin Institute for Advanced Studies > >> 5 Merrion Square - Dublin 2 - Ireland > >> > >> +353-1-6621333 (tel) / +353-1-6621477 (fax) > >> > > > > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From ondrej at certik.cz Sat Dec 8 16:33:24 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Sat, 8 Dec 2007 23:33:24 +0100 Subject: which MPI can we use In-Reply-To: References: <428810f20711272224k116d77a3o12d60c9532052045@mail.gmail.com> Message-ID: <85b5c3130712081433i163b9a2fx24dc57cc62878675@mail.gmail.com> On Nov 28, 2007 4:49 PM, Lisandro Dalcin wrote: > On 11/28/07, amjad ali wrote: > > Please name the MPI libraries (other than MPICH2) which can be used > > efficiently with PETSc? > > On Linux/GNU, surelly Open-MPI. You also have Intel-MPI (actually, it > is based on MPICH2). Yep, openmpi works nice. You can also use petsc4py with that. If you use Debian, just install python-petsc4py and you'll get everything installed with openmpi. Ondrej From timothy.stitt at ichec.ie Sun Dec 9 11:37:07 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Sun, 09 Dec 2007 17:37:07 +0000 Subject: MatView() and Multiple Processes Message-ID: <475C27C3.7030001@ichec.ie> Hi all, I was just wondering if someone can tell me how to configure my dual-core laptop to allow graphical X11 output of my PETSc matrices. MatView() works fine on my parallel code with 1 process but I get the following errors on each process when I use more than one: [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Error in external library! [0]PETSC ERROR: Unable to open display on localhost.localdomain:0.0 . Make sure your COMPUTE NODES are authorized to connect to this X server and either your DISPLAY variable is set or you use the -display name option Thanks in advance as always, Tim. -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From knepley at gmail.com Sun Dec 9 11:41:56 2007 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 9 Dec 2007 11:41:56 -0600 Subject: MatView() and Multiple Processes In-Reply-To: <475C27C3.7030001@ichec.ie> References: <475C27C3.7030001@ichec.ie> Message-ID: I would try -display :0.0 Matt On Dec 9, 2007 11:37 AM, Tim Stitt wrote: > Hi all, > > I was just wondering if someone can tell me how to configure my > dual-core laptop to allow graphical X11 output of my PETSc matrices. > MatView() works fine on my parallel code with 1 process but I get the > following errors on each process when I use more than one: > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Error in external library! > [0]PETSC ERROR: Unable to open display on localhost.localdomain:0.0 > . Make sure your COMPUTE NODES are authorized to connect > to this X server and either your DISPLAY variable > is set or you use the -display name option > > Thanks in advance as always, > > Tim. > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From timothy.stitt at ichec.ie Sun Dec 9 11:54:42 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Sun, 09 Dec 2007 17:54:42 +0000 Subject: MatView() and Multiple Processes In-Reply-To: References: <475C27C3.7030001@ichec.ie> Message-ID: <475C2BE2.8030302@ichec.ie> Perfect Matthew thanks...I was trying the -display with localhost:0.0 and all possible permutations but it seems your incantation is the correct one. Cheers, Tim. Matthew Knepley wrote: > I would try -display :0.0 > > Matt > > On Dec 9, 2007 11:37 AM, Tim Stitt wrote: > >> Hi all, >> >> I was just wondering if someone can tell me how to configure my >> dual-core laptop to allow graphical X11 output of my PETSc matrices. >> MatView() works fine on my parallel code with 1 process but I get the >> following errors on each process when I use more than one: >> >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Error in external library! >> [0]PETSC ERROR: Unable to open display on localhost.localdomain:0.0 >> . Make sure your COMPUTE NODES are authorized to connect >> to this X server and either your DISPLAY variable >> is set or you use the -display name option >> >> Thanks in advance as always, >> >> Tim. >> >> -- >> Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> >> >> > > > > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From amjad11 at gmail.com Tue Dec 11 09:33:03 2007 From: amjad11 at gmail.com (amjad ali) Date: Tue, 11 Dec 2007 20:33:03 +0500 Subject: PETSc on ROCKS Message-ID: <428810f20712110733l26211b1j2634fde417cc47ff@mail.gmail.com> Hi all, CAn we install PETSc on ROCKS-based-cluster? or we have find some kind of PETSc-Roll? Have any of you experienced PETSc on ROCKS? regards, Amjad. -------------- next part -------------- An HTML attachment was scrubbed... URL: From randy at geosystem.us Tue Dec 11 09:35:29 2007 From: randy at geosystem.us (Randall Mackie) Date: Tue, 11 Dec 2007 07:35:29 -0800 Subject: PETSc on ROCKS In-Reply-To: <428810f20712110733l26211b1j2634fde417cc47ff@mail.gmail.com> References: <428810f20712110733l26211b1j2634fde417cc47ff@mail.gmail.com> Message-ID: <475EAE41.6040806@geosystem.us> Yes, we use PETSc on a ROCKS-based-cluster and they work quite well together. Randy M. amjad ali wrote: > Hi all, > CAn we install PETSc on ROCKS-based-cluster? or we have find some kind > of PETSc-Roll? > > Have any of you experienced PETSc on ROCKS? > > regards, > Amjad. > From jwicks at cs.brown.edu Thu Dec 13 14:42:14 2007 From: jwicks at cs.brown.edu (John R. Wicks) Date: Thu, 13 Dec 2007 15:42:14 -0500 Subject: Norm computation In-Reply-To: <000201c834eb$e89ecfa0$0201a8c0@jwickslptp> Message-ID: <000b01c83dc8$a57215d0$0201a8c0@jwickslptp> I recently peformed solved a linear system of very high dimension distributed over 32 Mac XServe's. I was rather surprised by the performance statistics it reported, given below. In particular, how can VecNorm be so much more expensive than VecDot, since VecNorm should simply involve taking a single square root of a dot product. --- Event Stage 2: LinearSolve MatMult 19 1.0 1.8057e+01 1.7 1.19e+08 1.7 1.9e+04 5.5e+04 0.0e+00 2 17 49 49 0 16 17 50 50 0 2214 MatMultTranspose 19 1.0 1.6234e+01 2.1 1.73e+08 2.1 1.9e+04 5.5e+04 0.0e+00 2 18 49 49 0 11 18 50 50 0 2601 MatSolve 20 1.0 1.2656e+01 1.5 1.45e+08 1.5 0.0e+00 0.0e+00 0.0e+00 1 18 0 0 0 10 18 0 0 0 3200 MatSolveTranspos 20 1.0 1.3608e+01 1.5 1.40e+08 1.5 0.0e+00 0.0e+00 0.0e+00 2 18 0 0 0 11 18 0 0 0 2976 MatLUFactorNum 1 1.0 1.9609e+01 6.5 2.71e+08 9.3 0.0e+00 0.0e+00 0.0e+00 1 14 0 0 0 10 14 0 0 0 1635 MatILUFactorSym 1 1.0 5.3393e+00 3.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 1 0 0 0 1 4 0 0 0 2 0 MatGetRowIJ 1 1.0 1.7881e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 2.4659e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 3 0 0 0 0 3 0 VecDot 38 1.0 1.2653e+01 1.9 4.24e+07 2.2 0.0e+00 0.0e+00 3.8e+01 1 4 0 0 49 10 4 0 0 62 710 VecNorm 20 1.0 3.5348e+01 4.0 2.05e+07 6.0 0.0e+00 0.0e+00 2.0e+01 4 2 0 0 26 25 2 0 0 33 134 VecCopy 4 1.0 2.7451e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 79 1.0 8.9448e-01 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecAXPY 57 1.0 3.2704e+00 2.2 2.33e+08 1.5 0.0e+00 0.0e+00 0.0e+00 0 6 0 0 0 2 6 0 0 0 4118 VecAYPX 36 1.0 1.5667e+00 1.7 2.28e+08 1.3 0.0e+00 0.0e+00 0.0e+00 0 4 0 0 0 1 4 0 0 0 5430 VecScatterBegin 38 1.0 1.1066e+00 2.1 0.00e+00 0.0 3.8e+04 5.5e+04 0.0e+00 0 0 97 98 0 1 0100100 0 0 VecScatterEnd 38 1.0 1.5381e+01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 12 0 0 0 0 0 KSPSetup 2 1.0 8.2719e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 KSPSolve 20 1.0 1.2881e+01 1.5 1.42e+08 1.5 0.0e+00 0.0e+00 0.0e+00 1 18 0 0 0 10 18 0 0 0 3144 PCSetUp 2 1.0 2.5356e+01 5.5 1.96e+08 8.7 0.0e+00 0.0e+00 3.0e+00 2 14 0 0 4 13 14 0 0 5 1265 PCApply 40 1.0 4.7115e+01 2.1 1.59e+08 2.4 0.0e+00 0.0e+00 3.0e+00 5 49 0 0 4 35 49 0 0 5 2400 From bsmith at mcs.anl.gov Thu Dec 13 15:58:10 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 13 Dec 2007 15:58:10 -0600 Subject: Norm computation In-Reply-To: <000b01c83dc8$a57215d0$0201a8c0@jwickslptp> References: <000b01c83dc8$a57215d0$0201a8c0@jwickslptp> Message-ID: <0DC6210C-F590-4CC0-95E0-E590F5F6F9D4@mcs.anl.gov> The time for VecNorm and VecDot reflects two factors 1) the time to perform the local floating point operations and 2) the time a process waits untill all the other processes are ready to exchange date. 2) depends on whatever calculations are being done BEFORE the norm or dot and is largely related to the load balancing of the work there. If you look at the 4th column of numbers below it is a measure for the load balance up to that point: for the VecDot it is 1.9 which means the fastest process was in the routine (mostly waiting) 1/1.9 times as long as the slowest process was in the routine. For VecNorm it is 4! Meaning some processes are waiting in VecNorm for a long time before the slowest gets to that routine and does its communications. Barry On Dec 13, 2007, at 2:42 PM, John R. Wicks wrote: > I recently peformed solved a linear system of very high dimension > distributed over 32 Mac XServe's. I was rather surprised by the > performance > statistics it reported, given below. In particular, how can VecNorm > be so > much more expensive than VecDot, since VecNorm should simply involve > taking > a single square root of a dot product. > > --- Event Stage 2: LinearSolve > > MatMult 19 1.0 1.8057e+01 1.7 1.19e+08 1.7 1.9e+04 5.5e > +04 > 0.0e+00 2 17 49 49 0 16 17 50 50 0 2214 > MatMultTranspose 19 1.0 1.6234e+01 2.1 1.73e+08 2.1 1.9e+04 5.5e > +04 > 0.0e+00 2 18 49 49 0 11 18 50 50 0 2601 > MatSolve 20 1.0 1.2656e+01 1.5 1.45e+08 1.5 0.0e+00 0.0e > +00 > 0.0e+00 1 18 0 0 0 10 18 0 0 0 3200 > MatSolveTranspos 20 1.0 1.3608e+01 1.5 1.40e+08 1.5 0.0e+00 0.0e > +00 > 0.0e+00 2 18 0 0 0 11 18 0 0 0 2976 > MatLUFactorNum 1 1.0 1.9609e+01 6.5 2.71e+08 9.3 0.0e+00 0.0e > +00 > 0.0e+00 1 14 0 0 0 10 14 0 0 0 1635 > MatILUFactorSym 1 1.0 5.3393e+00 3.9 0.00e+00 0.0 0.0e+00 0.0e > +00 > 1.0e+00 1 0 0 0 1 4 0 0 0 2 0 > MatGetRowIJ 1 1.0 1.7881e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e > +00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetOrdering 1 1.0 2.4659e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e > +00 > 2.0e+00 0 0 0 0 3 0 0 0 0 3 0 > VecDot 38 1.0 1.2653e+01 1.9 4.24e+07 2.2 0.0e+00 0.0e > +00 > 3.8e+01 1 4 0 0 49 10 4 0 0 62 710 > VecNorm 20 1.0 3.5348e+01 4.0 2.05e+07 6.0 0.0e+00 0.0e > +00 > 2.0e+01 4 2 0 0 26 25 2 0 0 33 134 > VecCopy 4 1.0 2.7451e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e > +00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 79 1.0 8.9448e-01 1.9 0.00e+00 0.0 0.0e+00 0.0e > +00 > 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 > VecAXPY 57 1.0 3.2704e+00 2.2 2.33e+08 1.5 0.0e+00 0.0e > +00 > 0.0e+00 0 6 0 0 0 2 6 0 0 0 4118 > VecAYPX 36 1.0 1.5667e+00 1.7 2.28e+08 1.3 0.0e+00 0.0e > +00 > 0.0e+00 0 4 0 0 0 1 4 0 0 0 5430 > VecScatterBegin 38 1.0 1.1066e+00 2.1 0.00e+00 0.0 3.8e+04 5.5e > +04 > 0.0e+00 0 0 97 98 0 1 0100100 0 0 > VecScatterEnd 38 1.0 1.5381e+01 2.0 0.00e+00 0.0 0.0e+00 0.0e > +00 > 0.0e+00 2 0 0 0 0 12 0 0 0 0 0 > KSPSetup 2 1.0 8.2719e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e > +00 > 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 > KSPSolve 20 1.0 1.2881e+01 1.5 1.42e+08 1.5 0.0e+00 0.0e > +00 > 0.0e+00 1 18 0 0 0 10 18 0 0 0 3144 > PCSetUp 2 1.0 2.5356e+01 5.5 1.96e+08 8.7 0.0e+00 0.0e > +00 > 3.0e+00 2 14 0 0 4 13 14 0 0 5 1265 > PCApply 40 1.0 4.7115e+01 2.1 1.59e+08 2.4 0.0e+00 0.0e > +00 > 3.0e+00 5 49 0 0 4 35 49 0 0 5 2400 > From randy at geosystem.us Thu Dec 13 17:32:00 2007 From: randy at geosystem.us (Randall Mackie) Date: Thu, 13 Dec 2007 15:32:00 -0800 Subject: Question on Index Sets and VecScatters Message-ID: <4761C0F0.1070700@geosystem.us> I have a situation where I've put a model vector m(i,j,k) into a parallel PETSc vector for use in my modeling code. However, I'm now adding a bit of code where I want to do some calculations based on the 1D average of the model. In other words, for each k, I want to average m(i,j), and so produce a new model vector m_avg(k). So, to do this, it would seem that I need to create a VecScatter that will, for each layer, scatter all the m(i,j) into a 2D vector, then I can take the average. It would seem that I need to create an Index Set to do this, but I'm a bit confused as to how to go about it actually, since I've never used Index Sets. Can someone outline the basic steps given my description above? Thanks, Randy -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From knepley at gmail.com Thu Dec 13 17:46:39 2007 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 13 Dec 2007 17:46:39 -0600 Subject: Question on Index Sets and VecScatters In-Reply-To: <4761C0F0.1070700@geosystem.us> References: <4761C0F0.1070700@geosystem.us> Message-ID: You could do it like that, but it seems pretty wasteful, especially in parallel where you might be sending a considerable amount of data. Why not do something like this: 1) Average all slabs into a local vector, indexed by the given k value, meaning you have a map {k_0, k_1, ..., k_m} --> {0,1, ...,m}. 2) Now construct a scatter that maps each local vector into a parallel vector of all the ks. The IndexSet for the from (local) vector will be {0, 1, ..., m} and the IndexSet for the to (global) vector will be {k_0, k_1, ... , k_m} on each process. 3) When you scatter use ADD_VALUES. Then you will have the sum, and just scale the vector by the slab size. Does this makes sense to you? Thanks, Matt On Dec 13, 2007 5:32 PM, Randall Mackie wrote: > I have a situation where I've put a model vector m(i,j,k) into a parallel > PETSc vector for use in my modeling code. However, I'm now adding a bit of code > where I want to do some calculations based on the 1D average of the model. > In other words, for each k, I want to average m(i,j), and so produce a new > model vector m_avg(k). > > So, to do this, it would seem that I need to create a VecScatter that will, > for each layer, scatter all the m(i,j) into a 2D vector, then I can take > the average. It would seem that I need to create an Index Set to do this, > but I'm a bit confused as to how to go about it actually, since I've never > used Index Sets. > > Can someone outline the basic steps given my description above? > > Thanks, Randy > > -- > Randall Mackie > GSY-USA, Inc. > PMB# 643 > 2261 Market St., > San Francisco, CA 94114-1600 > Tel (415) 469-8649 > Fax (415) 469-5044 > > California Registered Geophysicist > License No. GP 1034 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From randy at geosystem.us Thu Dec 13 18:16:31 2007 From: randy at geosystem.us (Randall Mackie) Date: Thu, 13 Dec 2007 16:16:31 -0800 Subject: Question on Index Sets and VecScatters In-Reply-To: References: <4761C0F0.1070700@geosystem.us> Message-ID: <4761CB5F.3080609@geosystem.us> Hi Matt, Yes, I see what you're saying, and it makes sense. I'll give it a try. Randy Matthew Knepley wrote: > You could do it like that, but it seems pretty wasteful, especially in parallel > where you might be sending a considerable amount of data. Why not do > something like this: > > 1) Average all slabs into a local vector, indexed by the given k value, > meaning you have a map {k_0, k_1, ..., k_m} --> {0,1, ...,m}. > > 2) Now construct a scatter that maps each local vector into a parallel > vector of all the ks. The IndexSet for the from (local) vector will be > {0, 1, ..., m} and the IndexSet for the to (global) vector will be > {k_0, k_1, ... , k_m} on each process. > > 3) When you scatter use ADD_VALUES. Then you will have the sum, and > just scale the vector by the slab size. > > Does this makes sense to you? > > Thanks, > > Matt > > On Dec 13, 2007 5:32 PM, Randall Mackie wrote: >> I have a situation where I've put a model vector m(i,j,k) into a parallel >> PETSc vector for use in my modeling code. However, I'm now adding a bit of code >> where I want to do some calculations based on the 1D average of the model. >> In other words, for each k, I want to average m(i,j), and so produce a new >> model vector m_avg(k). >> >> So, to do this, it would seem that I need to create a VecScatter that will, >> for each layer, scatter all the m(i,j) into a 2D vector, then I can take >> the average. It would seem that I need to create an Index Set to do this, >> but I'm a bit confused as to how to go about it actually, since I've never >> used Index Sets. >> >> Can someone outline the basic steps given my description above? >> >> Thanks, Randy >> >> -- >> Randall Mackie >> GSY-USA, Inc. >> PMB# 643 >> 2261 Market St., >> San Francisco, CA 94114-1600 >> Tel (415) 469-8649 >> Fax (415) 469-5044 >> >> California Registered Geophysicist >> License No. GP 1034 >> >> > > > -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From thomas.fabry at uz.kuleuven.ac.be Tue Dec 18 06:22:17 2007 From: thomas.fabry at uz.kuleuven.ac.be (Thomas Fabry) Date: Tue, 18 Dec 2007 13:22:17 +0100 Subject: Petsc + Matlab Compute Engine Message-ID: I have a problem using the Matlab Compute Engine via Petsc. The line ierr = PetscMatlabEngineCreate(PETSC_COM_WORLD,PETSC_NULL,e); CHKERRQ(ierr); and using this makefile: CFLAGS = -c -I/usr/local/matlab14.3/extern/include -I/usr/local/matlab14.3/simulink/include FFLAGS = -I${PETSC_DIR}/include/finclude CPPFLAGS = FPPFLAGS = include ${PETSC_DIR}/bmake/common/base secondPETScTest: secondPETScTest.o -${CLINKER} -o secondPETScTest secondPETScTest.o ${PETSC_KSP_LIB} ${RM} secondPETScTest.o secondPETScTestm: secondPETScTest.o chkopts -${CLINKER} -O -pthread -shared -m32 -Wl,--version-script,/usr/local/matlab14.3/extern/lib/glnx86/mexFunction .map -o secondPETScTest secondPETScTest.o -Wl,-rpath-link,/usr/local/matlab14.3/bin/glnx86 -L/usr/local/matlab14.3/bin/glnx86 -lmx -lmex -lmat -lm -lstdc++ ${PETSC_KSP_LIB} ${RM} secondPETScTest.o gives "/PETSc impl/secondPETScTest.c:38: undefined reference to `PetscMatlabEngineCreate'" when trying make secondPETScTest, and when I compile with make secondPETScTestm, compilation works, but running the program gives a segmentation fault. I hope someone can help me Kind regards Thomas Fabry From knepley at gmail.com Tue Dec 18 08:02:53 2007 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 18 Dec 2007 08:02:53 -0600 Subject: Petsc + Matlab Compute Engine In-Reply-To: References: Message-ID: If you want to use the Matlab engine, you must configure PETSc to use Matlab, --with-matlab-dir= --with-matlab-engine. Thanks, Matt On Dec 18, 2007 6:22 AM, Thomas Fabry wrote: > I have a problem using the Matlab Compute Engine via Petsc. > The line > > ierr = PetscMatlabEngineCreate(PETSC_COM_WORLD,PETSC_NULL,e); > CHKERRQ(ierr); > > and using this makefile: > > CFLAGS = -c -I/usr/local/matlab14.3/extern/include > -I/usr/local/matlab14.3/simulink/include > FFLAGS = -I${PETSC_DIR}/include/finclude > CPPFLAGS = > FPPFLAGS = > > include ${PETSC_DIR}/bmake/common/base > > secondPETScTest: secondPETScTest.o > -${CLINKER} -o secondPETScTest secondPETScTest.o > ${PETSC_KSP_LIB} > ${RM} secondPETScTest.o > > secondPETScTestm: secondPETScTest.o chkopts > -${CLINKER} -O -pthread -shared -m32 > -Wl,--version-script,/usr/local/matlab14.3/extern/lib/glnx86/mexFunction > .map -o secondPETScTest secondPETScTest.o > -Wl,-rpath-link,/usr/local/matlab14.3/bin/glnx86 > -L/usr/local/matlab14.3/bin/glnx86 -lmx -lmex -lmat -lm -lstdc++ > ${PETSC_KSP_LIB} > ${RM} secondPETScTest.o > > gives "/PETSc impl/secondPETScTest.c:38: undefined reference to > `PetscMatlabEngineCreate'" when trying make secondPETScTest, and when I > compile with make secondPETScTestm, compilation works, but running the > program gives a segmentation fault. > > > I hope someone can help me > > Kind regards > > Thomas Fabry > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From randy at geosystem.us Tue Dec 18 13:43:25 2007 From: randy at geosystem.us (Randall Mackie) Date: Tue, 18 Dec 2007 11:43:25 -0800 Subject: Question on Index Sets and VecScatters In-Reply-To: References: <4761C0F0.1070700@geosystem.us> Message-ID: <476822DD.7070603@geosystem.us> Matt, Just a quick follow up question. The local vectors created in (1) are SEQ vectors on PETSC_COMM_SELF. To create the index sets, it seems like I should just use ISCreateStride, using 0, m for start and length of the index set. My question is should the communicator be PETSC_COMM_SELF or PETSC_COMM_WORLD? Similarly, the index set for the global vector should also be created with ISCreateStride, using k_0 and m for start and lengths. Same question about the communicator. Thanks, Randy Matthew Knepley wrote: > You could do it like that, but it seems pretty wasteful, especially in parallel > where you might be sending a considerable amount of data. Why not do > something like this: > > 1) Average all slabs into a local vector, indexed by the given k value, > meaning you have a map {k_0, k_1, ..., k_m} --> {0,1, ...,m}. > > 2) Now construct a scatter that maps each local vector into a parallel > vector of all the ks. The IndexSet for the from (local) vector will be > {0, 1, ..., m} and the IndexSet for the to (global) vector will be > {k_0, k_1, ... , k_m} on each process. > > 3) When you scatter use ADD_VALUES. Then you will have the sum, and > just scale the vector by the slab size. > > Does this makes sense to you? > > Thanks, > > Matt > > On Dec 13, 2007 5:32 PM, Randall Mackie wrote: >> I have a situation where I've put a model vector m(i,j,k) into a parallel >> PETSc vector for use in my modeling code. However, I'm now adding a bit of code >> where I want to do some calculations based on the 1D average of the model. >> In other words, for each k, I want to average m(i,j), and so produce a new >> model vector m_avg(k). >> >> So, to do this, it would seem that I need to create a VecScatter that will, >> for each layer, scatter all the m(i,j) into a 2D vector, then I can take >> the average. It would seem that I need to create an Index Set to do this, >> but I'm a bit confused as to how to go about it actually, since I've never >> used Index Sets. >> >> Can someone outline the basic steps given my description above? >> >> Thanks, Randy >> >> -- >> Randall Mackie >> GSY-USA, Inc. >> PMB# 643 >> 2261 Market St., >> San Francisco, CA 94114-1600 >> Tel (415) 469-8649 >> Fax (415) 469-5044 >> >> California Registered Geophysicist >> License No. GP 1034 >> >> > > > -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From knepley at gmail.com Tue Dec 18 16:01:34 2007 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 18 Dec 2007 16:01:34 -0600 Subject: Question on Index Sets and VecScatters In-Reply-To: <476822DD.7070603@geosystem.us> References: <4761C0F0.1070700@geosystem.us> <476822DD.7070603@geosystem.us> Message-ID: On Dec 18, 2007 1:43 PM, Randall Mackie wrote: > Matt, > > Just a quick follow up question. The local vectors created in (1) > are SEQ vectors on PETSC_COMM_SELF. To create the index sets, it > seems like I should just use ISCreateStride, using 0, m for start > and length of the index set. My question is should the communicator > be PETSC_COMM_SELF or PETSC_COMM_WORLD? SELF. > Similarly, the index set for the global vector should also be > created with ISCreateStride, using k_0 and m for start and lengths. > Same question about the communicator. Comms on IndexSets do not actually matter. Thanks, Matt > Thanks, Randy > > > Matthew Knepley wrote: > > You could do it like that, but it seems pretty wasteful, especially in parallel > > where you might be sending a considerable amount of data. Why not do > > something like this: > > > > 1) Average all slabs into a local vector, indexed by the given k value, > > meaning you have a map {k_0, k_1, ..., k_m} --> {0,1, ...,m}. > > > > 2) Now construct a scatter that maps each local vector into a parallel > > vector of all the ks. The IndexSet for the from (local) vector will be > > {0, 1, ..., m} and the IndexSet for the to (global) vector will be > > {k_0, k_1, ... , k_m} on each process. > > > > 3) When you scatter use ADD_VALUES. Then you will have the sum, and > > just scale the vector by the slab size. > > > > Does this makes sense to you? > > > > Thanks, > > > > Matt > > > > On Dec 13, 2007 5:32 PM, Randall Mackie wrote: > >> I have a situation where I've put a model vector m(i,j,k) into a parallel > >> PETSc vector for use in my modeling code. However, I'm now adding a bit of code > >> where I want to do some calculations based on the 1D average of the model. > >> In other words, for each k, I want to average m(i,j), and so produce a new > >> model vector m_avg(k). > >> > >> So, to do this, it would seem that I need to create a VecScatter that will, > >> for each layer, scatter all the m(i,j) into a 2D vector, then I can take > >> the average. It would seem that I need to create an Index Set to do this, > >> but I'm a bit confused as to how to go about it actually, since I've never > >> used Index Sets. > >> > >> Can someone outline the basic steps given my description above? > >> > >> Thanks, Randy > >> > >> -- > >> Randall Mackie > >> GSY-USA, Inc. > >> PMB# 643 > >> 2261 Market St., > >> San Francisco, CA 94114-1600 > >> Tel (415) 469-8649 > >> Fax (415) 469-5044 > >> > >> California Registered Geophysicist > >> License No. GP 1034 > >> > >> > > > > > > > > -- > Randall Mackie > GSY-USA, Inc. > PMB# 643 > 2261 Market St., > San Francisco, CA 94114-1600 > Tel (415) 469-8649 > Fax (415) 469-5044 > > California Registered Geophysicist > License No. GP 1034 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From recrusader at gmail.com Fri Dec 21 01:16:29 2007 From: recrusader at gmail.com (Yujie) Date: Thu, 20 Dec 2007 23:16:29 -0800 Subject: how to visit the variable "bs" in pmat of preconditioner Message-ID: <7ff0ee010712202316i57a8927ehfaaa933bc7c17304@mail.gmail.com> hi, everyone now, I want to use Hypre package via PETSc in third package. I need to visit the variable "bs" in Mat struct. In hypre.c, this variable may let BoomerAMG know the block size of Mat. The code is as follows: 127: /* special case for BoomerAMG */ 128: if (jac->setup == HYPRE_BoomerAMGSetup) { 129: MatGetBlockSize(pc->pmat,&bs); 130: if (bs > 1) { 131: HYPRE_BoomerAMGSetNumFunctions(jac->hsolver,bs); 132: } 133: }; However, I can't visit this variable. Now, I have get the pointer of PC I use. I can't visit the variable pmat in my code. I can't find any function to realize this function from PETSc manual. Could you give me some advice about how to do? Merry X'mas! Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Dec 21 07:44:12 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 21 Dec 2007 07:44:12 -0600 Subject: how to visit the variable "bs" in pmat of preconditioner In-Reply-To: <7ff0ee010712202316i57a8927ehfaaa933bc7c17304@mail.gmail.com> References: <7ff0ee010712202316i57a8927ehfaaa933bc7c17304@mail.gmail.com> Message-ID: <219D5EF9-DFFB-409A-B2E3-825C64F1E54E@mcs.anl.gov> pmat is the matrix you set with KSPSetOperators() so you just need to set block size of that matrix. On Dec 21, 2007, at 1:16 AM, Yujie wrote: > hi, everyone > > now, I want to use Hypre package via PETSc in third package. I need > to visit the variable "bs" in Mat struct. In hypre.c, this variable > may let BoomerAMG know the block size of Mat. The code is as follows: > > 127: /* special case for BoomerAMG */ > 128: if (jac->setup == HYPRE_BoomerAMGSetup) { > 129: MatGetBlockSize(pc->pmat,&bs); > 130: if (bs > 1) { > 131: HYPRE_BoomerAMGSetNumFunctions(jac->hsolver,bs); > 132: } > 133: }; > > However, I can't visit this variable. Now, I have get the pointer of > PC I use. I can't visit the variable pmat in my code. I can't find > any function to realize this function from PETSc manual. > Could you give me some advice about how to do? > > Merry X'mas! > > Regards, > Yujie > From billy at dem.uminho.pt Sat Dec 29 17:56:20 2007 From: billy at dem.uminho.pt (=?iso-8859-1?Q?Billy_Ara=FAjo?=) Date: Sat, 29 Dec 2007 23:56:20 -0000 Subject: Maintaining accuracy while increasing number of processors Message-ID: <1200D8BEDB3DD54DBA528E210F372BF3D94467@BEFUNCIONARIOS.uminho.pt> Hi, I need to know more about the PETSc parallel GMRES solver. Does the solver maintain the same accuracy independent of the number of processors. For example, if I subdivide a mesh with 1000 unkowns into 10, 100, 1000 processors should I expect to get always the same result? If no, why not? Are there any studies on this? Thank you, Billy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Dec 29 19:00:56 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 29 Dec 2007 19:00:56 -0600 Subject: Maintaining accuracy while increasing number of processors In-Reply-To: <1200D8BEDB3DD54DBA528E210F372BF3D94467@BEFUNCIONARIOS.uminho.pt> References: <1200D8BEDB3DD54DBA528E210F372BF3D94467@BEFUNCIONARIOS.uminho.pt> Message-ID: <65DF14FD-8FFC-4F20-A30A-63EC203FCA52@mcs.anl.gov> Billy, By default GMRES and most of the other KSP solvers stop after a reduction in the 2-norm of the PRECONDITIONED residual by a factor of 10^-5. See the manual page for KSPDefaultConverged() http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/KSP/KSPDefaultConverged.html There are a couple of things to consider: 1) even with the exact same preconditioner (for example Jacobi) the convergence history will be slightly different since the computations are done in a different order and so the floating point results will be slightly different. The converged SOLUTIONS for a different number of processes are ALL correct, even though they have different values since the calculations are done in floating point. As you decrease the tolerance factors you will see the SOLUTIONS for different number of processes all converge to the same answer (i.e. the solutions will share more and more significant digits.) 2) Most parallel preconditioners (even in exact precision) are different for a different number of processes, for example block Jacobi and the additive Schwarz method. So you get all the issues of 1) plus the fact that the convergence histories with different number of processes will be different. Again IF the solver is converging than the answers from any number of processes are equally correct. Also as you decrease the convergence tolerances you will see more and more common significant digits in the different solutions. Sometimes with a larger number of processes the preconditioner may stop working and you do not get convergence of GMRES and then, of course, the "answer" is garbage. You should always call KSPGetConvergedReason() to make sure the solver has converged. Barry On Dec 29, 2007, at 5:56 PM, Billy Ara?jo wrote: > > Hi, > > I need to know more about the PETSc parallel GMRES solver. Does the > solver maintain the same accuracy independent of the number of > processors. For example, if I subdivide a mesh with 1000 unkowns > into 10, 100, 1000 processors should I expect to get always the same > result? If no, why not? Are there any studies on this? > > Thank you, > > Billy. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vijay.m at gmail.com Sat Dec 29 20:07:11 2007 From: vijay.m at gmail.com (Vijay M) Date: Sat, 29 Dec 2007 20:07:11 -0600 Subject: Matrix free example snes/ex20.c In-Reply-To: <65DF14FD-8FFC-4F20-A30A-63EC203FCA52@mcs.anl.gov> Message-ID: <000601c84a88$b1f54cb0$203010ac@neutrino> Hi all, I was trying to compile and run the ex20.c example code in the tutorial section of SNES. Although it does not explicitly specify that -snes_mf option can be used, my understanding is that as long as a nonlinear residual function is written correctly, PETSc will calculate via finite difference the action of the Jacobian on a given vector. Is that correct ? Now if that is the case, then please observe the discrepancy in the number of linear iterations taken with an analytical Jacobian and matrix-free option. What puzzles me is that the SNES function norm are quite close for both the methods but the linear iterations differ by a factor of 3. Why exactly is this ? Here's the output to make this clearer. vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor 0 SNES Function norm 2.271442542876e-01 1 SNES Function norm 6.881516100891e-02 2 SNES Function norm 1.813939751552e-02 3 SNES Function norm 2.354176462207e-03 4 SNES Function norm 3.063728077362e-05 5 SNES Function norm 3.106106268946e-08 6 SNES Function norm 5.344742712545e-12 0 SNES Function norm 2.271442542876e-01 1 SNES Function norm 6.881516100891e-02 2 SNES Function norm 1.813939751552e-02 3 SNES Function norm 2.354176462207e-03 4 SNES Function norm 3.063728077362e-05 5 SNES Function norm 3.106106268946e-08 6 SNES Function norm 5.344742712545e-12 Number of Newton iterations = 6 Number of Linear iterations = 18 Average Linear its / Newton = 3.000000e+00 vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf 0 SNES Function norm 2.271442542876e-01 1 SNES Function norm 6.870629867542e-02 2 SNES Function norm 1.804335379848e-02 3 SNES Function norm 2.290074339682e-03 4 SNES Function norm 3.082384186373e-05 5 SNES Function norm 3.926396277038e-09 6 SNES Function norm 3.754922566585e-16 0 SNES Function norm 2.271442542876e-01 1 SNES Function norm 6.870629867542e-02 2 SNES Function norm 1.804335379848e-02 3 SNES Function norm 2.290074339682e-03 4 SNES Function norm 3.082384186373e-05 5 SNES Function norm 3.926396277038e-09 6 SNES Function norm 3.754922566585e-16 Number of Newton iterations = 6 Number of Linear iterations = 54 Average Linear its / Newton = 9.000000e+00 Thanks, Vijay -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Dec 29 21:05:26 2007 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 29 Dec 2007 21:05:26 -0600 Subject: Matrix free example snes/ex20.c In-Reply-To: <000601c84a88$b1f54cb0$203010ac@neutrino> References: <65DF14FD-8FFC-4F20-A30A-63EC203FCA52@mcs.anl.gov> <000601c84a88$b1f54cb0$203010ac@neutrino> Message-ID: On Dec 29, 2007 8:07 PM, Vijay M wrote: > Hi all, > > I was trying to compile and run the ex20.c example code in the tutorial > section of SNES. Although it does not explicitly specify that ?snes_mf > option can be used, my understanding is that as long as a nonlinear residual > function is written correctly, PETSc will calculate via finite difference > the action of the Jacobian on a given vector. Is that correct ? Yes. > Now if that is the case, then please observe the discrepancy in the number > of linear iterations taken with an analytical Jacobian and matrix-free > option. What puzzles me is that the SNES function norm are quite close for > both the methods but the linear iterations differ by a factor of 3. Why > exactly is this ? There is no PC when using -snes_mf whereas the default is ILU for the analytic Jacobian. Matt > Here's the output to make this clearer. > > vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor > > 0 SNES Function norm 2.271442542876e-01 > > 1 SNES Function norm 6.881516100891e-02 > > 2 SNES Function norm 1.813939751552e-02 > > 3 SNES Function norm 2.354176462207e-03 > > 4 SNES Function norm 3.063728077362e-05 > > 5 SNES Function norm 3.106106268946e-08 > > 6 SNES Function norm 5.344742712545e-12 > > 0 SNES Function norm 2.271442542876e-01 > > 1 SNES Function norm 6.881516100891e-02 > > 2 SNES Function norm 1.813939751552e-02 > > 3 SNES Function norm 2.354176462207e-03 > > 4 SNES Function norm 3.063728077362e-05 > > 5 SNES Function norm 3.106106268946e-08 > > 6 SNES Function norm 5.344742712545e-12 > > Number of Newton iterations = 6 > > Number of Linear iterations = 18 > > Average Linear its / Newton = 3.000000e+00 > > > > vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf > > 0 SNES Function norm 2.271442542876e-01 > > 1 SNES Function norm 6.870629867542e-02 > > 2 SNES Function norm 1.804335379848e-02 > > 3 SNES Function norm 2.290074339682e-03 > > 4 SNES Function norm 3.082384186373e-05 > > 5 SNES Function norm 3.926396277038e-09 > > 6 SNES Function norm 3.754922566585e-16 > > 0 SNES Function norm 2.271442542876e-01 > > 1 SNES Function norm 6.870629867542e-02 > > 2 SNES Function norm 1.804335379848e-02 > > 3 SNES Function norm 2.290074339682e-03 > > 4 SNES Function norm 3.082384186373e-05 > > 5 SNES Function norm 3.926396277038e-09 > > 6 SNES Function norm 3.754922566585e-16 > > Number of Newton iterations = 6 > > Number of Linear iterations = 54 > > Average Linear its / Newton = 9.000000e+00 > > > > Thanks, > > Vijay > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From vijay.m at gmail.com Sun Dec 30 12:44:24 2007 From: vijay.m at gmail.com (Vijay M) Date: Sun, 30 Dec 2007 12:44:24 -0600 Subject: Matrix free example snes/ex20.c In-Reply-To: Message-ID: <000001c84b14$00ed25f0$6c00a8c0@neutrino> Matt, Thanks for the reply. What you suggested makes sense and so to start from a common ground, I used no preconditioner at all in both the J-free and analytical Jacobian cases. But now, interestingly, the analytical Jacobian takes around twice the number of linear iterations. mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -pc_type none Number of Newton iterations = 6 Number of Linear iterations = 112 Average Linear its / Newton = 1.866667e+01 mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf -pc_type none Number of Newton iterations = 6 Number of Linear iterations = 54 Average Linear its / Newton = 9.000000e+00 I understand that both the methods will not give me the same number of total linear iterations but a factor of 2 seems a little odd to me. This leads to another question whether the user can actually change the epsilon used for computing the perturbation in J-free scheme or is this fixed in PETSc ? If not, then what do you think is the reason for this ? Do let me know your comments when you get some time. Thanks. Vijay -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: Saturday, December 29, 2007 9:05 PM To: petsc-users at mcs.anl.gov Subject: Re: Matrix free example snes/ex20.c On Dec 29, 2007 8:07 PM, Vijay M wrote: > Hi all, > > I was trying to compile and run the ex20.c example code in the tutorial > section of SNES. Although it does not explicitly specify that -snes_mf > option can be used, my understanding is that as long as a nonlinear residual > function is written correctly, PETSc will calculate via finite difference > the action of the Jacobian on a given vector. Is that correct ? Yes. > Now if that is the case, then please observe the discrepancy in the number > of linear iterations taken with an analytical Jacobian and matrix-free > option. What puzzles me is that the SNES function norm are quite close for > both the methods but the linear iterations differ by a factor of 3. Why > exactly is this ? There is no PC when using -snes_mf whereas the default is ILU for the analytic Jacobian. Matt > Here's the output to make this clearer. > > vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor > > 0 SNES Function norm 2.271442542876e-01 > > 1 SNES Function norm 6.881516100891e-02 > > 2 SNES Function norm 1.813939751552e-02 > > 3 SNES Function norm 2.354176462207e-03 > > 4 SNES Function norm 3.063728077362e-05 > > 5 SNES Function norm 3.106106268946e-08 > > 6 SNES Function norm 5.344742712545e-12 > > 0 SNES Function norm 2.271442542876e-01 > > 1 SNES Function norm 6.881516100891e-02 > > 2 SNES Function norm 1.813939751552e-02 > > 3 SNES Function norm 2.354176462207e-03 > > 4 SNES Function norm 3.063728077362e-05 > > 5 SNES Function norm 3.106106268946e-08 > > 6 SNES Function norm 5.344742712545e-12 > > Number of Newton iterations = 6 > > Number of Linear iterations = 18 > > Average Linear its / Newton = 3.000000e+00 > > > > vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf > > 0 SNES Function norm 2.271442542876e-01 > > 1 SNES Function norm 6.870629867542e-02 > > 2 SNES Function norm 1.804335379848e-02 > > 3 SNES Function norm 2.290074339682e-03 > > 4 SNES Function norm 3.082384186373e-05 > > 5 SNES Function norm 3.926396277038e-09 > > 6 SNES Function norm 3.754922566585e-16 > > 0 SNES Function norm 2.271442542876e-01 > > 1 SNES Function norm 6.870629867542e-02 > > 2 SNES Function norm 1.804335379848e-02 > > 3 SNES Function norm 2.290074339682e-03 > > 4 SNES Function norm 3.082384186373e-05 > > 5 SNES Function norm 3.926396277038e-09 > > 6 SNES Function norm 3.754922566585e-16 > > Number of Newton iterations = 6 > > Number of Linear iterations = 54 > > Average Linear its / Newton = 9.000000e+00 > > > > Thanks, > > Vijay > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From bsmith at mcs.anl.gov Sun Dec 30 13:46:14 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 30 Dec 2007 13:46:14 -0600 Subject: Matrix free example snes/ex20.c In-Reply-To: <000001c84b14$00ed25f0$6c00a8c0@neutrino> References: <000001c84b14$00ed25f0$6c00a8c0@neutrino> Message-ID: <81FBC8B4-77DE-453B-A226-CDFB8D10C1B6@mcs.anl.gov> On Dec 30, 2007, at 12:44 PM, Vijay M wrote: > Matt, > > Thanks for the reply. What you suggested makes sense and so to start > from a > common ground, I used no preconditioner at all in both the J-free and > analytical Jacobian cases. But now, interestingly, the analytical > Jacobian > takes around twice the number of linear iterations. > > mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -pc_type none > Number of Newton iterations = 6 > Number of Linear iterations = 112 > Average Linear its / Newton = 1.866667e+01 > > mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf -pc_type none > Number of Newton iterations = 6 > Number of Linear iterations = 54 > Average Linear its / Newton = 9.000000e+00 > > I understand that both the methods will not give me the same number > of total > linear iterations but a factor of 2 seems a little odd to me. Yes, this is surprising. Run with -ksp_monitor how are the linear convergence different? > This leads to > another question whether the user can actually change the epsilon > used for > computing the perturbation in J-free scheme or is this fixed in > PETSc ? Yes, see the manual page for MatMFFDSetFromOptions() and related manual pages. > > > If not, then what do you think is the reason for this ? Bug in your analytic Jacobian? Run with -snes_monitor and - ksp_monitor and send all output. Barry > Do let me know your > comments when you get some time. Thanks. > > Vijay > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov > ] > On Behalf Of Matthew Knepley > Sent: Saturday, December 29, 2007 9:05 PM > To: petsc-users at mcs.anl.gov > Subject: Re: Matrix free example snes/ex20.c > > On Dec 29, 2007 8:07 PM, Vijay M wrote: >> Hi all, >> >> I was trying to compile and run the ex20.c example code in the >> tutorial >> section of SNES. Although it does not explicitly specify that - >> snes_mf >> option can be used, my understanding is that as long as a nonlinear > residual >> function is written correctly, PETSc will calculate via finite >> difference >> the action of the Jacobian on a given vector. Is that correct ? > > Yes. > >> Now if that is the case, then please observe the discrepancy in the >> number >> of linear iterations taken with an analytical Jacobian and matrix- >> free >> option. What puzzles me is that the SNES function norm are quite >> close for >> both the methods but the linear iterations differ by a factor of 3. >> Why >> exactly is this ? > > There is no PC when using -snes_mf whereas the default is ILU for the > analytic > Jacobian. > > Matt > >> Here's the output to make this clearer. >> >> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor >> >> 0 SNES Function norm 2.271442542876e-01 >> >> 1 SNES Function norm 6.881516100891e-02 >> >> 2 SNES Function norm 1.813939751552e-02 >> >> 3 SNES Function norm 2.354176462207e-03 >> >> 4 SNES Function norm 3.063728077362e-05 >> >> 5 SNES Function norm 3.106106268946e-08 >> >> 6 SNES Function norm 5.344742712545e-12 >> >> 0 SNES Function norm 2.271442542876e-01 >> >> 1 SNES Function norm 6.881516100891e-02 >> >> 2 SNES Function norm 1.813939751552e-02 >> >> 3 SNES Function norm 2.354176462207e-03 >> >> 4 SNES Function norm 3.063728077362e-05 >> >> 5 SNES Function norm 3.106106268946e-08 >> >> 6 SNES Function norm 5.344742712545e-12 >> >> Number of Newton iterations = 6 >> >> Number of Linear iterations = 18 >> >> Average Linear its / Newton = 3.000000e+00 >> >> >> >> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf >> >> 0 SNES Function norm 2.271442542876e-01 >> >> 1 SNES Function norm 6.870629867542e-02 >> >> 2 SNES Function norm 1.804335379848e-02 >> >> 3 SNES Function norm 2.290074339682e-03 >> >> 4 SNES Function norm 3.082384186373e-05 >> >> 5 SNES Function norm 3.926396277038e-09 >> >> 6 SNES Function norm 3.754922566585e-16 >> >> 0 SNES Function norm 2.271442542876e-01 >> >> 1 SNES Function norm 6.870629867542e-02 >> >> 2 SNES Function norm 1.804335379848e-02 >> >> 3 SNES Function norm 2.290074339682e-03 >> >> 4 SNES Function norm 3.082384186373e-05 >> >> 5 SNES Function norm 3.926396277038e-09 >> >> 6 SNES Function norm 3.754922566585e-16 >> >> Number of Newton iterations = 6 >> >> Number of Linear iterations = 54 >> >> Average Linear its / Newton = 9.000000e+00 >> >> >> >> Thanks, >> >> Vijay >> >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > From vijay.m at gmail.com Sun Dec 30 14:19:32 2007 From: vijay.m at gmail.com (Vijay M) Date: Sun, 30 Dec 2007 14:19:32 -0600 Subject: Matrix free example snes/ex20.c In-Reply-To: <81FBC8B4-77DE-453B-A226-CDFB8D10C1B6@mcs.anl.gov> Message-ID: <000001c84b21$4b6b8240$163010ac@neutrino> I ran both the cases with -ksp_monitor on and have attached the output in two different files. 1.txt is the Jfree case and 2.txt is the analytical case. Barry, the example problem is ex20 from the snes tutorial directory. The petsc version is 2.3.3-p7 if that helps to clear things a little. Now I haven't yet completely checked for a bug in the analytical Jacobian but I would imagine that if it were incorrect, wouldn't that affect only how the nonlinear iteration converges and not the linear iteration since the matrix sparsity structure is still the same (well assuming the condition number is not very different from the exact Jacobian !). Just my 2 cents. Anyway, I will look into the code for ex20 and then see if something is messed up. Let me know if you find out the problem from the output. Thanks, Vijay > I understand that both the methods will not give me the same number > of total > linear iterations but a factor of 2 seems a little odd to me. Yes, this is surprising. Run with -ksp_monitor how are the linear convergence different? > This leads to > another question whether the user can actually change the epsilon > used for > computing the perturbation in J-free scheme or is this fixed in > PETSc ? Yes, see the manual page for MatMFFDSetFromOptions() and related manual pages. > > > If not, then what do you think is the reason for this ? Bug in your analytic Jacobian? Run with -snes_monitor and - ksp_monitor and send all output. Barry > Do let me know your > comments when you get some time. Thanks. > > Vijay > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov > ] > On Behalf Of Matthew Knepley > Sent: Saturday, December 29, 2007 9:05 PM > To: petsc-users at mcs.anl.gov > Subject: Re: Matrix free example snes/ex20.c > > On Dec 29, 2007 8:07 PM, Vijay M wrote: >> Hi all, >> >> I was trying to compile and run the ex20.c example code in the >> tutorial >> section of SNES. Although it does not explicitly specify that - >> snes_mf >> option can be used, my understanding is that as long as a nonlinear > residual >> function is written correctly, PETSc will calculate via finite >> difference >> the action of the Jacobian on a given vector. Is that correct ? > > Yes. > >> Now if that is the case, then please observe the discrepancy in the >> number >> of linear iterations taken with an analytical Jacobian and matrix- >> free >> option. What puzzles me is that the SNES function norm are quite >> close for >> both the methods but the linear iterations differ by a factor of 3. >> Why >> exactly is this ? > > There is no PC when using -snes_mf whereas the default is ILU for the > analytic > Jacobian. > > Matt > >> Here's the output to make this clearer. >> >> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor >> >> 0 SNES Function norm 2.271442542876e-01 >> >> 1 SNES Function norm 6.881516100891e-02 >> >> 2 SNES Function norm 1.813939751552e-02 >> >> 3 SNES Function norm 2.354176462207e-03 >> >> 4 SNES Function norm 3.063728077362e-05 >> >> 5 SNES Function norm 3.106106268946e-08 >> >> 6 SNES Function norm 5.344742712545e-12 >> >> 0 SNES Function norm 2.271442542876e-01 >> >> 1 SNES Function norm 6.881516100891e-02 >> >> 2 SNES Function norm 1.813939751552e-02 >> >> 3 SNES Function norm 2.354176462207e-03 >> >> 4 SNES Function norm 3.063728077362e-05 >> >> 5 SNES Function norm 3.106106268946e-08 >> >> 6 SNES Function norm 5.344742712545e-12 >> >> Number of Newton iterations = 6 >> >> Number of Linear iterations = 18 >> >> Average Linear its / Newton = 3.000000e+00 >> >> >> >> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf >> >> 0 SNES Function norm 2.271442542876e-01 >> >> 1 SNES Function norm 6.870629867542e-02 >> >> 2 SNES Function norm 1.804335379848e-02 >> >> 3 SNES Function norm 2.290074339682e-03 >> >> 4 SNES Function norm 3.082384186373e-05 >> >> 5 SNES Function norm 3.926396277038e-09 >> >> 6 SNES Function norm 3.754922566585e-16 >> >> 0 SNES Function norm 2.271442542876e-01 >> >> 1 SNES Function norm 6.870629867542e-02 >> >> 2 SNES Function norm 1.804335379848e-02 >> >> 3 SNES Function norm 2.290074339682e-03 >> >> 4 SNES Function norm 3.082384186373e-05 >> >> 5 SNES Function norm 3.926396277038e-09 >> >> 6 SNES Function norm 3.754922566585e-16 >> >> Number of Newton iterations = 6 >> >> Number of Linear iterations = 54 >> >> Average Linear its / Newton = 9.000000e+00 >> >> >> >> Thanks, >> >> Vijay >> >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 1.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 2.txt URL: From bsmith at mcs.anl.gov Sun Dec 30 16:51:20 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 30 Dec 2007 16:51:20 -0600 Subject: Matrix free example snes/ex20.c In-Reply-To: <000001c84b21$4b6b8240$163010ac@neutrino> References: <000001c84b21$4b6b8240$163010ac@neutrino> Message-ID: Vijay, This is a very cool problem. Because of the exact symmetry of the domain the EXACT Jacobian at each step has exactly 9 different eigenvalues. This means the GMRES will take exactly 9 iterations (and "completely" converge in the ninth iteration) if the "exact" Jacobian is used. You can run with -pc_type none -snes_mf -ksp_monitor_singular_value - ksp_plot_eigenvalues -display :0.0 -draw_pause -1 to see the 9 eigenvalues. Now run without the -snes_mf option. You will see the first Newton iteration's eigenvalues still look like 9; but starting at the second Newton iteration the "identical" eigenvalues are now not all identically placed so GMRES needs more iterations. The question then becomes how come the matrix-free application of the Jacobian is more accurate than actually computing it as a sparse matrix then applying it? Here is my non-rigorous answer; the multiplication of the sparse matrix values (even if very accurate) against the vector introduces some rounding error that screws up the eigenvalues slightly. For some reason for this problem the matrix-free application is accurate enough not to perturb the eigenvalues. Barry On Dec 30, 2007, at 2:19 PM, Vijay M wrote: > I ran both the cases with -ksp_monitor on and have attached the > output in > two different files. 1.txt is the Jfree case and 2.txt is the > analytical > case. > > Barry, the example problem is ex20 from the snes tutorial directory. > The > petsc version is 2.3.3-p7 if that helps to clear things a little. > Now I > haven't yet completely checked for a bug in the analytical Jacobian > but I > would imagine that if it were incorrect, wouldn't that affect only > how the > nonlinear iteration converges and not the linear iteration since the > matrix > sparsity structure is still the same (well assuming the condition > number is > not very different from the exact Jacobian !). Just my 2 cents. > > Anyway, I will look into the code for ex20 and then see if something > is > messed up. Let me know if you find out the problem from the output. > > Thanks, > Vijay > >> I understand that both the methods will not give me the same number >> of total >> linear iterations but a factor of 2 seems a little odd to me. > > Yes, this is surprising. > > Run with -ksp_monitor how are the linear convergence different? > >> This leads to >> another question whether the user can actually change the epsilon >> used for >> computing the perturbation in J-free scheme or is this fixed in >> PETSc ? > > Yes, see the manual page for MatMFFDSetFromOptions() and related > manual > pages. > >> >> >> If not, then what do you think is the reason for this ? > > Bug in your analytic Jacobian? Run with -snes_monitor and - > ksp_monitor and > send all output. > > Barry > >> Do let me know your >> comments when you get some time. Thanks. >> >> Vijay >> >> -----Original Message----- >> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov >> ] >> On Behalf Of Matthew Knepley >> Sent: Saturday, December 29, 2007 9:05 PM >> To: petsc-users at mcs.anl.gov >> Subject: Re: Matrix free example snes/ex20.c >> >> On Dec 29, 2007 8:07 PM, Vijay M wrote: >>> Hi all, >>> >>> I was trying to compile and run the ex20.c example code in the >>> tutorial >>> section of SNES. Although it does not explicitly specify that - >>> snes_mf >>> option can be used, my understanding is that as long as a nonlinear >> residual >>> function is written correctly, PETSc will calculate via finite >>> difference >>> the action of the Jacobian on a given vector. Is that correct ? >> >> Yes. >> >>> Now if that is the case, then please observe the discrepancy in the >>> number >>> of linear iterations taken with an analytical Jacobian and matrix- >>> free >>> option. What puzzles me is that the SNES function norm are quite >>> close for >>> both the methods but the linear iterations differ by a factor of 3. >>> Why >>> exactly is this ? >> >> There is no PC when using -snes_mf whereas the default is ILU for the >> analytic >> Jacobian. >> >> Matt >> >>> Here's the output to make this clearer. >>> >>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor >>> >>> 0 SNES Function norm 2.271442542876e-01 >>> >>> 1 SNES Function norm 6.881516100891e-02 >>> >>> 2 SNES Function norm 1.813939751552e-02 >>> >>> 3 SNES Function norm 2.354176462207e-03 >>> >>> 4 SNES Function norm 3.063728077362e-05 >>> >>> 5 SNES Function norm 3.106106268946e-08 >>> >>> 6 SNES Function norm 5.344742712545e-12 >>> >>> 0 SNES Function norm 2.271442542876e-01 >>> >>> 1 SNES Function norm 6.881516100891e-02 >>> >>> 2 SNES Function norm 1.813939751552e-02 >>> >>> 3 SNES Function norm 2.354176462207e-03 >>> >>> 4 SNES Function norm 3.063728077362e-05 >>> >>> 5 SNES Function norm 3.106106268946e-08 >>> >>> 6 SNES Function norm 5.344742712545e-12 >>> >>> Number of Newton iterations = 6 >>> >>> Number of Linear iterations = 18 >>> >>> Average Linear its / Newton = 3.000000e+00 >>> >>> >>> >>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf >>> >>> 0 SNES Function norm 2.271442542876e-01 >>> >>> 1 SNES Function norm 6.870629867542e-02 >>> >>> 2 SNES Function norm 1.804335379848e-02 >>> >>> 3 SNES Function norm 2.290074339682e-03 >>> >>> 4 SNES Function norm 3.082384186373e-05 >>> >>> 5 SNES Function norm 3.926396277038e-09 >>> >>> 6 SNES Function norm 3.754922566585e-16 >>> >>> 0 SNES Function norm 2.271442542876e-01 >>> >>> 1 SNES Function norm 6.870629867542e-02 >>> >>> 2 SNES Function norm 1.804335379848e-02 >>> >>> 3 SNES Function norm 2.290074339682e-03 >>> >>> 4 SNES Function norm 3.082384186373e-05 >>> >>> 5 SNES Function norm 3.926396277038e-09 >>> >>> 6 SNES Function norm 3.754922566585e-16 >>> >>> Number of Newton iterations = 6 >>> >>> Number of Linear iterations = 54 >>> >>> Average Linear its / Newton = 9.000000e+00 >>> >>> >>> >>> Thanks, >>> >>> Vijay >>> >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> > <1.txt><2.txt> From vijay.m at gmail.com Mon Dec 31 18:06:17 2007 From: vijay.m at gmail.com (Vijay M) Date: Mon, 31 Dec 2007 18:06:17 -0600 Subject: Matrix free example snes/ex20.c In-Reply-To: Message-ID: <000001c84c0a$222dccf0$6e00a8c0@neutrino> Barry, Thanks for the detailed explanation. That sure is a tricky and interesting problem. I did run the problem with the options you suggested and see what you mean. I just have one another question though that is not quite related to the example: Say when you do J-free Newton-Krylov iteration, then is it correct to say that the F.D calculation of the action of Jacobian on a vector is more accurate than using a numerical Jacobian (not analytical) found at the start of a Newton iteration ? Because even though in both cases, the Jacobian is technically found by perturbation about the last Newton iteration, it seems to me that there is some gain in this convergence respect with J-free immaterial of the problem being solved. Now is that confusing or am I making sense ? I'll be glad to explain more on that and awaiting to hear your comments. Well, happy new year to you Barry and all the PETSc team !! Cheers, Vijay -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith Sent: Sunday, December 30, 2007 4:51 PM To: petsc-users at mcs.anl.gov Subject: Re: Matrix free example snes/ex20.c Vijay, This is a very cool problem. Because of the exact symmetry of the domain the EXACT Jacobian at each step has exactly 9 different eigenvalues. This means the GMRES will take exactly 9 iterations (and "completely" converge in the ninth iteration) if the "exact" Jacobian is used. You can run with -pc_type none -snes_mf -ksp_monitor_singular_value - ksp_plot_eigenvalues -display :0.0 -draw_pause -1 to see the 9 eigenvalues. Now run without the -snes_mf option. You will see the first Newton iteration's eigenvalues still look like 9; but starting at the second Newton iteration the "identical" eigenvalues are now not all identically placed so GMRES needs more iterations. The question then becomes how come the matrix-free application of the Jacobian is more accurate than actually computing it as a sparse matrix then applying it? Here is my non-rigorous answer; the multiplication of the sparse matrix values (even if very accurate) against the vector introduces some rounding error that screws up the eigenvalues slightly. For some reason for this problem the matrix-free application is accurate enough not to perturb the eigenvalues. Barry On Dec 30, 2007, at 2:19 PM, Vijay M wrote: > I ran both the cases with -ksp_monitor on and have attached the > output in > two different files. 1.txt is the Jfree case and 2.txt is the > analytical > case. > > Barry, the example problem is ex20 from the snes tutorial directory. > The > petsc version is 2.3.3-p7 if that helps to clear things a little. > Now I > haven't yet completely checked for a bug in the analytical Jacobian > but I > would imagine that if it were incorrect, wouldn't that affect only > how the > nonlinear iteration converges and not the linear iteration since the > matrix > sparsity structure is still the same (well assuming the condition > number is > not very different from the exact Jacobian !). Just my 2 cents. > > Anyway, I will look into the code for ex20 and then see if something > is > messed up. Let me know if you find out the problem from the output. > > Thanks, > Vijay > >> I understand that both the methods will not give me the same number >> of total >> linear iterations but a factor of 2 seems a little odd to me. > > Yes, this is surprising. > > Run with -ksp_monitor how are the linear convergence different? > >> This leads to >> another question whether the user can actually change the epsilon >> used for >> computing the perturbation in J-free scheme or is this fixed in >> PETSc ? > > Yes, see the manual page for MatMFFDSetFromOptions() and related > manual > pages. > >> >> >> If not, then what do you think is the reason for this ? > > Bug in your analytic Jacobian? Run with -snes_monitor and - > ksp_monitor and > send all output. > > Barry > >> Do let me know your >> comments when you get some time. Thanks. >> >> Vijay >> >> -----Original Message----- >> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov >> ] >> On Behalf Of Matthew Knepley >> Sent: Saturday, December 29, 2007 9:05 PM >> To: petsc-users at mcs.anl.gov >> Subject: Re: Matrix free example snes/ex20.c >> >> On Dec 29, 2007 8:07 PM, Vijay M wrote: >>> Hi all, >>> >>> I was trying to compile and run the ex20.c example code in the >>> tutorial >>> section of SNES. Although it does not explicitly specify that - >>> snes_mf >>> option can be used, my understanding is that as long as a nonlinear >> residual >>> function is written correctly, PETSc will calculate via finite >>> difference >>> the action of the Jacobian on a given vector. Is that correct ? >> >> Yes. >> >>> Now if that is the case, then please observe the discrepancy in the >>> number >>> of linear iterations taken with an analytical Jacobian and matrix- >>> free >>> option. What puzzles me is that the SNES function norm are quite >>> close for >>> both the methods but the linear iterations differ by a factor of 3. >>> Why >>> exactly is this ? >> >> There is no PC when using -snes_mf whereas the default is ILU for the >> analytic >> Jacobian. >> >> Matt >> >>> Here's the output to make this clearer. >>> >>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor >>> >>> 0 SNES Function norm 2.271442542876e-01 >>> >>> 1 SNES Function norm 6.881516100891e-02 >>> >>> 2 SNES Function norm 1.813939751552e-02 >>> >>> 3 SNES Function norm 2.354176462207e-03 >>> >>> 4 SNES Function norm 3.063728077362e-05 >>> >>> 5 SNES Function norm 3.106106268946e-08 >>> >>> 6 SNES Function norm 5.344742712545e-12 >>> >>> 0 SNES Function norm 2.271442542876e-01 >>> >>> 1 SNES Function norm 6.881516100891e-02 >>> >>> 2 SNES Function norm 1.813939751552e-02 >>> >>> 3 SNES Function norm 2.354176462207e-03 >>> >>> 4 SNES Function norm 3.063728077362e-05 >>> >>> 5 SNES Function norm 3.106106268946e-08 >>> >>> 6 SNES Function norm 5.344742712545e-12 >>> >>> Number of Newton iterations = 6 >>> >>> Number of Linear iterations = 18 >>> >>> Average Linear its / Newton = 3.000000e+00 >>> >>> >>> >>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf >>> >>> 0 SNES Function norm 2.271442542876e-01 >>> >>> 1 SNES Function norm 6.870629867542e-02 >>> >>> 2 SNES Function norm 1.804335379848e-02 >>> >>> 3 SNES Function norm 2.290074339682e-03 >>> >>> 4 SNES Function norm 3.082384186373e-05 >>> >>> 5 SNES Function norm 3.926396277038e-09 >>> >>> 6 SNES Function norm 3.754922566585e-16 >>> >>> 0 SNES Function norm 2.271442542876e-01 >>> >>> 1 SNES Function norm 6.870629867542e-02 >>> >>> 2 SNES Function norm 1.804335379848e-02 >>> >>> 3 SNES Function norm 2.290074339682e-03 >>> >>> 4 SNES Function norm 3.082384186373e-05 >>> >>> 5 SNES Function norm 3.926396277038e-09 >>> >>> 6 SNES Function norm 3.754922566585e-16 >>> >>> Number of Newton iterations = 6 >>> >>> Number of Linear iterations = 54 >>> >>> Average Linear its / Newton = 9.000000e+00 >>> >>> >>> >>> Thanks, >>> >>> Vijay >>> >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> > <1.txt><2.txt> From bsmith at mcs.anl.gov Mon Dec 31 18:08:29 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 31 Dec 2007 18:08:29 -0600 Subject: Matrix free example snes/ex20.c In-Reply-To: <000001c84c0a$222dccf0$6e00a8c0@neutrino> References: <000001c84c0a$222dccf0$6e00a8c0@neutrino> Message-ID: <69A26E5B-749B-4055-BFBF-8703B2704F03@mcs.anl.gov> On Dec 31, 2007, at 6:06 PM, Vijay M wrote: > Barry, > > Thanks for the detailed explanation. That sure is a tricky and > interesting > problem. I did run the problem with the options you suggested and > see what > you mean. > > I just have one another question though that is not quite related to > the > example: Say when you do J-free Newton-Krylov iteration, then is it > correct > to say that the F.D calculation of the action of Jacobian on a > vector is > more accurate than using a numerical Jacobian (not analytical) > found at the > start of a Newton iteration ? Because even though in both cases, the > Jacobian is technically found by perturbation about the last Newton > iteration, it seems to me that there is some gain in this convergence > respect with J-free immaterial of the problem being solved. I would say no; this is just a fluke thing. Barry > Now is that > confusing or am I making sense ? I'll be glad to explain more on > that and > awaiting to hear your comments. > > Well, happy new year to you Barry and all the PETSc team !! > > Cheers, > Vijay > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov > ] > On Behalf Of Barry Smith > Sent: Sunday, December 30, 2007 4:51 PM > To: petsc-users at mcs.anl.gov > Subject: Re: Matrix free example snes/ex20.c > > > Vijay, > > This is a very cool problem. > > Because of the exact symmetry of the domain the EXACT Jacobian > at each step has exactly 9 different eigenvalues. This means the > GMRES will take exactly 9 iterations (and "completely" converge in > the ninth iteration) if the "exact" Jacobian is used. You can run > with > -pc_type none -snes_mf -ksp_monitor_singular_value - > ksp_plot_eigenvalues -display :0.0 -draw_pause -1 > to see the 9 eigenvalues. > > Now run without the -snes_mf option. You will see the first Newton > iteration's > eigenvalues still look like 9; but starting at the second Newton > iteration the > "identical" eigenvalues are now not all identically placed so GMRES > needs > more iterations. > > The question then becomes how come the matrix-free application of > the Jacobian is more accurate than actually computing it as a sparse > matrix then applying it? Here is my non-rigorous answer; the > multiplication > of the sparse matrix values (even if very accurate) against the vector > introduces > some rounding error that screws up the eigenvalues slightly. For some > reason > for this problem the matrix-free application is accurate enough not to > perturb the eigenvalues. > > Barry > > > > > > On Dec 30, 2007, at 2:19 PM, Vijay M wrote: > >> I ran both the cases with -ksp_monitor on and have attached the >> output in >> two different files. 1.txt is the Jfree case and 2.txt is the >> analytical >> case. >> >> Barry, the example problem is ex20 from the snes tutorial directory. >> The >> petsc version is 2.3.3-p7 if that helps to clear things a little. >> Now I >> haven't yet completely checked for a bug in the analytical Jacobian >> but I >> would imagine that if it were incorrect, wouldn't that affect only >> how the >> nonlinear iteration converges and not the linear iteration since the >> matrix >> sparsity structure is still the same (well assuming the condition >> number is >> not very different from the exact Jacobian !). Just my 2 cents. >> >> Anyway, I will look into the code for ex20 and then see if something >> is >> messed up. Let me know if you find out the problem from the output. >> >> Thanks, >> Vijay >> >>> I understand that both the methods will not give me the same number >>> of total >>> linear iterations but a factor of 2 seems a little odd to me. >> >> Yes, this is surprising. >> >> Run with -ksp_monitor how are the linear convergence different? >> >>> This leads to >>> another question whether the user can actually change the epsilon >>> used for >>> computing the perturbation in J-free scheme or is this fixed in >>> PETSc ? >> >> Yes, see the manual page for MatMFFDSetFromOptions() and related >> manual >> pages. >> >>> >>> >>> If not, then what do you think is the reason for this ? >> >> Bug in your analytic Jacobian? Run with -snes_monitor and - >> ksp_monitor and >> send all output. >> >> Barry >> >>> Do let me know your >>> comments when you get some time. Thanks. >>> >>> Vijay >>> >>> -----Original Message----- >>> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov >>> ] >>> On Behalf Of Matthew Knepley >>> Sent: Saturday, December 29, 2007 9:05 PM >>> To: petsc-users at mcs.anl.gov >>> Subject: Re: Matrix free example snes/ex20.c >>> >>> On Dec 29, 2007 8:07 PM, Vijay M wrote: >>>> Hi all, >>>> >>>> I was trying to compile and run the ex20.c example code in the >>>> tutorial >>>> section of SNES. Although it does not explicitly specify that - >>>> snes_mf >>>> option can be used, my understanding is that as long as a nonlinear >>> residual >>>> function is written correctly, PETSc will calculate via finite >>>> difference >>>> the action of the Jacobian on a given vector. Is that correct ? >>> >>> Yes. >>> >>>> Now if that is the case, then please observe the discrepancy in the >>>> number >>>> of linear iterations taken with an analytical Jacobian and matrix- >>>> free >>>> option. What puzzles me is that the SNES function norm are quite >>>> close for >>>> both the methods but the linear iterations differ by a factor of 3. >>>> Why >>>> exactly is this ? >>> >>> There is no PC when using -snes_mf whereas the default is ILU for >>> the >>> analytic >>> Jacobian. >>> >>> Matt >>> >>>> Here's the output to make this clearer. >>>> >>>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor >>>> >>>> 0 SNES Function norm 2.271442542876e-01 >>>> >>>> 1 SNES Function norm 6.881516100891e-02 >>>> >>>> 2 SNES Function norm 1.813939751552e-02 >>>> >>>> 3 SNES Function norm 2.354176462207e-03 >>>> >>>> 4 SNES Function norm 3.063728077362e-05 >>>> >>>> 5 SNES Function norm 3.106106268946e-08 >>>> >>>> 6 SNES Function norm 5.344742712545e-12 >>>> >>>> 0 SNES Function norm 2.271442542876e-01 >>>> >>>> 1 SNES Function norm 6.881516100891e-02 >>>> >>>> 2 SNES Function norm 1.813939751552e-02 >>>> >>>> 3 SNES Function norm 2.354176462207e-03 >>>> >>>> 4 SNES Function norm 3.063728077362e-05 >>>> >>>> 5 SNES Function norm 3.106106268946e-08 >>>> >>>> 6 SNES Function norm 5.344742712545e-12 >>>> >>>> Number of Newton iterations = 6 >>>> >>>> Number of Linear iterations = 18 >>>> >>>> Average Linear its / Newton = 3.000000e+00 >>>> >>>> >>>> >>>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf >>>> >>>> 0 SNES Function norm 2.271442542876e-01 >>>> >>>> 1 SNES Function norm 6.870629867542e-02 >>>> >>>> 2 SNES Function norm 1.804335379848e-02 >>>> >>>> 3 SNES Function norm 2.290074339682e-03 >>>> >>>> 4 SNES Function norm 3.082384186373e-05 >>>> >>>> 5 SNES Function norm 3.926396277038e-09 >>>> >>>> 6 SNES Function norm 3.754922566585e-16 >>>> >>>> 0 SNES Function norm 2.271442542876e-01 >>>> >>>> 1 SNES Function norm 6.870629867542e-02 >>>> >>>> 2 SNES Function norm 1.804335379848e-02 >>>> >>>> 3 SNES Function norm 2.290074339682e-03 >>>> >>>> 4 SNES Function norm 3.082384186373e-05 >>>> >>>> 5 SNES Function norm 3.926396277038e-09 >>>> >>>> 6 SNES Function norm 3.754922566585e-16 >>>> >>>> Number of Newton iterations = 6 >>>> >>>> Number of Linear iterations = 54 >>>> >>>> Average Linear its / Newton = 9.000000e+00 >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Vijay >>>> >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which >>> their experiments lead. >>> -- Norbert Wiener >>> >> <1.txt><2.txt> >