From devteam at transvalor.com Wed May 3 08:34:56 2006 From: devteam at transvalor.com (DevTeam) Date: Wed, 3 May 2006 15:34:56 +0200 Subject: how to get the best solution Message-ID: <006101c66eb6$5fe2a530$089621c0@dev> Hi all, I'm a newbie with PetSc so sorry if it is a well known question (I've seen nothing in the mailing archive). I have a linear system wich is qui ill conditionned. So whatever solver and preconditionner I use convergence is a bit chaotic and it use to quit the solver when the maximum number of iteration has been reached with the last residual. Last residual that is never the best residual found during the whole convergence path. Is there a way to get this best solution from PetSc ? Thanks, Etienne Perchat -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed May 3 14:22:24 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 3 May 2006 14:22:24 -0500 (CDT) Subject: how to get the best solution In-Reply-To: <006101c66eb6$5fe2a530$089621c0@dev> References: <006101c66eb6$5fe2a530$089621c0@dev> Message-ID: Etienne, One thing to note, most PETSc KSP methods by default use the norm of the PRECONDITIONED residual (as opposed to the true b - Ax residual). So just picking the solution with the best "residual" may not be what you want. See the manual pages for KSPSetNormType() and KSPSetPreconditionerSide(). Also, GMRES for example, only computes an "estimate" of the residual, which often gets wrong for very ill-conditioned problems. To try to answer your question: You can use KSPSetMonitor() to provide your function that checks the current norm against the best so far. If it is better then the best so far it copies over the solution into a buffer. You can put the "best so far" and buffer inside a C struct that you pass into KSPSetMonitor. Sadly, this may still not be a good solution. GMRES, for example, does not actually COMPUTE the solution at each iteration; it only computes it at restarts and when KSPSolve() ends. You need to call KSPBuildSolution() to get the current solution that you put into the buffer. Alternatively you could call the solver twice, locating the lowest norm in the first run and then collecting the solution from that location in the second run. Again not a great solution. My recommendation for very ill-conditioned problems with no good preconditioner/solver; use a direct method if at all possible. Sparse direct methods can now solve upwards of a million unknowns. Yes it is slow and uses LOTS of memory but usually it works. Barry On Wed, 3 May 2006, DevTeam wrote: > Hi all, > I'm a newbie with PetSc so sorry if it is a well known question (I've seen nothing in the mailing archive). > > I have a linear system wich is qui ill conditionned. So whatever solver and preconditionner I use convergence is a bit chaotic > and it use to quit the solver when the maximum number of iteration has been reached with the last residual. > > Last residual that is never the best residual found during the whole convergence path. Is there a way to get this best solution from PetSc ? > > Thanks, > Etienne Perchat > From pbauman at ices.utexas.edu Wed May 3 16:02:57 2006 From: pbauman at ices.utexas.edu (Paul T. Bauman) Date: Wed, 03 May 2006 16:02:57 -0500 Subject: Two questions regarding SNES/KSP Message-ID: <44591A81.3040005@ices.utexas.edu> Hello, I'm a new PETSc user, so if the questions I ask are redundant, I sincerely apologize. I was, however, unavailable to find the following information in the manual pages. I have two questions. First: When using the SNES package, is the initial guess for the /iterative linear solver/ (e.g. conjugate gradient) 0.0 or is it something else (e.g. the current Newton solution)? Second: I see that PETSc checks the preconditioner matrix for positive definiteness, but does it check whether the current solution iterate is positive def. (i.e. if x_i is the current solution iterate and A is the current Jacobian matrix, does the algorithm check if (x_i)^T * A * (x_i) > 0)? Thanks so much for your help. Best Regards, Paul From knepley at gmail.com Wed May 3 16:14:05 2006 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 May 2006 16:14:05 -0500 Subject: Two questions regarding SNES/KSP In-Reply-To: <44591A81.3040005@ices.utexas.edu> References: <44591A81.3040005@ices.utexas.edu> Message-ID: On 5/3/06, Paul T. Bauman wrote: > > Hello, > > I'm a new PETSc user, so if the questions I ask are redundant, I > sincerely apologize. I was, however, unavailable to find the following > information in the manual pages. > > I have two questions. > > First: When using the SNES package, is the initial guess for the > /iterative linear solver/ (e.g. conjugate gradient) 0.0 or is it > something else (e.g. the current Newton solution)? If you want that, you need to pull out the KSP from the SNES: http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/SNES/SNESGetKSP.html and then flip the flag in the KSP: http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/KSP/KSPSetInitialGuessNonzero.html Second: I see that PETSc checks the preconditioner matrix for positive > definiteness, but does it check whether the current solution iterate is > positive def. (i.e. if x_i is the current solution iterate and A is the > current Jacobian matrix, does the algorithm check if (x_i)^T * A * (x_i) > > 0)? We cannot check that the preconditioner is positive definite. What we actually check it that the A-inner product calculated at each iteration is positive. Matt Thanks so much for your help. > > Best Regards, > > Paul > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From abdul-rahman at tu-harburg.de Thu May 4 06:47:01 2006 From: abdul-rahman at tu-harburg.de (abdul-rahman at tu-harburg.de) Date: Thu, 4 May 2006 13:47:01 +0200 (METDST) Subject: help in capturing matrix patterns to files Message-ID: Hi all, I'd appreciate if anyone can point to a tutorial/example on how to direct the matrix pattern plot (one with -mat_view_draw) to a file (preferrably PNG for presentation and postscript for publications) I want to be able to capture the patterns in monochrome and also resize the graphics size. Thanks so much. Razi From knepley at gmail.com Thu May 4 08:47:21 2006 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 May 2006 08:47:21 -0500 Subject: help in capturing matrix patterns to files In-Reply-To: References: Message-ID: There is nothing built into PETSc to do this. It is also X-Windows render commands inside. I used a screen capture program on the Window it pops up. Thanks, Matt On 5/4/06, abdul-rahman at tu-harburg.de wrote: > > Hi all, > > I'd appreciate if anyone can point to a tutorial/example on how to direct > the matrix pattern plot (one with -mat_view_draw) to a file (preferrably > PNG for presentation and postscript for publications) > > I want to be able to capture the patterns in monochrome and also resize > the graphics size. > > Thanks so much. > > > Razi > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From harald at tapir.caltech.edu Thu May 4 10:59:55 2006 From: harald at tapir.caltech.edu (Harald Pfeiffer) Date: Thu, 04 May 2006 08:59:55 -0700 Subject: help in capturing matrix patterns to files In-Reply-To: References: Message-ID: <445A24FB.7050903@tapir.caltech.edu> Hello, That's a pity that this is not implemented. As soon as I saw Razi's question, I thought that this would be a really useful feature. I have found myself many times saying in talks that my matrix is non-symmetric for this-or-that reason. Having a picture of the matrix to make the point would be very powerful. I guess, outputting a picture with as many pixels as the matrix-size in a non-compressed format (tiff?) should be straightforward: You iterate through all matrix entries, if the entry is non-zero, you output "1", otherwise "0". You could even output different colors depending on the size or sign of an entry. This will result in huge files, but any image-software should be able to down-size and convert to more efficient formats. Harald Matthew Knepley wrote: > There is nothing built into PETSc to do this. It is also X-Windows > render commands inside. I used a screen capture program on > the Window it pops up. > > Thanks, > > Matt > > On 5/4/06, *abdul-rahman at tu-harburg.de > * > wrote: > > Hi all, > > I'd appreciate if anyone can point to a tutorial/example on how to > direct > the matrix pattern plot (one with -mat_view_draw) to a file > (preferrably > PNG for presentation and postscript for publications) > > I want to be able to capture the patterns in monochrome and also > resize > the graphics size. > > Thanks so much. > > > Razi > > > > > -- > "Failure has a thousand explanations. Success doesn't need one" -- Sir > Alec Guiness -- Harald P. Pfeiffer harald at tapir.caltech.edu Theoretical Astrophysics Phone (626) 395-8413 Caltech 130-33 Fax (626) 796-5675 Pasadena, CA 91125, USA From mleung at u.washington.edu Thu May 4 11:25:43 2006 From: mleung at u.washington.edu (Mary Ann Leung) Date: Thu, 4 May 2006 09:25:43 -0700 Subject: help in capturing matrix patterns to files References: <445A24FB.7050903@tapir.caltech.edu> Message-ID: <004301c66f97$6429be80$6500a8c0@reinhardtlt> One thing that I have done is use one of the PETSc MatViewers to write you the matrix nonzero structure and then create a graphic using Matlab's "spy" command. It creates a nice visual representation of the nonzero patterns. ---Mary Ann ----- Original Message ----- From: "Harald Pfeiffer" To: Cc: Sent: Thursday, May 04, 2006 8:59 AM Subject: Re: help in capturing matrix patterns to files > Hello, > > That's a pity that this is not implemented. As soon as I saw Razi's > question, I thought that this would be a really useful feature. I have > found myself many times saying in talks that my matrix is non-symmetric > for this-or-that reason. Having a picture of the matrix to make the point > would be very powerful. > > I guess, outputting a picture with as many pixels as the matrix-size in a > non-compressed format (tiff?) should be straightforward: You iterate > through all matrix entries, if the entry is non-zero, you output "1", > otherwise "0". > You could even output different colors depending on the size or sign of an > entry. This will result in huge files, but any image-software should be > able to down-size and convert to more efficient formats. > > Harald > > > Matthew Knepley wrote: >> There is nothing built into PETSc to do this. It is also X-Windows >> render commands inside. I used a screen capture program on >> the Window it pops up. >> >> Thanks, >> >> Matt >> >> On 5/4/06, *abdul-rahman at tu-harburg.de >> * > > wrote: >> >> Hi all, >> >> I'd appreciate if anyone can point to a tutorial/example on how to >> direct >> the matrix pattern plot (one with -mat_view_draw) to a file >> (preferrably >> PNG for presentation and postscript for publications) >> >> I want to be able to capture the patterns in monochrome and also >> resize >> the graphics size. >> >> Thanks so much. >> >> >> Razi >> >> >> >> >> -- >> "Failure has a thousand explanations. Success doesn't need one" -- Sir >> Alec Guiness > > -- > Harald P. Pfeiffer harald at tapir.caltech.edu > Theoretical Astrophysics Phone (626) 395-8413 > Caltech 130-33 Fax (626) 796-5675 > Pasadena, CA 91125, USA > From balay at mcs.anl.gov Thu May 4 11:34:13 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 4 May 2006 11:34:13 -0500 (CDT) Subject: help in capturing matrix patterns to files In-Reply-To: <445A24FB.7050903@tapir.caltech.edu> References: <445A24FB.7050903@tapir.caltech.edu> Message-ID: There are already tools to do get this dump xwd | xpr -device ps > foo.ps [now when the curser changes - click on the window to be dumped] xv [use the grab option, and then save as tiff/jpg/ps format] Perhaps there are other tools that can be used. Hence there is no extra tools in PETSc to do the capture part. Satish On Thu, 4 May 2006, Harald Pfeiffer wrote: > Hello, > > That's a pity that this is not implemented. As soon as I saw Razi's question, > I thought that this would be a really useful feature. I have found myself > many times saying in talks that my matrix is non-symmetric for this-or-that > reason. Having a picture of the matrix to make the point would be very > powerful. > > I guess, outputting a picture with as many pixels as the matrix-size in a > non-compressed format (tiff?) should be straightforward: You iterate through > all matrix entries, if the entry is non-zero, you output "1", otherwise "0". > You could even output different colors depending on the size or sign of an > entry. This will result in huge files, but any image-software should be able > to down-size and convert to more efficient formats. > > Harald > > > Matthew Knepley wrote: > > There is nothing built into PETSc to do this. It is also X-Windows > > render commands inside. I used a screen capture program on > > the Window it pops up. > > > > Thanks, > > > > Matt > > > > On 5/4/06, *abdul-rahman at tu-harburg.de * > > > wrote: > > > > Hi all, > > > > I'd appreciate if anyone can point to a tutorial/example on how to > > direct > > the matrix pattern plot (one with -mat_view_draw) to a file > > (preferrably > > PNG for presentation and postscript for publications) > > > > I want to be able to capture the patterns in monochrome and also > > resize > > the graphics size. > > > > Thanks so much. > > > > > > Razi > > > > > > > > > > -- > > "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec > > Guiness > > From gay-pierre at numericable.fr Thu May 4 16:42:01 2006 From: gay-pierre at numericable.fr (Pierre Gay) Date: Thu, 04 May 2006 23:42:01 +0200 Subject: Linear system partitioning Message-ID: <1146778921.9697.21.camel@aneu> Hello, I'm trying to use PETSc to partition and renumber a linear system. I found some examples in the distribution: $(PETSC_DIR)/ksp/ksp/examples/tutorials/ex10.c $(PETSC_DIR)/mat/examples/tests/ex73.c But these seem to only renumber the matrix. I also need to renumber associated vectors (right hand side, solution) and I did not manage to get some working code. Could someone lead me to an example? Many thanks, Pierre From knepley at gmail.com Thu May 4 17:50:07 2006 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 May 2006 17:50:07 -0500 Subject: Linear system partitioning In-Reply-To: <1146778921.9697.21.camel@aneu> References: <1146778921.9697.21.camel@aneu> Message-ID: We use the numbering and MatGetSubmatrix() to permute the matrix in those examples. You can permute the vector by creating a VecScatter between two global vectors. The first IS is "is" from the example, and the second is 0...n-1 (use ISCreateStride). Thanks, Matt On 5/4/06, Pierre Gay wrote: > > > Hello, > > I'm trying to use PETSc to partition and renumber a linear system. I > found some examples in the distribution: > > $(PETSC_DIR)/ksp/ksp/examples/tutorials/ex10.c > $(PETSC_DIR)/mat/examples/tests/ex73.c > > But these seem to only renumber the matrix. I also need to renumber > associated vectors (right hand side, solution) and I did not manage to > get some working code. > > Could someone lead me to an example? > > Many thanks, > > Pierre > > > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu May 4 21:39:19 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 4 May 2006 21:39:19 -0500 (CDT) Subject: help in capturing matrix patterns to files In-Reply-To: References: <445A24FB.7050903@tapir.caltech.edu> Message-ID: You can simply use -mat_view_draw -draw_type ps and it will save the image in defaultps0.ps Draw backs: 1) only generates color postscript 2) silly file name defaultps0.ps 3) the file can get large, megabytes for moderately large matrices (cause it draws EVERY non-zero matrix element). I have added to http://www-unix.mcs.anl.gov/petsc/petsc-as/developers/projects.html 1) -draw_ps_monochrome (and corresponding function) to get black and white ps 2) -draw_ps_filename for allowing changing the name of the file. (volunteers?) Actually this also needs to be added to the FAQ, Satish could you add it with all the options you listed below as well as the postsript approach? Thanks! Barry The image looks fine, but I do think that Matlab's spy() function does generate a more visually appealing image. Heck they have graphics experts while I learned my limited postscript by reverse engineering postscript files in the early 90's :-( On Thu, 4 May 2006, Satish Balay wrote: > There are already tools to do get this dump > > xwd | xpr -device ps > foo.ps > [now when the curser changes - click on the window to be dumped] > > xv > [use the grab option, and then save as tiff/jpg/ps format] > > Perhaps there are other tools that can be used. Hence there is > no extra tools in PETSc to do the capture part. > > Satish > > On Thu, 4 May 2006, Harald Pfeiffer wrote: > >> Hello, >> >> That's a pity that this is not implemented. As soon as I saw Razi's question, >> I thought that this would be a really useful feature. I have found myself >> many times saying in talks that my matrix is non-symmetric for this-or-that >> reason. Having a picture of the matrix to make the point would be very >> powerful. >> >> I guess, outputting a picture with as many pixels as the matrix-size in a >> non-compressed format (tiff?) should be straightforward: You iterate through >> all matrix entries, if the entry is non-zero, you output "1", otherwise "0". >> You could even output different colors depending on the size or sign of an >> entry. This will result in huge files, but any image-software should be able >> to down-size and convert to more efficient formats. >> >> Harald >> >> >> Matthew Knepley wrote: >>> There is nothing built into PETSc to do this. It is also X-Windows >>> render commands inside. I used a screen capture program on >>> the Window it pops up. >>> >>> Thanks, >>> >>> Matt >>> >>> On 5/4/06, *abdul-rahman at tu-harburg.de * >>> > wrote: >>> >>> Hi all, >>> >>> I'd appreciate if anyone can point to a tutorial/example on how to >>> direct >>> the matrix pattern plot (one with -mat_view_draw) to a file >>> (preferrably >>> PNG for presentation and postscript for publications) >>> >>> I want to be able to capture the patterns in monochrome and also >>> resize >>> the graphics size. >>> >>> Thanks so much. >>> >>> >>> Razi >>> >>> >>> >>> >>> -- >>> "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec >>> Guiness >> >> > > From michal.wieja at gmail.com Fri May 5 06:18:33 2006 From: michal.wieja at gmail.com (=?ISO-8859-2?Q?Micha=B3_Wieja?=) Date: Fri, 5 May 2006 13:18:33 +0200 Subject: PETSc and MPICH2 Message-ID: <73b582b90605050418m26cf02ddxab30ca81970535e8@mail.gmail.com> Hi, I just have a quick question, does the PETSc 2.3.0 and 2.3.1 work with MPICH2. I'm going to use it for my project and I have MPICH2 system runing already, would be very useful to know before starting to work with PETSc. -- Micha? Wieja From bsmith at mcs.anl.gov Fri May 5 07:53:20 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 5 May 2006 07:53:20 -0500 (CDT) Subject: memory bleeding In-Reply-To: <200605051128.k45BSgo27156@mcs.anl.gov> References: <200605051128.k45BSgo27156@mcs.anl.gov> Message-ID: See http://www-unix.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html number 12 You can run for say a few iterations with the options -malloc_debug -malloc_dump (in PETSc 2.3.1. in early releases it had different names). At PetscFinalize() PETSc will print all the memory it has allocated that has not been freed and where it was allocated. This can help determine what objects are not being freed. Barry > > Hi, > > I use PETSc, and after 5000 (more or less) iteations in which i solve a little > matrix > (30x30) my swap is totally used!!! I'm sure that when I create my matrices I > destroy > them. Could you help me about it? What happens? What thing I do bad? > > Mat K > MatCreateSeqAIJ(PETSC_COMM_SELF, dim, dim, 9 , PETSC_NULL, &K); > MatSetFromOptions(K); > MatSetOption(K, MAT_SYMMETRIC); > MatSetOption(K, MAT_ROW_ORIENTED); > MatSetOption(K, MAT_IGNORE_ZERO_ENTRIES); > MatSetOption(K, MAT_NEW_NONZERO_ALLOCATION_ERR); > > ( I use KSP solve, created and destroyed) > > MatDestroy(K); > > > Thanks, > jordi > ----------- > Jordi Marc?-Nogu? > Dept. Resist?ncia de Materials i Estructures a l'Enginyeria > Universitat Polit?cnica de Catalunya (UPC) > > Edifici T45 - despatx 137 > ETSEIAT (Terrassa) > > phone: +34 937 398 728 > mail: jordi.marce at upc.edu > > > From balay at mcs.anl.gov Fri May 5 08:29:23 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 5 May 2006 08:29:23 -0500 (CDT) Subject: PETSc and MPICH2 In-Reply-To: <73b582b90605050418m26cf02ddxab30ca81970535e8@mail.gmail.com> References: <73b582b90605050418m26cf02ddxab30ca81970535e8@mail.gmail.com> Message-ID: Yes, its the default version used by --download-mpich. [However on windows we still test with mich1] You should use --with-mpi-dir option with configure. Satish On Fri, 5 May 2006, Micha? Wieja wrote: > Hi, > > I just have a quick question, does the PETSc 2.3.0 and 2.3.1 work with > MPICH2. I'm going to use it for my project and I have MPICH2 system > runing already, would be very useful to know before starting to work > with PETSc. > > -- > Micha? Wieja > > From jordi.marce at upc.edu Fri May 5 06:05:56 2006 From: jordi.marce at upc.edu (Jordi =?iso-8859-1?b?TWFyY+kg?= =?iso-8859-1?b?Tm9ndek=?=) Date: Fri, 5 May 2006 13:05:56 +0200 Subject: My swap is full Message-ID: <1146827156.445b3194e8f2c@nobel.upc.es> Hi, I use PETSc, and after 5000 (more or less) iteations in which i solve a little matrix (30x30) my swap is totally used!!! I'm sure that when I create my matrices I destroy them. Could you help me about it? What happens? What thing I do bad? Mat K MatCreateSeqAIJ(PETSC_COMM_SELF, dim, dim, 9 , PETSC_NULL, &K); MatSetFromOptions(K); MatSetOption(K, MAT_SYMMETRIC); MatSetOption(K, MAT_ROW_ORIENTED); MatSetOption(K, MAT_IGNORE_ZERO_ENTRIES); MatSetOption(K, MAT_NEW_NONZERO_ALLOCATION_ERR); ( I use KSP solve, created and destroyed) MatDestroy(K); Thanks, jordi ----------- Jordi Marc?-Nogu? Dept. Resist?ncia de Materials i Estructures a l'Enginyeria Universitat Polit?cnica de Catalunya (UPC) Edifici T45 - despatx 137 ETSEIAT (Terrassa) phone: +34 937 398 728 mail: jordi.marce at upc.edu From jordi.marce at upc.edu Mon May 8 05:10:07 2006 From: jordi.marce at upc.edu (Jordi =?iso-8859-1?b?TWFyY+kg?= =?iso-8859-1?b?Tm9ndek=?=) Date: Mon, 8 May 2006 12:10:07 +0200 Subject: memory bleeding In-Reply-To: References: <200605051128.k45BSgo27156@mcs.anl.gov> Message-ID: <1147083007.445f18ffd5f69@nobel.upc.es> Thanks Barry, I'm using petsc 2.2.0. When I run my program ( " ./myprogram -trdump " or with -trmalloc, -trinfo...) I don't obtain anything... in the screen the code doesn't print anything. If I use (for example) "./myprogram -start_on_debugger", in my screen apers gdb and runs good... and with another options like -log_trace petsc writes messages in my screen. Would I have to activate something in my internal petsc code? best regards, jordi Missatge citat per Barry Smith : > > See > http://www-unix.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html > number 12 > > You can run for say a few iterations with the options > -malloc_debug -malloc_dump (in PETSc 2.3.1. in early releases it had > different names). At PetscFinalize() PETSc will print all the memory > it has allocated that has not been freed and where it was allocated. > This can help determine what objects are not being freed. > > Barry > > > > > Hi, > > > > I use PETSc, and after 5000 (more or less) iteations in which i solve a > little > > matrix > > (30x30) my swap is totally used!!! I'm sure that when I create my matrices > I > > destroy > > them. Could you help me about it? What happens? What thing I do bad? > > > > Mat K > > MatCreateSeqAIJ(PETSC_COMM_SELF, dim, dim, 9 , PETSC_NULL, &K); > > MatSetFromOptions(K); > > MatSetOption(K, MAT_SYMMETRIC); > > MatSetOption(K, MAT_ROW_ORIENTED); > > MatSetOption(K, MAT_IGNORE_ZERO_ENTRIES); > > MatSetOption(K, MAT_NEW_NONZERO_ALLOCATION_ERR); > > > > ( I use KSP solve, created and destroyed) > > > > MatDestroy(K); > > > > > > Thanks, > > jordi > > ----------- > > Jordi Marc?-Nogu? > > Dept. Resist?ncia de Materials i Estructures a l'Enginyeria > > Universitat Polit?cnica de Catalunya (UPC) > > > > Edifici T45 - despatx 137 > > ETSEIAT (Terrassa) > > > > phone: +34 937 398 728 > > mail: jordi.marce at upc.edu > > > > > > From jadelman at OCF.Berkeley.EDU Mon May 8 11:05:36 2006 From: jadelman at OCF.Berkeley.EDU (Joshua L. Adelman) Date: Mon, 8 May 2006 09:05:36 -0700 Subject: Program Design Question Message-ID: I am a new user of PETsc and was hoping that a more experienced member of the group could give me some insight as to what would be the proper formulation of the following problem in PETsc. I am attempting to solve the simple ODE: d{rho}/dt = K*rho Where K is a large rate matrix that doesn't depend on time within the simulation (i.e K(x)) and rho is a vector of densities. K is usually stiff as it contains terms that reflect both diffusive and chemical transitions, and its entries are sparse. I am interested in the evolution of the system in time as well as the steady-state behavior. I have already implemented a version of the code in Matlab and am looking to write a PETsc version that can be run in parallel on a cluster. In my matlab code, setting up K is fast and the rate limiting step is actually doing the solve. Is the appropriate approach using the Backward Euler TS to solve the problem? Also it is unclear to me whether I need to employ Distributed Arrays (DA). I can provide more information about the nature of the simulation if necessary if it helps in answering my questions. Any suggestions/insight would be most appreciated. Josh From balay at mcs.anl.gov Mon May 8 12:42:33 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 8 May 2006 12:42:33 -0500 (CDT) Subject: memory bleeding In-Reply-To: <1147083007.445f18ffd5f69@nobel.upc.es> References: <200605051128.k45BSgo27156@mcs.anl.gov> <1147083007.445f18ffd5f69@nobel.upc.es> Message-ID: If you don't get any output with -trdump - it could mean that all objects are getting properly destroyed. But the swap usage is a bit unusual. You could try the option -trmalloc_log to see how memory is allocate on the PETSc side. And also -log_summary to see the summary of memory usage. Also - is it possible you have malloc() calls in your side of the code - that could be leaking memory? you could also comment out all MatSetOption() calls and see if it makes a difference. BTW: 2.2.0 is a very old version. You might want to upgrade to the latest 2.3.1 Satish On Mon, 8 May 2006, Jordi Marc? Nogu? wrote: > Thanks Barry, > > I'm using petsc 2.2.0. When I run my program ( " ./myprogram -trdump " or with > -trmalloc, -trinfo...) I don't obtain anything... in the screen the code doesn't > print > anything. > > If I use (for example) "./myprogram -start_on_debugger", in my screen apers gdb > > and runs good... and with another options like -log_trace petsc writes messages > in > my screen. Would I have to activate something in my internal petsc code? > > best regards, > jordi > > > > > > > Missatge citat per Barry Smith : > > > > > See > > > http://www-unix.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html > > number 12 > > > > You can run for say a few iterations with the options > > -malloc_debug -malloc_dump (in PETSc 2.3.1. in early releases it had > > different names). At PetscFinalize() PETSc will print all the memory > > it has allocated that has not been freed and where it was allocated. > > This can help determine what objects are not being freed. > > > > Barry > > > > > > > > Hi, > > > > > > I use PETSc, and after 5000 (more or less) iteations in which i solve a > > little > > > matrix > > > (30x30) my swap is totally used!!! I'm sure that when I create my matrices > > I > > > destroy > > > them. Could you help me about it? What happens? What thing I do bad? > > > > > > Mat K > > > MatCreateSeqAIJ(PETSC_COMM_SELF, dim, dim, 9 , PETSC_NULL, &K); > > > MatSetFromOptions(K); > > > MatSetOption(K, MAT_SYMMETRIC); > > > MatSetOption(K, MAT_ROW_ORIENTED); > > > MatSetOption(K, MAT_IGNORE_ZERO_ENTRIES); > > > MatSetOption(K, MAT_NEW_NONZERO_ALLOCATION_ERR); > > > > > > ( I use KSP solve, created and destroyed) > > > > > > MatDestroy(K); > > > > > > > > > Thanks, > > > jordi > > > ----------- > > > Jordi Marc?-Nogu? > > > Dept. Resist?ncia de Materials i Estructures a l'Enginyeria > > > Universitat Polit?cnica de Catalunya (UPC) > > > > > > Edifici T45 - despatx 137 > > > ETSEIAT (Terrassa) > > > > > > phone: +34 937 398 728 > > > mail: jordi.marce at upc.edu > > > > > > > > > > > > > > From shma7099 at student.uu.se Mon May 8 13:36:48 2006 From: shma7099 at student.uu.se (Sh.M) Date: Mon, 8 May 2006 20:36:48 +0200 Subject: PetscMemoryGetMaximumUsage References: <200605051128.k45BSgo27156@mcs.anl.gov> <1147083007.445f18ffd5f69@nobel.upc.es> Message-ID: <001601c672ce$5e5fbd80$2516e055@bredbandsbolaget.se> Hi, If I want to check the maximum amount of memory PETSc has used during a program run, is PetscMemoryGetMaximumUsage the function to use? I take it a call to this function will print out how much Process ID X has used, is this correct? So If I want to see the total maximum amount of memory used during a program run, each process should call this function and then I add them to get the total amount, correct? Thanks in advance. With best regards, Shaman Mahmoudi From bsmith at mcs.anl.gov Mon May 8 14:31:52 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 8 May 2006 14:31:52 -0500 (CDT) Subject: PetscMemoryGetMaximumUsage In-Reply-To: <001601c672ce$5e5fbd80$2516e055@bredbandsbolaget.se> References: <200605051128.k45BSgo27156@mcs.anl.gov> <1147083007.445f18ffd5f69@nobel.upc.es> <001601c672ce$5e5fbd80$2516e055@bredbandsbolaget.se> Message-ID: Unfortunately that routine is not currently "wired"; getting actual memory usage is not portable and is a pain. You can use PetscMallocGetMaximumUsage() to see the maximum amount of memory PETSc has allocated at any one time (in all the PETSc objects). Barry On Mon, 8 May 2006, Sh.M wrote: > Hi, > > If I want to check the maximum amount of memory PETSc has used during a > program run, is PetscMemoryGetMaximumUsage the function to use? I take it a > call to this function will print out how much Process ID X has used, is this > correct? So If I want to see the total maximum amount of memory used during > a program run, each process should call this function and then I add them to > get the total amount, correct? > > Thanks in advance. > > With best regards, Shaman Mahmoudi > > From hzhang at mcs.anl.gov Mon May 8 14:33:17 2006 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Mon, 8 May 2006 14:33:17 -0500 (CDT) Subject: Program Design Question In-Reply-To: References: Message-ID: Josh, You may take a look at ~/src/ts/examples/tests/ex1.c, ex2.c Let us know if you still have trouble. Hong On Mon, 8 May 2006, Joshua L. Adelman wrote: > I am a new user of PETsc and was hoping that a more experienced > member of the group could give me some insight as to what would be > the proper formulation of the following problem in PETsc. I am > attempting to solve the simple ODE: > d{rho}/dt = K*rho > > Where K is a large rate matrix that doesn't depend on time within the > simulation (i.e K(x)) and rho is a vector of densities. K is usually > stiff as it contains terms that reflect both diffusive and chemical > transitions, and its entries are sparse. I am interested in the > evolution of the system in time as well as the steady-state behavior. > I have already implemented a version of the code in Matlab and am > looking to write a PETsc version that can be run in parallel on a > cluster. In my matlab code, setting up K is fast and the rate > limiting step is actually doing the solve. > > Is the appropriate approach using the Backward Euler TS to solve the > problem? Also it is unclear to me whether I need to employ > Distributed Arrays (DA). I can provide more information about the > nature of the simulation if necessary if it helps in answering my > questions. > > Any suggestions/insight would be most appreciated. > > Josh > > From bsmith at mcs.anl.gov Mon May 8 14:36:04 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 8 May 2006 14:36:04 -0500 (CDT) Subject: Program Design Question In-Reply-To: References: Message-ID: On Mon, 8 May 2006, Joshua L. Adelman wrote: > I am a new user of PETsc and was hoping that a more experienced member of the > group could give me some insight as to what would be the proper formulation > of the following problem in PETsc. I am attempting to solve the simple ODE: > d{rho}/dt = K*rho > Is d{rho}/dt implemented simply as {{rho}^{n+1} - {rho}^{n})/dt or do you have a mass matrix involved? > Where K is a large rate matrix that doesn't depend on time within the > simulation (i.e K(x)) and rho is a vector of densities. K is usually stiff as > it contains terms that reflect both diffusive and chemical transitions, and > its entries are sparse. I am interested in the evolution of the system in > time as well as the steady-state behavior. I have already implemented a > version of the code in Matlab and am looking to write a PETsc version that > can be run in parallel on a cluster. In my matlab code, setting up K is fast > and the rate limiting step is actually doing the solve. > > Is the appropriate approach using the Backward Euler TS to solve the problem? If there is not mass matrix then yes, you want to use that. > Also it is unclear to me whether I need to employ Distributed Arrays (DA). If your ODE comes from discretizing a PDE on a structured (rectangular) grid in 2 or 3d then the DA may be helpful to organize the partitioning of the domain. But so long as you have a way of computing K in parallel (or computing it fast sequentially (say in Matlab) and then loading it in in parallel then you have no reason to us DA. Barry >I > can provide more information about the nature of the simulation if necessary > if it helps in answering my questions. > > Any suggestions/insight would be most appreciated. > > Josh > From yaronkretchmer at gmail.com Tue May 9 01:56:23 2006 From: yaronkretchmer at gmail.com (Yaron Kretchmer) Date: Mon, 8 May 2006 23:56:23 -0700 Subject: nullspaces Message-ID: Hi all I have a matrix representing electrical currents (KCL). In my design, there are some elements which are totally isolated from the rest of the circuit, which means that they form the matrix's nullspace. My questions are: *) Can I use Petsc to derive the nodes which compose the nullspace *) Can I then remove them from the KSP? If so, how? Thanks much Yaron -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 9 08:35:35 2006 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 9 May 2006 08:35:35 -0500 Subject: nullspaces In-Reply-To: References: Message-ID: On 5/9/06, Yaron Kretchmer wrote: > > Hi all > I have a matrix representing electrical currents (KCL). In my design, > there are some elements which are totally isolated from the rest of the > circuit, which means that they form the matrix's nullspace. > > My questions are: > *) Can I use Petsc to derive the nodes which compose the nullspace > We do not have QR type algorithms in PETSc, but you could use LAPACK or PLAPACK for dense matrices. *) Can I then remove them from the KSP? If so, how? > You can use the MatNullSpace architecture to remove these components from a solve. Matt Thanks much > Yaron > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From abdul-rahman at tu-harburg.de Tue May 9 10:10:01 2006 From: abdul-rahman at tu-harburg.de (abdul-rahman at tu-harburg.de) Date: Tue, 9 May 2006 17:10:01 +0200 (METDST) Subject: help in capturing matrix patterns to files In-Reply-To: References: <445A24FB.7050903@tapir.caltech.edu> Message-ID: Many thanks everyone for the inputs/alternatives. On Thu, 4 May 2006, Barry Smith wrote: > > You can simply use -mat_view_draw -draw_type ps and it will save the > image in defaultps0.ps > > Draw backs: > 1) only generates color postscript > > 2) silly file name defaultps0.ps > > 3) the file can get large, megabytes for moderately large matrices (cause it > draws EVERY non-zero matrix element). > > I have added to > http://www-unix.mcs.anl.gov/petsc/petsc-as/developers/projects.html > 1) -draw_ps_monochrome (and corresponding function) to get black and white ps > > 2) -draw_ps_filename for allowing changing the name of the file. Will try this out one of these days. > The image looks fine, but I do think that Matlab's spy() function > does generate a more visually appealing image. Heck they have graphics > experts while I learned my limited postscript by reverse engineering > postscript files in the early 90's :-( For what it's worth, I tried to push how big a matrix matlab can take. My ascii file in Matlab format is 130M in size, with n~17k, nnz~2.2M (complex). Matlab simply could not read it. :). Well, i guess it could, but I didn't want to wait for a couple hours just to "spy" a matrix. I think for now the X tool suffices me. I never knew it exists. Thanks for pointing out. Many thanks again! Razi From sean at trialphaenergy.com Tue May 9 10:22:44 2006 From: sean at trialphaenergy.com (Sean Dettrick) Date: Tue, 09 May 2006 11:22:44 -0400 Subject: polar coordinate singularity Message-ID: <4460B3C4.5050600@trialphaenergy.com> Hi, I am solving Poisson's equation Div. Grad(Phi)=rho in the (R,Theta) plane with KSP, and it is working well if I avoid the axis. But I would like to put a point at R=0. Is there a recommended way to do that? Thanks, Sean From F.Boulahya at brgm.fr Tue May 9 10:12:57 2006 From: F.Boulahya at brgm.fr (=?iso-8859-1?Q?Boulahya_Fa=EFza?=) Date: Tue, 9 May 2006 17:12:57 +0200 Subject: Petsc + BlockSolve95 Message-ID: Hi All, Has anyone used Conjugate Gradient Solver + Icomplete Cholesky Preconditionner in parallel case? I tried as said in the manual : I use MATMPIROWBS for the storage of the matrice. However I get this message : PETSC ERROR: To use incomplete Cholesky preconditioning with a MATMPIROWBS matrix you must declare it to be symmetric using the option MatSetOption(A,MAT_SYMMETRIC)! So I tried adding this option (even if in the namual it is written that it is not required). Then I obtained this message PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in src/mat/impls/rowbs/mpi/mpirowbs.c PETSC ERROR: No support for this operation for this object type! PETSC ERROR: unknown option! PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c In advance thanks, Fa?za Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ *** Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de v?rifier l'absence de corruption ? sa r?ception. The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. eSafe scanned this email for viruses, vandals and malicious content. *** -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean at trialphaenergy.com Tue May 9 10:30:31 2006 From: sean at trialphaenergy.com (Sean Dettrick) Date: Tue, 09 May 2006 11:30:31 -0400 Subject: polar coordinate singularity In-Reply-To: <4460B3C4.5050600@trialphaenergy.com> References: <4460B3C4.5050600@trialphaenergy.com> Message-ID: <4460B597.7050009@trialphaenergy.com> Sean Dettrick wrote: > Hi, > I am solving Poisson's equation > Div. Grad(Phi)=rho > in the (R,Theta) plane with KSP, and it is working well if I avoid the > axis. But I would like to put a point at R=0. Is there a > recommended way to do that? > Thanks, > Sean > > By the way I forgot to mention I am using finite differencing, not finite elements. Thanks, Sean From jordi.marce at upc.edu Tue May 9 11:09:28 2006 From: jordi.marce at upc.edu (=?ISO-8859-1?Q?Jordi_Marc=E9_Nogu=E9?=) Date: Tue, 09 May 2006 18:09:28 +0200 Subject: memory bleeding In-Reply-To: References: <200605051128.k45BSgo27156@mcs.anl.gov> <1147083007.445f18ffd5f69@nobel.upc.es> Message-ID: <4460BEB8.3010909@upc.edu> Hi, I find my problem with the memory. Using PetscTrDump(PETSC_NULL) I've cornered it and I've find the lines which generate it. I've discovered that the functions like MatTranspose(), MatMatMult() or similar functions creates extra memory in my code and these don't destroy. I explain: I create my matrices ----->>>> 3000 b are created (for example) I use MatMatMult --------->>>> 2000 bytes are created (for example) I destroy my matrices ---->>>> 3000 b are destroyed. How can I free the memory created in the function MaMatMult()? The final result is that if I use this function in a iterative process I full the memory of the box. If I multiply the matrix in a own function, I don't have this problem. best regards, jordi En/na Satish Balay ha escrit: > If you don't get any output with -trdump - it could mean that all > objects are getting properly destroyed. But the swap usage is a bit > unusual. > > You could try the option -trmalloc_log to see how memory is allocate > on the PETSc side. And also -log_summary to see the summary of memory > usage. > > Also - is it possible you have malloc() calls in your side of the code > - that could be leaking memory? > > you could also comment out all MatSetOption() calls and see if it > makes a difference. > > BTW: 2.2.0 is a very old version. You might want to upgrade to the > latest 2.3.1 > > Satish > > On Mon, 8 May 2006, Jordi Marc? Nogu? wrote: > > >>Thanks Barry, >> >>I'm using petsc 2.2.0. When I run my program ( " ./myprogram -trdump " or with >>-trmalloc, -trinfo...) I don't obtain anything... in the screen the code doesn't >>print >>anything. >> >>If I use (for example) "./myprogram -start_on_debugger", in my screen apers gdb >> >>and runs good... and with another options like -log_trace petsc writes messages >>in >>my screen. Would I have to activate something in my internal petsc code? >> >>best regards, >>jordi >> >> >> >> >> >> >>Missatge citat per Barry Smith : >> >> >>> >>> See >>> >> >>http://www-unix.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html >> >>>number 12 >>> >>> You can run for say a few iterations with the options >>>-malloc_debug -malloc_dump (in PETSc 2.3.1. in early releases it had >>>different names). At PetscFinalize() PETSc will print all the memory >>>it has allocated that has not been freed and where it was allocated. >>>This can help determine what objects are not being freed. >>> >>> Barry >>> >>> >>>>Hi, >>>> >>>>I use PETSc, and after 5000 (more or less) iteations in which i solve a >>> >>>little >>> >>>>matrix >>>>(30x30) my swap is totally used!!! I'm sure that when I create my matrices >>> >>>I >>> >>>>destroy >>>>them. Could you help me about it? What happens? What thing I do bad? >>>> >>>>Mat K >>>>MatCreateSeqAIJ(PETSC_COMM_SELF, dim, dim, 9 , PETSC_NULL, &K); >>>>MatSetFromOptions(K); >>>>MatSetOption(K, MAT_SYMMETRIC); >>>>MatSetOption(K, MAT_ROW_ORIENTED); >>>>MatSetOption(K, MAT_IGNORE_ZERO_ENTRIES); >>>>MatSetOption(K, MAT_NEW_NONZERO_ALLOCATION_ERR); >>>> >>>>( I use KSP solve, created and destroyed) >>>> >>>>MatDestroy(K); >>>> >>>> >>>>Thanks, >>>>jordi >>>>----------- >>>>Jordi Marc?-Nogu? >>>>Dept. Resist?ncia de Materials i Estructures a l'Enginyeria >>>>Universitat Polit?cnica de Catalunya (UPC) >>>> >>>>Edifici T45 - despatx 137 >>>>ETSEIAT (Terrassa) >>>> >>>>phone: +34 937 398 728 >>>>mail: jordi.marce at upc.edu >>>> >>>> >>>> >> >> >> >> >> -- Jordi Marc?-Nogu? Dept. Resist?ncia de Materials i Estructures a l'Enginyeria Universitat Polit?cnica de Catalunya (UPC) Edifici T45 - despatx 137 ETSEIAT (Terrassa) phone: +34 937 398 728 mail: jordi.marce at upc.edu From knepley at gmail.com Tue May 9 11:19:19 2006 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 9 May 2006 11:19:19 -0500 Subject: Petsc + BlockSolve95 In-Reply-To: References: Message-ID: I believe there is a problem with the option that you specified. All these are integers, and it is complaining that the integer does not match MAT_SYMMETRIC. I will fix the error message to print the offending option, but please check the code. Thanks, Matt On 5/9/06, Boulahya Fa?za wrote: > > Hi All, > > Has anyone used Conjugate Gradient Solver + Icomplete Cholesky > Preconditionner in parallel case? I tried as said in the manual : I use > MATMPIROWBS for the storage of the matrice. However I get this message : > > PETSC ERROR: To use incomplete Cholesky > preconditioning with a MATMPIROWBS matrix you must > declare it to be > symmetric using the option > MatSetOption(A,MAT_SYMMETRIC)! > > So I tried adding this option (even if in the namual it is written that it > is not required). Then I obtained this message > PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in > src/mat/impls/rowbs/mpi/mpirowbs.c > PETSC ERROR: No support for this operation for this object type! > PETSC ERROR: unknown option! > PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > In advance thanks, > > > Fa?za > > > Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ > > *** > Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du > (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet > e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. > L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de > v?rifier l'absence de corruption ? sa r?ception. > > The contents of this email and any attachments are confidential. They are intended for > the named recipient(s) only. If you have received this email in error please notify the > system manager or the sender immediately and do not disclose the contents to > anyone or make copies. eSafe scanned this email for viruses, vandals and malicious > content. > *** > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 9 11:27:25 2006 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 9 May 2006 11:27:25 -0500 Subject: memory bleeding In-Reply-To: <4460BEB8.3010909@upc.edu> References: <200605051128.k45BSgo27156@mcs.anl.gov> <1147083007.445f18ffd5f69@nobel.upc.es> <4460BEB8.3010909@upc.edu> Message-ID: Are you destroying the matrix created by MatMatMult()? From the webpage: Notes C will be created and must be destroyed by the user with MatDestroy(). Unless scall is MAT_REUSE_MATRIX Thanks, Matt On 5/9/06, Jordi Marc? Nogu? wrote: > > Hi, > > I find my problem with the memory. Using PetscTrDump(PETSC_NULL) I've > cornered it and I've find the lines which generate it. > > I've discovered that the functions like MatTranspose(), MatMatMult() or > similar functions creates extra memory in my code and these don't destroy. > > I explain: > > I create my matrices ----->>>> 3000 b are created (for example) > I use MatMatMult --------->>>> 2000 bytes are created (for example) > I destroy my matrices ---->>>> 3000 b are destroyed. > > How can I free the memory created in the function MaMatMult()? > > The final result is that if I use this function in a iterative process I > full the memory of the box. If I multiply the matrix in a own function, > I don't have this problem. > > > best regards, > jordi > > > > > En/na Satish Balay ha escrit: > > If you don't get any output with -trdump - it could mean that all > > objects are getting properly destroyed. But the swap usage is a bit > > unusual. > > > > You could try the option -trmalloc_log to see how memory is allocate > > on the PETSc side. And also -log_summary to see the summary of memory > > usage. > > > > Also - is it possible you have malloc() calls in your side of the code > > - that could be leaking memory? > > > > you could also comment out all MatSetOption() calls and see if it > > makes a difference. > > > > BTW: 2.2.0 is a very old version. You might want to upgrade to the > > latest 2.3.1 > > > > Satish > > > > On Mon, 8 May 2006, Jordi Marc? Nogu? wrote: > > > > > >>Thanks Barry, > >> > >>I'm using petsc 2.2.0. When I run my program ( " ./myprogram -trdump " > or with > >>-trmalloc, -trinfo...) I don't obtain anything... in the screen the code > doesn't > >>print > >>anything. > >> > >>If I use (for example) "./myprogram -start_on_debugger", in my screen > apers gdb > >> > >>and runs good... and with another options like -log_trace petsc writes > messages > >>in > >>my screen. Would I have to activate something in my internal petsc code? > >> > >>best regards, > >>jordi > >> > >> > >> > >> > >> > >> > >>Missatge citat per Barry Smith : > >> > >> > >>> > >>> See > >>> > >> > >> > http://www-unix.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html > >> > >>>number 12 > >>> > >>> You can run for say a few iterations with the options > >>>-malloc_debug -malloc_dump (in PETSc 2.3.1. in early releases it had > >>>different names). At PetscFinalize() PETSc will print all the memory > >>>it has allocated that has not been freed and where it was allocated. > >>>This can help determine what objects are not being freed. > >>> > >>> Barry > >>> > >>> > >>>>Hi, > >>>> > >>>>I use PETSc, and after 5000 (more or less) iteations in which i solve > a > >>> > >>>little > >>> > >>>>matrix > >>>>(30x30) my swap is totally used!!! I'm sure that when I create my > matrices > >>> > >>>I > >>> > >>>>destroy > >>>>them. Could you help me about it? What happens? What thing I do bad? > >>>> > >>>>Mat K > >>>>MatCreateSeqAIJ(PETSC_COMM_SELF, dim, dim, 9 , PETSC_NULL, &K); > >>>>MatSetFromOptions(K); > >>>>MatSetOption(K, MAT_SYMMETRIC); > >>>>MatSetOption(K, MAT_ROW_ORIENTED); > >>>>MatSetOption(K, MAT_IGNORE_ZERO_ENTRIES); > >>>>MatSetOption(K, MAT_NEW_NONZERO_ALLOCATION_ERR); > >>>> > >>>>( I use KSP solve, created and destroyed) > >>>> > >>>>MatDestroy(K); > >>>> > >>>> > >>>>Thanks, > >>>>jordi > >>>>----------- > >>>>Jordi Marc?-Nogu? > >>>>Dept. Resist?ncia de Materials i Estructures a l'Enginyeria > >>>>Universitat Polit?cnica de Catalunya (UPC) > >>>> > >>>>Edifici T45 - despatx 137 > >>>>ETSEIAT (Terrassa) > >>>> > >>>>phone: +34 937 398 728 > >>>>mail: jordi.marce at upc.edu > >>>> > >>>> > >>>> > >> > >> > >> > >> > >> > > > -- > Jordi Marc?-Nogu? > Dept. Resist?ncia de Materials i Estructures a l'Enginyeria > Universitat Polit?cnica de Catalunya (UPC) > > Edifici T45 - despatx 137 > ETSEIAT (Terrassa) > > phone: +34 937 398 728 > mail: jordi.marce at upc.edu > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From gay-pierre at numericable.fr Tue May 9 16:06:08 2006 From: gay-pierre at numericable.fr (Pierre Gay) Date: Tue, 09 May 2006 23:06:08 +0200 Subject: Linear system partitioning In-Reply-To: References: <1146778921.9697.21.camel@aneu> Message-ID: <1147208768.3006.27.camel@aneu> Many thanks for your advice. Another little question: Are there alternate methods for renumbering the matrix (instead of MatGetSubMatrix())? Thanks again for your time. Pierre. From knepley at gmail.com Tue May 9 18:20:03 2006 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 9 May 2006 18:20:03 -0500 Subject: Linear system partitioning In-Reply-To: <1147208768.3006.27.camel@aneu> References: <1146778921.9697.21.camel@aneu> <1147208768.3006.27.camel@aneu> Message-ID: No, however we would be willing to take contributions. I assume you mean permute, not renumber. Thanks, Matt On 5/9/06, Pierre Gay wrote: > > > Many thanks for your advice. > > Another little question: > > Are there alternate methods for renumbering the matrix (instead of > MatGetSubMatrix())? > > Thanks again for your time. > > Pierre. > > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From F.Boulahya at brgm.fr Wed May 10 03:38:32 2006 From: F.Boulahya at brgm.fr (=?iso-8859-1?Q?Boulahya_Fa=EFza?=) Date: Wed, 10 May 2006 10:38:32 +0200 Subject: Petsc + BlockSolve95 Message-ID: Thanks. I tried something else : - creation of the matrix with MatCreateMPIAIJ - initialization - adding of the options MAT_SYMMETRIC and MAT_SYMMETRY_ETERNAL When solving in sequential CG + ICC, everything is ok. When I tried in parallel the same code the options lead to the same error : [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: No support for this operation for this object type! [0]PETSC ERROR: unknown option! [0]PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c Can MatSetOption be only used in sequential? Fa?za _____ De : Matthew Knepley [mailto:knepley at gmail.com] Envoy? : mardi 9 mai 2006 18:19 ? : petsc-users at mcs.anl.gov Objet : Re: Petsc + BlockSolve95 I believe there is a problem with the option that you specified. All these are integers, and it is complaining that the integer does not match MAT_SYMMETRIC. I will fix the error message to print the offending option, but please check the code. Thanks, Matt On 5/9/06, Boulahya Fa?za > wrote: Hi All, Has anyone used Conjugate Gradient Solver + Icomplete Cholesky Preconditionner in parallel case? I tried as said in the manual : I use MATMPIROWBS for the storage of the matrice. However I get this message : PETSC ERROR: To use incomplete Cholesky preconditioning with a MATMPIROWBS matrix you must declare it to be symmetric using the option MatSetOption(A,MAT_SYMMETRIC)! So I tried adding this option (even if in the namual it is written that it is not required). Then I obtained this message PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in src/mat/impls/rowbs/mpi/mpirowbs.c PETSC ERROR: No support for this operation for this object type! PETSC ERROR: unknown option! PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c In advance thanks, Fa?za Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ *** Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de v?rifier l'absence de corruption ? sa r?ception. The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. eSafe scanned this email for viruses, vandals and malicious content. *** -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordi.marce at upc.edu Wed May 10 04:23:06 2006 From: jordi.marce at upc.edu (=?ISO-8859-1?Q?Jordi_Marc=E9_Nogu=E9?=) Date: Wed, 10 May 2006 11:23:06 +0200 Subject: memory bleeding In-Reply-To: References: <200605051128.k45BSgo27156@mcs.anl.gov> <1147083007.445f18ffd5f69@nobel.upc.es> <4460BEB8.3010909@upc.edu> Message-ID: <4461B0FA.3040700@upc.edu> Yes, of course I destroy the matrix.... The scheme of my code when I create a diagonal Mass-lumping matrix is the code below. In MatSeqAIJSetPreallocation(M_aux1,6,PETSC_NULL); the number is 6 because in a general coordinates it's possible this matrix changes in a 6x6 full matrix. "element3D *p_element = lfiber[i].getElement(e)" is a internal procedure to obtain information about the element ------------------------------------------------------------------ Mat M_aux1, M_aux2; MatCreateSeqAIJ(PETSC_COMM_SELF, 6, 6, 6, PETSC_NULL, &M_aux1); MatCreateSeqAIJ(PETSC_COMM_SELF, 6, 6, 6, PETSC_NULL, &M_aux2); MatSeqAIJSetPreallocation(M_aux1,6,PETSC_NULL); MatSeqAIJSetPreallocation(M_aux2,6,PETSC_NULL); MatSetFromOptions(M_aux1); MatSetOption(M_aux1, MAT_SYMMETRIC); MatSetOption(M_aux1, MAT_IGNORE_ZERO_ENTRIES); MatSetFromOptions(M_aux2); MatSetOption(M_aux2, MAT_SYMMETRIC); MatSetOption(M_aux2, MAT_IGNORE_ZERO_ENTRIES); for(uint32_t i=0; iupdateTs(); MatSetValue(M_aux1, 0, 0, p_element->L0 * value.pho / 6., INSERT_VALUES); MatSetValue(M_aux1, 1, 1, p_element->L0 * value.pho / 6., INSERT_VALUES); MatSetValue(M_aux1, 2, 2, p_element->L0 * value.pho / 6., INSERT_VALUES); MatSetValue(M_aux1, 3, 3, p_element->L0 * value.pho / 6., INSERT_VALUES); MatSetValue(M_aux1, 4, 4, p_element->L0 * value.pho / 6., INSERT_VALUES); MatSetValue(M_aux1, 5, 5, p_element->L0 * value.pho / 6., INSERT_VALUES); MatAssemblyBegin(M_aux1,MAT_FINAL_ASSEMBLY); MatAssemblyEnd(M_aux1,MAT_FINAL_ASSEMBLY); MatMatMult(M_aux1,p_element->T,&M_aux2); MatMatMult(p_element->TT,M_aux2,&M_aux1); } // here I work with the matrix M_aux2, but in this point the memory // is constant. The code doesn't waste memory } MatDestroy(M_aux1); MatDestroy(M_aux2); ------------------------ Thanks, best regards -- Jordi Marc?-Nogu? Dept. Resist?ncia de Materials i Estructures a l'Enginyeria Universitat Polit?cnica de Catalunya (UPC) Edifici T45 - despatx 137 ETSEIAT (Terrassa) phone: +34 937 398 728 mail: jordi.marce at upc.edu From G.Vaz at marin.nl Wed May 10 04:37:11 2006 From: G.Vaz at marin.nl (Vaz, Guilherme) Date: Wed, 10 May 2006 11:37:11 +0200 Subject: Petsc + BlockSolve95 Message-ID: <5D9143EF9FADE942BEF6F2A636A861170271092F@MAR150CV1.marin.local> Hello people, I have exactly the same problem as Faiza... In sequential it runs ok but in parallel not. Greetings. Guilherme -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Boulahya Fa?za Sent: 10 May 2006 10:39 To: 'petsc-users at mcs.anl.gov' Subject: RE: Petsc + BlockSolve95 Thanks. I tried something else : - creation of the matrix with MatCreateMPIAIJ - initialization - adding of the options MAT_SYMMETRIC and MAT_SYMMETRY_ETERNAL When solving in sequential CG + ICC, everything is ok. When I tried in parallel the same code the options lead to the same error : [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: No support for this operation for this object type! [0]PETSC ERROR: unknown option! [0]PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c Can MatSetOption be only used in sequential? Fa?za ________________________________ De : Matthew Knepley [mailto:knepley at gmail.com] Envoy? : mardi 9 mai 2006 18:19 ? : petsc-users at mcs.anl.gov Objet : Re: Petsc + BlockSolve95 I believe there is a problem with the option that you specified. All these are integers, and it is complaining that the integer does not match MAT_SYMMETRIC. I will fix the error message to print the offending option, but please check the code. Thanks, Matt On 5/9/06, Boulahya Fa?za wrote: Hi All, Has anyone used Conjugate Gradient Solver + Icomplete Cholesky Preconditionner in parallel case? I tried as said in the manual : I use MATMPIROWBS for the storage of the matrice. However I get this message : PETSC ERROR: To use incomplete Cholesky preconditioning with a MATMPIROWBS matrix you must declare it to be symmetric using the option MatSetOption(A,MAT_SYMMETRIC)! So I tried adding this option (even if in the namual it is written that it is not required). Then I obtained this message PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in src/mat/impls/rowbs/mpi/mpirowbs.c PETSC ERROR: No support for this operation for this object type! PETSC ERROR: unknown option! PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c In advance thanks, Fa?za Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ *** Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de v?rifier l'absence de corruption ? sa r?ception. The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. eSafe scanned this email for viruses, vandals and malicious content. *** -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 10 08:24:10 2006 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 10 May 2006 08:24:10 -0500 Subject: memory bleeding In-Reply-To: <4461B0FA.3040700@upc.edu> References: <200605051128.k45BSgo27156@mcs.anl.gov> <1147083007.445f18ffd5f69@nobel.upc.es> <4460BEB8.3010909@upc.edu> <4461B0FA.3040700@upc.edu> Message-ID: The code you have there will create a matrix for every loop iteration, and will only destroy one at the end. The new MatMatMult() interface has an argument which allows the user to pass in a matrix. Matt On 5/10/06, Jordi Marc? Nogu? wrote: > > Yes, of course I destroy the matrix.... The scheme of my code when I > create a diagonal Mass-lumping matrix is the code below. > > In MatSeqAIJSetPreallocation(M_aux1,6,PETSC_NULL); the number is 6 > because in a general coordinates it's possible this matrix changes in a > 6x6 full matrix. > > "element3D *p_element = lfiber[i].getElement(e)" is a internal > procedure to obtain information about the element > > ------------------------------------------------------------------ > > Mat M_aux1, M_aux2; > > MatCreateSeqAIJ(PETSC_COMM_SELF, 6, 6, 6, PETSC_NULL, &M_aux1); > MatCreateSeqAIJ(PETSC_COMM_SELF, 6, 6, 6, PETSC_NULL, &M_aux2); > > MatSeqAIJSetPreallocation(M_aux1,6,PETSC_NULL); > MatSeqAIJSetPreallocation(M_aux2,6,PETSC_NULL); > > MatSetFromOptions(M_aux1); > MatSetOption(M_aux1, MAT_SYMMETRIC); > MatSetOption(M_aux1, MAT_IGNORE_ZERO_ENTRIES); > > MatSetFromOptions(M_aux2); > MatSetOption(M_aux2, MAT_SYMMETRIC); > MatSetOption(M_aux2, MAT_IGNORE_ZERO_ENTRIES); > > > for(uint32_t i=0; i { > for(uint32_t e=0; e { > > MatZeroentries(M_aux1); > MatZeroentries(M_aux2); > > element3D *p_element = lfiber[i].getElement(e); > > p_element->updateTs(); > > MatSetValue(M_aux1, 0, 0, p_element->L0 * value.pho / 6., > INSERT_VALUES); > MatSetValue(M_aux1, 1, 1, p_element->L0 * value.pho / 6., > INSERT_VALUES); > MatSetValue(M_aux1, 2, 2, p_element->L0 * value.pho / 6., > INSERT_VALUES); > MatSetValue(M_aux1, 3, 3, p_element->L0 * value.pho / 6., > INSERT_VALUES); > MatSetValue(M_aux1, 4, 4, p_element->L0 * value.pho / 6., > INSERT_VALUES); > MatSetValue(M_aux1, 5, 5, p_element->L0 * value.pho / 6., > INSERT_VALUES); > > > MatAssemblyBegin(M_aux1,MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(M_aux1,MAT_FINAL_ASSEMBLY); > > > MatMatMult(M_aux1,p_element->T,&M_aux2); > MatMatMult(p_element->TT,M_aux2,&M_aux1); > > } > > // here I work with the matrix M_aux2, but in this point the memory > // is constant. The code doesn't waste memory > } > > MatDestroy(M_aux1); > MatDestroy(M_aux2); > > ------------------------ > > > Thanks, > > best regards > > > -- > Jordi Marc?-Nogu? > Dept. Resist?ncia de Materials i Estructures a l'Enginyeria > Universitat Polit?cnica de Catalunya (UPC) > > Edifici T45 - despatx 137 > ETSEIAT (Terrassa) > > phone: +34 937 398 728 > mail: jordi.marce at upc.edu > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed May 10 08:29:30 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 10 May 2006 08:29:30 -0500 (CDT) Subject: Petsc + BlockSolve95 In-Reply-To: References: Message-ID: No, MatSetOption() should always work and accept these arguments. Are you sure the values of MAT_SYMMETRIC and MAT_SYMMETRY_ETERNAL are being set (by including the right Fortran include file) before you use them? Barry BTW: is this PETSc 2.3.1? On Wed, 10 May 2006, Boulahya Fa?za wrote: > Thanks. > > I tried something else : > - creation of the matrix with MatCreateMPIAIJ > - initialization > - adding of the options MAT_SYMMETRIC and MAT_SYMMETRY_ETERNAL > > When solving in sequential CG + ICC, everything is ok. When I tried in > parallel the same code the options lead to the same error : > [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in > src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: unknown option! > [0]PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > Can MatSetOption be only used in sequential? > > > Fa?za > > > _____ > > De : Matthew Knepley [mailto:knepley at gmail.com] > Envoy? : mardi 9 mai 2006 18:19 > ? : petsc-users at mcs.anl.gov > Objet : Re: Petsc + BlockSolve95 > > > I believe there is a problem with the option that you specified. All these > are integers, and it is complaining that the integer does not match > MAT_SYMMETRIC. I will fix the error message to print the offending > option, but please check the code. > > Thanks, > > Matt > > > On 5/9/06, Boulahya Fa?za > > wrote: > > Hi All, > > Has anyone used Conjugate Gradient Solver + Icomplete Cholesky > Preconditionner in parallel case? I tried as said in the manual : I use > MATMPIROWBS for the storage of the matrice. However I get this message : > > PETSC ERROR: To use incomplete Cholesky > preconditioning with a MATMPIROWBS matrix you must > declare it to be > symmetric using the option > MatSetOption(A,MAT_SYMMETRIC)! > > > So I tried adding this option (even if in the namual it is written that it > is not required). Then I obtained this message > PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in > src/mat/impls/rowbs/mpi/mpirowbs.c > PETSC ERROR: No support for this operation for this object type! > PETSC ERROR: unknown option! > PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > > In advance thanks, > > > Fa?za > > > Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ > > *** > Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du > (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet > e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. > L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de > v?rifier l'absence de corruption ? sa r?ception. > > The contents of this email and any attachments are confidential. They are intended for > the named recipient(s) only. If you have received this email in error please notify the > system manager or the sender immediately and do not disclose the contents to > anyone or make copies. eSafe scanned this email for viruses, vandals and malicious > content. > *** From bsmith at mcs.anl.gov Wed May 10 08:37:34 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 10 May 2006 08:37:34 -0500 (CDT) Subject: Petsc + BlockSolve95 In-Reply-To: <5D9143EF9FADE942BEF6F2A636A861170271092F@MAR150CV1.marin.local> References: <5D9143EF9FADE942BEF6F2A636A861170271092F@MAR150CV1.marin.local> Message-ID: Hmm, can anyone reproduce this in the debugger? What is the value of op in MatSetOption_MPIAIJ() that is not accepted? Is this also Fortran? Barry BTW: I have updated petsc-dev to print the numerical value of any op that it claims is "unknown", instead of providing NO useful information like it use to. On Wed, 10 May 2006, Vaz, Guilherme wrote: > Hello people, > > > > I have exactly the same problem as Faiza... > > In sequential it runs ok but in parallel not. > > > > Greetings. > > > > Guilherme > > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Boulahya Fa?za > Sent: 10 May 2006 10:39 > To: 'petsc-users at mcs.anl.gov' > Subject: RE: Petsc + BlockSolve95 > > > > Thanks. > > > > I tried something else : > > - creation of the matrix with MatCreateMPIAIJ > > - initialization > > - adding of the options MAT_SYMMETRIC and MAT_SYMMETRY_ETERNAL > > > > When solving in sequential CG + ICC, everything is ok. When I tried in parallel the same code the options lead to the same error : > > [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: unknown option! > [0]PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > Can MatSetOption be only used in sequential? > > > > > > Fa?za > > > > > > ________________________________ > > De : Matthew Knepley [mailto:knepley at gmail.com] > Envoy? : mardi 9 mai 2006 18:19 > ? : petsc-users at mcs.anl.gov > Objet : Re: Petsc + BlockSolve95 > > I believe there is a problem with the option that you specified. All these > are integers, and it is complaining that the integer does not match > MAT_SYMMETRIC. I will fix the error message to print the offending > option, but please check the code. > > Thanks, > > Matt > > On 5/9/06, Boulahya Fa?za wrote: > > Hi All, > > > > Has anyone used Conjugate Gradient Solver + Icomplete Cholesky Preconditionner in parallel case? I tried as said in the manual : I use MATMPIROWBS for the storage of the matrice. However I get this message : > > > > PETSC ERROR: To use incomplete Cholesky > preconditioning with a MATMPIROWBS matrix you must declare it to be > symmetric using the option MatSetOption(A,MAT_SYMMETRIC)! > > > > So I tried adding this option (even if in the namual it is written that it is not required). Then I obtained this message > > PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in src/mat/impls/rowbs/mpi/mpirowbs.c > PETSC ERROR: No support for this operation for this object type! > PETSC ERROR: unknown option! > PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > > > In advance thanks, > > > > > > Fa?za > > > > > Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ > > *** > Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du > (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet > e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. > L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de > v?rifier l'absence de corruption ? sa r?ception. > > The contents of this email and any attachments are confidential. They are intended for > the named recipient(s) only. If you have received this email in error please notify the > system manager or the sender immediately and do not disclose the contents to > anyone or make copies. eSafe scanned this email for viruses, vandals and malicious > content. > *** > From hzhang at mcs.anl.gov Wed May 10 09:43:23 2006 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Wed, 10 May 2006 09:43:23 -0500 (CDT) Subject: Petsc + BlockSolve95 In-Reply-To: References: <5D9143EF9FADE942BEF6F2A636A861170271092F@MAR150CV1.marin.local> Message-ID: I reproduced this crash and am working on it. Hong On Wed, 10 May 2006, Barry Smith wrote: > > Hmm, can anyone reproduce this in the debugger? What is the value > of op in MatSetOption_MPIAIJ() that is not accepted? > > Is this also Fortran? > > Barry > > BTW: I have updated petsc-dev to print the numerical value of any > op that it claims is "unknown", instead of providing NO useful information > like it use to. > > On Wed, 10 May 2006, Vaz, Guilherme wrote: > > > Hello people, > > > > > > > > I have exactly the same problem as Faiza... > > > > In sequential it runs ok but in parallel not. > > > > > > > > Greetings. > > > > > > > > Guilherme > > > > > > > > -----Original Message----- > > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Boulahya Fa?za > > Sent: 10 May 2006 10:39 > > To: 'petsc-users at mcs.anl.gov' > > Subject: RE: Petsc + BlockSolve95 > > > > > > > > Thanks. > > > > > > > > I tried something else : > > > > - creation of the matrix with MatCreateMPIAIJ > > > > - initialization > > > > - adding of the options MAT_SYMMETRIC and MAT_SYMMETRY_ETERNAL > > > > > > > > When solving in sequential CG + ICC, everything is ok. When I tried in parallel the same code the options lead to the same error : > > > > [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: No support for this operation for this object type! > > [0]PETSC ERROR: unknown option! > > [0]PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > > > Can MatSetOption be only used in sequential? > > > > > > > > > > > > Fa?za > > > > > > > > > > > > ________________________________ > > > > De : Matthew Knepley [mailto:knepley at gmail.com] > > Envoy? : mardi 9 mai 2006 18:19 > > ? : petsc-users at mcs.anl.gov > > Objet : Re: Petsc + BlockSolve95 > > > > I believe there is a problem with the option that you specified. All these > > are integers, and it is complaining that the integer does not match > > MAT_SYMMETRIC. I will fix the error message to print the offending > > option, but please check the code. > > > > Thanks, > > > > Matt > > > > On 5/9/06, Boulahya Fa?za wrote: > > > > Hi All, > > > > > > > > Has anyone used Conjugate Gradient Solver + Icomplete Cholesky Preconditionner in parallel case? I tried as said in the manual : I use MATMPIROWBS for the storage of the matrice. However I get this message : > > > > > > > > PETSC ERROR: To use incomplete Cholesky > > preconditioning with a MATMPIROWBS matrix you must declare it to be > > symmetric using the option MatSetOption(A,MAT_SYMMETRIC)! > > > > > > > > So I tried adding this option (even if in the namual it is written that it is not required). Then I obtained this message > > > > PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in src/mat/impls/rowbs/mpi/mpirowbs.c > > PETSC ERROR: No support for this operation for this object type! > > PETSC ERROR: unknown option! > > PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > > > > > > > In advance thanks, > > > > > > > > > > > > Fa?za > > > > > > > > > > Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ > > > > *** > > Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du > > (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet > > e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. > > L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de > > v?rifier l'absence de corruption ? sa r?ception. > > > > The contents of this email and any attachments are confidential. They are intended for > > the named recipient(s) only. If you have received this email in error please notify the > > system manager or the sender immediately and do not disclose the contents to > > anyone or make copies. eSafe scanned this email for viruses, vandals and malicious > > content. > > *** > > From hzhang at mcs.anl.gov Wed May 10 10:01:00 2006 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Wed, 10 May 2006 10:01:00 -0500 (CDT) Subject: Petsc + BlockSolve95 In-Reply-To: References: Message-ID: > > I tried something else : > - creation of the matrix with MatCreateMPIAIJ > - initialization > - adding of the options MAT_SYMMETRIC and MAT_SYMMETRY_ETERNAL > > When solving in sequential CG + ICC, everything is ok. When I tried in > parallel the same code the options lead to the same error : > [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in > src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: unknown option! > [0]PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > Can MatSetOption be only used in sequential? MatSetOption can be used for MPIAIJ matrix. We do not have support for parallel icc, but it can used as CG + BJACOBI + ICC. The runtime option is '-ksp_type cg -pc_type bjacobi -sub_pc_type icc'. Hong > > > Fa?za > > > _____ > > De : Matthew Knepley [mailto:knepley at gmail.com] > Envoy? : mardi 9 mai 2006 18:19 > ? : petsc-users at mcs.anl.gov > Objet : Re: Petsc + BlockSolve95 > > > I believe there is a problem with the option that you specified. All these > are integers, and it is complaining that the integer does not match > MAT_SYMMETRIC. I will fix the error message to print the offending > option, but please check the code. > > Thanks, > > Matt > > > On 5/9/06, Boulahya Fa?za > > wrote: > > Hi All, > > Has anyone used Conjugate Gradient Solver + Icomplete Cholesky > Preconditionner in parallel case? I tried as said in the manual : I use > MATMPIROWBS for the storage of the matrice. However I get this message : > > PETSC ERROR: To use incomplete Cholesky > preconditioning with a MATMPIROWBS matrix you must > declare it to be > symmetric using the option > MatSetOption(A,MAT_SYMMETRIC)! > > > So I tried adding this option (even if in the namual it is written that it > is not required). Then I obtained this message > PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in > src/mat/impls/rowbs/mpi/mpirowbs.c > PETSC ERROR: No support for this operation for this object type! > PETSC ERROR: unknown option! > PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > > In advance thanks, > > > Fa?za > > > Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ > > *** > Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du > (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet > e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. > L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de > v?rifier l'absence de corruption ? sa r?ception. > > The contents of this email and any attachments are confidential. They are intended for > the named recipient(s) only. If you have received this email in error please notify the > system manager or the sender immediately and do not disclose the contents to > anyone or make copies. eSafe scanned this email for viruses, vandals and malicious > content. > *** From bsmith at mcs.anl.gov Wed May 10 16:16:43 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 10 May 2006 16:16:43 -0500 (CDT) Subject: polar coordinate singularity In-Reply-To: <4460B3C4.5050600@trialphaenergy.com> References: <4460B3C4.5050600@trialphaenergy.com> Message-ID: A simple check in google of finite differences polar singularity gave lots of possible references. One near the top that may be relevent was http://enstrophy.colorado.edu/~mohseni/PSpdf/MyPapers/JCP2000.pdf PETSc itself doesn't address the nitty-gritty details of differencing schemes so I don't think has any particular tool to help manage the differencing. Barry On Tue, 9 May 2006, Sean Dettrick wrote: > Hi, > I am solving Poisson's equation > Div. Grad(Phi)=rho > in the (R,Theta) plane with KSP, and it is working well if I avoid the axis. > But I would like to put a point at R=0. Is there a recommended way to do > that? > Thanks, > Sean > > From sanjay at ce.berkeley.edu Wed May 10 21:50:29 2006 From: sanjay at ce.berkeley.edu (Sanjay Govindjee) Date: Wed, 10 May 2006 19:50:29 -0700 Subject: polar coordinate singularity In-Reply-To: References: <4460B3C4.5050600@trialphaenergy.com> Message-ID: <4462A675.90600@ce.berkeley.edu> There should be nothing special needed to do this in general for finite difference (or finite element) methods other than the usual need for one sided derivatives at the boundary of the domain. -sg Barry Smith wrote: > > A simple check in google of finite differences polar singularity > gave lots of possible references. One near the top that may be > relevent was > http://enstrophy.colorado.edu/~mohseni/PSpdf/MyPapers/JCP2000.pdf > > PETSc itself doesn't address the nitty-gritty details of > differencing schemes > so I don't think has any particular tool to help manage the differencing. > > Barry > > > On Tue, 9 May 2006, Sean Dettrick wrote: > >> Hi, >> I am solving Poisson's equation >> Div. Grad(Phi)=rho >> in the (R,Theta) plane with KSP, and it is working well if I avoid >> the axis. But I would like to put a point at R=0. Is there a >> recommended way to do that? >> Thanks, >> Sean >> >> From jordi.marce at upc.edu Thu May 11 11:23:50 2006 From: jordi.marce at upc.edu (=?ISO-8859-1?Q?Jordi_Marc=E9_Nogu=E9?=) Date: Thu, 11 May 2006 18:23:50 +0200 Subject: memory bleeding In-Reply-To: References: <200605051128.k45BSgo27156@mcs.anl.gov> <1147083007.445f18ffd5f69@nobel.upc.es> <4460BEB8.3010909@upc.edu> <4461B0FA.3040700@upc.edu> Message-ID: <44636516.8030601@upc.edu> Ok! thanks! It runs so good!! I supposed it when I read yours fisrts mails but I tried it last day and I didn't write the code correctly. When I have readed this mail I've tried again and I've found the error! jordi En/na Matthew Knepley ha escrit: > The code you have there will create a matrix for every loop iteration, > and will only destroy one at the end. The new MatMatMult() interface > has an argument which allows the user to pass in a matrix. > > Matt > > On 5/10/06, *Jordi Marc? Nogu?* > wrote: > > Yes, of course I destroy the matrix.... The scheme of my code when I > create a diagonal Mass-lumping matrix is the code below. > > In MatSeqAIJSetPreallocation(M_aux1,6,PETSC_NULL); the number is 6 > because in a general coordinates it's possible this matrix changes in a > 6x6 full matrix. > > "element3D *p_element = lfiber[i].getElement(e)" is a internal > procedure to obtain information about the element > > ------------------------------------------------------------------ > > Mat M_aux1, M_aux2; > > MatCreateSeqAIJ(PETSC_COMM_SELF, 6, 6, 6, PETSC_NULL, &M_aux1); > MatCreateSeqAIJ(PETSC_COMM_SELF, 6, 6, 6, PETSC_NULL, &M_aux2); > > MatSeqAIJSetPreallocation(M_aux1,6,PETSC_NULL); > MatSeqAIJSetPreallocation(M_aux2,6,PETSC_NULL); > > MatSetFromOptions(M_aux1); > MatSetOption(M_aux1, MAT_SYMMETRIC); > MatSetOption(M_aux1, MAT_IGNORE_ZERO_ENTRIES); > > MatSetFromOptions(M_aux2); > MatSetOption(M_aux2, MAT_SYMMETRIC); > MatSetOption(M_aux2, MAT_IGNORE_ZERO_ENTRIES); > > > for(uint32_t i=0; i { > for(uint32_t e=0; e { > > MatZeroentries(M_aux1); > MatZeroentries(M_aux2); > > element3D *p_element = lfiber[i].getElement(e); > > p_element->updateTs(); > > MatSetValue(M_aux1, 0, 0, p_element->L0 * value.pho / 6., > INSERT_VALUES); > MatSetValue(M_aux1, 1, 1, p_element->L0 * value.pho / 6., > INSERT_VALUES); > MatSetValue(M_aux1, 2, 2, p_element->L0 * value.pho / 6., > INSERT_VALUES); > MatSetValue(M_aux1, 3, 3, p_element->L0 * value.pho / 6., > INSERT_VALUES); > MatSetValue(M_aux1, 4, 4, p_element->L0 * value.pho / 6., > INSERT_VALUES); > MatSetValue(M_aux1, 5, 5, p_element->L0 * value.pho / 6., > INSERT_VALUES); > > > MatAssemblyBegin(M_aux1,MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(M_aux1,MAT_FINAL_ASSEMBLY); > > > MatMatMult(M_aux1,p_element->T,&M_aux2); > MatMatMult(p_element->TT,M_aux2,&M_aux1); > > } > > // here I work with the matrix M_aux2, but in this point the > memory > // is constant. The code doesn't waste memory > } > > MatDestroy(M_aux1); > MatDestroy(M_aux2); > > ------------------------ > > > Thanks, > > best regards > > > -- > Jordi Marc?-Nogu? > Dept. Resist?ncia de Materials i Estructures a l'Enginyeria > Universitat Polit?cnica de Catalunya (UPC) > > Edifici T45 - despatx 137 > ETSEIAT (Terrassa) > > phone: +34 937 398 728 > mail: jordi.marce at upc.edu > > > > > -- > "Failure has a thousand explanations. Success doesn't need one" -- Sir > Alec Guiness -- Jordi Marc?-Nogu? Dept. Resist?ncia de Materials i Estructures a l'Enginyeria Universitat Polit?cnica de Catalunya (UPC) Edifici T45 - despatx 137 ETSEIAT (Terrassa) phone: +34 937 398 728 mail: jordi.marce at upc.edu From jadelman at ocf.berkeley.edu Thu May 11 17:54:28 2006 From: jadelman at ocf.berkeley.edu (Joshua Adelman) Date: Thu, 11 May 2006 15:54:28 -0700 Subject: PETSc Problem with fgets Message-ID: I am trying to read some parameters into my PETSc simulation using fgets, and am getting a strange error. It appears that fgets and sscanf are working since the proper values are read into the simulation, but it throws an error anyways. This problem only arises when I am using the parallel version of the code (i.e. setting -n in mpiexec to a value greater than 1). Basically if I comment everything out and then start uncommenting one line at a time, the error appears once I've uncommented the line with fgets. One strange thing to note is that the error happens after the program has already finished executing the function where the problem is. Here's a snippet of the code that seems to be causing the problem: PetscErrorCode DataReadParams(SimData *sdata) { int rank, size; PetscErrorCode ierr; int i; FILE *fd; char str[256]; char line[100]; int max = 100; PetscFunctionBegin; ierr = MPI_Comm_size(PETSC_COMM_WORLD,&size); CHKERRQ(ierr); ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank); CHKERRQ(ierr); /* Read data from files */ sprintf(str,"%s%s",sdata->simname,".param"); ierr = PetscFOpen(PETSC_COMM_WORLD,str,"r",&fd); CHKERRQ(ierr); if (!rank && !fd) { SETERRQ1(PETSC_ERR_FILE_OPEN,"Cannot open %s\n",str); } // PetscPrintf(PETSC_COMM_WORLD,"Opening file %s\n",str); fgets(line,max,fd); sscanf(line,"%d",&sdata->NP); The error that is kicked out is: [1]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly illegal memory access [1]PETSC ERROR: Try option -start_in_debugger or - on_error_attach_debugger [1]PETSC ERROR: or try http://valgrind.org on linux to find memory corruption errors [1]PETSC ERROR: likely location of problem given in stack below [1]PETSC ERROR: --------------- Stack Frames --------------- [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [1]PETSC ERROR: INSTEAD the line number of the start of the function [1]PETSC ERROR: is given. [1]PETSC ERROR: [1] DataReadParams line 23 datareadparams.c [1]PETSC ERROR: [1] SimInit line 19 siminit.c [1]PETSC ERROR: -------------------------------------------- [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file [1]PETSC ERROR: Signal received! [1]PETSC ERROR: ! Any insight would be most appreciated. Josh ------------------------------------------------------------------------ ------------------------------ Joshua L. Adelman Biophysics Graduate Group Lab: 510.643.2159 218 Wellman Hall Fax: 510.642.7428 University of California, Berkeley http://www.ocf.berkeley.edu/ ~jadelman Berkeley, CA 94720 USA jadelman at ocf.berkeley.edu ------------------------------------------------------------------------ ------------------------------ From shma7099 at student.uu.se Thu May 11 19:56:30 2006 From: shma7099 at student.uu.se (Sh.M) Date: Fri, 12 May 2006 02:56:30 +0200 Subject: PETSc Problem with fgets References: Message-ID: <000801c6755e$e8b6af60$2516e055@bredbandsbolaget.se> Hi, I apologize in advance if I am confusing you or down right saying something that is wrong. When I initially checked you code, it looks like CPU O is opening the file... but the rest of the CPUs try to read it aswell even though they havent opened(dont have a correct file pointer to the file) it yet... This would cause a segmentation fault. fgets is using the fd variable, wich in case of anything but CPU 0 are pointing to some garbage and not the file itself. With best regards, Shaman Mahmoudi ----- Original Message ----- From: "Joshua Adelman" To: Sent: Friday, May 12, 2006 12:54 AM Subject: PETSc Problem with fgets > I am trying to read some parameters into my PETSc simulation using > fgets, and am getting a strange error. It appears that fgets and > sscanf are working since the proper values are read into the > simulation, but it throws an error anyways. This problem only arises > when I am using the parallel version of the code (i.e. setting -n in > mpiexec to a value greater than 1). Basically if I comment everything > out and then start uncommenting one line at a time, the error appears > once I've uncommented the line with fgets. One strange thing to note > is that the error happens after the program has already finished > executing the function where the problem is. Here's a snippet of the > code that seems to be causing the problem: > > PetscErrorCode DataReadParams(SimData *sdata) { > int rank, size; > PetscErrorCode ierr; > int i; > FILE *fd; > char str[256]; > char line[100]; > int max = 100; > > PetscFunctionBegin; > > ierr = MPI_Comm_size(PETSC_COMM_WORLD,&size); CHKERRQ(ierr); > ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank); CHKERRQ(ierr); > > > /* Read data from files */ > sprintf(str,"%s%s",sdata->simname,".param"); > ierr = PetscFOpen(PETSC_COMM_WORLD,str,"r",&fd); CHKERRQ(ierr); > if (!rank && !fd) { > SETERRQ1(PETSC_ERR_FILE_OPEN,"Cannot open %s\n",str); > } > // PetscPrintf(PETSC_COMM_WORLD,"Opening file %s\n",str); > fgets(line,max,fd); > sscanf(line,"%d",&sdata->NP); > > The error that is kicked out is: > > > [1]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly > illegal memory access > [1]PETSC ERROR: Try option -start_in_debugger or - > on_error_attach_debugger > [1]PETSC ERROR: or try http://valgrind.org on linux to find memory > corruption errors > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: --------------- Stack Frames --------------- > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [1]PETSC ERROR: INSTEAD the line number of the start of the > function > [1]PETSC ERROR: is given. > [1]PETSC ERROR: [1] DataReadParams line 23 datareadparams.c > [1]PETSC ERROR: [1] SimInit line 19 siminit.c > [1]PETSC ERROR: -------------------------------------------- > [1]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > [1]PETSC ERROR: Signal received! > [1]PETSC ERROR: ! > > > Any insight would be most appreciated. > > Josh > > ------------------------------------------------------------------------ > ------------------------------ > Joshua L. Adelman > Biophysics Graduate Group Lab: 510.643.2159 > 218 Wellman Hall Fax: 510.642.7428 > University of California, Berkeley http://www.ocf.berkeley.edu/ > ~jadelman > Berkeley, CA 94720 USA jadelman at ocf.berkeley.edu > ------------------------------------------------------------------------ > ------------------------------ > > > From joel.schaerer at insa-lyon.fr Thu May 11 18:47:18 2006 From: joel.schaerer at insa-lyon.fr (Joel Schaerer) Date: Fri, 12 May 2006 01:47:18 +0200 Subject: Installation on windows Message-ID: <1147391238.4463cd06716b7@webmail.insa-lyon.fr> Hi all, I've been using Petsc on linux successfully for some time. I'm now trying to port my code under windows, and it turns out, installing petsc on windows is not easy! If possible, I would like to use petsc with Visual Studio (my version is .NET 2003). I tried various config/configure.py options, but to no avail, petsc still uses gcc to compile itself... If that is not possible, is it possible to compile petsc with gcc and link the resulting libraries to Visual Studio compiled code? I've only been able to produce .a libraries, and if I understand well, I can't use .a libraries with visual studio. So: has anyone compiled petsc with the visual studio compilers? If yes, how (configure options, PETSC_ARCH used, etc.)? Also, is there any recent documentation about petsc on windows available? I've only been able to find outdated stuff... Thanks a lot! Joel Schaerer From jadelman at ocf.berkeley.edu Thu May 11 20:28:59 2006 From: jadelman at ocf.berkeley.edu (Joshua Adelman) Date: Thu, 11 May 2006 18:28:59 -0700 Subject: PETSc Problem with fgets In-Reply-To: <000801c6755e$e8b6af60$2516e055@bredbandsbolaget.se> References: <000801c6755e$e8b6af60$2516e055@bredbandsbolaget.se> Message-ID: <7C1B5D3E-6D73-42F3-A766-578C338B8E8D@ocf.berkeley.edu> OK, so I changed it to PETSC_COMM_SELF, which I believe takes care of the problem. Thanks for your help. Josh On May 11, 2006, at 5:56 PM, Sh.M wrote: > Hi, > > I apologize in advance if I am confusing you or down right saying > something > that is wrong. > > When I initially checked you code, it looks like CPU O is opening the > file... but the rest of the CPUs try to read it aswell even though > they > havent opened(dont have a correct file pointer to the file) it > yet... This > would cause a segmentation fault. fgets is using the fd variable, > wich in > case of anything but CPU 0 are pointing to some garbage and not the > file > itself. > > With best regards, Shaman Mahmoudi > > ----- Original Message ----- > From: "Joshua Adelman" > To: > Sent: Friday, May 12, 2006 12:54 AM > Subject: PETSc Problem with fgets > > >> I am trying to read some parameters into my PETSc simulation using >> fgets, and am getting a strange error. It appears that fgets and >> sscanf are working since the proper values are read into the >> simulation, but it throws an error anyways. This problem only arises >> when I am using the parallel version of the code (i.e. setting -n in >> mpiexec to a value greater than 1). Basically if I comment everything >> out and then start uncommenting one line at a time, the error appears >> once I've uncommented the line with fgets. One strange thing to note >> is that the error happens after the program has already finished >> executing the function where the problem is. Here's a snippet of the >> code that seems to be causing the problem: >> >> PetscErrorCode DataReadParams(SimData *sdata) { >> int rank, size; >> PetscErrorCode ierr; >> int i; >> FILE *fd; >> char str[256]; >> char line[100]; >> int max = 100; >> >> PetscFunctionBegin; >> >> ierr = MPI_Comm_size(PETSC_COMM_WORLD,&size); CHKERRQ(ierr); >> ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank); CHKERRQ(ierr); >> >> >> /* Read data from files */ >> sprintf(str,"%s%s",sdata->simname,".param"); >> ierr = PetscFOpen(PETSC_COMM_WORLD,str,"r",&fd); CHKERRQ(ierr); >> if (!rank && !fd) { >> SETERRQ1(PETSC_ERR_FILE_OPEN,"Cannot open %s\n",str); >> } >> // PetscPrintf(PETSC_COMM_WORLD,"Opening file %s\n",str); >> fgets(line,max,fd); >> sscanf(line,"%d",&sdata->NP); >> >> The error that is kicked out is: >> >> >> [1]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly >> illegal memory access >> [1]PETSC ERROR: Try option -start_in_debugger or - >> on_error_attach_debugger >> [1]PETSC ERROR: or try http://valgrind.org on linux to find memory >> corruption errors >> [1]PETSC ERROR: likely location of problem given in stack below >> [1]PETSC ERROR: --------------- Stack Frames --------------- >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> [1]PETSC ERROR: INSTEAD the line number of the start of the >> function >> [1]PETSC ERROR: is given. >> [1]PETSC ERROR: [1] DataReadParams line 23 datareadparams.c >> [1]PETSC ERROR: [1] SimInit line 19 siminit.c >> [1]PETSC ERROR: -------------------------------------------- >> [1]PETSC ERROR: User provided function() line 0 in unknown directory >> unknown file >> [1]PETSC ERROR: Signal received! >> [1]PETSC ERROR: ! >> >> >> Any insight would be most appreciated. >> >> Josh ------------------------------------------------------------------------ ------------------------------ Joshua L. Adelman Biophysics Graduate Group Lab: 510.643.2159 218 Wellman Hall Fax: 510.642.7428 University of California, Berkeley http://www.ocf.berkeley.edu/ ~jadelman Berkeley, CA 94720 USA jadelman at ocf.berkeley.edu ------------------------------------------------------------------------ ------------------------------ From balay at mcs.anl.gov Thu May 11 20:28:08 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 11 May 2006 20:28:08 -0500 (CDT) Subject: Installation on windows In-Reply-To: <1147391238.4463cd06716b7@webmail.insa-lyon.fr> References: <1147391238.4463cd06716b7@webmail.insa-lyon.fr> Message-ID: If you have problems installing PETSc - send us the relavent logfiles to petsc-maint at mcs.anl.gov [the relavent log file here is configure.log] [assuming you don't need fortran & mpi] - you should be able to run configure as: ./config/configure.py --with-cc='win32fe cl' --download-c-blas-lapack=1 --with-mpi=0 Satish On Fri, 12 May 2006, Joel Schaerer wrote: > Hi all, > > I've been using Petsc on linux successfully for some time. I'm now trying to > port my code under windows, and it turns out, installing petsc on windows is > not easy! > If possible, I would like to use petsc with Visual Studio (my version is .NET > 2003). I tried various config/configure.py options, but to no avail, petsc > still uses gcc to compile itself... > > If that is not possible, is it possible to compile petsc with gcc and link the > resulting libraries to Visual Studio compiled code? I've only been able to > produce .a libraries, and if I understand well, I can't use .a libraries with > visual studio. > > So: has anyone compiled petsc with the visual studio compilers? If yes, how > (configure options, PETSC_ARCH used, etc.)? Also, is there any recent > documentation about petsc on windows available? I've only been able to find > outdated stuff... > > Thanks a lot! > > Joel Schaerer > > From bsmith at mcs.anl.gov Fri May 12 07:59:14 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 12 May 2006 07:59:14 -0500 (CDT) Subject: PETSc Problem with fgets In-Reply-To: References: Message-ID: Joshua, PetscFOpen() only produces a valid fd on process 0. Thus fgets() can only be used on process 0. It is crashing on process 1 because fd is garbage there. You need to MPI_Scatter the data to the other processes. There is no PetscFGets(), though come to think of it there should be! I will write one and send it to you. Barry On Thu, 11 May 2006, Joshua Adelman wrote: > I am trying to read some parameters into my PETSc simulation using fgets, and > am getting a strange error. It appears that fgets and sscanf are working > since the proper values are read into the simulation, but it throws an error > anyways. This problem only arises when I am using the parallel version of the > code (i.e. setting -n in mpiexec to a value greater than 1). Basically if I > comment everything out and then start uncommenting one line at a time, the > error appears once I've uncommented the line with fgets. One strange thing to > note is that the error happens after the program has already finished > executing the function where the problem is. Here's a snippet of the code > that seems to be causing the problem: > > PetscErrorCode DataReadParams(SimData *sdata) { > int rank, size; > PetscErrorCode ierr; > int i; > FILE *fd; > char str[256]; > char line[100]; > int max = 100; > PetscFunctionBegin; > > ierr = MPI_Comm_size(PETSC_COMM_WORLD,&size); CHKERRQ(ierr); > ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank); CHKERRQ(ierr); > > /* Read data from files */ > sprintf(str,"%s%s",sdata->simname,".param"); > ierr = PetscFOpen(PETSC_COMM_WORLD,str,"r",&fd); CHKERRQ(ierr); > if (!rank && !fd) { > SETERRQ1(PETSC_ERR_FILE_OPEN,"Cannot open %s\n",str); > } > // PetscPrintf(PETSC_COMM_WORLD,"Opening file %s\n",str); > fgets(line,max,fd); > sscanf(line,"%d",&sdata->NP); > > The error that is kicked out is: > > > [1]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly illegal > memory access > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or try http://valgrind.org on linux to find memory corruption > errors > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: --------------- Stack Frames --------------- > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [1]PETSC ERROR: INSTEAD the line number of the start of the function > [1]PETSC ERROR: is given. > [1]PETSC ERROR: [1] DataReadParams line 23 datareadparams.c > [1]PETSC ERROR: [1] SimInit line 19 siminit.c > [1]PETSC ERROR: -------------------------------------------- > [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown > file > [1]PETSC ERROR: Signal received! > [1]PETSC ERROR: ! > > > Any insight would be most appreciated. > > Josh > > ------------------------------------------------------------------------------------------------------ > Joshua L. Adelman > Biophysics Graduate Group Lab: 510.643.2159 > 218 Wellman Hall Fax: 510.642.7428 > University of California, Berkeley http://www.ocf.berkeley.edu/~jadelman > Berkeley, CA 94720 USA jadelman at ocf.berkeley.edu > ------------------------------------------------------------------------------------------------------ > > > From bsmith at mcs.anl.gov Fri May 12 13:23:09 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 12 May 2006 13:23:09 -0500 (CDT) Subject: PETSc Problem with fgets In-Reply-To: References: Message-ID: Stupid me, I don't know my own code. You can use PetscSynchronizedFGets() exactly for this purpose. Barry I have updated the manual page for PetscFOpen() to seealso: PetscSynchronizedFGets() On Fri, 12 May 2006, Barry Smith wrote: > > Joshua, > > PetscFOpen() only produces a valid fd on process 0. Thus fgets() > can only be used on process 0. It is crashing on process 1 because > fd is garbage there. > > You need to MPI_Scatter the data to the other processes. There is > no PetscFGets(), though come to think of it there should be! I will write > one and send it to you. > > Barry > > > > On Thu, 11 May 2006, Joshua Adelman wrote: > >> I am trying to read some parameters into my PETSc simulation using fgets, >> and am getting a strange error. It appears that fgets and sscanf are >> working since the proper values are read into the simulation, but it throws >> an error anyways. This problem only arises when I am using the parallel >> version of the code (i.e. setting -n in mpiexec to a value greater than 1). >> Basically if I comment everything out and then start uncommenting one line >> at a time, the error appears once I've uncommented the line with fgets. One >> strange thing to note is that the error happens after the program has >> already finished executing the function where the problem is. Here's a >> snippet of the code that seems to be causing the problem: >> >> PetscErrorCode DataReadParams(SimData *sdata) { >> int rank, size; >> PetscErrorCode ierr; >> int i; >> FILE *fd; >> char str[256]; >> char line[100]; >> int max = 100; >> PetscFunctionBegin; >> >> ierr = MPI_Comm_size(PETSC_COMM_WORLD,&size); CHKERRQ(ierr); >> ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank); CHKERRQ(ierr); >> >> /* Read data from files */ >> sprintf(str,"%s%s",sdata->simname,".param"); >> ierr = PetscFOpen(PETSC_COMM_WORLD,str,"r",&fd); CHKERRQ(ierr); >> if (!rank && !fd) { >> SETERRQ1(PETSC_ERR_FILE_OPEN,"Cannot open %s\n",str); >> } >> // PetscPrintf(PETSC_COMM_WORLD,"Opening file %s\n",str); >> fgets(line,max,fd); >> sscanf(line,"%d",&sdata->NP); >> >> The error that is kicked out is: >> >> >> [1]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly illegal >> memory access >> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [1]PETSC ERROR: or try http://valgrind.org on linux to find memory >> corruption errors >> [1]PETSC ERROR: likely location of problem given in stack below >> [1]PETSC ERROR: --------------- Stack Frames --------------- >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> [1]PETSC ERROR: INSTEAD the line number of the start of the function >> [1]PETSC ERROR: is given. >> [1]PETSC ERROR: [1] DataReadParams line 23 datareadparams.c >> [1]PETSC ERROR: [1] SimInit line 19 siminit.c >> [1]PETSC ERROR: -------------------------------------------- >> [1]PETSC ERROR: User provided function() line 0 in unknown directory >> unknown file >> [1]PETSC ERROR: Signal received! >> [1]PETSC ERROR: ! >> >> >> Any insight would be most appreciated. >> >> Josh >> >> >> ------------------------------------------------------------------------------------------------------ >> Joshua L. Adelman >> Biophysics Graduate Group Lab: 510.643.2159 >> 218 Wellman Hall Fax: 510.642.7428 >> University of California, Berkeley >> http://www.ocf.berkeley.edu/~jadelman >> Berkeley, CA 94720 USA jadelman at ocf.berkeley.edu >> >> ------------------------------------------------------------------------------------------------------ >> >> >> > From jadelman at OCF.Berkeley.EDU Fri May 12 13:38:22 2006 From: jadelman at OCF.Berkeley.EDU (Joshua L. Adelman) Date: Fri, 12 May 2006 11:38:22 -0700 Subject: PETSc Problem with fgets In-Reply-To: References: Message-ID: <5055E535-5B7A-47DD-92FD-379B307802E8@ocf.berkeley.edu> Just to clarify, it appears from the documentation that this causes several processors to get the same line from a file. If I just want to have a single processor get a line and then parse that to some variables within a struct, it would seem like I don't need all of my processors reading the line. Would I just use PETSC_WORLD_SELF in the PetscFOpen() command to have a single processor open the file, and then use _SELF again in the synchronized fgets to have a single processor take care of retrieving the line? Thanks again for your help. Josh On May 12, 2006, at 11:23 AM, Barry Smith wrote: > > Stupid me, I don't know my own code. > > You can use PetscSynchronizedFGets() exactly for this purpose. > > Barry > > I have updated the manual page for PetscFOpen() to seealso: > PetscSynchronizedFGets() > > > On Fri, 12 May 2006, Barry Smith wrote: > >> >> Joshua, >> >> PetscFOpen() only produces a valid fd on process 0. Thus fgets() >> can only be used on process 0. It is crashing on process 1 because >> fd is garbage there. >> >> You need to MPI_Scatter the data to the other processes. There is >> no PetscFGets(), though come to think of it there should be! I >> will write >> one and send it to you. >> >> Barry >> >> >> >> On Thu, 11 May 2006, Joshua Adelman wrote: >> >>> I am trying to read some parameters into my PETSc simulation >>> using fgets, and am getting a strange error. It appears that >>> fgets and sscanf are working since the proper values are read >>> into the simulation, but it throws an error anyways. This problem >>> only arises when I am using the parallel version of the code >>> (i.e. setting -n in mpiexec to a value greater than 1). Basically >>> if I comment everything out and then start uncommenting one line >>> at a time, the error appears once I've uncommented the line with >>> fgets. One strange thing to note is that the error happens after >>> the program has already finished executing the function where the >>> problem is. Here's a snippet of the code that seems to be causing >>> the problem: >>> PetscErrorCode DataReadParams(SimData *sdata) { >>> int rank, size; >>> PetscErrorCode ierr; >>> int i; >>> FILE *fd; >>> char str[256]; >>> char line[100]; >>> int max = 100; >>> PetscFunctionBegin; >>> >>> ierr = MPI_Comm_size(PETSC_COMM_WORLD,&size); CHKERRQ(ierr); >>> ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank); CHKERRQ(ierr); >>> >>> /* Read data from files */ >>> sprintf(str,"%s%s",sdata->simname,".param"); >>> ierr = PetscFOpen(PETSC_COMM_WORLD,str,"r",&fd); CHKERRQ(ierr); >>> if (!rank && !fd) { >>> SETERRQ1(PETSC_ERR_FILE_OPEN,"Cannot open %s\n",str); >>> } >>> // PetscPrintf(PETSC_COMM_WORLD,"Opening file %s\n",str); >>> fgets(line,max,fd); >>> sscanf(line,"%d",&sdata->NP); >>> The error that is kicked out is: >>> [1]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly >>> illegal memory access >>> [1]PETSC ERROR: Try option -start_in_debugger or - >>> on_error_attach_debugger >>> [1]PETSC ERROR: or try http://valgrind.org on linux to find >>> memory corruption errors >>> [1]PETSC ERROR: likely location of problem given in stack below >>> [1]PETSC ERROR: --------------- Stack Frames --------------- >>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>> available, >>> [1]PETSC ERROR: INSTEAD the line number of the start of the >>> function >>> [1]PETSC ERROR: is given. >>> [1]PETSC ERROR: [1] DataReadParams line 23 datareadparams.c >>> [1]PETSC ERROR: [1] SimInit line 19 siminit.c >>> [1]PETSC ERROR: -------------------------------------------- >>> [1]PETSC ERROR: User provided function() line 0 in unknown >>> directory unknown file >>> [1]PETSC ERROR: Signal received! >>> [1]PETSC ERROR: ! >>> Any insight would be most appreciated. >>> Josh ------------------------------------------------------------------------ ------------------------------ Joshua L. Adelman Biophysics Graduate Group Lab: 510.643.2159 218 Wellman Hall Fax: 510.642.7428 University of California, Berkeley http://www.ocf.berkeley.edu/ ~jadelman Berkeley, CA 94720 USA jadelman at ocf.berkeley.edu ------------------------------------------------------------------------ ------------------------------ From bsmith at mcs.anl.gov Fri May 12 13:58:06 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 12 May 2006 13:58:06 -0500 (CDT) Subject: PETSc Problem with fgets In-Reply-To: <5055E535-5B7A-47DD-92FD-379B307802E8@ocf.berkeley.edu> References: <5055E535-5B7A-47DD-92FD-379B307802E8@ocf.berkeley.edu> Message-ID: On Fri, 12 May 2006, Joshua L. Adelman wrote: > Just to clarify, it appears from the documentation that this causes several > processors to get the same line from a file. Yes > If I just want to have a single > processor get a line and then parse that to some variables within a struct, > it would seem like I don't need all of my processors reading the line. > Yes, you don't need > Would I just use PETSC_WORLD_SELF in the PetscFOpen() command to have a > single processor open the file, and then use _SELF again in the synchronized > fgets to have a single processor take care of retrieving the line? Yes, you can just use SELF, but you need to put all of this stuff INSIDE the test of if (!rank) { In fact, if you are only having process 0 open the file, read the data and parse it then you don't need to use PetscFOpen(), etc AT ALL, you can just use plan old fopen(), fgets() inside the if (!rank). On the other hand if you want all the information from the file on all the processes you can just use the PetscFOpen(), PetscFGets(), PetscFClose() with NO if (!rank) { code needed. Barry > > Thanks again for your help. > > Josh > > On May 12, 2006, at 11:23 AM, Barry Smith wrote: > >> >> Stupid me, I don't know my own code. >> >> You can use PetscSynchronizedFGets() exactly for this purpose. >> >> Barry >> >> I have updated the manual page for PetscFOpen() to seealso: >> PetscSynchronizedFGets() >> >> >> On Fri, 12 May 2006, Barry Smith wrote: >> >>> >>> Joshua, >>> >>> PetscFOpen() only produces a valid fd on process 0. Thus fgets() >>> can only be used on process 0. It is crashing on process 1 because >>> fd is garbage there. >>> >>> You need to MPI_Scatter the data to the other processes. There is >>> no PetscFGets(), though come to think of it there should be! I will write >>> one and send it to you. >>> >>> Barry >>> >>> >>> >>> On Thu, 11 May 2006, Joshua Adelman wrote: >>> >>>> I am trying to read some parameters into my PETSc simulation using fgets, >>>> and am getting a strange error. It appears that fgets and sscanf are >>>> working since the proper values are read into the simulation, but it >>>> throws an error anyways. This problem only arises when I am using the >>>> parallel version of the code (i.e. setting -n in mpiexec to a value >>>> greater than 1). Basically if I comment everything out and then start >>>> uncommenting one line at a time, the error appears once I've uncommented >>>> the line with fgets. One strange thing to note is that the error happens >>>> after the program has already finished executing the function where the >>>> problem is. Here's a snippet of the code that seems to be causing the >>>> problem: >>>> PetscErrorCode DataReadParams(SimData *sdata) { >>>> int rank, size; >>>> PetscErrorCode ierr; >>>> int i; >>>> FILE *fd; >>>> char str[256]; >>>> char line[100]; >>>> int max = 100; >>>> PetscFunctionBegin; >>>> >>>> ierr = MPI_Comm_size(PETSC_COMM_WORLD,&size); CHKERRQ(ierr); >>>> ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank); CHKERRQ(ierr); >>>> >>>> /* Read data from files */ >>>> sprintf(str,"%s%s",sdata->simname,".param"); >>>> ierr = PetscFOpen(PETSC_COMM_WORLD,str,"r",&fd); CHKERRQ(ierr); >>>> if (!rank && !fd) { >>>> SETERRQ1(PETSC_ERR_FILE_OPEN,"Cannot open %s\n",str); >>>> } >>>> // PetscPrintf(PETSC_COMM_WORLD,"Opening file %s\n",str); >>>> fgets(line,max,fd); >>>> sscanf(line,"%d",&sdata->NP); >>>> The error that is kicked out is: >>>> [1]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly illegal >>>> memory access >>>> [1]PETSC ERROR: Try option -start_in_debugger or >>>> -on_error_attach_debugger >>>> [1]PETSC ERROR: or try http://valgrind.org on linux to find memory >>>> corruption errors >>>> [1]PETSC ERROR: likely location of problem given in stack below >>>> [1]PETSC ERROR: --------------- Stack Frames --------------- >>>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>> available, >>>> [1]PETSC ERROR: INSTEAD the line number of the start of the >>>> function >>>> [1]PETSC ERROR: is given. >>>> [1]PETSC ERROR: [1] DataReadParams line 23 datareadparams.c >>>> [1]PETSC ERROR: [1] SimInit line 19 siminit.c >>>> [1]PETSC ERROR: -------------------------------------------- >>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory >>>> unknown file >>>> [1]PETSC ERROR: Signal received! >>>> [1]PETSC ERROR: ! >>>> Any insight would be most appreciated. >>>> Josh > > ------------------------------------------------------------------------------------------------------ > Joshua L. Adelman > Biophysics Graduate Group Lab: 510.643.2159 > 218 Wellman Hall Fax: 510.642.7428 > University of California, Berkeley http://www.ocf.berkeley.edu/~jadelman > Berkeley, CA 94720 USA jadelman at ocf.berkeley.edu > ------------------------------------------------------------------------------------------------------ > > > From Stephen.R.Ball at awe.co.uk Mon May 15 04:46:34 2006 From: Stephen.R.Ball at awe.co.uk (Stephen R Ball) Date: Mon, 15 May 2006 10:46:34 +0100 Subject: Extracting the Hypre preconditioner type used and PGI compiler su pport for VecGetArrayF90() and VecRestoreArrayF90() Message-ID: <65FAo7021667@awe.co.uk> Hi Can you tell me if there is any way I can extract the name of the Hypre preconditioner type used i.e. boomeramg, euclid, parasails or pilut? Failing that is there any chance you could create a routine (with a Fortran interface) called PCHYPREGetType() that returns the Hypre preconditioner type used? Also, can you tell me if there is there any likelihood in the near future for Portland Group Inc (PGI) compiler support to be provided for routines VecGetArrayF90() and VecRestoreArrayF90()? Regards Stephen R. Ball Parallel Technology Support HPC Rm: G17 Bldg: E1.1 AWE(A) Aldermaston Reading Berkshire ENGLAND RG7 4PR Tel: +44 (0)118 982 4528 e-mail: stephen.r.ball at awe.co.uk -- _______________________________________________________________________________ The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited. Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer. While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected. AWE Plc Registered in England and Wales Registration No 02763902 AWE, Aldermaston, Reading, RG7 4PR From knepley at gmail.com Mon May 15 08:45:59 2006 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 15 May 2006 08:45:59 -0500 Subject: Extracting the Hypre preconditioner type used and PGI compiler su pport for VecGetArrayF90() and VecRestoreArrayF90() In-Reply-To: <65FAo7021667@awe.co.uk> References: <65FAo7021667@awe.co.uk> Message-ID: On 5/15/06, Stephen R Ball wrote: > > > Hi > > Can you tell me if there is any way I can extract the name of the Hypre > preconditioner type used i.e. boomeramg, euclid, parasails or pilut? > Failing > that is there any chance you could create a routine (with a Fortran > interface) called PCHYPREGetType() that returns the Hypre preconditioner > type used? I agree that this is an oversight. I will put it in the development version, and it will appear in the next release. Also, can you tell me if there is there any likelihood in the near future > for Portland Group Inc (PGI) compiler support to be provided for routines > VecGetArrayF90() and VecRestoreArrayF90()? This is more problematic. Most F90 compilers have an array descriptor that is a variant of the NAG format. However PGF90 does not, and these refuse to tell us the layout. Thus we cannot add those routines. Thanks, Matt Regards > > Stephen R. Ball > Parallel Technology Support > HPC > Rm: G17 > Bldg: E1.1 > AWE(A) > Aldermaston > Reading > Berkshire > ENGLAND > RG7 4PR > Tel: +44 (0)118 982 4528 > e-mail: stephen.r.ball at awe.co.uk > -- > > _______________________________________________________________________________ > > The information in this email and in any attachment(s) is commercial in > confidence. If you are not the named addressee(s) or if you receive this > email in error then any distribution, copying or use of this communication > or the information in it is strictly prohibited. Please notify us > immediately by email at admin.internet(at)awe.co.uk, and then delete this > message from your computer. While attachments are virus checked, AWE plc > does not accept any liability in respect of any virus which is not detected. > > AWE Plc > Registered in England and Wales > Registration No 02763902 > AWE, Aldermaston, Reading, RG7 4PR > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean at trialphaenergy.com Mon May 15 21:14:31 2006 From: sean at trialphaenergy.com (Sean Dettrick) Date: Mon, 15 May 2006 22:14:31 -0400 Subject: DAcreate2d process layout order Message-ID: <44693587.6010907@trialphaenergy.com> Hi, I'm trying to use DACreate2d and KSP in my existing MPI application. I already have a Cartesian communicator established, and I set PETSC_COMM_WORLD equal to it and then call PetscInitialize. This works fine on a prime number of CPUs, because there is only one possible ordered MPI layout in one dimension. But with a non-prime number there are two possible ordered layouts and it just happens that my 2D CPU layout (determined by MPI_Cart_create) is the transpose of the PETSc 2D CPU layout. Is there a way to organize the DA layout more explicitly than with DACreate2d? Or to tell PETSc to transpose its CPU order? I also wonder about the 3D case. thanks Sean From shma7099 at student.uu.se Tue May 16 00:23:08 2006 From: shma7099 at student.uu.se (Sh.M) Date: Tue, 16 May 2006 07:23:08 +0200 Subject: SBAIJ + hypre preconditioners does not work? References: <5055E535-5B7A-47DD-92FD-379B307802E8@ocf.berkeley.edu> Message-ID: <004101c678a8$d1f87e00$2516e055@bredbandsbolaget.se> Hi, So far I have been using AIJ to construct my matrices and this has been due to a legacy more than anything else. However now I do need the extra memory as my problems are getting bigger and bigger. So I have tried to use the SBAIJ matrix format with block size = 1. My matrices are read from a file and are symmetric but also "full" due to a legacy. However as I understand I can treat the full matrix as if it was just the upper triangle of it if I use command line -mat_ignore_lower_triangular. It has worked just fine when I for example use solver = CG and preconditioner = Jacobi. However I am getting some errors when using hypre preconditioners, for exampel BoomerAMG. I do know or atleast think by my own past experience with Hypre that symmetric sparse matrix support has not been trivial in Hypre eventhough the support was there. Can there be a case of PETSc + Hypre in combinaton simple not supporting the SBAIJ matrix format? I am getting this error message and I have tried with different matrices ranging from 54x54 to 5Mx5M and 1-32 CPUs. It crashes when it calls the solve function: [0]PETSC ERROR: MatGetRow_SeqSBAIJ() line 207 in src/mat/impls/sbaij/seq/sbaij.c [0]PETSC ERROR: No support for this operation for this object type! [0]PETSC ERROR: MatGetRow is not supported for SBAIJ matrix format. Getting the upper trian\ gular part of row, run with -mat_getrow_uppertriangular, call MatSetOption(mat,MAT_GETROW_U\ PPERTRIANGULAR) or MatGetRowUpperTriangular()! [0]PETSC ERROR: MatGetRow_MPISBAIJ() line 1056 in src/mat/impls/sbaij/mpi/mpisbaij.c [0]PETSC ERROR: MatGetRow() line 168 in src/mat/interface/matrix.c [0]PETSC ERROR: MatCholeskyCheckShift_inline() line 45 in src/mat/impls/hypre/mhyp.c [0]PETSC ERROR: PCSetUp_HYPRE() line 95 in src/ksp/pc/impls/hypre/hyppilut.c [0]PETSC ERROR: PCSetUp() line 798 in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 334 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: User provided function() line 683 in petscSolver.c [2509 MPI_COMM_WORLD 0] Process exited without calling MPI_Finalize. Fatal error, aborting. With best regards, Shaman Mahmoudi From shma7099 at student.uu.se Tue May 16 00:52:35 2006 From: shma7099 at student.uu.se (Sh.M) Date: Tue, 16 May 2006 07:52:35 +0200 Subject: SBAIJ + hypre preconditioners does not work? References: <5055E535-5B7A-47DD-92FD-379B307802E8@ocf.berkeley.edu> <004101c678a8$d1f87e00$2516e055@bredbandsbolaget.se> Message-ID: <004d01c678ac$ee708f60$2516e055@bredbandsbolaget.se> Ohh I forgot to use the flag -mat_getrow_uppertriangular? It seems to be the case. And the Hypre preconditioner will understand that I am only providing the upper triangular part of the matrix? With best regards, Shaman Mahmoudi ----- Original Message ----- From: "Sh.M" To: Sent: Tuesday, May 16, 2006 7:23 AM Subject: SBAIJ + hypre preconditioners does not work? > Hi, > > So far I have been using AIJ to construct my matrices and this has been due > to a legacy more than anything else. > However now I do need the extra memory as my problems are getting bigger and > bigger. So I have tried to use the > SBAIJ matrix format with block size = 1. My matrices are read from a file > and are symmetric but also "full" due to a legacy. > However as I understand I can treat the full matrix as if it was just the > upper triangle of it if I use command line > -mat_ignore_lower_triangular. > > It has worked just fine when I for example use solver = CG and > preconditioner = Jacobi. > However I am getting some errors when using hypre preconditioners, for > exampel BoomerAMG. > I do know or atleast think by my own past experience with Hypre that > symmetric sparse matrix support has not been trivial in Hypre eventhough the > support was there. Can there be a case of PETSc + Hypre in combinaton simple > not supporting the SBAIJ matrix format? > > I am getting this error message and I have tried with different matrices > ranging from 54x54 to 5Mx5M and 1-32 CPUs. > > It crashes when it calls the solve function: > > [0]PETSC ERROR: MatGetRow_SeqSBAIJ() line 207 in > src/mat/impls/sbaij/seq/sbaij.c > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: MatGetRow is not supported for SBAIJ matrix format. Getting > the upper trian\ > gular part of row, run with -mat_getrow_uppertriangular, call > MatSetOption(mat,MAT_GETROW_U\ > PPERTRIANGULAR) or MatGetRowUpperTriangular()! > [0]PETSC ERROR: MatGetRow_MPISBAIJ() line 1056 in > src/mat/impls/sbaij/mpi/mpisbaij.c > [0]PETSC ERROR: MatGetRow() line 168 in src/mat/interface/matrix.c > [0]PETSC ERROR: MatCholeskyCheckShift_inline() line 45 in > src/mat/impls/hypre/mhyp.c > [0]PETSC ERROR: PCSetUp_HYPRE() line 95 in src/ksp/pc/impls/hypre/hyppilut.c > [0]PETSC ERROR: PCSetUp() line 798 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 334 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: User provided function() line 683 in petscSolver.c > [2509 MPI_COMM_WORLD 0] Process exited without calling MPI_Finalize. > Fatal error, aborting. > > With best regards, Shaman Mahmoudi > From shma7099 at student.uu.se Tue May 16 03:21:18 2006 From: shma7099 at student.uu.se (Sh.M) Date: Tue, 16 May 2006 10:21:18 +0200 Subject: PetscMemoryGetMaximumUsage References: <200605051128.k45BSgo27156@mcs.anl.gov> <1147083007.445f18ffd5f69@nobel.upc.es> <001601c672ce$5e5fbd80$2516e055@bredbandsbolaget.se> Message-ID: <000c01c678c1$b5673920$2516e055@bredbandsbolaget.se> Hi again, I am using the function and when solving a system with 32 CPUs I get that CPU 0 - max memory usage 10GB(this is mostly due to it reading the matrix from file(5GB file), constructing it and then assembling/distributing it) CPU 1-31 = max memory usage 675MB per CPU. The maximum amount of memory used per CPU is during solve. Total memory usage: 31GB The OS does not release the 10GB memory CPU 0 is occupying wich is a bit annoying and I can attest to that as when I do operations with CPU 0 as in reading another matrix, the memory usage that the OS is reporting is not increasing, it stays at 10GB even though I just read another matrix into the memory. So while the memory is definitely "freed", it is not released by CPU 0 and This seems to be an OS configuration thing I guess. So my problem is that I have no idea how much memory CPU 0 is actually using during the solve. Can you give a hint or advice on how I should calculate CPU 0 memory usage during solve? Would it be roughly 675MB as the other CPUs aswell? Would the total memory usage during solve actually be closer to 22GB(32*675MB) rather than the 31GB(20GB+31*675MB) that I naively calculated? With best regards, Shaman Mahmoudi ----- Original Message ----- From: "Barry Smith" To: Sent: Monday, May 08, 2006 9:31 PM Subject: Re: PetscMemoryGetMaximumUsage > > Unfortunately that routine is not currently "wired"; getting actual > memory usage is not portable and is a pain. > > You can use PetscMallocGetMaximumUsage() to see the maximum amount > of memory PETSc has allocated at any one time (in all the PETSc objects). > > Barry > > > On Mon, 8 May 2006, Sh.M wrote: > > > Hi, > > > > If I want to check the maximum amount of memory PETSc has used during a > > program run, is PetscMemoryGetMaximumUsage the function to use? I take it a > > call to this function will print out how much Process ID X has used, is this > > correct? So If I want to see the total maximum amount of memory used during > > a program run, each process should call this function and then I add them to > > get the total amount, correct? > > > > Thanks in advance. > > > > With best regards, Shaman Mahmoudi > > > > > From bsmith at mcs.anl.gov Tue May 16 07:25:18 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 16 May 2006 07:25:18 -0500 (CDT) Subject: PetscMemoryGetMaximumUsage In-Reply-To: <000c01c678c1$b5673920$2516e055@bredbandsbolaget.se> References: <200605051128.k45BSgo27156@mcs.anl.gov> <1147083007.445f18ffd5f69@nobel.upc.es> <001601c672ce$5e5fbd80$2516e055@bredbandsbolaget.se> <000c01c678c1$b5673920$2516e055@bredbandsbolaget.se> Message-ID: Shaman, OS's generally never "release" memory that has been allocated to a process. But if that large chunk of memory is no longer used by your process eventually its pages will get swapped out and other processes will be able to use the physical memory. The memory usage for the solve on the first process should be roughly the same amount used as on the other processes so yes 675 MB. Barry On Tue, 16 May 2006, Sh.M wrote: > Hi again, > > I am using the function and when solving a system with 32 CPUs I get that > > CPU 0 - max memory usage 10GB(this is mostly due to it reading the matrix > from file(5GB file), constructing it and then assembling/distributing it) > > CPU 1-31 = max memory usage 675MB per CPU. The maximum amount of memory used > per CPU is during solve. > > Total memory usage: 31GB > > The OS does not release the 10GB memory CPU 0 is occupying wich is a bit > annoying and I can attest to that as when I do operations with CPU 0 as in > reading another matrix, the memory usage that the OS is reporting is not > increasing, it stays at 10GB even though I just read another matrix into the > memory. So while the memory is definitely "freed", it is not released by CPU > 0 and This seems to be an OS configuration thing I guess. > > So my problem is that I have no idea how much memory CPU 0 is actually using > during the solve. > > Can you give a hint or advice on how I should calculate CPU 0 memory usage > during solve? Would it be roughly 675MB as the other CPUs aswell? > > Would the total memory usage during solve actually be closer to > 22GB(32*675MB) rather than the 31GB(20GB+31*675MB) that I naively > calculated? > > With best regards, Shaman Mahmoudi > > ----- Original Message ----- > From: "Barry Smith" > To: > Sent: Monday, May 08, 2006 9:31 PM > Subject: Re: PetscMemoryGetMaximumUsage > > >> >> Unfortunately that routine is not currently "wired"; getting actual >> memory usage is not portable and is a pain. >> >> You can use PetscMallocGetMaximumUsage() to see the maximum amount >> of memory PETSc has allocated at any one time (in all the PETSc objects). >> >> Barry >> >> >> On Mon, 8 May 2006, Sh.M wrote: >> >>> Hi, >>> >>> If I want to check the maximum amount of memory PETSc has used during a >>> program run, is PetscMemoryGetMaximumUsage the function to use? I take > it a >>> call to this function will print out how much Process ID X has used, is > this >>> correct? So If I want to see the total maximum amount of memory used > during >>> a program run, each process should call this function and then I add > them to >>> get the total amount, correct? >>> >>> Thanks in advance. >>> >>> With best regards, Shaman Mahmoudi >>> >>> >> > > From bsmith at mcs.anl.gov Tue May 16 07:30:03 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 16 May 2006 07:30:03 -0500 (CDT) Subject: SBAIJ + hypre preconditioners does not work? In-Reply-To: <004101c678a8$d1f87e00$2516e055@bredbandsbolaget.se> References: <5055E535-5B7A-47DD-92FD-379B307802E8@ocf.berkeley.edu> <004101c678a8$d1f87e00$2516e055@bredbandsbolaget.se> Message-ID: I have not thought about hypre when only the upper-half of the matrix is provided? Please tell me which of the 4 hypre preconditioners support symmetric storage of the matrix and how I indicate that to hypre that I am only providing half the matrix and I will add support for you. Barry On Tue, 16 May 2006, Sh.M wrote: > Hi, > > So far I have been using AIJ to construct my matrices and this has been due > to a legacy more than anything else. > However now I do need the extra memory as my problems are getting bigger and > bigger. So I have tried to use the > SBAIJ matrix format with block size = 1. My matrices are read from a file > and are symmetric but also "full" due to a legacy. > However as I understand I can treat the full matrix as if it was just the > upper triangle of it if I use command line > -mat_ignore_lower_triangular. > > It has worked just fine when I for example use solver = CG and > preconditioner = Jacobi. > However I am getting some errors when using hypre preconditioners, for > exampel BoomerAMG. > I do know or atleast think by my own past experience with Hypre that > symmetric sparse matrix support has not been trivial in Hypre eventhough the > support was there. Can there be a case of PETSc + Hypre in combinaton simple > not supporting the SBAIJ matrix format? > > I am getting this error message and I have tried with different matrices > ranging from 54x54 to 5Mx5M and 1-32 CPUs. > > It crashes when it calls the solve function: > > [0]PETSC ERROR: MatGetRow_SeqSBAIJ() line 207 in > src/mat/impls/sbaij/seq/sbaij.c > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: MatGetRow is not supported for SBAIJ matrix format. Getting > the upper trian\ > gular part of row, run with -mat_getrow_uppertriangular, call > MatSetOption(mat,MAT_GETROW_U\ > PPERTRIANGULAR) or MatGetRowUpperTriangular()! > [0]PETSC ERROR: MatGetRow_MPISBAIJ() line 1056 in > src/mat/impls/sbaij/mpi/mpisbaij.c > [0]PETSC ERROR: MatGetRow() line 168 in src/mat/interface/matrix.c > [0]PETSC ERROR: MatCholeskyCheckShift_inline() line 45 in > src/mat/impls/hypre/mhyp.c > [0]PETSC ERROR: PCSetUp_HYPRE() line 95 in src/ksp/pc/impls/hypre/hyppilut.c > [0]PETSC ERROR: PCSetUp() line 798 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 334 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: User provided function() line 683 in petscSolver.c > [2509 MPI_COMM_WORLD 0] Process exited without calling MPI_Finalize. > Fatal error, aborting. > > With best regards, Shaman Mahmoudi > > From F.Boulahya at brgm.fr Tue May 16 08:02:25 2006 From: F.Boulahya at brgm.fr (=?iso-8859-1?Q?Boulahya_Fa=EFza?=) Date: Tue, 16 May 2006 15:02:25 +0200 Subject: Petsc + BlockSolve95 Message-ID: Hi, Is there something new about it? Regards, Fa?za _____ De : Vaz, Guilherme [mailto:G.Vaz at marin.nl] Envoy? : mercredi 10 mai 2006 11:37 ? : petsc-users at mcs.anl.gov Objet : RE: Petsc + BlockSolve95 Hello people, I have exactly the same problem as Faiza... In sequential it runs ok but in parallel not. Greetings. Guilherme -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Boulahya Fa?za Sent: 10 May 2006 10:39 To: 'petsc-users at mcs.anl.gov' Subject: RE: Petsc + BlockSolve95 Thanks. I tried something else : - creation of the matrix with MatCreateMPIAIJ - initialization - adding of the options MAT_SYMMETRIC and MAT_SYMMETRY_ETERNAL When solving in sequential CG + ICC, everything is ok. When I tried in parallel the same code the options lead to the same error : [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: No support for this operation for this object type! [0]PETSC ERROR: unknown option! [0]PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c Can MatSetOption be only used in sequential? Fa?za _____ De : Matthew Knepley [mailto:knepley at gmail.com] Envoy? : mardi 9 mai 2006 18:19 ? : petsc-users at mcs.anl.gov Objet : Re: Petsc + BlockSolve95 I believe there is a problem with the option that you specified. All these are integers, and it is complaining that the integer does not match MAT_SYMMETRIC. I will fix the error message to print the offending option, but please check the code. Thanks, Matt On 5/9/06, Boulahya Fa?za > wrote: Hi All, Has anyone used Conjugate Gradient Solver + Icomplete Cholesky Preconditionner in parallel case? I tried as said in the manual : I use MATMPIROWBS for the storage of the matrice. However I get this message : PETSC ERROR: To use incomplete Cholesky preconditioning with a MATMPIROWBS matrix you must declare it to be symmetric using the option MatSetOption(A,MAT_SYMMETRIC)! So I tried adding this option (even if in the namual it is written that it is not required). Then I obtained this message PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in src/mat/impls/rowbs/mpi/mpirowbs.c PETSC ERROR: No support for this operation for this object type! PETSC ERROR: unknown option! PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c In advance thanks, Fa?za Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ *** Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de v?rifier l'absence de corruption ? sa r?ception. The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. eSafe scanned this email for viruses, vandals and malicious content. *** Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ *** Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de v?rifier l'absence de corruption ? sa r?ception. The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. eSafe scanned this email for viruses, vandals and malicious content. *** -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue May 16 08:39:12 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 16 May 2006 08:39:12 -0500 (CDT) Subject: Petsc + BlockSolve95 In-Reply-To: References: Message-ID: > [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: No support for this operation for this object type! AIJ matrix does not support parallell ICC. BlockSolve provides parallel ICC - and to use it one must use MatType MATMPIROWBS Satish On Tue, 16 May 2006, Boulahya Fa?za wrote: > Is there something new about it? > De : Vaz, Guilherme [mailto:G.Vaz at marin.nl] > I have exactly the same problem as Faiza... > In sequential it runs ok but in parallel not. > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] > On Behalf Of Boulahya Fa?za > I tried something else : > > - creation of the matrix with MatCreateMPIAIJ > > - initialization > > - adding of the options MAT_SYMMETRIC and MAT_SYMMETRY_ETERNAL > > > > When solving in sequential CG + ICC, everything is ok. When I tried in > parallel the same code the options lead to the same error : > > [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in > src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: unknown option! > [0]PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > Can MatSetOption be only used in sequential? > > > > > > Fa?za > > > > > > _____ > > De : Matthew Knepley [mailto:knepley at gmail.com] > Envoy? : mardi 9 mai 2006 18:19 > ? : petsc-users at mcs.anl.gov > Objet : Re: Petsc + BlockSolve95 > > I believe there is a problem with the option that you specified. All these > are integers, and it is complaining that the integer does not match > MAT_SYMMETRIC. I will fix the error message to print the offending > option, but please check the code. > > Thanks, > > Matt > > On 5/9/06, Boulahya Fa?za > > wrote: > > Hi All, > > > > Has anyone used Conjugate Gradient Solver + Icomplete Cholesky > Preconditionner in parallel case? I tried as said in the manual : I use > MATMPIROWBS for the storage of the matrice. However I get this message : > > > > PETSC ERROR: To use incomplete Cholesky > preconditioning with a MATMPIROWBS matrix you must > declare it to be > symmetric using the option > MatSetOption(A,MAT_SYMMETRIC)! > > > > So I tried adding this option (even if in the namual it is written that it > is not required). Then I obtained this message > > PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in > src/mat/impls/rowbs/mpi/mpirowbs.c > PETSC ERROR: No support for this operation for this object type! > PETSC ERROR: unknown option! > PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > > > In advance thanks, > > > > > > Fa?za > > > > > Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ > > *** > Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage > exclusif du > (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de > r?ception de cet > e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le > contenu. > L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient > n?anmoins de > v?rifier l'absence de corruption ? sa r?ception. > > The contents of this email and any attachments are confidential. They are > intended for > the named recipient(s) only. If you have received this email in error please > notify the > system manager or the sender immediately and do not disclose the contents > to > anyone or make copies. eSafe scanned this email for viruses, vandals and > malicious > content. > *** > Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ > > *** > Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du > (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet > e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. > L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de > v?rifier l'absence de corruption ? sa r?ception. > > The contents of this email and any attachments are confidential. They are intended for > the named recipient(s) only. If you have received this email in error please notify the > system manager or the sender immediately and do not disclose the contents to > anyone or make copies. eSafe scanned this email for viruses, vandals and malicious > content. > *** From hzhang at mcs.anl.gov Tue May 16 09:34:04 2006 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Tue, 16 May 2006 09:34:04 -0500 (CDT) Subject: SBAIJ + hypre preconditioners does not work? In-Reply-To: <004d01c678ac$ee708f60$2516e055@bredbandsbolaget.se> References: <5055E535-5B7A-47DD-92FD-379B307802E8@ocf.berkeley.edu> <004101c678a8$d1f87e00$2516e055@bredbandsbolaget.se> <004d01c678ac$ee708f60$2516e055@bredbandsbolaget.se> Message-ID: On Tue, 16 May 2006, Sh.M wrote: > Ohh I forgot to use the flag -mat_getrow_uppertriangular? The flg '-mat_getrow_uppertriangular' or MatSetOption(mat,MAT_GETROW_UPPERTRIANGULAR) are used by Petsc function MatGetRow(). > It seems to be the case. And the Hypre preconditioner will understand that I > am only providing the upper triangular part of the matrix? '-mat_getrow_uppertriangular' is not written for Hypre interface. Hong > With best regards, Shaman Mahmoudi > > ----- Original Message ----- > From: "Sh.M" > To: > Sent: Tuesday, May 16, 2006 7:23 AM > Subject: SBAIJ + hypre preconditioners does not work? > > > > Hi, > > > > So far I have been using AIJ to construct my matrices and this has been > due > > to a legacy more than anything else. > > However now I do need the extra memory as my problems are getting bigger > and > > bigger. So I have tried to use the > > SBAIJ matrix format with block size = 1. My matrices are read from a file > > and are symmetric but also "full" due to a legacy. > > However as I understand I can treat the full matrix as if it was just the > > upper triangle of it if I use command line > > -mat_ignore_lower_triangular. > > > > It has worked just fine when I for example use solver = CG and > > preconditioner = Jacobi. > > However I am getting some errors when using hypre preconditioners, for > > exampel BoomerAMG. > > I do know or atleast think by my own past experience with Hypre that > > symmetric sparse matrix support has not been trivial in Hypre eventhough > the > > support was there. Can there be a case of PETSc + Hypre in combinaton > simple > > not supporting the SBAIJ matrix format? > > > > I am getting this error message and I have tried with different matrices > > ranging from 54x54 to 5Mx5M and 1-32 CPUs. > > > > It crashes when it calls the solve function: > > > > [0]PETSC ERROR: MatGetRow_SeqSBAIJ() line 207 in > > src/mat/impls/sbaij/seq/sbaij.c > > [0]PETSC ERROR: No support for this operation for this object type! > > [0]PETSC ERROR: MatGetRow is not supported for SBAIJ matrix format. > Getting > > the upper trian\ > > gular part of row, run with -mat_getrow_uppertriangular, call > > MatSetOption(mat,MAT_GETROW_U\ > > PPERTRIANGULAR) or MatGetRowUpperTriangular()! > > [0]PETSC ERROR: MatGetRow_MPISBAIJ() line 1056 in > > src/mat/impls/sbaij/mpi/mpisbaij.c > > [0]PETSC ERROR: MatGetRow() line 168 in src/mat/interface/matrix.c > > [0]PETSC ERROR: MatCholeskyCheckShift_inline() line 45 in > > src/mat/impls/hypre/mhyp.c > > [0]PETSC ERROR: PCSetUp_HYPRE() line 95 in > src/ksp/pc/impls/hypre/hyppilut.c > > [0]PETSC ERROR: PCSetUp() line 798 in src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: KSPSolve() line 334 in src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: User provided function() line 683 in petscSolver.c > > [2509 MPI_COMM_WORLD 0] Process exited without calling MPI_Finalize. > > Fatal error, aborting. > > > > With best regards, Shaman Mahmoudi > > > > From shma7099 at student.uu.se Tue May 16 12:21:53 2006 From: shma7099 at student.uu.se (Sh.M) Date: Tue, 16 May 2006 19:21:53 +0200 Subject: SBAIJ + hypre preconditioners does not work? References: <5055E535-5B7A-47DD-92FD-379B307802E8@ocf.berkeley.edu> <004101c678a8$d1f87e00$2516e055@bredbandsbolaget.se> Message-ID: <005701c6790d$39ee6ba0$2516e055@bredbandsbolaget.se> Hi, I asked Rob F. from the Hypre team and IJ does not support symmetric storage and will probably not in the near future. And he explained to me that the memory saving when using AMG with a symmetric storage is not much, so no big loss for me I guess. Thanks for all the help Barry and Hong! With best regards, Shaman Mahmoudi ----- Original Message ----- From: "Barry Smith" To: Sent: Tuesday, May 16, 2006 2:30 PM Subject: Re: SBAIJ + hypre preconditioners does not work? > > I have not thought about hypre when only the upper-half of the > matrix is provided? Please tell me which of the 4 hypre preconditioners > support symmetric storage of the matrix and how I indicate that to > hypre that I am only providing half the matrix and I will add > support for you. > > Barry > > > On Tue, 16 May 2006, Sh.M wrote: > > > Hi, > > > > So far I have been using AIJ to construct my matrices and this has been due > > to a legacy more than anything else. > > However now I do need the extra memory as my problems are getting bigger and > > bigger. So I have tried to use the > > SBAIJ matrix format with block size = 1. My matrices are read from a file > > and are symmetric but also "full" due to a legacy. > > However as I understand I can treat the full matrix as if it was just the > > upper triangle of it if I use command line > > -mat_ignore_lower_triangular. > > > > It has worked just fine when I for example use solver = CG and > > preconditioner = Jacobi. > > However I am getting some errors when using hypre preconditioners, for > > exampel BoomerAMG. > > I do know or atleast think by my own past experience with Hypre that > > symmetric sparse matrix support has not been trivial in Hypre eventhough the > > support was there. Can there be a case of PETSc + Hypre in combinaton simple > > not supporting the SBAIJ matrix format? > > > > I am getting this error message and I have tried with different matrices > > ranging from 54x54 to 5Mx5M and 1-32 CPUs. > > > > It crashes when it calls the solve function: > > > > [0]PETSC ERROR: MatGetRow_SeqSBAIJ() line 207 in > > src/mat/impls/sbaij/seq/sbaij.c > > [0]PETSC ERROR: No support for this operation for this object type! > > [0]PETSC ERROR: MatGetRow is not supported for SBAIJ matrix format. Getting > > the upper trian\ > > gular part of row, run with -mat_getrow_uppertriangular, call > > MatSetOption(mat,MAT_GETROW_U\ > > PPERTRIANGULAR) or MatGetRowUpperTriangular()! > > [0]PETSC ERROR: MatGetRow_MPISBAIJ() line 1056 in > > src/mat/impls/sbaij/mpi/mpisbaij.c > > [0]PETSC ERROR: MatGetRow() line 168 in src/mat/interface/matrix.c > > [0]PETSC ERROR: MatCholeskyCheckShift_inline() line 45 in > > src/mat/impls/hypre/mhyp.c > > [0]PETSC ERROR: PCSetUp_HYPRE() line 95 in src/ksp/pc/impls/hypre/hyppilut.c > > [0]PETSC ERROR: PCSetUp() line 798 in src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: KSPSolve() line 334 in src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: User provided function() line 683 in petscSolver.c > > [2509 MPI_COMM_WORLD 0] Process exited without calling MPI_Finalize. > > Fatal error, aborting. > > > > With best regards, Shaman Mahmoudi > > > > > From F.Boulahya at brgm.fr Wed May 17 05:58:28 2006 From: F.Boulahya at brgm.fr (=?iso-8859-1?Q?Boulahya_Fa=EFza?=) Date: Wed, 17 May 2006 12:58:28 +0200 Subject: Petsc + BlockSolve95 Message-ID: Ok but the error does not come from parallell ICC. It's from the MatSetOption(). Moreover, when I tried with MATMPIROWBS , it's the same problem : [0]PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in src/mat/impls/rowbs/mpi/mpirowbs.c [0]PETSC ERROR: No support for this operation for this object type! [0]PETSC ERROR: unknown option! [0]PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c Fa?za -----Message d'origine----- De : Satish Balay [mailto:balay at mcs.anl.gov] Envoy? : mardi 16 mai 2006 15:39 ? : 'petsc-users at mcs.anl.gov' Objet : RE: Petsc + BlockSolve95 > [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in > src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: No support for this operation for this object type! AIJ matrix does not support parallell ICC. BlockSolve provides parallel ICC - and to use it one must use MatType MATMPIROWBS Satish On Tue, 16 May 2006, Boulahya Fa?za wrote: > Is there something new about it? > De : Vaz, Guilherme [mailto:G.Vaz at marin.nl] > I have exactly the same problem as Faiza... > In sequential it runs ok but in parallel not. > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] > On Behalf Of Boulahya Fa?za > I tried something else : > > - creation of the matrix with MatCreateMPIAIJ > > - initialization > > - adding of the options MAT_SYMMETRIC and MAT_SYMMETRY_ETERNAL > > > > When solving in sequential CG + ICC, everything is ok. When I tried in > parallel the same code the options lead to the same error : > > [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in > src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: No support for this > operation for this object type! > [0]PETSC ERROR: unknown option! > [0]PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > Can MatSetOption be only used in sequential? > > > > > > Fa?za > > > > > > _____ > > De : Matthew Knepley [mailto:knepley at gmail.com] Envoy? : mardi 9 mai > 2006 18:19 ? : petsc-users at mcs.anl.gov Objet : Re: Petsc + > BlockSolve95 > > I believe there is a problem with the option that you specified. All > these are integers, and it is complaining that the integer does not > match MAT_SYMMETRIC. I will fix the error message to print the > offending option, but please check the code. > > Thanks, > > Matt > > On 5/9/06, Boulahya Fa?za > > wrote: > > Hi All, > > > > Has anyone used Conjugate Gradient Solver + Icomplete Cholesky > Preconditionner in parallel case? I tried as said in the manual : I > use MATMPIROWBS for the storage of the matrice. However I get this message : > > > > PETSC ERROR: To use incomplete Cholesky > preconditioning with a MATMPIROWBS matrix you > must declare it to be > symmetric using the option > MatSetOption(A,MAT_SYMMETRIC)! > > > > So I tried adding this option (even if in the namual it is written > that it is not required). Then I obtained this message > > PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in > src/mat/impls/rowbs/mpi/mpirowbs.c > PETSC ERROR: No support for this operation for this object type! > PETSC ERROR: unknown option! > PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > > > In advance thanks, > > > > > > Fa?za > > > > > Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ > > *** > Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? > l'usage exclusif du > (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de > r?ception de cet e-mail par erreur, le signaler ? son exp?diteur et > ne pas en divulguer le contenu. > L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient > n?anmoins de v?rifier l'absence de corruption ? sa r?ception. > > The contents of this email and any attachments are confidential. They > are intended for the named recipient(s) only. If you have received > this email in error please notify the system manager or the sender > immediately and do not disclose the contents to anyone or make copies. > eSafe scanned this email for viruses, vandals and malicious content. > *** > Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ > > *** > Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? > l'usage exclusif du > (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de > r?ception de cet e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. > L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient > n?anmoins de v?rifier l'absence de corruption ? sa r?ception. > > The contents of this email and any attachments are confidential. They > are intended for the named recipient(s) only. If you have received > this email in error please notify the system manager or the sender > immediately and do not disclose the contents to anyone or make copies. > eSafe scanned this email for viruses, vandals and malicious content. > *** Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ *** Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de v?rifier l'absence de corruption ? sa r?ception. The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. eSafe scanned this email for viruses, vandals and malicious content. *** -------------- next part -------------- An HTML attachment was scrubbed... URL: From geenen at gmail.com Wed May 17 05:08:08 2006 From: geenen at gmail.com (Thomas Geenen) Date: Wed, 17 May 2006 12:08:08 +0200 Subject: common nodla points on neighbouring subdomains Message-ID: <200605171208.08589.geenen@gmail.com> Dear Petsc users, I am solving a system of equations Ax=b resulting from a Finite element package. The parallelization strategy for this program is to minimize communication. therefore each subdomain has its own copy of common nodal points between subdomains (local numbering). This strategy is already implemented and I don't want to change that I fill the matrix and right hand side with a call to MatSetValuesLocal and VecSetValuesLocal. Some entries are filled on different subdomains (of course with the same value) The solution for the first subdomain is correct. however the solution vector on the second subdomain does not contain the solution on the common nodal points it shares with subdomain 1. This is probably a feature but it is not consistent with my program setup. The question is: how can i tell petsc to return a solution for all the positions i filled with VecSetValuesLocal? Thanks in advance Thomas Geenen ps I got the impression that the assemble routines also exchange common nodal point info. It would be nice if this could be skipped as well. Of course some extra communication could solve my problem but I would prefer a more elegant solution if one exists From knepley at gmail.com Wed May 17 06:13:37 2006 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 May 2006 06:13:37 -0500 Subject: common nodla points on neighbouring subdomains In-Reply-To: <200605171208.08589.geenen@gmail.com> References: <200605171208.08589.geenen@gmail.com> Message-ID: It is unclear how you have actually mapped unknowns into your linear system. However, a global PETSc Mat would naively have only one copy of a given unknown, and thus the duplicate unknowns would indeed be out of synch. You have to do extra work to communicate the values after the solve. However, all this is handled for you if you use the DA object. It only handles logically Cartesian grids, but does all the ghosting. We will soon have an unstructured counterpart. Matt On 5/17/06, Thomas Geenen wrote: > > Dear Petsc users, > > I am solving a system of equations Ax=b resulting from a Finite element > package. The parallelization strategy for this program is to minimize > communication. therefore each subdomain has its own copy of common nodal > points between subdomains (local numbering). This strategy is already > implemented and I don't want to change that > > I fill the matrix and right hand side with a call to MatSetValuesLocal and > VecSetValuesLocal. Some entries are filled on different subdomains (of > course > with the same value) > > The solution for the first subdomain is correct. however the solution > vector > on the second subdomain does not contain the solution on the common nodal > points it shares with subdomain 1. > This is probably a feature but it is not consistent with my program setup. > The question is: > how can i tell petsc to return a solution for all the positions i filled > with > VecSetValuesLocal? > > Thanks in advance > Thomas Geenen > ps I got the impression that the assemble routines also exchange common > nodal > point info. It would be nice if this could be skipped as well. > Of course some extra communication could solve my problem but I would > prefer a > more elegant solution if one exists > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 17 06:18:11 2006 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 May 2006 06:18:11 -0500 Subject: Petsc + BlockSolve95 In-Reply-To: References: Message-ID: My guess is somehow you are passing the wrong integer for the option because I am looking at the code and it is fine. However, it is easy to clear up. Go to $PETSC_DIR/src/mat/impls/rowbs/mpi/mpirowbs.c:1411 and replace it with SETERRQ1(PETSC_ERR_SUP,"unknown option %d",op); and then in that directory make make shared This will tell us what integer was passed in for the option. I have already made this fix in the dev version. Matt On 5/17/06, Boulahya Fa?za wrote: > > Ok but the error does not come from parallell ICC. It's from the > MatSetOption(). > > Moreover, when I tried with MATMPIROWBS , it's the same problem : > [0]PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in > src/mat/impls/rowbs/mpi/mpirowbs.c > > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: unknown option! > [0]PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > > Fa?za > > -----Message d'origine----- > De : Satish Balay [mailto:balay at mcs.anl.gov ] > Envoy? : mardi 16 mai 2006 15:39 > > ? : 'petsc-users at mcs.anl.gov' > Objet : RE: Petsc + BlockSolve95 > > > [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in > > src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: No support for this > operation for this object type! > > AIJ matrix does not support parallell ICC. BlockSolve provides parallel > ICC - and to use it one must use MatType MATMPIROWBS > > Satish > > On Tue, 16 May 2006, Boulahya Fa?za wrote: > > > Is there something new about it? > > > De : Vaz, Guilherme [mailto:G.Vaz at marin.nl ] > > > I have exactly the same problem as Faiza... > > In sequential it runs ok but in parallel not. > > > -----Original Message----- > > From: owner-petsc-users at mcs.anl.gov > > [mailto:owner-petsc-users at mcs.anl.gov ] > > On Behalf Of Boulahya Fa?za > > > I tried something else : > > > > - creation of the matrix with MatCreateMPIAIJ > > > > - initialization > > > > - adding of the options MAT_SYMMETRIC and MAT_SYMMETRY_ETERNAL > > > > > > > > When solving in sequential CG + ICC, everything is ok. When I tried in > > parallel the same code the options lead to the same error : > > > > [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in > > src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: No support for this > > operation for this object type! > > [0]PETSC ERROR: unknown option! > > [0]PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > > > Can MatSetOption be only used in sequential? > > > > > > > > > > > > Fa?za > > > > > > > > > > > > _____ > > > > De : Matthew Knepley [mailto:knepley at gmail.com ] > Envoy? : mardi 9 mai > > 2006 18:19 ? : petsc-users at mcs.anl.gov Objet : Re: Petsc + > > BlockSolve95 > > > > I believe there is a problem with the option that you specified. All > > these are integers, and it is complaining that the integer does not > > match MAT_SYMMETRIC. I will fix the error message to print the > > offending option, but please check the code. > > > > Thanks, > > > > Matt > > > > On 5/9/06, Boulahya Fa?za > > > > > wrote: > > > > Hi All, > > > > > > > > Has anyone used Conjugate Gradient Solver + Icomplete Cholesky > > Preconditionner in parallel case? I tried as said in the manual : I > > use MATMPIROWBS for the storage of the matrice. However I get this > message : > > > > > > > > PETSC ERROR: To use incomplete Cholesky > > preconditioning with a MATMPIROWBS matrix you > > must declare it to be > > symmetric using the option > > MatSetOption(A,MAT_SYMMETRIC)! > > > > > > > > So I tried adding this option (even if in the namual it is written > > that it is not required). Then I obtained this message > > > > PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in > > src/mat/impls/rowbs/mpi/mpirowbs.c > > PETSC ERROR: No support for this operation for this object type! > > PETSC ERROR: unknown option! > > PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > > > > > > > In advance thanks, > > > > > > > > > > > > Fa?za > > > > > > > > > > Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ > > > > *** > > Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? > > l'usage exclusif du > > (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de > > r?ception de cet e-mail par erreur, le signaler ? son exp?diteur et > > ne pas en divulguer le contenu. > > L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient > > n?anmoins de v?rifier l'absence de corruption ? sa r?ception. > > > > The contents of this email and any attachments are confidential. They > > are intended for the named recipient(s) only. If you have received > > this email in error please notify the system manager or the sender > > immediately and do not disclose the contents to anyone or make copies. > > eSafe scanned this email for viruses, vandals and malicious content. > > *** > > Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ > > > > *** > > Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? > > l'usage exclusif du > > (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de > > r?ception de cet e-mail par erreur, le signaler ? son exp?diteur et ne > pas en divulguer le contenu. > > L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient > > n?anmoins de v?rifier l'absence de corruption ? sa r?ception. > > > > The contents of this email and any attachments are confidential. They > > are intended for the named recipient(s) only. If you have received > > this email in error please notify the system manager or the sender > > immediately and do not disclose the contents to anyone or make copies. > > eSafe scanned this email for viruses, vandals and malicious content. > > *** > > Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ > > *** > Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du > (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet > e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. > L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de > v?rifier l'absence de corruption ? sa r?ception. > > The contents of this email and any attachments are confidential. They are intended for > the named recipient(s) only. If you have received this email in error please notify the > system manager or the sender immediately and do not disclose the contents to > anyone or make copies. eSafe scanned this email for viruses, vandals and malicious > content. > *** > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed May 17 07:53:00 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 17 May 2006 07:53:00 -0500 (CDT) Subject: common nodla points on neighbouring subdomains In-Reply-To: <200605171208.08589.geenen@gmail.com> References: <200605171208.08589.geenen@gmail.com> Message-ID: Thomas, I think you might misunderstand the function of XXXSetValuesLocal(). What happens with them is the index is mapped to the global index and the value is then put into the "correct" global location, so some values will be shipped off to the "owning" process. 1) If two processes that "share" a vertex both generate the entire stiffness for that vertex (that is, each process loops over all the elements containing the vertex summing the stiffness for that vertex) then the result will be exactly twice the expected value (since one process will send the result to the "owning" process). I THINK this is what you are doing. 2) if, on the other hand, each process generates only the part of the stiffness for the vertex from elements that it "owns" then the global matrix will be correct, BUT the communication of matrix elements to the "owning" process is done. If you DO NOT want the communication to the owning process but want to keep part of the stiffness on each process (for a shared vertex) I would call this the unassembled form. This is done by some people but is not really supported by PETSc. I think you may want to consider NOT having "shared" vertices (that is either 1 if you are using that or 2). This may not require much change to what you do not want to change but will allow you to easily use all of the PETSc solvers. The problem with "unassembled" approaches is that they save a bit of communication in generating the stiffness but then (since the matrix is never assembled) cannot form much of anything in terms of preconditioners, thus you do lots of linear solver iterations with communication. Communicating the stiffness initially (if you have a good partitioning of elements) is really not much communication and IMHO is the wrong place to optimize. I hope my response is not too confusing, I may have misunderstood parts of your question. Barry On Wed, 17 May 2006, Thomas Geenen wrote: > Dear Petsc users, > > I am solving a system of equations Ax=b resulting from a Finite element > package. The parallelization strategy for this program is to minimize > communication. therefore each subdomain has its own copy of common nodal > points between subdomains (local numbering). This strategy is already > implemented and I don't want to change that > > I fill the matrix and right hand side with a call to MatSetValuesLocal and > VecSetValuesLocal. Some entries are filled on different subdomains (of course > with the same value) > > The solution for the first subdomain is correct. however the solution vector > on the second subdomain does not contain the solution on the common nodal > points it shares with subdomain 1. > This is probably a feature but it is not consistent with my program setup. > The question is: > how can i tell petsc to return a solution for all the positions i filled with > VecSetValuesLocal? > > Thanks in advance > Thomas Geenen > ps I got the impression that the assemble routines also exchange common nodal > point info. It would be nice if this could be skipped as well. > Of course some extra communication could solve my problem but I would prefer a > more elegant solution if one exists > > From geenen at gmail.com Wed May 17 08:41:07 2006 From: geenen at gmail.com (Thomas Geenen) Date: Wed, 17 May 2006 15:41:07 +0200 Subject: common nodla points on neighbouring subdomains In-Reply-To: References: <200605171208.08589.geenen@gmail.com> Message-ID: <200605171541.07788.geenen@gmail.com> On Wednesday 17 May 2006 14:53, Barry Smith wrote: > Thomas, > > I think you might misunderstand the function of XXXSetValuesLocal(). > What happens with them is the index is mapped to the global index and > the value is then put into the "correct" global location, so some values > will be shipped off to the "owning" process. this happens in the assemble phase right? > > 1) If two processes that > "share" a vertex both generate the entire stiffness for that vertex > (that is, each process loops over all the elements containing the > vertex summing the stiffness for that vertex) then the result will > be exactly twice the expected value (since one process will send the > result to the "owning" process). I THINK this is what you are doing. I intercept the matrix and rhs just before the "old" call to the solver routine just before the call the value for the rhs of common nodalpoints are calculated globally and broadcasted to the owning processes. In this setup more than one process can own a nodal point. I construct a local2global mapping and apply it to the matrix and vector ISLocalToGlobalMapping ltog; ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD, nc, local2global, <og); MatSetLocalToGlobalMapping(A, ltog); > > 2) if, on the other hand, each process generates only the part of the > stiffness for the vertex from elements that it "owns" then the global > matrix will be correct, BUT the communication of matrix elements to the > "owning" process is done. If you DO NOT want the communication to the > owning process but want to keep part of the stiffness on each process (for > a shared vertex) I would call this the unassembled form. This is done by > some people but is not really supported by PETSc. > > I think you may want to consider NOT having "shared" vertices (that is > either 1 if you are using that or 2). This may not require much change to > what you do not want to change but will allow you to easily use all of the > PETSc solvers. The problem with "unassembled" approaches is that they save > a bit of communication in generating the stiffness but then (since the > matrix is never assembled) cannot form much of anything in terms of > preconditioners, thus you do lots of linear solver iterations with > communication. Communicating the stiffness initially (if you have a good > partitioning of elements) is really not much communication and IMHO is the > wrong place to optimize. basically what happens in my code is that each process owns all nodalpoints of the element it owns. I tell petsc the global numbering of the matrix and vector entries so I can use all the preconditioners and solvers etc. However in the original setup the program expects on return to have the solution vector for all nodes of the elements it owns . Petsc gives the solution to the "owner process" of the nodal point. To correct this I would have to send parts of the solution to processes that don't "own" that part of the solution but do expect it to be there. I thought It would be elegant if Petsc could return a solution vector that corresponds to the local filling of the vector not to the "owners" of the nodal point. thanks for you fast response Thomas Geenen > > I hope my response is not too confusing, I may have misunderstood > parts of your question. > > Barry > > On Wed, 17 May 2006, Thomas Geenen wrote: > > Dear Petsc users, > > > > I am solving a system of equations Ax=b resulting from a Finite element > > package. The parallelization strategy for this program is to minimize > > communication. therefore each subdomain has its own copy of common nodal > > points between subdomains (local numbering). This strategy is already > > implemented and I don't want to change that > > > > I fill the matrix and right hand side with a call to MatSetValuesLocal > > and VecSetValuesLocal. Some entries are filled on different subdomains > > (of course with the same value) > > > > The solution for the first subdomain is correct. however the solution > > vector on the second subdomain does not contain the solution on the > > common nodal points it shares with subdomain 1. > > This is probably a feature but it is not consistent with my program > > setup. The question is: > > how can i tell petsc to return a solution for all the positions i filled > > with VecSetValuesLocal? > > > > Thanks in advance > > Thomas Geenen > > ps I got the impression that the assemble routines also exchange common > > nodal point info. It would be nice if this could be skipped as well. > > Of course some extra communication could solve my problem but I would > > prefer a more elegant solution if one exists From F.Boulahya at brgm.fr Wed May 17 08:43:08 2006 From: F.Boulahya at brgm.fr (=?iso-8859-1?Q?Boulahya_Fa=EFza?=) Date: Wed, 17 May 2006 15:43:08 +0200 Subject: Petsc + BlockSolve95 Message-ID: Doing this, I realize that I wrote MAT SYMMETRIC instead of MAT_SYMMETRIC. Everything is ok. Sorry for all those mails... Fa?za _____ De : Matthew Knepley [mailto:knepley at gmail.com] Envoy? : mercredi 17 mai 2006 13:18 ? : petsc-users at mcs.anl.gov Cc : Boulahya Faiza Objet : Re: Petsc + BlockSolve95 My guess is somehow you are passing the wrong integer for the option because I am looking at the code and it is fine. However, it is easy to clear up. Go to $PETSC_DIR/src/mat/impls/rowbs/mpi/mpirowbs.c:1411 and replace it with SETERRQ1(PETSC_ERR_SUP,"unknown option %d",op); and then in that directory make make shared This will tell us what integer was passed in for the option. I have already made this fix in the dev version. Matt On 5/17/06, Boulahya Fa?za > wrote: Ok but the error does not come from parallell ICC. It's from the MatSetOption(). Moreover, when I tried with MATMPIROWBS , it's the same problem : [0]PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in src/mat/impls/rowbs/mpi/mpirowbs.c [0]PETSC ERROR: No support for this operation for this object type! [0]PETSC ERROR: unknown option! [0]PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c Fa?za -----Message d'origine----- De : Satish Balay [mailto:balay at mcs.anl.gov ] Envoy? : mardi 16 mai 2006 15:39 ? : 'petsc-users at mcs.anl.gov ' Objet : RE: Petsc + BlockSolve95 > [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in > src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: No support for this operation for this object type! AIJ matrix does not support parallell ICC. BlockSolve provides parallel ICC - and to use it one must use MatType MATMPIROWBS Satish On Tue, 16 May 2006, Boulahya Fa?za wrote: > Is there something new about it? > De : Vaz, Guilherme [mailto:G.Vaz at marin.nl ] > I have exactly the same problem as Faiza... > In sequential it runs ok but in parallel not. > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov ] > On Behalf Of Boulahya Fa?za > I tried something else : > > - creation of the matrix with MatCreateMPIAIJ > > - initialization > > - adding of the options MAT_SYMMETRIC and MAT_SYMMETRY_ETERNAL > > > > When solving in sequential CG + ICC, everything is ok. When I tried in > parallel the same code the options lead to the same error : > > [0]PETSC ERROR: MatSetOption_MPIAIJ() line 1251 in > src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: No support for this > operation for this object type! > [0]PETSC ERROR: unknown option! > [0]PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > Can MatSetOption be only used in sequential? > > > > > > Fa?za > > > > > > _____ > > De : Matthew Knepley [mailto:knepley at gmail.com ] Envoy? : mardi 9 mai > 2006 18:19 ? : petsc-users at mcs.anl.gov Objet : Re: Petsc + > BlockSolve95 > > I believe there is a problem with the option that you specified. All > these are integers, and it is complaining that the integer does not > match MAT_SYMMETRIC. I will fix the error message to print the > offending option, but please check the code. > > Thanks, > > Matt > > On 5/9/06, Boulahya Fa?za > > > > wrote: > > Hi All, > > > > Has anyone used Conjugate Gradient Solver + Icomplete Cholesky > Preconditionner in parallel case? I tried as said in the manual : I > use MATMPIROWBS for the storage of the matrice. However I get this message : > > > > PETSC ERROR: To use incomplete Cholesky > preconditioning with a MATMPIROWBS matrix you > must declare it to be > symmetric using the option > MatSetOption(A,MAT_SYMMETRIC)! > > > > So I tried adding this option (even if in the namual it is written > that it is not required). Then I obtained this message > > PETSC ERROR: MatSetOption_MPIRowbs() line 1411 in > src/mat/impls/rowbs/mpi/mpirowbs.c > PETSC ERROR: No support for this operation for this object type! > PETSC ERROR: unknown option! > PETSC ERROR: MatSetOption() line 4137 in src/mat/interface/matrix.c > > > > In advance thanks, > > > > > > Fa?za > > > > > Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ > > *** > Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? > l'usage exclusif du > (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de > r?ception de cet e-mail par erreur, le signaler ? son exp?diteur et > ne pas en divulguer le contenu. > L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient > n?anmoins de v?rifier l'absence de corruption ? sa r?ception. > > The contents of this email and any attachments are confidential. They > are intended for the named recipient(s) only. If you have received > this email in error please notify the system manager or the sender > immediately and do not disclose the contents to anyone or make copies. > eSafe scanned this email for viruses, vandals and malicious content. > *** > Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ > > *** > Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? > l'usage exclusif du > (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de > r?ception de cet e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. > L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient > n?anmoins de v?rifier l'absence de corruption ? sa r?ception. > > The contents of this email and any attachments are confidential. They > are intended for the named recipient(s) only. If you have received > this email in error please notify the system manager or the sender > immediately and do not disclose the contents to anyone or make copies. > eSafe scanned this email for viruses, vandals and malicious content. > *** Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ *** Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de v?rifier l'absence de corruption ? sa r?ception. The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. eSafe scanned this email for viruses, vandals and malicious content. *** -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness Pensez ? visiter le site BRGM sur.... http://www.brgm.fr/ *** Le contenu de cet e-mail et de ses pi?ces jointes est destin? ? l'usage exclusif du (des) destinataire(s) express?ment d?sign?(s) comme tel(s). En cas de r?ception de cet e-mail par erreur, le signaler ? son exp?diteur et ne pas en divulguer le contenu. L'absence de virus a ?t? v?rifi? ? l'?mission du message. Il convient n?anmoins de v?rifier l'absence de corruption ? sa r?ception. The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. eSafe scanned this email for viruses, vandals and malicious content. *** -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed May 17 13:43:23 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 17 May 2006 13:43:23 -0500 (CDT) Subject: common nodla points on neighbouring subdomains In-Reply-To: <200605171541.07788.geenen@gmail.com> References: <200605171208.08589.geenen@gmail.com> <200605171541.07788.geenen@gmail.com> Message-ID: Thomas, Ok, things are clearer. Are you using the VecCreateGhost() routines? If you are not using them you should use them (see manual page http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Vec/VecCreateGhost.html) Once the solver is done you will call VecGhostUpdateBegin(v,INSERT_VALUES,SCATTER_FORWARD); VecGhostUpdateEnd(v,INSERT_VALUES,SCATTER_FORWARD); and use VecGhostGetLocalRepresentation() to access the vector plus its ghost points. (note that the "local" and "global" representations share the same data in the same memory, the difference is that the "local" is a VecSeq (with ghost points at the end) and the "global" is a VecMPI that is the one actually passed to the solver and used with VecSetValues(). Good luck, Barry On Wed, 17 May 2006, Thomas Geenen wrote: > On Wednesday 17 May 2006 14:53, Barry Smith wrote: >> Thomas, >> >> I think you might misunderstand the function of XXXSetValuesLocal(). >> What happens with them is the index is mapped to the global index and >> the value is then put into the "correct" global location, so some values >> will be shipped off to the "owning" process. > this happens in the assemble phase right? >> >> 1) If two processes that >> "share" a vertex both generate the entire stiffness for that vertex >> (that is, each process loops over all the elements containing the >> vertex summing the stiffness for that vertex) then the result will >> be exactly twice the expected value (since one process will send the >> result to the "owning" process). I THINK this is what you are doing. > I intercept the matrix and rhs just before the "old" call to the solver > routine just before the call the value for the rhs of common nodalpoints are > calculated globally and broadcasted to the owning processes. In this setup > more than one process can own a nodal point. I construct a local2global > mapping and apply it to the matrix and vector > > ISLocalToGlobalMapping ltog; > ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD, nc, local2global, <og); > MatSetLocalToGlobalMapping(A, ltog); > >> >> 2) if, on the other hand, each process generates only the part of the >> stiffness for the vertex from elements that it "owns" then the global >> matrix will be correct, BUT the communication of matrix elements to the >> "owning" process is done. If you DO NOT want the communication to the >> owning process but want to keep part of the stiffness on each process (for >> a shared vertex) I would call this the unassembled form. This is done by >> some people but is not really supported by PETSc. >> >> I think you may want to consider NOT having "shared" vertices (that is >> either 1 if you are using that or 2). This may not require much change to >> what you do not want to change but will allow you to easily use all of the >> PETSc solvers. The problem with "unassembled" approaches is that they save >> a bit of communication in generating the stiffness but then (since the >> matrix is never assembled) cannot form much of anything in terms of >> preconditioners, thus you do lots of linear solver iterations with >> communication. Communicating the stiffness initially (if you have a good >> partitioning of elements) is really not much communication and IMHO is the >> wrong place to optimize. > basically what happens in my code is that each process owns all nodalpoints of > the element it owns. I tell petsc the global numbering of the matrix and > vector entries so I can use all the preconditioners and solvers etc. However > in the original setup the program expects on return to have the solution > vector for all nodes of the elements it owns . Petsc gives the solution to > the "owner process" of the nodal point. To correct this I would have to send > parts of the solution to processes that don't "own" that part of the solution > but do expect it to be there. I thought It would be elegant if Petsc could > return a solution vector that corresponds to the local filling of the vector > not to the "owners" of the nodal point. > > thanks for you fast response > > Thomas Geenen > >> >> I hope my response is not too confusing, I may have misunderstood >> parts of your question. >> >> Barry >> >> On Wed, 17 May 2006, Thomas Geenen wrote: >>> Dear Petsc users, >>> >>> I am solving a system of equations Ax=b resulting from a Finite element >>> package. The parallelization strategy for this program is to minimize >>> communication. therefore each subdomain has its own copy of common nodal >>> points between subdomains (local numbering). This strategy is already >>> implemented and I don't want to change that >>> >>> I fill the matrix and right hand side with a call to MatSetValuesLocal >>> and VecSetValuesLocal. Some entries are filled on different subdomains >>> (of course with the same value) >>> >>> The solution for the first subdomain is correct. however the solution >>> vector on the second subdomain does not contain the solution on the >>> common nodal points it shares with subdomain 1. >>> This is probably a feature but it is not consistent with my program >>> setup. The question is: >>> how can i tell petsc to return a solution for all the positions i filled >>> with VecSetValuesLocal? >>> >>> Thanks in advance >>> Thomas Geenen >>> ps I got the impression that the assemble routines also exchange common >>> nodal point info. It would be nice if this could be skipped as well. >>> Of course some extra communication could solve my problem but I would >>> prefer a more elegant solution if one exists > > From geenen at gmail.com Wed May 17 14:46:25 2006 From: geenen at gmail.com (Thomas Geenen) Date: Wed, 17 May 2006 21:46:25 +0200 Subject: common nodla points on neighbouring subdomains In-Reply-To: References: <200605171208.08589.geenen@gmail.com> <200605171541.07788.geenen@gmail.com> Message-ID: <200605172146.25370.geenen@gmail.com> On Wednesday 17 May 2006 20:43, Barry Smith wrote: > Thomas, > > Ok, things are clearer. Are you using the VecCreateGhost() routines? > If you are not using them you should use them (see manual page > http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/man >ualpages/Vec/VecCreateGhost.html) Once the solver is done you will call > VecGhostUpdateBegin(v,INSERT_VALUES,SCATTER_FORWARD); > VecGhostUpdateEnd(v,INSERT_VALUES,SCATTER_FORWARD); > and use VecGhostGetLocalRepresentation() to access the vector plus its > ghost points. (note that the "local" and "global" representations > share the same data in the same memory, the difference is that the "local" > is a VecSeq (with ghost points at the end) and the "global" is a VecMPI > that is the one actually passed to the solver and used with VecSetValues(). > that sounds exactly as what I need!! I am starting to like petsc more and more :) thanks for the help Thomas > Good luck, > > Barry > > On Wed, 17 May 2006, Thomas Geenen wrote: > > On Wednesday 17 May 2006 14:53, Barry Smith wrote: > >> Thomas, > >> > >> I think you might misunderstand the function of > >> XXXSetValuesLocal(). What happens with them is the index is mapped to > >> the global index and the value is then put into the "correct" global > >> location, so some values will be shipped off to the "owning" process. > > > > this happens in the assemble phase right? > > > >> 1) If two processes that > >> "share" a vertex both generate the entire stiffness for that vertex > >> (that is, each process loops over all the elements containing the > >> vertex summing the stiffness for that vertex) then the result will > >> be exactly twice the expected value (since one process will send the > >> result to the "owning" process). I THINK this is what you are doing. > > > > I intercept the matrix and rhs just before the "old" call to the solver > > routine just before the call the value for the rhs of common nodalpoints > > are calculated globally and broadcasted to the owning processes. In this > > setup more than one process can own a nodal point. I construct a > > local2global mapping and apply it to the matrix and vector > > > > ISLocalToGlobalMapping ltog; > > ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD, nc, local2global, <og); > > MatSetLocalToGlobalMapping(A, ltog); > > > >> 2) if, on the other hand, each process generates only the part of the > >> stiffness for the vertex from elements that it "owns" then the global > >> matrix will be correct, BUT the communication of matrix elements to the > >> "owning" process is done. If you DO NOT want the communication to the > >> owning process but want to keep part of the stiffness on each process > >> (for a shared vertex) I would call this the unassembled form. This is > >> done by some people but is not really supported by PETSc. > >> > >> I think you may want to consider NOT having "shared" vertices (that is > >> either 1 if you are using that or 2). This may not require much change > >> to what you do not want to change but will allow you to easily use all > >> of the PETSc solvers. The problem with "unassembled" approaches is that > >> they save a bit of communication in generating the stiffness but then > >> (since the matrix is never assembled) cannot form much of anything in > >> terms of preconditioners, thus you do lots of linear solver iterations > >> with communication. Communicating the stiffness initially (if you have a > >> good partitioning of elements) is really not much communication and IMHO > >> is the wrong place to optimize. > > > > basically what happens in my code is that each process owns all > > nodalpoints of the element it owns. I tell petsc the global numbering of > > the matrix and vector entries so I can use all the preconditioners and > > solvers etc. However in the original setup the program expects on return > > to have the solution vector for all nodes of the elements it owns . Petsc > > gives the solution to the "owner process" of the nodal point. To correct > > this I would have to send parts of the solution to processes that don't > > "own" that part of the solution but do expect it to be there. I thought > > It would be elegant if Petsc could return a solution vector that > > corresponds to the local filling of the vector not to the "owners" of the > > nodal point. > > > > thanks for you fast response > > > > Thomas Geenen > > > >> I hope my response is not too confusing, I may have misunderstood > >> parts of your question. > >> > >> Barry > >> > >> On Wed, 17 May 2006, Thomas Geenen wrote: > >>> Dear Petsc users, > >>> > >>> I am solving a system of equations Ax=b resulting from a Finite element > >>> package. The parallelization strategy for this program is to minimize > >>> communication. therefore each subdomain has its own copy of common > >>> nodal points between subdomains (local numbering). This strategy is > >>> already implemented and I don't want to change that > >>> > >>> I fill the matrix and right hand side with a call to MatSetValuesLocal > >>> and VecSetValuesLocal. Some entries are filled on different subdomains > >>> (of course with the same value) > >>> > >>> The solution for the first subdomain is correct. however the solution > >>> vector on the second subdomain does not contain the solution on the > >>> common nodal points it shares with subdomain 1. > >>> This is probably a feature but it is not consistent with my program > >>> setup. The question is: > >>> how can i tell petsc to return a solution for all the positions i > >>> filled with VecSetValuesLocal? > >>> > >>> Thanks in advance > >>> Thomas Geenen > >>> ps I got the impression that the assemble routines also exchange common > >>> nodal point info. It would be nice if this could be skipped as well. > >>> Of course some extra communication could solve my problem but I would > >>> prefer a more elegant solution if one exists From jadelman at OCF.Berkeley.EDU Thu May 18 10:31:00 2006 From: jadelman at OCF.Berkeley.EDU (Joshua L. Adelman) Date: Thu, 18 May 2006 08:31:00 -0700 Subject: PETSc vs Matlab Performance Message-ID: <2EDE0DDF-F783-41C0-88F6-EE34F10562F2@ocf.berkeley.edu> Having just finished writing my first modest PETSc-based application, I was amazed how much faster the code was than my Matlab prototype, even running in sequentially on a single processor. The code basically involves propagating a time series using Backward Euler, where the rate matrix K, is static. Granted I might be doing something horribly inefficient in my Matlab code, since I had code the Backward Euler, but given that the bulk of the time is spent in the matlab solver (either qmr or gmres), it seems that seems unlikely. Running with a single processor, the actual solver part of my PETSc code runs 15-25x faster than the equivalent part in my matlab code, and this is without using any optimization flags on the compiler and having compiled PETSc with debugger options on (the - log_summary output says that that would slow things down by a factor of 2-3x). I had always been told that for doing any sort of Linear Algebra sort of stuff, like inverting a matrix, Matlab's algorithms were as fast as anything you could hand code in C or Fortran (although when they say 'you', they might mean 'me' specifically and not the people who write PETSc). Have there been any official benchmarks, pitting PETSc again Matlab? And for the CS novice, what is the underlying reason for the difference? Thanks, Josh ------------------------------------------------------------------------ ------------------------------ Joshua L. Adelman Biophysics Graduate Group Lab: 510.643.2159 218 Wellman Hall Fax: 510.642.7428 University of California, Berkeley http://www.ocf.berkeley.edu/ ~jadelman Berkeley, CA 94720 USA jadelman at ocf.berkeley.edu ------------------------------------------------------------------------ ------------------------------ From shma7099 at student.uu.se Thu May 18 11:09:12 2006 From: shma7099 at student.uu.se (Sh.M) Date: Thu, 18 May 2006 18:09:12 +0200 Subject: PETSc vs Matlab Performance References: <2EDE0DDF-F783-41C0-88F6-EE34F10562F2@ocf.berkeley.edu> Message-ID: <000801c67a95$677b7de0$2516e055@bredbandsbolaget.se> Hi, Matlab interpretates your code when it runs it and does not compile it to machine code. So there is another program in the background that parses thru your code and translates to quadruples(some form of assembly code that is not CPU specifik), then it runs that interpreted code in a high leven language written interpretator. In C however the quadruples are converted to assembly code specifik to the target platform and then converted to machine code by an assembler. Are you using loops(for; while;) or do you so to speak "vectorize" your data? Looping is quite slow in matlab. I naively used a for loop to extract the first block of a triblock diagonal matrix and it took several minutes until it finished, and I just wanted to extract 4000x4000 elements from it. In C it took less than a second to do it, a lot thanks to fast memory addressing. Other than that I believe that you are using indirect addressing, something as far as I know is hard to optimize? Thus tight loops are hard to optimize and you are left with a good chunk of the code being unoptimized by matlab. I am sure some other peoples can extend what I have said and not said. With best regards, Shaman Mahmoudi ----- Original Message ----- From: "Joshua L. Adelman" To: Sent: Thursday, May 18, 2006 5:31 PM Subject: PETSc vs Matlab Performance > Having just finished writing my first modest PETSc-based application, > I was amazed how much faster the code was than my Matlab prototype, > even running in sequentially on a single processor. The code > basically involves propagating a time series using Backward Euler, > where the rate matrix K, is static. Granted I might be doing > something horribly inefficient in my Matlab code, since I had code > the Backward Euler, but given that the bulk of the time is spent in > the matlab solver (either qmr or gmres), it seems that seems > unlikely. Running with a single processor, the actual solver part of > my PETSc code runs 15-25x faster than the equivalent part in my > matlab code, and this is without using any optimization flags on the > compiler and having compiled PETSc with debugger options on (the - > log_summary output says that that would slow things down by a factor > of 2-3x). I had always been told that for doing any sort of Linear > Algebra sort of stuff, like inverting a matrix, Matlab's algorithms > were as fast as anything you could hand code in C or Fortran > (although when they say 'you', they might mean 'me' specifically and > not the people who write PETSc). > > Have there been any official benchmarks, pitting PETSc again Matlab? > And for the CS novice, what is the underlying reason for the difference? > > Thanks, > Josh > > > ------------------------------------------------------------------------ > ------------------------------ > Joshua L. Adelman > Biophysics Graduate Group Lab: 510.643.2159 > 218 Wellman Hall Fax: 510.642.7428 > University of California, Berkeley http://www.ocf.berkeley.edu/ > ~jadelman > Berkeley, CA 94720 USA jadelman at ocf.berkeley.edu > ------------------------------------------------------------------------ > ------------------------------ > > > From bsmith at mcs.anl.gov Thu May 18 15:36:23 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 18 May 2006 15:36:23 -0500 (CDT) Subject: DAcreate2d process layout order In-Reply-To: <44693587.6010907@trialphaenergy.com> References: <44693587.6010907@trialphaenergy.com> Message-ID: Sean, I don't completely understand what goes wrong. Is it because YOUR application orders the processors related to geometry in the following way? ^ y direction | 2 5 8 1 4 7 0 3 6 -> x direction Or is this something inherent in MPI_Cart_create? PETSc does it so ^ y direction | 6 7 8 3 4 5 0 1 2 -> x direction If you want to "force" the PETSc DA to match the first case you could just "lie" to PETSc and treat the x direction as the y direction and the x as the y. Thus you would flip the i and j indices. You could do the same trick in 3d. There is no easy way to change the DA to do the ordering as the first case above or support both approaches (possible? yes, but ugly duplicate code to handle the two cases). I've cc:ed to Bill since he would know more about the details of MPI_cart_create(). Barry On Mon, 15 May 2006, Sean Dettrick wrote: > Hi, > > I'm trying to use DACreate2d and KSP in my existing MPI application. I > already have a Cartesian communicator established, and I set PETSC_COMM_WORLD > equal to it and then call PetscInitialize. > > This works fine on a prime number of CPUs, because there is only one possible > ordered MPI layout in one dimension. But with a non-prime number there are > two possible ordered layouts and it just happens that my 2D CPU layout > (determined by MPI_Cart_create) is the transpose of the PETSc 2D CPU layout. > Is there a way to organize the DA layout more explicitly than with > DACreate2d? Or to tell PETSc to transpose its CPU order? I also wonder > about the 3D case. > > thanks > Sean > > From sean at trialphaenergy.com Thu May 18 18:28:11 2006 From: sean at trialphaenergy.com (Sean Dettrick) Date: Thu, 18 May 2006 19:28:11 -0400 Subject: DAcreate2d process layout order In-Reply-To: References: <44693587.6010907@trialphaenergy.com> Message-ID: <446D030B.90906@trialphaenergy.com> Hi Barry, the order is determined by MPI_Cart_create. I've tried to swap x and y in the DA to make the problem go away but haven't succeeded yet. Not enough brain cells I suspect ... still thinking about it. Thanks for the response, Sean Barry Smith wrote: > > > Sean, > > I don't completely understand what goes wrong. Is it because YOUR > application orders the processors related to geometry in the following > way? > > ^ y direction > | > 2 5 8 > 1 4 7 > 0 3 6 > > -> x direction > > Or is this something inherent in MPI_Cart_create? > > PETSc does it so > > ^ y direction > | > 6 7 8 > 3 4 5 > 0 1 2 > > -> x direction > > If you want to "force" the PETSc DA to match the first case you > could just > "lie" to PETSc and treat the x direction as the y direction and the x > as the y. > Thus you would flip the i and j indices. You could do the same trick in > 3d. > > There is no easy way to change the DA to do the ordering as the first > case above or support both approaches (possible? yes, but ugly > duplicate code to handle the two cases). > > I've cc:ed to Bill since he would know more about the details of > MPI_cart_create(). > > Barry > > On Mon, 15 May 2006, Sean Dettrick wrote: > >> Hi, >> >> I'm trying to use DACreate2d and KSP in my existing MPI application. >> I already have a Cartesian communicator established, and I set >> PETSC_COMM_WORLD equal to it and then call PetscInitialize. >> >> This works fine on a prime number of CPUs, because there is only one >> possible ordered MPI layout in one dimension. But with a non-prime >> number there are two possible ordered layouts and it just happens >> that my 2D CPU layout (determined by MPI_Cart_create) is the >> transpose of the PETSc 2D CPU layout. Is there a way to organize the >> DA layout more explicitly than with DACreate2d? Or to tell PETSc to >> transpose its CPU order? I also wonder about the 3D case. >> >> thanks >> Sean >> >> > > From bsmith at mcs.anl.gov Thu May 18 19:12:34 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 18 May 2006 19:12:34 -0500 (CDT) Subject: DAcreate2d process layout order In-Reply-To: <446D030B.90906@trialphaenergy.com> References: <44693587.6010907@trialphaenergy.com> <446D030B.90906@trialphaenergy.com> Message-ID: On Thu, 18 May 2006, Sean Dettrick wrote: > Hi Barry, > the order is determined by MPI_Cart_create. Do you mean that MPI_Cart_create() orders across the 2nd (y-axis) fastest and then the first (x-axis)? Hmmm, maybe we should change the DA? Changing it once and for all (not supporting both) is probably not a big deal and shouldn't break much (I hope). Bill, what do you think? Barry > I've tried to swap x and y in > the DA to make the problem go away but haven't succeeded yet. Not enough > brain cells I suspect ... still thinking about it. > Thanks for the response, > Sean > > Barry Smith wrote: > >> >> >> Sean, >> >> I don't completely understand what goes wrong. Is it because YOUR >> application orders the processors related to geometry in the following way? >> >> ^ y direction >> | >> 2 5 8 >> 1 4 7 >> 0 3 6 >> >> -> x direction >> >> Or is this something inherent in MPI_Cart_create? >> >> PETSc does it so >> >> ^ y direction >> | >> 6 7 8 >> 3 4 5 >> 0 1 2 >> >> -> x direction >> >> If you want to "force" the PETSc DA to match the first case you could >> just >> "lie" to PETSc and treat the x direction as the y direction and the x as >> the y. >> Thus you would flip the i and j indices. You could do the same trick in >> 3d. >> >> There is no easy way to change the DA to do the ordering as the first >> case above or support both approaches (possible? yes, but ugly >> duplicate code to handle the two cases). >> >> I've cc:ed to Bill since he would know more about the details of >> MPI_cart_create(). >> >> Barry >> >> On Mon, 15 May 2006, Sean Dettrick wrote: >> >>> Hi, >>> >>> I'm trying to use DACreate2d and KSP in my existing MPI application. I >>> already have a Cartesian communicator established, and I set >>> PETSC_COMM_WORLD equal to it and then call PetscInitialize. >>> >>> This works fine on a prime number of CPUs, because there is only one >>> possible ordered MPI layout in one dimension. But with a non-prime number >>> there are two possible ordered layouts and it just happens that my 2D CPU >>> layout (determined by MPI_Cart_create) is the transpose of the PETSc 2D >>> CPU layout. Is there a way to organize the DA layout more explicitly than >>> with DACreate2d? Or to tell PETSc to transpose its CPU order? I also >>> wonder about the 3D case. >>> >>> thanks >>> Sean >>> >>> >> >> > > From sean at trialphaenergy.com Thu May 18 19:23:30 2006 From: sean at trialphaenergy.com (Sean Dettrick) Date: Thu, 18 May 2006 20:23:30 -0400 Subject: DAcreate2d process layout order In-Reply-To: References: <44693587.6010907@trialphaenergy.com> <446D030B.90906@trialphaenergy.com> Message-ID: <446D1002.3020407@trialphaenergy.com> Barry Smith wrote: > > > > > On Thu, 18 May 2006, Sean Dettrick wrote: > >> Hi Barry, >> the order is determined by MPI_Cart_create. > > > Do you mean that MPI_Cart_create() orders across the 2nd (y-axis) > fastest and then the first (x-axis)? I am not sure that my interpretation is correct. I'll send you a small demo, so you can see for yourself if you agree with it. Sean > Hmmm, maybe we should change the > DA? Changing it once and for all (not supporting both) is probably > not a big deal and shouldn't break much (I hope). > Bill, what do you think? > > Barry > >> I've tried to swap x and y in the DA to make the problem go away but >> haven't succeeded yet. Not enough brain cells I suspect ... still >> thinking about it. >> Thanks for the response, >> Sean >> >> Barry Smith wrote: >> >>> >>> >>> Sean, >>> >>> I don't completely understand what goes wrong. Is it because YOUR >>> application orders the processors related to geometry in the >>> following way? >>> >>> ^ y direction >>> | >>> 2 5 8 >>> 1 4 7 >>> 0 3 6 >>> >>> -> x direction >>> >>> Or is this something inherent in MPI_Cart_create? >>> >>> PETSc does it so >>> >>> ^ y direction >>> | >>> 6 7 8 >>> 3 4 5 >>> 0 1 2 >>> >>> -> x direction >>> >>> If you want to "force" the PETSc DA to match the first case you >>> could just >>> "lie" to PETSc and treat the x direction as the y direction and the >>> x as the y. >>> Thus you would flip the i and j indices. You could do the same trick in >>> 3d. >>> >>> There is no easy way to change the DA to do the ordering as the first >>> case above or support both approaches (possible? yes, but ugly >>> duplicate code to handle the two cases). >>> >>> I've cc:ed to Bill since he would know more about the details of >>> MPI_cart_create(). >>> >>> Barry >>> >>> On Mon, 15 May 2006, Sean Dettrick wrote: >>> >>>> Hi, >>>> >>>> I'm trying to use DACreate2d and KSP in my existing MPI >>>> application. I already have a Cartesian communicator established, >>>> and I set PETSC_COMM_WORLD equal to it and then call PetscInitialize. >>>> >>>> This works fine on a prime number of CPUs, because there is only >>>> one possible ordered MPI layout in one dimension. But with a >>>> non-prime number there are two possible ordered layouts and it just >>>> happens that my 2D CPU layout (determined by MPI_Cart_create) is >>>> the transpose of the PETSc 2D CPU layout. Is there a way to >>>> organize the DA layout more explicitly than with DACreate2d? Or >>>> to tell PETSc to transpose its CPU order? I also wonder about the >>>> 3D case. >>>> >>>> thanks >>>> Sean >>>> >>>> >>> >>> >> >> > > From sean at trialphaenergy.com Thu May 18 21:36:13 2006 From: sean at trialphaenergy.com (Sean Dettrick) Date: Thu, 18 May 2006 22:36:13 -0400 Subject: DAcreate2d process layout order Message-ID: <446D2F1D.9040201@trialphaenergy.com> Barry Smith wrote: > On Thu, 18 May 2006, Sean Dettrick wrote: > >> Hi Barry, >> the order is determined by MPI_Cart_create. > > > Do you mean that MPI_Cart_create() orders across the 2nd (y-axis) > fastest and then the first (x-axis)? Hmmm, maybe we should change the > DA? Changing it once and for all (not supporting both) is probably > not a big deal and shouldn't break much (I hope). Hi Barry, it depends, what do you call x and what do you call y? MPI_Cart_coords returns a vector, coords - I tend to say x is coords[0], y is coords[1] and z is coords[2]. For what it's worth, there's a short code appended to this email, which produces: rank = 0 has Cartesian coords = { 0, 0 } rank = 1 has Cartesian coords = { 0, 1 } rank = 2 has Cartesian coords = { 1, 0 } rank = 3 has Cartesian coords = { 1, 1 } rank = 0 has DA range x=[0,50) and y=[0,50) rank = 1 has DA range x=[50,100) and y=[0,50) rank = 2 has DA range x=[0,50) and y=[50,100) rank = 3 has DA range x=[50,100) and y=[50,100) >>> I don't completely understand what goes wrong. Is it because YOUR >>> application orders the processors related to geometry in the >>> following way? >>> >>> ^ y direction >>> | >>> 2 5 8 >>> 1 4 7 >>> 0 3 6 >>> >>> -> x direction >>> >>> Or is this something inherent in MPI_Cart_create? >> For my interpretation of x and y, MPI_Cart_create produces the above layout. But if I said x=coords[1] and y=coords[0], then it would match the one below. >>> >>> PETSc does it so >>> >>> ^ y direction >>> | >>> 6 7 8 >>> 3 4 5 >>> 0 1 2 >>> >>> -> x direction >>> >>> >> Code and makefile attached ... hopefully within the message size limit. Just make cartcommtest. Sean -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: makefile URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cartcommtest.c Type: text/x-csrc Size: 2946 bytes Desc: not available URL: From bsmith at mcs.anl.gov Thu May 18 21:45:17 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 18 May 2006 21:45:17 -0500 (CDT) Subject: DAcreate2d process layout order In-Reply-To: <446D2F1D.9040201@trialphaenergy.com> References: <446D2F1D.9040201@trialphaenergy.com> Message-ID: I would say your original interpretation is correct. The DA distribution style is not compatible with the way MPI_cart_create() works. Once I get Bill's confirmation I'll change the DA; for now I think my suggestion of just swapping the meaning of x and y (i and j) when dealing with DAs will have the same effect. Barry Bill, does the MPI standard dictate this decomposition or could different implementations do it the opposite way? Then we'd have to make the DA logic a bit more complicated. On Thu, 18 May 2006, Sean Dettrick wrote: > Barry Smith wrote: > >> On Thu, 18 May 2006, Sean Dettrick wrote: >> >>> Hi Barry, >>> the order is determined by MPI_Cart_create. >> >> >> Do you mean that MPI_Cart_create() orders across the 2nd (y-axis) >> fastest and then the first (x-axis)? Hmmm, maybe we should change the >> DA? Changing it once and for all (not supporting both) is probably >> not a big deal and shouldn't break much (I hope). > > Hi Barry, > > it depends, what do you call x and what do you call y? > MPI_Cart_coords returns a vector, coords - I tend to say x is coords[0], y is > coords[1] and z is coords[2]. > For what it's worth, there's a short code appended to this email, which > produces: > > rank = 0 has Cartesian coords = { 0, 0 } > rank = 1 has Cartesian coords = { 0, 1 } > rank = 2 has Cartesian coords = { 1, 0 } > rank = 3 has Cartesian coords = { 1, 1 } > rank = 0 has DA range x=[0,50) and y=[0,50) > rank = 1 has DA range x=[50,100) and y=[0,50) > rank = 2 has DA range x=[0,50) and y=[50,100) > rank = 3 has DA range x=[50,100) and y=[50,100) > >>>> I don't completely understand what goes wrong. Is it because YOUR >>>> application orders the processors related to geometry in the following >>>> way? >>>> >>>> ^ y direction >>>> | >>>> 2 5 8 >>>> 1 4 7 >>>> 0 3 6 >>>> >>>> -> x direction >>>> >>>> Or is this something inherent in MPI_Cart_create? >>> > > For my interpretation of x and y, MPI_Cart_create produces the above layout. > But if I said x=coords[1] and y=coords[0], then it would match the one below. > >>>> >>>> PETSc does it so >>>> >>>> ^ y direction >>>> | >>>> 6 7 8 >>>> 3 4 5 >>>> 0 1 2 >>>> >>>> -> x direction >>>> >>>> >>> > > Code and makefile attached ... hopefully within the message size limit. > Just make cartcommtest. > > Sean > > > > From sean at trialphaenergy.com Thu May 18 22:06:33 2006 From: sean at trialphaenergy.com (Sean Dettrick) Date: Thu, 18 May 2006 23:06:33 -0400 Subject: DAcreate2d process layout order In-Reply-To: References: <446D2F1D.9040201@trialphaenergy.com> Message-ID: <446D3639.9070306@trialphaenergy.com> Barry Smith wrote: > > Bill, does the MPI standard dictate this decomposition or > could different implementations do it the opposite way? > Then we'd have to make the DA logic a bit more complicated. I don't have a copy of the standard, but to quote page 255 of "MPI, the complete reference" by Snir et al: "Row-major numbering is always used for the processes in a Cartesian structure". Their diagram in figure 6.1 matches my code output for coords couplets (i,j): 0 1 2 3 (0,0) (0,1) (0,2) (0,3) 4 5 6 7 (1,0) (1,1) (1,2) (1,3) 8 9 10 11 (2,0) (2,1) (2,2) (2,3) By the way I agree with you, I *should* be able to swap the x and y myself. Just haven't had much luck yet in that regard. Sean > > > On Thu, 18 May 2006, Sean Dettrick wrote: > >> Barry Smith wrote: >> >>> On Thu, 18 May 2006, Sean Dettrick wrote: >>> >>>> Hi Barry, >>>> the order is determined by MPI_Cart_create. >>> >>> >>> >>> Do you mean that MPI_Cart_create() orders across the 2nd (y-axis) >>> fastest and then the first (x-axis)? Hmmm, maybe we should change the >>> DA? Changing it once and for all (not supporting both) is probably >>> not a big deal and shouldn't break much (I hope). >> >> >> Hi Barry, >> >> it depends, what do you call x and what do you call y? >> MPI_Cart_coords returns a vector, coords - I tend to say x is >> coords[0], y is coords[1] and z is coords[2]. For what it's worth, >> there's a short code appended to this email, which produces: >> >> rank = 0 has Cartesian coords = { 0, 0 } >> rank = 1 has Cartesian coords = { 0, 1 } >> rank = 2 has Cartesian coords = { 1, 0 } >> rank = 3 has Cartesian coords = { 1, 1 } >> rank = 0 has DA range x=[0,50) and y=[0,50) >> rank = 1 has DA range x=[50,100) and y=[0,50) >> rank = 2 has DA range x=[0,50) and y=[50,100) >> rank = 3 has DA range x=[50,100) and y=[50,100) >> >>>>> I don't completely understand what goes wrong. Is it because YOUR >>>>> application orders the processors related to geometry in the >>>>> following way? >>>>> >>>>> ^ y direction >>>>> | >>>>> 2 5 8 >>>>> 1 4 7 >>>>> 0 3 6 >>>>> >>>>> -> x direction >>>>> >>>>> Or is this something inherent in MPI_Cart_create? >>>> >>>> >> >> For my interpretation of x and y, MPI_Cart_create produces the above >> layout. But if I said x=coords[1] and y=coords[0], then it would >> match the one below. >> >>>>> >>>>> PETSc does it so >>>>> >>>>> ^ y direction >>>>> | >>>>> 6 7 8 >>>>> 3 4 5 >>>>> 0 1 2 >>>>> >>>>> -> x direction >>>>> >>>>> >>>> >> >> Code and makefile attached ... hopefully within the message size limit. >> Just make cartcommtest. >> >> Sean >> >> >> >> > > From bsmith at mcs.anl.gov Fri May 19 07:53:16 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 19 May 2006 07:53:16 -0500 (CDT) Subject: DAcreate2d process layout order In-Reply-To: <446D3639.9070306@trialphaenergy.com> References: <446D2F1D.9040201@trialphaenergy.com> <446D3639.9070306@trialphaenergy.com> Message-ID: Ok, so it is laid-out like a matrix, not like x, y coordinates. Will put the change to DA on the list of things to do. Barry On Thu, 18 May 2006, Sean Dettrick wrote: > Barry Smith wrote: > >> >> Bill, does the MPI standard dictate this decomposition or >> could different implementations do it the opposite way? >> Then we'd have to make the DA logic a bit more complicated. > > I don't have a copy of the standard, but to quote page 255 of "MPI, the > complete reference" by Snir et al: > "Row-major numbering is always used for the processes in a Cartesian > structure". > Their diagram in figure 6.1 matches my code output for coords couplets (i,j): > > 0 1 2 3 > (0,0) (0,1) (0,2) (0,3) > > 4 5 6 7 > (1,0) (1,1) (1,2) (1,3) > > 8 9 10 11 > (2,0) (2,1) (2,2) (2,3) > > > By the way I agree with you, I *should* be able to swap the x and y myself. > Just haven't had much luck yet in that regard. > > Sean > >> >> >> On Thu, 18 May 2006, Sean Dettrick wrote: >> >>> Barry Smith wrote: >>> >>>> On Thu, 18 May 2006, Sean Dettrick wrote: >>>> >>>>> Hi Barry, >>>>> the order is determined by MPI_Cart_create. >>>> >>>> >>>> >>>> Do you mean that MPI_Cart_create() orders across the 2nd (y-axis) >>>> fastest and then the first (x-axis)? Hmmm, maybe we should change the >>>> DA? Changing it once and for all (not supporting both) is probably >>>> not a big deal and shouldn't break much (I hope). >>> >>> >>> Hi Barry, >>> >>> it depends, what do you call x and what do you call y? >>> MPI_Cart_coords returns a vector, coords - I tend to say x is coords[0], y >>> is coords[1] and z is coords[2]. For what it's worth, there's a short code >>> appended to this email, which produces: >>> >>> rank = 0 has Cartesian coords = { 0, 0 } >>> rank = 1 has Cartesian coords = { 0, 1 } >>> rank = 2 has Cartesian coords = { 1, 0 } >>> rank = 3 has Cartesian coords = { 1, 1 } >>> rank = 0 has DA range x=[0,50) and y=[0,50) >>> rank = 1 has DA range x=[50,100) and y=[0,50) >>> rank = 2 has DA range x=[0,50) and y=[50,100) >>> rank = 3 has DA range x=[50,100) and y=[50,100) >>> >>>>>> I don't completely understand what goes wrong. Is it because YOUR >>>>>> application orders the processors related to geometry in the following >>>>>> way? >>>>>> >>>>>> ^ y direction >>>>>> | >>>>>> 2 5 8 >>>>>> 1 4 7 >>>>>> 0 3 6 >>>>>> >>>>>> -> x direction >>>>>> >>>>>> Or is this something inherent in MPI_Cart_create? >>>>> >>>>> >>> >>> For my interpretation of x and y, MPI_Cart_create produces the above >>> layout. But if I said x=coords[1] and y=coords[0], then it would match the >>> one below. >>> >>>>>> >>>>>> PETSc does it so >>>>>> >>>>>> ^ y direction >>>>>> | >>>>>> 6 7 8 >>>>>> 3 4 5 >>>>>> 0 1 2 >>>>>> >>>>>> -> x direction >>>>>> >>>>>> >>>>> >>> >>> Code and makefile attached ... hopefully within the message size limit. >>> Just make cartcommtest. >>> >>> Sean >>> >>> >>> >>> >> >> > > From bsmith at mcs.anl.gov Fri May 19 12:31:42 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 19 May 2006 12:31:42 -0500 (CDT) Subject: Petsc Development Question In-Reply-To: <446DEF6E.9020700@cfdrc.com> References: <446DEF6E.9020700@cfdrc.com> Message-ID: Paul, This is a very appropriate question at this time since multi-core seems to be the wave of the future. The PETSc model will be the same, each "process" will continue have its own "processor" (core) and each process will have its OWN address space (memory it can access directly from the programming language). That is, the OpenMP model will NOT come into play. The "standard" MPI model will continue (with one MPI rank per process (core)). BUT: for performance benefits from multi-core, the two "parallel communication" operations that dominate in PETSc: global reductions (like VecNorm) and neighbor to neighbor communication (VecScatter) will "by-pass" the overhead of traditional MPI call overhead WITHIN a node (all the cores on the same chip). This will be completely transparent to PETSc users and in fact almost completely transparent to PETSc (I have discussed these issues with the MPICH group at ANL). I feel any other approach is completely unrealistic, there is no way we could recode PETSc to use OpenMP on a node and require users to do so as well! MOST IMPORTANTLY that approach would NOT offer significent improvement over our approach (if I am wrong about this, only after I have seen a real demonstration of real timings on a real application on a real machine, will my thinking change.). Barry General note: sparse matrix and vector computations will not benefit much at all from multi-core (that is two cores will not give anywhere near twice the performance of one core) if the memory-bandwidth to the cores is not high enough. So when looking at multi-core, look at the bandwidth PER core, unless that is high, don't buy the system, you'll be buying cores that don't do much for you but heat up the machine. On Fri, 19 May 2006, Paul Dionne wrote: > Barry, > > You visited us a while back at CFD Research Corporation, and we are finally > getting around to starting the project for which we will be using Petsc. I > have a couple of questions about your potential future work. My parallel > background is fairly weak, so I'll apologize in advance if these questions > are either silly or worded wrong. > > Do you have any plans on development for exploiting multi-core processors? > One of the guys here was saying that a parallel code could be sped up > considerably if communication was arranged to take advantage of the faster > communication within a computer compared to communication between computers. > > If you have plans for this, or would be considering it in the future, would I > be correct in assuming that this is something that would be controlled > entirely by Petsc? In other words, we could develop our code with Petsc as > it is, and if a multi-core version came out later there is nothing additional > that we would have to do to gain the additional speed. > > Thanks, > Paul Dionne > > From sean at trialphaenergy.com Fri May 19 15:04:28 2006 From: sean at trialphaenergy.com (Sean Dettrick) Date: Fri, 19 May 2006 16:04:28 -0400 Subject: DAcreate2d process layout order In-Reply-To: References: <446D2F1D.9040201@trialphaenergy.com> <446D3639.9070306@trialphaenergy.com> Message-ID: <446E24CC.7030806@trialphaenergy.com> Barry, just to follow up, I created a cartesian communicator workaround, without changing the order of x and y in the DA calls. Instead I permuted my periodicity orders, created the 2D cartesian communicator, and then created with MPI_Cart_sub an array of 1D cartesian communicators, one for each dimension, with the dimension orders permuted. Now I can use, say, Cartcomm1D[0] to send in the 0th (x) direction of the DA, with the periodicity matching DA_XPERIODIC. I still can't use the 2D communicator - it still has confused dimension order and periodicity. But the 1D communicators are a good compromise. Thanks for your help, Sean Barry Smith wrote: > > Ok, so it is laid-out like a matrix, not like x, y coordinates. > > Will put the change to DA on the list of things to do. > > > Barry > > On Thu, 18 May 2006, Sean Dettrick wrote: > >> Barry Smith wrote: >> >>> >>> Bill, does the MPI standard dictate this decomposition or >>> could different implementations do it the opposite way? >>> Then we'd have to make the DA logic a bit more complicated. >> >> >> I don't have a copy of the standard, but to quote page 255 of "MPI, >> the complete reference" by Snir et al: >> "Row-major numbering is always used for the processes in a Cartesian >> structure". >> Their diagram in figure 6.1 matches my code output for coords >> couplets (i,j): >> >> 0 1 2 3 >> (0,0) (0,1) (0,2) (0,3) >> >> 4 5 6 7 >> (1,0) (1,1) (1,2) (1,3) >> >> 8 9 10 11 >> (2,0) (2,1) (2,2) (2,3) >> >> >> By the way I agree with you, I *should* be able to swap the x and y >> myself. Just haven't had much luck yet in that regard. >> >> Sean >> >>> >>> >>> On Thu, 18 May 2006, Sean Dettrick wrote: >>> >>>> Barry Smith wrote: >>>> >>>>> On Thu, 18 May 2006, Sean Dettrick wrote: >>>>> >>>>>> Hi Barry, >>>>>> the order is determined by MPI_Cart_create. >>>>> >>>>> >>>>> >>>>> >>>>> Do you mean that MPI_Cart_create() orders across the 2nd (y-axis) >>>>> fastest and then the first (x-axis)? Hmmm, maybe we should change the >>>>> DA? Changing it once and for all (not supporting both) is probably >>>>> not a big deal and shouldn't break much (I hope). >>>> >>>> >>>> >>>> Hi Barry, >>>> >>>> it depends, what do you call x and what do you call y? >>>> MPI_Cart_coords returns a vector, coords - I tend to say x is >>>> coords[0], y is coords[1] and z is coords[2]. For what it's worth, >>>> there's a short code appended to this email, which produces: >>>> >>>> rank = 0 has Cartesian coords = { 0, 0 } >>>> rank = 1 has Cartesian coords = { 0, 1 } >>>> rank = 2 has Cartesian coords = { 1, 0 } >>>> rank = 3 has Cartesian coords = { 1, 1 } >>>> rank = 0 has DA range x=[0,50) and y=[0,50) >>>> rank = 1 has DA range x=[50,100) and y=[0,50) >>>> rank = 2 has DA range x=[0,50) and y=[50,100) >>>> rank = 3 has DA range x=[50,100) and y=[50,100) >>>> >>>>>>> I don't completely understand what goes wrong. Is it because YOUR >>>>>>> application orders the processors related to geometry in the >>>>>>> following way? >>>>>>> >>>>>>> ^ y direction >>>>>>> | >>>>>>> 2 5 8 >>>>>>> 1 4 7 >>>>>>> 0 3 6 >>>>>>> >>>>>>> -> x direction >>>>>>> >>>>>>> Or is this something inherent in MPI_Cart_create? >>>>>> >>>>>> >>>>>> >>>> >>>> For my interpretation of x and y, MPI_Cart_create produces the >>>> above layout. But if I said x=coords[1] and y=coords[0], then it >>>> would match the one below. >>>> >>>>>>> >>>>>>> PETSc does it so >>>>>>> >>>>>>> ^ y direction >>>>>>> | >>>>>>> 6 7 8 >>>>>>> 3 4 5 >>>>>>> 0 1 2 >>>>>>> >>>>>>> -> x direction >>>>>>> >>>>>>> >>>>>> >>>> >>>> Code and makefile attached ... hopefully within the message size >>>> limit. >>>> Just make cartcommtest. >>>> >>>> Sean >>>> >>>> >>>> >>>> >>> >>> >> >> > > From randy at geosystem.us Mon May 22 10:41:16 2006 From: randy at geosystem.us (Randall Mackie) Date: Mon, 22 May 2006 08:41:16 -0700 Subject: symmetric matrices Message-ID: <4471DB9C.2080601@geosystem.us> I'm trying to modify my code to take advantage of the fact that my system matrix is symmetric. It seemed from reading the manual pages that I could call MatCreateMPISBAIJ to create a symmetric data matrix. However, when I tried this, I got the error: [1]PETSC ERROR: Lower triangular value cannot be set for sbaij format. Ignoring these value s, run with -mat_ignore_lower_triangular or call MatSetOption(mat,MAT_IGNORE_LOWER_TRIANGUL AR)! So I tried including the line: call MatSetOption(A,MAT_IGNORE_LOWER_TRIANGULAR,ierr) But when I tried to compile the code, I then got this message: fortcom: Error: csem3dfwd.F, line 771: This name does not have a type, and must have an explicit type. [MAT_IGNORE_LOWER_TRIANGULAR] call MatSetOption(A,MAT_IGNORE_LOWER_TRIANGULAR,ierr) --------------------------^ So, can anyone point me in the right direction for taking advantage of the fact that my matrices are symmetric? (they are actually complex symmetric, and NOT Hermitian). Thanks, Randy -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From knepley at gmail.com Mon May 22 10:47:47 2006 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 May 2006 10:47:47 -0500 Subject: symmetric matrices In-Reply-To: <4471DB9C.2080601@geosystem.us> References: <4471DB9C.2080601@geosystem.us> Message-ID: The Fortran interface fell behind. I will fix it in dev, but for now you can just use the value, 91. You can also use the command line option. Thanks, Matt On 5/22/06, Randall Mackie wrote: > > I'm trying to modify my code to take advantage of the fact that my system > matrix > is symmetric. > > It seemed from reading the manual pages that I could call > MatCreateMPISBAIJ > to create a symmetric data matrix. > > However, when I tried this, I got the error: > > [1]PETSC ERROR: Lower triangular value cannot be set for sbaij format. > Ignoring these value > s, run with -mat_ignore_lower_triangular or call > MatSetOption(mat,MAT_IGNORE_LOWER_TRIANGUL > AR)! > > So I tried including the line: > > call MatSetOption(A,MAT_IGNORE_LOWER_TRIANGULAR,ierr) > > But when I tried to compile the code, I then got this message: > > fortcom: Error: csem3dfwd.F, line 771: This name does not have a type, and > must have an explicit type. [MAT_IGNORE_LOWER_TRIANGULAR] > call MatSetOption(A,MAT_IGNORE_LOWER_TRIANGULAR,ierr) > --------------------------^ > > > So, can anyone point me in the right direction for taking advantage of the > fact that > my matrices are symmetric? (they are actually complex symmetric, and NOT > Hermitian). > > Thanks, Randy > > -- > Randall Mackie > GSY-USA, Inc. > PMB# 643 > 2261 Market St., > San Francisco, CA 94114-1600 > Tel (415) 469-8649 > Fax (415) 469-5044 > > California Registered Geophysicist > License No. GP 1034 > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From randy at geosystem.us Mon May 22 10:59:47 2006 From: randy at geosystem.us (Randall Mackie) Date: Mon, 22 May 2006 08:59:47 -0700 Subject: symmetric matrices In-Reply-To: References: <4471DB9C.2080601@geosystem.us> Message-ID: <4471DFF3.3030304@geosystem.us> Hi Matt, Thanks, that fixed the compilation problem, but then the program just seemed to hang at the place where it was creating/assembling the matrix. Maybe I do not understand how to set bs, the block size for the call to MatCreateMPISBAIJ, any advice? Are symmetric MPI BAIJ matrices my only choice? Is there not a symmetric MPIAIJ matrix format? Thanks, Randy Matthew Knepley wrote: > The Fortran interface fell behind. I will fix it in dev, but for now you can > just use the value, 91. You can also use the command line option. > > Thanks, > > Matt > > On 5/22/06, *Randall Mackie* > wrote: > > I'm trying to modify my code to take advantage of the fact that my > system matrix > is symmetric. > > It seemed from reading the manual pages that I could call > MatCreateMPISBAIJ > to create a symmetric data matrix. > > However, when I tried this, I got the error: > > [1]PETSC ERROR: Lower triangular value cannot be set for sbaij > format. Ignoring these value > s, run with -mat_ignore_lower_triangular or call > MatSetOption(mat,MAT_IGNORE_LOWER_TRIANGUL > AR)! > > So I tried including the line: > > call MatSetOption(A,MAT_IGNORE_LOWER_TRIANGULAR,ierr) > > But when I tried to compile the code, I then got this message: > > fortcom: Error: csem3dfwd.F, line 771: This name does not have a > type, and must have an explicit type. [MAT_IGNORE_LOWER_TRIANGULAR] > call MatSetOption(A,MAT_IGNORE_LOWER_TRIANGULAR,ierr) > --------------------------^ > > > So, can anyone point me in the right direction for taking advantage > of the fact that > my matrices are symmetric? (they are actually complex symmetric, and > NOT Hermitian). > > Thanks, Randy > > -- > Randall Mackie > GSY-USA, Inc. > PMB# 643 > 2261 Market St., > San Francisco, CA 94114-1600 > Tel (415) 469-8649 > Fax (415) 469-5044 > > California Registered Geophysicist > License No. GP 1034 > > > > > -- > "Failure has a thousand explanations. Success doesn't need one" -- Sir > Alec Guiness -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From knepley at gmail.com Mon May 22 11:11:13 2006 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 May 2006 11:11:13 -0500 Subject: symmetric matrices In-Reply-To: <4471DFF3.3030304@geosystem.us> References: <4471DB9C.2080601@geosystem.us> <4471DFF3.3030304@geosystem.us> Message-ID: On 5/22/06, Randall Mackie wrote: > > Hi Matt, > > Thanks, that fixed the compilation problem, but then the program just > seemed to > hang at the place where it was creating/assembling the matrix. My guess is that it is not hanging, but rather allocating all over the place. If you give -info, it shouldlet you know how manty mallocs were used. Maybe I do not understand how to set bs, the block size for the call > to MatCreateMPISBAIJ, any advice? If you do not know the block size, just use 1 to start. Are symmetric MPI BAIJ matrices my only choice? Is there not a symmetric > MPIAIJ > matrix format? That is really block size 1. Matt Thanks, Randy > > > Matthew Knepley wrote: > > The Fortran interface fell behind. I will fix it in dev, but for now you > can > > just use the value, 91. You can also use the command line option. > > > > Thanks, > > > > Matt > > > > On 5/22/06, *Randall Mackie* > > wrote: > > > > I'm trying to modify my code to take advantage of the fact that my > > system matrix > > is symmetric. > > > > It seemed from reading the manual pages that I could call > > MatCreateMPISBAIJ > > to create a symmetric data matrix. > > > > However, when I tried this, I got the error: > > > > [1]PETSC ERROR: Lower triangular value cannot be set for sbaij > > format. Ignoring these value > > s, run with -mat_ignore_lower_triangular or call > > MatSetOption(mat,MAT_IGNORE_LOWER_TRIANGUL > > AR)! > > > > So I tried including the line: > > > > call MatSetOption(A,MAT_IGNORE_LOWER_TRIANGULAR,ierr) > > > > But when I tried to compile the code, I then got this message: > > > > fortcom: Error: csem3dfwd.F, line 771: This name does not have a > > type, and must have an explicit type. > [MAT_IGNORE_LOWER_TRIANGULAR] > > call MatSetOption(A,MAT_IGNORE_LOWER_TRIANGULAR,ierr) > > --------------------------^ > > > > > > So, can anyone point me in the right direction for taking advantage > > of the fact that > > my matrices are symmetric? (they are actually complex symmetric, and > > NOT Hermitian). > > > > Thanks, Randy > > > > -- > > Randall Mackie > > GSY-USA, Inc. > > PMB# 643 > > 2261 Market St., > > San Francisco, CA 94114-1600 > > Tel (415) 469-8649 > > Fax (415) 469-5044 > > > > California Registered Geophysicist > > License No. GP 1034 > > > > > > > > > > -- > > "Failure has a thousand explanations. Success doesn't need one" -- Sir > > Alec Guiness > > -- > Randall Mackie > GSY-USA, Inc. > PMB# 643 > 2261 Market St., > San Francisco, CA 94114-1600 > Tel (415) 469-8649 > Fax (415) 469-5044 > > California Registered Geophysicist > License No. GP 1034 > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From mep4gk01 at ucy.ac.cy Mon May 22 14:18:17 2006 From: mep4gk01 at ucy.ac.cy (mep4gk01 at ucy.ac.cy) Date: Mon, 22 May 2006 22:18:17 +0300 Subject: Laplacian solver Message-ID: Dear Sirs, My name is George Katsambas and I am an MSc student at the University of Cyprus (mechanical engineering department). I am trying to parallelize a Laplacian sovler using BiCGStab method (in Fortran). After a lot of effort I create, using MPI, the system Ax=b but it was very difficult for me, to parallelize the serial solver so I download and installed PETSc. I read the manual and I run the tutorials but I cannot understand how I can pass my data into the PETSc routines in order to prepare the data for the KSP solver. Can you please give some quitelines how I can do this and if it is possible to be done. Your help will be very valuable for me. Best regards, George Katsambas From bsmith at mcs.anl.gov Mon May 22 17:09:36 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 22 May 2006 17:09:36 -0500 (CDT) Subject: Laplacian solver In-Reply-To: References: Message-ID: George, First, is it a structured grid in 2 or 3 dimensions or is it unstructured with finite elements? If a structured grid you should use an example like src/ksp/ksp/examples/tutorials/ex29.c or ex22f.F or ex22.c or ex34.c If it is finite elements on an unstructured grid then it is much more difficult. We do not yet have the tools for easily parallelizing all the mesh management (which is that hard part). For the solver all you need do is loop over elements, computing the stiffness for each element and call MatSetValues() or MatSetValuesLocal() to insert into the matrix. Similar for the load vector. Good luck, Barry On Mon, 22 May 2006, mep4gk01 at ucy.ac.cy wrote: > Dear Sirs, > > My name is George Katsambas and I am an MSc student at the University of Cyprus (mechanical engineering department). I am trying to parallelize a Laplacian sovler using BiCGStab method (in Fortran). After a lot of effort I create, using MPI, the system Ax=b but it was very difficult for me, to parallelize the serial solver so I download and installed PETSc. I read the manual and I run the tutorials but I cannot understand how I can pass my data into the PETSc routines in order to prepare the data for the KSP solver. Can you please give some quitelines how I can do this and if it is possible to be done. Your help will be very valuable for me. > > Best regards, > George Katsambas > > From knepley at gmail.com Mon May 22 17:09:50 2006 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 May 2006 17:09:50 -0500 Subject: Laplacian solver In-Reply-To: References: Message-ID: I suggest looking at the examples. For instance, this is the FD Laplacian: http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/ksp/ksp/examples/tutorials/ex2.c.html If you have a logically Cartesian grid, I advise you to use the DA object: http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/snes/examples/tutorials/ex5.c.html Matt On 5/22/06, mep4gk01 at ucy.ac.cy wrote: > > Dear Sirs, > > My name is George Katsambas and I am an MSc student at the University of > Cyprus (mechanical engineering department). I am trying to parallelize a > Laplacian sovler using BiCGStab method (in Fortran). After a lot of effort I > create, using MPI, the system Ax=b but it was very difficult for me, to > parallelize the serial solver so I download and installed PETSc. I read the > manual and I run the tutorials but I cannot understand how I can pass my > data into the PETSc routines in order to prepare the data for the KSP > solver. Can you please give some quitelines how I can do this and if it is > possible to be done. Your help will be very valuable for me. > > Best regards, > George Katsambas > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon May 22 17:11:36 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 22 May 2006 17:11:36 -0500 (CDT) Subject: Laplacian solver In-Reply-To: References: Message-ID: Please do NOT look at this example. It is too simple and doesn't make clear at all how to properly parallelize the data for the matrix. Barry On Mon, 22 May 2006, Matthew Knepley wrote: > I suggest looking at the examples. For instance, this is the FD Laplacian: > > > http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/ksp/ksp/examples/tutorials/ex2.c.html > > If you have a logically Cartesian grid, I advise you to use the DA object: > > > http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/snes/examples/tutorials/ex5.c.html > > Matt > > On 5/22/06, mep4gk01 at ucy.ac.cy wrote: >> >> Dear Sirs, >> >> My name is George Katsambas and I am an MSc student at the University of >> Cyprus (mechanical engineering department). I am trying to parallelize a >> Laplacian sovler using BiCGStab method (in Fortran). After a lot of effort >> I >> create, using MPI, the system Ax=b but it was very difficult for me, to >> parallelize the serial solver so I download and installed PETSc. I read >> the >> manual and I run the tutorials but I cannot understand how I can pass my >> data into the PETSc routines in order to prepare the data for the KSP >> solver. Can you please give some quitelines how I can do this and if it is >> possible to be done. Your help will be very valuable for me. >> >> Best regards, >> George Katsambas >> >> > > > From jbakosi at gmu.edu Wed May 24 12:52:27 2006 From: jbakosi at gmu.edu (Jozsef Bakosi) Date: Wed, 24 May 2006 13:52:27 -0400 Subject: MatGetColumnVector() Message-ID: <20060524175227.GB5406@debian> Hi all, I'm trying to use MatGetColumnVector() to extract a column from a matrix, but even if I put a MatAssemblyBegin/MatAssemblyEnd pair right before it, I always get the error message during run: [0]PETSC ERROR: MatGetRow() line 163 in src/mat/interface/matrix.c [0]PETSC ERROR: Object is in wrong state! [0]PETSC ERROR: Not for unassembled matrix! [0]PETSC ERROR: MatGetColumnVector() line 61 in src/mat/utils/getcolv.c What can cause this problem? I'm using PETSc 2.3.1. Jozsef -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 191 bytes Desc: Digital signature URL: From bsmith at mcs.anl.gov Wed May 24 21:06:43 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 24 May 2006 21:06:43 -0500 (CDT) Subject: MatGetColumnVector() In-Reply-To: <20060524175227.GB5406@debian> References: <20060524175227.GB5406@debian> Message-ID: Jozsef, Strange. What matrix type? Parallel or sequential? Please send code or code segment. Barry On Wed, 24 May 2006, Jozsef Bakosi wrote: > Hi all, > > I'm trying to use MatGetColumnVector() to extract a column from a > matrix, but even if I put a MatAssemblyBegin/MatAssemblyEnd pair > right before it, I always get the error message during run: > > [0]PETSC ERROR: MatGetRow() line 163 in src/mat/interface/matrix.c > [0]PETSC ERROR: Object is in wrong state! > [0]PETSC ERROR: Not for unassembled matrix! > [0]PETSC ERROR: MatGetColumnVector() line 61 in src/mat/utils/getcolv.c > > What can cause this problem? > > I'm using PETSc 2.3.1. > > Jozsef > From randy at geosystem.us Sat May 27 17:00:18 2006 From: randy at geosystem.us (Randall Mackie) Date: Sat, 27 May 2006 15:00:18 -0700 Subject: bombing out writing large scratch files Message-ID: <4478CBF2.20600@geosystem.us> In my PETSc based modeling code, I write out intermediate results to a scratch file, and then read them back later. This has worked fine up until today, when for a large model, this seems to be causing my program to crash with errors like: ------------------------------------------------------------------------ [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range I've tracked down the offending code to: IF (rank == 0) THEN irec=(iper-1)*2+ipol write(7,rec=irec) (xvec(i),i=1,np) END IF It writes out xvec for the first record, but then on the second record my program is crashing. The record length (from an inquire statement) is recl 22626552 The size of the scratch file when my program crashes is 98M. PETSc is compiled using the intel compilers (v9.0 for fortran), and the users manual says that you can have record lengths of up to 2 billion bytes. I'm kind of stuck as to what might be the cause. Any ideas from anyone would be greatly appreciated. Randy Mackie ps. I've tried both the optimized and debugging versions of the PETSc libraries, with the same result. -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From knepley at gmail.com Sat May 27 17:17:37 2006 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 27 May 2006 17:17:37 -0500 Subject: bombing out writing large scratch files In-Reply-To: <4478CBF2.20600@geosystem.us> References: <4478CBF2.20600@geosystem.us> Message-ID: The best thing to do here is get a stack trace from the debugger. From the description, it is hard to tell what statement is trying to access which illegal memory. Matt On 5/27/06, Randall Mackie wrote: > > In my PETSc based modeling code, I write out intermediate results to a > scratch > file, and then read them back later. This has worked fine up until today, > when for a large model, this seems to be causing my program to crash with > errors like: > > ------------------------------------------------------------------------ > [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > > I've tracked down the offending code to: > > IF (rank == 0) THEN > irec=(iper-1)*2+ipol > write(7,rec=irec) (xvec(i),i=1,np) > END IF > > It writes out xvec for the first record, but then on the second > record my program is crashing. > > The record length (from an inquire statement) is recl 22626552 > > The size of the scratch file when my program crashes is 98M. > > PETSc is compiled using the intel compilers (v9.0 for fortran), > and the users manual says that you can have record lengths of > up to 2 billion bytes. > > I'm kind of stuck as to what might be the cause. Any ideas from anyone > would be greatly appreciated. > > Randy Mackie > > ps. I've tried both the optimized and debugging versions of the PETSc > libraries, with the same result. > > > -- > Randall Mackie > GSY-USA, Inc. > PMB# 643 > 2261 Market St., > San Francisco, CA 94114-1600 > Tel (415) 469-8649 > Fax (415) 469-5044 > > California Registered Geophysicist > License No. GP 1034 > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat May 27 17:32:03 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 27 May 2006 17:32:03 -0500 (CDT) Subject: bombing out writing large scratch files In-Reply-To: <4478CBF2.20600@geosystem.us> References: <4478CBF2.20600@geosystem.us> Message-ID: Randy, The only "PETSc" related reason for this is that xvec(i), i=1,np is accessing out of range. What is xvec and is it of length 1 to np? Barry On Sat, 27 May 2006, Randall Mackie wrote: > In my PETSc based modeling code, I write out intermediate results to a > scratch > file, and then read them back later. This has worked fine up until today, > when for a large model, this seems to be causing my program to crash with > errors like: > > ------------------------------------------------------------------------ > [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > > I've tracked down the offending code to: > > IF (rank == 0) THEN > irec=(iper-1)*2+ipol > write(7,rec=irec) (xvec(i),i=1,np) > END IF > > It writes out xvec for the first record, but then on the second > record my program is crashing. > > The record length (from an inquire statement) is recl 22626552 > > The size of the scratch file when my program crashes is 98M. > > PETSc is compiled using the intel compilers (v9.0 for fortran), > and the users manual says that you can have record lengths of > up to 2 billion bytes. > > I'm kind of stuck as to what might be the cause. Any ideas from anyone > would be greatly appreciated. > > Randy Mackie > > ps. I've tried both the optimized and debugging versions of the PETSc > libraries, with the same result. > > > From randy at geosystem.us Sat May 27 17:36:45 2006 From: randy at geosystem.us (Randall Mackie) Date: Sat, 27 May 2006 15:36:45 -0700 Subject: bombing out writing large scratch files In-Reply-To: References: <4478CBF2.20600@geosystem.us> Message-ID: <4478D47D.6070004@geosystem.us> xvec is a double precision complex vector that is dynamically allocated once np is known. I've printed out the np value and it is correct. This works on the first pass, but not the second. This PETSc program has been working just fine for a couple years now, the only difference this time is the size of the model I'm working with, which is substantially larger than typical. I'm going to try to run this in the debugger and see if I can get anymore information. Randy Barry Smith wrote: > > Randy, > > The only "PETSc" related reason for this is that > xvec(i), i=1,np is accessing out of range. What is xvec > and is it of length 1 to np? > > Barry > > > On Sat, 27 May 2006, Randall Mackie wrote: > >> In my PETSc based modeling code, I write out intermediate results to a >> scratch >> file, and then read them back later. This has worked fine up until today, >> when for a large model, this seems to be causing my program to crash with >> errors like: >> >> ------------------------------------------------------------------------ >> [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> >> >> I've tracked down the offending code to: >> >> IF (rank == 0) THEN >> irec=(iper-1)*2+ipol >> write(7,rec=irec) (xvec(i),i=1,np) >> END IF >> >> It writes out xvec for the first record, but then on the second >> record my program is crashing. >> >> The record length (from an inquire statement) is recl 22626552 >> >> The size of the scratch file when my program crashes is 98M. >> >> PETSc is compiled using the intel compilers (v9.0 for fortran), >> and the users manual says that you can have record lengths of >> up to 2 billion bytes. >> >> I'm kind of stuck as to what might be the cause. Any ideas from anyone >> would be greatly appreciated. >> >> Randy Mackie >> >> ps. I've tried both the optimized and debugging versions of the PETSc >> libraries, with the same result. >> >> >> > -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From randy at geosystem.us Sat May 27 17:45:46 2006 From: randy at geosystem.us (Randall Mackie) Date: Sat, 27 May 2006 15:45:46 -0700 Subject: bombing out writing large scratch files In-Reply-To: References: <4478CBF2.20600@geosystem.us> Message-ID: <4478D69A.5050002@geosystem.us> This is a stupid question, but how do I start in the debugger if I'm running on a cluster half-way around the world and I'm working on that cluster via ssh? Randy Matthew Knepley wrote: > The best thing to do here is get a stack trace from the debugger. From the > description, it is hard to tell what statement is trying to access which > illegal > memory. > > Matt > > On 5/27/06, *Randall Mackie* > wrote: > > In my PETSc based modeling code, I write out intermediate results to > a scratch > file, and then read them back later. This has worked fine up until > today, > when for a large model, this seems to be causing my program to crash > with > errors like: > > ------------------------------------------------------------------------ > [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > Violation, probably memory access out of range > > > I've tracked down the offending code to: > > IF (rank == 0) THEN > irec=(iper-1)*2+ipol > write(7,rec=irec) (xvec(i),i=1,np) > END IF > > It writes out xvec for the first record, but then on the second > record my program is crashing. > > The record length (from an inquire statement) is recl 22626552 > > The size of the scratch file when my program crashes is 98M. > > PETSc is compiled using the intel compilers ( v9.0 for fortran), > and the users manual says that you can have record lengths of > up to 2 billion bytes. > > I'm kind of stuck as to what might be the cause. Any ideas from anyone > would be greatly appreciated. > > Randy Mackie > > ps. I've tried both the optimized and debugging versions of the PETSc > libraries, with the same result. > > > -- > Randall Mackie > GSY-USA, Inc. > PMB# 643 > 2261 Market St., > San Francisco, CA 94114-1600 > Tel (415) 469-8649 > Fax (415) 469-5044 > > California Registered Geophysicist > License No. GP 1034 > > > > > -- > "Failure has a thousand explanations. Success doesn't need one" -- Sir > Alec Guiness -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From knepley at gmail.com Sat May 27 17:47:53 2006 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 27 May 2006 17:47:53 -0500 Subject: bombing out writing large scratch files In-Reply-To: <4478D69A.5050002@geosystem.us> References: <4478CBF2.20600@geosystem.us> <4478D69A.5050002@geosystem.us> Message-ID: 1) Make sure ssh is forwarding X (-Y I think) 2) -start_in_debugger 3) -display :0.0 should do it. Matt On 5/27/06, Randall Mackie wrote: > > This is a stupid question, but how do I start in the debugger if I'm > running > on a cluster half-way around the world and I'm working on that cluster > via ssh? > > Randy > > > Matthew Knepley wrote: > > The best thing to do here is get a stack trace from the debugger. From > the > > description, it is hard to tell what statement is trying to access which > > illegal > > memory. > > > > Matt > > > > On 5/27/06, *Randall Mackie* > > wrote: > > > > In my PETSc based modeling code, I write out intermediate results to > > a scratch > > file, and then read them back later. This has worked fine up until > > today, > > when for a large model, this seems to be causing my program to crash > > with > > errors like: > > > > > ------------------------------------------------------------------------ > > [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > > Violation, probably memory access out of range > > > > > > I've tracked down the offending code to: > > > > IF (rank == 0) THEN > > irec=(iper-1)*2+ipol > > write(7,rec=irec) (xvec(i),i=1,np) > > END IF > > > > It writes out xvec for the first record, but then on the second > > record my program is crashing. > > > > The record length (from an inquire statement) is recl 22626552 > > > > The size of the scratch file when my program crashes is 98M. > > > > PETSc is compiled using the intel compilers ( v9.0 for fortran), > > and the users manual says that you can have record lengths of > > up to 2 billion bytes. > > > > I'm kind of stuck as to what might be the cause. Any ideas from > anyone > > would be greatly appreciated. > > > > Randy Mackie > > > > ps. I've tried both the optimized and debugging versions of the > PETSc > > libraries, with the same result. > > > > > > -- > > Randall Mackie > > GSY-USA, Inc. > > PMB# 643 > > 2261 Market St., > > San Francisco, CA 94114-1600 > > Tel (415) 469-8649 > > Fax (415) 469-5044 > > > > California Registered Geophysicist > > License No. GP 1034 > > > > > > > > > > -- > > "Failure has a thousand explanations. Success doesn't need one" -- Sir > > Alec Guiness > > -- > Randall Mackie > GSY-USA, Inc. > PMB# 643 > 2261 Market St., > San Francisco, CA 94114-1600 > Tel (415) 469-8649 > Fax (415) 469-5044 > > California Registered Geophysicist > License No. GP 1034 > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat May 27 17:48:55 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 27 May 2006 17:48:55 -0500 (CDT) Subject: bombing out writing large scratch files In-Reply-To: <4478D47D.6070004@geosystem.us> References: <4478CBF2.20600@geosystem.us> <4478D47D.6070004@geosystem.us> Message-ID: Sometimes a subtle memory bug can lurk under the covers and then appear in a big problem. You can try putting a CHKMEMQ right before the if (rank == ) in the code and run the debug version with -malloc_debug You could also consider valgrind (valgrind.org). Barry On Sat, 27 May 2006, Randall Mackie wrote: > xvec is a double precision complex vector that is dynamically allocated > once np is known. I've printed out the np value and it is correct. > This works on the first pass, but not the second. > > This PETSc program has been working just fine for a couple years now, > the only difference this time is the size of the model I'm working > with, which is substantially larger than typical. > > I'm going to try to run this in the debugger and see if I can get > anymore information. > > Randy > > > Barry Smith wrote: >> >> Randy, >> >> The only "PETSc" related reason for this is that >> xvec(i), i=1,np is accessing out of range. What is xvec >> and is it of length 1 to np? >> >> Barry >> >> >> On Sat, 27 May 2006, Randall Mackie wrote: >> >>> In my PETSc based modeling code, I write out intermediate results to a >>> scratch >>> file, and then read them back later. This has worked fine up until today, >>> when for a large model, this seems to be causing my program to crash with >>> errors like: >>> >>> ------------------------------------------------------------------------ >>> [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>> probably memory access out of range >>> >>> >>> I've tracked down the offending code to: >>> >>> IF (rank == 0) THEN >>> irec=(iper-1)*2+ipol >>> write(7,rec=irec) (xvec(i),i=1,np) >>> END IF >>> >>> It writes out xvec for the first record, but then on the second >>> record my program is crashing. >>> >>> The record length (from an inquire statement) is recl 22626552 >>> >>> The size of the scratch file when my program crashes is 98M. >>> >>> PETSc is compiled using the intel compilers (v9.0 for fortran), >>> and the users manual says that you can have record lengths of >>> up to 2 billion bytes. >>> >>> I'm kind of stuck as to what might be the cause. Any ideas from anyone >>> would be greatly appreciated. >>> >>> Randy Mackie >>> >>> ps. I've tried both the optimized and debugging versions of the PETSc >>> libraries, with the same result. >>> >>> >>> >> > > From bsmith at mcs.anl.gov Sat May 27 17:50:26 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 27 May 2006 17:50:26 -0500 (CDT) Subject: bombing out writing large scratch files In-Reply-To: References: <4478CBF2.20600@geosystem.us> <4478D69A.5050002@geosystem.us> Message-ID: You can also add -debugger_nodes 9 to indicate you only want node 9 in the debugger; otherwise you get lots of xterms poping up. Barry On Sat, 27 May 2006, Matthew Knepley wrote: > 1) Make sure ssh is forwarding X (-Y I think) > > 2) -start_in_debugger > > 3) -display :0.0 > > should do it. > > Matt > > On 5/27/06, Randall Mackie wrote: >> >> This is a stupid question, but how do I start in the debugger if I'm >> running >> on a cluster half-way around the world and I'm working on that cluster >> via ssh? >> >> Randy >> >> >> Matthew Knepley wrote: >> > The best thing to do here is get a stack trace from the debugger. From >> the >> > description, it is hard to tell what statement is trying to access which >> > illegal >> > memory. >> > >> > Matt >> > >> > On 5/27/06, *Randall Mackie* > > > wrote: >> > >> > In my PETSc based modeling code, I write out intermediate results to >> > a scratch >> > file, and then read them back later. This has worked fine up until >> > today, >> > when for a large model, this seems to be causing my program to crash >> > with >> > errors like: >> > >> > >> ------------------------------------------------------------------------ >> > [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >> > Violation, probably memory access out of range >> > >> > >> > I've tracked down the offending code to: >> > >> > IF (rank == 0) THEN >> > irec=(iper-1)*2+ipol >> > write(7,rec=irec) (xvec(i),i=1,np) >> > END IF >> > >> > It writes out xvec for the first record, but then on the second >> > record my program is crashing. >> > >> > The record length (from an inquire statement) is recl 22626552 >> > >> > The size of the scratch file when my program crashes is 98M. >> > >> > PETSc is compiled using the intel compilers ( v9.0 for fortran), >> > and the users manual says that you can have record lengths of >> > up to 2 billion bytes. >> > >> > I'm kind of stuck as to what might be the cause. Any ideas from >> anyone >> > would be greatly appreciated. >> > >> > Randy Mackie >> > >> > ps. I've tried both the optimized and debugging versions of the >> PETSc >> > libraries, with the same result. >> > >> > >> > -- >> > Randall Mackie >> > GSY-USA, Inc. >> > PMB# 643 >> > 2261 Market St., >> > San Francisco, CA 94114-1600 >> > Tel (415) 469-8649 >> > Fax (415) 469-5044 >> > >> > California Registered Geophysicist >> > License No. GP 1034 >> > >> > >> > >> > >> > -- >> > "Failure has a thousand explanations. Success doesn't need one" -- Sir >> > Alec Guiness >> >> -- >> Randall Mackie >> GSY-USA, Inc. >> PMB# 643 >> 2261 Market St., >> San Francisco, CA 94114-1600 >> Tel (415) 469-8649 >> Fax (415) 469-5044 >> >> California Registered Geophysicist >> License No. GP 1034 >> >> > > > From randy at geosystem.us Sat May 27 18:58:44 2006 From: randy at geosystem.us (Randall Mackie) Date: Sat, 27 May 2006 16:58:44 -0700 Subject: bombing out writing large scratch files In-Reply-To: References: <4478CBF2.20600@geosystem.us> <4478D69A.5050002@geosystem.us> Message-ID: <4478E7B4.6070209@geosystem.us> I can't seem to get the debugger to pop up on my screen. When I'm logged into the cluster I'm working on, I can type xterm &, and an xterm pops up on my display. So I know I can get something from the remote cluster. Now, when I try this using PETSc, I'm getting the following error message, for example: ------------------------------------------------------------------------ [17]PETSC ERROR: PETSC: Attaching gdb to /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 3628 on display 24.5.142.138:0.0 on machine compute-0-23.local ------------------------------------------------------------------------ I'm using this in my command file: source ~/.bashrc time /opt/mpich/intel/bin/mpirun -np 20 -nolocal -machinefile machines \ /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc \ -start_in_debugger \ -debugger_node 1 \ -display 24.5.142.138:0.0 \ -em_ksp_type bcgs \ -em_sub_pc_type ilu \ -em_sub_pc_factor_levels 8 \ -em_sub_pc_factor_fill 4 \ -em_sub_pc_factor_reuse_ordering \ -em_sub_pc_factor_reuse_fill \ -em_sub_pc_factor_mat_ordering_type rcm \ -divh_ksp_type cr \ -divh_sub_pc_type icc \ -ppc_sub_pc_type ilu \ << EOF ... Randy Matthew Knepley wrote: > 1) Make sure ssh is forwarding X (-Y I think) > > 2) -start_in_debugger > > 3) -display :0.0 > > should do it. > > Matt > > On 5/27/06, *Randall Mackie* > wrote: > > This is a stupid question, but how do I start in the debugger if I'm > running > on a cluster half-way around the world and I'm working on that cluster > via ssh? > > Randy > > > Matthew Knepley wrote: > > The best thing to do here is get a stack trace from the debugger. > From the > > description, it is hard to tell what statement is trying to > access which > > illegal > > memory. > > > > Matt > > > > On 5/27/06, *Randall Mackie* < randy at geosystem.us > > > >> wrote: > > > > In my PETSc based modeling code, I write out intermediate > results to > > a scratch > > file, and then read them back later. This has worked fine up > until > > today, > > when for a large model, this seems to be causing my program > to crash > > with > > errors like: > > > > > ------------------------------------------------------------------------ > > [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > > Violation, probably memory access out of range > > > > > > I've tracked down the offending code to: > > > > IF (rank == 0) THEN > > irec=(iper-1)*2+ipol > > write(7,rec=irec) (xvec(i),i=1,np) > > END IF > > > > It writes out xvec for the first record, but then on the second > > record my program is crashing. > > > > The record length (from an inquire statement) is recl > 22626552 > > > > The size of the scratch file when my program crashes is 98M. > > > > PETSc is compiled using the intel compilers ( v9.0 for fortran), > > and the users manual says that you can have record lengths of > > up to 2 billion bytes. > > > > I'm kind of stuck as to what might be the cause. Any ideas > from anyone > > would be greatly appreciated. > > > > Randy Mackie > > > > ps. I've tried both the optimized and debugging versions of > the PETSc > > libraries, with the same result. > > > > > > -- > > Randall Mackie > > GSY-USA, Inc. > > PMB# 643 > > 2261 Market St., > > San Francisco, CA 94114-1600 > > Tel (415) 469-8649 > > Fax (415) 469-5044 > > > > California Registered Geophysicist > > License No. GP 1034 > > > > > > > > > > -- > > "Failure has a thousand explanations. Success doesn't need one" > -- Sir > > Alec Guiness > > -- > Randall Mackie > GSY-USA, Inc. > PMB# 643 > 2261 Market St., > San Francisco, CA 94114-1600 > Tel (415) 469-8649 > Fax (415) 469-5044 > > California Registered Geophysicist > License No. GP 1034 > > > > > -- > "Failure has a thousand explanations. Success doesn't need one" -- Sir > Alec Guiness -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From bsmith at mcs.anl.gov Sat May 27 19:19:16 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 27 May 2006 19:19:16 -0500 (CDT) Subject: bombing out writing large scratch files In-Reply-To: <4478E7B4.6070209@geosystem.us> References: <4478CBF2.20600@geosystem.us> <4478D69A.5050002@geosystem.us> <4478E7B4.6070209@geosystem.us> Message-ID: That's the correct message; it is trying to do the right thing. My guess is that the cluster node doesn't have a path back to your display. If, for example, the MPI jobs are not started up with ssh with X forwarding. What happens when you do mpirun -np 1 -nolocal -machinefile machines xterm -display 24.5.142.138:0.0 does it open an xterm on your system? BTW: it is -debugger_nodes 1 not -debugger_node 1 How about not using the -nolocal and use -debugger_nodes 0? Barry On Sat, 27 May 2006, Randall Mackie wrote: > I can't seem to get the debugger to pop up on my screen. > > When I'm logged into the cluster I'm working on, I can > type xterm &, and an xterm pops up on my display. So I know > I can get something from the remote cluster. > > Now, when I try this using PETSc, I'm getting the following error > message, for example: > > ------------------------------------------------------------------------ > [17]PETSC ERROR: PETSC: Attaching gdb to > /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 3628 on display > 24.5.142.138:0.0 on machine compute-0-23.local > ------------------------------------------------------------------------ > > I'm using this in my command file: > > source ~/.bashrc > time /opt/mpich/intel/bin/mpirun -np 20 -nolocal -machinefile machines \ > /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc \ > -start_in_debugger \ > -debugger_node 1 \ > -display 24.5.142.138:0.0 \ > -em_ksp_type bcgs \ > -em_sub_pc_type ilu \ > -em_sub_pc_factor_levels 8 \ > -em_sub_pc_factor_fill 4 \ > -em_sub_pc_factor_reuse_ordering \ > -em_sub_pc_factor_reuse_fill \ > -em_sub_pc_factor_mat_ordering_type rcm \ > -divh_ksp_type cr \ > -divh_sub_pc_type icc \ > -ppc_sub_pc_type ilu \ > << EOF > ... > > > Randy > > > Matthew Knepley wrote: >> 1) Make sure ssh is forwarding X (-Y I think) >> >> 2) -start_in_debugger >> >> 3) -display :0.0 >> >> should do it. >> >> Matt >> >> On 5/27/06, *Randall Mackie* > > wrote: >> >> This is a stupid question, but how do I start in the debugger if I'm >> running >> on a cluster half-way around the world and I'm working on that cluster >> via ssh? >> >> Randy >> >> >> Matthew Knepley wrote: >> > The best thing to do here is get a stack trace from the debugger. >> From the >> > description, it is hard to tell what statement is trying to >> access which >> > illegal >> > memory. >> > >> > Matt >> > >> > On 5/27/06, *Randall Mackie* < randy at geosystem.us >> >> > >> wrote: >> > >> > In my PETSc based modeling code, I write out intermediate >> results to >> > a scratch >> > file, and then read them back later. This has worked fine up >> until >> > today, >> > when for a large model, this seems to be causing my program >> to crash >> > with >> > errors like: >> > >> > >> ------------------------------------------------------------------------ >> > [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >> > Violation, probably memory access out of range >> > >> > >> > I've tracked down the offending code to: >> > >> > IF (rank == 0) THEN >> > irec=(iper-1)*2+ipol >> > write(7,rec=irec) (xvec(i),i=1,np) >> > END IF >> > >> > It writes out xvec for the first record, but then on the second >> > record my program is crashing. >> > >> > The record length (from an inquire statement) is recl >> 22626552 >> > >> > The size of the scratch file when my program crashes is 98M. >> > >> > PETSc is compiled using the intel compilers ( v9.0 for fortran), >> > and the users manual says that you can have record lengths of >> > up to 2 billion bytes. >> > >> > I'm kind of stuck as to what might be the cause. Any ideas >> from anyone >> > would be greatly appreciated. >> > >> > Randy Mackie >> > >> > ps. I've tried both the optimized and debugging versions of >> the PETSc >> > libraries, with the same result. >> > >> > >> > -- >> > Randall Mackie >> > GSY-USA, Inc. >> > PMB# 643 >> > 2261 Market St., >> > San Francisco, CA 94114-1600 >> > Tel (415) 469-8649 >> > Fax (415) 469-5044 >> > >> > California Registered Geophysicist >> > License No. GP 1034 >> > >> > >> > >> > >> > -- >> > "Failure has a thousand explanations. Success doesn't need one" >> -- Sir >> > Alec Guiness >> >> -- >> Randall Mackie >> GSY-USA, Inc. >> PMB# 643 >> 2261 Market St., >> San Francisco, CA 94114-1600 >> Tel (415) 469-8649 >> Fax (415) 469-5044 >> >> California Registered Geophysicist >> License No. GP 1034 >> >> >> >> >> -- >> "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec >> Guiness > > From knepley at gmail.com Sat May 27 19:30:52 2006 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 27 May 2006 19:30:52 -0500 Subject: bombing out writing large scratch files In-Reply-To: <4478E7B4.6070209@geosystem.us> References: <4478CBF2.20600@geosystem.us> <4478D69A.5050002@geosystem.us> <4478E7B4.6070209@geosystem.us> Message-ID: What the error? It always shows the error when it cannot pop up the window. Sounds like a problem with some batch environment being different from the interactive node. Computer centers are the worst run thing in the world. Matt On 5/27/06, Randall Mackie wrote: > > I can't seem to get the debugger to pop up on my screen. > > When I'm logged into the cluster I'm working on, I can > type xterm &, and an xterm pops up on my display. So I know > I can get something from the remote cluster. > > Now, when I try this using PETSc, I'm getting the following error > message, for example: > > ------------------------------------------------------------------------ > [17]PETSC ERROR: PETSC: Attaching gdb to > /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 3628 on display > 24.5.142.138:0.0 on > machine compute-0-23.local > ------------------------------------------------------------------------ > > I'm using this in my command file: > > source ~/.bashrc > time /opt/mpich/intel/bin/mpirun -np 20 -nolocal -machinefile machines \ > /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc \ > -start_in_debugger \ > -debugger_node 1 \ > -display 24.5.142.138:0.0 \ > -em_ksp_type bcgs \ > -em_sub_pc_type ilu \ > -em_sub_pc_factor_levels 8 \ > -em_sub_pc_factor_fill 4 \ > -em_sub_pc_factor_reuse_ordering \ > -em_sub_pc_factor_reuse_fill \ > -em_sub_pc_factor_mat_ordering_type rcm \ > -divh_ksp_type cr \ > -divh_sub_pc_type icc \ > -ppc_sub_pc_type ilu \ > << EOF > ... > > > Randy > > > Matthew Knepley wrote: > > 1) Make sure ssh is forwarding X (-Y I think) > > > > 2) -start_in_debugger > > > > 3) -display :0.0 > > > > should do it. > > > > Matt > > > > On 5/27/06, *Randall Mackie* > > wrote: > > > > This is a stupid question, but how do I start in the debugger if I'm > > running > > on a cluster half-way around the world and I'm working on that > cluster > > via ssh? > > > > Randy > > > > > > Matthew Knepley wrote: > > > The best thing to do here is get a stack trace from the debugger. > > From the > > > description, it is hard to tell what statement is trying to > > access which > > > illegal > > > memory. > > > > > > Matt > > > > > > On 5/27/06, *Randall Mackie* < randy at geosystem.us > > > > > >> wrote: > > > > > > In my PETSc based modeling code, I write out intermediate > > results to > > > a scratch > > > file, and then read them back later. This has worked fine up > > until > > > today, > > > when for a large model, this seems to be causing my program > > to crash > > > with > > > errors like: > > > > > > > > > ------------------------------------------------------------------------ > > > [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > > > Violation, probably memory access out of range > > > > > > > > > I've tracked down the offending code to: > > > > > > IF (rank == 0) THEN > > > irec=(iper-1)*2+ipol > > > write(7,rec=irec) (xvec(i),i=1,np) > > > END IF > > > > > > It writes out xvec for the first record, but then on the > second > > > record my program is crashing. > > > > > > The record length (from an inquire statement) is recl > > 22626552 > > > > > > The size of the scratch file when my program crashes is 98M. > > > > > > PETSc is compiled using the intel compilers ( v9.0 for > fortran), > > > and the users manual says that you can have record lengths of > > > up to 2 billion bytes. > > > > > > I'm kind of stuck as to what might be the cause. Any ideas > > from anyone > > > would be greatly appreciated. > > > > > > Randy Mackie > > > > > > ps. I've tried both the optimized and debugging versions of > > the PETSc > > > libraries, with the same result. > > > > > > > > > -- > > > Randall Mackie > > > GSY-USA, Inc. > > > PMB# 643 > > > 2261 Market St., > > > San Francisco, CA 94114-1600 > > > Tel (415) 469-8649 > > > Fax (415) 469-5044 > > > > > > California Registered Geophysicist > > > License No. GP 1034 > > > > > > > > > > > > > > > -- > > > "Failure has a thousand explanations. Success doesn't need one" > > -- Sir > > > Alec Guiness > > > > -- > > Randall Mackie > > GSY-USA, Inc. > > PMB# 643 > > 2261 Market St., > > San Francisco, CA 94114-1600 > > Tel (415) 469-8649 > > Fax (415) 469-5044 > > > > California Registered Geophysicist > > License No. GP 1034 > > > > > > > > > > -- > > "Failure has a thousand explanations. Success doesn't need one" -- Sir > > Alec Guiness > > -- > Randall Mackie > GSY-USA, Inc. > PMB# 643 > 2261 Market St., > San Francisco, CA 94114-1600 > Tel (415) 469-8649 > Fax (415) 469-5044 > > California Registered Geophysicist > License No. GP 1034 > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From randy at geosystem.us Sat May 27 19:38:56 2006 From: randy at geosystem.us (Randall Mackie) Date: Sat, 27 May 2006 17:38:56 -0700 Subject: bombing out writing large scratch files In-Reply-To: References: <4478CBF2.20600@geosystem.us> <4478D69A.5050002@geosystem.us> <4478E7B4.6070209@geosystem.us> Message-ID: <4478F120.6030700@geosystem.us> This is frustrating. Now I'm getting this Message: [randy at cluster Delta_May06]$ ./cmd_inv_petsc_3_3 Warning: No xauth data; using fake authentication data for X11 forwarding. Warning: No xauth data; using fake authentication data for X11 forwarding. Warning: No xauth data; using fake authentication data for X11 forwarding. Warning: No xauth data; using fake authentication data for X11 forwarding. Warning: No xauth data; using fake authentication data for X11 forwarding. Warning: No xauth data; using fake authentication data for X11 forwarding. Warning: No xauth data; using fake authentication data for X11 forwarding. Warning: No xauth data; using fake authentication data for X11 forwarding. Warning: No xauth data; using fake authentication data for X11 forwarding. Warning: No xauth data; using fake authentication data for X11 forwarding. ------------------------------------------------------------------------ Petsc Release Version 2.3.1, Patch 13, Wed May 10 11:08:35 CDT 2006 BK revision: balay at asterix.mcs.anl.gov|ChangeSet|20060510160640|13832 See docs/changes/index.html for recent updates. See docs/faq.html for hints about trouble shooting. See docs/index.html for manual pages. ------------------------------------------------------------------------ /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc on a linux-gnu named cluster.geo.hpc by randy Sun May 28 02:26:04 2006 Libraries linked from /home/randy/SPARSE/petsc-2.3.1-p13/lib/linux-gnu Configure run at Sun May 28 00:21:24 2006 Configure options --with-fortran --with-fortran-kernels=generic --with-blas-lapack-dir=/opt/intel/mkl72cluster/lib/32 --with-scalar-type=complex --with-debugging=1 --with-mpi-dir=/opt/mpich/intel --with-shared=0 ------------------------------------------------------------------------ [0]PETSC ERROR: PETSC: Attaching gdb to /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 4861 on display 24.5.142.138:0.0 on machine cluster.geo.hpc I noticed, however, that in /opt/mpich/intel/bin, there are other mpirun commands, like mpirun_dbg.gdb, so I'll give that a try and see if that helps. Randy Barry Smith wrote: > > That's the correct message; it is trying to do the right > thing. My guess is that the cluster node doesn't have a > path back to your display. If, for example, the MPI jobs > are not started up with ssh with X forwarding. > > What happens when you do > mpirun -np 1 -nolocal -machinefile machines xterm -display > 24.5.142.138:0.0 > does it open an xterm on your system? > > BTW: it is -debugger_nodes 1 not -debugger_node 1 > > How about not using the -nolocal and use -debugger_nodes 0? > > Barry > > > On Sat, 27 May 2006, Randall Mackie wrote: > >> I can't seem to get the debugger to pop up on my screen. >> >> When I'm logged into the cluster I'm working on, I can >> type xterm &, and an xterm pops up on my display. So I know >> I can get something from the remote cluster. >> >> Now, when I try this using PETSc, I'm getting the following error >> message, for example: >> >> ------------------------------------------------------------------------ >> [17]PETSC ERROR: PETSC: Attaching gdb to >> /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 3628 on display >> 24.5.142.138:0.0 on machine compute-0-23.local >> ------------------------------------------------------------------------ >> >> I'm using this in my command file: >> >> source ~/.bashrc >> time /opt/mpich/intel/bin/mpirun -np 20 -nolocal -machinefile machines \ >> /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc \ >> -start_in_debugger \ >> -debugger_node 1 \ >> -display 24.5.142.138:0.0 \ >> -em_ksp_type bcgs \ >> -em_sub_pc_type ilu \ >> -em_sub_pc_factor_levels 8 \ >> -em_sub_pc_factor_fill 4 \ >> -em_sub_pc_factor_reuse_ordering \ >> -em_sub_pc_factor_reuse_fill \ >> -em_sub_pc_factor_mat_ordering_type rcm \ >> -divh_ksp_type cr \ >> -divh_sub_pc_type icc \ >> -ppc_sub_pc_type ilu \ >> << EOF >> ... >> >> >> Randy >> >> >> Matthew Knepley wrote: >>> 1) Make sure ssh is forwarding X (-Y I think) >>> >>> 2) -start_in_debugger >>> >>> 3) -display :0.0 >>> >>> should do it. >>> >>> Matt >>> >>> On 5/27/06, *Randall Mackie* >> > wrote: >>> >>> This is a stupid question, but how do I start in the debugger if I'm >>> running >>> on a cluster half-way around the world and I'm working on that >>> cluster >>> via ssh? >>> >>> Randy >>> >>> >>> Matthew Knepley wrote: >>> > The best thing to do here is get a stack trace from the debugger. >>> From the >>> > description, it is hard to tell what statement is trying to >>> access which >>> > illegal >>> > memory. >>> > >>> > Matt >>> > >>> > On 5/27/06, *Randall Mackie* < randy at geosystem.us >>> >>> > >> wrote: >>> > >>> > In my PETSc based modeling code, I write out intermediate >>> results to >>> > a scratch >>> > file, and then read them back later. This has worked fine up >>> until >>> > today, >>> > when for a large model, this seems to be causing my program >>> to crash >>> > with >>> > errors like: >>> > >>> > >>> ------------------------------------------------------------------------ >>> > [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>> > Violation, probably memory access out of range >>> > >>> > >>> > I've tracked down the offending code to: >>> > >>> > IF (rank == 0) THEN >>> > irec=(iper-1)*2+ipol >>> > write(7,rec=irec) (xvec(i),i=1,np) >>> > END IF >>> > >>> > It writes out xvec for the first record, but then on the >>> second >>> > record my program is crashing. >>> > >>> > The record length (from an inquire statement) is recl >>> 22626552 >>> > >>> > The size of the scratch file when my program crashes is 98M. >>> > >>> > PETSc is compiled using the intel compilers ( v9.0 for >>> fortran), >>> > and the users manual says that you can have record lengths of >>> > up to 2 billion bytes. >>> > >>> > I'm kind of stuck as to what might be the cause. Any ideas >>> from anyone >>> > would be greatly appreciated. >>> > >>> > Randy Mackie >>> > >>> > ps. I've tried both the optimized and debugging versions of >>> the PETSc >>> > libraries, with the same result. >>> > >>> > >>> > -- >>> > Randall Mackie >>> > GSY-USA, Inc. >>> > PMB# 643 >>> > 2261 Market St., >>> > San Francisco, CA 94114-1600 >>> > Tel (415) 469-8649 >>> > Fax (415) 469-5044 >>> > >>> > California Registered Geophysicist >>> > License No. GP 1034 >>> > >>> > >>> > >>> > >>> > -- >>> > "Failure has a thousand explanations. Success doesn't need one" >>> -- Sir >>> > Alec Guiness >>> >>> -- >>> Randall Mackie >>> GSY-USA, Inc. >>> PMB# 643 >>> 2261 Market St., >>> San Francisco, CA 94114-1600 >>> Tel (415) 469-8649 >>> Fax (415) 469-5044 >>> >>> California Registered Geophysicist >>> License No. GP 1034 >>> >>> >>> >>> >>> -- >>> "Failure has a thousand explanations. Success doesn't need one" -- >>> Sir Alec Guiness >> >> > -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From randy at geosystem.us Sat May 27 21:51:18 2006 From: randy at geosystem.us (Randall Mackie) Date: Sat, 27 May 2006 19:51:18 -0700 Subject: bombing out writing large scratch files In-Reply-To: References: <4478CBF2.20600@geosystem.us> <4478D47D.6070004@geosystem.us> Message-ID: <44791026.8030303@geosystem.us> If using valgrind, can you tell me how to do that with mpirun and a parallel petsc program? is it valgrind mpirun program, or mpirun valgrind program? Randy Barry Smith wrote: > > Sometimes a subtle memory bug can lurk under the covers and > then appear in a big problem. You can try putting a CHKMEMQ > right before the if (rank == ) in the code and run the debug > version with -malloc_debug > You could also consider valgrind (valgrind.org). > > Barry > > On Sat, 27 May 2006, Randall Mackie wrote: > >> xvec is a double precision complex vector that is dynamically allocated >> once np is known. I've printed out the np value and it is correct. >> This works on the first pass, but not the second. >> >> This PETSc program has been working just fine for a couple years now, >> the only difference this time is the size of the model I'm working >> with, which is substantially larger than typical. >> >> I'm going to try to run this in the debugger and see if I can get >> anymore information. >> >> Randy >> >> >> Barry Smith wrote: >>> >>> Randy, >>> >>> The only "PETSc" related reason for this is that >>> xvec(i), i=1,np is accessing out of range. What is xvec >>> and is it of length 1 to np? >>> >>> Barry >>> >>> >>> On Sat, 27 May 2006, Randall Mackie wrote: >>> >>>> In my PETSc based modeling code, I write out intermediate results to >>>> a scratch >>>> file, and then read them back later. This has worked fine up until >>>> today, >>>> when for a large model, this seems to be causing my program to crash >>>> with >>>> errors like: >>>> >>>> ------------------------------------------------------------------------ >>>> >>>> [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>> Violation, probably memory access out of range >>>> >>>> >>>> I've tracked down the offending code to: >>>> >>>> IF (rank == 0) THEN >>>> irec=(iper-1)*2+ipol >>>> write(7,rec=irec) (xvec(i),i=1,np) >>>> END IF >>>> >>>> It writes out xvec for the first record, but then on the second >>>> record my program is crashing. >>>> >>>> The record length (from an inquire statement) is recl 22626552 >>>> >>>> The size of the scratch file when my program crashes is 98M. >>>> >>>> PETSc is compiled using the intel compilers (v9.0 for fortran), >>>> and the users manual says that you can have record lengths of >>>> up to 2 billion bytes. >>>> >>>> I'm kind of stuck as to what might be the cause. Any ideas from anyone >>>> would be greatly appreciated. >>>> >>>> Randy Mackie >>>> >>>> ps. I've tried both the optimized and debugging versions of the PETSc >>>> libraries, with the same result. >>>> >>>> >>>> >>> >> >> > -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From bsmith at mcs.anl.gov Sat May 27 21:53:54 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 27 May 2006 21:53:54 -0500 (CDT) Subject: bombing out writing large scratch files In-Reply-To: <44791026.8030303@geosystem.us> References: <4478CBF2.20600@geosystem.us> <4478D47D.6070004@geosystem.us> <44791026.8030303@geosystem.us> Message-ID: mpirun -np 2 valgrind --tool=memcheck executable On Sat, 27 May 2006, Randall Mackie wrote: > If using valgrind, can you tell me how to do that with mpirun and > a parallel petsc program? > > is it valgrind mpirun program, or mpirun valgrind program? > > Randy > > > Barry Smith wrote: >> >> Sometimes a subtle memory bug can lurk under the covers and >> then appear in a big problem. You can try putting a CHKMEMQ >> right before the if (rank == ) in the code and run the debug >> version with -malloc_debug >> You could also consider valgrind (valgrind.org). >> >> Barry >> >> On Sat, 27 May 2006, Randall Mackie wrote: >> >>> xvec is a double precision complex vector that is dynamically allocated >>> once np is known. I've printed out the np value and it is correct. >>> This works on the first pass, but not the second. >>> >>> This PETSc program has been working just fine for a couple years now, >>> the only difference this time is the size of the model I'm working >>> with, which is substantially larger than typical. >>> >>> I'm going to try to run this in the debugger and see if I can get >>> anymore information. >>> >>> Randy >>> >>> >>> Barry Smith wrote: >>>> >>>> Randy, >>>> >>>> The only "PETSc" related reason for this is that >>>> xvec(i), i=1,np is accessing out of range. What is xvec >>>> and is it of length 1 to np? >>>> >>>> Barry >>>> >>>> >>>> On Sat, 27 May 2006, Randall Mackie wrote: >>>> >>>>> In my PETSc based modeling code, I write out intermediate results to a >>>>> scratch >>>>> file, and then read them back later. This has worked fine up until >>>>> today, >>>>> when for a large model, this seems to be causing my program to crash >>>>> with >>>>> errors like: >>>>> >>>>> ------------------------------------------------------------------------ >>>>> [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>>>> probably memory access out of range >>>>> >>>>> >>>>> I've tracked down the offending code to: >>>>> >>>>> IF (rank == 0) THEN >>>>> irec=(iper-1)*2+ipol >>>>> write(7,rec=irec) (xvec(i),i=1,np) >>>>> END IF >>>>> >>>>> It writes out xvec for the first record, but then on the second >>>>> record my program is crashing. >>>>> >>>>> The record length (from an inquire statement) is recl 22626552 >>>>> >>>>> The size of the scratch file when my program crashes is 98M. >>>>> >>>>> PETSc is compiled using the intel compilers (v9.0 for fortran), >>>>> and the users manual says that you can have record lengths of >>>>> up to 2 billion bytes. >>>>> >>>>> I'm kind of stuck as to what might be the cause. Any ideas from anyone >>>>> would be greatly appreciated. >>>>> >>>>> Randy Mackie >>>>> >>>>> ps. I've tried both the optimized and debugging versions of the PETSc >>>>> libraries, with the same result. >>>>> >>>>> >>>>> >>>> >>> >>> >> > > From balay at mcs.anl.gov Sat May 27 23:03:07 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 27 May 2006 23:03:07 -0500 (CDT) Subject: bombing out writing large scratch files In-Reply-To: References: <4478CBF2.20600@geosystem.us> <4478D47D.6070004@geosystem.us> <44791026.8030303@geosystem.us> Message-ID: On Sat, 27 May 2006, Barry Smith wrote: > mpirun -np 2 valgrind --tool=memcheck executable Note: this works with mpich2 - not mpich1 And its probably best to install a different version of PETSc libraries with mpich2 for such debugging on a non-cluster machine. Satish From balay at mcs.anl.gov Sat May 27 23:33:41 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 27 May 2006 23:33:41 -0500 (CDT) Subject: bombing out writing large scratch files In-Reply-To: <4478E7B4.6070209@geosystem.us> References: <4478CBF2.20600@geosystem.us> <4478D69A.5050002@geosystem.us> <4478E7B4.6070209@geosystem.us> Message-ID: Looks like you have direct access to all the cluster nodes. Perhaps you have admin access? You can do either of the following: * if the cluster frontend/compute nodes have common filesystem [i.e all machines can see the same file for ~/.Xauthority] and you can get 'sshd' settings on the frontend changed - then: - configure sshd with 'X11UseLocalhost no' - this way xterms on the compute-nodes can connect to the 'ssh-x11' port on the frontend - run the PETSc app with: '-display frontend:ssh-x11-port' * However if the above is not possible - but you can ssh directly to all the the compute nodes [perhaps from the frontend] then you can cascade x11 forwarding with: - ssh from desktop to frontend - ssh from frontend to node-9 [if you know which machine is node9 from the machine file.] - If you don't know which one is the node-9 - then ssh from frontend to all the nodes :). Mostlikely all nodes will get a display 'localhost:l0.0' - so now you can run the executable with the option -display localhost:10.0 The other alternative that might work [for interactive runs] is: -start_in_debugger noxterm -debugger_nodes 9 Satish On Sat, 27 May 2006, Randall Mackie wrote: > I can't seem to get the debugger to pop up on my screen. > > When I'm logged into the cluster I'm working on, I can > type xterm &, and an xterm pops up on my display. So I know > I can get something from the remote cluster. > > Now, when I try this using PETSc, I'm getting the following error > message, for example: > > ------------------------------------------------------------------------ > [17]PETSC ERROR: PETSC: Attaching gdb to > /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 3628 on display > 24.5.142.138:0.0 on machine compute-0-23.local > ------------------------------------------------------------------------ > > I'm using this in my command file: > > source ~/.bashrc > time /opt/mpich/intel/bin/mpirun -np 20 -nolocal -machinefile machines \ > /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc \ > -start_in_debugger \ > -debugger_node 1 \ > -display 24.5.142.138:0.0 \ > -em_ksp_type bcgs \ > -em_sub_pc_type ilu \ > -em_sub_pc_factor_levels 8 \ > -em_sub_pc_factor_fill 4 \ > -em_sub_pc_factor_reuse_ordering \ > -em_sub_pc_factor_reuse_fill \ > -em_sub_pc_factor_mat_ordering_type rcm \ > -divh_ksp_type cr \ > -divh_sub_pc_type icc \ > -ppc_sub_pc_type ilu \ > << EOF From randy at geosystem.us Sun May 28 09:22:49 2006 From: randy at geosystem.us (Randall Mackie) Date: Sun, 28 May 2006 07:22:49 -0700 Subject: bombing out writing large scratch files In-Reply-To: References: <4478CBF2.20600@geosystem.us> <4478D69A.5050002@geosystem.us> <4478E7B4.6070209@geosystem.us> Message-ID: <4479B239.6010408@geosystem.us> Satish, Thanks, using method (2) worked. However, when I run a bt in gdb, I get the following output: Loaded symbols for /lib/libnss_files.so.2 0x080b2631 in d3inv_3_3 () at d3inv_3_3.F:2063 2063 call VecAssemblyBegin(xyz,ierr) (gdb) cont Continuing. Program received signal SIGUSR1, User defined signal 1. [Switching to Thread 1082952160 (LWP 23496)] 0x088cd729 in _intel_fast_memcpy.J () Current language: auto; currently fortran (gdb) bt #0 0x088cd729 in _intel_fast_memcpy.J () #1 0x40620628 in for_write_dir_xmit () from /opt/intel_fc_80/lib/libifcore.so.5 #2 0xbfffa6b0 in ?? () #3 0x00000008 in ?? () #4 0xbfff986c in ?? () #5 0xbfff9890 in ?? () #6 0x406873a8 in __dtors_list_end () from /opt/intel_fc_80/lib/libifcore.so.5 #7 0x00000002 in ?? () #8 0x00000000 in ?? () (gdb) This all makes me think this is an INTEL compiler bug, and has nothing to do with my code. Any ideas? Randy Satish Balay wrote: > Looks like you have direct access to all the cluster nodes. Perhaps > you have admin access? You can do either of the following: > > * if the cluster frontend/compute nodes have common filesystem [i.e > all machines can see the same file for ~/.Xauthority] and you can get > 'sshd' settings on the frontend changed - then: > > - configure sshd with 'X11UseLocalhost no' - this way xterms on the > compute-nodes can connect to the 'ssh-x11' port on the frontend > - run the PETSc app with: '-display frontend:ssh-x11-port' > > * However if the above is not possible - but you can ssh directly to > all the the compute nodes [perhaps from the frontend] then you can > cascade x11 forwarding with: > > - ssh from desktop to frontend > - ssh from frontend to node-9 [if you know which machine is node9 > from the machine file.] > - If you don't know which one is the node-9 - then ssh from frontend > to all the nodes :). Mostlikely all nodes will get a display 'localhost:l0.0' > - so now you can run the executable with the option > -display localhost:10.0 > > The other alternative that might work [for interactive runs] is: > > -start_in_debugger noxterm -debugger_nodes 9 > > Satish > > On Sat, 27 May 2006, Randall Mackie wrote: > >> I can't seem to get the debugger to pop up on my screen. >> >> When I'm logged into the cluster I'm working on, I can >> type xterm &, and an xterm pops up on my display. So I know >> I can get something from the remote cluster. >> >> Now, when I try this using PETSc, I'm getting the following error >> message, for example: >> >> ------------------------------------------------------------------------ >> [17]PETSC ERROR: PETSC: Attaching gdb to >> /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 3628 on display >> 24.5.142.138:0.0 on machine compute-0-23.local >> ------------------------------------------------------------------------ >> >> I'm using this in my command file: >> >> source ~/.bashrc >> time /opt/mpich/intel/bin/mpirun -np 20 -nolocal -machinefile machines \ >> /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc \ >> -start_in_debugger \ >> -debugger_node 1 \ >> -display 24.5.142.138:0.0 \ >> -em_ksp_type bcgs \ >> -em_sub_pc_type ilu \ >> -em_sub_pc_factor_levels 8 \ >> -em_sub_pc_factor_fill 4 \ >> -em_sub_pc_factor_reuse_ordering \ >> -em_sub_pc_factor_reuse_fill \ >> -em_sub_pc_factor_mat_ordering_type rcm \ >> -divh_ksp_type cr \ >> -divh_sub_pc_type icc \ >> -ppc_sub_pc_type ilu \ >> << EOF > -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From balay at mcs.anl.gov Sun May 28 09:39:33 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Sun, 28 May 2006 09:39:33 -0500 (CDT) Subject: bombing out writing large scratch files In-Reply-To: <4479B239.6010408@geosystem.us> References: <4478CBF2.20600@geosystem.us> <4478D69A.5050002@geosystem.us> <4478E7B4.6070209@geosystem.us> <4479B239.6010408@geosystem.us> Message-ID: - Not sure what SIGUSR1 means in this context. - The stack doesn't show any PETSc/user code. Was this code compiled with debug version of PETSc? - it could be that gdb is unable to look at intel compilers stack [normally gdb should work]. If thats the case - you could run with '-start_in_debugger idb' - It appears that this breakage is from usercode which calls fortran I/O [for_write_dir_xmit()]. There is no fortran I/O from PETSc side of the code. I think it could still be a bug in the usercode. However PETSc does try to detect the availability of _intel_fast_memcpy() and use it from C side. I don't think this is the cause. But to verify you could remove the flag PETSC_HAVE__INTEL_FAST_MEMCPY from petscconf.h and rebuild libraries. Satish On Sun, 28 May 2006, Randall Mackie wrote: > Satish, > > Thanks, using method (2) worked. However, when I run a bt in gdb, > I get the following output: > > Loaded symbols for /lib/libnss_files.so.2 > 0x080b2631 in d3inv_3_3 () at d3inv_3_3.F:2063 > 2063 call VecAssemblyBegin(xyz,ierr) > (gdb) cont > Continuing. > > Program received signal SIGUSR1, User defined signal 1. > [Switching to Thread 1082952160 (LWP 23496)] > 0x088cd729 in _intel_fast_memcpy.J () > Current language: auto; currently fortran > (gdb) bt > #0 0x088cd729 in _intel_fast_memcpy.J () > #1 0x40620628 in for_write_dir_xmit () > from /opt/intel_fc_80/lib/libifcore.so.5 > #2 0xbfffa6b0 in ?? () > #3 0x00000008 in ?? () > #4 0xbfff986c in ?? () > #5 0xbfff9890 in ?? () > #6 0x406873a8 in __dtors_list_end () from /opt/intel_fc_80/lib/libifcore.so.5 > #7 0x00000002 in ?? () > #8 0x00000000 in ?? () > (gdb) > > This all makes me think this is an INTEL compiler bug, and has nothing to > do with my code. > > Any ideas? > > Randy > > > Satish Balay wrote: > > Looks like you have direct access to all the cluster nodes. Perhaps > > you have admin access? You can do either of the following: > > > > * if the cluster frontend/compute nodes have common filesystem [i.e > > all machines can see the same file for ~/.Xauthority] and you can get > > 'sshd' settings on the frontend changed - then: > > > > - configure sshd with 'X11UseLocalhost no' - this way xterms on the > > compute-nodes can connect to the 'ssh-x11' port on the frontend - run > > the PETSc app with: '-display frontend:ssh-x11-port' > > > > * However if the above is not possible - but you can ssh directly to > > all the the compute nodes [perhaps from the frontend] then you can > > cascade x11 forwarding with: > > > > - ssh from desktop to frontend > > - ssh from frontend to node-9 [if you know which machine is node9 > > from the machine file.] > > - If you don't know which one is the node-9 - then ssh from frontend > > to all the nodes :). Mostlikely all nodes will get a display > > 'localhost:l0.0' > > - so now you can run the executable with the option > > -display localhost:10.0 > > > > The other alternative that might work [for interactive runs] is: > > > > -start_in_debugger noxterm -debugger_nodes 9 > > > > Satish > > > > On Sat, 27 May 2006, Randall Mackie wrote: > > > > > I can't seem to get the debugger to pop up on my screen. > > > > > > When I'm logged into the cluster I'm working on, I can > > > type xterm &, and an xterm pops up on my display. So I know > > > I can get something from the remote cluster. > > > > > > Now, when I try this using PETSc, I'm getting the following error > > > message, for example: > > > > > > ------------------------------------------------------------------------ > > > [17]PETSC ERROR: PETSC: Attaching gdb to > > > /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 3628 on display > > > 24.5.142.138:0.0 on machine compute-0-23.local > > > ------------------------------------------------------------------------ > > > > > > I'm using this in my command file: > > > > > > source ~/.bashrc > > > time /opt/mpich/intel/bin/mpirun -np 20 -nolocal -machinefile machines \ > > > /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc \ > > > -start_in_debugger \ > > > -debugger_node 1 \ > > > -display 24.5.142.138:0.0 \ > > > -em_ksp_type bcgs \ > > > -em_sub_pc_type ilu \ > > > -em_sub_pc_factor_levels 8 \ > > > -em_sub_pc_factor_fill 4 \ > > > -em_sub_pc_factor_reuse_ordering \ > > > -em_sub_pc_factor_reuse_fill \ > > > -em_sub_pc_factor_mat_ordering_type rcm \ > > > -divh_ksp_type cr \ > > > -divh_sub_pc_type icc \ > > > -ppc_sub_pc_type ilu \ > > > << EOF > > > > From randy at geosystem.us Sun May 28 09:46:23 2006 From: randy at geosystem.us (Randall Mackie) Date: Sun, 28 May 2006 07:46:23 -0700 Subject: bombing out writing large scratch files In-Reply-To: References: <4478CBF2.20600@geosystem.us> <4478D69A.5050002@geosystem.us> <4478E7B4.6070209@geosystem.us> <4479B239.6010408@geosystem.us> Message-ID: <4479B7BF.4010606@geosystem.us> Satish, Yes, PETSc was compiled in debug mode. Since I'm simply storing vectors in a temporary file, could I get around this by using VecView and writing each vector to the same Viewer in binary format, then reading them later? In other words: do loop=1,n call VecView (xvec(:,loop).....) end do then later do loop=1,n call VecLoad (xvec(:,loop)....) end do Randy ps. I'll try your other suggestions as well. However, this code has worked flawlessly until now, with a model much much larger than I've used in the past. Satish Balay wrote: > - Not sure what SIGUSR1 means in this context. > > - The stack doesn't show any PETSc/user code. Was > this code compiled with debug version of PETSc? > > - it could be that gdb is unable to look at intel compilers stack > [normally gdb should work]. If thats the case - you could run with > '-start_in_debugger idb'] > > - It appears that this breakage is from usercode which calls fortran > I/O [for_write_dir_xmit()]. There is no fortran I/O from PETSc side > of the code. I think it could still be a bug in the usercode. > > However PETSc does try to detect the availability of > _intel_fast_memcpy() and use it from C side. I don't think this is the > cause. But to verify you could remove the flag > PETSC_HAVE__INTEL_FAST_MEMCPY from petscconf.h and rebuild libraries. > > Satish > > > On Sun, 28 May 2006, Randall Mackie wrote: > >> Satish, >> >> Thanks, using method (2) worked. However, when I run a bt in gdb, >> I get the following output: >> >> Loaded symbols for /lib/libnss_files.so.2 >> 0x080b2631 in d3inv_3_3 () at d3inv_3_3.F:2063 >> 2063 call VecAssemblyBegin(xyz,ierr) >> (gdb) cont >> Continuing. >> >> Program received signal SIGUSR1, User defined signal 1. >> [Switching to Thread 1082952160 (LWP 23496)] >> 0x088cd729 in _intel_fast_memcpy.J () >> Current language: auto; currently fortran >> (gdb) bt >> #0 0x088cd729 in _intel_fast_memcpy.J () >> #1 0x40620628 in for_write_dir_xmit () >> from /opt/intel_fc_80/lib/libifcore.so.5 >> #2 0xbfffa6b0 in ?? () >> #3 0x00000008 in ?? () >> #4 0xbfff986c in ?? () >> #5 0xbfff9890 in ?? () >> #6 0x406873a8 in __dtors_list_end () from /opt/intel_fc_80/lib/libifcore.so.5 >> #7 0x00000002 in ?? () >> #8 0x00000000 in ?? () >> (gdb) >> >> This all makes me think this is an INTEL compiler bug, and has nothing to >> do with my code. >> >> Any ideas? >> >> Randy >> >> >> Satish Balay wrote: >>> Looks like you have direct access to all the cluster nodes. Perhaps >>> you have admin access? You can do either of the following: >>> >>> * if the cluster frontend/compute nodes have common filesystem [i.e >>> all machines can see the same file for ~/.Xauthority] and you can get >>> 'sshd' settings on the frontend changed - then: >>> >>> - configure sshd with 'X11UseLocalhost no' - this way xterms on the >>> compute-nodes can connect to the 'ssh-x11' port on the frontend - run >>> the PETSc app with: '-display frontend:ssh-x11-port' >>> >>> * However if the above is not possible - but you can ssh directly to >>> all the the compute nodes [perhaps from the frontend] then you can >>> cascade x11 forwarding with: >>> >>> - ssh from desktop to frontend >>> - ssh from frontend to node-9 [if you know which machine is node9 >>> from the machine file.] >>> - If you don't know which one is the node-9 - then ssh from frontend >>> to all the nodes :). Mostlikely all nodes will get a display >>> 'localhost:l0.0' >>> - so now you can run the executable with the option >>> -display localhost:10.0 >>> >>> The other alternative that might work [for interactive runs] is: >>> >>> -start_in_debugger noxterm -debugger_nodes 9 >>> >>> Satish >>> >>> On Sat, 27 May 2006, Randall Mackie wrote: >>> >>>> I can't seem to get the debugger to pop up on my screen. >>>> >>>> When I'm logged into the cluster I'm working on, I can >>>> type xterm &, and an xterm pops up on my display. So I know >>>> I can get something from the remote cluster. >>>> >>>> Now, when I try this using PETSc, I'm getting the following error >>>> message, for example: >>>> >>>> ------------------------------------------------------------------------ >>>> [17]PETSC ERROR: PETSC: Attaching gdb to >>>> /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 3628 on display >>>> 24.5.142.138:0.0 on machine compute-0-23.local >>>> ------------------------------------------------------------------------ >>>> >>>> I'm using this in my command file: >>>> >>>> source ~/.bashrc >>>> time /opt/mpich/intel/bin/mpirun -np 20 -nolocal -machinefile machines \ >>>> /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc \ >>>> -start_in_debugger \ >>>> -debugger_node 1 \ >>>> -display 24.5.142.138:0.0 \ >>>> -em_ksp_type bcgs \ >>>> -em_sub_pc_type ilu \ >>>> -em_sub_pc_factor_levels 8 \ >>>> -em_sub_pc_factor_fill 4 \ >>>> -em_sub_pc_factor_reuse_ordering \ >>>> -em_sub_pc_factor_reuse_fill \ >>>> -em_sub_pc_factor_mat_ordering_type rcm \ >>>> -divh_ksp_type cr \ >>>> -divh_sub_pc_type icc \ >>>> -ppc_sub_pc_type ilu \ >>>> << EOF >> > -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From balay at mcs.anl.gov Sun May 28 09:49:44 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Sun, 28 May 2006 09:49:44 -0500 (CDT) Subject: bombing out writing large scratch files In-Reply-To: <4479B7BF.4010606@geosystem.us> References: <4478CBF2.20600@geosystem.us> <4478D69A.5050002@geosystem.us> <4478E7B4.6070209@geosystem.us> <4479B239.6010408@geosystem.us> <4479B7BF.4010606@geosystem.us> Message-ID: Yes - VecView()/VecLoad() should work. And it should be reasonably efficient. Satish On Sun, 28 May 2006, Randall Mackie wrote: > Satish, > > Yes, PETSc was compiled in debug mode. > > Since I'm simply storing vectors in a temporary file, could I get > around this by using VecView and writing each vector to the > same Viewer in binary format, then reading them later? > > In other words: > > do loop=1,n > > call VecView (xvec(:,loop).....) > > end do > > > > then later > > > do loop=1,n > call VecLoad (xvec(:,loop)....) > > end do > > > Randy > > ps. I'll try your other suggestions as well. However, this code has worked > flawlessly until now, with a model much much larger than I've used in the > past. > > > > Satish Balay wrote: > > - Not sure what SIGUSR1 means in this context. > > > > - The stack doesn't show any PETSc/user code. Was > > this code compiled with debug version of PETSc? > > > > - it could be that gdb is unable to look at intel compilers stack > > [normally gdb should work]. If thats the case - you could run with > > '-start_in_debugger idb'] > > > > - It appears that this breakage is from usercode which calls fortran > > I/O [for_write_dir_xmit()]. There is no fortran I/O from PETSc side > > of the code. I think it could still be a bug in the usercode. > > > > However PETSc does try to detect the availability of > > _intel_fast_memcpy() and use it from C side. I don't think this is the > > cause. But to verify you could remove the flag > > PETSC_HAVE__INTEL_FAST_MEMCPY from petscconf.h and rebuild libraries. > > > > Satish > > > > > > On Sun, 28 May 2006, Randall Mackie wrote: > > > > > Satish, > > > > > > Thanks, using method (2) worked. However, when I run a bt in gdb, > > > I get the following output: > > > > > > Loaded symbols for /lib/libnss_files.so.2 > > > 0x080b2631 in d3inv_3_3 () at d3inv_3_3.F:2063 > > > 2063 call VecAssemblyBegin(xyz,ierr) > > > (gdb) cont > > > Continuing. > > > > > > Program received signal SIGUSR1, User defined signal 1. > > > [Switching to Thread 1082952160 (LWP 23496)] > > > 0x088cd729 in _intel_fast_memcpy.J () > > > Current language: auto; currently fortran > > > (gdb) bt > > > #0 0x088cd729 in _intel_fast_memcpy.J () > > > #1 0x40620628 in for_write_dir_xmit () > > > from /opt/intel_fc_80/lib/libifcore.so.5 > > > #2 0xbfffa6b0 in ?? () > > > #3 0x00000008 in ?? () > > > #4 0xbfff986c in ?? () > > > #5 0xbfff9890 in ?? () > > > #6 0x406873a8 in __dtors_list_end () from > > > /opt/intel_fc_80/lib/libifcore.so.5 > > > #7 0x00000002 in ?? () > > > #8 0x00000000 in ?? () > > > (gdb) > > > > > > This all makes me think this is an INTEL compiler bug, and has nothing to > > > do with my code. > > > > > > Any ideas? > > > > > > Randy > > > > > > > > > Satish Balay wrote: > > > > Looks like you have direct access to all the cluster nodes. Perhaps > > > > you have admin access? You can do either of the following: > > > > > > > > * if the cluster frontend/compute nodes have common filesystem [i.e > > > > all machines can see the same file for ~/.Xauthority] and you can get > > > > 'sshd' settings on the frontend changed - then: > > > > > > > > - configure sshd with 'X11UseLocalhost no' - this way xterms on the > > > > compute-nodes can connect to the 'ssh-x11' port on the frontend - > > > > run > > > > the PETSc app with: '-display frontend:ssh-x11-port' > > > > > > > > * However if the above is not possible - but you can ssh directly to > > > > all the the compute nodes [perhaps from the frontend] then you can > > > > cascade x11 forwarding with: > > > > > > > > - ssh from desktop to frontend > > > > - ssh from frontend to node-9 [if you know which machine is node9 > > > > from the machine file.] > > > > - If you don't know which one is the node-9 - then ssh from frontend > > > > to all the nodes :). Mostlikely all nodes will get a display > > > > 'localhost:l0.0' > > > > - so now you can run the executable with the option > > > > -display localhost:10.0 > > > > > > > > The other alternative that might work [for interactive runs] is: > > > > > > > > -start_in_debugger noxterm -debugger_nodes 9 > > > > > > > > Satish > > > > > > > > On Sat, 27 May 2006, Randall Mackie wrote: > > > > > > > > > I can't seem to get the debugger to pop up on my screen. > > > > > > > > > > When I'm logged into the cluster I'm working on, I can > > > > > type xterm &, and an xterm pops up on my display. So I know > > > > > I can get something from the remote cluster. > > > > > > > > > > Now, when I try this using PETSc, I'm getting the following error > > > > > message, for example: > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > [17]PETSC ERROR: PETSC: Attaching gdb to > > > > > /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 3628 on display > > > > > 24.5.142.138:0.0 on machine compute-0-23.local > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > I'm using this in my command file: > > > > > > > > > > source ~/.bashrc > > > > > time /opt/mpich/intel/bin/mpirun -np 20 -nolocal -machinefile machines > > > > > \ > > > > > /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc \ > > > > > -start_in_debugger \ > > > > > -debugger_node 1 \ > > > > > -display 24.5.142.138:0.0 \ > > > > > -em_ksp_type bcgs \ > > > > > -em_sub_pc_type ilu \ > > > > > -em_sub_pc_factor_levels 8 \ > > > > > -em_sub_pc_factor_fill 4 \ > > > > > -em_sub_pc_factor_reuse_ordering \ > > > > > -em_sub_pc_factor_reuse_fill \ > > > > > -em_sub_pc_factor_mat_ordering_type rcm \ > > > > > -divh_ksp_type cr \ > > > > > -divh_sub_pc_type icc \ > > > > > -ppc_sub_pc_type ilu \ > > > > > << EOF > > > > > > > From knepley at gmail.com Sun May 28 09:56:27 2006 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 28 May 2006 09:56:27 -0500 Subject: bombing out writing large scratch files In-Reply-To: <4479B7BF.4010606@geosystem.us> References: <4478CBF2.20600@geosystem.us> <4478D69A.5050002@geosystem.us> <4478E7B4.6070209@geosystem.us> <4479B239.6010408@geosystem.us> <4479B7BF.4010606@geosystem.us> Message-ID: Arrghhh, some bad compilers use SIGUSR1 to communicate with themselves. I have had this. Just keep typing 'cont' until the SEGV. Matt On 5/28/06, Randall Mackie wrote: > > Satish, > > Yes, PETSc was compiled in debug mode. > > Since I'm simply storing vectors in a temporary file, could I get > around this by using VecView and writing each vector to the > same Viewer in binary format, then reading them later? > > In other words: > > do loop=1,n > > call VecView (xvec(:,loop).....) > > end do > > > > then later > > > do loop=1,n > > call VecLoad (xvec(:,loop)....) > > end do > > > Randy > > ps. I'll try your other suggestions as well. However, this code has worked > flawlessly until now, with a model much much larger than I've used in the > past. > > > > Satish Balay wrote: > > - Not sure what SIGUSR1 means in this context. > > > > - The stack doesn't show any PETSc/user code. Was > > this code compiled with debug version of PETSc? > > > > - it could be that gdb is unable to look at intel compilers stack > > [normally gdb should work]. If thats the case - you could run with > > '-start_in_debugger idb'] > > > > - It appears that this breakage is from usercode which calls fortran > > I/O [for_write_dir_xmit()]. There is no fortran I/O from PETSc side > > of the code. I think it could still be a bug in the usercode. > > > > However PETSc does try to detect the availability of > > _intel_fast_memcpy() and use it from C side. I don't think this is the > > cause. But to verify you could remove the flag > > PETSC_HAVE__INTEL_FAST_MEMCPY from petscconf.h and rebuild libraries. > > > > Satish > > > > > > On Sun, 28 May 2006, Randall Mackie wrote: > > > >> Satish, > >> > >> Thanks, using method (2) worked. However, when I run a bt in gdb, > >> I get the following output: > >> > >> Loaded symbols for /lib/libnss_files.so.2 > >> 0x080b2631 in d3inv_3_3 () at d3inv_3_3.F:2063 > >> 2063 call VecAssemblyBegin(xyz,ierr) > >> (gdb) cont > >> Continuing. > >> > >> Program received signal SIGUSR1, User defined signal 1. > >> [Switching to Thread 1082952160 (LWP 23496)] > >> 0x088cd729 in _intel_fast_memcpy.J () > >> Current language: auto; currently fortran > >> (gdb) bt > >> #0 0x088cd729 in _intel_fast_memcpy.J () > >> #1 0x40620628 in for_write_dir_xmit () > >> from /opt/intel_fc_80/lib/libifcore.so.5 > >> #2 0xbfffa6b0 in ?? () > >> #3 0x00000008 in ?? () > >> #4 0xbfff986c in ?? () > >> #5 0xbfff9890 in ?? () > >> #6 0x406873a8 in __dtors_list_end () from > /opt/intel_fc_80/lib/libifcore.so.5 > >> #7 0x00000002 in ?? () > >> #8 0x00000000 in ?? () > >> (gdb) > >> > >> This all makes me think this is an INTEL compiler bug, and has nothing > to > >> do with my code. > >> > >> Any ideas? > >> > >> Randy > >> > >> > >> Satish Balay wrote: > >>> Looks like you have direct access to all the cluster nodes. Perhaps > >>> you have admin access? You can do either of the following: > >>> > >>> * if the cluster frontend/compute nodes have common filesystem [i.e > >>> all machines can see the same file for ~/.Xauthority] and you can get > >>> 'sshd' settings on the frontend changed - then: > >>> > >>> - configure sshd with 'X11UseLocalhost no' - this way xterms on the > >>> compute-nodes can connect to the 'ssh-x11' port on the frontend - > run > >>> the PETSc app with: '-display frontend:ssh-x11-port' > >>> > >>> * However if the above is not possible - but you can ssh directly to > >>> all the the compute nodes [perhaps from the frontend] then you can > >>> cascade x11 forwarding with: > >>> > >>> - ssh from desktop to frontend > >>> - ssh from frontend to node-9 [if you know which machine is node9 > >>> from the machine file.] > >>> - If you don't know which one is the node-9 - then ssh from frontend > >>> to all the nodes :). Mostlikely all nodes will get a display > >>> 'localhost:l0.0' > >>> - so now you can run the executable with the option > >>> -display localhost:10.0 > >>> > >>> The other alternative that might work [for interactive runs] is: > >>> > >>> -start_in_debugger noxterm -debugger_nodes 9 > >>> > >>> Satish > >>> > >>> On Sat, 27 May 2006, Randall Mackie wrote: > >>> > >>>> I can't seem to get the debugger to pop up on my screen. > >>>> > >>>> When I'm logged into the cluster I'm working on, I can > >>>> type xterm &, and an xterm pops up on my display. So I know > >>>> I can get something from the remote cluster. > >>>> > >>>> Now, when I try this using PETSc, I'm getting the following error > >>>> message, for example: > >>>> > >>>> > ------------------------------------------------------------------------ > >>>> [17]PETSC ERROR: PETSC: Attaching gdb to > >>>> /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 3628 on display > >>>> 24.5.142.138:0.0 on machine compute-0-23.local > >>>> > ------------------------------------------------------------------------ > >>>> > >>>> I'm using this in my command file: > >>>> > >>>> source ~/.bashrc > >>>> time /opt/mpich/intel/bin/mpirun -np 20 -nolocal -machinefile > machines \ > >>>> /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc \ > >>>> -start_in_debugger \ > >>>> -debugger_node 1 \ > >>>> -display 24.5.142.138:0.0 \ > >>>> -em_ksp_type bcgs \ > >>>> -em_sub_pc_type ilu \ > >>>> -em_sub_pc_factor_levels 8 \ > >>>> -em_sub_pc_factor_fill 4 \ > >>>> -em_sub_pc_factor_reuse_ordering \ > >>>> -em_sub_pc_factor_reuse_fill \ > >>>> -em_sub_pc_factor_mat_ordering_type rcm \ > >>>> -divh_ksp_type cr \ > >>>> -divh_sub_pc_type icc \ > >>>> -ppc_sub_pc_type ilu \ > >>>> << EOF > >> > > > > -- > Randall Mackie > GSY-USA, Inc. > PMB# 643 > 2261 Market St., > San Francisco, CA 94114-1600 > Tel (415) 469-8649 > Fax (415) 469-5044 > > California Registered Geophysicist > License No. GP 1034 > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From randy at geosystem.us Sun May 28 12:00:57 2006 From: randy at geosystem.us (Randall Mackie) Date: Sun, 28 May 2006 10:00:57 -0700 Subject: bombing out writing large scratch files In-Reply-To: References: <4478CBF2.20600@geosystem.us> <4478D69A.5050002@geosystem.us> <4478E7B4.6070209@geosystem.us> <4479B239.6010408@geosystem.us> Message-ID: <4479D749.5070005@geosystem.us> Thanks to everybody who helped me struggle with this problem. I've learned a lot about debugging MPI programs on a cluster half-way around the world. It turns out that the problem was not a bug in the sense that I had written x(i) when it should have been x(i-1), for example. Rather, in one of my subroutines, I was using automatic arrays, and I believe I was bumping up against the hard limit for stack memory (where automatic arrays are put). I've rewritten the code to make those allocatable arrays, and now it all runs okay, although I realize I have some more reprogramming to do to properly do it in parallel, but at least that problem is solved. Thanks again, Randy Satish Balay wrote: > - Not sure what SIGUSR1 means in this context. > > - The stack doesn't show any PETSc/user code. Was > this code compiled with debug version of PETSc? > > - it could be that gdb is unable to look at intel compilers stack > [normally gdb should work]. If thats the case - you could run with > '-start_in_debugger idb' > > - It appears that this breakage is from usercode which calls fortran > I/O [for_write_dir_xmit()]. There is no fortran I/O from PETSc side > of the code. I think it could still be a bug in the usercode. > > However PETSc does try to detect the availability of > _intel_fast_memcpy() and use it from C side. I don't think this is the > cause. But to verify you could remove the flag > PETSC_HAVE__INTEL_FAST_MEMCPY from petscconf.h and rebuild libraries. > > Satish > > > On Sun, 28 May 2006, Randall Mackie wrote: > >> Satish, >> >> Thanks, using method (2) worked. However, when I run a bt in gdb, >> I get the following output: >> >> Loaded symbols for /lib/libnss_files.so.2 >> 0x080b2631 in d3inv_3_3 () at d3inv_3_3.F:2063 >> 2063 call VecAssemblyBegin(xyz,ierr) >> (gdb) cont >> Continuing. >> >> Program received signal SIGUSR1, User defined signal 1. >> [Switching to Thread 1082952160 (LWP 23496)] >> 0x088cd729 in _intel_fast_memcpy.J () >> Current language: auto; currently fortran >> (gdb) bt >> #0 0x088cd729 in _intel_fast_memcpy.J () >> #1 0x40620628 in for_write_dir_xmit () >> from /opt/intel_fc_80/lib/libifcore.so.5 >> #2 0xbfffa6b0 in ?? () >> #3 0x00000008 in ?? () >> #4 0xbfff986c in ?? () >> #5 0xbfff9890 in ?? () >> #6 0x406873a8 in __dtors_list_end () from /opt/intel_fc_80/lib/libifcore.so.5 >> #7 0x00000002 in ?? () >> #8 0x00000000 in ?? () >> (gdb) >> >> This all makes me think this is an INTEL compiler bug, and has nothing to >> do with my code. >> >> Any ideas? >> >> Randy >> >> >> Satish Balay wrote: >>> Looks like you have direct access to all the cluster nodes. Perhaps >>> you have admin access? You can do either of the following: >>> >>> * if the cluster frontend/compute nodes have common filesystem [i.e >>> all machines can see the same file for ~/.Xauthority] and you can get >>> 'sshd' settings on the frontend changed - then: >>> >>> - configure sshd with 'X11UseLocalhost no' - this way xterms on the >>> compute-nodes can connect to the 'ssh-x11' port on the frontend - run >>> the PETSc app with: '-display frontend:ssh-x11-port' >>> >>> * However if the above is not possible - but you can ssh directly to >>> all the the compute nodes [perhaps from the frontend] then you can >>> cascade x11 forwarding with: >>> >>> - ssh from desktop to frontend >>> - ssh from frontend to node-9 [if you know which machine is node9 >>> from the machine file.] >>> - If you don't know which one is the node-9 - then ssh from frontend >>> to all the nodes :). Mostlikely all nodes will get a display >>> 'localhost:l0.0' >>> - so now you can run the executable with the option >>> -display localhost:10.0 >>> >>> The other alternative that might work [for interactive runs] is: >>> >>> -start_in_debugger noxterm -debugger_nodes 9 >>> >>> Satish >>> >>> On Sat, 27 May 2006, Randall Mackie wrote: >>> >>>> I can't seem to get the debugger to pop up on my screen. >>>> >>>> When I'm logged into the cluster I'm working on, I can >>>> type xterm &, and an xterm pops up on my display. So I know >>>> I can get something from the remote cluster. >>>> >>>> Now, when I try this using PETSc, I'm getting the following error >>>> message, for example: >>>> >>>> ------------------------------------------------------------------------ >>>> [17]PETSC ERROR: PETSC: Attaching gdb to >>>> /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 3628 on display >>>> 24.5.142.138:0.0 on machine compute-0-23.local >>>> ------------------------------------------------------------------------ >>>> >>>> I'm using this in my command file: >>>> >>>> source ~/.bashrc >>>> time /opt/mpich/intel/bin/mpirun -np 20 -nolocal -machinefile machines \ >>>> /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc \ >>>> -start_in_debugger \ >>>> -debugger_node 1 \ >>>> -display 24.5.142.138:0.0 \ >>>> -em_ksp_type bcgs \ >>>> -em_sub_pc_type ilu \ >>>> -em_sub_pc_factor_levels 8 \ >>>> -em_sub_pc_factor_fill 4 \ >>>> -em_sub_pc_factor_reuse_ordering \ >>>> -em_sub_pc_factor_reuse_fill \ >>>> -em_sub_pc_factor_mat_ordering_type rcm \ >>>> -divh_ksp_type cr \ >>>> -divh_sub_pc_type icc \ >>>> -ppc_sub_pc_type ilu \ >>>> << EOF >> > -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From berend at chalmers.se Sun May 28 15:03:09 2006 From: berend at chalmers.se (Berend van Wachem) Date: Sun, 28 May 2006 22:03:09 +0200 Subject: Using valgrind on petsc projects Message-ID: <447A01FD.1020102@chalmers.se> Hi, I am using Petsc with Open-Mpi and have started to use valgrind to find a bug. One of the things that I see is that there is memory lost in PetscInitialize by calling MPI_Init and in DACreate (which I do quite often in my project). The amounts of memory lost aren't huge; should I be alarmed? To give an example, I have attached the output of valgrind on $PETSC_DIR/src/snes/examples/tutorials Thanks, Berend. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: valgrindout URL: From balay at mcs.anl.gov Sun May 28 17:52:48 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Sun, 28 May 2006 17:52:48 -0500 (CDT) Subject: Using valgrind on petsc projects In-Reply-To: <447A01FD.1020102@chalmers.se> References: <447A01FD.1020102@chalmers.se> Message-ID: >>>> asterix:/home/balay/spetsc/src/snes/examples/tutorials>mpiexec -np 2 valgrind -q --tool=memcheck ./ex5 Number of Newton iterations = 4 asterix:/home/balay/spetsc/src/snes/examples/tutorials> >>>> The above is with MPICH2 and valgrind doesn't find any problems. So I'll sugest installing another build of PETSc with --download-mpich=1 and retrying your code with it [you can install with a different PETSC_ARCH so that what you've curently installed is still useable] Mostlikely the issues you've encountered are OpenMPI issues. You could try reproducing this problem with a simple MPI code. [Ideally OpenMPI code should be fixed to be valgrind clean..] However as this one is a minor leak - you you could ignore this issue. There is a way in valgrind to create a supression file - for known issues that you'd like to ignore. Satish On Sun, 28 May 2006, Berend van Wachem wrote: > Hi, > > I am using Petsc with Open-Mpi and have started to use valgrind to find a bug. > One of the things that I see is that there is memory lost in PetscInitialize > by calling MPI_Init and in DACreate (which I do quite often in my project). > The amounts of memory lost aren't huge; should I be alarmed? > > To give an example, I have attached the output of valgrind on > $PETSC_DIR/src/snes/examples/tutorials > > Thanks, > > Berend. > > From gudik at ae.metu.edu.tr Wed May 31 02:39:51 2006 From: gudik at ae.metu.edu.tr (Evrim Dizemen) Date: Wed, 31 May 2006 10:39:51 +0300 Subject: Matrix input file format Message-ID: <447D4847.5030007@ae.metu.edu.tr> Dear all, I'm using Petsc 2.3.1 for my master thesis. I want to solve a linear problem Ax=b in parallel programming using MPICH with complex variables. I can get the A matrix from my fortran code but i can not find in which format i must write it for reading by Petsc. I'm really not familiar with c programming so I'll be happy for a solution in fortran. THANKS, Evrim From bsmith at mcs.anl.gov Wed May 31 08:01:53 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 31 May 2006 08:01:53 -0500 (CDT) Subject: Matrix input file format In-Reply-To: <447D4847.5030007@ae.metu.edu.tr> References: <447D4847.5030007@ae.metu.edu.tr> Message-ID: Evrim, From the manual page for MatLoad() http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatLoad.html Most users should not need to know the details of the binary storage format, since MatLoad() and MatView() completely hide these details. But for anyone who's interested, the standard binary matrix storage format is int MAT_FILE_COOKIE int number of rows int number of columns int total number of nonzeros int *number nonzeros in each row int *column indices of all nonzeros (starting index is zero) PetscScalar *values of all nonzeros Mat_file_cookie is 1211216 BTW: we never recommend saving a matrix to a file and then loading with a PETSc program to solve the system; this is not parallel programming. Barry On Wed, 31 May 2006, Evrim Dizemen wrote: > Dear all, > > I'm using Petsc 2.3.1 for my master thesis. I want to solve a linear problem > Ax=b in parallel programming using MPICH with complex variables. I can get > the A matrix from my fortran code but i can not find in which format i must > write it for reading by Petsc. I'm really not familiar with c programming so > I'll be happy for a solution in fortran. > > THANKS, > > Evrim > > From randy at geosystem.us Wed May 31 09:19:13 2006 From: randy at geosystem.us (Randall Mackie) Date: Wed, 31 May 2006 07:19:13 -0700 Subject: Matrix input file format In-Reply-To: <447D4847.5030007@ae.metu.edu.tr> References: <447D4847.5030007@ae.metu.edu.tr> Message-ID: <447DA5E1.6090108@geosystem.us> Evrim, I was in your spot several years ago, wanting to convert a serial code to solve complex systems to parallel, and PETSc seemed like a good solution. It was a steep learning curve for me, but well worth it. Barry told you how to read a file with PETSc, but maybe you want to consider to generate your matrix and assemble it in parallel? This has many advantages, one being that no single node has to store the entire matrix in memory, or that no single node has to do all the calculations to generate the matrix. There are different ways to do this in PETSc, but a simple way that doesn't take too much effort would be as follows: Say your system is np x np in size, first create a parallel vector for this system for the right hand side (b) and the solution xsol: call VecCreateMPI(PETSC_COMM_WORLD,PETSC_DECIDE,np,b,ierr) call VecDuplicate(b,xsol,ierr) call VecGetLocalSize(b,mloc,ierr) call VecGetOwnershipRange(b,Istart,Iend,ierr) do i=Istart+1,Iend loc(i)=i-1 end do Istart and Iend determine the rows in the global system that are owned by a particular node. The variable loc(i) just converts to 0-based indexing. I use it for setting the vector values: call VecSetValues(xsol,mloc,loc(Istart+1),xvec(Istart+1), . INSERT_VALUES,ierr) call VecAssemblyBegin(xsol,ierr) call VecAssemblyEnd(xsol,ierr) Then, you create the parallel matrix (but you need to know something about it's structure in order to do an efficient matrix assembly, and for that you'll need to read the manual): call MatCreateMPIAIJ(PETSC_COMM_WORLD,mloc,mloc,np,np, . PETSC_NULL_INTEGER, d_nnz, PETSC_NULL_INTEGER, . o_nnz,A,ierr) Then, in your program that computes the elements of the coefficient matrix, each node only computes those rows that it owns (from Istart and Iend), something like: do jj=1,np IF (jj >= Istart+1 .and. jj <= Iend) THEN compute elements.... call MatSetValues(A,i1,row,ic,col,v,INSERT_VALUES, . ierr) END IF end do After that, you assemble the matrix: call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) You can do something similar to set up the right hand side (b), and then it's a rather simple matter to have PETSc solve the system. From this, you should be able to fill in the details by reading the manual and looking at the examples. Good luck, Randy Mackie Evrim Dizemen wrote: > Dear all, > > I'm using Petsc 2.3.1 for my master thesis. I want to solve a linear > problem Ax=b in parallel programming using MPICH with complex variables. > I can get the A matrix from my fortran code but i can not find in which > format i must write it for reading by Petsc. I'm really not familiar > with c programming so I'll be happy for a solution in fortran. > > THANKS, > > Evrim > -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From mafunk at nmsu.edu Wed May 31 16:46:40 2006 From: mafunk at nmsu.edu (Matt Funk) Date: Wed, 31 May 2006 15:46:40 -0600 Subject: building libraries In-Reply-To: <447DA5E1.6090108@geosystem.us> References: <447D4847.5030007@ae.metu.edu.tr> <447DA5E1.6090108@geosystem.us> Message-ID: <200605311546.42002.mafunk@nmsu.edu> Hi, i need to build PETsC on s sycld machine. Basically i need to have MPI support in PetsC but i need to switch from using mpicc to gcc. I was wondering if someone could point to how i can do that. (Is there an option that i overlooked?) thanks mat From knepley at gmail.com Wed May 31 17:08:44 2006 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 May 2006 17:08:44 -0500 Subject: building libraries In-Reply-To: <200605311546.42002.mafunk@nmsu.edu> References: <447D4847.5030007@ae.metu.edu.tr> <447DA5E1.6090108@geosystem.us> <200605311546.42002.mafunk@nmsu.edu> Message-ID: You can have PETSc install MPI using --download-mpich. You can make sure that it does not use mpicc using --with-mpi-compilers=0. Matt On 5/31/06, Matt Funk wrote: > > Hi, > > i need to build PETsC on s sycld machine. Basically i need to have MPI > support > in PetsC but i need to switch from using mpicc to gcc. > > I was wondering if someone could point to how i can do that. (Is there an > option that i overlooked?) > > thanks > mat > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed May 31 17:08:50 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 31 May 2006 17:08:50 -0500 (CDT) Subject: building libraries In-Reply-To: <200605311546.42002.mafunk@nmsu.edu> References: <447D4847.5030007@ae.metu.edu.tr> <447DA5E1.6090108@geosystem.us> <200605311546.42002.mafunk@nmsu.edu> Message-ID: On Wed, 31 May 2006, Matt Funk wrote: > Hi, > > i need to build PETsC on s sycld machine. Basically i need to have MPI support > in PetsC but i need to switch from using mpicc to gcc. why use gcc over mpicc? [mpicc should be internally using gcc so it should satisfy the gcc requiremetn] > I was wondering if someone could point to how i can do that. (Is there an > option that i overlooked?) Look at 'mpicc -show' to determine the include & library options required to use this MPI - and then configure PETSc with: --with-cc=gcc --with-fc=g77 --with-mpi-include=/foo/bar/mpi/include --with-mpi-lib=[/foo/bar/mpi/lib/libmpi.a,anotherlib.a,anotherlib.a] 'mpicc -show' should give the path to libmpi.a and its dependendent libs. Satish