From Lars.Rindorf at teknologisk.dk Mon Jun 2 03:31:37 2008 From: Lars.Rindorf at teknologisk.dk (Lars Rindorf) Date: Mon, 2 Jun 2008 10:31:37 +0200 Subject: SV: SV: Slow MatSetValues In-Reply-To: <20080530123109.GB3835@brakk.ethz.ch> Message-ID: Calling MatMPIAIJSetPreallocation again after MatSetFromOptions fixed the problem! Thanks. KR, Lars -----Oprindelig meddelelse----- Fra: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af Jed Brown Sendt: 30. maj 2008 14:31 Til: petsc-users at mcs.anl.gov Emne: Re: SV: Slow MatSetValues On Fri 2008-05-30 13:44, Lars Rindorf wrote: > Hi everybody > > Thanks for all the suggestions and help. The problem is of a bit different nature. I use only direct solvers, so I give the options "-ksp_type preonly -pc_type lu" to make a standard LU factorization. This works fine without any problems. If I additionally set "-mat_type umfpack" to use umfpack then MatSetValues is very, very slow (about 50 times slower). If, as a test, I call MatAssemblyBegin and MatAssemblyEnd before MatSetValues, and only use the lu (no umfpack) then the performance is very similarly slow. I've seen this problem when preallocation information is lost by changing the matrix type. Try putting MatSeqAIJSetPreallocation() and/or (it doesn't hurt to do both) MatMPIAIJSetPreallocation() after MatSetFromOptions(). This will preallocate for any matrix type that inherits from these two types (which I think is anything you might use). Jed From Lars.Rindorf at teknologisk.dk Mon Jun 2 07:17:09 2008 From: Lars.Rindorf at teknologisk.dk (Lars Rindorf) Date: Mon, 2 Jun 2008 14:17:09 +0200 Subject: Error codes from external packages In-Reply-To: Message-ID: Dear all If I want to know the error codes of an external package that crashes, then what can I do? The problem arises with umfpack when the size of the matrix is more than 118000x118000, corresponding to 2.4 Gb memory consumption. It simply returns "umfpack_di_numeric" (factorization of a real matrix) failed. Has anybody else experienced this? KR, Lars From knepley at gmail.com Mon Jun 2 08:07:24 2008 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 2 Jun 2008 08:07:24 -0500 Subject: Error codes from external packages In-Reply-To: References: Message-ID: On Mon, Jun 2, 2008 at 7:17 AM, Lars Rindorf wrote: > Dear all > > If I want to know the error codes of an external package that crashes, > then what can I do? We call umfpack_zl_report_status() with the UMFPACK status code in the event of a failure. Printing for this is controlled by the -mat_umfpack_prl option I believe. Matt > The problem arises with umfpack when the size of the matrix is more than > 118000x118000, corresponding to 2.4 Gb memory consumption. It simply > returns "umfpack_di_numeric" (factorization of a real matrix) failed. > Has anybody else experienced this? > > KR, Lars > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From Lars.Rindorf at teknologisk.dk Mon Jun 2 09:22:22 2008 From: Lars.Rindorf at teknologisk.dk (Lars Rindorf) Date: Mon, 2 Jun 2008 16:22:22 +0200 Subject: SV: Error codes from external packages In-Reply-To: Message-ID: Hi Matthew I have included the -mat_umfpack_prl parameter, but it does not make any difference. I have checked the spelling with petsc manual. When umfpack crashes it (umfpack_di_numeric) returns an error code. I want to access that code. Here is what petsc returns (with/without -mat_umfpack_prl option): [lhr at localhost notch_patch]$ getdp patch_notch.pro -msh patch_notch_29000.msh -pre res_static -cal -ksp_type preonly -pc_type lu -mat_type umfpack -mat_umfpack_prl P r e - P r o c e s s i n g . . . E n d P r e - P r o c e s s i n g P r o c e s s i n g . . . Operation : Generate[A] Info : Setting System {A,b} to zero Resources : cpu 6.41402 s Operation : Solve[A] PETSc : N: 136188 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Error in external library! [0]PETSC ERROR: umfpack_di_numeric failed! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: patch_notch.pro on a linux-gnu named localhost.localdomain by lhr Mon Jun 2 16:03:50 2008 [0]PETSC ERROR: Libraries linked from /home/lhr/Desktop/getdp/petsc/petsc-2.3.3-p13/lib/linux-gnu-c-debug [0]PETSC ERROR: Configure run at Mon Jun 2 15:58:42 2008 [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77 --with-blas-lapack-dir=/opt/intel/mkl/10.0.1.014/lib/em64t/ --download-mpich=ifneeded --download-umfpack=ifneeded --with-shared=0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: MatLUFactorNumeric_UMFPACK() line 129 in src/mat/impls/aij/seq/umfpack/umfpack.c [0]PETSC ERROR: MatLUFactorNumeric() line 2227 in src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_LU() line 280 in src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: PCSetUp() line 787 in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 347 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: User provided function() line 1472 in unknowndirectory/LinAlg_PETSC.c [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 76) - process 0 KR, Lars -----Oprindelig meddelelse----- Fra: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af Matthew Knepley Sendt: 2. juni 2008 15:07 Til: petsc-users at mcs.anl.gov Emne: Re: Error codes from external packages On Mon, Jun 2, 2008 at 7:17 AM, Lars Rindorf wrote: > Dear all > > If I want to know the error codes of an external package that crashes, > then what can I do? We call umfpack_zl_report_status() with the UMFPACK status code in the event of a failure. Printing for this is controlled by the -mat_umfpack_prl option I believe. Matt > The problem arises with umfpack when the size of the matrix is more > than 118000x118000, corresponding to 2.4 Gb memory consumption. It > simply returns "umfpack_di_numeric" (factorization of a real matrix) failed. > Has anybody else experienced this? > > KR, Lars > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From knepley at gmail.com Mon Jun 2 09:32:48 2008 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 2 Jun 2008 09:32:48 -0500 Subject: Error codes from external packages In-Reply-To: References: Message-ID: On Mon, Jun 2, 2008 at 9:22 AM, Lars Rindorf wrote: > Hi Matthew > > I have included the -mat_umfpack_prl parameter, but it does not make any difference. I have checked the spelling with petsc manual. When umfpack crashes it (umfpack_di_numeric) returns an error code. I want to access that code. Ah, I was checking solve(), not factor(). This is an oversight in the code. I am fixing it in the dev version. We will have a release fairly soon, but you can always get dev for these kinds of bugs fixes quickly. Matt > Here is what petsc returns (with/without -mat_umfpack_prl option): > [lhr at localhost notch_patch]$ getdp patch_notch.pro -msh patch_notch_29000.msh -pre res_static -cal -ksp_type preonly -pc_type lu -mat_type umfpack -mat_umfpack_prl > P r e - P r o c e s s i n g . . . > E n d P r e - P r o c e s s i n g > P r o c e s s i n g . . . > Operation : Generate[A] > Info : Setting System {A,b} to zero > Resources : cpu 6.41402 s > Operation : Solve[A] > PETSc : N: 136188 > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > [0]PETSC ERROR: Error in external library! > [0]PETSC ERROR: umfpack_di_numeric failed! > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: patch_notch.pro on a linux-gnu named localhost.localdomain by lhr Mon Jun 2 16:03:50 2008 > [0]PETSC ERROR: Libraries linked from /home/lhr/Desktop/getdp/petsc/petsc-2.3.3-p13/lib/linux-gnu-c-debug > [0]PETSC ERROR: Configure run at Mon Jun 2 15:58:42 2008 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77 --with-blas-lapack-dir=/opt/intel/mkl/10.0.1.014/lib/em64t/ --download-mpich=ifneeded --download-umfpack=ifneeded --with-shared=0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: MatLUFactorNumeric_UMFPACK() line 129 in src/mat/impls/aij/seq/umfpack/umfpack.c > [0]PETSC ERROR: MatLUFactorNumeric() line 2227 in src/mat/interface/matrix.c > [0]PETSC ERROR: PCSetUp_LU() line 280 in src/ksp/pc/impls/factor/lu/lu.c > [0]PETSC ERROR: PCSetUp() line 787 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 347 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: User provided function() line 1472 in unknowndirectory/LinAlg_PETSC.c > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 76) - process 0 > > KR, Lars > > > -----Oprindelig meddelelse----- > Fra: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af Matthew Knepley > Sendt: 2. juni 2008 15:07 > Til: petsc-users at mcs.anl.gov > Emne: Re: Error codes from external packages > > On Mon, Jun 2, 2008 at 7:17 AM, Lars Rindorf wrote: >> Dear all >> >> If I want to know the error codes of an external package that crashes, >> then what can I do? > > We call > > umfpack_zl_report_status() > > with the UMFPACK status code in the event of a failure. Printing for this is controlled by the -mat_umfpack_prl option I believe. > > Matt > >> The problem arises with umfpack when the size of the matrix is more >> than 118000x118000, corresponding to 2.4 Gb memory consumption. It >> simply returns "umfpack_di_numeric" (factorization of a real matrix) failed. >> Has anybody else experienced this? >> >> KR, Lars >> >> > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From Lars.Rindorf at teknologisk.dk Mon Jun 2 09:51:04 2008 From: Lars.Rindorf at teknologisk.dk (Lars Rindorf) Date: Mon, 2 Jun 2008 16:51:04 +0200 Subject: SV: Error codes from external packages In-Reply-To: Message-ID: Hi again I solved the problem! I changed the petsc source code, to show the error code. It is a memory allocation error in malloc inside umfpack. Thanks anyway. Lars -----Oprindelig meddelelse----- Fra: Lars Rindorf Sendt: 2. juni 2008 16:22 Til: 'petsc-users at mcs.anl.gov' Emne: SV: Error codes from external packages Hi Matthew I have included the -mat_umfpack_prl parameter, but it does not make any difference. I have checked the spelling with petsc manual. When umfpack crashes it (umfpack_di_numeric) returns an error code. I want to access that code. Here is what petsc returns (with/without -mat_umfpack_prl option): [lhr at localhost notch_patch]$ getdp patch_notch.pro -msh patch_notch_29000.msh -pre res_static -cal -ksp_type preonly -pc_type lu -mat_type umfpack -mat_umfpack_prl P r e - P r o c e s s i n g . . . E n d P r e - P r o c e s s i n g P r o c e s s i n g . . . Operation : Generate[A] Info : Setting System {A,b} to zero Resources : cpu 6.41402 s Operation : Solve[A] PETSc : N: 136188 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Error in external library! [0]PETSC ERROR: umfpack_di_numeric failed! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: patch_notch.pro on a linux-gnu named localhost.localdomain by lhr Mon Jun 2 16:03:50 2008 [0]PETSC ERROR: Libraries linked from /home/lhr/Desktop/getdp/petsc/petsc-2.3.3-p13/lib/linux-gnu-c-debug [0]PETSC ERROR: Configure run at Mon Jun 2 15:58:42 2008 [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77 --with-blas-lapack-dir=/opt/intel/mkl/10.0.1.014/lib/em64t/ --download-mpich=ifneeded --download-umfpack=ifneeded --with-shared=0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: MatLUFactorNumeric_UMFPACK() line 129 in src/mat/impls/aij/seq/umfpack/umfpack.c [0]PETSC ERROR: MatLUFactorNumeric() line 2227 in src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_LU() line 280 in src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: PCSetUp() line 787 in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 347 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: User provided function() line 1472 in unknowndirectory/LinAlg_PETSC.c [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 76) - process 0 KR, Lars -----Oprindelig meddelelse----- Fra: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af Matthew Knepley Sendt: 2. juni 2008 15:07 Til: petsc-users at mcs.anl.gov Emne: Re: Error codes from external packages On Mon, Jun 2, 2008 at 7:17 AM, Lars Rindorf wrote: > Dear all > > If I want to know the error codes of an external package that crashes, > then what can I do? We call umfpack_zl_report_status() with the UMFPACK status code in the event of a failure. Printing for this is controlled by the -mat_umfpack_prl option I believe. Matt > The problem arises with umfpack when the size of the matrix is more > than 118000x118000, corresponding to 2.4 Gb memory consumption. It > simply returns "umfpack_di_numeric" (factorization of a real matrix) failed. > Has anybody else experienced this? > > KR, Lars > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From Lars.Rindorf at teknologisk.dk Tue Jun 3 06:53:14 2008 From: Lars.Rindorf at teknologisk.dk (Lars Rindorf) Date: Tue, 3 Jun 2008 13:53:14 +0200 Subject: SV: Error codes from external packages In-Reply-To: Message-ID: Hi Matthew I've a couple of questions regarding umfpack and petsc. I have a version of the program that I use. This program uses complex scalars and umfpack in petsc, but when I try myself to compile petsc with umfpack and complex numbers, petsc gives an error saying that umfpack and complex scalars is not yet implemented. Is that correct? Is there a version of petsc that allows complex scalars and umfpack? Secondly, I'm still having problems with umfpack running out of memory (umfpack error -1). I have played around with petsc memory allocation in MatSetAIJSetPreallocation and with umfpack Control[UMFPACK_ALLOC_INIT] and it makes no difference. I have try it also with both Intel mkl blas and petsc default blas, and that made no difference either. Do you have any ideas where to look for the error? The largest succesful simulation gives the following petsc output: KSP Object: type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: factor fill ratio needed 0 Factored matrix follows Matrix Object: type=umfpack, rows=118636, cols=118636 total: nonzeros=0, allocated nonzeros=118636 not using I-node routines UMFPACK run parameters: Control[UMFPACK_PRL]: 1 Control[UMFPACK_STRATEGY]: 0 Control[UMFPACK_DENSE_COL]: 0.2 Control[UMFPACK_DENSE_ROW]: 0.2 Control[UMFPACK_AMD_DENSE]: 10 Control[UMFPACK_BLOCK_SIZE]: 32 Control[UMFPACK_2BY2_TOLERANCE]: 0.01 Control[UMFPACK_FIXQ]: 0 Control[UMFPACK_AGGRESSIVE]: 1 Control[UMFPACK_PIVOT_TOLERANCE]: 0.1 Control[UMFPACK_SYM_PIVOT_TOLERANCE]: 0.001 Control[UMFPACK_SCALE]: 1 Control[UMFPACK_ALLOC_INIT]: 0.7 Control[UMFPACK_DROPTOL]: 0 Control[UMFPACK_IRSTEP]: 0 UMFPACK default matrix ordering is used (not the PETSc matrix ordering) linear system matrix = precond matrix: Matrix Object: type=umfpack, rows=118636, cols=118636 total: nonzeros=4377120, allocated nonzeros=29659000 using I-node routines: found 105980 nodes, limit used is 5 KR, Lars -----Oprindelig meddelelse----- Fra: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af Matthew Knepley Sendt: 2. juni 2008 16:33 Til: petsc-users at mcs.anl.gov Emne: Re: Error codes from external packages On Mon, Jun 2, 2008 at 9:22 AM, Lars Rindorf wrote: > Hi Matthew > > I have included the -mat_umfpack_prl parameter, but it does not make any difference. I have checked the spelling with petsc manual. When umfpack crashes it (umfpack_di_numeric) returns an error code. I want to access that code. Ah, I was checking solve(), not factor(). This is an oversight in the code. I am fixing it in the dev version. We will have a release fairly soon, but you can always get dev for these kinds of bugs fixes quickly. Matt > Here is what petsc returns (with/without -mat_umfpack_prl option): > [lhr at localhost notch_patch]$ getdp patch_notch.pro -msh > patch_notch_29000.msh -pre res_static -cal -ksp_type preonly -pc_type lu -mat_type umfpack -mat_umfpack_prl P r e - P r o c e s s i n g . . . > E n d P r e - P r o c e s s i n g > P r o c e s s i n g . . . > Operation : Generate[A] > Info : Setting System {A,b} to zero > Resources : cpu 6.41402 s > Operation : Solve[A] > PETSc : N: 136188 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Error in external library! > [0]PETSC ERROR: umfpack_di_numeric failed! > [0]PETSC ERROR: > ---------------------------------------------------------------------- > -- [0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 13, Thu May 15 > 17:29:26 CDT 2008 HG revision: > 4466c6289a0922df26e20626fd4a0b4dd03c8124 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ---------------------------------------------------------------------- > -- [0]PETSC ERROR: patch_notch.pro on a linux-gnu named > localhost.localdomain by lhr Mon Jun 2 16:03:50 2008 [0]PETSC ERROR: > Libraries linked from > /home/lhr/Desktop/getdp/petsc/petsc-2.3.3-p13/lib/linux-gnu-c-debug > [0]PETSC ERROR: Configure run at Mon Jun 2 15:58:42 2008 [0]PETSC > ERROR: Configure options --with-cc=gcc --with-fc=g77 > --with-blas-lapack-dir=/opt/intel/mkl/10.0.1.014/lib/em64t/ > --download-mpich=ifneeded --download-umfpack=ifneeded --with-shared=0 > [0]PETSC ERROR: > ---------------------------------------------------------------------- > -- [0]PETSC ERROR: MatLUFactorNumeric_UMFPACK() line 129 in > src/mat/impls/aij/seq/umfpack/umfpack.c > [0]PETSC ERROR: MatLUFactorNumeric() line 2227 in > src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_LU() line 280 in > src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: PCSetUp() line 787 in > src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 234 in > src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 347 in > src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: User provided > function() line 1472 in unknowndirectory/LinAlg_PETSC.c > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 76) - process 0 > > KR, Lars > > > -----Oprindelig meddelelse----- > Fra: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af Matthew Knepley > Sendt: 2. juni 2008 15:07 > Til: petsc-users at mcs.anl.gov > Emne: Re: Error codes from external packages > > On Mon, Jun 2, 2008 at 7:17 AM, Lars Rindorf wrote: >> Dear all >> >> If I want to know the error codes of an external package that >> crashes, then what can I do? > > We call > > umfpack_zl_report_status() > > with the UMFPACK status code in the event of a failure. Printing for this is controlled by the -mat_umfpack_prl option I believe. > > Matt > >> The problem arises with umfpack when the size of the matrix is more >> than 118000x118000, corresponding to 2.4 Gb memory consumption. It >> simply returns "umfpack_di_numeric" (factorization of a real matrix) failed. >> Has anybody else experienced this? >> >> KR, Lars >> >> > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From Stephen.R.Ball at awe.co.uk Tue Jun 3 08:11:31 2008 From: Stephen.R.Ball at awe.co.uk (Stephen R Ball) Date: Tue, 3 Jun 2008 14:11:31 +0100 Subject: MatGetArray() not supporting Mat type mpiaij. Message-ID: <863ELA018348@awe.co.uk> Hi I have been trying to extract an array containing the local matrix values using MatGetArray() via the Fortran interface but get the error message that Mat type mpiaij is not supported with this routine. All I want to do is to extract the local matrix values so that I can output them to file in the format I want rather than via use of the MatView() routine. Can you suggest a way of how I can go about extracting the local matrix values? Thanks Stephen From knepley at gmail.com Tue Jun 3 09:28:37 2008 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 3 Jun 2008 09:28:37 -0500 Subject: MatGetArray() not supporting Mat type mpiaij. In-Reply-To: <863ELA018348@awe.co.uk> References: <863ELA018348@awe.co.uk> Message-ID: On Tue, Jun 3, 2008 at 8:11 AM, Stephen R Ball wrote: > Hi > > I have been trying to extract an array containing the local matrix > values using MatGetArray() via the Fortran interface but get the error > message that Mat type mpiaij is not supported with this routine. All I > want to do is to extract the local matrix values so that I can output > them to file in the format I want rather than via use of the MatView() > routine. Can you suggest a way of how I can go about extracting the > local matrix values? This is no "local matrix". The Mat interface is supposed to be data structure neutral so we can optimize for different architectures. If you want the values directly, I would use MatGetRow() for each row. Matt > Thanks > > Stephen > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From knepley at gmail.com Tue Jun 3 09:32:14 2008 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 3 Jun 2008 09:32:14 -0500 Subject: Error codes from external packages In-Reply-To: References: Message-ID: On Tue, Jun 3, 2008 at 6:53 AM, Lars Rindorf wrote: > Hi Matthew > > I've a couple of questions regarding umfpack and petsc. I have a version of the program that I use. This program uses complex scalars and umfpack in petsc, but when I try myself to compile petsc with umfpack and complex numbers, petsc gives an error saying that umfpack and complex scalars is not yet implemented. Is that correct? Is there a version of petsc that allows complex scalars and umfpack? This question came up before on this list. We do not support complex with UMFPACK. I cannot remember the reason, but there was a problem with the complex extension. > Secondly, I'm still having problems with umfpack running out of memory (umfpack error -1). I have played around with petsc memory allocation in MatSetAIJSetPreallocation and with umfpack Control[UMFPACK_ALLOC_INIT] and it makes no difference. I have try it also with both Intel mkl blas and petsc default blas, and that made no difference either. Do you have any ideas where to look for the error? Are you trying to go beyond 32-bits? If so, you would need an OS that will allocate more than 2G to a process, like 64-bit Linux. Matt > The largest succesful simulation gives the following petsc output: > KSP Object: > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: factor fill ratio needed 0 > Factored matrix follows > Matrix Object: > type=umfpack, rows=118636, cols=118636 > total: nonzeros=0, allocated nonzeros=118636 > not using I-node routines > UMFPACK run parameters: > Control[UMFPACK_PRL]: 1 > Control[UMFPACK_STRATEGY]: 0 > Control[UMFPACK_DENSE_COL]: 0.2 > Control[UMFPACK_DENSE_ROW]: 0.2 > Control[UMFPACK_AMD_DENSE]: 10 > Control[UMFPACK_BLOCK_SIZE]: 32 > Control[UMFPACK_2BY2_TOLERANCE]: 0.01 > Control[UMFPACK_FIXQ]: 0 > Control[UMFPACK_AGGRESSIVE]: 1 > Control[UMFPACK_PIVOT_TOLERANCE]: 0.1 > Control[UMFPACK_SYM_PIVOT_TOLERANCE]: 0.001 > Control[UMFPACK_SCALE]: 1 > Control[UMFPACK_ALLOC_INIT]: 0.7 > Control[UMFPACK_DROPTOL]: 0 > Control[UMFPACK_IRSTEP]: 0 > UMFPACK default matrix ordering is used (not the PETSc matrix ordering) > linear system matrix = precond matrix: > Matrix Object: > type=umfpack, rows=118636, cols=118636 > total: nonzeros=4377120, allocated nonzeros=29659000 > using I-node routines: found 105980 nodes, limit used is 5 > > KR, Lars > > -----Oprindelig meddelelse----- > Fra: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af Matthew Knepley > Sendt: 2. juni 2008 16:33 > Til: petsc-users at mcs.anl.gov > Emne: Re: Error codes from external packages > > On Mon, Jun 2, 2008 at 9:22 AM, Lars Rindorf wrote: >> Hi Matthew >> >> I have included the -mat_umfpack_prl parameter, but it does not make any difference. I have checked the spelling with petsc manual. When umfpack crashes it (umfpack_di_numeric) returns an error code. I want to access that code. > > Ah, I was checking solve(), not factor(). This is an oversight in the code. I am fixing it in the dev version. We will have a release fairly soon, but you can always get dev for these kinds of bugs fixes quickly. > > Matt > >> Here is what petsc returns (with/without -mat_umfpack_prl option): >> [lhr at localhost notch_patch]$ getdp patch_notch.pro -msh >> patch_notch_29000.msh -pre res_static -cal -ksp_type preonly -pc_type lu -mat_type umfpack -mat_umfpack_prl P r e - P r o c e s s i n g . . . >> E n d P r e - P r o c e s s i n g >> P r o c e s s i n g . . . >> Operation : Generate[A] >> Info : Setting System {A,b} to zero >> Resources : cpu 6.41402 s >> Operation : Solve[A] >> PETSc : N: 136188 >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Error in external library! >> [0]PETSC ERROR: umfpack_di_numeric failed! >> [0]PETSC ERROR: >> ---------------------------------------------------------------------- >> -- [0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 13, Thu May 15 >> 17:29:26 CDT 2008 HG revision: >> 4466c6289a0922df26e20626fd4a0b4dd03c8124 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: >> ---------------------------------------------------------------------- >> -- [0]PETSC ERROR: patch_notch.pro on a linux-gnu named >> localhost.localdomain by lhr Mon Jun 2 16:03:50 2008 [0]PETSC ERROR: >> Libraries linked from >> /home/lhr/Desktop/getdp/petsc/petsc-2.3.3-p13/lib/linux-gnu-c-debug >> [0]PETSC ERROR: Configure run at Mon Jun 2 15:58:42 2008 [0]PETSC >> ERROR: Configure options --with-cc=gcc --with-fc=g77 >> --with-blas-lapack-dir=/opt/intel/mkl/10.0.1.014/lib/em64t/ >> --download-mpich=ifneeded --download-umfpack=ifneeded --with-shared=0 >> [0]PETSC ERROR: >> ---------------------------------------------------------------------- >> -- [0]PETSC ERROR: MatLUFactorNumeric_UMFPACK() line 129 in >> src/mat/impls/aij/seq/umfpack/umfpack.c >> [0]PETSC ERROR: MatLUFactorNumeric() line 2227 in >> src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_LU() line 280 in >> src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: PCSetUp() line 787 in >> src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 234 in >> src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 347 in >> src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: User provided >> function() line 1472 in unknowndirectory/LinAlg_PETSC.c >> [unset]: aborting job: >> application called MPI_Abort(MPI_COMM_WORLD, 76) - process 0 >> >> KR, Lars >> >> >> -----Oprindelig meddelelse----- >> Fra: owner-petsc-users at mcs.anl.gov >> [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af Matthew Knepley >> Sendt: 2. juni 2008 15:07 >> Til: petsc-users at mcs.anl.gov >> Emne: Re: Error codes from external packages >> >> On Mon, Jun 2, 2008 at 7:17 AM, Lars Rindorf wrote: >>> Dear all >>> >>> If I want to know the error codes of an external package that >>> crashes, then what can I do? >> >> We call >> >> umfpack_zl_report_status() >> >> with the UMFPACK status code in the event of a failure. Printing for this is controlled by the -mat_umfpack_prl option I believe. >> >> Matt >> >>> The problem arises with umfpack when the size of the matrix is more >>> than 118000x118000, corresponding to 2.4 Gb memory consumption. It >>> simply returns "umfpack_di_numeric" (factorization of a real matrix) failed. >>> Has anybody else experienced this? >>> >>> KR, Lars >>> >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From David.Colignon at ulg.ac.be Tue Jun 3 10:50:39 2008 From: David.Colignon at ulg.ac.be (David Colignon) Date: Tue, 03 Jun 2008 17:50:39 +0200 Subject: Error codes from external packages In-Reply-To: References: Message-ID: <4845684F.1090404@ulg.ac.be> Hi, my colleague Ch. Geuzaine added complex arithmetic support and 64 bit addressing for umfpack last year. See http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-dev/2007/06/msg00000.html It has been included in petsc-dev. Cheers, Dave -- David Colignon, Ph.D. Collaborateur Logistique F.R.S.-FNRS (Equipements de Calcul Intensif) ACE - Applied & Computational Electromagnetics Institut Montefiore B28 Universit? de Li?ge 4000 Li?ge - BELGIQUE T?l: +32 (0)4 366 37 32 Fax: +32 (0)4 366 29 10 WWW: http://www.montefiore.ulg.ac.be/personnel.php?op=detail&id=898 Agenda: http://www.google.com/calendar/embed?src=david.colignon%40gmail.com Matthew Knepley wrote: > On Tue, Jun 3, 2008 at 6:53 AM, Lars Rindorf > wrote: >> Hi Matthew >> >> I've a couple of questions regarding umfpack and petsc. I have a version of the program that I use. This program uses complex scalars and umfpack in petsc, but when I try myself to compile petsc with umfpack and complex numbers, petsc gives an error saying that umfpack and complex scalars is not yet implemented. Is that correct? Is there a version of petsc that allows complex scalars and umfpack? > > This question came up before on this list. We do not support complex > with UMFPACK. I cannot remember the reason, but there was a problem > with the complex extension. > >> Secondly, I'm still having problems with umfpack running out of memory (umfpack error -1). I have played around with petsc memory allocation in MatSetAIJSetPreallocation and with umfpack Control[UMFPACK_ALLOC_INIT] and it makes no difference. I have try it also with both Intel mkl blas and petsc default blas, and that made no difference either. Do you have any ideas where to look for the error? > > Are you trying to go beyond 32-bits? If so, you would need an OS that > will allocate more than 2G to a process, like 64-bit Linux. > > Matt > >> The largest succesful simulation gives the following petsc output: >> KSP Object: >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: lu >> LU: out-of-place factorization >> matrix ordering: nd >> LU: tolerance for zero pivot 1e-12 >> LU: factor fill ratio needed 0 >> Factored matrix follows >> Matrix Object: >> type=umfpack, rows=118636, cols=118636 >> total: nonzeros=0, allocated nonzeros=118636 >> not using I-node routines >> UMFPACK run parameters: >> Control[UMFPACK_PRL]: 1 >> Control[UMFPACK_STRATEGY]: 0 >> Control[UMFPACK_DENSE_COL]: 0.2 >> Control[UMFPACK_DENSE_ROW]: 0.2 >> Control[UMFPACK_AMD_DENSE]: 10 >> Control[UMFPACK_BLOCK_SIZE]: 32 >> Control[UMFPACK_2BY2_TOLERANCE]: 0.01 >> Control[UMFPACK_FIXQ]: 0 >> Control[UMFPACK_AGGRESSIVE]: 1 >> Control[UMFPACK_PIVOT_TOLERANCE]: 0.1 >> Control[UMFPACK_SYM_PIVOT_TOLERANCE]: 0.001 >> Control[UMFPACK_SCALE]: 1 >> Control[UMFPACK_ALLOC_INIT]: 0.7 >> Control[UMFPACK_DROPTOL]: 0 >> Control[UMFPACK_IRSTEP]: 0 >> UMFPACK default matrix ordering is used (not the PETSc matrix ordering) >> linear system matrix = precond matrix: >> Matrix Object: >> type=umfpack, rows=118636, cols=118636 >> total: nonzeros=4377120, allocated nonzeros=29659000 >> using I-node routines: found 105980 nodes, limit used is 5 >> >> KR, Lars >> >> -----Oprindelig meddelelse----- >> Fra: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af Matthew Knepley >> Sendt: 2. juni 2008 16:33 >> Til: petsc-users at mcs.anl.gov >> Emne: Re: Error codes from external packages >> >> On Mon, Jun 2, 2008 at 9:22 AM, Lars Rindorf wrote: >>> Hi Matthew >>> >>> I have included the -mat_umfpack_prl parameter, but it does not make any difference. I have checked the spelling with petsc manual. When umfpack crashes it (umfpack_di_numeric) returns an error code. I want to access that code. >> Ah, I was checking solve(), not factor(). This is an oversight in the code. I am fixing it in the dev version. We will have a release fairly soon, but you can always get dev for these kinds of bugs fixes quickly. >> >> Matt >> >>> Here is what petsc returns (with/without -mat_umfpack_prl option): >>> [lhr at localhost notch_patch]$ getdp patch_notch.pro -msh >>> patch_notch_29000.msh -pre res_static -cal -ksp_type preonly -pc_type lu -mat_type umfpack -mat_umfpack_prl P r e - P r o c e s s i n g . . . >>> E n d P r e - P r o c e s s i n g >>> P r o c e s s i n g . . . >>> Operation : Generate[A] >>> Info : Setting System {A,b} to zero >>> Resources : cpu 6.41402 s >>> Operation : Solve[A] >>> PETSc : N: 136188 >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [0]PETSC ERROR: Error in external library! >>> [0]PETSC ERROR: umfpack_di_numeric failed! >>> [0]PETSC ERROR: >>> ---------------------------------------------------------------------- >>> -- [0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 13, Thu May 15 >>> 17:29:26 CDT 2008 HG revision: >>> 4466c6289a0922df26e20626fd4a0b4dd03c8124 >>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [0]PETSC ERROR: See docs/index.html for manual pages. >>> [0]PETSC ERROR: >>> ---------------------------------------------------------------------- >>> -- [0]PETSC ERROR: patch_notch.pro on a linux-gnu named >>> localhost.localdomain by lhr Mon Jun 2 16:03:50 2008 [0]PETSC ERROR: >>> Libraries linked from >>> /home/lhr/Desktop/getdp/petsc/petsc-2.3.3-p13/lib/linux-gnu-c-debug >>> [0]PETSC ERROR: Configure run at Mon Jun 2 15:58:42 2008 [0]PETSC >>> ERROR: Configure options --with-cc=gcc --with-fc=g77 >>> --with-blas-lapack-dir=/opt/intel/mkl/10.0.1.014/lib/em64t/ >>> --download-mpich=ifneeded --download-umfpack=ifneeded --with-shared=0 >>> [0]PETSC ERROR: >>> ---------------------------------------------------------------------- >>> -- [0]PETSC ERROR: MatLUFactorNumeric_UMFPACK() line 129 in >>> src/mat/impls/aij/seq/umfpack/umfpack.c >>> [0]PETSC ERROR: MatLUFactorNumeric() line 2227 in >>> src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_LU() line 280 in >>> src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: PCSetUp() line 787 in >>> src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 234 in >>> src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 347 in >>> src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: User provided >>> function() line 1472 in unknowndirectory/LinAlg_PETSC.c >>> [unset]: aborting job: >>> application called MPI_Abort(MPI_COMM_WORLD, 76) - process 0 >>> >>> KR, Lars >>> >>> >>> -----Oprindelig meddelelse----- >>> Fra: owner-petsc-users at mcs.anl.gov >>> [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af Matthew Knepley >>> Sendt: 2. juni 2008 15:07 >>> Til: petsc-users at mcs.anl.gov >>> Emne: Re: Error codes from external packages >>> >>> On Mon, Jun 2, 2008 at 7:17 AM, Lars Rindorf wrote: >>>> Dear all >>>> >>>> If I want to know the error codes of an external package that >>>> crashes, then what can I do? >>> We call >>> >>> umfpack_zl_report_status() >>> >>> with the UMFPACK status code in the event of a failure. Printing for this is controlled by the -mat_umfpack_prl option I believe. >>> >>> Matt >>> >>>> The problem arises with umfpack when the size of the matrix is more >>>> than 118000x118000, corresponding to 2.4 Gb memory consumption. It >>>> simply returns "umfpack_di_numeric" (factorization of a real matrix) failed. >>>> Has anybody else experienced this? >>>> >>>> KR, Lars >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> > > > From Stephen.R.Ball at awe.co.uk Wed Jun 4 06:33:44 2008 From: Stephen.R.Ball at awe.co.uk (Stephen R Ball) Date: Wed, 4 Jun 2008 12:33:44 +0100 Subject: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. Message-ID: <864CZa100408@awe.co.uk> Ok, I am looking into using MatGetRow(). However this requires the global row number for input. I was looking to use MatGetOwnershipRange() to obtain the range of global row numbers owned by each processor but the documentation states that this routine assumes that the matrix is laid out with the first n1 rows on the first processor, the next n2 rows on the second, etc and that for certain parallel layouts this range may not be well defined. This is the case for me. Do you have a routine where I can specify a global row number and it will tell me the rank of the processor that owns it? This is to ensure that MatGetRow() only gets called by the owner processor for each global row number. Regards Stephen -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: 03 June 2008 15:29 To: petsc-users at mcs.anl.gov Subject: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. On Tue, Jun 3, 2008 at 8:11 AM, Stephen R Ball wrote: > Hi > > I have been trying to extract an array containing the local matrix > values using MatGetArray() via the Fortran interface but get the error > message that Mat type mpiaij is not supported with this routine. All I > want to do is to extract the local matrix values so that I can output > them to file in the format I want rather than via use of the MatView() > routine. Can you suggest a way of how I can go about extracting the > local matrix values? This is no "local matrix". The Mat interface is supposed to be data structure neutral so we can optimize for different architectures. If you want the values directly, I would use MatGetRow() for each row. Matt > Thanks > > Stephen > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From Lars.Rindorf at teknologisk.dk Wed Jun 4 07:39:17 2008 From: Lars.Rindorf at teknologisk.dk (Lars Rindorf) Date: Wed, 4 Jun 2008 14:39:17 +0200 Subject: SV: Error codes from external packages In-Reply-To: <4845684F.1090404@ulg.ac.be> Message-ID: Hi David Thanks for the answer and the reference I'll look into it. I have since my email tried mumps. It is considerably faster than umfpack when running real scalars and default solver settings (43/53 secs), which I had not expected. I'll make some more detailed comparisons of the performance of mumps and umfpack on the xeon quad core after the summer vacation. KR, Lars -----Oprindelig meddelelse----- Fra: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af David Colignon Sendt: 3. juni 2008 17:51 Til: petsc-users at mcs.anl.gov Emne: Re: Error codes from external packages Hi, my colleague Ch. Geuzaine added complex arithmetic support and 64 bit addressing for umfpack last year. See http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-dev/2007/06/msg00000.html It has been included in petsc-dev. Cheers, Dave -- David Colignon, Ph.D. Collaborateur Logistique F.R.S.-FNRS (Equipements de Calcul Intensif) ACE - Applied & Computational Electromagnetics Institut Montefiore B28 Universit? de Li?ge 4000 Li?ge - BELGIQUE T?l: +32 (0)4 366 37 32 Fax: +32 (0)4 366 29 10 WWW: http://www.montefiore.ulg.ac.be/personnel.php?op=detail&id=898 Agenda: http://www.google.com/calendar/embed?src=david.colignon%40gmail.com Matthew Knepley wrote: > On Tue, Jun 3, 2008 at 6:53 AM, Lars Rindorf > wrote: >> Hi Matthew >> >> I've a couple of questions regarding umfpack and petsc. I have a version of the program that I use. This program uses complex scalars and umfpack in petsc, but when I try myself to compile petsc with umfpack and complex numbers, petsc gives an error saying that umfpack and complex scalars is not yet implemented. Is that correct? Is there a version of petsc that allows complex scalars and umfpack? > > This question came up before on this list. We do not support complex > with UMFPACK. I cannot remember the reason, but there was a problem > with the complex extension. > >> Secondly, I'm still having problems with umfpack running out of memory (umfpack error -1). I have played around with petsc memory allocation in MatSetAIJSetPreallocation and with umfpack Control[UMFPACK_ALLOC_INIT] and it makes no difference. I have try it also with both Intel mkl blas and petsc default blas, and that made no difference either. Do you have any ideas where to look for the error? > > Are you trying to go beyond 32-bits? If so, you would need an OS that > will allocate more than 2G to a process, like 64-bit Linux. > > Matt > >> The largest succesful simulation gives the following petsc output: >> KSP Object: >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left >> preconditioning PC Object: >> type: lu >> LU: out-of-place factorization >> matrix ordering: nd >> LU: tolerance for zero pivot 1e-12 >> LU: factor fill ratio needed 0 >> Factored matrix follows >> Matrix Object: >> type=umfpack, rows=118636, cols=118636 >> total: nonzeros=0, allocated nonzeros=118636 >> not using I-node routines >> UMFPACK run parameters: >> Control[UMFPACK_PRL]: 1 >> Control[UMFPACK_STRATEGY]: 0 >> Control[UMFPACK_DENSE_COL]: 0.2 >> Control[UMFPACK_DENSE_ROW]: 0.2 >> Control[UMFPACK_AMD_DENSE]: 10 >> Control[UMFPACK_BLOCK_SIZE]: 32 >> Control[UMFPACK_2BY2_TOLERANCE]: 0.01 >> Control[UMFPACK_FIXQ]: 0 >> Control[UMFPACK_AGGRESSIVE]: 1 >> Control[UMFPACK_PIVOT_TOLERANCE]: 0.1 >> Control[UMFPACK_SYM_PIVOT_TOLERANCE]: 0.001 >> Control[UMFPACK_SCALE]: 1 >> Control[UMFPACK_ALLOC_INIT]: 0.7 >> Control[UMFPACK_DROPTOL]: 0 >> Control[UMFPACK_IRSTEP]: 0 >> UMFPACK default matrix ordering is used (not the PETSc >> matrix ordering) linear system matrix = precond matrix: >> Matrix Object: >> type=umfpack, rows=118636, cols=118636 >> total: nonzeros=4377120, allocated nonzeros=29659000 >> using I-node routines: found 105980 nodes, limit used is 5 >> >> KR, Lars >> >> -----Oprindelig meddelelse----- >> Fra: owner-petsc-users at mcs.anl.gov >> [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af Matthew Knepley >> Sendt: 2. juni 2008 16:33 >> Til: petsc-users at mcs.anl.gov >> Emne: Re: Error codes from external packages >> >> On Mon, Jun 2, 2008 at 9:22 AM, Lars Rindorf wrote: >>> Hi Matthew >>> >>> I have included the -mat_umfpack_prl parameter, but it does not make any difference. I have checked the spelling with petsc manual. When umfpack crashes it (umfpack_di_numeric) returns an error code. I want to access that code. >> Ah, I was checking solve(), not factor(). This is an oversight in the code. I am fixing it in the dev version. We will have a release fairly soon, but you can always get dev for these kinds of bugs fixes quickly. >> >> Matt >> >>> Here is what petsc returns (with/without -mat_umfpack_prl option): >>> [lhr at localhost notch_patch]$ getdp patch_notch.pro -msh >>> patch_notch_29000.msh -pre res_static -cal -ksp_type preonly -pc_type lu -mat_type umfpack -mat_umfpack_prl P r e - P r o c e s s i n g . . . >>> E n d P r e - P r o c e s s i n g >>> P r o c e s s i n g . . . >>> Operation : Generate[A] >>> Info : Setting System {A,b} to zero >>> Resources : cpu 6.41402 s >>> Operation : Solve[A] >>> PETSc : N: 136188 >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [0]PETSC ERROR: Error in external library! >>> [0]PETSC ERROR: umfpack_di_numeric failed! >>> [0]PETSC ERROR: >>> -------------------------------------------------------------------- >>> -- >>> -- [0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 13, Thu May 15 >>> 17:29:26 CDT 2008 HG revision: >>> 4466c6289a0922df26e20626fd4a0b4dd03c8124 >>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [0]PETSC ERROR: See docs/index.html for manual pages. >>> [0]PETSC ERROR: >>> -------------------------------------------------------------------- >>> -- >>> -- [0]PETSC ERROR: patch_notch.pro on a linux-gnu named >>> localhost.localdomain by lhr Mon Jun 2 16:03:50 2008 [0]PETSC ERROR: >>> Libraries linked from >>> /home/lhr/Desktop/getdp/petsc/petsc-2.3.3-p13/lib/linux-gnu-c-debug >>> [0]PETSC ERROR: Configure run at Mon Jun 2 15:58:42 2008 [0]PETSC >>> ERROR: Configure options --with-cc=gcc --with-fc=g77 >>> --with-blas-lapack-dir=/opt/intel/mkl/10.0.1.014/lib/em64t/ >>> --download-mpich=ifneeded --download-umfpack=ifneeded >>> --with-shared=0 [0]PETSC ERROR: >>> -------------------------------------------------------------------- >>> -- >>> -- [0]PETSC ERROR: MatLUFactorNumeric_UMFPACK() line 129 in >>> src/mat/impls/aij/seq/umfpack/umfpack.c >>> [0]PETSC ERROR: MatLUFactorNumeric() line 2227 in >>> src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_LU() line 280 in >>> src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: PCSetUp() line 787 >>> in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 234 >>> in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line >>> 347 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: User provided >>> function() line 1472 in unknowndirectory/LinAlg_PETSC.c >>> [unset]: aborting job: >>> application called MPI_Abort(MPI_COMM_WORLD, 76) - process 0 >>> >>> KR, Lars >>> >>> >>> -----Oprindelig meddelelse----- >>> Fra: owner-petsc-users at mcs.anl.gov >>> [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af Matthew Knepley >>> Sendt: 2. juni 2008 15:07 >>> Til: petsc-users at mcs.anl.gov >>> Emne: Re: Error codes from external packages >>> >>> On Mon, Jun 2, 2008 at 7:17 AM, Lars Rindorf wrote: >>>> Dear all >>>> >>>> If I want to know the error codes of an external package that >>>> crashes, then what can I do? >>> We call >>> >>> umfpack_zl_report_status() >>> >>> with the UMFPACK status code in the event of a failure. Printing for this is controlled by the -mat_umfpack_prl option I believe. >>> >>> Matt >>> >>>> The problem arises with umfpack when the size of the matrix is more >>>> than 118000x118000, corresponding to 2.4 Gb memory consumption. It >>>> simply returns "umfpack_di_numeric" (factorization of a real matrix) failed. >>>> Has anybody else experienced this? >>>> >>>> KR, Lars >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> > > > From dave.mayhem23 at gmail.com Wed Jun 4 07:42:44 2008 From: dave.mayhem23 at gmail.com (Dave May) Date: Wed, 4 Jun 2008 22:42:44 +1000 Subject: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. In-Reply-To: <864CZa100408@awe.co.uk> References: <864CZa100408@awe.co.uk> Message-ID: <956373f0806040542q1154dcdax13e8af2ac1403cad@mail.gmail.com> You could use MatGetOwnershipRanges() http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatGetOwnershipRanges.html and then traverse the list "ranges[]" to determine which rank owns the global row number of interest. Cheers, Dave On Wed, Jun 4, 2008 at 9:33 PM, Stephen R Ball wrote: > Ok, I am looking into using MatGetRow(). However this requires the > global row number for input. I was looking to use MatGetOwnershipRange() > to obtain the range of global row numbers owned by each processor but > the documentation states that this routine assumes that the matrix is > laid out with the first n1 rows on the first processor, the next n2 rows > on the second, etc and that for certain parallel layouts this range may > not be well defined. > > This is the case for me. Do you have a routine where I can specify a > global row number and it will tell me the rank of the processor that > owns it? This is to ensure that MatGetRow() only gets called by the > owner processor for each global row number. > > Regards > > Stephen > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley > Sent: 03 June 2008 15:29 > To: petsc-users at mcs.anl.gov > Subject: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. > > On Tue, Jun 3, 2008 at 8:11 AM, Stephen R Ball > wrote: > > Hi > > > > I have been trying to extract an array containing the local matrix > > values using MatGetArray() via the Fortran interface but get the error > > message that Mat type mpiaij is not supported with this routine. All I > > want to do is to extract the local matrix values so that I can output > > them to file in the format I want rather than via use of the MatView() > > routine. Can you suggest a way of how I can go about extracting the > > local matrix values? > > This is no "local matrix". The Mat interface is supposed to be data > structure > neutral so we can optimize for different architectures. If you want the > values > directly, I would use MatGetRow() for each row. > > Matt > > > Thanks > > > > Stephen > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Wed Jun 4 09:11:33 2008 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Wed, 4 Jun 2008 09:11:33 -0500 (CDT) Subject: SV: Error codes from external packages In-Reply-To: References: Message-ID: Lars, On Wed, 4 Jun 2008, Lars Rindorf wrote: > Hi David > > Thanks for the answer and the reference I'll look into it. > > I have since my email tried mumps. It is considerably faster than umfpack when running real scalars and default solver settings (43/53 secs), which I had not expected. I'll make some more detailed comparisons of the performance of mumps and umfpack on the xeon quad core after the summer vacation. Petsc-MUMPS interface supports complex arithmetic as well. We also interface with superlu with the complex arithmetic support. You can try both without changing your application code. Let us know the performance. Thanks, Hong > > KR, Lars > > -----Oprindelig meddelelse----- > Fra: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af David Colignon > Sendt: 3. juni 2008 17:51 > Til: petsc-users at mcs.anl.gov > Emne: Re: Error codes from external packages > > Hi, > > my colleague Ch. Geuzaine added complex arithmetic support and 64 bit addressing for umfpack last year. > > See http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-dev/2007/06/msg00000.html > > It has been included in petsc-dev. > > Cheers, > > Dave > > -- > David Colignon, Ph.D. > Collaborateur Logistique F.R.S.-FNRS (Equipements de Calcul Intensif) ACE - Applied & Computational Electromagnetics Institut Montefiore B28 Universit? de Li?ge 4000 Li?ge - BELGIQUE > T?l: +32 (0)4 366 37 32 > Fax: +32 (0)4 366 29 10 > WWW: http://www.montefiore.ulg.ac.be/personnel.php?op=detail&id=898 > Agenda: http://www.google.com/calendar/embed?src=david.colignon%40gmail.com > > > Matthew Knepley wrote: >> On Tue, Jun 3, 2008 at 6:53 AM, Lars Rindorf >> wrote: >>> Hi Matthew >>> >>> I've a couple of questions regarding umfpack and petsc. I have a version of the program that I use. This program uses complex scalars and umfpack in petsc, but when I try myself to compile petsc with umfpack and complex numbers, petsc gives an error saying that umfpack and complex scalars is not yet implemented. Is that correct? Is there a version of petsc that allows complex scalars and umfpack? >> >> This question came up before on this list. We do not support complex >> with UMFPACK. I cannot remember the reason, but there was a problem >> with the complex extension. >> >>> Secondly, I'm still having problems with umfpack running out of memory (umfpack error -1). I have played around with petsc memory allocation in MatSetAIJSetPreallocation and with umfpack Control[UMFPACK_ALLOC_INIT] and it makes no difference. I have try it also with both Intel mkl blas and petsc default blas, and that made no difference either. Do you have any ideas where to look for the error? >> >> Are you trying to go beyond 32-bits? If so, you would need an OS that >> will allocate more than 2G to a process, like 64-bit Linux. >> >> Matt >> >>> The largest succesful simulation gives the following petsc output: >>> KSP Object: >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left >>> preconditioning PC Object: >>> type: lu >>> LU: out-of-place factorization >>> matrix ordering: nd >>> LU: tolerance for zero pivot 1e-12 >>> LU: factor fill ratio needed 0 >>> Factored matrix follows >>> Matrix Object: >>> type=umfpack, rows=118636, cols=118636 >>> total: nonzeros=0, allocated nonzeros=118636 >>> not using I-node routines >>> UMFPACK run parameters: >>> Control[UMFPACK_PRL]: 1 >>> Control[UMFPACK_STRATEGY]: 0 >>> Control[UMFPACK_DENSE_COL]: 0.2 >>> Control[UMFPACK_DENSE_ROW]: 0.2 >>> Control[UMFPACK_AMD_DENSE]: 10 >>> Control[UMFPACK_BLOCK_SIZE]: 32 >>> Control[UMFPACK_2BY2_TOLERANCE]: 0.01 >>> Control[UMFPACK_FIXQ]: 0 >>> Control[UMFPACK_AGGRESSIVE]: 1 >>> Control[UMFPACK_PIVOT_TOLERANCE]: 0.1 >>> Control[UMFPACK_SYM_PIVOT_TOLERANCE]: 0.001 >>> Control[UMFPACK_SCALE]: 1 >>> Control[UMFPACK_ALLOC_INIT]: 0.7 >>> Control[UMFPACK_DROPTOL]: 0 >>> Control[UMFPACK_IRSTEP]: 0 >>> UMFPACK default matrix ordering is used (not the PETSc >>> matrix ordering) linear system matrix = precond matrix: >>> Matrix Object: >>> type=umfpack, rows=118636, cols=118636 >>> total: nonzeros=4377120, allocated nonzeros=29659000 >>> using I-node routines: found 105980 nodes, limit used is 5 >>> >>> KR, Lars >>> >>> -----Oprindelig meddelelse----- >>> Fra: owner-petsc-users at mcs.anl.gov >>> [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af Matthew Knepley >>> Sendt: 2. juni 2008 16:33 >>> Til: petsc-users at mcs.anl.gov >>> Emne: Re: Error codes from external packages >>> >>> On Mon, Jun 2, 2008 at 9:22 AM, Lars Rindorf wrote: >>>> Hi Matthew >>>> >>>> I have included the -mat_umfpack_prl parameter, but it does not make any difference. I have checked the spelling with petsc manual. When umfpack crashes it (umfpack_di_numeric) returns an error code. I want to access that code. >>> Ah, I was checking solve(), not factor(). This is an oversight in the code. I am fixing it in the dev version. We will have a release fairly soon, but you can always get dev for these kinds of bugs fixes quickly. >>> >>> Matt >>> >>>> Here is what petsc returns (with/without -mat_umfpack_prl option): >>>> [lhr at localhost notch_patch]$ getdp patch_notch.pro -msh >>>> patch_notch_29000.msh -pre res_static -cal -ksp_type preonly -pc_type lu -mat_type umfpack -mat_umfpack_prl P r e - P r o c e s s i n g . . . >>>> E n d P r e - P r o c e s s i n g >>>> P r o c e s s i n g . . . >>>> Operation : Generate[A] >>>> Info : Setting System {A,b} to zero >>>> Resources : cpu 6.41402 s >>>> Operation : Solve[A] >>>> PETSc : N: 136188 >>>> [0]PETSC ERROR: --------------------- Error Message >>>> ------------------------------------ >>>> [0]PETSC ERROR: Error in external library! >>>> [0]PETSC ERROR: umfpack_di_numeric failed! >>>> [0]PETSC ERROR: >>>> -------------------------------------------------------------------- >>>> -- >>>> -- [0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 13, Thu May 15 >>>> 17:29:26 CDT 2008 HG revision: >>>> 4466c6289a0922df26e20626fd4a0b4dd03c8124 >>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>> [0]PETSC ERROR: See docs/index.html for manual pages. >>>> [0]PETSC ERROR: >>>> -------------------------------------------------------------------- >>>> -- >>>> -- [0]PETSC ERROR: patch_notch.pro on a linux-gnu named >>>> localhost.localdomain by lhr Mon Jun 2 16:03:50 2008 [0]PETSC ERROR: >>>> Libraries linked from >>>> /home/lhr/Desktop/getdp/petsc/petsc-2.3.3-p13/lib/linux-gnu-c-debug >>>> [0]PETSC ERROR: Configure run at Mon Jun 2 15:58:42 2008 [0]PETSC >>>> ERROR: Configure options --with-cc=gcc --with-fc=g77 >>>> --with-blas-lapack-dir=/opt/intel/mkl/10.0.1.014/lib/em64t/ >>>> --download-mpich=ifneeded --download-umfpack=ifneeded >>>> --with-shared=0 [0]PETSC ERROR: >>>> -------------------------------------------------------------------- >>>> -- >>>> -- [0]PETSC ERROR: MatLUFactorNumeric_UMFPACK() line 129 in >>>> src/mat/impls/aij/seq/umfpack/umfpack.c >>>> [0]PETSC ERROR: MatLUFactorNumeric() line 2227 in >>>> src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_LU() line 280 in >>>> src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: PCSetUp() line 787 >>>> in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 234 >>>> in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line >>>> 347 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: User provided >>>> function() line 1472 in unknowndirectory/LinAlg_PETSC.c >>>> [unset]: aborting job: >>>> application called MPI_Abort(MPI_COMM_WORLD, 76) - process 0 >>>> >>>> KR, Lars >>>> >>>> >>>> -----Oprindelig meddelelse----- >>>> Fra: owner-petsc-users at mcs.anl.gov >>>> [mailto:owner-petsc-users at mcs.anl.gov] P? vegne af Matthew Knepley >>>> Sendt: 2. juni 2008 15:07 >>>> Til: petsc-users at mcs.anl.gov >>>> Emne: Re: Error codes from external packages >>>> >>>> On Mon, Jun 2, 2008 at 7:17 AM, Lars Rindorf wrote: >>>>> Dear all >>>>> >>>>> If I want to know the error codes of an external package that >>>>> crashes, then what can I do? >>>> We call >>>> >>>> umfpack_zl_report_status() >>>> >>>> with the UMFPACK status code in the event of a failure. Printing for this is controlled by the -mat_umfpack_prl option I believe. >>>> >>>> Matt >>>> >>>>> The problem arises with umfpack when the size of the matrix is more >>>>> than 118000x118000, corresponding to 2.4 Gb memory consumption. It >>>>> simply returns "umfpack_di_numeric" (factorization of a real matrix) failed. >>>>> Has anybody else experienced this? >>>>> >>>>> KR, Lars >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> >> >> >> > From bsmith at mcs.anl.gov Wed Jun 4 09:30:20 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 4 Jun 2008 09:30:20 -0500 Subject: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. In-Reply-To: <864CZa100408@awe.co.uk> References: <864CZa100408@awe.co.uk> Message-ID: <3F2D556E-2F5E-44CF-9289-D43DE52ED94D@mcs.anl.gov> Stephan, For all built in PETSc matrix types, the rows are stored contiguously so you can use MatGetOwnershipRange() to get the global indices for the local rows and thus call MatGetRow(). Barry On Jun 4, 2008, at 6:33 AM, Stephen R Ball wrote: > Ok, I am looking into using MatGetRow(). However this requires the > global row number for input. I was looking to use > MatGetOwnershipRange() > to obtain the range of global row numbers owned by each processor but > the documentation states that this routine assumes that the matrix is > laid out with the first n1 rows on the first processor, the next n2 > rows > on the second, etc and that for certain parallel layouts this range > may > not be well defined. > > This is the case for me. Do you have a routine where I can specify a > global row number and it will tell me the rank of the processor that > owns it? This is to ensure that MatGetRow() only gets called by the > owner processor for each global row number. > > Regards > > Stephen > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley > Sent: 03 June 2008 15:29 > To: petsc-users at mcs.anl.gov > Subject: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. > > On Tue, Jun 3, 2008 at 8:11 AM, Stephen R Ball > wrote: >> Hi >> >> I have been trying to extract an array containing the local matrix >> values using MatGetArray() via the Fortran interface but get the >> error >> message that Mat type mpiaij is not supported with this routine. >> All I >> want to do is to extract the local matrix values so that I can output >> them to file in the format I want rather than via use of the >> MatView() >> routine. Can you suggest a way of how I can go about extracting the >> local matrix values? > > This is no "local matrix". The Mat interface is supposed to be data > structure > neutral so we can optimize for different architectures. If you want > the > values > directly, I would use MatGetRow() for each row. > > Matt > >> Thanks >> >> Stephen >> >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > From Stephen.R.Ball at awe.co.uk Wed Jun 4 10:15:59 2008 From: Stephen.R.Ball at awe.co.uk (Stephen R Ball) Date: Wed, 4 Jun 2008 16:15:59 +0100 Subject: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. Message-ID: <864GKT015225@awe.co.uk> I am using PETSc via the Fortran interface. I don't think MatGetOwnershipRanges() is available via Fortran. ? ________________________________________ From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Dave May Sent: 04 June 2008 13:43 To: petsc-users at mcs.anl.gov Subject: EXTERNAL: Re: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. You could use MatGetOwnershipRanges() http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatGetOwnershipRanges.html and then traverse the list "ranges[]" to determine which rank owns the global row number of interest. Cheers, ??? Dave On Wed, Jun 4, 2008 at 9:33 PM, Stephen R Ball wrote: Ok, I am looking into using MatGetRow(). However this requires the global row number for input. I was looking to use MatGetOwnershipRange() to obtain the range of global row numbers owned by each processor but the documentation states that this routine assumes that the matrix is laid out with the first n1 rows on the first processor, the next n2 rows on the second, etc and that for certain parallel layouts this range may not be well defined. This is the case for me. Do you have a routine where I can specify a global row number and it will tell me the rank of the processor that owns it? This is to ensure that MatGetRow() only gets called by the owner processor for each global row number. Regards Stephen -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: 03 June 2008 15:29 To: petsc-users at mcs.anl.gov Subject: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. On Tue, Jun 3, 2008 at 8:11 AM, Stephen R Ball wrote: > Hi > > I have been trying to extract an array containing the local matrix > values using MatGetArray() via the Fortran interface but get the error > message that Mat type mpiaij is not supported with this routine. All I > want to do is to extract the local matrix values so that I can output > them to file in the format I want rather than via use of the MatView() > routine. Can you suggest a way of how I can go about extracting the > local matrix values? This is no "local matrix". The Mat interface is supposed to be data structure neutral so we can optimize for different architectures. If you want the values directly, I would use MatGetRow() for each row. ?Matt > Thanks > > Stephen > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From knepley at gmail.com Wed Jun 4 11:23:43 2008 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 4 Jun 2008 11:23:43 -0500 Subject: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. In-Reply-To: <864GKT015225@awe.co.uk> References: <864GKT015225@awe.co.uk> Message-ID: On Wed, Jun 4, 2008 at 10:15 AM, Stephen R Ball wrote: > I am using PETSc via the Fortran interface. I don't think MatGetOwnershipRanges() is available via Fortran. As Barry noted, you can use MatGetOwnershipRange() Matt > ________________________________________ > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Dave May > Sent: 04 June 2008 13:43 > To: petsc-users at mcs.anl.gov > Subject: EXTERNAL: Re: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. > > You could use MatGetOwnershipRanges() > http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatGetOwnershipRanges.html > and then traverse the list "ranges[]" to determine which rank owns the global row number of interest. > > Cheers, > Dave > On Wed, Jun 4, 2008 at 9:33 PM, Stephen R Ball wrote: > Ok, I am looking into using MatGetRow(). However this requires the > global row number for input. I was looking to use MatGetOwnershipRange() > to obtain the range of global row numbers owned by each processor but > the documentation states that this routine assumes that the matrix is > laid out with the first n1 rows on the first processor, the next n2 rows > on the second, etc and that for certain parallel layouts this range may > not be well defined. > > This is the case for me. Do you have a routine where I can specify a > global row number and it will tell me the rank of the processor that > owns it? This is to ensure that MatGetRow() only gets called by the > owner processor for each global row number. > > Regards > > Stephen > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley > Sent: 03 June 2008 15:29 > To: petsc-users at mcs.anl.gov > Subject: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. > > On Tue, Jun 3, 2008 at 8:11 AM, Stephen R Ball > wrote: >> Hi >> >> I have been trying to extract an array containing the local matrix >> values using MatGetArray() via the Fortran interface but get the error >> message that Mat type mpiaij is not supported with this routine. All I >> want to do is to extract the local matrix values so that I can output >> them to file in the format I want rather than via use of the MatView() >> routine. Can you suggest a way of how I can go about extracting the >> local matrix values? > > This is no "local matrix". The Mat interface is supposed to be data > structure > neutral so we can optimize for different architectures. If you want the > values > directly, I would use MatGetRow() for each row. > > Matt > >> Thanks >> >> Stephen >> >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From etienne.perchat at transvalor.com Thu Jun 5 05:35:09 2008 From: etienne.perchat at transvalor.com (Etienne PERCHAT) Date: Thu, 5 Jun 2008 12:35:09 +0200 Subject: KSPCR with PCILU generally slower with v2.3.3 than with 2.3.0 Message-ID: <9113A52E1096EB41B1F88DD94C4369D52168A5@EXCHSRV.transvalor.com> Hello, I'm solving MPIBAIJI matrices using KSPCR preconditioned by a PCBJACOBI between sub domains and PCILU with a fill ratio of 1 on each sub domain. Solving exactly the same linear systems, I notice that the solution is slower using v2.3.3p8 than with v2.3.0. The behavior is not very clear: v2.3.3p8 is generally slower (10 % or more) but on some systems it happen to be really faster (may be 40 %). Unhappily in my case at the end I'm always slower ... Did somebody noticed such behavior? Is there changes if I upgrade to current 2.3.3p13 version (I've seen a fix:" fix error with using mpirowbs & icc & ksp_view") ? Thanks, Etienne Perchat From dalcinl at gmail.com Thu Jun 5 10:39:20 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 5 Jun 2008 12:39:20 -0300 Subject: KSPCR with PCILU generally slower with v2.3.3 than with 2.3.0 In-Reply-To: <9113A52E1096EB41B1F88DD94C4369D52168A5@EXCHSRV.transvalor.com> References: <9113A52E1096EB41B1F88DD94C4369D52168A5@EXCHSRV.transvalor.com> Message-ID: On 6/5/08, Etienne PERCHAT wrote: > The behavior is not very clear: v2.3.3p8 is generally slower (10 % or > more) but on some systems it happen to be really faster (may be 40 %). Which compilers are you using in your different systems? If they are the same, Are their versions the same? I've noticed surprising differences for different versions of GCC in builtin PETSc factorization routines, more specifically LU. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From bsmith at mcs.anl.gov Thu Jun 5 13:28:07 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 5 Jun 2008 13:28:07 -0500 Subject: KSPCR with PCILU generally slower with v2.3.3 than with 2.3.0 In-Reply-To: <9113A52E1096EB41B1F88DD94C4369D52168A5@EXCHSRV.transvalor.com> References: <9113A52E1096EB41B1F88DD94C4369D52168A5@EXCHSRV.transvalor.com> Message-ID: <06B5B34E-5AF3-49A1-AE86-C70C3597DE08@mcs.anl.gov> This could also be the result of changes in the way we test for convergence. Try with -ksp_truemonitor and see if it is using the same number of iterations to converge. Barry On Jun 5, 2008, at 5:35 AM, Etienne PERCHAT wrote: > Hello, > > I'm solving MPIBAIJI matrices using KSPCR preconditioned by a > PCBJACOBI > between sub domains and PCILU with a fill ratio of 1 on each sub > domain. > > Solving exactly the same linear systems, I notice that the solution is > slower using v2.3.3p8 than with v2.3.0. > The behavior is not very clear: v2.3.3p8 is generally slower (10 % or > more) but on some systems it happen to be really faster (may be 40 %). > > Unhappily in my case at the end I'm always slower ... > > Did somebody noticed such behavior? Is there changes if I upgrade to > current 2.3.3p13 version (I've seen a fix:" fix error with using > mpirowbs & icc & ksp_view") ? > > Thanks, > Etienne Perchat > > From etienne.perchat at transvalor.com Fri Jun 6 03:03:31 2008 From: etienne.perchat at transvalor.com (Etienne PERCHAT) Date: Fri, 6 Jun 2008 10:03:31 +0200 Subject: KSPCR with PCILU generally slower with v2.3.3 than with 2.3.0 Message-ID: <9113A52E1096EB41B1F88DD94C4369D5216916@EXCHSRV.transvalor.com> Hi Lisandro, I'm using exactly the same compilers. Etienne -----Message d'origine----- De?: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] De la part de Lisandro Dalcin Envoy??: jeudi 5 juin 2008 17:39 ??: petsc-users at mcs.anl.gov Objet?: Re: KSPCR with PCILU generally slower with v2.3.3 than with 2.3.0 On 6/5/08, Etienne PERCHAT wrote: > The behavior is not very clear: v2.3.3p8 is generally slower (10 % or > more) but on some systems it happen to be really faster (may be 40 %). Which compilers are you using in your different systems? If they are the same, Are their versions the same? I've noticed surprising differences for different versions of GCC in builtin PETSc factorization routines, more specifically LU. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From etienne.perchat at transvalor.com Fri Jun 6 06:27:07 2008 From: etienne.perchat at transvalor.com (Etienne PERCHAT) Date: Fri, 6 Jun 2008 13:27:07 +0200 Subject: KSPCR with PCILU generally slower with v2.3.3 than with 2.3.0 Message-ID: <9113A52E1096EB41B1F88DD94C4369D5216948@EXCHSRV.transvalor.com> Hi Barry, Using -ksp_truemonitor does not change anything. I have the same number of iterations in the 3 cases (v2.3.3p8 + true_monitor, v2.3.3, v2.3.0) and the same final residual norm but with v2.3.0 the code is slightly faster. I have to stress that I also monitor convergence with KSPMonitorSet in order to store the "best residual" and the "best solution" (I did not test if it is still useful in 2.3.3). Etienne -----Message d'origine----- De?: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] De la part de Barry Smith Envoy??: jeudi 5 juin 2008 20:28 ??: petsc-users at mcs.anl.gov Objet?: Re: KSPCR with PCILU generally slower with v2.3.3 than with 2.3.0 This could also be the result of changes in the way we test for convergence. Try with -ksp_truemonitor and see if it is using the same number of iterations to converge. Barry On Jun 5, 2008, at 5:35 AM, Etienne PERCHAT wrote: > Hello, > > I'm solving MPIBAIJI matrices using KSPCR preconditioned by a > PCBJACOBI > between sub domains and PCILU with a fill ratio of 1 on each sub > domain. > > Solving exactly the same linear systems, I notice that the solution is > slower using v2.3.3p8 than with v2.3.0. > The behavior is not very clear: v2.3.3p8 is generally slower (10 % or > more) but on some systems it happen to be really faster (may be 40 %). > > Unhappily in my case at the end I'm always slower ... > > Did somebody noticed such behavior? Is there changes if I upgrade to > current 2.3.3p13 version (I've seen a fix:" fix error with using > mpirowbs & icc & ksp_view") ? > > Thanks, > Etienne Perchat > > From bsmith at mcs.anl.gov Fri Jun 6 09:39:04 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 6 Jun 2008 09:39:04 -0500 Subject: KSPCR with PCILU generally slower with v2.3.3 than with 2.3.0 In-Reply-To: <9113A52E1096EB41B1F88DD94C4369D5216948@EXCHSRV.transvalor.com> References: <9113A52E1096EB41B1F88DD94C4369D5216948@EXCHSRV.transvalor.com> Message-ID: <4EB13749-479B-4405-9F33-74D63C387D9E@mcs.anl.gov> If you run both cases with -log_summary you'll see where it is spending more time. Mail along the -log_summary output. Barry On Jun 6, 2008, at 6:27 AM, Etienne PERCHAT wrote: > Hi Barry, > > Using -ksp_truemonitor does not change anything. I have the same > number of iterations in the 3 cases (v2.3.3p8 + true_monitor, > v2.3.3, v2.3.0) and the same final residual norm but with v2.3.0 the > code is slightly faster. > > I have to stress that I also monitor convergence with KSPMonitorSet > in order to store the "best residual" and the "best solution" (I did > not test if it is still useful in 2.3.3). > > Etienne > > -----Message d'origine----- > De : owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov > ] De la part de Barry Smith > Envoy? : jeudi 5 juin 2008 20:28 > ? : petsc-users at mcs.anl.gov > Objet : Re: KSPCR with PCILU generally slower with v2.3.3 than with > 2.3.0 > > > This could also be the result of changes in the way we test for > convergence. > Try with -ksp_truemonitor and see if it is using the same number of > iterations > to converge. > > Barry > > On Jun 5, 2008, at 5:35 AM, Etienne PERCHAT wrote: > >> Hello, >> >> I'm solving MPIBAIJI matrices using KSPCR preconditioned by a >> PCBJACOBI >> between sub domains and PCILU with a fill ratio of 1 on each sub >> domain. >> >> Solving exactly the same linear systems, I notice that the solution >> is >> slower using v2.3.3p8 than with v2.3.0. >> The behavior is not very clear: v2.3.3p8 is generally slower (10 % or >> more) but on some systems it happen to be really faster (may be 40 >> %). >> >> Unhappily in my case at the end I'm always slower ... >> >> Did somebody noticed such behavior? Is there changes if I upgrade to >> current 2.3.3p13 version (I've seen a fix:" fix error with using >> mpirowbs & icc & ksp_view") ? >> >> Thanks, >> Etienne Perchat >> >> > > > From zonexo at gmail.com Fri Jun 6 20:07:04 2008 From: zonexo at gmail.com (Ben Tay) Date: Sat, 07 Jun 2008 09:07:04 +0800 Subject: Analysis of performance of parallel code as processors increase Message-ID: <4849DF38.6070104@gmail.com> Hi, I have coded in parallel using PETSc and Hypre. I found that going from 1 to 4 processors gives an almost 4 times increase. However from 4 to 8 processors only increase performance by 1.2-1.5 instead of 2. Is the slowdown due to the size of the matrix being not large enough? Currently I am using 600x2160 to do the benchmark. Even when increase the matrix size to 900x3240 or 1200x2160, the performance increase is also not much. Is it possible to use -log_summary find out the error? I have attached the log file comparison for the 4 and 8 processors, I found that some event like VecScatterEnd, VecNorm and MatAssemblyBegin have much higher ratios. Does it indicate something? Another strange thing is that MatAssemblyBegin for the 4 pros has a much higher ratio than the 8pros. I thought there should be less communications for the 4 pros case, and so the ratio should be lower. Does it mean there's some communication problem at that time? Thank you very much. Regards -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test4_600_29min.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test8_600_19min.txt URL: From bsmith at mcs.anl.gov Fri Jun 6 21:23:15 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 6 Jun 2008 21:23:15 -0500 Subject: Analysis of performance of parallel code as processors increase In-Reply-To: <4849DF38.6070104@gmail.com> References: <4849DF38.6070104@gmail.com> Message-ID: You are not using hypre, you are using block Jacobi with ILU on the blocks. The number of iterations goes from around 4000 to around 5000 in going from 4 to 8 processes, this is why you do not see such a great speedup. Barry On Jun 6, 2008, at 8:07 PM, Ben Tay wrote: > Hi, > > I have coded in parallel using PETSc and Hypre. I found that going > from 1 to 4 processors gives an almost 4 times increase. However > from 4 to 8 processors only increase performance by 1.2-1.5 instead > of 2. > > Is the slowdown due to the size of the matrix being not large > enough? Currently I am using 600x2160 to do the benchmark. Even when > increase the matrix size to 900x3240 or 1200x2160, the performance > increase is also not much. Is it possible to use -log_summary find > out the error? I have attached the log file comparison for the 4 and > 8 processors, I found that some event like VecScatterEnd, VecNorm > and MatAssemblyBegin have much higher ratios. Does it indicate > something? Another strange thing is that MatAssemblyBegin for the 4 > pros has a much higher ratio than the 8pros. I thought there should > be less communications for the 4 pros case, and so the ratio should > be lower. Does it mean there's some communication problem at that > time? > > Thank you very much. > > Regards > > > ************************************************************************************************************************ > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript - > r -fCourier9' to print this document *** > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance > Summary: ---------------------------------------------- > > ./a.out on a atlas3-mp named atlas3-c43 with 4 processors, by > g0306332 Fri Jun 6 17:29:26 2008 > Using Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST > 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b > > Max Max/Min Avg Total > Time (sec): 1.750e+03 1.00043 1.750e+03 > Objects: 4.200e+01 1.00000 4.200e+01 > Flops: 6.961e+10 1.00074 6.959e+10 2.784e+11 > Flops/sec: 3.980e+07 1.00117 3.978e+07 1.591e+08 > MPI Messages: 8.168e+03 2.00000 6.126e+03 2.450e+04 > MPI Message Lengths: 5.525e+07 2.00000 6.764e+03 1.658e+08 > MPI Reductions: 3.203e+03 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of > length N --> 2N flops > and VecAXPY() for complex vectors of > length N --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- > Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > 0: Main Stage: 1.7495e+03 100.0% 2.7837e+11 100.0% 2.450e+04 > 100.0% 6.764e+03 100.0% 1.281e+04 100.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops/sec: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all > processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with > PetscLogStagePush() and PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in > this phase > %M - percent messages in this phase %L - percent message > lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max > time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > > > ########################################################## > # # > # WARNING!!! # > # # > # This code was run without the PreLoadBegin() # > # macros. To get timing results we always recommend # > # preloading. otherwise timing numbers may be # > # meaningless. # > ########################################################## > > > Event Count Time (sec) Flops/ > sec --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg > len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > MatMult 4082 1.0 8.2037e+01 1.5 4.67e+08 1.5 2.4e+04 6.8e > +03 0.0e+00 4 37100100 0 4 37100100 0 1240 > MatSolve 1976 1.0 1.3250e+02 1.5 2.52e+08 1.5 0.0e+00 0.0e > +00 0.0e+00 6 31 0 0 0 6 31 0 0 0 655 > MatLUFactorNum 300 1.0 3.8260e+01 1.2 2.07e+08 1.2 0.0e+00 0.0e > +00 0.0e+00 2 9 0 0 0 2 9 0 0 0 668 > MatILUFactorSym 1 1.0 2.2550e-01 2.7 0.00e+00 0.0 0.0e+00 0.0e > +00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatConvert 1 1.0 2.9182e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatAssemblyBegin 301 1.0 1.0776e+021228.9 0.00e+00 0.0 0.0e+00 > 0.0e+00 6.0e+02 4 0 0 0 5 4 0 0 0 5 0 > MatAssemblyEnd 301 1.0 9.6146e+00 1.1 0.00e+00 0.0 1.2e+01 3.6e > +03 3.1e+02 1 0 0 0 2 1 0 0 0 2 0 > MatGetRow 324000 1.0 1.2161e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetRowIJ 3 1.0 5.0068e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetOrdering 1 1.0 2.1279e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e > +00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSetup 601 1.0 2.5108e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 600 1.0 1.2353e+03 1.0 5.64e+07 1.0 2.4e+04 6.8e > +03 8.3e+03 71100100100 65 71100100100 65 225 > PCSetUp 601 1.0 4.0116e+01 1.2 1.96e+08 1.2 0.0e+00 0.0e > +00 5.0e+00 2 9 0 0 0 2 9 0 0 0 637 > PCSetUpOnBlocks 300 1.0 3.8513e+01 1.2 2.06e+08 1.2 0.0e+00 0.0e > +00 3.0e+00 2 9 0 0 0 2 9 0 0 0 664 > PCApply 4682 1.0 1.0566e+03 1.0 2.12e+07 1.0 0.0e+00 0.0e > +00 0.0e+00 59 31 0 0 0 59 31 0 0 0 82 > VecDot 4812 1.0 8.2762e+00 1.1 4.00e+08 1.1 0.0e+00 0.0e > +00 4.8e+03 0 4 0 0 38 0 4 0 0 38 1507 > VecNorm 3479 1.0 9.2739e+01 8.3 3.15e+08 8.3 0.0e+00 0.0e > +00 3.5e+03 4 5 0 0 27 4 5 0 0 27 152 > VecCopy 900 1.0 2.0819e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 5882 1.0 9.4626e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 5585 1.0 1.5397e+01 1.5 4.67e+08 1.5 0.0e+00 0.0e > +00 0.0e+00 1 7 0 0 0 1 7 0 0 0 1273 > VecAYPX 2879 1.0 1.0303e+01 1.6 4.45e+08 1.6 0.0e+00 0.0e > +00 0.0e+00 0 4 0 0 0 0 4 0 0 0 1146 > VecWAXPY 2406 1.0 7.7902e+00 1.6 3.14e+08 1.6 0.0e+00 0.0e > +00 0.0e+00 0 2 0 0 0 0 2 0 0 0 801 > VecAssemblyBegin 1200 1.0 8.4259e+00 3.8 0.00e+00 0.0 0.0e+00 0.0e > +00 3.6e+03 0 0 0 0 28 0 0 0 0 28 0 > VecAssemblyEnd 1200 1.0 2.4173e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 4082 1.0 1.2512e-01 1.5 0.00e+00 0.0 2.4e+04 6.8e > +03 0.0e+00 0 0100100 0 0 0100100 0 0 > VecScatterEnd 4082 1.0 2.0954e+0153.3 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' > Mem. > > --- Event Stage 0: Main Stage > > Matrix 7 7 321241092 0 > Krylov Solver 3 3 8 0 > Preconditioner 3 3 528 0 > Index Set 7 7 7785600 0 > Vec 20 20 46685344 0 > Vec Scatter 2 2 0 0 > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > ====================================================================== > Average time to get PetscTime(): 1.90735e-07 > Average time for MPI_Barrier(): 1.45912e-05 > Average time for zero size MPI_Send(): 7.27177e-06 > OptionTable: -log_summary test4_600 > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 > Configure run at: Tue Jan 8 22:22:08 2008 > Configure options: --with-memcmp-ok --sizeof_char=1 -- > sizeof_void_p=8 --sizeof_short=2 --sizeof_int=4 --sizeof_long=8 -- > sizeof_long_long=8 --sizeof_float=4 --sizeof_double=8 -- > bits_per_byte=8 --sizeof_MPI_Comm=4 --sizeof_MPI_Fint=4 --with- > vendor-compilers=intel --with-x=0 --with-hypre-dir=/home/enduser/ > g0306332/lib/hypre --with-debugging=0 --with-batch=1 --with-mpi- > shared=0 --with-mpi-include=/usr/local/topspin/mpi/mpich/include -- > with-mpi-lib=/usr/local/topspin/mpi/mpich/lib/libmpich.a --with- > mpirun=/usr/local/topspin/mpi/mpich/bin/mpirun --with-blas-lapack- > dir=/opt/intel/cmkl/8.1.1/lib/em64t --with-shared=0 > ----------------------------------------- > Libraries compiled on Tue Jan 8 22:34:13 SGT 2008 on atlas3-c01 > Machine characteristics: Linux atlas3-c01 2.6.9-42.ELsmp #1 SMP Wed > Jul 12 23:32:02 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux > Using PETSc directory: /nfs/home/enduser/g0306332/petsc-2.3.3-p8 > Using PETSc arch: atlas3-mpi > ----------------------------------------- > Using C compiler: mpicc -fPIC -O > Using Fortran compiler: mpif90 -I. -fPIC -O > ----------------------------------------- > Using include paths: -I/nfs/home/enduser/g0306332/petsc-2.3.3-p8 -I/ > nfs/home/enduser/g0306332/petsc-2.3.3-p8/bmake/atlas3-mpi -I/nfs/ > home/enduser/g0306332/petsc-2.3.3-p8/include -I/home/enduser/ > g0306332/lib/hypre/include -I/usr/local/topspin/mpi/mpich/include > ------------------------------------------ > Using C linker: mpicc -fPIC -O > Using Fortran linker: mpif90 -I. -fPIC -O > Using libraries: -Wl,-rpath,/nfs/home/enduser/g0306332/petsc-2.3.3- > p8/lib/atlas3-mpi -L/nfs/home/enduser/g0306332/petsc-2.3.3-p8/lib/ > atlas3-mpi -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat - > lpetscvec -lpetsc -Wl,-rpath,/home/enduser/g0306332/lib/hypre/ > lib -L/home/enduser/g0306332/lib/hypre/lib -lHYPRE -Wl,-rpath,/opt/ > mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 -Wl,-rpath,/ > opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat- > linux/3.4.6/ -Wl,-rpath,/usr/lib64 -lstdc++ -lcxaguard -Wl,-rpath,/ > opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 -Wl,- > rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64- > redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,-rpath,/usr/local/ > topspin/mpi/mpich/lib -L/usr/local/topspin/mpi/mpich/lib -lmpich - > Wl,-rpath,/opt/intel/cmkl/8.1.1/lib/em64t -L/opt/intel/cmkl/8.1.1/ > lib/em64t -lmkl_lapack -lmkl_em64t -lguide -lpthread -Wl,-rpath,/usr/ > local/ofed/lib64 -L/usr/local/ofed/lib64 -Wl,-rpath,/opt/mvapich/ > 0.9.9/gen2/lib -L/opt/mvapich/0.9.9/gen2/lib -ldl -lmpich -libverbs - > libumad -lpthread -lrt -Wl,-rpath,/opt/intel/cce/9.1.049/lib -L/opt/ > intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/ > 3.4.6/ -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/ > lib64 -L/usr/lib64 -lsvml -limf -lipgo -lirc -lgcc_s -lirc_s - > lmpichf90nc -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/ > local/ofed/lib64 -Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/ > usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,- > rpath,/opt/intel/fce/9.1.045/lib -L/opt/intel/fce/9.1.045/lib - > lifport -lifcore -lm -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,- > rpath,/usr/local/ofed/lib64 -Wl,-rpath,/opt/intel/cce/9.1.049/lib - > Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/ > lib64 -lm -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/ > local/ofed/lib64 -Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/ > usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -lstdc+ > + -lcxaguard -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/ > local/ofed/lib64 -Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/ > usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,- > rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 - > Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64- > redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -lstdc++ -lcxaguard -Wl,- > rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 - > Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64- > redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,-rpath,/opt/mvapich/ > 0.9.9/gen2/lib -L/opt/mvapich/0.9.9/gen2/lib -ldl -lmpich -Wl,- > rpath,/usr/local/ofed/lib64 -L/usr/local/ofed/lib64 -libverbs - > libumad -lpthread -lrt -Wl,-rpath,/opt/intel/cce/9.1.049/lib -L/opt/ > intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/ > 3.4.6/ -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/ > lib64 -L/usr/lib64 -lsvml -limf -lipgo -lirc -lgcc_s -lirc_s -ldl -lc > ------------------------------------------ > ************************************************************************************************************************ > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript - > r -fCourier9' to print this document *** > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance > Summary: ---------------------------------------------- > > ./a.out on a atlas3-mp named atlas3-c18 with 8 processors, by > g0306332 Fri Jun 6 17:23:25 2008 > Using Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST > 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b > > Max Max/Min Avg Total > Time (sec): 1.140e+03 1.00019 1.140e+03 > Objects: 4.200e+01 1.00000 4.200e+01 > Flops: 4.620e+10 1.00158 4.619e+10 3.695e+11 > Flops/sec: 4.053e+07 1.00177 4.051e+07 3.241e+08 > MPI Messages: 9.954e+03 2.00000 8.710e+03 6.968e+04 > MPI Message Lengths: 7.224e+07 2.00000 7.257e+03 5.057e+08 > MPI Reductions: 1.716e+03 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of > length N --> 2N flops > and VecAXPY() for complex vectors of > length N --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- > Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > 0: Main Stage: 1.1402e+03 100.0% 3.6953e+11 100.0% 6.968e+04 > 100.0% 7.257e+03 100.0% 1.372e+04 100.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops/sec: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all > processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with > PetscLogStagePush() and PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in > this phase > %M - percent messages in this phase %L - percent message > lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max > time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > > > ########################################################## > # # > # WARNING!!! # > # # > # This code was run without the PreLoadBegin() # > # macros. To get timing results we always recommend # > # preloading. otherwise timing numbers may be # > # meaningless. # > ########################################################## > > > Event Count Time (sec) Flops/ > sec --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg > len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > MatMult 4975 1.0 7.8154e+01 1.9 4.19e+08 1.9 7.0e+04 7.3e > +03 0.0e+00 5 38100100 0 5 38100100 0 1798 > MatSolve 2855 1.0 1.0870e+02 1.8 2.57e+08 1.8 0.0e+00 0.0e > +00 0.0e+00 7 34 0 0 0 7 34 0 0 0 1153 > MatLUFactorNum 300 1.0 2.3238e+01 1.5 2.07e+08 1.5 0.0e+00 0.0e > +00 0.0e+00 2 7 0 0 0 2 7 0 0 0 1099 > MatILUFactorSym 1 1.0 6.1973e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e > +00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatConvert 1 1.0 1.4168e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatAssemblyBegin 301 1.0 6.9683e+01 8.6 0.00e+00 0.0 0.0e+00 0.0e > +00 6.0e+02 4 0 0 0 4 4 0 0 0 4 0 > MatAssemblyEnd 301 1.0 6.2247e+00 1.2 0.00e+00 0.0 2.8e+01 3.6e > +03 3.1e+02 0 0 0 0 2 0 0 0 0 2 0 > MatGetRow 162000 1.0 6.0330e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetRowIJ 3 1.0 9.0599e-06 3.2 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetOrdering 1 1.0 5.6710e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e > +00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSetup 601 1.0 1.5631e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 600 1.0 8.1668e+02 1.0 5.66e+07 1.0 7.0e+04 7.3e > +03 9.2e+03 72100100100 67 72100100100 67 452 > PCSetUp 601 1.0 2.4372e+01 1.5 1.93e+08 1.5 0.0e+00 0.0e > +00 5.0e+00 2 7 0 0 0 2 7 0 0 0 1048 > PCSetUpOnBlocks 300 1.0 2.3303e+01 1.5 2.07e+08 1.5 0.0e+00 0.0e > +00 3.0e+00 2 7 0 0 0 2 7 0 0 0 1096 > PCApply 5575 1.0 6.5344e+02 1.1 2.57e+07 1.1 0.0e+00 0.0e > +00 0.0e+00 55 34 0 0 0 55 34 0 0 0 192 > VecDot 4840 1.0 6.8932e+00 1.3 3.07e+08 1.3 0.0e+00 0.0e > +00 4.8e+03 1 3 0 0 35 1 3 0 0 35 1820 > VecNorm 4365 1.0 1.2250e+02 3.6 6.82e+07 3.6 0.0e+00 0.0e > +00 4.4e+03 8 5 0 0 32 8 5 0 0 32 153 > VecCopy 900 1.0 1.4297e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 6775 1.0 8.1405e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 6485 1.0 1.0003e+01 1.9 5.73e+08 1.9 0.0e+00 0.0e > +00 0.0e+00 1 7 0 0 0 1 7 0 0 0 2420 > VecAYPX 3765 1.0 7.8289e+00 2.0 5.17e+08 2.0 0.0e+00 0.0e > +00 0.0e+00 0 4 0 0 0 0 4 0 0 0 2092 > VecWAXPY 2420 1.0 3.8504e+00 1.9 3.80e+08 1.9 0.0e+00 0.0e > +00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1629 > VecAssemblyBegin 1200 1.0 9.2808e+00 3.4 0.00e+00 0.0 0.0e+00 0.0e > +00 3.6e+03 1 0 0 0 26 1 0 0 0 26 0 > VecAssemblyEnd 1200 1.0 2.3313e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 4975 1.0 2.2727e-01 2.6 0.00e+00 0.0 7.0e+04 7.3e > +03 0.0e+00 0 0100100 0 0 0100100 0 0 > VecScatterEnd 4975 1.0 2.7557e+0168.1 0.00e+00 0.0 0.0e+00 0.0e > +00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' > Mem. > > --- Event Stage 0: Main Stage > > Matrix 7 7 160595412 0 > Krylov Solver 3 3 8 0 > Preconditioner 3 3 528 0 > Index Set 7 7 3897600 0 > Vec 20 20 23357344 0 > Vec Scatter 2 2 0 0 > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > ====================================================================== > Average time to get PetscTime(): 1.19209e-07 > Average time for MPI_Barrier(): 2.10285e-05 > Average time for zero size MPI_Send(): 7.59959e-06 > OptionTable: -log_summary test8_600 > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 > Configure run at: Tue Jan 8 22:22:08 2008 > Configure options: --with-memcmp-ok --sizeof_char=1 -- > sizeof_void_p=8 --sizeof_short=2 --sizeof_int=4 --sizeof_long=8 -- > sizeof_long_long=8 --sizeof_float=4 --sizeof_double=8 -- > bits_per_byte=8 --sizeof_MPI_Comm=4 --sizeof_MPI_Fint=4 --with- > vendor-compilers=intel --with-x=0 --with-hypre-dir=/home/enduser/ > g0306332/lib/hypre --with-debugging=0 --with-batch=1 --with-mpi- > shared=0 --with-mpi-include=/usr/local/topspin/mpi/mpich/include -- > with-mpi-lib=/usr/local/topspin/mpi/mpich/lib/libmpich.a --with- > mpirun=/usr/local/topspin/mpi/mpich/bin/mpirun --with-blas-lapack- > dir=/opt/intel/cmkl/8.1.1/lib/em64t --with-shared=0 > ----------------------------------------- > Libraries compiled on Tue Jan 8 22:34:13 SGT 2008 on atlas3-c01 > Machine characteristics: Linux atlas3-c01 2.6.9-42.ELsmp #1 SMP Wed > Jul 12 23:32:02 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux > Using PETSc directory: /nfs/home/enduser/g0306332/petsc-2.3.3-p8 > Using PETSc arch: atlas3-mpi > ----------------------------------------- > Using C compiler: mpicc -fPIC -O > Using Fortran compiler: mpif90 -I. -fPIC -O > ----------------------------------------- > Using include paths: -I/nfs/home/enduser/g0306332/petsc-2.3.3-p8 -I/ > nfs/home/enduser/g0306332/petsc-2.3.3-p8/bmake/atlas3-mpi -I/nfs/ > home/enduser/g0306332/petsc-2.3.3-p8/include -I/home/enduser/ > g0306332/lib/hypre/include -I/usr/local/topspin/mpi/mpich/include > ------------------------------------------ > Using C linker: mpicc -fPIC -O > Using Fortran linker: mpif90 -I. -fPIC -O > Using libraries: -Wl,-rpath,/nfs/home/enduser/g0306332/petsc-2.3.3- > p8/lib/atlas3-mpi -L/nfs/home/enduser/g0306332/petsc-2.3.3-p8/lib/ > atlas3-mpi -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat - > lpetscvec -lpetsc -Wl,-rpath,/home/enduser/g0306332/lib/hypre/ > lib -L/home/enduser/g0306332/lib/hypre/lib -lHYPRE -Wl,-rpath,/opt/ > mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 -Wl,-rpath,/ > opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat- > linux/3.4.6/ -Wl,-rpath,/usr/lib64 -lstdc++ -lcxaguard -Wl,-rpath,/ > opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 -Wl,- > rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64- > redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,-rpath,/usr/local/ > topspin/mpi/mpich/lib -L/usr/local/topspin/mpi/mpich/lib -lmpich - > Wl,-rpath,/opt/intel/cmkl/8.1.1/lib/em64t -L/opt/intel/cmkl/8.1.1/ > lib/em64t -lmkl_lapack -lmkl_em64t -lguide -lpthread -Wl,-rpath,/usr/ > local/ofed/lib64 -L/usr/local/ofed/lib64 -Wl,-rpath,/opt/mvapich/ > 0.9.9/gen2/lib -L/opt/mvapich/0.9.9/gen2/lib -ldl -lmpich -libverbs - > libumad -lpthread -lrt -Wl,-rpath,/opt/intel/cce/9.1.049/lib -L/opt/ > intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/ > 3.4.6/ -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/ > lib64 -L/usr/lib64 -lsvml -limf -lipgo -lirc -lgcc_s -lirc_s - > lmpichf90nc -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/ > local/ofed/lib64 -Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/ > usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,- > rpath,/opt/intel/fce/9.1.045/lib -L/opt/intel/fce/9.1.045/lib - > lifport -lifcore -lm -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,- > rpath,/usr/local/ofed/lib64 -Wl,-rpath,/opt/intel/cce/9.1.049/lib - > Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/ > lib64 -lm -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/ > local/ofed/lib64 -Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/ > usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -lstdc+ > + -lcxaguard -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/ > local/ofed/lib64 -Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/ > usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,- > rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 - > Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64- > redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -lstdc++ -lcxaguard -Wl,- > rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 - > Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64- > redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,-rpath,/opt/mvapich/ > 0.9.9/gen2/lib -L/opt/mvapich/0.9.9/gen2/lib -ldl -lmpich -Wl,- > rpath,/usr/local/ofed/lib64 -L/usr/local/ofed/lib64 -libverbs - > libumad -lpthread -lrt -Wl,-rpath,/opt/intel/cce/9.1.049/lib -L/opt/ > intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/ > 3.4.6/ -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/ > lib64 -L/usr/lib64 -lsvml -limf -lipgo -lirc -lgcc_s -lirc_s -ldl -lc > ------------------------------------------ From acolombi at gmail.com Sat Jun 7 13:39:34 2008 From: acolombi at gmail.com (Andrew Colombi) Date: Sat, 7 Jun 2008 13:39:34 -0500 Subject: MatCreateMPISBAIJ & Non-Square Matrix Message-ID: <9dc10d950806071139n55c30c60o78bcf4cfb0a0b0e1@mail.gmail.com> I am wondering how the d_nz and o_nz are treated for a matrix where N > M (i.e. more columns than rows). Also, if I understand the documentation correctly, it seems the "diagonal part" of the matrix depends only on the number of rows each processor has. If I let PETSc decide how can I know this before it's too late? Should I use MatMPISBAIJSetPreallocation after Creating the matrix? And finally, is d_nz the number of non zeros after having saved 1/2 for storing in a symmetric manner? or should I pretend I actually need twice the nz than will actually be stored. I'd also like to describe the problem I am trying to solve as I'm sure it's essentially the most trivial use of PETSc, and yet I am having some difficulty expressing it. Maybe someone can point out the obviously correct thing to do that I am missing. I am solving a linear elliptic PDE using linear FEM with Dirichlet boundary conditions. To handle boundary conditions using dense matrices in Matlab my strategy was to create an M x N matrix, A, where M is the number of unknowns, and N is unknowns + boundary conditions (i.e. N > M). Then I'd multiply A by a vector of 0s at unknowns and boundary conditions at knowns, and subtract this from my RHS, b. Finally I'd mask away the columns associated with the boundary conditions of A to produce a square matrix A' = A(:, unknowns). Then x = A' \ b. Scaling up to PETSc here are my concerns: * How do I best represent A? It is a symmetric matrix, but I'm not sure how to find d_nz and o_nz. nz. The trouble is I don't have any control over the bandwidth of the matrix (at least at the present). Though I have no estimate for d_nz, I do have one for nz; maybe I'll find better performance with an MPIAIJ? If such is the case would I still be able to use KSPCG? * How do I best produce A' from A? Ideally this could be done without any copying, since it's just a simple mask (i.e. remove the last N-M columns). Do Sub matrix commands perform a copy? Is there a way to avoid the copy? Thanks! -Andrew From bsmith at mcs.anl.gov Sat Jun 7 15:22:59 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 7 Jun 2008 15:22:59 -0500 Subject: MatCreateMPISBAIJ & Non-Square Matrix In-Reply-To: <9dc10d950806071139n55c30c60o78bcf4cfb0a0b0e1@mail.gmail.com> References: <9dc10d950806071139n55c30c60o78bcf4cfb0a0b0e1@mail.gmail.com> Message-ID: To mimic what you have done before use MatGetSubMatrix() to "mask out" the part you do not want (MatGetSubMatrix() is very fast). Start with MPIAIJ and get the preallocation correct. And successfully solve your problem. Then and only then, consider switching to SBAIJ to save some memory. Once you have the preallocation correct for MPIAIJ you can reuse it with a twist of discarding below diagonal counts for SBAIJ. Barry On Jun 7, 2008, at 1:39 PM, Andrew Colombi wrote: > I am wondering how the d_nz and o_nz are treated for a matrix where N >> M (i.e. more columns than rows). Also, if I understand the > documentation correctly, it seems the "diagonal part" of the matrix > depends only on the number of rows each processor has. If I let PETSc > decide how can I know this before it's too late? Should I use > MatMPISBAIJSetPreallocation after Creating the matrix? And finally, > is d_nz the number of non zeros after having saved 1/2 for storing in > a symmetric manner? or should I pretend I actually need twice the nz > than will actually be stored. > > I'd also like to describe the problem I am trying to solve as I'm sure > it's essentially the most trivial use of PETSc, and yet I am having > some difficulty expressing it. Maybe someone can point out the > obviously correct thing to do that I am missing. > > I am solving a linear elliptic PDE using linear FEM with Dirichlet > boundary conditions. To handle boundary conditions using dense > matrices in Matlab my strategy was to create an M x N matrix, A, where > M is the number of unknowns, and N is unknowns + boundary conditions > (i.e. N > M). Then I'd multiply A by a vector of 0s at unknowns and > boundary conditions at knowns, and subtract this from my RHS, b. > Finally I'd mask away the columns associated with the boundary > conditions of A to produce a square matrix A' = A(:, unknowns). Then > x = A' \ b. > > Scaling up to PETSc here are my concerns: > > * How do I best represent A? It is a symmetric matrix, but I'm not > sure how to find d_nz and o_nz. nz. The trouble is I don't have any > control over the bandwidth of the matrix (at least at the present). > Though I have no estimate for d_nz, I do have one for nz; maybe I'll > find better performance with an MPIAIJ? If such is the case would I > still be able to use KSPCG? > > * How do I best produce A' from A? Ideally this could be done without > any copying, since it's just a simple mask (i.e. remove the last N-M > columns). Do Sub matrix commands perform a copy? Is there a way to > avoid the copy? > > Thanks! > -Andrew > > From zonexo at gmail.com Mon Jun 9 21:46:50 2008 From: zonexo at gmail.com (Ben Tay) Date: Tue, 10 Jun 2008 10:46:50 +0800 Subject: Error using BoomerAMG with eqns Message-ID: <484DEB1A.7010400@gmail.com> Hi, I tried to use BoomerAMG as the preconditioner. When I use ./a.out -pc_type hypre -pc_hypre_type boomeramg, I got KSP Object: type: richardson Richardson: damping factor=1 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: hypre HYPRE BoomerAMG preconditioning HYPRE BoomerAMG: Cycle type V HYPRE BoomerAMG: Maximum number of levels 25 HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 HYPRE BoomerAMG: Threshold for strong coupling 0.25 HYPRE BoomerAMG: Interpolation truncation factor 0 HYPRE BoomerAMG: Interpolation: max elements per row 0 HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 However, when I add the BoomerAMG into my code, I got [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[3]PETSC ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to find memory corruption errors [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [3]PETSC ERROR: to get more information on the crash. [3]PETSC ERROR: --------------------- Error Message ------------------------------------ [3]PETSC ERROR: Signal received! [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b [3]PETSC ERROR: See docs/changes/index.html for recent updates. [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [3]PETSC ERROR: See docs/index.html for manual pages. [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: ./a.out on a atlas3-mp named atlas3-c36 by g0306332 Tue Jun 10 10:36:34 2008 [3]PETSC ERROR: Libraries linked from /nfs/home/enduser/g0306332/petsc-2.3.3-p8/lib/atlas3-mpi [3]PETSC ERROR: Configure run at Tue Jan 8 22:22:08 2008 Here's part of my code: call MatAssemblyBegin(A_mat_uv,MAT_FINAL_ASSEMBLY,ierr) call MatAssemblyEnd(A_mat_uv,MAT_FINAL_ASSEMBLY,ierr) call VecAssemblyBegin(b_rhs_uv,ierr) call VecAssemblyEnd(b_rhs_uv,ierr) call VecAssemblyBegin(xx_uv,ierr) call VecAssemblyEnd(xx_uv,ierr) call KSPSetOperators(ksp_uv,A_mat_uv,A_mat_uv,SAME_NONZERO_PATTERN,ierr) call KSPGetPC(ksp_uv,pc_uv,ierr) ksptype=KSPRICHARDSON call KSPSetType(ksp_uv,ksptype,ierr) call PCSetType(pc,'hypre',ierr) call PCHYPRESetType(pc,'boomeramg',ierr) call KSPSetFromOptions(ksp_uv,ierr) tol=1.e-5 call KSPSetTolerances(ksp_uv,tol,PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_INTEGER,ierr) call KSPSolve(ksp_uv,b_rhs_uv,xx_uv,ierr) call KSPGetConvergedReason(ksp_uv,reason,ierr) The matrix changes every timestep so I have to call the preconditioner every time. So what did I do wrong? Btw, the matrix is A_mat_uv, RHS is b_rhs_uv and the answer is xx_uv. Thank you very much. Regards. From balay at mcs.anl.gov Mon Jun 9 21:59:36 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 9 Jun 2008 21:59:36 -0500 (CDT) Subject: Error using BoomerAMG with eqns In-Reply-To: <484DEB1A.7010400@gmail.com> References: <484DEB1A.7010400@gmail.com> Message-ID: you should just use a debugger to determine where/why you get a segv Satish On Tue, 10 Jun 2008, Ben Tay wrote: > Hi, > > I tried to use BoomerAMG as the preconditioner. When I use ./a.out -pc_type > hypre -pc_hypre_type boomeramg, I got > > KSP Object: > type: richardson > Richardson: damping factor=1 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: hypre > HYPRE BoomerAMG preconditioning > HYPRE BoomerAMG: Cycle type V > HYPRE BoomerAMG: Maximum number of levels 25 > HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 > HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 > HYPRE BoomerAMG: Threshold for strong coupling 0.25 > HYPRE BoomerAMG: Interpolation truncation factor 0 > HYPRE BoomerAMG: Interpolation: max elements per row 0 > HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 > HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 > > However, when I add the BoomerAMG into my code, I got > > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably > memory access out of range > [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [3]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[3]PETSC > ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to find > memory corruption errors > [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > [3]PETSC ERROR: to get more information on the crash. > [3]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [3]PETSC ERROR: Signal received! > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST > 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b > [3]PETSC ERROR: See docs/changes/index.html for recent updates. > [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [3]PETSC ERROR: See docs/index.html for manual pages. > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: ./a.out on a atlas3-mp named atlas3-c36 by g0306332 Tue Jun 10 > 10:36:34 2008 > [3]PETSC ERROR: Libraries linked from > /nfs/home/enduser/g0306332/petsc-2.3.3-p8/lib/atlas3-mpi > [3]PETSC ERROR: Configure run at Tue Jan 8 22:22:08 2008 > > > Here's part of my code: > > call MatAssemblyBegin(A_mat_uv,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyEnd(A_mat_uv,MAT_FINAL_ASSEMBLY,ierr) > > call VecAssemblyBegin(b_rhs_uv,ierr) > call VecAssemblyEnd(b_rhs_uv,ierr) > > call VecAssemblyBegin(xx_uv,ierr) > call VecAssemblyEnd(xx_uv,ierr) > > call KSPSetOperators(ksp_uv,A_mat_uv,A_mat_uv,SAME_NONZERO_PATTERN,ierr) > > call KSPGetPC(ksp_uv,pc_uv,ierr) > > ksptype=KSPRICHARDSON > > call KSPSetType(ksp_uv,ksptype,ierr) > > call PCSetType(pc,'hypre',ierr) > > call PCHYPRESetType(pc,'boomeramg',ierr) > > call KSPSetFromOptions(ksp_uv,ierr) > > tol=1.e-5 > > call > KSPSetTolerances(ksp_uv,tol,PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_INTEGER,ierr) > > call KSPSolve(ksp_uv,b_rhs_uv,xx_uv,ierr) > > call KSPGetConvergedReason(ksp_uv,reason,ierr) > > > The matrix changes every timestep so I have to call the preconditioner every > time. So what did I do wrong? Btw, the matrix is A_mat_uv, RHS is b_rhs_uv and > the answer is xx_uv. > > Thank you very much. > > Regards. > > > From bsmith at mcs.anl.gov Mon Jun 9 22:50:28 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 9 Jun 2008 22:50:28 -0500 Subject: Error using BoomerAMG with eqns In-Reply-To: <484DEB1A.7010400@gmail.com> References: <484DEB1A.7010400@gmail.com> Message-ID: You use KSPGetPC(ksp_uv,pc_uv,ierr) but call PCSetType(pc,'hypre',ierr) this is why you should ALWAYS, always use implicit none Barry On Jun 9, 2008, at 9:46 PM, Ben Tay wrote: > Hi, > > I tried to use BoomerAMG as the preconditioner. When I use ./a.out - > pc_type hypre -pc_hypre_type boomeramg, I got > > KSP Object: > type: richardson > Richardson: damping factor=1 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: hypre > HYPRE BoomerAMG preconditioning > HYPRE BoomerAMG: Cycle type V > HYPRE BoomerAMG: Maximum number of levels 25 > HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 > HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 > HYPRE BoomerAMG: Threshold for strong coupling 0.25 > HYPRE BoomerAMG: Interpolation truncation factor 0 > HYPRE BoomerAMG: Interpolation: max elements per row 0 > HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 > HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 > > However, when I add the BoomerAMG into my code, I got > > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > Violation, probably memory access out of range > [3]PETSC ERROR: Try option -start_in_debugger or - > on_error_attach_debugger > [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal > [3]PETSC ERROR: or try http://valgrind.org on linux or man > libgmalloc on Apple to find memory corruption errors > [3]PETSC ERROR: configure using --with-debugging=yes, recompile, > link, and run > [3]PETSC ERROR: to get more information on the crash. > [3]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [3]PETSC ERROR: Signal received! > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 > 17:03:40 CST 2007 HG revision: > 414581156e67e55c761739b0deb119f7590d0f4b > [3]PETSC ERROR: See docs/changes/index.html for recent updates. > [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [3]PETSC ERROR: See docs/index.html for manual pages. > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: ./a.out on a atlas3-mp named atlas3-c36 by g0306332 > Tue Jun 10 10:36:34 2008 > [3]PETSC ERROR: Libraries linked from /nfs/home/enduser/g0306332/ > petsc-2.3.3-p8/lib/atlas3-mpi > [3]PETSC ERROR: Configure run at Tue Jan 8 22:22:08 2008 > > > Here's part of my code: > > call MatAssemblyBegin(A_mat_uv,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyEnd(A_mat_uv,MAT_FINAL_ASSEMBLY,ierr) > > call VecAssemblyBegin(b_rhs_uv,ierr) > call VecAssemblyEnd(b_rhs_uv,ierr) > > call VecAssemblyBegin(xx_uv,ierr) > call VecAssemblyEnd(xx_uv,ierr) > > call > KSPSetOperators(ksp_uv,A_mat_uv,A_mat_uv,SAME_NONZERO_PATTERN,ierr) > > call KSPGetPC(ksp_uv,pc_uv,ierr) > > ksptype=KSPRICHARDSON > > call KSPSetType(ksp_uv,ksptype,ierr) > > call PCSetType(pc,'hypre',ierr) > > call PCHYPRESetType(pc,'boomeramg',ierr) > > call KSPSetFromOptions(ksp_uv,ierr) > > tol=1.e-5 > > call > KSPSetTolerances > (ksp_uv > ,tol > ,PETSC_DEFAULT_DOUBLE_PRECISION > ,PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_INTEGER,ierr) > > call KSPSolve(ksp_uv,b_rhs_uv,xx_uv,ierr) > > call KSPGetConvergedReason(ksp_uv,reason,ierr) > > > The matrix changes every timestep so I have to call the > preconditioner every time. So what did I do wrong? Btw, the matrix > is A_mat_uv, RHS is b_rhs_uv and the answer is xx_uv. > > Thank you very much. > > Regards. > > From zonexo at gmail.com Tue Jun 10 09:21:17 2008 From: zonexo at gmail.com (Ben Tay) Date: Tue, 10 Jun 2008 22:21:17 +0800 Subject: Analysis of performance of parallel code as processors increase In-Reply-To: References: <4849DF38.6070104@gmail.com> Message-ID: <484E8DDD.9020209@gmail.com> Hi Barry, I found that when I use hypre, it is about twice as slow. I guess hypre does not work well with the linearised momentum eqn. I tried to use PCILU and PCICC and I got the error: No support for this operation for this object type! [1]PETSC ERROR: Matrix type mpiaij symbolic ICC! PCASM performs even worse. It seems like block jacobi is still the best. where did you find the no. of iterations? Are you saying that if I increase the no. of processors, the iteration nos must go down? Btw, I'm using the richardson solver. Other combi such as bcgs + hypre is much worse. Does it mean there are some other problems present and hence my code does not scale properly? Thank you very much. Regards Barry Smith wrote: > > You are not using hypre, you are using block Jacobi with ILU on the > blocks. > > The number of iterations goes from around 4000 to around 5000 in > going from 4 to 8 processes, > this is why you do not see such a great speedup. > > Barry > > On Jun 6, 2008, at 8:07 PM, Ben Tay wrote: > >> Hi, >> >> I have coded in parallel using PETSc and Hypre. I found that going >> from 1 to 4 processors gives an almost 4 times increase. However from >> 4 to 8 processors only increase performance by 1.2-1.5 instead of 2. >> >> Is the slowdown due to the size of the matrix being not large enough? >> Currently I am using 600x2160 to do the benchmark. Even when increase >> the matrix size to 900x3240 or 1200x2160, the performance increase >> is also not much. Is it possible to use -log_summary find out the >> error? I have attached the log file comparison for the 4 and 8 >> processors, I found that some event like VecScatterEnd, VecNorm and >> MatAssemblyBegin have much higher ratios. Does it indicate something? >> Another strange thing is that MatAssemblyBegin for the 4 pros has a >> much higher ratio than the 8pros. I thought there should be less >> communications for the 4 pros case, and so the ratio should be lower. >> Does it mean there's some communication problem at that time? >> >> Thank you very much. >> >> Regards >> >> >> ************************************************************************************************************************ >> >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript >> -r -fCourier9' to print this document *** >> ************************************************************************************************************************ >> >> >> ---------------------------------------------- PETSc Performance >> Summary: ---------------------------------------------- >> >> ./a.out on a atlas3-mp named atlas3-c43 with 4 processors, by >> g0306332 Fri Jun 6 17:29:26 2008 >> Using Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST >> 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b >> >> Max Max/Min Avg Total >> Time (sec): 1.750e+03 1.00043 1.750e+03 >> Objects: 4.200e+01 1.00000 4.200e+01 >> Flops: 6.961e+10 1.00074 6.959e+10 2.784e+11 >> Flops/sec: 3.980e+07 1.00117 3.978e+07 1.591e+08 >> MPI Messages: 8.168e+03 2.00000 6.126e+03 2.450e+04 >> MPI Message Lengths: 5.525e+07 2.00000 6.764e+03 1.658e+08 >> MPI Reductions: 3.203e+03 1.00000 >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length >> N --> 2N flops >> and VecAXPY() for complex vectors of >> length N --> 8N flops >> >> Summary of Stages: ----- Time ------ ----- Flops ----- --- >> Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total counts >> %Total Avg %Total counts %Total >> 0: Main Stage: 1.7495e+03 100.0% 2.7837e+11 100.0% 2.450e+04 >> 100.0% 6.764e+03 100.0% 1.281e+04 100.0% >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flops/sec: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all >> processors >> Mess: number of messages sent >> Avg. len: average message length >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() >> and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flops in >> this phase >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time >> over all processors) >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> >> ########################################################## >> # # >> # WARNING!!! # >> # # >> # This code was run without the PreLoadBegin() # >> # macros. To get timing results we always recommend # >> # preloading. otherwise timing numbers may be # >> # meaningless. # >> ########################################################## >> >> >> Event Count Time (sec) >> Flops/sec --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio Mess Avg >> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> --- Event Stage 0: Main Stage >> >> MatMult 4082 1.0 8.2037e+01 1.5 4.67e+08 1.5 2.4e+04 >> 6.8e+03 0.0e+00 4 37100100 0 4 37100100 0 1240 >> MatSolve 1976 1.0 1.3250e+02 1.5 2.52e+08 1.5 0.0e+00 >> 0.0e+00 0.0e+00 6 31 0 0 0 6 31 0 0 0 655 >> MatLUFactorNum 300 1.0 3.8260e+01 1.2 2.07e+08 1.2 0.0e+00 >> 0.0e+00 0.0e+00 2 9 0 0 0 2 9 0 0 0 668 >> MatILUFactorSym 1 1.0 2.2550e-01 2.7 0.00e+00 0.0 0.0e+00 >> 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatConvert 1 1.0 2.9182e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatAssemblyBegin 301 1.0 1.0776e+021228.9 0.00e+00 0.0 0.0e+00 >> 0.0e+00 6.0e+02 4 0 0 0 5 4 0 0 0 5 0 >> MatAssemblyEnd 301 1.0 9.6146e+00 1.1 0.00e+00 0.0 1.2e+01 >> 3.6e+03 3.1e+02 1 0 0 0 2 1 0 0 0 2 0 >> MatGetRow 324000 1.0 1.2161e-01 1.4 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetRowIJ 3 1.0 5.0068e-06 1.3 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetOrdering 1 1.0 2.1279e-02 2.3 0.00e+00 0.0 0.0e+00 >> 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetup 601 1.0 2.5108e-02 1.2 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 600 1.0 1.2353e+03 1.0 5.64e+07 1.0 2.4e+04 >> 6.8e+03 8.3e+03 71100100100 65 71100100100 65 225 >> PCSetUp 601 1.0 4.0116e+01 1.2 1.96e+08 1.2 0.0e+00 >> 0.0e+00 5.0e+00 2 9 0 0 0 2 9 0 0 0 637 >> PCSetUpOnBlocks 300 1.0 3.8513e+01 1.2 2.06e+08 1.2 0.0e+00 >> 0.0e+00 3.0e+00 2 9 0 0 0 2 9 0 0 0 664 >> PCApply 4682 1.0 1.0566e+03 1.0 2.12e+07 1.0 0.0e+00 >> 0.0e+00 0.0e+00 59 31 0 0 0 59 31 0 0 0 82 >> VecDot 4812 1.0 8.2762e+00 1.1 4.00e+08 1.1 0.0e+00 >> 0.0e+00 4.8e+03 0 4 0 0 38 0 4 0 0 38 1507 >> VecNorm 3479 1.0 9.2739e+01 8.3 3.15e+08 8.3 0.0e+00 >> 0.0e+00 3.5e+03 4 5 0 0 27 4 5 0 0 27 152 >> VecCopy 900 1.0 2.0819e+00 1.4 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 5882 1.0 9.4626e+00 1.5 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 5585 1.0 1.5397e+01 1.5 4.67e+08 1.5 0.0e+00 >> 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 1273 >> VecAYPX 2879 1.0 1.0303e+01 1.6 4.45e+08 1.6 0.0e+00 >> 0.0e+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 1146 >> VecWAXPY 2406 1.0 7.7902e+00 1.6 3.14e+08 1.6 0.0e+00 >> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 801 >> VecAssemblyBegin 1200 1.0 8.4259e+00 3.8 0.00e+00 0.0 0.0e+00 >> 0.0e+00 3.6e+03 0 0 0 0 28 0 0 0 0 28 0 >> VecAssemblyEnd 1200 1.0 2.4173e-03 1.3 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecScatterBegin 4082 1.0 1.2512e-01 1.5 0.00e+00 0.0 2.4e+04 >> 6.8e+03 0.0e+00 0 0100100 0 0 0100100 0 0 >> VecScatterEnd 4082 1.0 2.0954e+0153.3 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' >> Mem. >> >> --- Event Stage 0: Main Stage >> >> Matrix 7 7 321241092 0 >> Krylov Solver 3 3 8 0 >> Preconditioner 3 3 528 0 >> Index Set 7 7 7785600 0 >> Vec 20 20 46685344 0 >> Vec Scatter 2 2 0 0 >> ======================================================================================================================== >> >> Average time to get PetscTime(): 1.90735e-07 >> Average time for MPI_Barrier(): 1.45912e-05 >> Average time for zero size MPI_Send(): 7.27177e-06 >> OptionTable: -log_summary test4_600 >> Compiled without FORTRAN kernels >> Compiled with full precision matrices (default) >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >> sizeof(PetscScalar) 8 >> Configure run at: Tue Jan 8 22:22:08 2008 >> Configure options: --with-memcmp-ok --sizeof_char=1 --sizeof_void_p=8 >> --sizeof_short=2 --sizeof_int=4 --sizeof_long=8 --sizeof_long_long=8 >> --sizeof_float=4 --sizeof_double=8 --bits_per_byte=8 >> --sizeof_MPI_Comm=4 --sizeof_MPI_Fint=4 --with-vendor-compilers=intel >> --with-x=0 --with-hypre-dir=/home/enduser/g0306332/lib/hypre >> --with-debugging=0 --with-batch=1 --with-mpi-shared=0 >> --with-mpi-include=/usr/local/topspin/mpi/mpich/include >> --with-mpi-lib=/usr/local/topspin/mpi/mpich/lib/libmpich.a >> --with-mpirun=/usr/local/topspin/mpi/mpich/bin/mpirun >> --with-blas-lapack-dir=/opt/intel/cmkl/8.1.1/lib/em64t --with-shared=0 >> ----------------------------------------- >> Libraries compiled on Tue Jan 8 22:34:13 SGT 2008 on atlas3-c01 >> Machine characteristics: Linux atlas3-c01 2.6.9-42.ELsmp #1 SMP Wed >> Jul 12 23:32:02 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux >> Using PETSc directory: /nfs/home/enduser/g0306332/petsc-2.3.3-p8 >> Using PETSc arch: atlas3-mpi >> ----------------------------------------- >> Using C compiler: mpicc -fPIC -O >> Using Fortran compiler: mpif90 -I. -fPIC -O >> ----------------------------------------- >> Using include paths: -I/nfs/home/enduser/g0306332/petsc-2.3.3-p8 >> -I/nfs/home/enduser/g0306332/petsc-2.3.3-p8/bmake/atlas3-mpi >> -I/nfs/home/enduser/g0306332/petsc-2.3.3-p8/include >> -I/home/enduser/g0306332/lib/hypre/include >> -I/usr/local/topspin/mpi/mpich/include >> ------------------------------------------ >> Using C linker: mpicc -fPIC -O >> Using Fortran linker: mpif90 -I. -fPIC -O >> Using libraries: >> -Wl,-rpath,/nfs/home/enduser/g0306332/petsc-2.3.3-p8/lib/atlas3-mpi >> -L/nfs/home/enduser/g0306332/petsc-2.3.3-p8/lib/atlas3-mpi -lpetscts >> -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc >> -Wl,-rpath,/home/enduser/g0306332/lib/hypre/lib >> -L/home/enduser/g0306332/lib/hypre/lib -lHYPRE >> -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -lstdc++ -lcxaguard >> -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -Wl,-rpath,/usr/local/topspin/mpi/mpich/lib >> -L/usr/local/topspin/mpi/mpich/lib -lmpich >> -Wl,-rpath,/opt/intel/cmkl/8.1.1/lib/em64t >> -L/opt/intel/cmkl/8.1.1/lib/em64t -lmkl_lapack -lmkl_em64t -lguide >> -lpthread -Wl,-rpath,/usr/local/ofed/lib64 -L/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib -L/opt/mvapich/0.9.9/gen2/lib >> -ldl -lmpich -libverbs -libumad -lpthread -lrt >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib -L/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 >> -L/usr/lib64 -lsvml -limf -lipgo -lirc -lgcc_s -lirc_s -lmpichf90nc >> -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -Wl,-rpath,/opt/intel/fce/9.1.045/lib >> -L/opt/intel/fce/9.1.045/lib -lifport -lifcore -lm >> -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -lm -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -lstdc++ -lcxaguard >> -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -lstdc++ -lcxaguard >> -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -L/opt/mvapich/0.9.9/gen2/lib -ldl -lmpich >> -Wl,-rpath,/usr/local/ofed/lib64 -L/usr/local/ofed/lib64 -libverbs >> -libumad -lpthread -lrt -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -L/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 >> -L/usr/lib64 -lsvml -limf -lipgo -lirc -lgcc_s -lirc_s -ldl -lc >> ------------------------------------------ >> ************************************************************************************************************************ >> >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript >> -r -fCourier9' to print this document *** >> ************************************************************************************************************************ >> >> >> ---------------------------------------------- PETSc Performance >> Summary: ---------------------------------------------- >> >> ./a.out on a atlas3-mp named atlas3-c18 with 8 processors, by >> g0306332 Fri Jun 6 17:23:25 2008 >> Using Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST >> 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b >> >> Max Max/Min Avg Total >> Time (sec): 1.140e+03 1.00019 1.140e+03 >> Objects: 4.200e+01 1.00000 4.200e+01 >> Flops: 4.620e+10 1.00158 4.619e+10 3.695e+11 >> Flops/sec: 4.053e+07 1.00177 4.051e+07 3.241e+08 >> MPI Messages: 9.954e+03 2.00000 8.710e+03 6.968e+04 >> MPI Message Lengths: 7.224e+07 2.00000 7.257e+03 5.057e+08 >> MPI Reductions: 1.716e+03 1.00000 >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length >> N --> 2N flops >> and VecAXPY() for complex vectors of >> length N --> 8N flops >> >> Summary of Stages: ----- Time ------ ----- Flops ----- --- >> Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total counts >> %Total Avg %Total counts %Total >> 0: Main Stage: 1.1402e+03 100.0% 3.6953e+11 100.0% 6.968e+04 >> 100.0% 7.257e+03 100.0% 1.372e+04 100.0% >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flops/sec: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all >> processors >> Mess: number of messages sent >> Avg. len: average message length >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() >> and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flops in >> this phase >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time >> over all processors) >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> >> ########################################################## >> # # >> # WARNING!!! # >> # # >> # This code was run without the PreLoadBegin() # >> # macros. To get timing results we always recommend # >> # preloading. otherwise timing numbers may be # >> # meaningless. # >> ########################################################## >> >> >> Event Count Time (sec) >> Flops/sec --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio Mess Avg >> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> --- Event Stage 0: Main Stage >> >> MatMult 4975 1.0 7.8154e+01 1.9 4.19e+08 1.9 7.0e+04 >> 7.3e+03 0.0e+00 5 38100100 0 5 38100100 0 1798 >> MatSolve 2855 1.0 1.0870e+02 1.8 2.57e+08 1.8 0.0e+00 >> 0.0e+00 0.0e+00 7 34 0 0 0 7 34 0 0 0 1153 >> MatLUFactorNum 300 1.0 2.3238e+01 1.5 2.07e+08 1.5 0.0e+00 >> 0.0e+00 0.0e+00 2 7 0 0 0 2 7 0 0 0 1099 >> MatILUFactorSym 1 1.0 6.1973e-02 1.5 0.00e+00 0.0 0.0e+00 >> 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatConvert 1 1.0 1.4168e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatAssemblyBegin 301 1.0 6.9683e+01 8.6 0.00e+00 0.0 0.0e+00 >> 0.0e+00 6.0e+02 4 0 0 0 4 4 0 0 0 4 0 >> MatAssemblyEnd 301 1.0 6.2247e+00 1.2 0.00e+00 0.0 2.8e+01 >> 3.6e+03 3.1e+02 0 0 0 0 2 0 0 0 0 2 0 >> MatGetRow 162000 1.0 6.0330e-02 1.4 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetRowIJ 3 1.0 9.0599e-06 3.2 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetOrdering 1 1.0 5.6710e-03 1.4 0.00e+00 0.0 0.0e+00 >> 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetup 601 1.0 1.5631e-02 1.1 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 600 1.0 8.1668e+02 1.0 5.66e+07 1.0 7.0e+04 >> 7.3e+03 9.2e+03 72100100100 67 72100100100 67 452 >> PCSetUp 601 1.0 2.4372e+01 1.5 1.93e+08 1.5 0.0e+00 >> 0.0e+00 5.0e+00 2 7 0 0 0 2 7 0 0 0 1048 >> PCSetUpOnBlocks 300 1.0 2.3303e+01 1.5 2.07e+08 1.5 0.0e+00 >> 0.0e+00 3.0e+00 2 7 0 0 0 2 7 0 0 0 1096 >> PCApply 5575 1.0 6.5344e+02 1.1 2.57e+07 1.1 0.0e+00 >> 0.0e+00 0.0e+00 55 34 0 0 0 55 34 0 0 0 192 >> VecDot 4840 1.0 6.8932e+00 1.3 3.07e+08 1.3 0.0e+00 >> 0.0e+00 4.8e+03 1 3 0 0 35 1 3 0 0 35 1820 >> VecNorm 4365 1.0 1.2250e+02 3.6 6.82e+07 3.6 0.0e+00 >> 0.0e+00 4.4e+03 8 5 0 0 32 8 5 0 0 32 153 >> VecCopy 900 1.0 1.4297e+00 1.8 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 6775 1.0 8.1405e+00 1.8 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 6485 1.0 1.0003e+01 1.9 5.73e+08 1.9 0.0e+00 >> 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 2420 >> VecAYPX 3765 1.0 7.8289e+00 2.0 5.17e+08 2.0 0.0e+00 >> 0.0e+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 2092 >> VecWAXPY 2420 1.0 3.8504e+00 1.9 3.80e+08 1.9 0.0e+00 >> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1629 >> VecAssemblyBegin 1200 1.0 9.2808e+00 3.4 0.00e+00 0.0 0.0e+00 >> 0.0e+00 3.6e+03 1 0 0 0 26 1 0 0 0 26 0 >> VecAssemblyEnd 1200 1.0 2.3313e-03 1.3 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecScatterBegin 4975 1.0 2.2727e-01 2.6 0.00e+00 0.0 7.0e+04 >> 7.3e+03 0.0e+00 0 0100100 0 0 0100100 0 0 >> VecScatterEnd 4975 1.0 2.7557e+0168.1 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' >> Mem. >> >> --- Event Stage 0: Main Stage >> >> Matrix 7 7 160595412 0 >> Krylov Solver 3 3 8 0 >> Preconditioner 3 3 528 0 >> Index Set 7 7 3897600 0 >> Vec 20 20 23357344 0 >> Vec Scatter 2 2 0 0 >> ======================================================================================================================== >> >> Average time to get PetscTime(): 1.19209e-07 >> Average time for MPI_Barrier(): 2.10285e-05 >> Average time for zero size MPI_Send(): 7.59959e-06 >> OptionTable: -log_summary test8_600 >> Compiled without FORTRAN kernels >> Compiled with full precision matrices (default) >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >> sizeof(PetscScalar) 8 >> Configure run at: Tue Jan 8 22:22:08 2008 >> Configure options: --with-memcmp-ok --sizeof_char=1 --sizeof_void_p=8 >> --sizeof_short=2 --sizeof_int=4 --sizeof_long=8 --sizeof_long_long=8 >> --sizeof_float=4 --sizeof_double=8 --bits_per_byte=8 >> --sizeof_MPI_Comm=4 --sizeof_MPI_Fint=4 --with-vendor-compilers=intel >> --with-x=0 --with-hypre-dir=/home/enduser/g0306332/lib/hypre >> --with-debugging=0 --with-batch=1 --with-mpi-shared=0 >> --with-mpi-include=/usr/local/topspin/mpi/mpich/include >> --with-mpi-lib=/usr/local/topspin/mpi/mpich/lib/libmpich.a >> --with-mpirun=/usr/local/topspin/mpi/mpich/bin/mpirun >> --with-blas-lapack-dir=/opt/intel/cmkl/8.1.1/lib/em64t --with-shared=0 >> ----------------------------------------- >> Libraries compiled on Tue Jan 8 22:34:13 SGT 2008 on atlas3-c01 >> Machine characteristics: Linux atlas3-c01 2.6.9-42.ELsmp #1 SMP Wed >> Jul 12 23:32:02 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux >> Using PETSc directory: /nfs/home/enduser/g0306332/petsc-2.3.3-p8 >> Using PETSc arch: atlas3-mpi >> ----------------------------------------- >> Using C compiler: mpicc -fPIC -O >> Using Fortran compiler: mpif90 -I. -fPIC -O >> ----------------------------------------- >> Using include paths: -I/nfs/home/enduser/g0306332/petsc-2.3.3-p8 >> -I/nfs/home/enduser/g0306332/petsc-2.3.3-p8/bmake/atlas3-mpi >> -I/nfs/home/enduser/g0306332/petsc-2.3.3-p8/include >> -I/home/enduser/g0306332/lib/hypre/include >> -I/usr/local/topspin/mpi/mpich/include >> ------------------------------------------ >> Using C linker: mpicc -fPIC -O >> Using Fortran linker: mpif90 -I. -fPIC -O >> Using libraries: >> -Wl,-rpath,/nfs/home/enduser/g0306332/petsc-2.3.3-p8/lib/atlas3-mpi >> -L/nfs/home/enduser/g0306332/petsc-2.3.3-p8/lib/atlas3-mpi -lpetscts >> -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc >> -Wl,-rpath,/home/enduser/g0306332/lib/hypre/lib >> -L/home/enduser/g0306332/lib/hypre/lib -lHYPRE >> -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -lstdc++ -lcxaguard >> -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -Wl,-rpath,/usr/local/topspin/mpi/mpich/lib >> -L/usr/local/topspin/mpi/mpich/lib -lmpich >> -Wl,-rpath,/opt/intel/cmkl/8.1.1/lib/em64t >> -L/opt/intel/cmkl/8.1.1/lib/em64t -lmkl_lapack -lmkl_em64t -lguide >> -lpthread -Wl,-rpath,/usr/local/ofed/lib64 -L/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib -L/opt/mvapich/0.9.9/gen2/lib >> -ldl -lmpich -libverbs -libumad -lpthread -lrt >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib -L/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 >> -L/usr/lib64 -lsvml -limf -lipgo -lirc -lgcc_s -lirc_s -lmpichf90nc >> -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -Wl,-rpath,/opt/intel/fce/9.1.045/lib >> -L/opt/intel/fce/9.1.045/lib -lifport -lifcore -lm >> -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -lm -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -lstdc++ -lcxaguard >> -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -lstdc++ -lcxaguard >> -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -Wl,-rpath,/usr/local/ofed/lib64 >> -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -Wl,-rpath,/usr/lib64 -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib >> -L/opt/mvapich/0.9.9/gen2/lib -ldl -lmpich >> -Wl,-rpath,/usr/local/ofed/lib64 -L/usr/local/ofed/lib64 -libverbs >> -libumad -lpthread -lrt -Wl,-rpath,/opt/intel/cce/9.1.049/lib >> -L/opt/intel/cce/9.1.049/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ >> -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 >> -L/usr/lib64 -lsvml -limf -lipgo -lirc -lgcc_s -lirc_s -ldl -lc >> ------------------------------------------ > > From zonexo at gmail.com Tue Jun 10 09:53:42 2008 From: zonexo at gmail.com (Ben Tay) Date: Tue, 10 Jun 2008 22:53:42 +0800 Subject: Error using BoomerAMG with eqns In-Reply-To: References: <484DEB1A.7010400@gmail.com> Message-ID: <484E9576.20400@gmail.com> Ya Thank you Barry. I also realised it. Barry Smith wrote: > > You use KSPGetPC(ksp_uv,pc_uv,ierr) but call PCSetType(pc,'hypre',ierr) > > this is why you should ALWAYS, always use implicit none > > Barry > > > > On Jun 9, 2008, at 9:46 PM, Ben Tay wrote: > >> Hi, >> >> I tried to use BoomerAMG as the preconditioner. When I use ./a.out >> -pc_type hypre -pc_hypre_type boomeramg, I got >> >> KSP Object: >> type: richardson >> Richardson: damping factor=1 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: hypre >> HYPRE BoomerAMG preconditioning >> HYPRE BoomerAMG: Cycle type V >> HYPRE BoomerAMG: Maximum number of levels 25 >> HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 >> HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 >> HYPRE BoomerAMG: Threshold for strong coupling 0.25 >> HYPRE BoomerAMG: Interpolation truncation factor 0 >> HYPRE BoomerAMG: Interpolation: max elements per row 0 >> HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 >> HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 >> >> However, when I add the BoomerAMG into my code, I got >> >> [3]PETSC ERROR: >> ------------------------------------------------------------------------ >> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> [3]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> [3]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[3]PETSC >> ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple >> to find memory corruption errors >> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, >> link, and run >> [3]PETSC ERROR: to get more information on the crash. >> [3]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [3]PETSC ERROR: Signal received! >> [3]PETSC ERROR: >> ------------------------------------------------------------------------ >> [3]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 >> 17:03:40 CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b >> [3]PETSC ERROR: See docs/changes/index.html for recent updates. >> [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [3]PETSC ERROR: See docs/index.html for manual pages. >> [3]PETSC ERROR: >> ------------------------------------------------------------------------ >> [3]PETSC ERROR: ./a.out on a atlas3-mp named atlas3-c36 by g0306332 >> Tue Jun 10 10:36:34 2008 >> [3]PETSC ERROR: Libraries linked from >> /nfs/home/enduser/g0306332/petsc-2.3.3-p8/lib/atlas3-mpi >> [3]PETSC ERROR: Configure run at Tue Jan 8 22:22:08 2008 >> >> >> Here's part of my code: >> >> call MatAssemblyBegin(A_mat_uv,MAT_FINAL_ASSEMBLY,ierr) >> call MatAssemblyEnd(A_mat_uv,MAT_FINAL_ASSEMBLY,ierr) >> >> call VecAssemblyBegin(b_rhs_uv,ierr) >> call VecAssemblyEnd(b_rhs_uv,ierr) >> >> call VecAssemblyBegin(xx_uv,ierr) >> call VecAssemblyEnd(xx_uv,ierr) >> >> call KSPSetOperators(ksp_uv,A_mat_uv,A_mat_uv,SAME_NONZERO_PATTERN,ierr) >> >> call KSPGetPC(ksp_uv,pc_uv,ierr) >> >> ksptype=KSPRICHARDSON >> >> call KSPSetType(ksp_uv,ksptype,ierr) >> >> call PCSetType(pc,'hypre',ierr) >> >> call PCHYPRESetType(pc,'boomeramg',ierr) >> >> call KSPSetFromOptions(ksp_uv,ierr) >> >> tol=1.e-5 >> >> call >> KSPSetTolerances(ksp_uv,tol,PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_INTEGER,ierr) >> >> >> call KSPSolve(ksp_uv,b_rhs_uv,xx_uv,ierr) >> >> call KSPGetConvergedReason(ksp_uv,reason,ierr) >> >> >> The matrix changes every timestep so I have to call the >> preconditioner every time. So what did I do wrong? Btw, the matrix is >> A_mat_uv, RHS is b_rhs_uv and the answer is xx_uv. >> >> Thank you very much. >> >> Regards. >> >> > > From bsmith at mcs.anl.gov Tue Jun 10 15:37:47 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 10 Jun 2008 15:37:47 -0500 Subject: Analysis of performance of parallel code as processors increase In-Reply-To: <484E8DDD.9020209@gmail.com> References: <4849DF38.6070104@gmail.com> <484E8DDD.9020209@gmail.com> Message-ID: <36D407D4-AD2D-4D39-8FD3-6389ACF3DB2F@mcs.anl.gov> On Jun 10, 2008, at 9:21 AM, Ben Tay wrote: > Hi Barry, > > I found that when I use hypre, it is about twice as slow. I guess > hypre does not work well with the linearised momentum eqn. I tried > to use PCILU and PCICC and I got the error: hypre has several preconditioners; run with -pc_type hypre -help to get a list you set the type of hypre preconditioner with - pc_hypre_type then one of "pilut","parasails","boomeramg","euclid" > > > No support for this operation for this object type! > [1]PETSC ERROR: Matrix type mpiaij symbolic ICC! PETSc does not have its own parallel ICC or ILU > > > PCASM performs even worse. It seems like block jacobi is still the > best. where did you find the no. of iterations? Are you saying that > if I increase the no. of processors, the iteration nos must go down? With block Jacobi and ASM the number of iterations will INCREASE with more processes. Depending on the problem they may increase a tiny bit or they may increase A LOT. > > > Btw, I'm using the richardson solver. Other combi such as bcgs + > hypre is much worse. Adding any Krylov method like bcgs should pretty much always decrease the number of iterations needed and usually decrease the time over Richardson. > > > Does it mean there are some other problems present and hence my code > does not scale properly? There is no way to tell this. > > > Thank you very much. > > Regards > > > Barry Smith wrote: >> >> You are not using hypre, you are using block Jacobi with ILU on >> the blocks. >> >> The number of iterations goes from around 4000 to around 5000 in >> going from 4 to 8 processes, >> this is why you do not see such a great speedup. >> >> Barry >> >> On Jun 6, 2008, at 8:07 PM, Ben Tay wrote: >> >>> Hi, >>> >>> I have coded in parallel using PETSc and Hypre. I found that going >>> from 1 to 4 processors gives an almost 4 times increase. However >>> from 4 to 8 processors only increase performance by 1.2-1.5 >>> instead of 2. >>> >>> Is the slowdown due to the size of the matrix being not large >>> enough? Currently I am using 600x2160 to do the benchmark. Even >>> when increase the matrix size to 900x3240 or 1200x2160, the >>> performance increase is also not much. Is it possible to use - >>> log_summary find out the error? I have attached the log file >>> comparison for the 4 and 8 processors, I found that some event >>> like VecScatterEnd, VecNorm and MatAssemblyBegin have much higher >>> ratios. Does it indicate something? Another strange thing is that >>> MatAssemblyBegin for the 4 pros has a much higher ratio than the >>> 8pros. I thought there should be less communications for the 4 >>> pros case, and so the ratio should be lower. Does it mean there's >>> some communication problem at that time? >>> >>> Thank you very much. >>> >>> Regards >>> >>> >>> ************************************************************************************************************************ >>> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use >>> 'enscript -r -fCourier9' to print this document *** >>> ************************************************************************************************************************ >>> >>> ---------------------------------------------- PETSc Performance >>> Summary: ---------------------------------------------- >>> >>> ./a.out on a atlas3-mp named atlas3-c43 with 4 processors, by >>> g0306332 Fri Jun 6 17:29:26 2008 >>> Using Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 >>> CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b >>> >>> Max Max/Min Avg Total >>> Time (sec): 1.750e+03 1.00043 1.750e+03 >>> Objects: 4.200e+01 1.00000 4.200e+01 >>> Flops: 6.961e+10 1.00074 6.959e+10 2.784e+11 >>> Flops/sec: 3.980e+07 1.00117 3.978e+07 1.591e+08 >>> MPI Messages: 8.168e+03 2.00000 6.126e+03 2.450e+04 >>> MPI Message Lengths: 5.525e+07 2.00000 6.764e+03 1.658e+08 >>> MPI Reductions: 3.203e+03 1.00000 >>> >>> Flop counting convention: 1 flop = 1 real number operation of type >>> (multiply/divide/add/subtract) >>> e.g., VecAXPY() for real vectors of >>> length N --> 2N flops >>> and VecAXPY() for complex vectors of >>> length N --> 8N flops >>> >>> Summary of Stages: ----- Time ------ ----- Flops ----- --- >>> Messages --- -- Message Lengths -- -- Reductions -- >>> Avg %Total Avg %Total counts >>> %Total Avg %Total counts %Total >>> 0: Main Stage: 1.7495e+03 100.0% 2.7837e+11 100.0% 2.450e >>> +04 100.0% 6.764e+03 100.0% 1.281e+04 100.0% >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> See the 'Profiling' chapter of the users' manual for details on >>> interpreting output. >>> Phase summary info: >>> Count: number of times phase was executed >>> Time and Flops/sec: Max - maximum over all processors >>> Ratio - ratio of maximum to minimum over all >>> processors >>> Mess: number of messages sent >>> Avg. len: average message length >>> Reduct: number of global reductions >>> Global: entire computation >>> Stage: stages of a computation. Set stages with >>> PetscLogStagePush() and PetscLogStagePop(). >>> %T - percent time in this phase %F - percent flops in >>> this phase >>> %M - percent messages in this phase %L - percent message >>> lengths in this phase >>> %R - percent reductions in this phase >>> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max >>> time over all processors) >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> >>> ########################################################## >>> # # >>> # WARNING!!! # >>> # # >>> # This code was run without the PreLoadBegin() # >>> # macros. To get timing results we always recommend # >>> # preloading. otherwise timing numbers may be # >>> # meaningless. # >>> ########################################################## >>> >>> >>> Event Count Time (sec) Flops/ >>> sec --- Global --- --- Stage --- Total >>> Max Ratio Max Ratio Max Ratio Mess Avg >>> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> --- Event Stage 0: Main Stage >>> >>> MatMult 4082 1.0 8.2037e+01 1.5 4.67e+08 1.5 2.4e+04 >>> 6.8e+03 0.0e+00 4 37100100 0 4 37100100 0 1240 >>> MatSolve 1976 1.0 1.3250e+02 1.5 2.52e+08 1.5 0.0e+00 >>> 0.0e+00 0.0e+00 6 31 0 0 0 6 31 0 0 0 655 >>> MatLUFactorNum 300 1.0 3.8260e+01 1.2 2.07e+08 1.2 0.0e+00 >>> 0.0e+00 0.0e+00 2 9 0 0 0 2 9 0 0 0 668 >>> MatILUFactorSym 1 1.0 2.2550e-01 2.7 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatConvert 1 1.0 2.9182e-01 1.0 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatAssemblyBegin 301 1.0 1.0776e+021228.9 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 6.0e+02 4 0 0 0 5 4 0 0 0 5 0 >>> MatAssemblyEnd 301 1.0 9.6146e+00 1.1 0.00e+00 0.0 1.2e+01 >>> 3.6e+03 3.1e+02 1 0 0 0 2 1 0 0 0 2 0 >>> MatGetRow 324000 1.0 1.2161e-01 1.4 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatGetRowIJ 3 1.0 5.0068e-06 1.3 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatGetOrdering 1 1.0 2.1279e-02 2.3 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPSetup 601 1.0 2.5108e-02 1.2 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPSolve 600 1.0 1.2353e+03 1.0 5.64e+07 1.0 2.4e+04 >>> 6.8e+03 8.3e+03 71100100100 65 71100100100 65 225 >>> PCSetUp 601 1.0 4.0116e+01 1.2 1.96e+08 1.2 0.0e+00 >>> 0.0e+00 5.0e+00 2 9 0 0 0 2 9 0 0 0 637 >>> PCSetUpOnBlocks 300 1.0 3.8513e+01 1.2 2.06e+08 1.2 0.0e+00 >>> 0.0e+00 3.0e+00 2 9 0 0 0 2 9 0 0 0 664 >>> PCApply 4682 1.0 1.0566e+03 1.0 2.12e+07 1.0 0.0e+00 >>> 0.0e+00 0.0e+00 59 31 0 0 0 59 31 0 0 0 82 >>> VecDot 4812 1.0 8.2762e+00 1.1 4.00e+08 1.1 0.0e+00 >>> 0.0e+00 4.8e+03 0 4 0 0 38 0 4 0 0 38 1507 >>> VecNorm 3479 1.0 9.2739e+01 8.3 3.15e+08 8.3 0.0e+00 >>> 0.0e+00 3.5e+03 4 5 0 0 27 4 5 0 0 27 152 >>> VecCopy 900 1.0 2.0819e+00 1.4 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecSet 5882 1.0 9.4626e+00 1.5 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecAXPY 5585 1.0 1.5397e+01 1.5 4.67e+08 1.5 0.0e+00 >>> 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 1273 >>> VecAYPX 2879 1.0 1.0303e+01 1.6 4.45e+08 1.6 0.0e+00 >>> 0.0e+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 1146 >>> VecWAXPY 2406 1.0 7.7902e+00 1.6 3.14e+08 1.6 0.0e+00 >>> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 801 >>> VecAssemblyBegin 1200 1.0 8.4259e+00 3.8 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 3.6e+03 0 0 0 0 28 0 0 0 0 28 0 >>> VecAssemblyEnd 1200 1.0 2.4173e-03 1.3 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecScatterBegin 4082 1.0 1.2512e-01 1.5 0.00e+00 0.0 2.4e+04 >>> 6.8e+03 0.0e+00 0 0100100 0 0 0100100 0 0 >>> VecScatterEnd 4082 1.0 2.0954e+0153.3 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> Memory usage is given in bytes: >>> >>> Object Type Creations Destructions Memory >>> Descendants' Mem. >>> >>> --- Event Stage 0: Main Stage >>> >>> Matrix 7 7 321241092 0 >>> Krylov Solver 3 3 8 0 >>> Preconditioner 3 3 528 0 >>> Index Set 7 7 7785600 0 >>> Vec 20 20 46685344 0 >>> Vec Scatter 2 2 0 0 >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> ==================================================================== >>> Average time to get PetscTime(): 1.90735e-07 >>> Average time for MPI_Barrier(): 1.45912e-05 >>> Average time for zero size MPI_Send(): 7.27177e-06 >>> OptionTable: -log_summary test4_600 >>> Compiled without FORTRAN kernels >>> Compiled with full precision matrices (default) >>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >>> sizeof(PetscScalar) 8 >>> Configure run at: Tue Jan 8 22:22:08 2008 >>> Configure options: --with-memcmp-ok --sizeof_char=1 -- >>> sizeof_void_p=8 --sizeof_short=2 --sizeof_int=4 --sizeof_long=8 -- >>> sizeof_long_long=8 --sizeof_float=4 --sizeof_double=8 -- >>> bits_per_byte=8 --sizeof_MPI_Comm=4 --sizeof_MPI_Fint=4 --with- >>> vendor-compilers=intel --with-x=0 --with-hypre-dir=/home/enduser/ >>> g0306332/lib/hypre --with-debugging=0 --with-batch=1 --with-mpi- >>> shared=0 --with-mpi-include=/usr/local/topspin/mpi/mpich/include -- >>> with-mpi-lib=/usr/local/topspin/mpi/mpich/lib/libmpich.a --with- >>> mpirun=/usr/local/topspin/mpi/mpich/bin/mpirun --with-blas-lapack- >>> dir=/opt/intel/cmkl/8.1.1/lib/em64t --with-shared=0 >>> ----------------------------------------- >>> Libraries compiled on Tue Jan 8 22:34:13 SGT 2008 on atlas3-c01 >>> Machine characteristics: Linux atlas3-c01 2.6.9-42.ELsmp #1 SMP >>> Wed Jul 12 23:32:02 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux >>> Using PETSc directory: /nfs/home/enduser/g0306332/petsc-2.3.3-p8 >>> Using PETSc arch: atlas3-mpi >>> ----------------------------------------- >>> Using C compiler: mpicc -fPIC -O >>> Using Fortran compiler: mpif90 -I. -fPIC -O >>> ----------------------------------------- >>> Using include paths: -I/nfs/home/enduser/g0306332/petsc-2.3.3-p8 - >>> I/nfs/home/enduser/g0306332/petsc-2.3.3-p8/bmake/atlas3-mpi -I/nfs/ >>> home/enduser/g0306332/petsc-2.3.3-p8/include -I/home/enduser/ >>> g0306332/lib/hypre/include -I/usr/local/topspin/mpi/mpich/include >>> ------------------------------------------ >>> Using C linker: mpicc -fPIC -O >>> Using Fortran linker: mpif90 -I. -fPIC -O >>> Using libraries: -Wl,-rpath,/nfs/home/enduser/g0306332/petsc-2.3.3- >>> p8/lib/atlas3-mpi -L/nfs/home/enduser/g0306332/petsc-2.3.3-p8/lib/ >>> atlas3-mpi -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat - >>> lpetscvec -lpetsc -Wl,-rpath,/home/enduser/g0306332/lib/ >>> hypre/lib -L/home/enduser/g0306332/lib/hypre/lib -lHYPRE -Wl,- >>> rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 >>> -Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/ >>> x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -lstdc++ - >>> lcxaguard -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/ >>> local/ofed/lib64 -Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/ >>> usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,- >>> rpath,/usr/local/topspin/mpi/mpich/lib -L/usr/local/topspin/mpi/ >>> mpich/lib -lmpich -Wl,-rpath,/opt/intel/cmkl/8.1.1/lib/em64t -L/ >>> opt/intel/cmkl/8.1.1/lib/em64t -lmkl_lapack -lmkl_em64t -lguide - >>> lpthread -Wl,-rpath,/usr/local/ofed/lib64 -L/usr/local/ofed/lib64 - >>> Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib -L/opt/mvapich/0.9.9/gen2/ >>> lib -ldl -lmpich -libverbs -libumad -lpthread -lrt -Wl,-rpath,/opt/ >>> intel/cce/9.1.049/lib -L/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/ >>> lib/gcc/x86_64-redhat-linux/3.4.6/ -L/usr/lib/gcc/x86_64-redhat- >>> linux/3.4.6/ -Wl,-rpath,/usr/lib64 -L/usr/lib64 -lsvml -limf - >>> lipgo -lirc -lgcc_s -lirc_s -lmpichf90nc -Wl,-rpath,/opt/mvapich/ >>> 0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 -Wl,-rpath,/opt/ >>> intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/ >>> 3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,-rpath,/opt/intel/fce/9.1.045/lib >>> -L/opt/intel/fce/9.1.045/lib -lifport -lifcore -lm -Wl,-rpath,/opt/ >>> mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 -Wl,- >>> rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64- >>> redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -lm -Wl,-rpath,/opt/ >>> mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 -Wl,- >>> rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64- >>> redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -lstdc++ -lcxaguard -Wl,- >>> rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 >>> -Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/ >>> x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,-rpath,/opt/ >>> mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 -Wl,- >>> rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64- >>> redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -lstdc++ -lcxaguard -Wl,- >>> rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 >>> -Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/ >>> x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,-rpath,/opt/ >>> mvapich/0.9.9/gen2/lib -L/opt/mvapich/0.9.9/gen2/lib -ldl -lmpich - >>> Wl,-rpath,/usr/local/ofed/lib64 -L/usr/local/ofed/lib64 -libverbs - >>> libumad -lpthread -lrt -Wl,-rpath,/opt/intel/cce/9.1.049/lib -L/ >>> opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat- >>> linux/3.4.6/ -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/ >>> usr/lib64 -L/usr/lib64 -lsvml -limf -lipgo -lirc -lgcc_s -lirc_s - >>> ldl -lc >>> ------------------------------------------ >>> ************************************************************************************************************************ >>> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use >>> 'enscript -r -fCourier9' to print this document *** >>> ************************************************************************************************************************ >>> >>> ---------------------------------------------- PETSc Performance >>> Summary: ---------------------------------------------- >>> >>> ./a.out on a atlas3-mp named atlas3-c18 with 8 processors, by >>> g0306332 Fri Jun 6 17:23:25 2008 >>> Using Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 >>> CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b >>> >>> Max Max/Min Avg Total >>> Time (sec): 1.140e+03 1.00019 1.140e+03 >>> Objects: 4.200e+01 1.00000 4.200e+01 >>> Flops: 4.620e+10 1.00158 4.619e+10 3.695e+11 >>> Flops/sec: 4.053e+07 1.00177 4.051e+07 3.241e+08 >>> MPI Messages: 9.954e+03 2.00000 8.710e+03 6.968e+04 >>> MPI Message Lengths: 7.224e+07 2.00000 7.257e+03 5.057e+08 >>> MPI Reductions: 1.716e+03 1.00000 >>> >>> Flop counting convention: 1 flop = 1 real number operation of type >>> (multiply/divide/add/subtract) >>> e.g., VecAXPY() for real vectors of >>> length N --> 2N flops >>> and VecAXPY() for complex vectors of >>> length N --> 8N flops >>> >>> Summary of Stages: ----- Time ------ ----- Flops ----- --- >>> Messages --- -- Message Lengths -- -- Reductions -- >>> Avg %Total Avg %Total counts >>> %Total Avg %Total counts %Total >>> 0: Main Stage: 1.1402e+03 100.0% 3.6953e+11 100.0% 6.968e >>> +04 100.0% 7.257e+03 100.0% 1.372e+04 100.0% >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> See the 'Profiling' chapter of the users' manual for details on >>> interpreting output. >>> Phase summary info: >>> Count: number of times phase was executed >>> Time and Flops/sec: Max - maximum over all processors >>> Ratio - ratio of maximum to minimum over all >>> processors >>> Mess: number of messages sent >>> Avg. len: average message length >>> Reduct: number of global reductions >>> Global: entire computation >>> Stage: stages of a computation. Set stages with >>> PetscLogStagePush() and PetscLogStagePop(). >>> %T - percent time in this phase %F - percent flops in >>> this phase >>> %M - percent messages in this phase %L - percent message >>> lengths in this phase >>> %R - percent reductions in this phase >>> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max >>> time over all processors) >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> >>> ########################################################## >>> # # >>> # WARNING!!! # >>> # # >>> # This code was run without the PreLoadBegin() # >>> # macros. To get timing results we always recommend # >>> # preloading. otherwise timing numbers may be # >>> # meaningless. # >>> ########################################################## >>> >>> >>> Event Count Time (sec) Flops/ >>> sec --- Global --- --- Stage --- Total >>> Max Ratio Max Ratio Max Ratio Mess Avg >>> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> --- Event Stage 0: Main Stage >>> >>> MatMult 4975 1.0 7.8154e+01 1.9 4.19e+08 1.9 7.0e+04 >>> 7.3e+03 0.0e+00 5 38100100 0 5 38100100 0 1798 >>> MatSolve 2855 1.0 1.0870e+02 1.8 2.57e+08 1.8 0.0e+00 >>> 0.0e+00 0.0e+00 7 34 0 0 0 7 34 0 0 0 1153 >>> MatLUFactorNum 300 1.0 2.3238e+01 1.5 2.07e+08 1.5 0.0e+00 >>> 0.0e+00 0.0e+00 2 7 0 0 0 2 7 0 0 0 1099 >>> MatILUFactorSym 1 1.0 6.1973e-02 1.5 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatConvert 1 1.0 1.4168e-01 1.0 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatAssemblyBegin 301 1.0 6.9683e+01 8.6 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 6.0e+02 4 0 0 0 4 4 0 0 0 4 0 >>> MatAssemblyEnd 301 1.0 6.2247e+00 1.2 0.00e+00 0.0 2.8e+01 >>> 3.6e+03 3.1e+02 0 0 0 0 2 0 0 0 0 2 0 >>> MatGetRow 162000 1.0 6.0330e-02 1.4 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatGetRowIJ 3 1.0 9.0599e-06 3.2 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatGetOrdering 1 1.0 5.6710e-03 1.4 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPSetup 601 1.0 1.5631e-02 1.1 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPSolve 600 1.0 8.1668e+02 1.0 5.66e+07 1.0 7.0e+04 >>> 7.3e+03 9.2e+03 72100100100 67 72100100100 67 452 >>> PCSetUp 601 1.0 2.4372e+01 1.5 1.93e+08 1.5 0.0e+00 >>> 0.0e+00 5.0e+00 2 7 0 0 0 2 7 0 0 0 1048 >>> PCSetUpOnBlocks 300 1.0 2.3303e+01 1.5 2.07e+08 1.5 0.0e+00 >>> 0.0e+00 3.0e+00 2 7 0 0 0 2 7 0 0 0 1096 >>> PCApply 5575 1.0 6.5344e+02 1.1 2.57e+07 1.1 0.0e+00 >>> 0.0e+00 0.0e+00 55 34 0 0 0 55 34 0 0 0 192 >>> VecDot 4840 1.0 6.8932e+00 1.3 3.07e+08 1.3 0.0e+00 >>> 0.0e+00 4.8e+03 1 3 0 0 35 1 3 0 0 35 1820 >>> VecNorm 4365 1.0 1.2250e+02 3.6 6.82e+07 3.6 0.0e+00 >>> 0.0e+00 4.4e+03 8 5 0 0 32 8 5 0 0 32 153 >>> VecCopy 900 1.0 1.4297e+00 1.8 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecSet 6775 1.0 8.1405e+00 1.8 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecAXPY 6485 1.0 1.0003e+01 1.9 5.73e+08 1.9 0.0e+00 >>> 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 2420 >>> VecAYPX 3765 1.0 7.8289e+00 2.0 5.17e+08 2.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 2092 >>> VecWAXPY 2420 1.0 3.8504e+00 1.9 3.80e+08 1.9 0.0e+00 >>> 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1629 >>> VecAssemblyBegin 1200 1.0 9.2808e+00 3.4 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 3.6e+03 1 0 0 0 26 1 0 0 0 26 0 >>> VecAssemblyEnd 1200 1.0 2.3313e-03 1.3 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecScatterBegin 4975 1.0 2.2727e-01 2.6 0.00e+00 0.0 7.0e+04 >>> 7.3e+03 0.0e+00 0 0100100 0 0 0100100 0 0 >>> VecScatterEnd 4975 1.0 2.7557e+0168.1 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> Memory usage is given in bytes: >>> >>> Object Type Creations Destructions Memory >>> Descendants' Mem. >>> >>> --- Event Stage 0: Main Stage >>> >>> Matrix 7 7 160595412 0 >>> Krylov Solver 3 3 8 0 >>> Preconditioner 3 3 528 0 >>> Index Set 7 7 3897600 0 >>> Vec 20 20 23357344 0 >>> Vec Scatter 2 2 0 0 >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> ==================================================================== >>> Average time to get PetscTime(): 1.19209e-07 >>> Average time for MPI_Barrier(): 2.10285e-05 >>> Average time for zero size MPI_Send(): 7.59959e-06 >>> OptionTable: -log_summary test8_600 >>> Compiled without FORTRAN kernels >>> Compiled with full precision matrices (default) >>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >>> sizeof(PetscScalar) 8 >>> Configure run at: Tue Jan 8 22:22:08 2008 >>> Configure options: --with-memcmp-ok --sizeof_char=1 -- >>> sizeof_void_p=8 --sizeof_short=2 --sizeof_int=4 --sizeof_long=8 -- >>> sizeof_long_long=8 --sizeof_float=4 --sizeof_double=8 -- >>> bits_per_byte=8 --sizeof_MPI_Comm=4 --sizeof_MPI_Fint=4 --with- >>> vendor-compilers=intel --with-x=0 --with-hypre-dir=/home/enduser/ >>> g0306332/lib/hypre --with-debugging=0 --with-batch=1 --with-mpi- >>> shared=0 --with-mpi-include=/usr/local/topspin/mpi/mpich/include -- >>> with-mpi-lib=/usr/local/topspin/mpi/mpich/lib/libmpich.a --with- >>> mpirun=/usr/local/topspin/mpi/mpich/bin/mpirun --with-blas-lapack- >>> dir=/opt/intel/cmkl/8.1.1/lib/em64t --with-shared=0 >>> ----------------------------------------- >>> Libraries compiled on Tue Jan 8 22:34:13 SGT 2008 on atlas3-c01 >>> Machine characteristics: Linux atlas3-c01 2.6.9-42.ELsmp #1 SMP >>> Wed Jul 12 23:32:02 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux >>> Using PETSc directory: /nfs/home/enduser/g0306332/petsc-2.3.3-p8 >>> Using PETSc arch: atlas3-mpi >>> ----------------------------------------- >>> Using C compiler: mpicc -fPIC -O >>> Using Fortran compiler: mpif90 -I. -fPIC -O >>> ----------------------------------------- >>> Using include paths: -I/nfs/home/enduser/g0306332/petsc-2.3.3-p8 - >>> I/nfs/home/enduser/g0306332/petsc-2.3.3-p8/bmake/atlas3-mpi -I/nfs/ >>> home/enduser/g0306332/petsc-2.3.3-p8/include -I/home/enduser/ >>> g0306332/lib/hypre/include -I/usr/local/topspin/mpi/mpich/include >>> ------------------------------------------ >>> Using C linker: mpicc -fPIC -O >>> Using Fortran linker: mpif90 -I. -fPIC -O >>> Using libraries: -Wl,-rpath,/nfs/home/enduser/g0306332/petsc-2.3.3- >>> p8/lib/atlas3-mpi -L/nfs/home/enduser/g0306332/petsc-2.3.3-p8/lib/ >>> atlas3-mpi -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat - >>> lpetscvec -lpetsc -Wl,-rpath,/home/enduser/g0306332/lib/ >>> hypre/lib -L/home/enduser/g0306332/lib/hypre/lib -lHYPRE -Wl,- >>> rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 >>> -Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/ >>> x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -lstdc++ - >>> lcxaguard -Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/ >>> local/ofed/lib64 -Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/ >>> usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,- >>> rpath,/usr/local/topspin/mpi/mpich/lib -L/usr/local/topspin/mpi/ >>> mpich/lib -lmpich -Wl,-rpath,/opt/intel/cmkl/8.1.1/lib/em64t -L/ >>> opt/intel/cmkl/8.1.1/lib/em64t -lmkl_lapack -lmkl_em64t -lguide - >>> lpthread -Wl,-rpath,/usr/local/ofed/lib64 -L/usr/local/ofed/lib64 - >>> Wl,-rpath,/opt/mvapich/0.9.9/gen2/lib -L/opt/mvapich/0.9.9/gen2/ >>> lib -ldl -lmpich -libverbs -libumad -lpthread -lrt -Wl,-rpath,/opt/ >>> intel/cce/9.1.049/lib -L/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/ >>> lib/gcc/x86_64-redhat-linux/3.4.6/ -L/usr/lib/gcc/x86_64-redhat- >>> linux/3.4.6/ -Wl,-rpath,/usr/lib64 -L/usr/lib64 -lsvml -limf - >>> lipgo -lirc -lgcc_s -lirc_s -lmpichf90nc -Wl,-rpath,/opt/mvapich/ >>> 0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 -Wl,-rpath,/opt/ >>> intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/ >>> 3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,-rpath,/opt/intel/fce/9.1.045/lib >>> -L/opt/intel/fce/9.1.045/lib -lifport -lifcore -lm -Wl,-rpath,/opt/ >>> mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 -Wl,- >>> rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64- >>> redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -lm -Wl,-rpath,/opt/ >>> mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 -Wl,- >>> rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64- >>> redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -lstdc++ -lcxaguard -Wl,- >>> rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 >>> -Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/ >>> x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,-rpath,/opt/ >>> mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 -Wl,- >>> rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64- >>> redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -lstdc++ -lcxaguard -Wl,- >>> rpath,/opt/mvapich/0.9.9/gen2/lib -Wl,-rpath,/usr/local/ofed/lib64 >>> -Wl,-rpath,/opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/ >>> x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/usr/lib64 -Wl,-rpath,/opt/ >>> mvapich/0.9.9/gen2/lib -L/opt/mvapich/0.9.9/gen2/lib -ldl -lmpich - >>> Wl,-rpath,/usr/local/ofed/lib64 -L/usr/local/ofed/lib64 -libverbs - >>> libumad -lpthread -lrt -Wl,-rpath,/opt/intel/cce/9.1.049/lib -L/ >>> opt/intel/cce/9.1.049/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat- >>> linux/3.4.6/ -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6/ -Wl,-rpath,/ >>> usr/lib64 -L/usr/lib64 -lsvml -limf -lipgo -lirc -lgcc_s -lirc_s - >>> ldl -lc >>> ------------------------------------------ >> >> > > From rlmackie862 at gmail.com Tue Jun 10 17:23:21 2008 From: rlmackie862 at gmail.com (Randall Mackie) Date: Tue, 10 Jun 2008 15:23:21 -0700 Subject: question on Warning messages Message-ID: <484EFED9.5030201@gmail.com> I was running my PETSc code with -info, and I noticed a bunch of warnings that said: Efficiency warning, copying array in XXXGetArray() due to alignment differences between C and Fortran. My code is written in Fortran, and these must be coming from all the VecGetArray calls I make, but is this a serious issue and is there some way to get proper alignment between C and Fortran? Thanks, Randy From balay at mcs.anl.gov Tue Jun 10 17:31:13 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 10 Jun 2008 17:31:13 -0500 (CDT) Subject: question on Warning messages In-Reply-To: <484EFED9.5030201@gmail.com> References: <484EFED9.5030201@gmail.com> Message-ID: On Tue, 10 Jun 2008, Randall Mackie wrote: > I was running my PETSc code with -info, and I noticed a bunch of > warnings that said: > > Efficiency warning, copying array in XXXGetArray() due to alignment > differences between C and Fortran. > > My code is written in Fortran, and these must be coming from all > the VecGetArray calls I make, but is this a serious issue and is > there some way to get proper alignment between C and Fortran? The problem here is: the variable you specify for offset is not aligned the same way as the array. And there is no way to specify [to compiler] how variables should be alighned. You can use VecGetArrayF90() [with a f90 compiler] - to avoid this problem. Satish From bsmith at mcs.anl.gov Tue Jun 10 23:27:13 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 10 Jun 2008 23:27:13 -0500 Subject: question on Warning messages In-Reply-To: References: <484EFED9.5030201@gmail.com> Message-ID: <4DD8941B-37D2-46EE-97DC-39766F9FD71F@mcs.anl.gov> Likely you want to switch to using VecGetArrayF90() anyways. All Fortran compilers are F90 now and it makes codes cleaner. In fact, in our next release we will make F90 support the default and F77 the optional. Barry On Jun 10, 2008, at 5:31 PM, Satish Balay wrote: > On Tue, 10 Jun 2008, Randall Mackie wrote: > >> I was running my PETSc code with -info, and I noticed a bunch of >> warnings that said: >> >> Efficiency warning, copying array in XXXGetArray() due to alignment >> differences between C and Fortran. >> >> My code is written in Fortran, and these must be coming from all >> the VecGetArray calls I make, but is this a serious issue and is >> there some way to get proper alignment between C and Fortran? > > The problem here is: the variable you specify for offset is not > aligned the same way as the array. And there is no way to specify [to > compiler] how variables should be alighned. > > You can use VecGetArrayF90() [with a f90 compiler] - to avoid this > problem. > > Satish > > From etienne.perchat at transvalor.com Wed Jun 11 11:18:42 2008 From: etienne.perchat at transvalor.com (Etienne PERCHAT) Date: Wed, 11 Jun 2008 18:18:42 +0200 Subject: KSPCR with PCILU generally slower with v2.3.3 than with 2.3.0 Message-ID: <9113A52E1096EB41B1F88DD94C4369D5216A35@EXCHSRV.transvalor.com> Hi Barry, I don't understand why but -log_summary does not produce anything. I've checked that the code has been compiled with DPETSC_USE_LOG. Any tips ? Thanks, Etienne -----Message d'origine----- De?: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] De la part de Barry Smith Envoy??: vendredi 6 juin 2008 16:39 ??: petsc-users at mcs.anl.gov Objet?: Re: KSPCR with PCILU generally slower with v2.3.3 than with 2.3.0 If you run both cases with -log_summary you'll see where it is spending more time. Mail along the -log_summary output. Barry On Jun 6, 2008, at 6:27 AM, Etienne PERCHAT wrote: > Hi Barry, > > Using -ksp_truemonitor does not change anything. I have the same > number of iterations in the 3 cases (v2.3.3p8 + true_monitor, > v2.3.3, v2.3.0) and the same final residual norm but with v2.3.0 the > code is slightly faster. > > I have to stress that I also monitor convergence with KSPMonitorSet > in order to store the "best residual" and the "best solution" (I did > not test if it is still useful in 2.3.3). > > Etienne > > -----Message d'origine----- > De : owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov > ] De la part de Barry Smith > Envoy? : jeudi 5 juin 2008 20:28 > ? : petsc-users at mcs.anl.gov > Objet : Re: KSPCR with PCILU generally slower with v2.3.3 than with > 2.3.0 > > > This could also be the result of changes in the way we test for > convergence. > Try with -ksp_truemonitor and see if it is using the same number of > iterations > to converge. > > Barry > > On Jun 5, 2008, at 5:35 AM, Etienne PERCHAT wrote: > >> Hello, >> >> I'm solving MPIBAIJI matrices using KSPCR preconditioned by a >> PCBJACOBI >> between sub domains and PCILU with a fill ratio of 1 on each sub >> domain. >> >> Solving exactly the same linear systems, I notice that the solution >> is >> slower using v2.3.3p8 than with v2.3.0. >> The behavior is not very clear: v2.3.3p8 is generally slower (10 % or >> more) but on some systems it happen to be really faster (may be 40 >> %). >> >> Unhappily in my case at the end I'm always slower ... >> >> Did somebody noticed such behavior? Is there changes if I upgrade to >> current 2.3.3p13 version (I've seen a fix:" fix error with using >> mpirowbs & icc & ksp_view") ? >> >> Thanks, >> Etienne Perchat >> >> > > > From bsmith at mcs.anl.gov Wed Jun 11 11:24:17 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 11 Jun 2008 11:24:17 -0500 Subject: KSPCR with PCILU generally slower with v2.3.3 than with 2.3.0 In-Reply-To: <9113A52E1096EB41B1F88DD94C4369D5216A35@EXCHSRV.transvalor.com> References: <9113A52E1096EB41B1F88DD94C4369D5216A35@EXCHSRV.transvalor.com> Message-ID: <35EA1C45-4014-412D-8CF4-14D4C811BA56@mcs.anl.gov> Make sure you have a PetscFinalize() that the code reaches. Barry On Jun 11, 2008, at 11:18 AM, Etienne PERCHAT wrote: > Hi Barry, > > I don't understand why but -log_summary does not produce anything. > I've checked that the code has been compiled with DPETSC_USE_LOG. > > Any tips ? > Thanks, > > Etienne > > -----Message d'origine----- > De : owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov > ] De la part de Barry Smith > Envoy? : vendredi 6 juin 2008 16:39 > ? : petsc-users at mcs.anl.gov > Objet : Re: KSPCR with PCILU generally slower with v2.3.3 than with > 2.3.0 > > > If you run both cases with -log_summary you'll see where it is > spending more time. > Mail along the -log_summary output. > > Barry > > On Jun 6, 2008, at 6:27 AM, Etienne PERCHAT wrote: > >> Hi Barry, >> >> Using -ksp_truemonitor does not change anything. I have the same >> number of iterations in the 3 cases (v2.3.3p8 + true_monitor, >> v2.3.3, v2.3.0) and the same final residual norm but with v2.3.0 the >> code is slightly faster. >> >> I have to stress that I also monitor convergence with KSPMonitorSet >> in order to store the "best residual" and the "best solution" (I did >> not test if it is still useful in 2.3.3). >> >> Etienne >> >> -----Message d'origine----- >> De : owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov >> ] De la part de Barry Smith >> Envoy? : jeudi 5 juin 2008 20:28 >> ? : petsc-users at mcs.anl.gov >> Objet : Re: KSPCR with PCILU generally slower with v2.3.3 than with >> 2.3.0 >> >> >> This could also be the result of changes in the way we test for >> convergence. >> Try with -ksp_truemonitor and see if it is using the same number of >> iterations >> to converge. >> >> Barry >> >> On Jun 5, 2008, at 5:35 AM, Etienne PERCHAT wrote: >> >>> Hello, >>> >>> I'm solving MPIBAIJI matrices using KSPCR preconditioned by a >>> PCBJACOBI >>> between sub domains and PCILU with a fill ratio of 1 on each sub >>> domain. >>> >>> Solving exactly the same linear systems, I notice that the solution >>> is >>> slower using v2.3.3p8 than with v2.3.0. >>> The behavior is not very clear: v2.3.3p8 is generally slower (10 % >>> or >>> more) but on some systems it happen to be really faster (may be 40 >>> %). >>> >>> Unhappily in my case at the end I'm always slower ... >>> >>> Did somebody noticed such behavior? Is there changes if I upgrade to >>> current 2.3.3p13 version (I've seen a fix:" fix error with using >>> mpirowbs & icc & ksp_view") ? >>> >>> Thanks, >>> Etienne Perchat >>> >>> >> >> >> > > > From balay at mcs.anl.gov Wed Jun 11 11:24:12 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 11 Jun 2008 11:24:12 -0500 (CDT) Subject: KSPCR with PCILU generally slower with v2.3.3 than with 2.3.0 In-Reply-To: <9113A52E1096EB41B1F88DD94C4369D5216A35@EXCHSRV.transvalor.com> References: <9113A52E1096EB41B1F88DD94C4369D5216A35@EXCHSRV.transvalor.com> Message-ID: Does -log_summary with an example code - say src/ksp/ksp/examples/tutorials/ex2.c or ex2f.F Perhaps you are missing a call to PetscFinalize() in your code? Satish On Wed, 11 Jun 2008, Etienne PERCHAT wrote: > Hi Barry, > > I don't understand why but -log_summary does not produce anything. > I've checked that the code has been compiled with DPETSC_USE_LOG. > > Any tips ? > Thanks, > > Etienne > > -----Message d'origine----- > De?: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] De la part de Barry Smith > Envoy??: vendredi 6 juin 2008 16:39 > ??: petsc-users at mcs.anl.gov > Objet?: Re: KSPCR with PCILU generally slower with v2.3.3 than with 2.3.0 > > > If you run both cases with -log_summary you'll see where it is > spending more time. > Mail along the -log_summary output. > > Barry > > On Jun 6, 2008, at 6:27 AM, Etienne PERCHAT wrote: > > > Hi Barry, > > > > Using -ksp_truemonitor does not change anything. I have the same > > number of iterations in the 3 cases (v2.3.3p8 + true_monitor, > > v2.3.3, v2.3.0) and the same final residual norm but with v2.3.0 the > > code is slightly faster. > > > > I have to stress that I also monitor convergence with KSPMonitorSet > > in order to store the "best residual" and the "best solution" (I did > > not test if it is still useful in 2.3.3). > > > > Etienne > > > > -----Message d'origine----- > > De : owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov > > ] De la part de Barry Smith > > Envoy? : jeudi 5 juin 2008 20:28 > > ? : petsc-users at mcs.anl.gov > > Objet : Re: KSPCR with PCILU generally slower with v2.3.3 than with > > 2.3.0 > > > > > > This could also be the result of changes in the way we test for > > convergence. > > Try with -ksp_truemonitor and see if it is using the same number of > > iterations > > to converge. > > > > Barry > > > > On Jun 5, 2008, at 5:35 AM, Etienne PERCHAT wrote: > > > >> Hello, > >> > >> I'm solving MPIBAIJI matrices using KSPCR preconditioned by a > >> PCBJACOBI > >> between sub domains and PCILU with a fill ratio of 1 on each sub > >> domain. > >> > >> Solving exactly the same linear systems, I notice that the solution > >> is > >> slower using v2.3.3p8 than with v2.3.0. > >> The behavior is not very clear: v2.3.3p8 is generally slower (10 % or > >> more) but on some systems it happen to be really faster (may be 40 > >> %). > >> > >> Unhappily in my case at the end I'm always slower ... > >> > >> Did somebody noticed such behavior? Is there changes if I upgrade to > >> current 2.3.3p13 version (I've seen a fix:" fix error with using > >> mpirowbs & icc & ksp_view") ? > >> > >> Thanks, > >> Etienne Perchat > >> > >> > > > > > > > > > > From tsjb00 at hotmail.com Wed Jun 11 15:42:24 2008 From: tsjb00 at hotmail.com (tsjb00) Date: Wed, 11 Jun 2008 20:42:24 +0000 Subject: question about Petsc_Real In-Reply-To: References: <484EFED9.5030201@gmail.com> Message-ID: Have a stupid question about the precision of Petsc_Real (float, double, long double...) Is it defined when installing the PETSC package? How? Once the PETSC is installed, can I still change the precision of Petsc_Real for different application code? How? Do I have to recompile the PETSC package? Many thanks in advance! BJ _________________________________________________________________ ???MSN??????????????????? http://mobile.msn.com.cn/ From balay at mcs.anl.gov Wed Jun 11 16:08:17 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 11 Jun 2008 16:08:17 -0500 (CDT) Subject: question about Petsc_Real In-Reply-To: References: <484EFED9.5030201@gmail.com> Message-ID: On Wed, 11 Jun 2008, tsjb00 wrote: > > Have a stupid question about the precision of Petsc_Real (float, double, long double...) > Is it defined when installing the PETSC package? How? yes with the configure option --with-precision=double. However - precisions other than double are not well tested. > Once the PETSC is installed, can I still change the precision of > Petsc_Real for different application code? How? nope. > Do I have to recompile the PETSC package? yes you have to recompile. However you can use a different PETSC_ARCH for this build - so that both builds are useable. [you can switch between them by using the correct PETSC_ARCH value with make - when compiling your code] Satish From tsjb00 at hotmail.com Wed Jun 11 17:10:35 2008 From: tsjb00 at hotmail.com (tsjb00) Date: Wed, 11 Jun 2008 22:10:35 +0000 Subject: question about Petsc_Real In-Reply-To: References: <484EFED9.5030201@gmail.com> Message-ID: Thanks for the reply! ---------------------------------------- > Date: Wed, 11 Jun 2008 16:08:17 -0500 > From: balay at mcs.anl.gov > To: petsc-users at mcs.anl.gov > Subject: Re: question about Petsc_Real > > On Wed, 11 Jun 2008, tsjb00 wrote: > >> >> Have a stupid question about the precision of Petsc_Real (float, double, long double...) > > >> Is it defined when installing the PETSC package? How? > > yes with the configure option --with-precision=double. However - > precisions other than double are not well tested. > >> Once the PETSC is installed, can I still change the precision of >> Petsc_Real for different application code? How? > > nope. > >> Do I have to recompile the PETSC package? > > yes you have to recompile. However you can use a different PETSC_ARCH > for this build - so that both builds are useable. [you can switch > between them by using the correct PETSC_ARCH value with make - when > compiling your code] > > Satish > _________________________________________________________________ Windows Live Photo gallery ????????????????????????????? http://get.live.cn/product/photo.html From zonexo at gmail.com Wed Jun 11 19:33:21 2008 From: zonexo at gmail.com (Ben Tay) Date: Thu, 12 Jun 2008 08:33:21 +0800 Subject: Getting BlockSolve95 Message-ID: <48506ED1.7060903@gmail.com> Hi, I am trying to install PETSc with BlockSolve95. Due to firewall issues, I need to download separately BlockSolve95. However, when I logged into ftp://ftp.mcs.anl.gov/pub/petsc/externalpackages/, I can't find BlockSolve95. So do I download it from the official website? Cos I'm worried that there's some difference between the official and PETSc's copy. Thank you very much. Regards. From bsmith at mcs.anl.gov Wed Jun 11 19:50:55 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 11 Jun 2008 19:50:55 -0500 Subject: Getting BlockSolve95 In-Reply-To: <48506ED1.7060903@gmail.com> References: <48506ED1.7060903@gmail.com> Message-ID: <9F6324EA-0CFC-4844-BCB0-594342551726@mcs.anl.gov> The is no PETSc install of BlockSolve95, you must install BlockSolve95 yourself (with the MPI and compilers you will use with PETSc) then config/configure.py PETSc to use that BlockSolve95. Barry On Jun 11, 2008, at 7:33 PM, Ben Tay wrote: > Hi, > > I am trying to install PETSc with BlockSolve95. Due to firewall > issues, I need to download separately BlockSolve95. However, when I > logged into ftp://ftp.mcs.anl.gov/pub/petsc/externalpackages/, I > can't find BlockSolve95. So do I download it from the official > website? Cos I'm worried that there's some difference between the > official and PETSc's copy. > > Thank you very much. > > Regards. > > From neckel at in.tum.de Thu Jun 12 06:07:52 2008 From: neckel at in.tum.de (Tobias Neckel) Date: Thu, 12 Jun 2008 13:07:52 +0200 Subject: Question on petsc-2.3.3-p13 Message-ID: <48510388.9060409@in.tum.de> Hello, we recently switched from petsc-2.3.2-p10 to petsc-2.3.3-p13 (on a Linux 32 bit architecture, with gcc 4.1.2). Installation and tests of petsc worked all well. Besides some new naming conventions, we encountered a strange compile error when using the new version 2.3.3 in our project: In the file petsc-2.3.3-p13/include/private/vecimpl.h, we get the following error in the lines 278 and 300: petsc-2.3.3-p13/include/private/vecimpl.h error: expected primary-expression before ',' token. We are not including vecimpl.h directly, but possibly indirectly by including kspimpl.h, pcimpl.h, and matimpl.h. When comparing this implementation to the old release, we learnt that the methods VecStashValue_Private() and VecStashValuesBlocked_Private(), which have been simple defines before, are now sth. like inline functions. We could eliminate the compile error by simply commenting the two corresponding calls of CHKERRQ(ierr) manually in vecimpl.h. So the ',' token seems to be one in the define of CHKERRQ. Is this a known problem? Are there any known solutions (besides this workaround)? Or are we misusing/omitting something? Thanks in advance Best regards Tobias Neckel -- Dipl.-Tech. Math. Tobias Neckel Institut f?r Informatik V, TU M?nchen Boltzmannstr. 3, 85748 Garching Tel.: 089/289-18602 Email: neckel at in.tum.de URL: http://www5.in.tum.de/persons/neckel.html From knepley at gmail.com Thu Jun 12 07:12:07 2008 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 12 Jun 2008 07:12:07 -0500 Subject: Question on petsc-2.3.3-p13 In-Reply-To: <48510388.9060409@in.tum.de> References: <48510388.9060409@in.tum.de> Message-ID: On Thu, Jun 12, 2008 at 6:07 AM, Tobias Neckel wrote: > Hello, > > we recently switched from petsc-2.3.2-p10 to petsc-2.3.3-p13 (on a Linux 32 > bit architecture, with gcc 4.1.2). Installation and tests of petsc worked > all well. > > Besides some new naming conventions, we encountered a strange compile error > when using the new version 2.3.3 in our project: > > In the file petsc-2.3.3-p13/include/private/vecimpl.h, we get the following > error in the lines 278 and 300: > petsc-2.3.3-p13/include/private/vecimpl.h error: expected primary-expression > before ',' token. > > We are not including vecimpl.h directly, but possibly indirectly by > including kspimpl.h, pcimpl.h, and matimpl.h. > > When comparing this implementation to the old release, we learnt that the > methods VecStashValue_Private() and VecStashValuesBlocked_Private(), which > have been simple defines before, are now sth. like inline functions. > > We could eliminate the compile error by simply commenting the two > corresponding calls of CHKERRQ(ierr) manually in vecimpl.h. So the ',' token > seems to be one in the define of CHKERRQ. > > Is this a known problem? Are there any known solutions (besides this > workaround)? Or are we misusing/omitting something? I have not seen this problem before. I looked through the code. My only guess is that the compiler is complaining about one of the __*__ variables in the CHKERRQ() definition. I have not seen this before (they have always been defined). What compilers are you using? Matt > Thanks in advance > Best regards > Tobias Neckel > > -- > Dipl.-Tech. Math. Tobias Neckel > > Institut f?r Informatik V, TU M?nchen > Boltzmannstr. 3, 85748 Garching > > Tel.: 089/289-18602 > Email: neckel at in.tum.de > URL: http://www5.in.tum.de/persons/neckel.html > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From neckel at in.tum.de Thu Jun 12 08:43:09 2008 From: neckel at in.tum.de (Tobias Neckel) Date: Thu, 12 Jun 2008 15:43:09 +0200 Subject: Question on petsc-2.3.3-p13 In-Reply-To: References: <48510388.9060409@in.tum.de> Message-ID: <485127ED.7010907@in.tum.de> Hi Matthew, > I have not seen this problem before. I looked through the code. My only guess > is that the compiler is complaining about one of the __*__ variables in the > CHKERRQ() definition. I have not seen this before (they have always > been defined). > What compilers are you using? Currently, we are using g++ (GCC) 4.1.1 (Gentoo 4.1.1-r3) with the usual debug flags (-g3 -O0) and some warnings (-Wall -Werror -pedantic -pedantic-errors). Best regards Tobias From balay at mcs.anl.gov Thu Jun 12 08:52:31 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 12 Jun 2008 08:52:31 -0500 (CDT) Subject: Question on petsc-2.3.3-p13 In-Reply-To: <485127ED.7010907@in.tum.de> References: <48510388.9060409@in.tum.de> <485127ED.7010907@in.tum.de> Message-ID: Do you gOn Thu, 12 Jun 2008, Tobias Neckel wrote: > Hi Matthew, > > > I have not seen this problem before. I looked through the code. My only > > guess > > is that the compiler is complaining about one of the __*__ variables in the > > CHKERRQ() definition. I have not seen this before (they have always > > been defined). > > What compilers are you using? > > Currently, we are using g++ (GCC) 4.1.1 (Gentoo 4.1.1-r3) with the usual debug > flags (-g3 -O0) and some warnings (-Wall -Werror -pedantic -pedantic-errors). Do you get these errors with PETSc example codes aswell - or just your application code? Satish From neckel at in.tum.de Thu Jun 12 09:56:40 2008 From: neckel at in.tum.de (Tobias Neckel) Date: Thu, 12 Jun 2008 16:56:40 +0200 Subject: Question on petsc-2.3.3-p13 In-Reply-To: References: <48510388.9060409@in.tum.de> <485127ED.7010907@in.tum.de> Message-ID: <48513928.4040403@in.tum.de> > Do you get these errors with PETSc example codes aswell - or just your > application code? The PETSc examples (e.g. src/ksp/ksp/examples/tutorials/ex1.c) work well, so it is just our code ;-) It seems that we do sth. wrong/weird with the __SDIR__ setting ... I will investigate that in more detail. Thanks already, Tobias From Stephen.R.Ball at awe.co.uk Fri Jun 13 10:23:17 2008 From: Stephen.R.Ball at awe.co.uk (Stephen R Ball) Date: Fri, 13 Jun 2008 16:23:17 +0100 Subject: Using MatGetRow() to extract matrix values. Message-ID: <86DGPC012618@awe.co.uk> Hi I am now using MatGetRow() to extract the matrix values and column numbers for each row. However, I have noticed that the ordering of the matrix values is sometimes different to that which was originally inserted. For example below are the matrix values for one row. Original row: Column no. Matrix Value 0 185693206.9234681 1 4.3135145491066673E-002 2 -1.5858782157409282E-010 3 8.5489320680905649E-002 4 -7.5011130237855266E-011 5 -3.7505565118927637E-011 6 -4.3135145489119342E-002 7 -8.6270290978446909E-002 After matrix assembly, extracted via MatGetRow():- Column no. Matrix Value 0 -7.5011130237855266E-011 1 -3.7505565118927637E-011 2 4.3135145491066673E-002 3 -8.6270290978446909E-002 4 185693206.9234681 5 -4.3135145489119342E-002 6 -1.5858782157409282E-010 7 8.5489320680905649E-002 Is this what I should expect to happen? As I mentioned in an earlier email, my purpose for extracting the matrix values is so that I can output the locally owned values to file in the format I want rather than via use of the MatView() routine. I would prefer to output the values in their original ordering. The reason I extract the values after matrix assembly rather than just output the original CSR matrix array is because, in parallel, some matrix entries are partial contributions to the true values and it is these true values that I wish to output. I get these true values after inserting them via MatSetValues() using the ADD_VALUES flag and then using MatGetRow(). Can I obtain the matrix values in their original row and column ordering after matrix assembly? Doesn't MatView() output the matrix with its original ordering? Regards Stephen -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith Sent: 04 June 2008 15:30 To: petsc-users at mcs.anl.gov Subject: EXTERNAL: Re: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. Stephan, For all built in PETSc matrix types, the rows are stored contiguously so you can use MatGetOwnershipRange() to get the global indices for the local rows and thus call MatGetRow(). Barry On Jun 4, 2008, at 6:33 AM, Stephen R Ball wrote: > Ok, I am looking into using MatGetRow(). However this requires the > global row number for input. I was looking to use > MatGetOwnershipRange() > to obtain the range of global row numbers owned by each processor but > the documentation states that this routine assumes that the matrix is > laid out with the first n1 rows on the first processor, the next n2 > rows > on the second, etc and that for certain parallel layouts this range > may > not be well defined. > > This is the case for me. Do you have a routine where I can specify a > global row number and it will tell me the rank of the processor that > owns it? This is to ensure that MatGetRow() only gets called by the > owner processor for each global row number. > > Regards > > Stephen > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley > Sent: 03 June 2008 15:29 > To: petsc-users at mcs.anl.gov > Subject: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. > > On Tue, Jun 3, 2008 at 8:11 AM, Stephen R Ball > wrote: >> Hi >> >> I have been trying to extract an array containing the local matrix >> values using MatGetArray() via the Fortran interface but get the >> error >> message that Mat type mpiaij is not supported with this routine. >> All I >> want to do is to extract the local matrix values so that I can output >> them to file in the format I want rather than via use of the >> MatView() >> routine. Can you suggest a way of how I can go about extracting the >> local matrix values? > > This is no "local matrix". The Mat interface is supposed to be data > structure > neutral so we can optimize for different architectures. If you want > the > values > directly, I would use MatGetRow() for each row. > > Matt > >> Thanks >> >> Stephen >> >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > From knepley at gmail.com Fri Jun 13 13:58:52 2008 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 13 Jun 2008 13:58:52 -0500 Subject: Using MatGetRow() to extract matrix values. In-Reply-To: <86DGPC012618@awe.co.uk> References: <86DGPC012618@awe.co.uk> Message-ID: On Fri, Jun 13, 2008 at 10:23 AM, Stephen R Ball wrote: > Hi > > I am now using MatGetRow() to extract the matrix values and column > numbers for each row. However, I have noticed that the ordering of the > matrix values is sometimes different to that which was originally > inserted. For example below are the matrix values for one row. I assume by "column no" you mean the order in the array returned, not the global column index. We sort the rows by global column index, and thus have no memory of the original insertion order. Matt > Original row: > > Column no. Matrix Value > 0 185693206.9234681 > 1 4.3135145491066673E-002 > 2 -1.5858782157409282E-010 > 3 8.5489320680905649E-002 > 4 -7.5011130237855266E-011 > 5 -3.7505565118927637E-011 > 6 -4.3135145489119342E-002 > 7 -8.6270290978446909E-002 > > > After matrix assembly, extracted via MatGetRow():- > > Column no. Matrix Value > 0 -7.5011130237855266E-011 > 1 -3.7505565118927637E-011 > 2 4.3135145491066673E-002 > 3 -8.6270290978446909E-002 > 4 185693206.9234681 > 5 -4.3135145489119342E-002 > 6 -1.5858782157409282E-010 > 7 8.5489320680905649E-002 > > > Is this what I should expect to happen? > > As I mentioned in an earlier email, my purpose for extracting the matrix > values is so that I can output the locally owned values to file in the > format I want rather than via use of the MatView() routine. I would > prefer to output the values in their original ordering. > > The reason I extract the values after matrix assembly rather than just > output the original CSR matrix array is because, in parallel, some > matrix entries are partial contributions to the true values and it is > these true values that I wish to output. I get these true values after > inserting them via MatSetValues() using the ADD_VALUES flag and then > using MatGetRow(). > > Can I obtain the matrix values in their original row and column ordering > after matrix assembly? Doesn't MatView() output the matrix with its > original ordering? > > Regards > > Stephen > > > > > > > > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith > Sent: 04 June 2008 15:30 > To: petsc-users at mcs.anl.gov > Subject: EXTERNAL: Re: EXTERNAL: Re: MatGetArray() not supporting Mat > type mpiaij. > > > Stephan, > > For all built in PETSc matrix types, the rows are stored > contiguously so > you can use MatGetOwnershipRange() to get the global indices for the > local > rows and thus call MatGetRow(). > > Barry > > > On Jun 4, 2008, at 6:33 AM, Stephen R Ball wrote: > >> Ok, I am looking into using MatGetRow(). However this requires the >> global row number for input. I was looking to use >> MatGetOwnershipRange() >> to obtain the range of global row numbers owned by each processor but >> the documentation states that this routine assumes that the matrix is >> laid out with the first n1 rows on the first processor, the next n2 >> rows >> on the second, etc and that for certain parallel layouts this range >> may >> not be well defined. >> >> This is the case for me. Do you have a routine where I can specify a >> global row number and it will tell me the rank of the processor that >> owns it? This is to ensure that MatGetRow() only gets called by the >> owner processor for each global row number. >> >> Regards >> >> Stephen >> >> >> -----Original Message----- >> From: owner-petsc-users at mcs.anl.gov >> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley >> Sent: 03 June 2008 15:29 >> To: petsc-users at mcs.anl.gov >> Subject: EXTERNAL: Re: MatGetArray() not supporting Mat type mpiaij. >> >> On Tue, Jun 3, 2008 at 8:11 AM, Stephen R Ball >> wrote: >>> Hi >>> >>> I have been trying to extract an array containing the local matrix >>> values using MatGetArray() via the Fortran interface but get the >>> error >>> message that Mat type mpiaij is not supported with this routine. >>> All I >>> want to do is to extract the local matrix values so that I can output >>> them to file in the format I want rather than via use of the >>> MatView() >>> routine. Can you suggest a way of how I can go about extracting the >>> local matrix values? >> >> This is no "local matrix". The Mat interface is supposed to be data >> structure >> neutral so we can optimize for different architectures. If you want >> the >> values >> directly, I would use MatGetRow() for each row. >> >> Matt >> >>> Thanks >>> >>> Stephen >>> >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From keita at cray.com Tue Jun 17 13:01:51 2008 From: keita at cray.com (Keita Teranishi) Date: Tue, 17 Jun 2008 13:01:51 -0500 Subject: ParMetis and MatPartitioningApply Message-ID: <925346A443D4E340BEB20248BAFCDBDF05FBDD49@CFEVS1-IP.americas.cray.com> Hi, In petsc-2.3.3 installed with ParMetis, does MatPartitioningApply always call ParMetis routines? Thanks, ================================ Keita Teranishi Math Software Group Cray, Inc. keita at cray.com ================================ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jun 17 16:33:28 2008 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 17 Jun 2008 16:33:28 -0500 Subject: ParMetis and MatPartitioningApply In-Reply-To: <925346A443D4E340BEB20248BAFCDBDF05FBDD49@CFEVS1-IP.americas.cray.com> References: <925346A443D4E340BEB20248BAFCDBDF05FBDD49@CFEVS1-IP.americas.cray.com> Message-ID: On Tue, Jun 17, 2008 at 1:01 PM, Keita Teranishi wrote: > Hi, > > > > In petsc-2.3.3 installed with ParMetis, does MatPartitioningApply always > call ParMetis routines? It is the default. Matt > Thanks, > > > > ================================ > Keita Teranishi > Math Software Group > Cray, Inc. > keita at cray.com > ================================ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From keita at cray.com Tue Jun 17 17:15:42 2008 From: keita at cray.com (Keita Teranishi) Date: Tue, 17 Jun 2008 17:15:42 -0500 Subject: Using two different data types for KSPSetOperators Message-ID: <925346A443D4E340BEB20248BAFCDBDF05FBE06D@CFEVS1-IP.americas.cray.com> Hi, I'd like to make sure if it is legal to put different Matrix data type for Amat and Pmat in KSPSetOperators call. For example, Amat is BAIJ and Pmat is AIJ. This situation could happen because BAIJ's MatMult is faster than AIJ's. KSPSetOperators(KSP ksp,Mat Amat,Mat Pmat,MatStructure flag) Thanks, ================================ Keita Teranishi Math Software Group Cray, Inc. keita at cray.com ================================ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jun 17 20:33:00 2008 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 17 Jun 2008 20:33:00 -0500 Subject: Using two different data types for KSPSetOperators In-Reply-To: <925346A443D4E340BEB20248BAFCDBDF05FBE06D@CFEVS1-IP.americas.cray.com> References: <925346A443D4E340BEB20248BAFCDBDF05FBE06D@CFEVS1-IP.americas.cray.com> Message-ID: Yes. Matt On Tue, Jun 17, 2008 at 5:15 PM, Keita Teranishi wrote: > Hi, > > > > I'd like to make sure if it is legal to put different Matrix data type for > Amat and Pmat in KSPSetOperators call. For example, Amat is BAIJ and Pmat > is AIJ. > > This situation could happen because BAIJ's MatMult is faster than AIJ's. > > > > KSPSetOperators(KSP ksp,Mat Amat,Mat Pmat,MatStructure flag) > > > > Thanks, > > ================================ > Keita Teranishi > Math Software Group > Cray, Inc. > keita at cray.com > ================================ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From rlmackie862 at gmail.com Wed Jun 18 15:32:33 2008 From: rlmackie862 at gmail.com (Randall Mackie) Date: Wed, 18 Jun 2008 13:32:33 -0700 Subject: compiling PETSc with Intel MKL 10.0.1.14 Message-ID: <485970E1.1090104@gmail.com> We've upgraded Intel MKL to version 10.0, but in this version, Intel has changed how libraries are suppose to be linked. For example, the libmkl_lapack.a is a dummy library, but that's what the PETSc configure script looks for. The documentation says, for example, to compile LAPACK in the static case, use libmkl_lapack.a libmkl_em64t.a and in the layered pure case to use libmkl_intel_lp64.a libmkl_intel_thread.a libmkl_core.a However, the PETSC configuration wants -lmkl_lapack -lmkl -lguide -lpthread Any suggestions are appreciated. Randy From balay at mcs.anl.gov Wed Jun 18 15:46:47 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 18 Jun 2008 15:46:47 -0500 (CDT) Subject: compiling PETSc with Intel MKL 10.0.1.14 In-Reply-To: <485970E1.1090104@gmail.com> References: <485970E1.1090104@gmail.com> Message-ID: Intels' MKL is always confusing to me. [balay at login02 em64t]$ cat libmkl_lapack.a GROUP (libmkl_intel_lp64.a libmkl_intel_thread.a libmkl_core.a) So linking with libmkl_lapack.a should work fine - and is correct [as per the layered pure case mentioned below]. satish On Wed, 18 Jun 2008, Randall Mackie wrote: > We've upgraded Intel MKL to version 10.0, but in this version, Intel has > changed how libraries are suppose to be linked. For example, the > libmkl_lapack.a > is a dummy library, but that's what the PETSc configure script looks for. > > The documentation says, for example, to compile LAPACK in the static case, > use libmkl_lapack.a libmkl_em64t.a > > and in the layered pure case to use > libmkl_intel_lp64.a libmkl_intel_thread.a libmkl_core.a > > However, the PETSC configuration wants -lmkl_lapack -lmkl -lguide -lpthread > > Any suggestions are appreciated. > > Randy > > From bsmith at mcs.anl.gov Wed Jun 18 16:29:42 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 18 Jun 2008 16:29:42 -0500 Subject: compiling PETSc with Intel MKL 10.0.1.14 In-Reply-To: <485970E1.1090104@gmail.com> References: <485970E1.1090104@gmail.com> Message-ID: <381F69EF-1C27-420C-8337-0EAEB70093E0@mcs.anl.gov> Could you email to petsc-maint at mcs.anl.gov ALL the messages as to what goes wrong with our current linking so we can fix it? Thanks Barry On Jun 18, 2008, at 3:32 PM, Randall Mackie wrote: > We've upgraded Intel MKL to version 10.0, but in this version, Intel > has > changed how libraries are suppose to be linked. For example, the > libmkl_lapack.a > is a dummy library, but that's what the PETSc configure script looks > for. > > The documentation says, for example, to compile LAPACK in the static > case, > use libmkl_lapack.a libmkl_em64t.a > > and in the layered pure case to use > libmkl_intel_lp64.a libmkl_intel_thread.a libmkl_core.a > > However, the PETSC configuration wants -lmkl_lapack -lmkl -lguide - > lpthread > > Any suggestions are appreciated. > > Randy > From acolombi at gmail.com Thu Jun 19 11:12:31 2008 From: acolombi at gmail.com (Andrew Colombi) Date: Thu, 19 Jun 2008 11:12:31 -0500 Subject: MatAssemblyBegin/End Message-ID: <9dc10d950806190912w5c62f552uce8935909f2af355@mail.gmail.com> I'm trying to overlap as much as possible MatAssembly with other computations, and I'm finding a confusing result. If I follow my AssemblyBegin with an immediate AssemblyEnd it takes 31 seconds to assemble the matrix. If I interleave a 10 minute computation between AssemblyBegin and AssemblyEnd I find that executing only AssemblyEnd still takes 27 seconds. So it takes 31 seconds to complete the entire transaction, or after 10 minutes of compute I still find myself stuck with 27 seconds of wait time. Now clearly, from the standpoint of optimization, 27 seconds in the presence of 10 minute computations is not something to waste brain cycles on. Nevertheless, I'm always curious about discrepancies between the world in my head and the actual world ;-) Here are some points that may be of interest: * I'm using a debug compile of PETSc. I wouldn't guess this makes much difference as long as BLAS and LAPACK are optimized. * One node does not participate in the computation, instead it acts as a work queue; doling out work whenever a "worker" processor becomes available. As such the "server" node makes a lot of calls to MPI_Iprobe. Could this be interfering with PETSc's background use of MPI? Thanks, -Andrew From bsmith at mcs.anl.gov Thu Jun 19 12:49:17 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 19 Jun 2008 12:49:17 -0500 Subject: MatAssemblyBegin/End In-Reply-To: <9dc10d950806190912w5c62f552uce8935909f2af355@mail.gmail.com> References: <9dc10d950806190912w5c62f552uce8935909f2af355@mail.gmail.com> Message-ID: Overlapping communication and computation is largely a myth. The CPU is still involved in packing and unpacking each message (and thus is unavailable for computation during that time). I expect exactly the behavior you have seen. Barry The only way I know to reduce the time in these stages is 1) make sure that the time spent on each process in creating the matrix entries and calling MatSetValues() is evenly balanced between processes 2) make sure that "most" matrix entries are created on the process where they will eventually live so less data must be moved in the MatAssemblyBegin/End() calls. These are both pretty hard to tune perfectly. I would just live with the 27 secs out of 10 minutes. On Jun 19, 2008, at 11:12 AM, Andrew Colombi wrote: > I'm trying to overlap as much as possible MatAssembly with other > computations, and I'm finding a confusing result. If I follow my > AssemblyBegin with an immediate AssemblyEnd it takes 31 seconds to > assemble the matrix. If I interleave a 10 minute computation between > AssemblyBegin and AssemblyEnd I find that executing only AssemblyEnd > still takes 27 seconds. So it takes 31 seconds to complete the entire > transaction, or after 10 minutes of compute I still find myself stuck > with 27 seconds of wait time. > > Now clearly, from the standpoint of optimization, 27 seconds in the > presence of 10 minute computations is not something to waste brain > cycles on. Nevertheless, I'm always curious about discrepancies > between the world in my head and the actual world ;-) Here are some > points that may be of interest: > > * I'm using a debug compile of PETSc. I wouldn't guess this makes > much difference as long as BLAS and LAPACK are optimized. > * One node does not participate in the computation, instead it acts as > a work queue; doling out work whenever a "worker" processor becomes > available. As such the "server" node makes a lot of calls to > MPI_Iprobe. Could this be interfering with PETSc's background use of > MPI? > > Thanks, > -Andrew > From grs2103 at columbia.edu Sun Jun 22 00:17:55 2008 From: grs2103 at columbia.edu (Gideon Simpson) Date: Sun, 22 Jun 2008 01:17:55 -0400 Subject: x11 on os x 10.5.x Message-ID: <9808DF45-9C50-40D3-9170-E6C570213CDF@columbia.edu> Does anyone know what flags to feed petsc 2.3.3 to get x11 functionality working with OS X 10.5.x? The following error shows up in my configure log: TEST configureLibrary from PETSc.packages.X11(/opt/petsc-2.3.3-p13/ python/PETSc/packages/X11.py:93) TESTING: configureLibrary from PETSc.packages.X11(python/PETSc/ packages/X11.py:93) Checks for X windows, sets PETSC_HAVE_X11 if found, and defines X_CFLAGS, X_PRE_LIBS, X_LIBS, and X_EXTRA_LIBS sh: xmkmf Executing: xmkmf sh: imake -DUseInstalled -I/usr/X11/lib/X11/config sh: /usr/bin/make acfindx Executing: /usr/bin/make acfindx sh: X_INCLUDE_ROOT = /usr/X11/include X_USR_LIB_DIR = /usr/X11/lib X_LIB_DIR = /usr/X11/lib/X11 Pushing language C sh: /opt/bin/mpicc -c -o conftest.o -fPIC -Wall -Wwrite-strings -Wno- strict-aliasing -g3 conftest.c Executing: /opt/bin/mpicc -c -o conftest.o -fPIC -Wall -Wwrite- strings -Wno-strict-aliasing -g3 conftest.c sh: Possible ERROR while running compiler: error message = {conftest.c: In function ?main?: conftest.c:5: warning: implicit declaration of function ?XSetWMName? } Source: #include "confdefs.h" #include "conffix.h" int main() { XSetWMName(); ; return 0; } Pushing language C Popping language C sh: /opt/bin/mpicc -o conftest -Wl,-multiply_defined,suppress -Wl,- multiply_defined -Wl,suppress -Wl,-multiply_defined,suppress -Wl,- multiply_defined -Wl,suppress -Wl,-multiply_defined,suppress -Wl,- multiply_defined -Wl,suppress -fPIC -Wall -Wwrite-strings -Wno-strict- aliasing -g3 conftest.o -Wl,-rpath,/usr/lib/i686-apple-darwin9/4.2.1 - Wl,-rpath,/Library/Frameworks/Intel_MKL.framework/Versions/10.0.2.018/ lib/32 -Wl,-rpath,. -Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1 - Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1/../../../i686-apple- darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- darwin9/4.2.1/../../.. -lstdc++ -lcrt1.10.5.o -Wl,-rpath,/usr/lib/i686- apple-darwin9/4.2.1 -L/usr/lib/i686-apple-darwin9/4.2.1 -Wl,-rpath,/ Library/Frameworks/Intel_MKL.framework/Versions/10.0.2.018/lib/32 -L/ Library/Frameworks/Intel_MKL.framework/Versions/10.0.2.018/lib/32 -Wl,- rpath,. -L. -Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1 -L/usr/ lib/gcc/i686-apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- darwin9/4.2.1/../../../i686-apple-darwin9/4.2.1 -L/usr/lib/gcc/i686- apple-darwin9/4.2.1/../../../i686-apple-darwin9/4.2.1 -Wl,-rpath,/usr/ lib/gcc/i686-apple-darwin9/4.2.1/../../.. -L/usr/lib/gcc/i686-apple- darwin9/4.2.1/../../.. -ldl -lgcc_s.10.5 -lSystem -ldl -lX11 Executing: /opt/bin/mpicc -o conftest -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-multiply_defined,suppress -Wl,- multiply_defined -Wl,suppress -Wl,-multiply_defined,suppress -Wl,- multiply_defined -Wl,suppress -fPIC -Wall -Wwrite-strings -Wno-strict- aliasing -g3 conftest.o -Wl,-rpath,/usr/lib/i686-apple-darwin9/4.2.1 - Wl,-rpath,/Library/Frameworks/Intel_MKL.framework/Versions/10.0.2.018/ lib/32 -Wl,-rpath,. -Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1 - Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1/../../../i686-apple- darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- darwin9/4.2.1/../../.. -lstdc++ -lcrt1.10.5.o -Wl,-rpath,/usr/lib/i686- apple-darwin9/4.2.1 -L/usr/lib/i686-apple-darwin9/4.2.1 -Wl,-rpath,/ Library/Frameworks/Intel_MKL.framework/Versions/10.0.2.018/lib/32 -L/ Library/Frameworks/Intel_MKL.framework/Versions/10.0.2.018/lib/32 -Wl,- rpath,. -L. -Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1 -L/usr/ lib/gcc/i686-apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- darwin9/4.2.1/../../../i686-apple-darwin9/4.2.1 -L/usr/lib/gcc/i686- apple-darwin9/4.2.1/../../../i686-apple-darwin9/4.2.1 -Wl,-rpath,/usr/ lib/gcc/i686-apple-darwin9/4.2.1/../../.. -L/usr/lib/gcc/i686-apple- darwin9/4.2.1/../../.. -ldl -lgcc_s.10.5 -lSystem -ldl -lX11 sh: Possible ERROR while running linker: ld: library not found for -lX11 collect2: ld returned 1 exit status output: ret = 256 error message = {ld: library not found for -lX11 collect2: ld returned 1 exit status } Pushing language C Popping language C in /opt/bin/mpicc -o conftest -Wl,-multiply_defined,suppress -Wl,- multiply_defined -Wl,suppress -Wl,-multiply_defined,suppress -Wl,- multiply_defined -Wl,suppress -Wl,-multiply_defined,suppress -Wl,- multiply_defined -Wl,suppress -fPIC -Wall -Wwrite-strings -Wno-strict- aliasing -g3 conftest.o -Wl,-rpath,/usr/lib/i686-apple-darwin9/4.2.1 - Wl,-rpath,/Library/Frameworks/Intel_MKL.framework/Versions/10.0.2.018/ lib/32 -Wl,-rpath,. -Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1 - Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1/../../../i686-apple- darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- darwin9/4.2.1/../../.. -lstdc++ -lcrt1.10.5.o -Wl,-rpath,/usr/lib/i686- apple-darwin9/4.2.1 -L/usr/lib/i686-apple-darwin9/4.2.1 -Wl,-rpath,/ Library/Frameworks/Intel_MKL.framework/Versions/10.0.2.018/lib/32 -L/ Library/Frameworks/Intel_MKL.framework/Versions/10.0.2.018/lib/32 -Wl,- rpath,. -L. -Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1 -L/usr/ lib/gcc/i686-apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- darwin9/4.2.1/../../../i686-apple-darwin9/4.2.1 -L/usr/lib/gcc/i686- apple-darwin9/4.2.1/../../../i686-apple-darwin9/4.2.1 -Wl,-rpath,/usr/ lib/gcc/i686-apple-darwin9/4.2.1/../../.. -L/usr/lib/gcc/i686-apple- darwin9/4.2.1/../../.. -ldl -lgcc_s.10.5 -lSystem -ldl -lX11 Source: #include "confdefs.h" #include "conffix.h" int main() { XSetWMName(); ; return 0; } Popping language C Could not find X11 libraries From bsmith at mcs.anl.gov Sun Jun 22 11:25:06 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 22 Jun 2008 11:25:06 -0500 Subject: x11 on os x 10.5.x In-Reply-To: <9808DF45-9C50-40D3-9170-E6C570213CDF@columbia.edu> References: <9808DF45-9C50-40D3-9170-E6C570213CDF@columbia.edu> Message-ID: <19B42C14-00C9-43EA-AC4D-0CD0FB8448AF@mcs.anl.gov> Please try editing python/PETSc/packages/X11.py and locate the line for ext in ['.a', '.so', '.sl', '.dll.a']: replace with for ext in ['.a', '.so', '.sl', '.dll.a','.dylib']: Barry Why this is not a PETSc patch I have no clue! On Jun 22, 2008, at 12:17 AM, Gideon Simpson wrote: > Does anyone know what flags to feed petsc 2.3.3 to get x11 > functionality working with OS X 10.5.x? The following error shows > up in my configure log: > > TEST configureLibrary from PETSc.packages.X11(/opt/petsc-2.3.3-p13/ > python/PETSc/packages/X11.py:93) > TESTING: configureLibrary from PETSc.packages.X11(python/PETSc/ > packages/X11.py:93) > Checks for X windows, sets PETSC_HAVE_X11 if found, and defines > X_CFLAGS, X_PRE_LIBS, X_LIBS, and X_EXTRA_LIBS > sh: xmkmf > Executing: xmkmf > sh: imake -DUseInstalled -I/usr/X11/lib/X11/config > > sh: /usr/bin/make acfindx > Executing: /usr/bin/make acfindx > sh: X_INCLUDE_ROOT = /usr/X11/include > X_USR_LIB_DIR = /usr/X11/lib > X_LIB_DIR = /usr/X11/lib/X11 > > Pushing language C > sh: /opt/bin/mpicc -c -o conftest.o -fPIC -Wall -Wwrite-strings - > Wno-strict-aliasing -g3 conftest.c > Executing: /opt/bin/mpicc -c -o conftest.o -fPIC -Wall -Wwrite- > strings -Wno-strict-aliasing -g3 conftest.c > sh: > Possible ERROR while running compiler: error message = {conftest.c: > In function ?main?: > conftest.c:5: warning: implicit declaration of function ?XSetWMName? > } > Source: > #include "confdefs.h" > #include "conffix.h" > > int main() { > XSetWMName(); > ; > return 0; > } > Pushing language C > Popping language C > sh: /opt/bin/mpicc -o conftest -Wl,-multiply_defined,suppress -Wl,- > multiply_defined -Wl,suppress -Wl,-multiply_defined,suppress -Wl,- > multiply_defined -Wl,suppress -Wl,-multiply_defined,suppress -Wl,- > multiply_defined -Wl,suppress -fPIC -Wall -Wwrite-strings -Wno- > strict-aliasing -g3 conftest.o -Wl,-rpath,/usr/lib/i686-apple- > darwin9/4.2.1 -Wl,-rpath,/Library/Frameworks/Intel_MKL.framework/ > Versions/10.0.2.018/lib/32 -Wl,-rpath,. -Wl,-rpath,/usr/lib/gcc/i686- > apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- > darwin9/4.2.1/../../../i686-apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/ > gcc/i686-apple-darwin9/4.2.1/../../.. -lstdc++ -lcrt1.10.5.o -Wl,- > rpath,/usr/lib/i686-apple-darwin9/4.2.1 -L/usr/lib/i686-apple- > darwin9/4.2.1 -Wl,-rpath,/Library/Frameworks/Intel_MKL.framework/ > Versions/10.0.2.018/lib/32 -L/Library/Frameworks/Intel_MKL.framework/ > Versions/10.0.2.018/lib/32 -Wl,-rpath,. -L. -Wl,-rpath,/usr/lib/gcc/ > i686-apple-darwin9/4.2.1 -L/usr/lib/gcc/i686-apple-darwin9/4.2.1 - > Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1/../../../i686-apple- > darwin9/4.2.1 -L/usr/lib/gcc/i686-apple-darwin9/4.2.1/../../../i686- > apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- > darwin9/4.2.1/../../.. -L/usr/lib/gcc/i686-apple- > darwin9/4.2.1/../../.. -ldl -lgcc_s.10.5 -lSystem -ldl -lX11 > Executing: /opt/bin/mpicc -o conftest -Wl,- > multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,- > multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,- > multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -fPIC - > Wall -Wwrite-strings -Wno-strict-aliasing -g3 conftest.o -Wl,- > rpath,/usr/lib/i686-apple-darwin9/4.2.1 -Wl,-rpath,/Library/ > Frameworks/Intel_MKL.framework/Versions/10.0.2.018/lib/32 -Wl,- > rpath,. -Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1 -Wl,-rpath,/ > usr/lib/gcc/i686-apple-darwin9/4.2.1/../../../i686-apple- > darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- > darwin9/4.2.1/../../.. -lstdc++ -lcrt1.10.5.o -Wl,-rpath,/usr/lib/ > i686-apple-darwin9/4.2.1 -L/usr/lib/i686-apple-darwin9/4.2.1 -Wl,- > rpath,/Library/Frameworks/Intel_MKL.framework/Versions/10.0.2.018/ > lib/32 -L/Library/Frameworks/Intel_MKL.framework/Versions/10.0.2.018/ > lib/32 -Wl,-rpath,. -L. -Wl,-rpath,/usr/lib/gcc/i686-apple- > darwin9/4.2.1 -L/usr/lib/gcc/i686-apple-darwin9/4.2.1 -Wl,-rpath,/ > usr/lib/gcc/i686-apple-darwin9/4.2.1/../../../i686-apple- > darwin9/4.2.1 -L/usr/lib/gcc/i686-apple-darwin9/4.2.1/../../../i686- > apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- > darwin9/4.2.1/../../.. -L/usr/lib/gcc/i686-apple- > darwin9/4.2.1/../../.. -ldl -lgcc_s.10.5 -lSystem -ldl -lX11 > sh: > Possible ERROR while running linker: ld: library not found for -lX11 > collect2: ld returned 1 exit status > output: ret = 256 > error message = {ld: library not found for -lX11 > collect2: ld returned 1 exit status > } > Pushing language C > Popping language C > in /opt/bin/mpicc -o conftest -Wl,-multiply_defined,suppress -Wl,- > multiply_defined -Wl,suppress -Wl,-multiply_defined,suppress -Wl,- > multiply_defined -Wl,suppress -Wl,-multiply_defined,suppress -Wl,- > multiply_defined -Wl,suppress -fPIC -Wall -Wwrite-strings -Wno- > strict-aliasing -g3 conftest.o -Wl,-rpath,/usr/lib/i686-apple- > darwin9/4.2.1 -Wl,-rpath,/Library/Frameworks/Intel_MKL.framework/ > Versions/10.0.2.018/lib/32 -Wl,-rpath,. -Wl,-rpath,/usr/lib/gcc/i686- > apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- > darwin9/4.2.1/../../../i686-apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/ > gcc/i686-apple-darwin9/4.2.1/../../.. -lstdc++ -lcrt1.10.5.o -Wl,- > rpath,/usr/lib/i686-apple-darwin9/4.2.1 -L/usr/lib/i686-apple- > darwin9/4.2.1 -Wl,-rpath,/Library/Frameworks/Intel_MKL.framework/ > Versions/10.0.2.018/lib/32 -L/Library/Frameworks/Intel_MKL.framework/ > Versions/10.0.2.018/lib/32 -Wl,-rpath,. -L. -Wl,-rpath,/usr/lib/gcc/ > i686-apple-darwin9/4.2.1 -L/usr/lib/gcc/i686-apple-darwin9/4.2.1 - > Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1/../../../i686-apple- > darwin9/4.2.1 -L/usr/lib/gcc/i686-apple-darwin9/4.2.1/../../../i686- > apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- > darwin9/4.2.1/../../.. -L/usr/lib/gcc/i686-apple- > darwin9/4.2.1/../../.. -ldl -lgcc_s.10.5 -lSystem -ldl -lX11 > Source: > #include "confdefs.h" > #include "conffix.h" > > int main() { > XSetWMName(); > ; > return 0; > } > Popping language C > Could not find X11 libraries > From grs2103 at columbia.edu Sun Jun 22 13:55:05 2008 From: grs2103 at columbia.edu (Gideon Simpson) Date: Sun, 22 Jun 2008 14:55:05 -0400 Subject: x11 on os x 10.5.x In-Reply-To: <19B42C14-00C9-43EA-AC4D-0CD0FB8448AF@mcs.anl.gov> References: <9808DF45-9C50-40D3-9170-E6C570213CDF@columbia.edu> <19B42C14-00C9-43EA-AC4D-0CD0FB8448AF@mcs.anl.gov> Message-ID: <6C5AF533-5494-44FC-A1BF-7C4A1B0B54F4@columbia.edu> that fixed it. thanks. -gideon On Jun 22, 2008, at 12:25 PM, Barry Smith wrote: > > Please try editing python/PETSc/packages/X11.py and locate the line > for ext in ['.a', '.so', '.sl', '.dll.a']: > replace with > for ext in ['.a', '.so', '.sl', '.dll.a','.dylib']: > > Barry > > Why this is not a PETSc patch I have no clue! > > On Jun 22, 2008, at 12:17 AM, Gideon Simpson wrote: > >> Does anyone know what flags to feed petsc 2.3.3 to get x11 >> functionality working with OS X 10.5.x? The following error shows >> up in my configure log: >> >> TEST configureLibrary from PETSc.packages.X11(/opt/petsc-2.3.3-p13/ >> python/PETSc/packages/X11.py:93) >> TESTING: configureLibrary from PETSc.packages.X11(python/PETSc/ >> packages/X11.py:93) >> Checks for X windows, sets PETSC_HAVE_X11 if found, and defines >> X_CFLAGS, X_PRE_LIBS, X_LIBS, and X_EXTRA_LIBS >> sh: xmkmf >> Executing: xmkmf >> sh: imake -DUseInstalled -I/usr/X11/lib/X11/config >> >> sh: /usr/bin/make acfindx >> Executing: /usr/bin/make acfindx >> sh: X_INCLUDE_ROOT = /usr/X11/include >> X_USR_LIB_DIR = /usr/X11/lib >> X_LIB_DIR = /usr/X11/lib/X11 >> >> Pushing language C >> sh: /opt/bin/mpicc -c -o conftest.o -fPIC -Wall -Wwrite-strings - >> Wno-strict-aliasing -g3 conftest.c >> Executing: /opt/bin/mpicc -c -o conftest.o -fPIC -Wall -Wwrite- >> strings -Wno-strict-aliasing -g3 conftest.c >> sh: >> Possible ERROR while running compiler: error message = {conftest.c: >> In function ?main?: >> conftest.c:5: warning: implicit declaration of function ?XSetWMName? >> } >> Source: >> #include "confdefs.h" >> #include "conffix.h" >> >> int main() { >> XSetWMName(); >> ; >> return 0; >> } >> Pushing language C >> Popping language C >> sh: /opt/bin/mpicc -o conftest -Wl,-multiply_defined,suppress - >> Wl,-multiply_defined -Wl,suppress -Wl,-multiply_defined,suppress - >> Wl,-multiply_defined -Wl,suppress -Wl,-multiply_defined,suppress - >> Wl,-multiply_defined -Wl,suppress -fPIC -Wall -Wwrite-strings -Wno- >> strict-aliasing -g3 conftest.o -Wl,-rpath,/usr/lib/i686-apple- >> darwin9/4.2.1 -Wl,-rpath,/Library/Frameworks/Intel_MKL.framework/ >> Versions/10.0.2.018/lib/32 -Wl,-rpath,. -Wl,-rpath,/usr/lib/gcc/ >> i686-apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- >> darwin9/4.2.1/../../../i686-apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/ >> gcc/i686-apple-darwin9/4.2.1/../../.. -lstdc++ -lcrt1.10.5.o -Wl,- >> rpath,/usr/lib/i686-apple-darwin9/4.2.1 -L/usr/lib/i686-apple- >> darwin9/4.2.1 -Wl,-rpath,/Library/Frameworks/Intel_MKL.framework/ >> Versions/10.0.2.018/lib/32 -L/Library/Frameworks/ >> Intel_MKL.framework/Versions/10.0.2.018/lib/32 -Wl,-rpath,. -L. - >> Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1 -L/usr/lib/gcc/i686- >> apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- >> darwin9/4.2.1/../../../i686-apple-darwin9/4.2.1 -L/usr/lib/gcc/i686- >> apple-darwin9/4.2.1/../../../i686-apple-darwin9/4.2.1 -Wl,-rpath,/ >> usr/lib/gcc/i686-apple-darwin9/4.2.1/../../.. -L/usr/lib/gcc/i686- >> apple-darwin9/4.2.1/../../.. -ldl -lgcc_s.10.5 -lSystem -ldl -lX11 >> Executing: /opt/bin/mpicc -o conftest -Wl,- >> multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,- >> multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,- >> multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -fPIC >> -Wall -Wwrite-strings -Wno-strict-aliasing -g3 conftest.o -Wl,- >> rpath,/usr/lib/i686-apple-darwin9/4.2.1 -Wl,-rpath,/Library/ >> Frameworks/Intel_MKL.framework/Versions/10.0.2.018/lib/32 -Wl,- >> rpath,. -Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1 -Wl,- >> rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1/../../../i686-apple- >> darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- >> darwin9/4.2.1/../../.. -lstdc++ -lcrt1.10.5.o -Wl,-rpath,/usr/lib/ >> i686-apple-darwin9/4.2.1 -L/usr/lib/i686-apple-darwin9/4.2.1 -Wl,- >> rpath,/Library/Frameworks/Intel_MKL.framework/Versions/10.0.2.018/ >> lib/32 -L/Library/Frameworks/Intel_MKL.framework/Versions/ >> 10.0.2.018/lib/32 -Wl,-rpath,. -L. -Wl,-rpath,/usr/lib/gcc/i686- >> apple-darwin9/4.2.1 -L/usr/lib/gcc/i686-apple-darwin9/4.2.1 -Wl,- >> rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1/../../../i686-apple- >> darwin9/4.2.1 -L/usr/lib/gcc/i686-apple-darwin9/4.2.1/../../../i686- >> apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- >> darwin9/4.2.1/../../.. -L/usr/lib/gcc/i686-apple- >> darwin9/4.2.1/../../.. -ldl -lgcc_s.10.5 -lSystem -ldl -lX11 >> sh: >> Possible ERROR while running linker: ld: library not found for -lX11 >> collect2: ld returned 1 exit status >> output: ret = 256 >> error message = {ld: library not found for -lX11 >> collect2: ld returned 1 exit status >> } >> Pushing language C >> Popping language C >> in /opt/bin/mpicc -o conftest -Wl,-multiply_defined,suppress -Wl,- >> multiply_defined -Wl,suppress -Wl,-multiply_defined,suppress -Wl,- >> multiply_defined -Wl,suppress -Wl,-multiply_defined,suppress -Wl,- >> multiply_defined -Wl,suppress -fPIC -Wall -Wwrite-strings -Wno- >> strict-aliasing -g3 conftest.o -Wl,-rpath,/usr/lib/i686-apple- >> darwin9/4.2.1 -Wl,-rpath,/Library/Frameworks/Intel_MKL.framework/ >> Versions/10.0.2.018/lib/32 -Wl,-rpath,. -Wl,-rpath,/usr/lib/gcc/ >> i686-apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- >> darwin9/4.2.1/../../../i686-apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/ >> gcc/i686-apple-darwin9/4.2.1/../../.. -lstdc++ -lcrt1.10.5.o -Wl,- >> rpath,/usr/lib/i686-apple-darwin9/4.2.1 -L/usr/lib/i686-apple- >> darwin9/4.2.1 -Wl,-rpath,/Library/Frameworks/Intel_MKL.framework/ >> Versions/10.0.2.018/lib/32 -L/Library/Frameworks/ >> Intel_MKL.framework/Versions/10.0.2.018/lib/32 -Wl,-rpath,. -L. - >> Wl,-rpath,/usr/lib/gcc/i686-apple-darwin9/4.2.1 -L/usr/lib/gcc/i686- >> apple-darwin9/4.2.1 -Wl,-rpath,/usr/lib/gcc/i686-apple- >> darwin9/4.2.1/../../../i686-apple-darwin9/4.2.1 -L/usr/lib/gcc/i686- >> apple-darwin9/4.2.1/../../../i686-apple-darwin9/4.2.1 -Wl,-rpath,/ >> usr/lib/gcc/i686-apple-darwin9/4.2.1/../../.. -L/usr/lib/gcc/i686- >> apple-darwin9/4.2.1/../../.. -ldl -lgcc_s.10.5 -lSystem -ldl -lX11 >> Source: >> #include "confdefs.h" >> #include "conffix.h" >> >> int main() { >> XSetWMName(); >> ; >> return 0; >> } >> Popping language C >> Could not find X11 libraries >> > From geenen at gmail.com Mon Jun 23 04:52:50 2008 From: geenen at gmail.com (Thomas Geenen) Date: Mon, 23 Jun 2008 11:52:50 +0200 Subject: using MUMPS with mg Message-ID: <8aa042e10806230252k7d072701oadb5ed742d16023e@mail.gmail.com> dear Petsc users, when using mg with Petsc (ml to be precise) the default solver on the coarsest level is "redundant" (I could not find much info in either the usermanual or de online manuals about this solver) it seems to be a sequential LU factorization running on all cpu's at the same time? doing the same thing? i try to use MUMPS instead. i try to invoke mumps by calling ierr = PetscOptionsSetValue("-mg_coarse_ksp_type", "preonly"); ierr = PetscOptionsSetValue("-mg_coarse_mat_type", "aijmumps"); ierr = PetscOptionsSetValue("-mg_coarse_pc_type", "lu"); this gives me [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: No support for this operation for this object type! [0]PETSC ERROR: Matrix type mpiaij symbolic so apparently Petsc did not convert the matrix on the coarsest level. if i remove the pc_type the conversion option is ignored and Petsc uses redundant. cheers Thomas From bsmith at mcs.anl.gov Mon Jun 23 09:08:09 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 23 Jun 2008 09:08:09 -0500 Subject: using MUMPS with mg In-Reply-To: <8aa042e10806230252k7d072701oadb5ed742d16023e@mail.gmail.com> References: <8aa042e10806230252k7d072701oadb5ed742d16023e@mail.gmail.com> Message-ID: On Jun 23, 2008, at 4:52 AM, Thomas Geenen wrote: > dear Petsc users, > > when using mg with Petsc (ml to be precise) > the default solver on the coarsest level is "redundant" > (I could not find much info in either the usermanual or de online > manuals about this solver) > > it seems to be a sequential LU factorization running on all cpu's at > the same time? > doing the same thing? Correct. > > > i try to use MUMPS instead. > i try to invoke mumps by calling > ierr = PetscOptionsSetValue("-mg_coarse_ksp_type", "preonly"); > ierr = PetscOptionsSetValue("-mg_coarse_mat_type", "aijmumps"); > ierr = PetscOptionsSetValue("-mg_coarse_pc_type", "lu"); > > this gives me > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: Matrix type mpiaij symbolic > > so apparently Petsc did not convert the matrix on the coarsest level. > if i remove the pc_type the conversion option is ignored and Petsc > uses redundant. How are you generating the coarse grid matrix? Do you get it with DAGetMatrix()? Or MatCreateMPIAIJ()? Do you set the matrix prefix to -mg_coarse? It is not set automatically. You need to either call MatSetFromOptions() on the coarse matrix or directly call MatSetType() with aijmumps to set it the type. Barry > > > > > cheers > Thomas > From balay at mcs.anl.gov Mon Jun 23 09:24:33 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 23 Jun 2008 09:24:33 -0500 (CDT) Subject: solve PDEs in spherical coordinates Message-ID: ---------- Forwarded message ---------- Date: Mon, 23 Jun 2008 11:37:14 +0200 (CEST) From: Mark Cheeseman Reply-To: mpch at cscs.ch To: petsc-users at mcs.anl.gov Subject: solve PDEs in spherical coordinates Hello, I would like to solve/integrate a system of 4 PDEs over /inside a spherical domain.?? How can I accomplish this in PETSc??? Do I need convert the equations into Matrix-vector form first (along with all the messy metrics)? Or can PETSc do this for me??? My goal is to be integrate this system of PDEs for a spherical domain of arbitrary radius. Thanks, Mark From knepley at gmail.com Mon Jun 23 09:41:39 2008 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 23 Jun 2008 09:41:39 -0500 Subject: solve PDEs in spherical coordinates In-Reply-To: References: Message-ID: PETSc currently handles only linear algebra, so you would need to first produce a linear (or nonlinear) system. Thanks, Matt On Mon, Jun 23, 2008 at 9:24 AM, Satish Balay wrote: > > ---------- Forwarded message ---------- > Date: Mon, 23 Jun 2008 11:37:14 +0200 (CEST) > From: Mark Cheeseman > Reply-To: mpch at cscs.ch > To: petsc-users at mcs.anl.gov > Subject: solve PDEs in spherical coordinates > > Hello, > > I would like to solve/integrate a system of 4 PDEs over /inside a > spherical domain.?? How can I accomplish this in PETSc??? Do I need > convert the equations into Matrix-vector form first (along with all the > messy metrics)? Or can PETSc do this for me??? My goal is to be integrate > this system of PDEs for a spherical domain of arbitrary radius. > > Thanks, > Mark > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From geenen at gmail.com Mon Jun 23 10:14:49 2008 From: geenen at gmail.com (Thomas Geenen) Date: Mon, 23 Jun 2008 17:14:49 +0200 Subject: using MUMPS with mg In-Reply-To: References: <8aa042e10806230252k7d072701oadb5ed742d16023e@mail.gmail.com> Message-ID: <200806231714.49613.geenen@gmail.com> On Monday 23 June 2008 16:08, Barry Smith wrote: > On Jun 23, 2008, at 4:52 AM, Thomas Geenen wrote: > > dear Petsc users, > > > > when using mg with Petsc (ml to be precise) > > the default solver on the coarsest level is "redundant" > > (I could not find much info in either the usermanual or de online > > manuals about this solver) > > > > it seems to be a sequential LU factorization running on all cpu's at > > the same time? > > doing the same thing? > > Correct. > > > i try to use MUMPS instead. > > i try to invoke mumps by calling > > ierr = PetscOptionsSetValue("-mg_coarse_ksp_type", "preonly"); > > ierr = PetscOptionsSetValue("-mg_coarse_mat_type", "aijmumps"); > > ierr = PetscOptionsSetValue("-mg_coarse_pc_type", "lu"); > > > > this gives me > > > > [0]PETSC ERROR: --------------------- Error Message > > ------------------------------------ > > [0]PETSC ERROR: No support for this operation for this object type! > > [0]PETSC ERROR: Matrix type mpiaij symbolic > > > > so apparently Petsc did not convert the matrix on the coarsest level. > > if i remove the pc_type the conversion option is ignored and Petsc > > uses redundant. > > How are you generating the coarse grid matrix? Do you get it with > DAGetMatrix()? its handled in ml.c from ml.c } else { /* convert ML P and R into shell format, ML A into mpiaij format */ for (mllevel=1; mllevelPmat[mllevel]); ierr = MatWrapML_SHELL(mlmat,reuse,&gridctx[level].P);CHKERRQ(ierr); mlmat = &(ml_object->Rmat[mllevel-1]); ierr = MatWrapML_SHELL(mlmat,reuse,&gridctx[level].R);CHKERRQ(ierr); mlmat = &(ml_object->Amat[mllevel]); if (reuse){ ierr = MatDestroy(gridctx[level].A);CHKERRQ(ierr); } ierr = MatWrapML_MPIAIJ(mlmat,&gridctx[level].A);CHKERRQ(ierr); level--; } } from MatWrapML_MPIAIJ( ierr = MatCreate(mlmat->comm->USR_comm,&A);CHKERRQ(ierr); ierr = MatSetSizes(A,m,n,PETSC_DECIDE,PETSC_DECIDE);CHKERRQ(ierr); ierr = MatSetType(A,MATMPIAIJ);CHKERRQ(ierr); > Or MatCreateMPIAIJ()? Do you set the matrix prefix to -mg_coarse? > It is not set automatically. You need to either call MatSetFromOptions() > on the coarse matrix or directly call MatSetType() with aijmumps to set > it the type. i will check if just adding ierr = MatConvert(A, MATAIJMUMPS, MAT_REUSE_MATRIX,&A);CHKERRQ(ierr); will do the trick > > Barry > > > cheers > > Thomas From bsmith at mcs.anl.gov Mon Jun 23 12:45:14 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 23 Jun 2008 12:45:14 -0500 Subject: using MUMPS with mg In-Reply-To: <200806231714.49613.geenen@gmail.com> References: <8aa042e10806230252k7d072701oadb5ed742d16023e@mail.gmail.com> <200806231714.49613.geenen@gmail.com> Message-ID: > ierr = MatCreate(mlmat->comm->USR_comm,&A);CHKERRQ(ierr); > ierr = MatSetSizes(A,m,n,PETSC_DECIDE,PETSC_DECIDE);CHKERRQ(ierr); > ierr = MatSetType(A,MATMPIAIJ);CHKERRQ(ierr); > > >> Or MatCreateMPIAIJ()? Do you set the matrix prefix to -mg_coarse? >> It is not set automatically. You need to either call >> MatSetFromOptions() >> on the coarse matrix or directly call MatSetType() with aijmumps to >> set >> it the type. > > i will check if just adding > > ierr = MatConvert(A, MATAIJMUMPS, MAT_REUSE_MATRIX,&A);CHKERRQ(ierr); > > will do the trick Better just to try err = MatSetType(A,MATAIJMUMPS);CHKERRQ(ierr); instead of err = MatSetType(A,MATMPIAIJ);CHKERRQ(ierr); Barry Thankfully selecting direct solvers will be much cleaner in the next release. On Jun 23, 2008, at 10:14 AM, Thomas Geenen wrote: > On Monday 23 June 2008 16:08, Barry Smith wrote: >> On Jun 23, 2008, at 4:52 AM, Thomas Geenen wrote: >>> dear Petsc users, >>> >>> when using mg with Petsc (ml to be precise) >>> the default solver on the coarsest level is "redundant" >>> (I could not find much info in either the usermanual or de online >>> manuals about this solver) >>> >>> it seems to be a sequential LU factorization running on all cpu's at >>> the same time? >>> doing the same thing? >> >> Correct. >> >>> i try to use MUMPS instead. >>> i try to invoke mumps by calling >>> ierr = PetscOptionsSetValue("-mg_coarse_ksp_type", "preonly"); >>> ierr = PetscOptionsSetValue("-mg_coarse_mat_type", "aijmumps"); >>> ierr = PetscOptionsSetValue("-mg_coarse_pc_type", "lu"); >>> >>> this gives me >>> >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [0]PETSC ERROR: No support for this operation for this object type! >>> [0]PETSC ERROR: Matrix type mpiaij symbolic >>> >>> so apparently Petsc did not convert the matrix on the coarsest >>> level. >>> if i remove the pc_type the conversion option is ignored and Petsc >>> uses redundant. >> >> How are you generating the coarse grid matrix? Do you get it with >> DAGetMatrix()? > > its handled in ml.c > > from ml.c > } else { /* convert ML P and R into shell format, ML A into mpiaij > format */ > for (mllevel=1; mllevel mlmat = &(ml_object->Pmat[mllevel]); > ierr = > MatWrapML_SHELL(mlmat,reuse,&gridctx[level].P);CHKERRQ(ierr); > mlmat = &(ml_object->Rmat[mllevel-1]); > ierr = > MatWrapML_SHELL(mlmat,reuse,&gridctx[level].R);CHKERRQ(ierr); > > mlmat = &(ml_object->Amat[mllevel]); > if (reuse){ > ierr = MatDestroy(gridctx[level].A);CHKERRQ(ierr); > } > ierr = MatWrapML_MPIAIJ(mlmat,&gridctx[level].A);CHKERRQ(ierr); > level--; > } > } > > from MatWrapML_MPIAIJ( > > ierr = MatCreate(mlmat->comm->USR_comm,&A);CHKERRQ(ierr); > ierr = MatSetSizes(A,m,n,PETSC_DECIDE,PETSC_DECIDE);CHKERRQ(ierr); > ierr = MatSetType(A,MATMPIAIJ);CHKERRQ(ierr); > > >> Or MatCreateMPIAIJ()? Do you set the matrix prefix to -mg_coarse? >> It is not set automatically. You need to either call >> MatSetFromOptions() >> on the coarse matrix or directly call MatSetType() with aijmumps to >> set >> it the type. > > i will check if just adding > > ierr = MatConvert(A, MATAIJMUMPS, MAT_REUSE_MATRIX,&A);CHKERRQ(ierr); > > will do the trick > >> >> Barry >> >>> cheers >>> Thomas > From dalcinl at gmail.com Mon Jun 23 13:16:48 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 23 Jun 2008 15:16:48 -0300 Subject: using MUMPS with mg In-Reply-To: <200806231714.49613.geenen@gmail.com> References: <8aa042e10806230252k7d072701oadb5ed742d16023e@mail.gmail.com> <200806231714.49613.geenen@gmail.com> Message-ID: On 6/23/08, Thomas Geenen wrote: > i will check if just adding > > ierr = MatConvert(A, MATAIJMUMPS, MAT_REUSE_MATRIX,&A);CHKERRQ(ierr); > > will do the trick > Indeed. That would do the trick. But perhaps better is to add: MatConvert(A, MATSAME, MAT_REUSE_MATRIX,&A) Then, you can pass '-matconvert_type aijmumps' (check for the actual option name, to busy right now to look at the source) to actually use MATAIJMUMPS. If you do not pass the option, the the MatConvert() call is just a non-op . Perhaps this is a candidate for petsc-dev? Or MatSetSolverType() will handle this in the near future? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From enjoywm at cs.wm.edu Wed Jun 25 16:29:31 2008 From: enjoywm at cs.wm.edu (enjoywm at cs.wm.edu) Date: Wed, 25 Jun 2008 17:29:31 -0400 (EDT) Subject: how to resolve linear equations of system Message-ID: <2557.128.239.2.78.1214429371.squirrel@mail.cs.wm.edu> Hi, I have a linear equations of system such as Kx=u. 1. If K is singular, can I use KSPSolve to solve it? 2. If K is non-singular I also can use x= inverse(K)u to do it. Which one is better? Best, Yixun From acolombi at gmail.com Wed Jun 25 16:48:23 2008 From: acolombi at gmail.com (Andrew Colombi) Date: Wed, 25 Jun 2008 16:48:23 -0500 Subject: how to resolve linear equations of system In-Reply-To: <2557.128.239.2.78.1214429371.squirrel@mail.cs.wm.edu> References: <2557.128.239.2.78.1214429371.squirrel@mail.cs.wm.edu> Message-ID: <9dc10d950806251448o6438bd87yb4d3dd1001e07ccd@mail.gmail.com> > 1. In general the answer is no, as there is no solution. But, there are some situations where an acceptable solution is found using appropriate pre-conditioning. It all depends on the exact circumstance you are in. (For example, if the singularity is due to a zero row in K, then using the Jacobi pre-conditioner will make it an identity row. This may or may not be acceptable depending on your application.) > 2. Solving the linear system is almost always preferred as finding the inverse to K explicitly boils down to repeatedly solving Kx=u where u are individual columns of I. Even if you are interested in solving for many u's it is preferable to save K's factorization rather than K's inverse. Read Matrix Computations by Gene Golub, or an introductory Scientific Computing book. Many of these questions are addressed there. -Andrew On Wed, Jun 25, 2008 at 4:29 PM, wrote: > Hi, > I have a linear equations of system such as Kx=u. > 1. If K is singular, can I use KSPSolve to solve it? > 2. If K is non-singular I also can use x= inverse(K)u to do it. > Which one is better? > > Best, > Yixun > > From bsmith at mcs.anl.gov Wed Jun 25 16:53:16 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 25 Jun 2008 16:53:16 -0500 Subject: how to resolve linear equations of system In-Reply-To: <2557.128.239.2.78.1214429371.squirrel@mail.cs.wm.edu> References: <2557.128.239.2.78.1214429371.squirrel@mail.cs.wm.edu> Message-ID: <61FDE792-588D-46AA-9E45-5F747229FA50@mcs.anl.gov> On Jun 25, 2008, at 4:29 PM, enjoywm at cs.wm.edu wrote: > Hi, > I have a linear equations of system such as Kx=u. > 1. If K is singular, can I use KSPSolve to solve it? If you know the null space of K you may be able to use the PETSc iterative solvers. If K is rectangular then likely you would not benefit from using PETSc If your linear systems are sparse and less than 50 to 100,000 unknowns then I recommend using Matlab for your computations. The learning curve for PETSc means it is reserved for problems so large that simpler techniques/packages do not work. > > 2. If K is non-singular I also can use x= inverse(K)u to do it. Suggest you read a good book on Iterative methods for linear systems like by Saad. Barry > > Which one is better? > > Best, > Yixun > From enjoywm at cs.wm.edu Wed Jun 25 19:06:15 2008 From: enjoywm at cs.wm.edu (enjoywm at cs.wm.edu) Date: Wed, 25 Jun 2008 20:06:15 -0400 (EDT) Subject: how to resolve linear equations of system In-Reply-To: <61FDE792-588D-46AA-9E45-5F747229FA50@mcs.anl.gov> References: <2557.128.239.2.78.1214429371.squirrel@mail.cs.wm.edu> <61FDE792-588D-46AA-9E45-5F747229FA50@mcs.anl.gov> Message-ID: <50749.68.13.203.37.1214438775.squirrel@mail.cs.wm.edu> Thank you very much! > > On Jun 25, 2008, at 4:29 PM, enjoywm at cs.wm.edu wrote: > >> Hi, >> I have a linear equations of system such as Kx=u. >> 1. If K is singular, can I use KSPSolve to solve it? > > If you know the null space of K you may be able to use the PETSc > iterative solvers. > If K is rectangular then likely you would not benefit from using PETSc > > If your linear systems are sparse and less than 50 to 100,000 unknowns > then I recommend > using Matlab for your computations. The learning curve for PETSc means > it > is reserved for problems so large that simpler techniques/packages do > not work. > >> >> 2. If K is non-singular I also can use x= inverse(K)u to do it. > > Suggest you read a good book on Iterative methods for linear > systems like > by Saad. > > Barry > > >> >> Which one is better? >> >> Best, >> Yixun >> > > From enjoywm at cs.wm.edu Wed Jun 25 19:06:37 2008 From: enjoywm at cs.wm.edu (enjoywm at cs.wm.edu) Date: Wed, 25 Jun 2008 20:06:37 -0400 (EDT) Subject: how to resolve linear equations of system In-Reply-To: <9dc10d950806251448o6438bd87yb4d3dd1001e07ccd@mail.gmail.com> References: <2557.128.239.2.78.1214429371.squirrel@mail.cs.wm.edu> <9dc10d950806251448o6438bd87yb4d3dd1001e07ccd@mail.gmail.com> Message-ID: <50754.68.13.203.37.1214438797.squirrel@mail.cs.wm.edu> Thank you very much! >> 1. > > In general the answer is no, as there is no solution. But, there are > some situations where an acceptable solution is found using > appropriate pre-conditioning. It all depends on the exact > circumstance you are in. (For example, if the singularity is due to a > zero row in K, then using the Jacobi pre-conditioner will make it an > identity row. This may or may not be acceptable depending on your > application.) > >> 2. > > Solving the linear system is almost always preferred as finding the > inverse to K explicitly boils down to repeatedly solving Kx=u where u > are individual columns of I. Even if you are interested in solving > for many u's it is preferable to save K's factorization rather than > K's inverse. > > Read Matrix Computations by Gene Golub, or an introductory Scientific > Computing book. Many of these questions are addressed there. > > -Andrew > > On Wed, Jun 25, 2008 at 4:29 PM, wrote: >> Hi, >> I have a linear equations of system such as Kx=u. >> 1. If K is singular, can I use KSPSolve to solve it? >> 2. If K is non-singular I also can use x= inverse(K)u to do it. >> Which one is better? >> >> Best, >> Yixun >> >> > > From w_subber at yahoo.com Sat Jun 28 12:03:45 2008 From: w_subber at yahoo.com (Waad Subber) Date: Sat, 28 Jun 2008 10:03:45 -0700 (PDT) Subject: PETSc installed without X windows Message-ID: <618747.19121.qm@web38201.mail.mud.yahoo.com> Hi, When I try to view a matrix with the option -mat_view_draw, or call MatView in side the code . I get : ? [0]PETSC ERROR: PETSc installed without X windows on this machine proceeding without graphics I reconfigure Petsc with the option --with-X=1, but the problem doesn't solved Thanks Waad -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Sat Jun 28 12:13:45 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 28 Jun 2008 12:13:45 -0500 (CDT) Subject: PETSc installed without X windows In-Reply-To: <618747.19121.qm@web38201.mail.mud.yahoo.com> References: <618747.19121.qm@web38201.mail.mud.yahoo.com> Message-ID: What OS is this? Do you have X11 [include, library files] installed on this machine? [if linux - then they can be in libX11-devel or equivalent package] Satish On Sat, 28 Jun 2008, Waad Subber wrote: > Hi, > > When I try to view a matrix with the option -mat_view_draw, or call MatView in side the code . I get : > ? > [0]PETSC ERROR: PETSc installed without X windows on this machine > proceeding without graphics > > I reconfigure Petsc with the option --with-X=1, but the problem doesn't solved > > Thanks > > Waad > > > > > From w_subber at yahoo.com Sat Jun 28 12:28:53 2008 From: w_subber at yahoo.com (Waad Subber) Date: Sat, 28 Jun 2008 10:28:53 -0700 (PDT) Subject: PETSc installed without X windows In-Reply-To: Message-ID: <19625.46402.qm@web38202.mail.mud.yahoo.com> Thank you Satish My system is ubuntu 8.04 LTS , and Yes I have X11 running on my machine Waad --- On Sat, 6/28/08, Satish Balay wrote: From: Satish Balay Subject: Re: PETSc installed without X windows To: petsc-users at mcs.anl.gov Date: Saturday, June 28, 2008, 1:13 PM What OS is this? Do you have X11 [include, library files] installed on this machine? [if linux - then they can be in libX11-devel or equivalent package] Satish On Sat, 28 Jun 2008, Waad Subber wrote: > Hi, > > When I try to view a matrix with the option -mat_view_draw, or call MatView in side the code . I get : > ? > [0]PETSC ERROR: PETSc installed without X windows on this machine > proceeding without graphics > > I reconfigure Petsc with the option --with-X=1, but the problem doesn't solved > > Thanks > > Waad > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Jun 28 12:34:27 2008 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 28 Jun 2008 12:34:27 -0500 Subject: PETSc installed without X windows In-Reply-To: <19625.46402.qm@web38202.mail.mud.yahoo.com> References: <19625.46402.qm@web38202.mail.mud.yahoo.com> Message-ID: On Sat, Jun 28, 2008 at 12:28 PM, Waad Subber wrote: > Thank you Satish > > My system is ubuntu 8.04 LTS , and Yes I have X11 running on my machine Then you must install the libX11-dev package. Matt > Waad > --- On Sat, 6/28/08, Satish Balay wrote: > > From: Satish Balay > Subject: Re: PETSc installed without X windows > To: petsc-users at mcs.anl.gov > Date: Saturday, June 28, 2008, 1:13 PM > > What OS is this? Do you have X11 [include, library files] installed on this > machine? > > [if linux - then they can be in libX11-devel or equivalent package] > > Satish > > On Sat, 28 Jun 2008, Waad Subber wrote: > >> Hi, >> >> When I try to view a matrix with the option -mat_view_draw, or call > MatView in side the code . I get : >> >> [0]PETSC > ERROR: PETSc installed without X windows on this machine >> proceeding without graphics >> >> I reconfigure Petsc with the option --with-X=1, but the problem > doesn't solved >> >> Thanks >> >> Waad >> >> >> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From w_subber at yahoo.com Sat Jun 28 12:38:11 2008 From: w_subber at yahoo.com (Waad Subber) Date: Sat, 28 Jun 2008 10:38:11 -0700 (PDT) Subject: PETSc installed without X windows In-Reply-To: Message-ID: <458871.10268.qm@web38205.mail.mud.yahoo.com> Thank you Matt --- On Sat, 6/28/08, Matthew Knepley wrote: From: Matthew Knepley Subject: Re: PETSc installed without X windows To: petsc-users at mcs.anl.gov Date: Saturday, June 28, 2008, 1:34 PM On Sat, Jun 28, 2008 at 12:28 PM, Waad Subber wrote: > Thank you Satish > > My system is ubuntu 8.04 LTS , and Yes I have X11 running on my machine Then you must install the libX11-dev package. Matt > Waad > --- On Sat, 6/28/08, Satish Balay wrote: > > From: Satish Balay > Subject: Re: PETSc installed without X windows > To: petsc-users at mcs.anl.gov > Date: Saturday, June 28, 2008, 1:13 PM > > What OS is this? Do you have X11 [include, library files] installed on this > machine? > > [if linux - then they can be in libX11-devel or equivalent package] > > Satish > > On Sat, 28 Jun 2008, Waad Subber wrote: > >> Hi, >> >> When I try to view a matrix with the option -mat_view_draw, or call > MatView in side the code . I get : >> >> [0]PETSC > ERROR: PETSc installed without X windows on this machine >> proceeding without graphics >> >> I reconfigure Petsc with the option --with-X=1, but the problem > doesn't solved >> >> Thanks >> >> Waad >> >> >> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From acolombi at gmail.com Mon Jun 30 12:25:21 2008 From: acolombi at gmail.com (Andrew Colombi) Date: Mon, 30 Jun 2008 12:25:21 -0500 Subject: Matrix Allocation Message-ID: <9dc10d950806301025i7a615f6ak19d2ccd84fc03e9f@mail.gmail.com> I am a little confused by some behavior. I've been trying to understand some poor performance numbers and I noticed that if I doubled the preallocation numbers (presently I'm setting d_nz and o_nz to constants, no effort has been made to get the numbers _right_) performance improved dramatically (about 20X). This isn't so surprising except for the report from MatGetInfo. Even with the lower preallocation numbers I'm getting 0 in info.mallocs for MatInfo info; MatGetInfo(pc_A, MAT_LOCAL, &info); // done and reported on each machine individually So, my question is.... well.. why? If I'm seeing 0 mallocs before doubling prealloation shouldn't that mean I've preallocated enough? Or are their some switches I need to use to enable malloc counting? Also you can see (according to the same call to MetGetInfo) I'm wasting a lot of memory: // after doubling preallocation nz_alloc 6.2704e+07 nz_used 1.9125e+07 nz_unneed 4.3579e+07 (these are print outs of info.nz_allocated, info.nz_used and info.nzunneeded). Any thoughts? For now memory is not a bottleneck, so I guess I'll be satisfied with guessing big numbers for d_nz and o_nz. Still, I spent a lot of time scratching my head since guessing higher numbers didn't seem likely to have an effect. Thanks, -Andrew From knepley at gmail.com Mon Jun 30 13:38:36 2008 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 30 Jun 2008 13:38:36 -0500 Subject: Matrix Allocation In-Reply-To: <9dc10d950806301025i7a615f6ak19d2ccd84fc03e9f@mail.gmail.com> References: <9dc10d950806301025i7a615f6ak19d2ccd84fc03e9f@mail.gmail.com> Message-ID: 1) Could you be calling MatGetInfo() with the wrong matrix? 2) Can you use -info to check the numbers? 3) Can you try MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, 1)? Matt On Mon, Jun 30, 2008 at 12:25 PM, Andrew Colombi wrote: > I am a little confused by some behavior. > > I've been trying to understand some poor performance numbers and I > noticed that if I doubled the preallocation numbers (presently I'm > setting d_nz and o_nz to constants, no effort has been made to get the > numbers _right_) performance improved dramatically (about 20X). > > This isn't so surprising except for the report from MatGetInfo. Even > with the lower preallocation numbers I'm getting 0 in info.mallocs for > > MatInfo info; > MatGetInfo(pc_A, MAT_LOCAL, &info); // done and reported on each > machine individually > > So, my question is.... well.. why? If I'm seeing 0 mallocs before > doubling prealloation shouldn't that mean I've preallocated enough? > Or are their some switches I need to use to enable malloc counting? > Also you can see (according to the same call to MetGetInfo) I'm > wasting a lot of memory: > > // after doubling preallocation > nz_alloc 6.2704e+07 nz_used 1.9125e+07 nz_unneed 4.3579e+07 > > (these are print outs of info.nz_allocated, info.nz_used and info.nzunneeded). > > Any thoughts? For now memory is not a bottleneck, so I guess I'll be > satisfied with guessing big numbers for d_nz and o_nz. Still, I spent > a lot of time scratching my head since guessing higher numbers didn't > seem likely to have an effect. > > Thanks, > -Andrew -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From acolombi at gmail.com Mon Jun 30 15:15:12 2008 From: acolombi at gmail.com (Andrew Colombi) Date: Mon, 30 Jun 2008 15:15:12 -0500 Subject: Matrix Allocation In-Reply-To: References: <9dc10d950806301025i7a615f6ak19d2ccd84fc03e9f@mail.gmail.com> Message-ID: <9dc10d950806301315y4eb734baw455f1f1ba8bba977@mail.gmail.com> > 1) Could you be calling MatGetInfo() with the wrong matrix? Nice try, but I only have one matrix ;-) I call it A. We get along. > 2) Can you use -info to check the numbers? > 3) Can you try MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, 1)? Yes and yes. In fact, I even wrote a small test app just for this purpose which is revealing some interesting behavior. Results: Additional memory is being malloced; however, MatGetInfo does not appear to be reporting this. Here I've grepped the output for "malloc": $ rjq ./make -info | grep malloc [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 1 [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 1 [ 1] mallocs 0 memory 64 [ 0] mallocs 0 memory 64 The initial messages are from -info, and the last two are from my MatGetInfo snippet. Clearly two mallocs are happening that are not being reported. Maybe this is reason: those mallocs don't happen in MatAssembly_MPIAIJ? The reason for this is that each processor only computes the rows it's reponsible for. Thoughts? -Andrew On Mon, Jun 30, 2008 at 1:38 PM, Matthew Knepley wrote: > 1) Could you be calling MatGetInfo() with the wrong matrix? > > 2) Can you use -info to check the numbers? > > 3) Can you try MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, 1)? > > Matt > > On Mon, Jun 30, 2008 at 12:25 PM, Andrew Colombi wrote: >> I am a little confused by some behavior. >> >> I've been trying to understand some poor performance numbers and I >> noticed that if I doubled the preallocation numbers (presently I'm >> setting d_nz and o_nz to constants, no effort has been made to get the >> numbers _right_) performance improved dramatically (about 20X). >> >> This isn't so surprising except for the report from MatGetInfo. Even >> with the lower preallocation numbers I'm getting 0 in info.mallocs for >> >> MatInfo info; >> MatGetInfo(pc_A, MAT_LOCAL, &info); // done and reported on each >> machine individually >> >> So, my question is.... well.. why? If I'm seeing 0 mallocs before >> doubling prealloation shouldn't that mean I've preallocated enough? >> Or are their some switches I need to use to enable malloc counting? >> Also you can see (according to the same call to MetGetInfo) I'm >> wasting a lot of memory: >> >> // after doubling preallocation >> nz_alloc 6.2704e+07 nz_used 1.9125e+07 nz_unneed 4.3579e+07 >> >> (these are print outs of info.nz_allocated, info.nz_used and info.nzunneeded). >> >> Any thoughts? For now memory is not a bottleneck, so I guess I'll be >> satisfied with guessing big numbers for d_nz and o_nz. Still, I spent >> a lot of time scratching my head since guessing higher numbers didn't >> seem likely to have an effect. >> >> Thanks, >> -Andrew > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > From bsmith at mcs.anl.gov Mon Jun 30 15:26:49 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 30 Jun 2008 15:26:49 -0500 Subject: Matrix Allocation In-Reply-To: <9dc10d950806301315y4eb734baw455f1f1ba8bba977@mail.gmail.com> References: <9dc10d950806301025i7a615f6ak19d2ccd84fc03e9f@mail.gmail.com> <9dc10d950806301315y4eb734baw455f1f1ba8bba977@mail.gmail.com> Message-ID: <15A7211B-29BE-499D-AF6D-46D04E969668@mcs.anl.gov> I found the cause, from MatAssemblyEnd_SeqAIJ() ierr = MatMarkDiagonal_SeqAIJ(A);CHKERRQ(ierr); ierr = PetscInfo4(A,"Matrix size: %D X %D; storage space: %D unneeded,%D used\n",m,A->cmap.n,fshift,a->nz);CHKERRQ(ierr); ierr = PetscInfo1(A,"Number of mallocs during MatSetValues() is %D \n",a->reallocs);CHKERRQ(ierr); ierr = PetscInfo1(A,"Maximum nonzeros in any row is %D \n",rmax);CHKERRQ(ierr); a->reallocs = 0; A->info.nz_unneeded = (double)fshift; a->rmax = rmax; It prints the correct value and then sets the value to zero! Presumably this is done to get an accurate count of new reallocations if the user does a bunch more MatSetValues(). But this is silly. How to fix this? We could simply remove the zeroing and have this count be cumulative over all the previous MatSetValues/MatAssembly... calls. Or introduce two counters, a cumulative and recent counter. I would suggest going with the simple fix. Barry On Jun 30, 2008, at 3:15 PM, Andrew Colombi wrote: >> 1) Could you be calling MatGetInfo() with the wrong matrix? > > Nice try, but I only have one matrix ;-) I call it A. We get along. > >> 2) Can you use -info to check the numbers? >> 3) Can you try MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, 1)? > > Yes and yes. In fact, I even wrote a small test app just for this > purpose which is revealing some interesting behavior. Results: > Additional memory is being malloced; however, MatGetInfo does not > appear to be reporting this. Here I've grepped the output for > "malloc": > > $ rjq ./make -info | grep malloc > [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() > is 0 > [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() > is 1 > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() > is 0 > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() > is 1 > [ 1] mallocs 0 memory 64 > [ 0] mallocs 0 memory 64 > > The initial messages are from -info, and the last two are from my > MatGetInfo snippet. Clearly two mallocs are happening that are not > being reported. Maybe this is reason: those mallocs don't happen in > MatAssembly_MPIAIJ? The reason for this is that each processor only > computes the rows it's reponsible for. > > Thoughts? > -Andrew > > On Mon, Jun 30, 2008 at 1:38 PM, Matthew Knepley > wrote: >> 1) Could you be calling MatGetInfo() with the wrong matrix? >> >> 2) Can you use -info to check the numbers? >> >> 3) Can you try MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, 1)? >> >> Matt >> >> On Mon, Jun 30, 2008 at 12:25 PM, Andrew Colombi >> wrote: >>> I am a little confused by some behavior. >>> >>> I've been trying to understand some poor performance numbers and I >>> noticed that if I doubled the preallocation numbers (presently I'm >>> setting d_nz and o_nz to constants, no effort has been made to get >>> the >>> numbers _right_) performance improved dramatically (about 20X). >>> >>> This isn't so surprising except for the report from MatGetInfo. >>> Even >>> with the lower preallocation numbers I'm getting 0 in info.mallocs >>> for >>> >>> MatInfo info; >>> MatGetInfo(pc_A, MAT_LOCAL, &info); // done and reported on each >>> machine individually >>> >>> So, my question is.... well.. why? If I'm seeing 0 mallocs before >>> doubling prealloation shouldn't that mean I've preallocated enough? >>> Or are their some switches I need to use to enable malloc counting? >>> Also you can see (according to the same call to MetGetInfo) I'm >>> wasting a lot of memory: >>> >>> // after doubling preallocation >>> nz_alloc 6.2704e+07 nz_used 1.9125e+07 nz_unneed 4.3579e+07 >>> >>> (these are print outs of info.nz_allocated, info.nz_used and >>> info.nzunneeded). >>> >>> Any thoughts? For now memory is not a bottleneck, so I guess I'll >>> be >>> satisfied with guessing big numbers for d_nz and o_nz. Still, I >>> spent >>> a lot of time scratching my head since guessing higher numbers >>> didn't >>> seem likely to have an effect. >>> >>> Thanks, >>> -Andrew >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> >> >