From z.sheng at ewi.tudelft.nl  Thu Nov  1 07:45:57 2007
From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng)
Date: Thu, 01 Nov 2007 13:45:57 +0100
Subject: memory usage of the unstructured grid
In-Reply-To: <830561.29161.qm@web36201.mail.mud.yahoo.com>
References: <830561.29161.qm@web36201.mail.mud.yahoo.com>
Message-ID: <4729CA85.3070908@ewi.tudelft.nl>

Shi Jin wrote:

> Hi,
>
> I have observed that the memory usage of the petsc mesh is much higher 
> than my previous code, if both were to be run serially.
> For example, for a simple cubic box with 750,000 tetrahedral elements, 
> my old code takes about 200MB for the whole array, including all the 
> mappings  required for later use such as the inverse connectivity  
> table.  For the same mesh, my PETSc code takes about 4GB for the mesh 
> alone.
>
> The same can be found in the provided examples. I made a few changes 
> to the navierStokes code to output the virtual memory usage and got
> ./navierStokes -dim 3 -generate  -structured 0 -refinement_limit 1e-6
>     109,283 elements,  139,030 edges , 21,523 vertexes
>     [0]:after mesh created:mem=574.46 MB
> This is consistent with my Petsc code.
>
> I understand that for the mesh to scale in parallel, extra  
> information needs to be stored. But the current  cost seems too 
> expensive. I am wondering if there is a way to cut the memory usage 
> for the mesh.
> Thank you very much.
>
> Shi
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com 

Hi

I am not expert on it... but I have used some FEM library beside mine, 
and I found out that some libraries create neighboring element, edge, 
and node list for every element. This is not necessary if you do use 
them. Similar thing could have happened in your case.

Best
Zhifeng


From z.sheng at ewi.tudelft.nl  Thu Nov  1 07:52:58 2007
From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng)
Date: Thu, 01 Nov 2007 13:52:58 +0100
Subject: Dynamic allocation of Non zeros?
Message-ID: <4729CC2A.6010703@ewi.tudelft.nl>

Dear all

It is always not easy to determine the number of nonzeros in a row 
without any cost.

I wonder if I can make dynamic allocation, for instance, if one found 
the nonzeros of a row overflow the pre-allocated space, then the space 
shall be doubled.

This may waste some memory but it may not be that bad. Or maybe this is 
already done in Petsc?

Could please tell what happens if I assign the nozero number in rows to 
be zeros and let the programme fills in the nonzeros ?

Best regards
Zhifeng Sheng


From hzhang at mcs.anl.gov  Thu Nov  1 09:32:31 2007
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Thu, 1 Nov 2007 09:32:31 -0500 (CDT)
Subject: Dynamic allocation of Non zeros?
In-Reply-To: <4729CC2A.6010703@ewi.tudelft.nl>
References: <4729CC2A.6010703@ewi.tudelft.nl>
Message-ID: <Pine.LNX.4.64.0711010906590.15616@terra.mcs.anl.gov>


On Thu, 1 Nov 2007, Zhifeng Sheng wrote:

> Dear all
>
> It is always not easy to determine the number of nonzeros in a row without 
> any cost.
>
> I wonder if I can make dynamic allocation, for instance, if one found the 
> nonzeros of a row overflow the pre-allocated space, then the space shall be 
> doubled.
>
> This may waste some memory but it may not be that bad. Or maybe this is 
> already done in Petsc?

This is implemented in petsc, and is the reasons that MatAssembly 
may become extremely slow when user does not preallocate memory.
>
> Could please tell what happens if I assign the nozero number in rows to be 
> zeros and let the programme fills in the nonzeros ?

It would be better that you assign a max nozero number than zero
to reduce memory malloc.

Hong

>
> Best regards
> Zhifeng Sheng
>
>


From bsmith at mcs.anl.gov  Thu Nov  1 16:08:25 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 1 Nov 2007 16:08:25 -0500
Subject: Dynamic allocation of Non zeros?
In-Reply-To: <4729CC2A.6010703@ewi.tudelft.nl>
References: <4729CC2A.6010703@ewi.tudelft.nl>
Message-ID: <E63FB52F-1504-4AF3-8578-43B144024C03@mcs.anl.gov>


    Zhifeng,

      We already do this; it helps with the time, but still results in  
very inefficient runs if your preallocation is zero entries.

     Perhaps you could describe how your matrix entries are generated  
and we can suggest a
preallocation scheme?

    Barry


On Nov 1, 2007, at 7:52 AM, Zhifeng Sheng wrote:

> Dear all
>
> It is always not easy to determine the number of nonzeros in a row  
> without any cost.
>
> I wonder if I can make dynamic allocation, for instance, if one  
> found the nonzeros of a row overflow the pre-allocated space, then  
> the space shall be doubled.
>
> This may waste some memory but it may not be that bad. Or maybe this  
> is already done in Petsc?
>
> Could please tell what happens if I assign the nozero number in rows  
> to be zeros and let the programme fills in the nonzeros ?
>
> Best regards
> Zhifeng Sheng
>


From berend at chalmers.se  Thu Nov  1 16:17:02 2007
From: berend at chalmers.se (Berend van Wachem)
Date: Thu, 01 Nov 2007 22:17:02 +0100
Subject: Dynamic allocation of Non zeros?
In-Reply-To: <E63FB52F-1504-4AF3-8578-43B144024C03@mcs.anl.gov>
References: <4729CC2A.6010703@ewi.tudelft.nl> <E63FB52F-1504-4AF3-8578-43B144024C03@mcs.anl.gov>
Message-ID: <472A424E.1000608@chalmers.se>

Hi Barry,


>     Perhaps you could describe how your matrix entries are generated and 
> we can suggest a
> preallocation scheme?

Are there examples of preallocation schemes for PETSc? For instance, if 
I would like to solve Poisson equation on an irregular grid (9 point 
stencil in 2d and 27 in 3d) on n processors?

Thanks,
Berend.


From bsmith at mcs.anl.gov  Thu Nov  1 16:25:15 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 1 Nov 2007 16:25:15 -0500
Subject: Dynamic allocation of Non zeros?
In-Reply-To: <472A424E.1000608@chalmers.se>
References: <4729CC2A.6010703@ewi.tudelft.nl> <E63FB52F-1504-4AF3-8578-43B144024C03@mcs.anl.gov> <472A424E.1000608@chalmers.se>
Message-ID: <5028F943-1EF9-4378-9F3D-64A3D3910789@mcs.anl.gov>


On Nov 1, 2007, at 4:17 PM, Berend van Wachem wrote:

> Hi Barry,
>
>
>>    Perhaps you could describe how your matrix entries are generated  
>> and we can suggest a
>> preallocation scheme?
>
> Are there examples of preallocation schemes for PETSc? For instance,  
> if I would like to solve Poisson equation on an irregular grid (9  
> point stencil in 2d and 27 in 3d) on n processors?

   This would depend on how you store your grid information; is it  
treated as unstructured? For structured or semi-structured
take a look at DAGetMatrix2d_MPIAIJ() in src/dm/da/utils/fdda.c

   Barry


>
>
> Thanks,
> Berend.
>


From knepley at gmail.com  Thu Nov  1 18:05:07 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 1 Nov 2007 18:05:07 -0500
Subject: Dynamic allocation of Non zeros?
In-Reply-To: <472A424E.1000608@chalmers.se>
References: <4729CC2A.6010703@ewi.tudelft.nl>
	 <E63FB52F-1504-4AF3-8578-43B144024C03@mcs.anl.gov>
	 <472A424E.1000608@chalmers.se>
Message-ID: <a9f269830711011605o58b2e418h3c393b4a80d973cd@mail.gmail.com>

The short answer is that preallocation looks exactly like an assembly, but
individual values are not calculated, just sizes. You can look at the routines
I have for this for the Mesh class in preallocateOperator() in
src/dm/mesh/mesh.c.

   Matt

On Nov 1, 2007 4:17 PM, Berend van Wachem <berend at chalmers.se> wrote:
> Hi Barry,
>
>
> >     Perhaps you could describe how your matrix entries are generated and
> > we can suggest a
> > preallocation scheme?
>
> Are there examples of preallocation schemes for PETSc? For instance, if
> I would like to solve Poisson equation on an irregular grid (9 point
> stencil in 2d and 27 in 3d) on n processors?
>
> Thanks,
> Berend.
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From z.sheng at ewi.tudelft.nl  Fri Nov  2 07:23:40 2007
From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng)
Date: Fri, 02 Nov 2007 13:23:40 +0100
Subject: Dynamic allocation of Non zeros?
In-Reply-To: <E63FB52F-1504-4AF3-8578-43B144024C03@mcs.anl.gov>
References: <4729CC2A.6010703@ewi.tudelft.nl> <E63FB52F-1504-4AF3-8578-43B144024C03@mcs.anl.gov>
Message-ID: <472B16CC.7030906@ewi.tudelft.nl>

Barry Smith wrote:

>
>
> Zhifeng,
>
> We already do this; it helps with the time, but still results in very 
> inefficient runs if your preallocation is zero entries.
>
> Perhaps you could describe how your matrix entries are generated and 
> we can suggest a
> preallocation scheme?
>
> Barry
>
>
> On Nov 1, 2007, at 7:52 AM, Zhifeng Sheng wrote:
>
>> Dear all
>>
>> It is always not easy to determine the number of nonzeros in a row 
>> without any cost.
>>
>> I wonder if I can make dynamic allocation, for instance, if one found 
>> the nonzeros of a row overflow the pre-allocated space, then the 
>> space shall be doubled.
>>
>> This may waste some memory but it may not be that bad. Or maybe this 
>> is already done in Petsc?
>>
>> Could please tell what happens if I assign the nozero number in rows 
>> to be zeros and let the programme fills in the nonzeros ?
>>
>> Best regards
>> Zhifeng Sheng
>>
>
In that case I'd better do some preallocation. I am working on 3D FEM 
code with unstructured tetrahedron mesh, and on each node, we can have 3 
to arround 12 unknowns, at this moment I assigned 50 nonzeros to each 
rows, and it is sufficient... just half of the memory assigned went 
wasted. Then I tried 25 nonzeros in a row, but this is not enough and 
the performance is terrible.

So... do I really need to compute the nonzero pattern?

Ps. my matrix is SeqSBAIJ with block size 1, so I am wondering when I 
specify the nonzeros in a row, does it mean the actually nonzeros 
numbers or the memory that is needed? (for instance, for SeqSBAIJ, the 
actual nonzeros in a row would twice as much as memory needed)

Thank you
Best regards
Zhifeng


From z.sheng at ewi.tudelft.nl  Fri Nov  2 07:54:28 2007
From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng)
Date: Fri, 02 Nov 2007 13:54:28 +0100
Subject: not enough memory?!
Message-ID: <472B1E04.1090906@ewi.tudelft.nl>

Dear all

I have a problem for memory management.

I implemented 3d FEM code and at this moment SeqSBAIJ is used to store 
the system matrix, then I used CG method and ICC(k) preconditioner.

The test configuration is not very big, I succeeded in constructing the 
system matrix and construct the preconditioner. But when I need to setup 
the KSP solver, I got the error message below ( I also dumped some 
information about solver and system matrix):

KSP Object:
  type: cg
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-09, absolute=1e-50, divergence=10000
  left preconditioning
PC Object:
  type: icc
    ICC: 2 levels of fill
    ICC: factor fill ratio allocated 1
  linear system matrix = precond matrix:
  Matrix Object:
    type=seqsbaij, rows=435450, cols=435450
    total: nonzeros=9470450, allocated nonzeros=21772580
        block size is 1
[0]PETSC ERROR: --------------------- Error Message 
------------------------------------
[0]PETSC ERROR: Out of memory. This could be due to allocating
[0]PETSC ERROR: too large an object or bleeding by not properly
[0]PETSC ERROR: destroying unneeded objects.
[0]PETSC ERROR: Memory allocated 375435120 Memory used by process 428597248
[0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
[0]PETSC ERROR: Memory requested 3108507040!
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 2.3.2, Patch 10, Wed Mar 28 
19:13:22 CDT 2007 HG revision: d7298c71db7f5e767f359ae35d33cab3bed44428
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: bin/main on a linux-gnu named callisto by zhifeng Fri 
Nov  2 13:26:39 2007
[0]PETSC ERROR: Libraries linked from 
/u/01/01/zhifeng/install/lib/linux-gnu-c-debug
[0]PETSC ERROR: Configure run at Wed Aug  8 13:46:26 2007
[0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77 
--download-f-blas-lapack=1 --download-mpich=1 
--prefix=/u/01/01/zhifeng/install --with-shared=0
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
[0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c
[0]PETSC ERROR: PetscFreeSpaceGet() line 14 in src/mat/utils/freespace.c
[0]PETSC ERROR: MatICCFactorSymbolic_SeqSBAIJ() line 1648 in 
src/mat/impls/sbaij/seq/sbaijfact2.c
[0]PETSC ERROR: MatICCFactorSymbolic() line 4611 in 
src/mat/interface/matrix.c
[0]PETSC ERROR: PCSetup_ICC() line 117 in src/ksp/pc/impls/factor/icc/icc.c
[0]PETSC ERROR: PCSetUp() line 801 in src/ksp/pc/interface/precon.c
[0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: KSPSolve() line 338 in src/ksp/ksp/interface/itfunc.c
=====================the solver convergence info======================
Convergence in 0 iterations.
=====================================================================
writing solution to file temp_H.vtk ...
number of unknowns >>435450
finishing the numeric solver ...
LSFIM hybrid method for four domain problem
deallocating memory of domain class ...


I debug the code, and the error is dumped by the funciton KSPSetUp() 
which should not take so much memory, but still 3108507040 memory was 
requested .... I really could not figure out where I need them...

Any help will be appreciated.

Thank you all
Best regards
Zhifeng


From z.sheng at ewi.tudelft.nl  Fri Nov  2 07:56:22 2007
From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng)
Date: Fri, 02 Nov 2007 13:56:22 +0100
Subject: PCFactorSetLevels does not work?!
In-Reply-To: <Pine.OSX.4.64.0710311225230.14706@bsmith.mcs.anl.gov>
References: <925346A443D4E340BEB20248BAFCDBDF02D8501F@CFEVS1-IP.americas.cray.com> <Pine.OSX.4.64.0710291548460.14706@bsmith.mcs.anl.gov> <4728AC10.4040500@ewi.tudelft.nl> <Pine.OSX.4.64.0710311225230.14706@bsmith.mcs.anl.gov>
Message-ID: <472B1E76.1010200@ewi.tudelft.nl>

Barry Smith wrote:

>   Are you calling the PCFactorSetLevels() on the prec where the
>ICC is being used. For example if your * below means 
>-sub_pc_factor_levels then you need to use
>PCBJacobiGetSubKSP() then get the PC out of that that subksp
>and call the PCFactorSetLevels() on that
>
>  Barry
>
>
>On Wed, 31 Oct 2007, Zhifeng Sheng wrote:
>
>  
>
>>Dear all
>>
>>I have a weird problem.
>>
>>In my programme, I try to use ICC(K), while I set the factor level with
>>
>>PCFactorSetLevels(prec, 4);
>>
>>However this does not work.
>>
>>Then I run the executable with the option *-pc_factor_levels 4, it works... 
>>Can someone tell me why this is happenning?
>>
>>Thank you
>>
>>Best regards
>>Zhifeng Sheng
>>*
>>
>>
>>    
>>
>
>  
>
Dear all

I figured out the reason .... DO NOT use KSPSetUP() before you 
configurate you solver and preconditioners, otherwise, your 
configuration shall be ignored.

Best regards
Zhifeng


From knepley at gmail.com  Fri Nov  2 08:25:59 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 2 Nov 2007 08:25:59 -0500
Subject: not enough memory?!
In-Reply-To: <472B1E04.1090906@ewi.tudelft.nl>
References: <472B1E04.1090906@ewi.tudelft.nl>
Message-ID: <a9f269830711020625t29348ef2x668d443698cd8c9@mail.gmail.com>

The routine which calculates ICC requested 3G, so it seems that the matrix
is too dense or large to be factored on this machine.

  Matt

On Nov 2, 2007 7:54 AM, Zhifeng Sheng <z.sheng at ewi.tudelft.nl> wrote:
> Dear all
>
> I have a problem for memory management.
>
> I implemented 3d FEM code and at this moment SeqSBAIJ is used to store
> the system matrix, then I used CG method and ICC(k) preconditioner.
>
> The test configuration is not very big, I succeeded in constructing the
> system matrix and construct the preconditioner. But when I need to setup
> the KSP solver, I got the error message below ( I also dumped some
> information about solver and system matrix):
>
> KSP Object:
>   type: cg
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=1e-09, absolute=1e-50, divergence=10000
>   left preconditioning
> PC Object:
>   type: icc
>     ICC: 2 levels of fill
>     ICC: factor fill ratio allocated 1
>   linear system matrix = precond matrix:
>   Matrix Object:
>     type=seqsbaij, rows=435450, cols=435450
>     total: nonzeros=9470450, allocated nonzeros=21772580
>         block size is 1
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Out of memory. This could be due to allocating
> [0]PETSC ERROR: too large an object or bleeding by not properly
> [0]PETSC ERROR: destroying unneeded objects.
> [0]PETSC ERROR: Memory allocated 375435120 Memory used by process 428597248
> [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> [0]PETSC ERROR: Memory requested 3108507040!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 2.3.2, Patch 10, Wed Mar 28
> 19:13:22 CDT 2007 HG revision: d7298c71db7f5e767f359ae35d33cab3bed44428
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: bin/main on a linux-gnu named callisto by zhifeng Fri
> Nov  2 13:26:39 2007
> [0]PETSC ERROR: Libraries linked from
> /u/01/01/zhifeng/install/lib/linux-gnu-c-debug
> [0]PETSC ERROR: Configure run at Wed Aug  8 13:46:26 2007
> [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77
> --download-f-blas-lapack=1 --download-mpich=1
> --prefix=/u/01/01/zhifeng/install --with-shared=0
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
> [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c
> [0]PETSC ERROR: PetscFreeSpaceGet() line 14 in src/mat/utils/freespace.c
> [0]PETSC ERROR: MatICCFactorSymbolic_SeqSBAIJ() line 1648 in
> src/mat/impls/sbaij/seq/sbaijfact2.c
> [0]PETSC ERROR: MatICCFactorSymbolic() line 4611 in
> src/mat/interface/matrix.c
> [0]PETSC ERROR: PCSetup_ICC() line 117 in src/ksp/pc/impls/factor/icc/icc.c
> [0]PETSC ERROR: PCSetUp() line 801 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: KSPSolve() line 338 in src/ksp/ksp/interface/itfunc.c
> =====================the solver convergence info======================
> Convergence in 0 iterations.
> =====================================================================
> writing solution to file temp_H.vtk ...
> number of unknowns >>435450
> finishing the numeric solver ...
> LSFIM hybrid method for four domain problem
> deallocating memory of domain class ...
>
>
> I debug the code, and the error is dumped by the funciton KSPSetUp()
> which should not take so much memory, but still 3108507040 memory was
> requested .... I really could not figure out where I need them...
>
> Any help will be appreciated.
>
> Thank you all
> Best regards
> Zhifeng
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From z.sheng at ewi.tudelft.nl  Fri Nov  2 09:07:09 2007
From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng)
Date: Fri, 02 Nov 2007 15:07:09 +0100
Subject: not enough memory?!
In-Reply-To: <a9f269830711020625t29348ef2x668d443698cd8c9@mail.gmail.com>
References: <472B1E04.1090906@ewi.tudelft.nl> <a9f269830711020625t29348ef2x668d443698cd8c9@mail.gmail.com>
Message-ID: <472B2F0D.4010906@ewi.tudelft.nl>

Matthew Knepley wrote:

>The routine which calculates ICC requested 3G, so it seems that the matrix
>is too dense or large to be factored on this machine.
>
>  Matt
>
>On Nov 2, 2007 7:54 AM, Zhifeng Sheng <z.sheng at ewi.tudelft.nl> wrote:
>  
>
>>Dear all
>>
>>I have a problem for memory management.
>>
>>I implemented 3d FEM code and at this moment SeqSBAIJ is used to store
>>the system matrix, then I used CG method and ICC(k) preconditioner.
>>
>>The test configuration is not very big, I succeeded in constructing the
>>system matrix and construct the preconditioner. But when I need to setup
>>the KSP solver, I got the error message below ( I also dumped some
>>information about solver and system matrix):
>>
>>KSP Object:
>>  type: cg
>>  maximum iterations=10000, initial guess is zero
>>  tolerances:  relative=1e-09, absolute=1e-50, divergence=10000
>>  left preconditioning
>>PC Object:
>>  type: icc
>>    ICC: 2 levels of fill
>>    ICC: factor fill ratio allocated 1
>>  linear system matrix = precond matrix:
>>  Matrix Object:
>>    type=seqsbaij, rows=435450, cols=435450
>>    total: nonzeros=9470450, allocated nonzeros=21772580
>>        block size is 1
>>[0]PETSC ERROR: --------------------- Error Message
>>------------------------------------
>>[0]PETSC ERROR: Out of memory. This could be due to allocating
>>[0]PETSC ERROR: too large an object or bleeding by not properly
>>[0]PETSC ERROR: destroying unneeded objects.
>>[0]PETSC ERROR: Memory allocated 375435120 Memory used by process 428597248
>>[0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
>>[0]PETSC ERROR: Memory requested 3108507040!
>>[0]PETSC ERROR:
>>------------------------------------------------------------------------
>>[0]PETSC ERROR: Petsc Release Version 2.3.2, Patch 10, Wed Mar 28
>>19:13:22 CDT 2007 HG revision: d7298c71db7f5e767f359ae35d33cab3bed44428
>>[0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>[0]PETSC ERROR: See docs/index.html for manual pages.
>>[0]PETSC ERROR:
>>------------------------------------------------------------------------
>>[0]PETSC ERROR: bin/main on a linux-gnu named callisto by zhifeng Fri
>>Nov  2 13:26:39 2007
>>[0]PETSC ERROR: Libraries linked from
>>/u/01/01/zhifeng/install/lib/linux-gnu-c-debug
>>[0]PETSC ERROR: Configure run at Wed Aug  8 13:46:26 2007
>>[0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77
>>--download-f-blas-lapack=1 --download-mpich=1
>>--prefix=/u/01/01/zhifeng/install --with-shared=0
>>[0]PETSC ERROR:
>>------------------------------------------------------------------------
>>[0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
>>[0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c
>>[0]PETSC ERROR: PetscFreeSpaceGet() line 14 in src/mat/utils/freespace.c
>>[0]PETSC ERROR: MatICCFactorSymbolic_SeqSBAIJ() line 1648 in
>>src/mat/impls/sbaij/seq/sbaijfact2.c
>>[0]PETSC ERROR: MatICCFactorSymbolic() line 4611 in
>>src/mat/interface/matrix.c
>>[0]PETSC ERROR: PCSetup_ICC() line 117 in src/ksp/pc/impls/factor/icc/icc.c
>>[0]PETSC ERROR: PCSetUp() line 801 in src/ksp/pc/interface/precon.c
>>[0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c
>>[0]PETSC ERROR: KSPSolve() line 338 in src/ksp/ksp/interface/itfunc.c
>>=====================the solver convergence info======================
>>Convergence in 0 iterations.
>>=====================================================================
>>writing solution to file temp_H.vtk ...
>>number of unknowns >>435450
>>finishing the numeric solver ...
>>LSFIM hybrid method for four domain problem
>>deallocating memory of domain class ...
>>
>>
>>I debug the code, and the error is dumped by the funciton KSPSetUp()
>>which should not take so much memory, but still 3108507040 memory was
>>requested .... I really could not figure out where I need them...
>>
>>Any help will be appreciated.
>>
>>Thank you all
>>Best regards
>>Zhifeng
>>
>>
>>    
>>
>
>
>
>  
>
So ... computation of the preconditioner is actually called by 
KSPSetUp() instead of PCGetFactoredMatrix(prec,&M)?

Could you please tell me when the system matrix be factorized 
(incompletely)? thank you.

Zhifeng


From hzhang at mcs.anl.gov  Fri Nov  2 09:22:36 2007
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Fri, 2 Nov 2007 09:22:36 -0500 (CDT)
Subject: not enough memory?!
In-Reply-To: <472B2F0D.4010906@ewi.tudelft.nl>
References: <472B1E04.1090906@ewi.tudelft.nl>
 <a9f269830711020625t29348ef2x668d443698cd8c9@mail.gmail.com>
 <472B2F0D.4010906@ewi.tudelft.nl>
Message-ID: <Pine.LNX.4.64.0711020921190.16463@terra.mcs.anl.gov>


k>>>
>>> 
>> 
>> 
>>
>> 
> So ... computation of the preconditioner is actually called by KSPSetUp() 
> instead of PCGetFactoredMatrix(prec,&M)?

Yes.
>
> Could you please tell me when the system matrix be factorized (incompletely)? 
> thank you.

During a call of KSPSetUp().

Hong

>
> Zhifeng
>
>


From z.sheng at ewi.tudelft.nl  Fri Nov  2 10:04:48 2007
From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng)
Date: Fri, 02 Nov 2007 16:04:48 +0100
Subject: not enough memory?!
In-Reply-To: <Pine.LNX.4.64.0711020921190.16463@terra.mcs.anl.gov>
References: <472B1E04.1090906@ewi.tudelft.nl> <a9f269830711020625t29348ef2x668d443698cd8c9@mail.gmail.com> <472B2F0D.4010906@ewi.tudelft.nl> <Pine.LNX.4.64.0711020921190.16463@terra.mcs.anl.gov>
Message-ID: <472B3C90.2020906@ewi.tudelft.nl>

Hong Zhang wrote:

>
> k>>>
>
>>>>
>>>
>>>
>>>
>>>
>> So ... computation of the preconditioner is actually called by 
>> KSPSetUp() instead of PCGetFactoredMatrix(prec,&M)?
>
>
> Yes.
>
>>
>> Could you please tell me when the system matrix be factorized 
>> (incompletely)? thank you.
>
>
> During a call of KSPSetUp().
>
> Hong
>
>>
>> Zhifeng
>>
>>
>
Thank you for your reply, since the problem I have is ICC took too much 
memory, but I think I can fix it with reorderring the matrix.

My matrix is SeqSBAIJ and ICC is used as preconditioner, I think the 
default orderring is "natural", while I think nested-dissection is what 
I need.

However, when I choose nested dissection with - 
*pc_factor_mat_ordering_type nd, this is what comes out:

[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, 
probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see 
http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC 
ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to 
find memory corruption errors
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: --------------------- Stack Frames 
------------------------------------
[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[0]PETSC ERROR: INSTEAD the line number of the start of the function
[0]PETSC ERROR: is given.
[0]PETSC ERROR: [0] SPARSEPACKfndsep line 50 src/mat/order/fndsep.c
[0]PETSC ERROR: [0] SPARSEPACKgennd line 70 src/mat/order/gennd.c
[0]PETSC ERROR: [0] MatOrdering_ND line 18 src/mat/order/spnd.c
[0]PETSC ERROR: [0] MatGetOrdering line 187 src/mat/order/sorder.c
[0]PETSC ERROR: [0] PCSetup_ICC line 113 src/ksp/pc/impls/factor/icc/icc.c
[0]PETSC ERROR: [0] PCSetUp line 778 src/ksp/pc/interface/precon.c
[0]PETSC ERROR: [0] KSPSolve line 305 src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: --------------------- Error Message 
------------------------------------
[0]PETSC ERROR: Signal received!
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 2.3.2, Patch 10, Wed Mar 28 
19:13:22 CDT 2007 HG revision: d7298c71db7f5e767f359ae35d33cab3bed44428
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: bin/main on a linux-gnu named callisto by zhifeng Fri 
Nov 2 16:00:58 2007
[0]PETSC ERROR: Libraries linked from 
/u/01/01/zhifeng/install/lib/linux-gnu-c-debug
[0]PETSC ERROR: Configure run at Wed Aug 8 13:46:26 2007
[0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77 
--download-f-blas-lapack=1 --download-mpich=1 
--prefix=/u/01/01/zhifeng/install --with-shared=0
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: User provided function() line 0 in unknown directory 
unknown file
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0

PS: which reorderring schemas are symmetric?

Thanks alot

Best regards
Zhifeng Sheng
*


From jens.madsen at risoe.dk  Fri Nov  2 10:37:27 2007
From: jens.madsen at risoe.dk (jens.madsen at risoe.dk)
Date: Fri, 02 Nov 2007 16:37:27 +0100
Subject: Larger stencils in DA
Message-ID: <CA703655D571CF49A2E6DAED3FF5A493016D16D4@EXCHG-VS1.risoe.dk>

Hi 

 
Do you have examples where stencils are wider than the DA types (box and
star)? I need a ghost region twice the normal size for a fourth order
derivative. 

 
Are there any plans of introducing a stencil width into DA?

 
Kind Regards 

 
Jens MAdsen

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071102/bdf55d2f/attachment.htm>

From hzhang at mcs.anl.gov  Fri Nov  2 10:47:47 2007
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Fri, 2 Nov 2007 10:47:47 -0500 (CDT)
Subject: not enough memory?!
In-Reply-To: <472B3C90.2020906@ewi.tudelft.nl>
References: <472B1E04.1090906@ewi.tudelft.nl>
 <a9f269830711020625t29348ef2x668d443698cd8c9@mail.gmail.com>
 <472B2F0D.4010906@ewi.tudelft.nl> <Pine.LNX.4.64.0711020921190.16463@terra.mcs.anl.gov>
 <472B3C90.2020906@ewi.tudelft.nl>
Message-ID: <Pine.LNX.4.64.0711021033480.16463@terra.mcs.anl.gov>

>>> 
>> 
> Thank you for your reply, since the problem I have is ICC took too much 
> memory, but I think I can fix it with reorderring the matrix.
>
> My matrix is SeqSBAIJ and ICC is used as preconditioner, I think the default 
> orderring is "natural", while I think nested-dissection is what I need.
>
> However, when I choose nested dissection with - *pc_factor_mat_ordering_type 
> nd, this is what comes out:

We do not support Matrix reordering for sbaij matrix format
because reordering sbaij matrix changes matrix data structure - very
expensive. You can use aij matrix, which enables efficient
matrix reordering. We support icc for aij matrix.
Note: using aij matrix should not increase much of memory,
only additional half of original sparse matrix entries are added.
All matrix operations for aij matrix can only be faster than sbaij's.

Hong

>
> [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, 
> probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see 
> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC 
> ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to find 
> memory corruption errors
> [0]PETSC ERROR: likely location of problem given in stack below
> [0]PETSC ERROR: --------------------- Stack Frames 
> ------------------------------------
> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [0]PETSC ERROR: INSTEAD the line number of the start of the function
> [0]PETSC ERROR: is given.
> [0]PETSC ERROR: [0] SPARSEPACKfndsep line 50 src/mat/order/fndsep.c
> [0]PETSC ERROR: [0] SPARSEPACKgennd line 70 src/mat/order/gennd.c
> [0]PETSC ERROR: [0] MatOrdering_ND line 18 src/mat/order/spnd.c
> [0]PETSC ERROR: [0] MatGetOrdering line 187 src/mat/order/sorder.c
> [0]PETSC ERROR: [0] PCSetup_ICC line 113 src/ksp/pc/impls/factor/icc/icc.c
> [0]PETSC ERROR: [0] PCSetUp line 778 src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: [0] KSPSolve line 305 src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: --------------------- Error Message 
> ------------------------------------
> [0]PETSC ERROR: Signal received!
> [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 2.3.2, Patch 10, Wed Mar 28 19:13:22 
> CDT 2007 HG revision: d7298c71db7f5e767f359ae35d33cab3bed44428
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]PETSC ERROR: bin/main on a linux-gnu named callisto by zhifeng Fri Nov 2 
> 16:00:58 2007
> [0]PETSC ERROR: Libraries linked from 
> /u/01/01/zhifeng/install/lib/linux-gnu-c-debug
> [0]PETSC ERROR: Configure run at Wed Aug 8 13:46:26 2007
> [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77 
> --download-f-blas-lapack=1 --download-mpich=1 
> --prefix=/u/01/01/zhifeng/install --with-shared=0
> [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown 
> file
> [unset]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> PS: which reorderring schemas are symmetric?
>
> Thanks alot
>
> Best regards
> Zhifeng Sheng
> *
>
>
>
>
>
>


From bsmith at mcs.anl.gov  Fri Nov  2 10:53:52 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 2 Nov 2007 10:53:52 -0500
Subject: Larger stencils in DA
In-Reply-To: <CA703655D571CF49A2E6DAED3FF5A493016D16D4@EXCHG-VS1.risoe.dk>
References: <CA703655D571CF49A2E6DAED3FF5A493016D16D4@EXCHG-VS1.risoe.dk>
Message-ID: <FB6289E8-8FF8-4D30-A35E-FCB5D6EFF9AD@mcs.anl.gov>


   Jens,

      This is the argument called s in the calls to DACreateXXX() I  
hope this serves your needs.


     Barry

Note: with a wider stencil we still only support box and star so you  
may have some points that the
DA thinks are in your stencil but that you do not need to use. This is  
harmless.


On Nov 2, 2007, at 10:37 AM, jens.madsen at risoe.dk wrote:

> Hi
>
> Do you have examples where stencils are wider than the DA types (box  
> and star)? I need a ghost region twice the normal size for a fourth  
> order derivative.
>
> Are there any plans of introducing a stencil width into DA?
>
> Kind Regards
>
> Jens MAdsen

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071102/12915e85/attachment.htm>

From mbostandoust at yahoo.com  Fri Nov  2 10:50:12 2007
From: mbostandoust at yahoo.com (Mehdi Bostandoost)
Date: Fri, 2 Nov 2007 08:50:12 -0700 (PDT)
Subject: Larger stencils in DA
In-Reply-To: <CA703655D571CF49A2E6DAED3FF5A493016D16D4@EXCHG-VS1.risoe.dk>
Message-ID: <930783.14701.qm@web33507.mail.mud.yahoo.com>

hi
  box or star are the type of overlapping. if you need more ghost points,you can specify it with parameter s in DACreate.
   
  mehdi

jens.madsen at risoe.dk wrote:
                Hi 
   
  Do you have examples where stencils are wider than the DA types (box and star)? I need a ghost region twice the normal size for a fourth order derivative. 
   
  Are there any plans of introducing a stencil width into DA?
   
  Kind Regards 
   
  Jens MAdsen


 __________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071102/240b4752/attachment.htm>

From bsmith at mcs.anl.gov  Fri Nov  2 11:01:18 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 2 Nov 2007 11:01:18 -0500
Subject: not enough memory?!
In-Reply-To: <472B1E04.1090906@ewi.tudelft.nl>
References: <472B1E04.1090906@ewi.tudelft.nl>
Message-ID: <94D44493-F62A-48CD-AC21-FFE6B3A14631@mcs.anl.gov>


In the output:

ICC: factor fill ratio allocated 1

but

ICC: 2 levels of fill

You should use the option -pc_factor_fill <a> (or -sub_pc_factor_fill  
if using block Jacobi or ASM in parallel)
and set <a> to be the expected ratio of nonzeros in the factored  
matrix and the unfactored matrix;
I suggest starting with at least 3 since you are using 2 levels of  
fill. See the manual page for PCFactorSetFill()

    Barry


On Nov 2, 2007, at 7:54 AM, Zhifeng Sheng wrote:

> Dear all
>
> I have a problem for memory management.
>
> I implemented 3d FEM code and at this moment SeqSBAIJ is used to  
> store the system matrix, then I used CG method and ICC(k)  
> preconditioner.
>
> The test configuration is not very big, I succeeded in constructing  
> the system matrix and construct the preconditioner. But when I need  
> to setup the KSP solver, I got the error message below ( I also  
> dumped some information about solver and system matrix):
>
> KSP Object:
> type: cg
> maximum iterations=10000, initial guess is zero
> tolerances:  relative=1e-09, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:
> type: icc
>   ICC: 2 levels of fill
>   ICC: factor fill ratio allocated 1
> linear system matrix = precond matrix:
> Matrix Object:
>   type=seqsbaij, rows=435450, cols=435450
>   total: nonzeros=9470450, allocated nonzeros=21772580
>       block size is 1
> [0]PETSC ERROR: --------------------- Error Message  
> ------------------------------------
> [0]PETSC ERROR: Out of memory. This could be due to allocating
> [0]PETSC ERROR: too large an object or bleeding by not properly
> [0]PETSC ERROR: destroying unneeded objects.
> [0]PETSC ERROR: Memory allocated 375435120 Memory used by process  
> 428597248
> [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> [0]PETSC ERROR: Memory requested 3108507040!
> [0]PETSC ERROR:  
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 2.3.2, Patch 10, Wed Mar 28  
> 19:13:22 CDT 2007 HG revision:  
> d7298c71db7f5e767f359ae35d33cab3bed44428
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:  
> ------------------------------------------------------------------------
> [0]PETSC ERROR: bin/main on a linux-gnu named callisto by zhifeng  
> Fri Nov  2 13:26:39 2007
> [0]PETSC ERROR: Libraries linked from /u/01/01/zhifeng/install/lib/ 
> linux-gnu-c-debug
> [0]PETSC ERROR: Configure run at Wed Aug  8 13:46:26 2007
> [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77 -- 
> download-f-blas-lapack=1 --download-mpich=1 --prefix=/u/01/01/ 
> zhifeng/install --with-shared=0
> [0]PETSC ERROR:  
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
> [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/ 
> mtr.c
> [0]PETSC ERROR: PetscFreeSpaceGet() line 14 in src/mat/utils/ 
> freespace.c
> [0]PETSC ERROR: MatICCFactorSymbolic_SeqSBAIJ() line 1648 in src/mat/ 
> impls/sbaij/seq/sbaijfact2.c
> [0]PETSC ERROR: MatICCFactorSymbolic() line 4611 in src/mat/ 
> interface/matrix.c
> [0]PETSC ERROR: PCSetup_ICC() line 117 in src/ksp/pc/impls/factor/ 
> icc/icc.c
> [0]PETSC ERROR: PCSetUp() line 801 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: KSPSolve() line 338 in src/ksp/ksp/interface/itfunc.c
> =====================the solver convergence info======================
> Convergence in 0 iterations.
> =====================================================================
> writing solution to file temp_H.vtk ...
> number of unknowns >>435450
> finishing the numeric solver ...
> LSFIM hybrid method for four domain problem
> deallocating memory of domain class ...
>
>
> I debug the code, and the error is dumped by the funciton KSPSetUp()  
> which should not take so much memory, but still 3108507040 memory  
> was requested .... I really could not figure out where I need them...
>
> Any help will be appreciated.
>
> Thank you all
> Best regards
> Zhifeng
>


From z.sheng at ewi.tudelft.nl  Tue Nov  6 03:29:04 2007
From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng)
Date: Tue, 06 Nov 2007 10:29:04 +0100
Subject: reordering does not work for ICC?
Message-ID: <473033E0.6020802@ewi.tudelft.nl>

Dear all

I tried to reorder my the preconditioner with different reordering schema.

All worked well for ILU, but does not make any difference on ICC. It 
seems that the reordering schema does not work for ICC at all....

Is it supposed to be like this?

PS: I have a symmetric matrix and I would like to save some memory. I 
used SBAIJ with block=1, but some told me it's not efficient ...
So... what can I do to save some memory on matrix and preconditioner?


Thank you
Best regards
Zhifeng Sheng


From bsmith at mcs.anl.gov  Tue Nov  6 08:26:23 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 6 Nov 2007 08:26:23 -0600
Subject: reordering does not work for ICC?
In-Reply-To: <473033E0.6020802@ewi.tudelft.nl>
References: <473033E0.6020802@ewi.tudelft.nl>
Message-ID: <90C0D9F9-8EEF-4B59-8DEB-CF14DE5F40C2@mcs.anl.gov>


On Nov 6, 2007, at 3:29 AM, Zhifeng Sheng wrote:

> Dear all
>
> I tried to reorder my the preconditioner with different reordering  
> schema.
>
> All worked well for ILU, but does not make any difference on ICC. It  
> seems that the reordering schema does not work for ICC at all....
>
> Is it supposed to be like this?

     Yes, with sbaij only the upper triangular part of the matrix is  
stored; hence reordering doesn't make sense
since the values needed in the reordered form are not available. You  
can use the SeqAIJ format if you want
to do reorderings with the ICC.

    Barry
>
>
> PS: I have a symmetric matrix and I would like to save some memory.  
> I used SBAIJ with block=1, but some told me it's not efficient ...

    SBAIJ with block=1 is just as efficient as AIJ! The is seperate  
code for each block size.

     Barry

>
> So... what can I do to save some memory on matrix and preconditioner?
>
>
> Thank you
> Best regards
> Zhifeng Sheng
>


From knepley at gmail.com  Tue Nov  6 08:47:23 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 6 Nov 2007 08:47:23 -0600
Subject: Larger stencils in DA
In-Reply-To: <CA703655D571CF49A2E6DAED3FF5A4930105FC82@EXCHG-VS1.risoe.dk>
References: <CA703655D571CF49A2E6DAED3FF5A4930105FC82@EXCHG-VS1.risoe.dk>
Message-ID: <a9f269830711060647s3ee3c4d4k83116021746a52cd@mail.gmail.com>

On Oct 31, 2007 8:38 AM,  <jens.madsen at risoe.dk> wrote:
>
>
>
>
> Hi
>
>
>
> Do you have examples where stencils not fitting the DA types (box and star)
> are being used?

We have extensions for multicomponent problems, like
DASetBlockFills().

> Are there any plans of introducing a stencil width into DA?

We have one. It is an argument to DACreate2D().

    Matt

>
> Kind Regards
>
>
>
> Jens MAdsen


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From jens.madsen at risoe.dk  Tue Nov  6 09:03:34 2007
From: jens.madsen at risoe.dk (jens.madsen at risoe.dk)
Date: Tue, 06 Nov 2007 16:03:34 +0100
Subject: Larger stencils in DA
In-Reply-To: <a9f269830711060647s3ee3c4d4k83116021746a52cd@mail.gmail.com>
References: <CA703655D571CF49A2E6DAED3FF5A4930105FC82@EXCHG-VS1.risoe.dk>
 <a9f269830711060647s3ee3c4d4k83116021746a52cd@mail.gmail.com>
Message-ID: <CA703655D571CF49A2E6DAED3FF5A493016D1C5C@EXCHG-VS1.risoe.dk>

Hi

Thank you for all your answer to my up to date most stupid question ever :-) 

I somehow missed the "s" parameter in the documentation. 

Sorry


Jens Madsen
Ph.d.-studerende
Phone direct +45 4677 4560
Mobile 
jens.madsen at risoe.dk

Optics and Plasma Research Department
Ris? National Laboratory
Technical University of Denmark - DTU
Building 128, P.O. Box 49
DK-4000 Roskilde, Denmark
Tel +45 4677 4500
Fax +45 4677 4565
www.risoe.dk 

>From 1 January 2007, Ris? National Laboratory, the Danish Institute for Food and Veterinary Research,
the Danish Institute for Fisheries Research, the Danish National Space Center and
the Danish Transport Research Institute have been merged with
the Technical University of Denmark (DTU) with DTU as the continuing unit.
-----Original Message-----
From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley
Sent: Tuesday, November 06, 2007 3:47 PM
To: petsc-users at mcs.anl.gov
Cc: petsc-maint
Subject: Re: Larger stencils in DA

On Oct 31, 2007 8:38 AM,  <jens.madsen at risoe.dk> wrote:
>
>
>
>
> Hi
>
>
>
> Do you have examples where stencils not fitting the DA types (box and star)
> are being used?

We have extensions for multicomponent problems, like
DASetBlockFills().

> Are there any plans of introducing a stencil width into DA?

We have one. It is an argument to DACreate2D().

    Matt

>
> Kind Regards
>
>
>
> Jens MAdsen


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From z.sheng at ewi.tudelft.nl  Tue Nov  6 09:29:43 2007
From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng)
Date: Tue, 06 Nov 2007 16:29:43 +0100
Subject: reordering does not work for ICC?
In-Reply-To: <90C0D9F9-8EEF-4B59-8DEB-CF14DE5F40C2@mcs.anl.gov>
References: <473033E0.6020802@ewi.tudelft.nl> <90C0D9F9-8EEF-4B59-8DEB-CF14DE5F40C2@mcs.anl.gov>
Message-ID: <47308867.1010309@ewi.tudelft.nl>

Barry Smith wrote:

>
> On Nov 6, 2007, at 3:29 AM, Zhifeng Sheng wrote:
>
>> Dear all
>>
>> I tried to reorder my the preconditioner with different reordering 
>> schema.
>>
>> All worked well for ILU, but does not make any difference on ICC. It 
>> seems that the reordering schema does not work for ICC at all....
>>
>> Is it supposed to be like this?
>
>
> Yes, with sbaij only the upper triangular part of the matrix is 
> stored; hence reordering doesn't make sense
> since the values needed in the reordered form are not available. You 
> can use the SeqAIJ format if you want
> to do reorderings with the ICC.
>
> Barry
>
>>
>>
>> PS: I have a symmetric matrix and I would like to save some memory. I 
>> used SBAIJ with block=1, but some told me it's not efficient ...
>
>
> SBAIJ with block=1 is just as efficient as AIJ! The is seperate code 
> for each block size.
>
> Barry
>
>>
>> So... what can I do to save some memory on matrix and preconditioner?
>>
>>
>> Thank you
>> Best regards
>> Zhifeng Sheng
>>
>
Thank you for your answer.

Could you please tell me if I can first assemble the matrix with a 
sufficiently large nonzero number per row and then release the redundent 
memory after the assembly is done?

and what does MatCompress do? I tried it on my matrix, nothing happened ...

Thank you
Best regards


From bsmith at mcs.anl.gov  Tue Nov  6 10:29:29 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 6 Nov 2007 10:29:29 -0600
Subject: reordering does not work for ICC?
In-Reply-To: <47308867.1010309@ewi.tudelft.nl>
References: <473033E0.6020802@ewi.tudelft.nl> <90C0D9F9-8EEF-4B59-8DEB-CF14DE5F40C2@mcs.anl.gov> <47308867.1010309@ewi.tudelft.nl>
Message-ID: <F720820F-79D3-4FD6-A124-BAB9FF123053@mcs.anl.gov>


    There is no mechanism to release that extra memory. Thus it is  
important to preallocate
well for large problems.

    Barry
On Nov 6, 2007, at 9:29 AM, Zhifeng Sheng wrote:

> Barry Smith wrote:
>
>>
>> On Nov 6, 2007, at 3:29 AM, Zhifeng Sheng wrote:
>>
>>> Dear all
>>>
>>> I tried to reorder my the preconditioner with different reordering  
>>> schema.
>>>
>>> All worked well for ILU, but does not make any difference on ICC.  
>>> It seems that the reordering schema does not work for ICC at all....
>>>
>>> Is it supposed to be like this?
>>
>>
>> Yes, with sbaij only the upper triangular part of the matrix is  
>> stored; hence reordering doesn't make sense
>> since the values needed in the reordered form are not available.  
>> You can use the SeqAIJ format if you want
>> to do reorderings with the ICC.
>>
>> Barry
>>
>>>
>>>
>>> PS: I have a symmetric matrix and I would like to save some  
>>> memory. I used SBAIJ with block=1, but some told me it's not  
>>> efficient ...
>>
>>
>> SBAIJ with block=1 is just as efficient as AIJ! The is seperate  
>> code for each block size.
>>
>> Barry
>>
>>>
>>> So... what can I do to save some memory on matrix and  
>>> preconditioner?
>>>
>>>
>>> Thank you
>>> Best regards
>>> Zhifeng Sheng
>>>
>>
> Thank you for your answer.
>
> Could you please tell me if I can first assemble the matrix with a  
> sufficiently large nonzero number per row and then release the  
> redundent memory after the assembly is done?
>
> and what does MatCompress do? I tried it on my matrix, nothing  
> happened ...
>
> Thank you
> Best regards
>


From z.sheng at ewi.tudelft.nl  Mon Nov 12 10:36:54 2007
From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng)
Date: Mon, 12 Nov 2007 17:36:54 +0100
Subject: Nonzeros of SBAIJ
Message-ID: <47388126.2010607@ewi.tudelft.nl>

Dear all

my matrix is SeqSBAIJ with block size 1, so I am wondering when I 
specify the nonzeros in a row, does it mean the actually nonzeros 
numbers or the memory that is needed? (for instance, for SeqSBAIJ, the 
actual nonzeros in a row would twice as much as memory needed)

Thank you
Best regards
Zhifeng


From hzhang at mcs.anl.gov  Mon Nov 12 11:16:06 2007
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Mon, 12 Nov 2007 11:16:06 -0600 (CST)
Subject: Nonzeros of SBAIJ
In-Reply-To: <47388126.2010607@ewi.tudelft.nl>
References: <47388126.2010607@ewi.tudelft.nl>
Message-ID: <Pine.LNX.4.64.0711121115040.28623@terra.mcs.anl.gov>


On Mon, 12 Nov 2007, Zhifeng Sheng wrote:

> Dear all
>
> my matrix is SeqSBAIJ with block size 1, so I am wondering when I specify the 
> nonzeros in a row, does it mean the actually nonzeros numbers or the memory 
> that is needed? (for instance, for SeqSBAIJ, the actual nonzeros in a row 
> would twice as much as memory needed)

No, you specify the nonzeros of upper triangular part.

Hong

>
> Thank you
> Best regards
> Zhifeng
>
>


From bsmith at mcs.anl.gov  Mon Nov 12 13:21:51 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 12 Nov 2007 13:21:51 -0600
Subject: Nonzeros of SBAIJ
In-Reply-To: <Pine.LNX.4.64.0711121115040.28623@terra.mcs.anl.gov>
References: <47388126.2010607@ewi.tudelft.nl> <Pine.LNX.4.64.0711121115040.28623@terra.mcs.anl.gov>
Message-ID: <64C31677-0447-46D7-9F6D-8EB59777E7BE@mcs.anl.gov>


   Plus 1 for each row for the diagonal.

    Barry


On Nov 12, 2007, at 11:16 AM, Hong Zhang wrote:

>
>
> On Mon, 12 Nov 2007, Zhifeng Sheng wrote:
>
>> Dear all
>>
>> my matrix is SeqSBAIJ with block size 1, so I am wondering when I  
>> specify the nonzeros in a row, does it mean the actually nonzeros  
>> numbers or the memory that is needed? (for instance, for SeqSBAIJ,  
>> the actual nonzeros in a row would twice as much as memory needed)
>
> No, you specify the nonzeros of upper triangular part.
>
> Hong
>
>>
>> Thank you
>> Best regards
>> Zhifeng
>>
>>
>


From grs2103 at columbia.edu  Tue Nov 13 10:06:29 2007
From: grs2103 at columbia.edu (Gideon Simpson)
Date: Tue, 13 Nov 2007 11:06:29 -0500
Subject: multi core os x machines
Message-ID: <5248B9BC-2C7A-4783-821C-3DFBF44A98B6@columbia.edu>

Has anyone had any success in getting good performance on multi-core  
intel os x machines with petsc?  What's the right way to get MPICH up  
and running for such a thing?

-Gideon Simpson
  Department of Applied Physics and Applied Mathematics
  Columbia University


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071113/6f6aad37/attachment.htm>

From bsmith at mcs.anl.gov  Tue Nov 13 10:14:17 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 13 Nov 2007 10:14:17 -0600
Subject: multi core os x machines
In-Reply-To: <5248B9BC-2C7A-4783-821C-3DFBF44A98B6@columbia.edu>
References: <5248B9BC-2C7A-4783-821C-3DFBF44A98B6@columbia.edu>
Message-ID: <D705F746-61C3-4B5D-B3ED-A24085509B49@mcs.anl.gov>


   Not possible. The problem is that with one process it uses all the  
memory
bandwidth, when you change to use 2 processes (2 cores) each core
now gets only half the memory bandwidth and hence essentially half
the speed.

    Barry


    Barry

On Nov 13, 2007, at 10:06 AM, Gideon Simpson wrote:

> Has anyone had any success in getting good performance on multi-core  
> intel os x machines with petsc?  What's the right way to get MPICH  
> up and running for such a thing?
>
> -Gideon Simpson
>  Department of Applied Physics and Applied Mathematics
>  Columbia University
>
>


From grs2103 at columbia.edu  Tue Nov 13 10:23:01 2007
From: grs2103 at columbia.edu (Gideon Simpson)
Date: Tue, 13 Nov 2007 11:23:01 -0500
Subject: multi core os x machines
In-Reply-To: <D705F746-61C3-4B5D-B3ED-A24085509B49@mcs.anl.gov>
References: <5248B9BC-2C7A-4783-821C-3DFBF44A98B6@columbia.edu> <D705F746-61C3-4B5D-B3ED-A24085509B49@mcs.anl.gov>
Message-ID: <FDC2ADE1-5B24-4C5E-A896-2FC782433F3D@columbia.edu>

This is also true for a multi-processor machine, or its unique to  
multi-core machines?
-gideon

On Nov 13, 2007, at 11:14 AM, Barry Smith wrote:

>
>   Not possible. The problem is that with one process it uses all  
> the memory
> bandwidth, when you change to use 2 processes (2 cores) each core
> now gets only half the memory bandwidth and hence essentially half
> the speed.
>
>    Barry
>
>
>    Barry
>
> On Nov 13, 2007, at 10:06 AM, Gideon Simpson wrote:
>
>> Has anyone had any success in getting good performance on multi- 
>> core intel os x machines with petsc?  What's the right way to get  
>> MPICH up and running for such a thing?
>>
>> -Gideon Simpson
>>  Department of Applied Physics and Applied Mathematics
>>  Columbia University
>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071113/bf3355c4/attachment.htm>

From bsmith at mcs.anl.gov  Tue Nov 13 10:31:05 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 13 Nov 2007 10:31:05 -0600
Subject: multi core os x machines
In-Reply-To: <FDC2ADE1-5B24-4C5E-A896-2FC782433F3D@columbia.edu>
References: <5248B9BC-2C7A-4783-821C-3DFBF44A98B6@columbia.edu> <D705F746-61C3-4B5D-B3ED-A24085509B49@mcs.anl.gov> <FDC2ADE1-5B24-4C5E-A896-2FC782433F3D@columbia.edu>
Message-ID: <679E6589-C2EE-455B-ACB7-971A065B16A9@mcs.anl.gov>


   It depends on how the memory is connected to the individual cores  
or CPUS;
for example the AMD has a different approach than Intel. If the  
different processors/cores
have SEPERATE paths to memory then you will not see this terrible  
effect.

    Barry


On Nov 13, 2007, at 10:23 AM, Gideon Simpson wrote:

> This is also true for a multi-processor machine, or its unique to  
> multi-core machines?
> -gideon
>
> On Nov 13, 2007, at 11:14 AM, Barry Smith wrote:
>
>>
>>   Not possible. The problem is that with one process it uses all  
>> the memory
>> bandwidth, when you change to use 2 processes (2 cores) each core
>> now gets only half the memory bandwidth and hence essentially half
>> the speed.
>>
>>    Barry
>>
>>
>>    Barry
>>
>> On Nov 13, 2007, at 10:06 AM, Gideon Simpson wrote:
>>
>>> Has anyone had any success in getting good performance on multi- 
>>> core intel os x machines with petsc?  What's the right way to get  
>>> MPICH up and running for such a thing?
>>>
>>> -Gideon Simpson
>>>  Department of Applied Physics and Applied Mathematics
>>>  Columbia University
>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071113/c8adcc4a/attachment.htm>

From balay at mcs.anl.gov  Tue Nov 13 10:57:07 2007
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 13 Nov 2007 10:57:07 -0600 (CST)
Subject: multi core os x machines
In-Reply-To: <679E6589-C2EE-455B-ACB7-971A065B16A9@mcs.anl.gov>
References: <5248B9BC-2C7A-4783-821C-3DFBF44A98B6@columbia.edu> <D705F746-61C3-4B5D-B3ED-A24085509B49@mcs.anl.gov> <FDC2ADE1-5B24-4C5E-A896-2FC782433F3D@columbia.edu> <679E6589-C2EE-455B-ACB7-971A065B16A9@mcs.anl.gov>
Message-ID: <alpine.LFD.0.9999.0711131040290.16512@www.wondir.com>

Actually the new intels are pretty similar to the AMDs these days.

There are 2 things here:
- multiple-cores per chip
- and multiple chips

For eg: one can buy 2-chip-dual-core = 4CPU machine.

[with AMD each chip has a separate memory bank. With intel, there is a
single controller with multiple banks. But when 1 chip is used - only
half the memory banks are accessed - or something like that]

So in both AMD and Intel, when both chips [each chip - a dual-core]
are used, MB available scales up - as compared to 1 chip usage.

However within a chip [i.e dual core] - the MB from main memory to
cpu/cache is same irrespective of both cores being used or only
one. So when both are used - the effective memory bandwith is not
scaling up.

So to get best parallel speedup - one should choose 'np' as
'no_of_memory banks' - not 'no_of_cpus'. So, on this 2x2 = 4CPU
machine, I suspect the best performance scaling can be seen only with
'-np 2'

Wrt MPICH on SMP, we were sugested to use the following MPICH
configure options:

--with-pm=gforker --device=ch3:nemesis --enable-fast

Satish

On Tue, 13 Nov 2007, Barry Smith wrote:

> 
>  It depends on how the memory is connected to the individual cores or CPUS;
> for example the AMD has a different approach than Intel. If the different
> processors/cores
> have SEPERATE paths to memory then you will not see this terrible effect.
> 
>   Barry
> 
> 
> 
> On Nov 13, 2007, at 10:23 AM, Gideon Simpson wrote:
> 
> > This is also true for a multi-processor machine, or its unique to multi-core
> > machines?
> > -gideon
> > 
> > On Nov 13, 2007, at 11:14 AM, Barry Smith wrote:
> > 
> > > 
> > >  Not possible. The problem is that with one process it uses all the memory
> > > bandwidth, when you change to use 2 processes (2 cores) each core
> > > now gets only half the memory bandwidth and hence essentially half
> > > the speed.
> > > 
> > >   Barry
> > > 
> > > 
> > >   Barry
> > > 
> > > On Nov 13, 2007, at 10:06 AM, Gideon Simpson wrote:
> > > 
> > > > Has anyone had any success in getting good performance on multi-core
> > > > intel os x machines with petsc?  What's the right way to get MPICH up
> > > > and running for such a thing?
> > > > 
> > > > -Gideon Simpson
> > > > Department of Applied Physics and Applied Mathematics
> > > > Columbia University
> > > > 
> > > > 
> > > 
> > 
> 


From randy at geosystem.us  Tue Nov 13 11:01:47 2007
From: randy at geosystem.us (Randall Mackie)
Date: Tue, 13 Nov 2007 09:01:47 -0800
Subject: multi core os x machines
In-Reply-To: <679E6589-C2EE-455B-ACB7-971A065B16A9@mcs.anl.gov>
References: <5248B9BC-2C7A-4783-821C-3DFBF44A98B6@columbia.edu> <D705F746-61C3-4B5D-B3ED-A24085509B49@mcs.anl.gov> <FDC2ADE1-5B24-4C5E-A896-2FC782433F3D@columbia.edu> <679E6589-C2EE-455B-ACB7-971A065B16A9@mcs.anl.gov>
Message-ID: <4739D87B.4060300@geosystem.us>

We have a 64 node cluster, each node being a quad core Intel Xeon chip,
so we have a total of 256 cpus. i'm not quite sure of the chip architecture
and the memory paths.

With infiniband, each cpu can go at full 100% during a PETSc execution.

The key for us was the infiniband and the special mpi that is tuned
for the infiniband - without them, performance was much worse (ie, using
mpich).

Randy M.

Barry Smith wrote:
> 
>   It depends on how the memory is connected to the individual cores or CPUS;
> for example the AMD has a different approach than Intel. If the 
> different processors/cores
> have SEPERATE paths to memory then you will not see this terrible effect.
> 
>    Barry
> 
> 
> 
> On Nov 13, 2007, at 10:23 AM, Gideon Simpson wrote:
> 
>> This is also true for a multi-processor machine, or its unique to 
>> multi-core machines?
>> -gideon
>>
>> On Nov 13, 2007, at 11:14 AM, Barry Smith wrote:
>>
>>>
>>>   Not possible. The problem is that with one process it uses all the 
>>> memory
>>> bandwidth, when you change to use 2 processes (2 cores) each core
>>> now gets only half the memory bandwidth and hence essentially half
>>> the speed.
>>>
>>>    Barry
>>>
>>>
>>>    Barry
>>>
>>> On Nov 13, 2007, at 10:06 AM, Gideon Simpson wrote:
>>>
>>>> Has anyone had any success in getting good performance on multi-core 
>>>> intel os x machines with petsc?  What's the right way to get MPICH 
>>>> up and running for such a thing?
>>>>
>>>> -Gideon Simpson
>>>>  Department of Applied Physics and Applied Mathematics
>>>>  Columbia University
>>>>
>>>>
>>>
>>
> 

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034


From jiaxun_hou at yahoo.com.cn  Wed Nov 14 04:12:48 2007
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Wed, 14 Nov 2007 18:12:48 +0800 (CST)
Subject: about MatGetRow/MatRestoreRow
Message-ID: <412086.81470.qm@web15808.mail.cnb.yahoo.com>


Dear all,

Does anyone have examples of using MatGetRow/MatRestoreRow?
I failed in using them.

My code is:

PetscInt ncols_A;
const PetscInt** cols_A_point;
const PetscScalar **vals_A_point;
for (i=0;i<row_A;i++){
 ierr = MatGetRow(A,i,&ncols_A,cols_A_point,vals_A_point);CHKERRQ(ierr);
 //do something
 ierr = MatRestoreRow(A,i,&ncols_A,cols_A_point,vals_A_point);CHKERRQ(ierr);
}

and it gets errors as below:

Petsc Release Version 2.3.1, Patch 10, Thu Mar  9 22:48:00 CST 2006
BK revision: balay at asterix.mcs.anrank 0 in job 61  lab_43825   caused collective abort of all ranks
  exit status of rank 0: killed by signal 9
[cli_0]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 64) - process 0
l.gov|ChangeSet|20060310044535|22333
See docs/changes/index.html for recent updates.
See docs/faq.html for hints about trouble shooting.
See docs/index.html for manual pages.
------------------------------------------------------------------------
./mytest1 on a linux-gnu named lab by root Wed Nov 14 17:57:00 2007
Libraries linked from /home/software/petsc-2.3.1-p10/lib/linux-gnu-cxx-complex-debug
Configure run at Thu Jun 15 13:08:29 2006
Configure options --with-cc=gcc --with-fc=gfortran --download-f-blas-lapack=1 --with-mpi-dir=/home/software/mpich2 --with-scalar-type=complex --with-shared=0
------------------------------------------------------------------------
[0]PETSC ERROR: PetscObjectDestroy() line 88 in src/sys/objects/destroy.c
[0]PETSC ERROR: Corrupt argument: see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Corrupt!
[0]PETSC ERROR: Invalid type of object: Parameter # 1!
[0]PETSC ERROR: PetscObjectRegisterDestroyAll() line 228 in src/sys/objects/destroy.c
[0]PETSC ERROR: PetscFinalize() line 599 in src/sys/objects/pinit.c
[0]PETSC ERROR: main() line 329 in /home/myprogram/mypro/mytest1.c
make: [runmytest1] Error 137 (ignored)

Can anyone tell me where is wrong? THX


---------------------------------
?????????? 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071114/1d3dc9c1/attachment.htm>

From knepley at gmail.com  Wed Nov 14 06:28:20 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 14 Nov 2007 06:28:20 -0600
Subject: about MatGetRow/MatRestoreRow
In-Reply-To: <412086.81470.qm@web15808.mail.cnb.yahoo.com>
References: <412086.81470.qm@web15808.mail.cnb.yahoo.com>
Message-ID: <a9f269830711140428x7c3a1cf7reb468e3bac592d26@mail.gmail.com>

On Nov 14, 2007 4:12 AM, jiaxun hou <jiaxun_hou at yahoo.com.cn> wrote:
> Dear all,
>
> Does anyone have examples of using MatGetRow/MatRestoreRow?
> I failed in using them.
>
> My code is:
>
> PetscInt ncols_A;
> const PetscInt** cols_A_point;
> const PetscScalar **vals_A_point;

This is not proper C usage since you never allocate space for the
pointers. You want

const PetscInt *cols;
const PetscScalar *vals;

> for (i=0;i<row_A;i++){
>  ierr = MatGetRow(A,i,&ncols_A,cols_A_point,vals_A_point);CHKERRQ(ierr);

MatGetRow(A,i,&ncols,&cols,&vals);

and so on.

   Matt

>  //do something
>  ierr = MatRestoreRow(A,i,&ncols_A,cols_A_point,vals_A_point);CHKERRQ(ierr);
> }
>
> and it gets errors as below:
>
> Petsc Release Version 2.3.1, Patch 10, Thu Mar 9 22:48:00 CST 2006
> BK revision: balay at asterix.mcs.anrank 0 in job 61 lab_43825 caused
> collective abort of all ranks
>  exit status of rank 0: killed by signal 9
> [cli_0]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 64) - process 0
> l.gov|ChangeSet|20060310044535|22333
> See docs/changes/index.html for recent updates.
> See docs/faq.html for hints about
>  trouble shooting.
> See docs/index.html for manual pages.
> ------------------------------------------------------------------------
> ./mytest1 on a linux-gnu named lab by root Wed Nov 14 17:57:00 2007
> Libraries linked from
> /home/software/petsc-2.3.1-p10/lib/linux-gnu-cxx-complex-debug
> Configure run at Thu Jun 15 13:08:29 2006
> Configure options --with-cc=gcc --with-fc=gfortran
> --download-f-blas-lapack=1 --with-mpi-dir=/home/software/mpich2
> --with-scalar-type=complex --with-shared=0
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscObjectDestroy() line 88 in src/sys/objects/destroy.c
> [0]PETSC ERROR: Corrupt argument: see
> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Corrupt!
> [0]PETSC ERROR: Invalid type of object: Parameter # 1!
> [0]PETSC ERROR: PetscObjectRegisterDestroyAll() line 228 in
> src/sys/objects/destroy.c
> [0]PETSC ERROR: PetscFinalize() line 599 in
>  src/sys/objects/pinit.c
> [0]PETSC ERROR: main() line 329 in /home/myprogram/mypro/mytest1.c
> make: [runmytest1] Error 137 (ignored)
>
> Can anyone tell me where is wrong? THX
>
>
>
>  ________________________________
> ??????????


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From jiaxun_hou at yahoo.com.cn  Wed Nov 14 07:00:15 2007
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Wed, 14 Nov 2007 21:00:15 +0800 (CST)
Subject: =?gb2312?q?Re=A3=BA=20Re:=20about=20MatGetRow/MatRestoreRow?=
In-Reply-To: <a9f269830711140428x7c3a1cf7reb468e3bac592d26@mail.gmail.com>
Message-ID: <39757.46716.qm@web15814.mail.cnb.yahoo.com>

Thanks a lot!

Matthew Knepley <knepley at gmail.com> ??? On Nov 14, 2007 4:12 AM, jiaxun hou  wrote:
> Dear all,
>
> Does anyone have examples of using MatGetRow/MatRestoreRow?
> I failed in using them.
>
> My code is:
>
> PetscInt ncols_A;
> const PetscInt** cols_A_point;
> const PetscScalar **vals_A_point;

This is not proper C usage since you never allocate space for the
pointers. You want

const PetscInt *cols;
const PetscScalar *vals;

> for (i=0;i
>  ierr = MatGetRow(A,i,&ncols_A,cols_A_point,vals_A_point);CHKERRQ(ierr);

MatGetRow(A,i,&ncols,&cols,&vals);

and so on.

   Matt

>  //do something
>  ierr = MatRestoreRow(A,i,&ncols_A,cols_A_point,vals_A_point);CHKERRQ(ierr);
> }
>
> and it gets errors as below:
>
> Petsc Release Version 2.3.1, Patch 10, Thu Mar 9 22:48:00 CST 2006
> BK revision: balay at asterix.mcs.anrank 0 in job 61 lab_43825 caused
> collective abort of all ranks
>  exit status of rank 0: killed by signal 9
> [cli_0]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 64) - process 0
> l.gov|ChangeSet|20060310044535|22333
> See docs/changes/index.html for recent updates.
> See docs/faq.html for hints about
>  trouble shooting.
> See docs/index.html for manual pages.
> ------------------------------------------------------------------------
> ./mytest1 on a linux-gnu named lab by root Wed Nov 14 17:57:00 2007
> Libraries linked from
> /home/software/petsc-2.3.1-p10/lib/linux-gnu-cxx-complex-debug
> Configure run at Thu Jun 15 13:08:29 2006
> Configure options --with-cc=gcc --with-fc=gfortran
> --download-f-blas-lapack=1 --with-mpi-dir=/home/software/mpich2
> --with-scalar-type=complex --with-shared=0
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscObjectDestroy() line 88 in src/sys/objects/destroy.c
> [0]PETSC ERROR: Corrupt argument: see
> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Corrupt!
> [0]PETSC ERROR: Invalid type of object: Parameter # 1!
> [0]PETSC ERROR: PetscObjectRegisterDestroyAll() line 228 in
> src/sys/objects/destroy.c
> [0]PETSC ERROR: PetscFinalize() line 599 in
>  src/sys/objects/pinit.c
> [0]PETSC ERROR: main() line 329 in /home/myprogram/mypro/mytest1.c
> make: [runmytest1] Error 137 (ignored)
>
> Can anyone tell me where is wrong? THX
>
>
>
>  ________________________________
> ??????????


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


---------------------------------
?????????? 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071114/52e76aac/attachment.htm>

From timothy.stitt at ichec.ie  Wed Nov 14 08:13:28 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Wed, 14 Nov 2007 14:13:28 +0000
Subject: AX=B Fortran Petsc Code
Message-ID: <473B0288.2060002@ichec.ie>

Dear PETSc Users/Developers,

I have the following sequential Fortran PETSc code that I have been 
developing (on and off) based on the kind advice given by members of 
this list, with respect to solving an inverse sparse matrix problem. 
Essentially, the code reads in a square double complex matrix from 
external file of size (order x order) and then proceeds to do a 
MatMatSolve(), where A is the sparse matrix to invert, B is a dense 
identity matrix and X is the resultant dense matrix....hope that makes 
sense.

My main problem is that the code stalls on the MatSetValues() for the 
sparse matrix A. With a trivial test matrix of (224 x 224) the program 
terminates successfully (by successfully I mean all instructions 
execute...I am not interested in the validity of X right now). 
Unfortunately, when I move up to a (2352 x 2352) matrix the 
MatSetValues() routine for matrix A is still in progress after 15 
minutes on one processor of our AMD Opteron IBM Cluster. I know that 
people will be screaming "preallocation"...but I have tried to take this 
into account by running a loop over the rows in A and counting the 
non-zero values explicitly prior to creation. I then pass this vector 
into the creation routine for the nnz argument. For the large (2352 x 
2352) problem that seems to be taking forever to set...at most there are 
only 200 elements per row that are non-zero according to the counts.

Can anyone explain why the MatSetValues() routine is taking such a long 
time. Maybe this expected for this specific task...although it seems 
very long?

I did notice that on the trivial (224 x 224) run that I was still 
getting mallocs (approx 2000) for the A assembly when I used the -info 
command line parameter. I thought that it should be 0 if my 
preallocation counts were exact? Does this hint that I am doing 
something wrong. I have checked the code but don't see any obvious 
problems in the logic...not that means anything.

I would be grateful if someone could advise on this matter. Also, if you 
have a few seconds to spare I would be grateful if some experts could 
scan the remaining logic of the code (not in fine detail)  to make sure 
that I am doing all that I need to do to get this calculation 
working...assuming I can resolve the MatSetValues() problem.

Once again many thanks in advance,

Tim.

 ! Initialise the PETSc MPI Harness
  call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error)
 
  call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error)
  call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error)

  ! Read in Matrix
  open(321,file='Hamiltonian.bin',form='unformatted')
  read(321) order
  if (ID==0) then
     print *
     print *,processes," Processing Elements being used"
     print *
     print *,"Matrix has order ",order," rows by ",order," columns"
     print *
  end if

  allocate(matrix(order,order))
  read(321) matrix
  close(321)

  ! Allocate array for nnz
  allocate(numberZero(order))

  ! Count number of non-zero elements in each matrix row
  do row=1,order
     count=0
     do column=1,order
        if (matrix(row,column).ne.(0,0)) count=count+1
     end do
     numberZero(row)=count
  end do

  ! Declare a PETSc Matrices

  call 
MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error)
  call 
MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error)
  call 
MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error)
  call 
MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error)

  ! Set up zero-based array indexing for use in MatSetValues
  allocate(columnIndices(order))

  do column=1,order
     columnIndices(column)=column-1
  end do

  ! Need to transpose values array as row-major arrays are used.
  call 
MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error)

  ! Assemble Matrix A
  call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
  call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)

  deallocate(matrix)

  ! Create Index Sets for Factorisation
  call 
ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error)
  call MatFactorInfoInitialize(info,error);CHKERRQ(error)
  call ISSetPermutation(indexSet,error);CHKERRQ(error)
  call 
MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error)
  call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error)

  ! A no-longer needed
  call MatDestroy(A,error);CHKERRQ(error)

  one=(1,0)

  ! Set Diagonal elements in Identity Matrix B
  do row=0,order-1
     call MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error)
  end do

  ! Assemble B
  call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
  call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)

  ! Assemble X
  call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
  call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)

  ! Solve AX=B
  call MatMatSolve(factorMat,B,X,error);CHKERRQ(error)

  ! Deallocate Storage
  deallocate(columnIndices)

  call MatDestroy(factorMat,error);CHKERRQ(error)
  call MatDestroy(B,error);CHKERRQ(error)
  call MatDestroy(X,error);CHKERRQ(error)

  call PetscFinalize(error)

-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From knepley at gmail.com  Wed Nov 14 08:29:21 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 14 Nov 2007 08:29:21 -0600
Subject: AX=B Fortran Petsc Code
In-Reply-To: <473B0288.2060002@ichec.ie>
References: <473B0288.2060002@ichec.ie>
Message-ID: <a9f269830711140629g5501f7ebx74934420f279992d@mail.gmail.com>

You appear to be setting every value in the sparse matrix. We do not
throw out 0 values (since sometimes they are necessary for structural
reasons). Thus you are allocating a ton of times. You need to remove
the 0 values before calling MatSetValues (and their associated
column entires as well).

  Matt

On Nov 14, 2007 8:13 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> Dear PETSc Users/Developers,
>
> I have the following sequential Fortran PETSc code that I have been
> developing (on and off) based on the kind advice given by members of
> this list, with respect to solving an inverse sparse matrix problem.
> Essentially, the code reads in a square double complex matrix from
> external file of size (order x order) and then proceeds to do a
> MatMatSolve(), where A is the sparse matrix to invert, B is a dense
> identity matrix and X is the resultant dense matrix....hope that makes
> sense.
>
> My main problem is that the code stalls on the MatSetValues() for the
> sparse matrix A. With a trivial test matrix of (224 x 224) the program
> terminates successfully (by successfully I mean all instructions
> execute...I am not interested in the validity of X right now).
> Unfortunately, when I move up to a (2352 x 2352) matrix the
> MatSetValues() routine for matrix A is still in progress after 15
> minutes on one processor of our AMD Opteron IBM Cluster. I know that
> people will be screaming "preallocation"...but I have tried to take this
> into account by running a loop over the rows in A and counting the
> non-zero values explicitly prior to creation. I then pass this vector
> into the creation routine for the nnz argument. For the large (2352 x
> 2352) problem that seems to be taking forever to set...at most there are
> only 200 elements per row that are non-zero according to the counts.
>
> Can anyone explain why the MatSetValues() routine is taking such a long
> time. Maybe this expected for this specific task...although it seems
> very long?
>
> I did notice that on the trivial (224 x 224) run that I was still
> getting mallocs (approx 2000) for the A assembly when I used the -info
> command line parameter. I thought that it should be 0 if my
> preallocation counts were exact? Does this hint that I am doing
> something wrong. I have checked the code but don't see any obvious
> problems in the logic...not that means anything.
>
> I would be grateful if someone could advise on this matter. Also, if you
> have a few seconds to spare I would be grateful if some experts could
> scan the remaining logic of the code (not in fine detail)  to make sure
> that I am doing all that I need to do to get this calculation
> working...assuming I can resolve the MatSetValues() problem.
>
> Once again many thanks in advance,
>
> Tim.
>
>  ! Initialise the PETSc MPI Harness
>   call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error)
>
>   call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error)
>   call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error)
>
>   ! Read in Matrix
>   open(321,file='Hamiltonian.bin',form='unformatted')
>   read(321) order
>   if (ID==0) then
>      print *
>      print *,processes," Processing Elements being used"
>      print *
>      print *,"Matrix has order ",order," rows by ",order," columns"
>      print *
>   end if
>
>   allocate(matrix(order,order))
>   read(321) matrix
>   close(321)
>
>   ! Allocate array for nnz
>   allocate(numberZero(order))
>
>   ! Count number of non-zero elements in each matrix row
>   do row=1,order
>      count=0
>      do column=1,order
>         if (matrix(row,column).ne.(0,0)) count=count+1
>      end do
>      numberZero(row)=count
>   end do
>
>   ! Declare a PETSc Matrices
>
>   call
> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error)
>   call
> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error)
>   call
> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error)
>   call
> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error)
>
>   ! Set up zero-based array indexing for use in MatSetValues
>   allocate(columnIndices(order))
>
>   do column=1,order
>      columnIndices(column)=column-1
>   end do
>
>   ! Need to transpose values array as row-major arrays are used.
>   call
> MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error)
>
>   ! Assemble Matrix A
>   call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>   call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>
>   deallocate(matrix)
>
>   ! Create Index Sets for Factorisation
>   call
> ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error)
>   call MatFactorInfoInitialize(info,error);CHKERRQ(error)
>   call ISSetPermutation(indexSet,error);CHKERRQ(error)
>   call
> MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error)
>   call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error)
>
>   ! A no-longer needed
>   call MatDestroy(A,error);CHKERRQ(error)
>
>   one=(1,0)
>
>   ! Set Diagonal elements in Identity Matrix B
>   do row=0,order-1
>      call MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error)
>   end do
>
>   ! Assemble B
>   call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>   call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>
>   ! Assemble X
>   call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>   call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>
>   ! Solve AX=B
>   call MatMatSolve(factorMat,B,X,error);CHKERRQ(error)
>
>   ! Deallocate Storage
>   deallocate(columnIndices)
>
>   call MatDestroy(factorMat,error);CHKERRQ(error)
>   call MatDestroy(B,error);CHKERRQ(error)
>   call MatDestroy(X,error);CHKERRQ(error)
>
>   call PetscFinalize(error)
>
> --
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From timothy.stitt at ichec.ie  Wed Nov 14 08:47:01 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Wed, 14 Nov 2007 14:47:01 +0000
Subject: AX=B Fortran Petsc Code
In-Reply-To: <a9f269830711140629g5501f7ebx74934420f279992d@mail.gmail.com>
References: <473B0288.2060002@ichec.ie> <a9f269830711140629g5501f7ebx74934420f279992d@mail.gmail.com>
Message-ID: <473B0A65.4090607@ichec.ie>

Matthew,

OK...I see what you are saying.

I initially set A a row at a time but for performance reasons I thought 
doing it at once would be better. I overlooked the fact that the logical 
2D matrix input to MatSetValues() is non-zero values only. With 
hindsight I now remember that was the case for each individual row.

Thanks for pointing that out....

Regards.

Matthew Knepley wrote:
> You appear to be setting every value in the sparse matrix. We do not
> throw out 0 values (since sometimes they are necessary for structural
> reasons). Thus you are allocating a ton of times. You need to remove
> the 0 values before calling MatSetValues (and their associated
> column entires as well).
>
>   Matt
>
> On Nov 14, 2007 8:13 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>   
>> Dear PETSc Users/Developers,
>>
>> I have the following sequential Fortran PETSc code that I have been
>> developing (on and off) based on the kind advice given by members of
>> this list, with respect to solving an inverse sparse matrix problem.
>> Essentially, the code reads in a square double complex matrix from
>> external file of size (order x order) and then proceeds to do a
>> MatMatSolve(), where A is the sparse matrix to invert, B is a dense
>> identity matrix and X is the resultant dense matrix....hope that makes
>> sense.
>>
>> My main problem is that the code stalls on the MatSetValues() for the
>> sparse matrix A. With a trivial test matrix of (224 x 224) the program
>> terminates successfully (by successfully I mean all instructions
>> execute...I am not interested in the validity of X right now).
>> Unfortunately, when I move up to a (2352 x 2352) matrix the
>> MatSetValues() routine for matrix A is still in progress after 15
>> minutes on one processor of our AMD Opteron IBM Cluster. I know that
>> people will be screaming "preallocation"...but I have tried to take this
>> into account by running a loop over the rows in A and counting the
>> non-zero values explicitly prior to creation. I then pass this vector
>> into the creation routine for the nnz argument. For the large (2352 x
>> 2352) problem that seems to be taking forever to set...at most there are
>> only 200 elements per row that are non-zero according to the counts.
>>
>> Can anyone explain why the MatSetValues() routine is taking such a long
>> time. Maybe this expected for this specific task...although it seems
>> very long?
>>
>> I did notice that on the trivial (224 x 224) run that I was still
>> getting mallocs (approx 2000) for the A assembly when I used the -info
>> command line parameter. I thought that it should be 0 if my
>> preallocation counts were exact? Does this hint that I am doing
>> something wrong. I have checked the code but don't see any obvious
>> problems in the logic...not that means anything.
>>
>> I would be grateful if someone could advise on this matter. Also, if you
>> have a few seconds to spare I would be grateful if some experts could
>> scan the remaining logic of the code (not in fine detail)  to make sure
>> that I am doing all that I need to do to get this calculation
>> working...assuming I can resolve the MatSetValues() problem.
>>
>> Once again many thanks in advance,
>>
>> Tim.
>>
>>  ! Initialise the PETSc MPI Harness
>>   call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error)
>>
>>   call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error)
>>   call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error)
>>
>>   ! Read in Matrix
>>   open(321,file='Hamiltonian.bin',form='unformatted')
>>   read(321) order
>>   if (ID==0) then
>>      print *
>>      print *,processes," Processing Elements being used"
>>      print *
>>      print *,"Matrix has order ",order," rows by ",order," columns"
>>      print *
>>   end if
>>
>>   allocate(matrix(order,order))
>>   read(321) matrix
>>   close(321)
>>
>>   ! Allocate array for nnz
>>   allocate(numberZero(order))
>>
>>   ! Count number of non-zero elements in each matrix row
>>   do row=1,order
>>      count=0
>>      do column=1,order
>>         if (matrix(row,column).ne.(0,0)) count=count+1
>>      end do
>>      numberZero(row)=count
>>   end do
>>
>>   ! Declare a PETSc Matrices
>>
>>   call
>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error)
>>   call
>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error)
>>   call
>> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error)
>>   call
>> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error)
>>
>>   ! Set up zero-based array indexing for use in MatSetValues
>>   allocate(columnIndices(order))
>>
>>   do column=1,order
>>      columnIndices(column)=column-1
>>   end do
>>
>>   ! Need to transpose values array as row-major arrays are used.
>>   call
>> MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error)
>>
>>   ! Assemble Matrix A
>>   call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>   call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>
>>   deallocate(matrix)
>>
>>   ! Create Index Sets for Factorisation
>>   call
>> ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error)
>>   call MatFactorInfoInitialize(info,error);CHKERRQ(error)
>>   call ISSetPermutation(indexSet,error);CHKERRQ(error)
>>   call
>> MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error)
>>   call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error)
>>
>>   ! A no-longer needed
>>   call MatDestroy(A,error);CHKERRQ(error)
>>
>>   one=(1,0)
>>
>>   ! Set Diagonal elements in Identity Matrix B
>>   do row=0,order-1
>>      call MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error)
>>   end do
>>
>>   ! Assemble B
>>   call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>   call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>
>>   ! Assemble X
>>   call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>   call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>
>>   ! Solve AX=B
>>   call MatMatSolve(factorMat,B,X,error);CHKERRQ(error)
>>
>>   ! Deallocate Storage
>>   deallocate(columnIndices)
>>
>>   call MatDestroy(factorMat,error);CHKERRQ(error)
>>   call MatDestroy(B,error);CHKERRQ(error)
>>   call MatDestroy(X,error);CHKERRQ(error)
>>
>>   call PetscFinalize(error)
>>
>> --
>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>>
>>     
>
>
>
>   


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From balay at mcs.anl.gov  Wed Nov 14 09:20:48 2007
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 14 Nov 2007 09:20:48 -0600 (CST)
Subject: AX=B Fortran Petsc Code
In-Reply-To: <473B0A65.4090607@ichec.ie>
References: <473B0288.2060002@ichec.ie> <a9f269830711140629g5501f7ebx74934420f279992d@mail.gmail.com> <473B0A65.4090607@ichec.ie>
Message-ID: <alpine.LFD.0.9999.0711140919080.5184@www.wondir.com>

You can use:

MatSetOption(mat,MAT_IGNORE_ZERO_ENTRIES)

So subsequent MatSetValues() will ignore these zero entries.

Satish

On Wed, 14 Nov 2007, Tim Stitt wrote:

> Matthew,
> 
> OK...I see what you are saying.
> 
> I initially set A a row at a time but for performance reasons I thought doing
> it at once would be better. I overlooked the fact that the logical 2D matrix
> input to MatSetValues() is non-zero values only. With hindsight I now remember
> that was the case for each individual row.
> 
> Thanks for pointing that out....
> 
> Regards.
> 
> Matthew Knepley wrote:
> > You appear to be setting every value in the sparse matrix. We do not
> > throw out 0 values (since sometimes they are necessary for structural
> > reasons). Thus you are allocating a ton of times. You need to remove
> > the 0 values before calling MatSetValues (and their associated
> > column entires as well).
> > 
> >   Matt
> > 
> > On Nov 14, 2007 8:13 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> >   
> > > Dear PETSc Users/Developers,
> > > 
> > > I have the following sequential Fortran PETSc code that I have been
> > > developing (on and off) based on the kind advice given by members of
> > > this list, with respect to solving an inverse sparse matrix problem.
> > > Essentially, the code reads in a square double complex matrix from
> > > external file of size (order x order) and then proceeds to do a
> > > MatMatSolve(), where A is the sparse matrix to invert, B is a dense
> > > identity matrix and X is the resultant dense matrix....hope that makes
> > > sense.
> > > 
> > > My main problem is that the code stalls on the MatSetValues() for the
> > > sparse matrix A. With a trivial test matrix of (224 x 224) the program
> > > terminates successfully (by successfully I mean all instructions
> > > execute...I am not interested in the validity of X right now).
> > > Unfortunately, when I move up to a (2352 x 2352) matrix the
> > > MatSetValues() routine for matrix A is still in progress after 15
> > > minutes on one processor of our AMD Opteron IBM Cluster. I know that
> > > people will be screaming "preallocation"...but I have tried to take this
> > > into account by running a loop over the rows in A and counting the
> > > non-zero values explicitly prior to creation. I then pass this vector
> > > into the creation routine for the nnz argument. For the large (2352 x
> > > 2352) problem that seems to be taking forever to set...at most there are
> > > only 200 elements per row that are non-zero according to the counts.
> > > 
> > > Can anyone explain why the MatSetValues() routine is taking such a long
> > > time. Maybe this expected for this specific task...although it seems
> > > very long?
> > > 
> > > I did notice that on the trivial (224 x 224) run that I was still
> > > getting mallocs (approx 2000) for the A assembly when I used the -info
> > > command line parameter. I thought that it should be 0 if my
> > > preallocation counts were exact? Does this hint that I am doing
> > > something wrong. I have checked the code but don't see any obvious
> > > problems in the logic...not that means anything.
> > > 
> > > I would be grateful if someone could advise on this matter. Also, if you
> > > have a few seconds to spare I would be grateful if some experts could
> > > scan the remaining logic of the code (not in fine detail)  to make sure
> > > that I am doing all that I need to do to get this calculation
> > > working...assuming I can resolve the MatSetValues() problem.
> > > 
> > > Once again many thanks in advance,
> > > 
> > > Tim.
> > > 
> > >  ! Initialise the PETSc MPI Harness
> > >   call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error)
> > > 
> > >   call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error)
> > >   call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error)
> > > 
> > >   ! Read in Matrix
> > >   open(321,file='Hamiltonian.bin',form='unformatted')
> > >   read(321) order
> > >   if (ID==0) then
> > >      print *
> > >      print *,processes," Processing Elements being used"
> > >      print *
> > >      print *,"Matrix has order ",order," rows by ",order," columns"
> > >      print *
> > >   end if
> > > 
> > >   allocate(matrix(order,order))
> > >   read(321) matrix
> > >   close(321)
> > > 
> > >   ! Allocate array for nnz
> > >   allocate(numberZero(order))
> > > 
> > >   ! Count number of non-zero elements in each matrix row
> > >   do row=1,order
> > >      count=0
> > >      do column=1,order
> > >         if (matrix(row,column).ne.(0,0)) count=count+1
> > >      end do
> > >      numberZero(row)=count
> > >   end do
> > > 
> > >   ! Declare a PETSc Matrices
> > > 
> > >   call
> > > MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error)
> > >   call
> > > MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error)
> > >   call
> > > MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error)
> > >   call
> > > MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error)
> > > 
> > >   ! Set up zero-based array indexing for use in MatSetValues
> > >   allocate(columnIndices(order))
> > > 
> > >   do column=1,order
> > >      columnIndices(column)=column-1
> > >   end do
> > > 
> > >   ! Need to transpose values array as row-major arrays are used.
> > >   call
> > > MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error)
> > > 
> > >   ! Assemble Matrix A
> > >   call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
> > >   call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
> > > 
> > >   deallocate(matrix)
> > > 
> > >   ! Create Index Sets for Factorisation
> > >   call
> > > ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error)
> > >   call MatFactorInfoInitialize(info,error);CHKERRQ(error)
> > >   call ISSetPermutation(indexSet,error);CHKERRQ(error)
> > >   call
> > > MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error)
> > >   call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error)
> > > 
> > >   ! A no-longer needed
> > >   call MatDestroy(A,error);CHKERRQ(error)
> > > 
> > >   one=(1,0)
> > > 
> > >   ! Set Diagonal elements in Identity Matrix B
> > >   do row=0,order-1
> > >      call MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error)
> > >   end do
> > > 
> > >   ! Assemble B
> > >   call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
> > >   call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
> > > 
> > >   ! Assemble X
> > >   call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
> > >   call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
> > > 
> > >   ! Solve AX=B
> > >   call MatMatSolve(factorMat,B,X,error);CHKERRQ(error)
> > > 
> > >   ! Deallocate Storage
> > >   deallocate(columnIndices)
> > > 
> > >   call MatDestroy(factorMat,error);CHKERRQ(error)
> > >   call MatDestroy(B,error);CHKERRQ(error)
> > >   call MatDestroy(X,error);CHKERRQ(error)
> > > 
> > >   call PetscFinalize(error)
> > > 
> > > --
> > > Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> > > HPC Application Consultant - ICHEC (www.ichec.ie)
> > > 
> > > Dublin Institute for Advanced Studies
> > > 5 Merrion Square - Dublin 2 - Ireland
> > > 
> > > +353-1-6621333 (tel) / +353-1-6621477 (fax)
> > > 
> > > 
> > >     
> > 
> > 
> > 
> >   
> 
> 
> 


From z.sheng at ewi.tudelft.nl  Wed Nov 14 10:08:43 2007
From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng)
Date: Wed, 14 Nov 2007 17:08:43 +0100
Subject: Nonzeros of SBAIJ
In-Reply-To: <64C31677-0447-46D7-9F6D-8EB59777E7BE@mcs.anl.gov>
References: <47388126.2010607@ewi.tudelft.nl> <Pine.LNX.4.64.0711121115040.28623@terra.mcs.anl.gov> <64C31677-0447-46D7-9F6D-8EB59777E7BE@mcs.anl.gov>
Message-ID: <473B1D8B.30900@ewi.tudelft.nl>

Barry Smith wrote:

>
> Plus 1 for each row for the diagonal.
>
> Barry
>
>
> On Nov 12, 2007, at 11:16 AM, Hong Zhang wrote:
>
>>
>>
>> On Mon, 12 Nov 2007, Zhifeng Sheng wrote:
>>
>>> Dear all
>>>
>>> my matrix is SeqSBAIJ with block size 1, so I am wondering when I 
>>> specify the nonzeros in a row, does it mean the actually nonzeros 
>>> numbers or the memory that is needed? (for instance, for SeqSBAIJ, 
>>> the actual nonzeros in a row would twice as much as memory needed)
>>
>>
>> No, you specify the nonzeros of upper triangular part.
>>
>> Hong
>>
>>>
>>> Thank you
>>> Best regards
>>> Zhifeng
>>>
>>>
>>
>
Dear all

I allocated the memory exactly with symbolic computation, and tried 
preallocation on AIJ and SBAIJ, it works well for AIJ, no additional 
allocaiton is needed in assembly... then I did it on SBAIJ ( nonzeros of 
upper triangular part + 1), the performance is much worse , it seems 
that memory allocation was still needed... does anyone have such problem 
before?

Thank you
Best regards
Zhifeng Sheng


From hzhang at mcs.anl.gov  Wed Nov 14 10:17:28 2007
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Wed, 14 Nov 2007 10:17:28 -0600 (CST)
Subject: Nonzeros of SBAIJ
In-Reply-To: <473B1D8B.30900@ewi.tudelft.nl>
References: <47388126.2010607@ewi.tudelft.nl> <Pine.LNX.4.64.0711121115040.28623@terra.mcs.anl.gov>
 <64C31677-0447-46D7-9F6D-8EB59777E7BE@mcs.anl.gov> <473B1D8B.30900@ewi.tudelft.nl>
Message-ID: <Pine.LNX.4.64.0711141011510.32372@terra.mcs.anl.gov>

>> 
> Dear all
>
> I allocated the memory exactly with symbolic computation, and tried 
> preallocation on AIJ and SBAIJ, it works well for AIJ, no additional 
> allocaiton is needed in assembly... then I did it on SBAIJ ( nonzeros of 
> upper triangular part + 1), the performance is much worse , it seems that 
> memory allocation was still needed... does anyone have such problem before?

May I have the segment of your code that implements the memory allocation?
I'll test it and see what is the problem.

BTW, you may send this type of request to petsc-maint at mcs.anl.gov
instead of petsc-users.

Hong

>
> Thank you
> Best regards
> Zhifeng Sheng
>
>
>
>
>
>
>


From timothy.stitt at ichec.ie  Wed Nov 14 10:37:43 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Wed, 14 Nov 2007 16:37:43 +0000
Subject: AX=B Fortran Petsc Code
In-Reply-To: <a9f269830711140629g5501f7ebx74934420f279992d@mail.gmail.com>
References: <473B0288.2060002@ichec.ie> <a9f269830711140629g5501f7ebx74934420f279992d@mail.gmail.com>
Message-ID: <473B2457.1050104@ichec.ie>

Can I just ask a question about MatLUFactorSymbolic() in this context? 
What sizes should the 'row' and 'col' index sets be? Should they span 
all global rows/columns in A?

Matthew Knepley wrote:
> You appear to be setting every value in the sparse matrix. We do not
> throw out 0 values (since sometimes they are necessary for structural
> reasons). Thus you are allocating a ton of times. You need to remove
> the 0 values before calling MatSetValues (and their associated
> column entires as well).
>
>   Matt
>
> On Nov 14, 2007 8:13 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>   
>> Dear PETSc Users/Developers,
>>
>> I have the following sequential Fortran PETSc code that I have been
>> developing (on and off) based on the kind advice given by members of
>> this list, with respect to solving an inverse sparse matrix problem.
>> Essentially, the code reads in a square double complex matrix from
>> external file of size (order x order) and then proceeds to do a
>> MatMatSolve(), where A is the sparse matrix to invert, B is a dense
>> identity matrix and X is the resultant dense matrix....hope that makes
>> sense.
>>
>> My main problem is that the code stalls on the MatSetValues() for the
>> sparse matrix A. With a trivial test matrix of (224 x 224) the program
>> terminates successfully (by successfully I mean all instructions
>> execute...I am not interested in the validity of X right now).
>> Unfortunately, when I move up to a (2352 x 2352) matrix the
>> MatSetValues() routine for matrix A is still in progress after 15
>> minutes on one processor of our AMD Opteron IBM Cluster. I know that
>> people will be screaming "preallocation"...but I have tried to take this
>> into account by running a loop over the rows in A and counting the
>> non-zero values explicitly prior to creation. I then pass this vector
>> into the creation routine for the nnz argument. For the large (2352 x
>> 2352) problem that seems to be taking forever to set...at most there are
>> only 200 elements per row that are non-zero according to the counts.
>>
>> Can anyone explain why the MatSetValues() routine is taking such a long
>> time. Maybe this expected for this specific task...although it seems
>> very long?
>>
>> I did notice that on the trivial (224 x 224) run that I was still
>> getting mallocs (approx 2000) for the A assembly when I used the -info
>> command line parameter. I thought that it should be 0 if my
>> preallocation counts were exact? Does this hint that I am doing
>> something wrong. I have checked the code but don't see any obvious
>> problems in the logic...not that means anything.
>>
>> I would be grateful if someone could advise on this matter. Also, if you
>> have a few seconds to spare I would be grateful if some experts could
>> scan the remaining logic of the code (not in fine detail)  to make sure
>> that I am doing all that I need to do to get this calculation
>> working...assuming I can resolve the MatSetValues() problem.
>>
>> Once again many thanks in advance,
>>
>> Tim.
>>
>>  ! Initialise the PETSc MPI Harness
>>   call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error)
>>
>>   call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error)
>>   call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error)
>>
>>   ! Read in Matrix
>>   open(321,file='Hamiltonian.bin',form='unformatted')
>>   read(321) order
>>   if (ID==0) then
>>      print *
>>      print *,processes," Processing Elements being used"
>>      print *
>>      print *,"Matrix has order ",order," rows by ",order," columns"
>>      print *
>>   end if
>>
>>   allocate(matrix(order,order))
>>   read(321) matrix
>>   close(321)
>>
>>   ! Allocate array for nnz
>>   allocate(numberZero(order))
>>
>>   ! Count number of non-zero elements in each matrix row
>>   do row=1,order
>>      count=0
>>      do column=1,order
>>         if (matrix(row,column).ne.(0,0)) count=count+1
>>      end do
>>      numberZero(row)=count
>>   end do
>>
>>   ! Declare a PETSc Matrices
>>
>>   call
>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error)
>>   call
>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error)
>>   call
>> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error)
>>   call
>> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error)
>>
>>   ! Set up zero-based array indexing for use in MatSetValues
>>   allocate(columnIndices(order))
>>
>>   do column=1,order
>>      columnIndices(column)=column-1
>>   end do
>>
>>   ! Need to transpose values array as row-major arrays are used.
>>   call
>> MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error)
>>
>>   ! Assemble Matrix A
>>   call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>   call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>
>>   deallocate(matrix)
>>
>>   ! Create Index Sets for Factorisation
>>   call
>> ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error)
>>   call MatFactorInfoInitialize(info,error);CHKERRQ(error)
>>   call ISSetPermutation(indexSet,error);CHKERRQ(error)
>>   call
>> MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error)
>>   call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error)
>>
>>   ! A no-longer needed
>>   call MatDestroy(A,error);CHKERRQ(error)
>>
>>   one=(1,0)
>>
>>   ! Set Diagonal elements in Identity Matrix B
>>   do row=0,order-1
>>      call MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error)
>>   end do
>>
>>   ! Assemble B
>>   call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>   call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>
>>   ! Assemble X
>>   call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>   call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>
>>   ! Solve AX=B
>>   call MatMatSolve(factorMat,B,X,error);CHKERRQ(error)
>>
>>   ! Deallocate Storage
>>   deallocate(columnIndices)
>>
>>   call MatDestroy(factorMat,error);CHKERRQ(error)
>>   call MatDestroy(B,error);CHKERRQ(error)
>>   call MatDestroy(X,error);CHKERRQ(error)
>>
>>   call PetscFinalize(error)
>>
>> --
>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>>
>>     
>
>
>
>   


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From knepley at gmail.com  Wed Nov 14 10:53:46 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 14 Nov 2007 10:53:46 -0600
Subject: AX=B Fortran Petsc Code
In-Reply-To: <473B2457.1050104@ichec.ie>
References: <473B0288.2060002@ichec.ie>
	 <a9f269830711140629g5501f7ebx74934420f279992d@mail.gmail.com>
	 <473B2457.1050104@ichec.ie>
Message-ID: <a9f269830711140853l1417fd2cqf2012dd76c3e1dc9@mail.gmail.com>

On Nov 14, 2007 10:37 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> Can I just ask a question about MatLUFactorSymbolic() in this context?
> What sizes should the 'row' and 'col' index sets be? Should they span
> all global rows/columns in A?

Yes, the matrix is permuted as a whole.

   Matt

> Matthew Knepley wrote:
> > You appear to be setting every value in the sparse matrix. We do not
> > throw out 0 values (since sometimes they are necessary for structural
> > reasons). Thus you are allocating a ton of times. You need to remove
> > the 0 values before calling MatSetValues (and their associated
> > column entires as well).
> >
> >   Matt
> >
> > On Nov 14, 2007 8:13 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> >
> >> Dear PETSc Users/Developers,
> >>
> >> I have the following sequential Fortran PETSc code that I have been
> >> developing (on and off) based on the kind advice given by members of
> >> this list, with respect to solving an inverse sparse matrix problem.
> >> Essentially, the code reads in a square double complex matrix from
> >> external file of size (order x order) and then proceeds to do a
> >> MatMatSolve(), where A is the sparse matrix to invert, B is a dense
> >> identity matrix and X is the resultant dense matrix....hope that makes
> >> sense.
> >>
> >> My main problem is that the code stalls on the MatSetValues() for the
> >> sparse matrix A. With a trivial test matrix of (224 x 224) the program
> >> terminates successfully (by successfully I mean all instructions
> >> execute...I am not interested in the validity of X right now).
> >> Unfortunately, when I move up to a (2352 x 2352) matrix the
> >> MatSetValues() routine for matrix A is still in progress after 15
> >> minutes on one processor of our AMD Opteron IBM Cluster. I know that
> >> people will be screaming "preallocation"...but I have tried to take this
> >> into account by running a loop over the rows in A and counting the
> >> non-zero values explicitly prior to creation. I then pass this vector
> >> into the creation routine for the nnz argument. For the large (2352 x
> >> 2352) problem that seems to be taking forever to set...at most there are
> >> only 200 elements per row that are non-zero according to the counts.
> >>
> >> Can anyone explain why the MatSetValues() routine is taking such a long
> >> time. Maybe this expected for this specific task...although it seems
> >> very long?
> >>
> >> I did notice that on the trivial (224 x 224) run that I was still
> >> getting mallocs (approx 2000) for the A assembly when I used the -info
> >> command line parameter. I thought that it should be 0 if my
> >> preallocation counts were exact? Does this hint that I am doing
> >> something wrong. I have checked the code but don't see any obvious
> >> problems in the logic...not that means anything.
> >>
> >> I would be grateful if someone could advise on this matter. Also, if you
> >> have a few seconds to spare I would be grateful if some experts could
> >> scan the remaining logic of the code (not in fine detail)  to make sure
> >> that I am doing all that I need to do to get this calculation
> >> working...assuming I can resolve the MatSetValues() problem.
> >>
> >> Once again many thanks in advance,
> >>
> >> Tim.
> >>
> >>  ! Initialise the PETSc MPI Harness
> >>   call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error)
> >>
> >>   call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error)
> >>   call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error)
> >>
> >>   ! Read in Matrix
> >>   open(321,file='Hamiltonian.bin',form='unformatted')
> >>   read(321) order
> >>   if (ID==0) then
> >>      print *
> >>      print *,processes," Processing Elements being used"
> >>      print *
> >>      print *,"Matrix has order ",order," rows by ",order," columns"
> >>      print *
> >>   end if
> >>
> >>   allocate(matrix(order,order))
> >>   read(321) matrix
> >>   close(321)
> >>
> >>   ! Allocate array for nnz
> >>   allocate(numberZero(order))
> >>
> >>   ! Count number of non-zero elements in each matrix row
> >>   do row=1,order
> >>      count=0
> >>      do column=1,order
> >>         if (matrix(row,column).ne.(0,0)) count=count+1
> >>      end do
> >>      numberZero(row)=count
> >>   end do
> >>
> >>   ! Declare a PETSc Matrices
> >>
> >>   call
> >> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error)
> >>   call
> >> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error)
> >>   call
> >> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error)
> >>   call
> >> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error)
> >>
> >>   ! Set up zero-based array indexing for use in MatSetValues
> >>   allocate(columnIndices(order))
> >>
> >>   do column=1,order
> >>      columnIndices(column)=column-1
> >>   end do
> >>
> >>   ! Need to transpose values array as row-major arrays are used.
> >>   call
> >> MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error)
> >>
> >>   ! Assemble Matrix A
> >>   call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
> >>   call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
> >>
> >>   deallocate(matrix)
> >>
> >>   ! Create Index Sets for Factorisation
> >>   call
> >> ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error)
> >>   call MatFactorInfoInitialize(info,error);CHKERRQ(error)
> >>   call ISSetPermutation(indexSet,error);CHKERRQ(error)
> >>   call
> >> MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error)
> >>   call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error)
> >>
> >>   ! A no-longer needed
> >>   call MatDestroy(A,error);CHKERRQ(error)
> >>
> >>   one=(1,0)
> >>
> >>   ! Set Diagonal elements in Identity Matrix B
> >>   do row=0,order-1
> >>      call MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error)
> >>   end do
> >>
> >>   ! Assemble B
> >>   call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
> >>   call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
> >>
> >>   ! Assemble X
> >>   call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
> >>   call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
> >>
> >>   ! Solve AX=B
> >>   call MatMatSolve(factorMat,B,X,error);CHKERRQ(error)
> >>
> >>   ! Deallocate Storage
> >>   deallocate(columnIndices)
> >>
> >>   call MatDestroy(factorMat,error);CHKERRQ(error)
> >>   call MatDestroy(B,error);CHKERRQ(error)
> >>   call MatDestroy(X,error);CHKERRQ(error)
> >>
> >>   call PetscFinalize(error)
> >>
> >> --
> >> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> >> HPC Application Consultant - ICHEC (www.ichec.ie)
> >>
> >> Dublin Institute for Advanced Studies
> >> 5 Merrion Square - Dublin 2 - Ireland
> >>
> >> +353-1-6621333 (tel) / +353-1-6621477 (fax)
> >>
> >>
> >>
> >
> >
> >
> >
>
>
> --
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From bsmith at mcs.anl.gov  Wed Nov 14 11:17:07 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 14 Nov 2007 11:17:07 -0600
Subject: AX=B Fortran Petsc Code
In-Reply-To: <473B2457.1050104@ichec.ie>
References: <473B0288.2060002@ichec.ie> <a9f269830711140629g5501f7ebx74934420f279992d@mail.gmail.com> <473B2457.1050104@ichec.ie>
Message-ID: <6D457C11-14AD-4E43-B842-8AD24C64F993@mcs.anl.gov>


   For sequential codes the index set is as large as the matrix.

   In parallel the factor codes do not use the ordering, they do the  
ordering
internally.

    Barry

On Nov 14, 2007, at 10:37 AM, Tim Stitt wrote:

> Can I just ask a question about MatLUFactorSymbolic() in this  
> context? What sizes should the 'row' and 'col' index sets be? Should  
> they span all global rows/columns in A?
>
> Matthew Knepley wrote:
>> You appear to be setting every value in the sparse matrix. We do not
>> throw out 0 values (since sometimes they are necessary for structural
>> reasons). Thus you are allocating a ton of times. You need to remove
>> the 0 values before calling MatSetValues (and their associated
>> column entires as well).
>>
>>  Matt
>>
>> On Nov 14, 2007 8:13 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>>
>>> Dear PETSc Users/Developers,
>>>
>>> I have the following sequential Fortran PETSc code that I have been
>>> developing (on and off) based on the kind advice given by members of
>>> this list, with respect to solving an inverse sparse matrix problem.
>>> Essentially, the code reads in a square double complex matrix from
>>> external file of size (order x order) and then proceeds to do a
>>> MatMatSolve(), where A is the sparse matrix to invert, B is a dense
>>> identity matrix and X is the resultant dense matrix....hope that  
>>> makes
>>> sense.
>>>
>>> My main problem is that the code stalls on the MatSetValues() for  
>>> the
>>> sparse matrix A. With a trivial test matrix of (224 x 224) the  
>>> program
>>> terminates successfully (by successfully I mean all instructions
>>> execute...I am not interested in the validity of X right now).
>>> Unfortunately, when I move up to a (2352 x 2352) matrix the
>>> MatSetValues() routine for matrix A is still in progress after 15
>>> minutes on one processor of our AMD Opteron IBM Cluster. I know that
>>> people will be screaming "preallocation"...but I have tried to  
>>> take this
>>> into account by running a loop over the rows in A and counting the
>>> non-zero values explicitly prior to creation. I then pass this  
>>> vector
>>> into the creation routine for the nnz argument. For the large  
>>> (2352 x
>>> 2352) problem that seems to be taking forever to set...at most  
>>> there are
>>> only 200 elements per row that are non-zero according to the counts.
>>>
>>> Can anyone explain why the MatSetValues() routine is taking such a  
>>> long
>>> time. Maybe this expected for this specific task...although it seems
>>> very long?
>>>
>>> I did notice that on the trivial (224 x 224) run that I was still
>>> getting mallocs (approx 2000) for the A assembly when I used the - 
>>> info
>>> command line parameter. I thought that it should be 0 if my
>>> preallocation counts were exact? Does this hint that I am doing
>>> something wrong. I have checked the code but don't see any obvious
>>> problems in the logic...not that means anything.
>>>
>>> I would be grateful if someone could advise on this matter. Also,  
>>> if you
>>> have a few seconds to spare I would be grateful if some experts  
>>> could
>>> scan the remaining logic of the code (not in fine detail)  to make  
>>> sure
>>> that I am doing all that I need to do to get this calculation
>>> working...assuming I can resolve the MatSetValues() problem.
>>>
>>> Once again many thanks in advance,
>>>
>>> Tim.
>>>
>>> ! Initialise the PETSc MPI Harness
>>>  call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error)
>>>
>>>  call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error)
>>>  call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error)
>>>
>>>  ! Read in Matrix
>>>  open(321,file='Hamiltonian.bin',form='unformatted')
>>>  read(321) order
>>>  if (ID==0) then
>>>     print *
>>>     print *,processes," Processing Elements being used"
>>>     print *
>>>     print *,"Matrix has order ",order," rows by ",order," columns"
>>>     print *
>>>  end if
>>>
>>>  allocate(matrix(order,order))
>>>  read(321) matrix
>>>  close(321)
>>>
>>>  ! Allocate array for nnz
>>>  allocate(numberZero(order))
>>>
>>>  ! Count number of non-zero elements in each matrix row
>>>  do row=1,order
>>>     count=0
>>>     do column=1,order
>>>        if (matrix(row,column).ne.(0,0)) count=count+1
>>>     end do
>>>     numberZero(row)=count
>>>  end do
>>>
>>>  ! Declare a PETSc Matrices
>>>
>>>  call
>>> MatCreateSeqAIJ 
>>> (PETSC_COMM_SELF 
>>> ,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error)
>>>  call
>>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order, 
>>> 0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error)
>>>  call
>>> MatCreateSeqDense 
>>> (PETSC_COMM_SELF 
>>> ,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error)
>>>  call
>>> MatCreateSeqDense 
>>> (PETSC_COMM_SELF 
>>> ,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error)
>>>
>>>  ! Set up zero-based array indexing for use in MatSetValues
>>>  allocate(columnIndices(order))
>>>
>>>  do column=1,order
>>>     columnIndices(column)=column-1
>>>  end do
>>>
>>>  ! Need to transpose values array as row-major arrays are used.
>>>  call
>>> MatSetValues 
>>> (A 
>>> ,order 
>>> ,columnIndices 
>>> ,order 
>>> ,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error)
>>>
>>>  ! Assemble Matrix A
>>>  call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>  call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>
>>>  deallocate(matrix)
>>>
>>>  ! Create Index Sets for Factorisation
>>>  call
>>> ISCreateGeneral 
>>> (PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error)
>>>  call MatFactorInfoInitialize(info,error);CHKERRQ(error)
>>>  call ISSetPermutation(indexSet,error);CHKERRQ(error)
>>>  call
>>> MatLUFactorSymbolic 
>>> (A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error)
>>>  call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error)
>>>
>>>  ! A no-longer needed
>>>  call MatDestroy(A,error);CHKERRQ(error)
>>>
>>>  one=(1,0)
>>>
>>>  ! Set Diagonal elements in Identity Matrix B
>>>  do row=0,order-1
>>>     call  
>>> MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error)
>>>  end do
>>>
>>>  ! Assemble B
>>>  call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>  call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>
>>>  ! Assemble X
>>>  call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>  call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>
>>>  ! Solve AX=B
>>>  call MatMatSolve(factorMat,B,X,error);CHKERRQ(error)
>>>
>>>  ! Deallocate Storage
>>>  deallocate(columnIndices)
>>>
>>>  call MatDestroy(factorMat,error);CHKERRQ(error)
>>>  call MatDestroy(B,error);CHKERRQ(error)
>>>  call MatDestroy(X,error);CHKERRQ(error)
>>>
>>>  call PetscFinalize(error)
>>>
>>> --
>>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>
>>> Dublin Institute for Advanced Studies
>>> 5 Merrion Square - Dublin 2 - Ireland
>>>
>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>
>>>
>>>
>>
>>
>>
>>
>
>
> -- 
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>


From timothy.stitt at ichec.ie  Wed Nov 14 12:04:21 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Wed, 14 Nov 2007 18:04:21 +0000
Subject: AX=B Fortran Petsc Code
In-Reply-To: <6D457C11-14AD-4E43-B842-8AD24C64F993@mcs.anl.gov>
References: <473B0288.2060002@ichec.ie> <a9f269830711140629g5501f7ebx74934420f279992d@mail.gmail.com> <473B2457.1050104@ichec.ie> <6D457C11-14AD-4E43-B842-8AD24C64F993@mcs.anl.gov>
Message-ID: <473B38A5.8080602@ichec.ie>

OK...everything is working well now and I am getting the results I 
expect. Much appreciated.

Saying that...I am trying to now satisfy the PC FACTOR FILL suggestion 
provided by the -info parameter on my sample sparse matrices.

In my case I am getting:

[0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 
2.56568
[0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.56568 or use
[0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.56568);
[0] MatLUFactorSymbolic_SeqAIJ(): for best performance.

So I run my code with ./foo -pc_factor_fill 2.56568

but I continually get

WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
Option left: name:-pc_factor_fill value: 2.56568

Can someone suggest how I can improve performance with the 
pc_factor_fill parameter in my case? As my code stands there is no 
MatSetFromOptions() as I set everything explicitly in the code.

Thanks again,

Tim.

Barry Smith wrote:
>
>   For sequential codes the index set is as large as the matrix.
>
>   In parallel the factor codes do not use the ordering, they do the 
> ordering
> internally.
>
>    Barry
>
> On Nov 14, 2007, at 10:37 AM, Tim Stitt wrote:
>
>> Can I just ask a question about MatLUFactorSymbolic() in this 
>> context? What sizes should the 'row' and 'col' index sets be? Should 
>> they span all global rows/columns in A?
>>
>> Matthew Knepley wrote:
>>> You appear to be setting every value in the sparse matrix. We do not
>>> throw out 0 values (since sometimes they are necessary for structural
>>> reasons). Thus you are allocating a ton of times. You need to remove
>>> the 0 values before calling MatSetValues (and their associated
>>> column entires as well).
>>>
>>>  Matt
>>>
>>> On Nov 14, 2007 8:13 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>>>
>>>> Dear PETSc Users/Developers,
>>>>
>>>> I have the following sequential Fortran PETSc code that I have been
>>>> developing (on and off) based on the kind advice given by members of
>>>> this list, with respect to solving an inverse sparse matrix problem.
>>>> Essentially, the code reads in a square double complex matrix from
>>>> external file of size (order x order) and then proceeds to do a
>>>> MatMatSolve(), where A is the sparse matrix to invert, B is a dense
>>>> identity matrix and X is the resultant dense matrix....hope that makes
>>>> sense.
>>>>
>>>> My main problem is that the code stalls on the MatSetValues() for the
>>>> sparse matrix A. With a trivial test matrix of (224 x 224) the program
>>>> terminates successfully (by successfully I mean all instructions
>>>> execute...I am not interested in the validity of X right now).
>>>> Unfortunately, when I move up to a (2352 x 2352) matrix the
>>>> MatSetValues() routine for matrix A is still in progress after 15
>>>> minutes on one processor of our AMD Opteron IBM Cluster. I know that
>>>> people will be screaming "preallocation"...but I have tried to take 
>>>> this
>>>> into account by running a loop over the rows in A and counting the
>>>> non-zero values explicitly prior to creation. I then pass this vector
>>>> into the creation routine for the nnz argument. For the large (2352 x
>>>> 2352) problem that seems to be taking forever to set...at most 
>>>> there are
>>>> only 200 elements per row that are non-zero according to the counts.
>>>>
>>>> Can anyone explain why the MatSetValues() routine is taking such a 
>>>> long
>>>> time. Maybe this expected for this specific task...although it seems
>>>> very long?
>>>>
>>>> I did notice that on the trivial (224 x 224) run that I was still
>>>> getting mallocs (approx 2000) for the A assembly when I used the -info
>>>> command line parameter. I thought that it should be 0 if my
>>>> preallocation counts were exact? Does this hint that I am doing
>>>> something wrong. I have checked the code but don't see any obvious
>>>> problems in the logic...not that means anything.
>>>>
>>>> I would be grateful if someone could advise on this matter. Also, 
>>>> if you
>>>> have a few seconds to spare I would be grateful if some experts could
>>>> scan the remaining logic of the code (not in fine detail)  to make 
>>>> sure
>>>> that I am doing all that I need to do to get this calculation
>>>> working...assuming I can resolve the MatSetValues() problem.
>>>>
>>>> Once again many thanks in advance,
>>>>
>>>> Tim.
>>>>
>>>> ! Initialise the PETSc MPI Harness
>>>>  call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error)
>>>>
>>>>  call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error)
>>>>  call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error)
>>>>
>>>>  ! Read in Matrix
>>>>  open(321,file='Hamiltonian.bin',form='unformatted')
>>>>  read(321) order
>>>>  if (ID==0) then
>>>>     print *
>>>>     print *,processes," Processing Elements being used"
>>>>     print *
>>>>     print *,"Matrix has order ",order," rows by ",order," columns"
>>>>     print *
>>>>  end if
>>>>
>>>>  allocate(matrix(order,order))
>>>>  read(321) matrix
>>>>  close(321)
>>>>
>>>>  ! Allocate array for nnz
>>>>  allocate(numberZero(order))
>>>>
>>>>  ! Count number of non-zero elements in each matrix row
>>>>  do row=1,order
>>>>     count=0
>>>>     do column=1,order
>>>>        if (matrix(row,column).ne.(0,0)) count=count+1
>>>>     end do
>>>>     numberZero(row)=count
>>>>  end do
>>>>
>>>>  ! Declare a PETSc Matrices
>>>>
>>>>  call
>>>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error) 
>>>>
>>>>  call
>>>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error) 
>>>>
>>>>  call
>>>> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error) 
>>>>
>>>>  call
>>>> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error) 
>>>>
>>>>
>>>>  ! Set up zero-based array indexing for use in MatSetValues
>>>>  allocate(columnIndices(order))
>>>>
>>>>  do column=1,order
>>>>     columnIndices(column)=column-1
>>>>  end do
>>>>
>>>>  ! Need to transpose values array as row-major arrays are used.
>>>>  call
>>>> MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error) 
>>>>
>>>>
>>>>  ! Assemble Matrix A
>>>>  call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>  call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>
>>>>  deallocate(matrix)
>>>>
>>>>  ! Create Index Sets for Factorisation
>>>>  call
>>>> ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error) 
>>>>
>>>>  call MatFactorInfoInitialize(info,error);CHKERRQ(error)
>>>>  call ISSetPermutation(indexSet,error);CHKERRQ(error)
>>>>  call
>>>> MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error) 
>>>>
>>>>  call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error)
>>>>
>>>>  ! A no-longer needed
>>>>  call MatDestroy(A,error);CHKERRQ(error)
>>>>
>>>>  one=(1,0)
>>>>
>>>>  ! Set Diagonal elements in Identity Matrix B
>>>>  do row=0,order-1
>>>>     call MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error)
>>>>  end do
>>>>
>>>>  ! Assemble B
>>>>  call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>  call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>
>>>>  ! Assemble X
>>>>  call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>  call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>
>>>>  ! Solve AX=B
>>>>  call MatMatSolve(factorMat,B,X,error);CHKERRQ(error)
>>>>
>>>>  ! Deallocate Storage
>>>>  deallocate(columnIndices)
>>>>
>>>>  call MatDestroy(factorMat,error);CHKERRQ(error)
>>>>  call MatDestroy(B,error);CHKERRQ(error)
>>>>  call MatDestroy(X,error);CHKERRQ(error)
>>>>
>>>>  call PetscFinalize(error)
>>>>
>>>> -- 
>>>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>>
>>>> Dublin Institute for Advanced Studies
>>>> 5 Merrion Square - Dublin 2 - Ireland
>>>>
>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>>
>> --Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From timothy.stitt at ichec.ie  Thu Nov 15 08:20:19 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Thu, 15 Nov 2007 14:20:19 +0000
Subject: SUPERLU Type and MatMatSolve()
Message-ID: <473C55A3.1070504@ichec.ie>

Hi,

Just wondering if it is possible to use the SUPERLU matrix type with the 
MatMatSolve() routine.

I changed a working code (which uses MatMatSolve()) by the setting the 
matrix type to superlu (using MatSetType()) and now I get the following 
runtime errors in MatMatSolve():

[0]PETSC ERROR: Null argument, when expecting valid pointer!
[0]PETSC ERROR: Null Object: Parameter # 1!

Thanks,

Tim.

-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From hzhang at mcs.anl.gov  Thu Nov 15 08:39:08 2007
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Thu, 15 Nov 2007 08:39:08 -0600 (CST)
Subject: SUPERLU Type and MatMatSolve()
In-Reply-To: <473C55A3.1070504@ichec.ie>
References: <473C55A3.1070504@ichec.ie>
Message-ID: <Pine.LNX.4.64.0711150838050.27459@terra.mcs.anl.gov>


On Thu, 15 Nov 2007, Tim Stitt wrote:

> Hi,
>
> Just wondering if it is possible to use the SUPERLU matrix type with the 
> MatMatSolve() routine.
>
> I changed a working code (which uses MatMatSolve()) by the setting the matrix 
> type to superlu (using MatSetType()) and now I get the following runtime 
> errors in MatMatSolve():

The current petsc-SUPERLU interface doesn't support MatMatSolve().

Hong

>
> [0]PETSC ERROR: Null argument, when expecting valid pointer!
> [0]PETSC ERROR: Null Object: Parameter # 1!
>
> Thanks,
>
> Tim.
>
> -- 
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>
>


From timothy.stitt at ichec.ie  Thu Nov 15 10:26:37 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Thu, 15 Nov 2007 16:26:37 +0000
Subject: SUPERLU Type and MatMatSolve()
In-Reply-To: <Pine.LNX.4.64.0711150838050.27459@terra.mcs.anl.gov>
References: <473C55A3.1070504@ichec.ie> <Pine.LNX.4.64.0711150838050.27459@terra.mcs.anl.gov>
Message-ID: <473C733D.7090306@ichec.ie>

Hong,

Does MatSolve() support SuperLU and MUMPS?

Hong Zhang wrote:
>
>
> On Thu, 15 Nov 2007, Tim Stitt wrote:
>
>> Hi,
>>
>> Just wondering if it is possible to use the SUPERLU matrix type with 
>> the MatMatSolve() routine.
>>
>> I changed a working code (which uses MatMatSolve()) by the setting 
>> the matrix type to superlu (using MatSetType()) and now I get the 
>> following runtime errors in MatMatSolve():
>
> The current petsc-SUPERLU interface doesn't support MatMatSolve().
>
> Hong
>
>>
>> [0]PETSC ERROR: Null argument, when expecting valid pointer!
>> [0]PETSC ERROR: Null Object: Parameter # 1!
>>
>> Thanks,
>>
>> Tim.
>>
>> -- 
>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>>
>


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From knepley at gmail.com  Thu Nov 15 10:59:21 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 15 Nov 2007 10:59:21 -0600
Subject: SUPERLU Type and MatMatSolve()
In-Reply-To: <473C733D.7090306@ichec.ie>
References: <473C55A3.1070504@ichec.ie>
	 <Pine.LNX.4.64.0711150838050.27459@terra.mcs.anl.gov>
	 <473C733D.7090306@ichec.ie>
Message-ID: <a9f269830711150859y23f721ak9dc5a3e4a8a002bf@mail.gmail.com>

Yes.

  Matt

On Nov 15, 2007 10:26 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> Hong,
>
> Does MatSolve() support SuperLU and MUMPS?
>
> Hong Zhang wrote:
> >
> >
> > On Thu, 15 Nov 2007, Tim Stitt wrote:
> >
> >> Hi,
> >>
> >> Just wondering if it is possible to use the SUPERLU matrix type with
> >> the MatMatSolve() routine.
> >>
> >> I changed a working code (which uses MatMatSolve()) by the setting
> >> the matrix type to superlu (using MatSetType()) and now I get the
> >> following runtime errors in MatMatSolve():
> >
> > The current petsc-SUPERLU interface doesn't support MatMatSolve().
> >
> > Hong
> >
> >>
> >> [0]PETSC ERROR: Null argument, when expecting valid pointer!
> >> [0]PETSC ERROR: Null Object: Parameter # 1!
> >>
> >> Thanks,
> >>
> >> Tim.
> >>
> >> --
> >> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> >> HPC Application Consultant - ICHEC (www.ichec.ie)
> >>
> >> Dublin Institute for Advanced Studies
> >> 5 Merrion Square - Dublin 2 - Ireland
> >>
> >> +353-1-6621333 (tel) / +353-1-6621477 (fax)
> >>
> >>
> >
>
>
> --
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From hzhang at mcs.anl.gov  Thu Nov 15 11:03:30 2007
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Thu, 15 Nov 2007 11:03:30 -0600 (CST)
Subject: SUPERLU Type and MatMatSolve()
In-Reply-To: <473C733D.7090306@ichec.ie>
References: <473C55A3.1070504@ichec.ie> <Pine.LNX.4.64.0711150838050.27459@terra.mcs.anl.gov>
 <473C733D.7090306@ichec.ie>
Message-ID: <Pine.LNX.4.64.0711151103100.31193@terra.mcs.anl.gov>


On Thu, 15 Nov 2007, Tim Stitt wrote:

> Hong,
>
> Does MatSolve() support SuperLU and MUMPS?
Yes.

Hong

>
> Hong Zhang wrote:
>> 
>> 
>> On Thu, 15 Nov 2007, Tim Stitt wrote:
>> 
>>> Hi,
>>> 
>>> Just wondering if it is possible to use the SUPERLU matrix type with the 
>>> MatMatSolve() routine.
>>> 
>>> I changed a working code (which uses MatMatSolve()) by the setting the 
>>> matrix type to superlu (using MatSetType()) and now I get the following 
>>> runtime errors in MatMatSolve():
>> 
>> The current petsc-SUPERLU interface doesn't support MatMatSolve().
>> 
>> Hong
>> 
>>> 
>>> [0]PETSC ERROR: Null argument, when expecting valid pointer!
>>> [0]PETSC ERROR: Null Object: Parameter # 1!
>>> 
>>> Thanks,
>>> 
>>> Tim.
>>> 
>>> -- 
>>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>> 
>>> Dublin Institute for Advanced Studies
>>> 5 Merrion Square - Dublin 2 - Ireland
>>> 
>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>> 
>>> 
>> 
>
>
> -- 
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>
>


From bsmith at mcs.anl.gov  Thu Nov 15 13:43:50 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 15 Nov 2007 13:43:50 -0600
Subject: AX=B Fortran Petsc Code
In-Reply-To: <473B38A5.8080602@ichec.ie>
References: <473B0288.2060002@ichec.ie> <a9f269830711140629g5501f7ebx74934420f279992d@mail.gmail.com> <473B2457.1050104@ichec.ie> <6D457C11-14AD-4E43-B842-8AD24C64F993@mcs.anl.gov> <473B38A5.8080602@ichec.ie>
Message-ID: <517A8A62-BDC1-495E-87C6-B2C5659A5B01@mcs.anl.gov>


   Tim,

     There is an field in MatFactorInfo that contains this fill factor  
called
fill set it with 2.5 and you should be all set.

     Barry

On Nov 14, 2007, at 12:04 PM, Tim Stitt wrote:

> OK...everything is working well now and I am getting the results I  
> expect. Much appreciated.
>
> Saying that...I am trying to now satisfy the PC FACTOR FILL  
> suggestion provided by the -info parameter on my sample sparse  
> matrices.
>
> In my case I am getting:
>
> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0  
> needed 2.56568
> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.56568  
> or use
> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.56568);
> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance.
>
> So I run my code with ./foo -pc_factor_fill 2.56568
>
> but I continually get
>
> WARNING! There are options you set that were not used!
> WARNING! could be spelling mistake, etc!
> Option left: name:-pc_factor_fill value: 2.56568
>
> Can someone suggest how I can improve performance with the  
> pc_factor_fill parameter in my case? As my code stands there is no  
> MatSetFromOptions() as I set everything explicitly in the code.
>
> Thanks again,
>
> Tim.
>
> Barry Smith wrote:
>>
>>  For sequential codes the index set is as large as the matrix.
>>
>>  In parallel the factor codes do not use the ordering, they do the  
>> ordering
>> internally.
>>
>>   Barry
>>
>> On Nov 14, 2007, at 10:37 AM, Tim Stitt wrote:
>>
>>> Can I just ask a question about MatLUFactorSymbolic() in this  
>>> context? What sizes should the 'row' and 'col' index sets be?  
>>> Should they span all global rows/columns in A?
>>>
>>> Matthew Knepley wrote:
>>>> You appear to be setting every value in the sparse matrix. We do  
>>>> not
>>>> throw out 0 values (since sometimes they are necessary for  
>>>> structural
>>>> reasons). Thus you are allocating a ton of times. You need to  
>>>> remove
>>>> the 0 values before calling MatSetValues (and their associated
>>>> column entires as well).
>>>>
>>>> Matt
>>>>
>>>> On Nov 14, 2007 8:13 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>>>>
>>>>> Dear PETSc Users/Developers,
>>>>>
>>>>> I have the following sequential Fortran PETSc code that I have  
>>>>> been
>>>>> developing (on and off) based on the kind advice given by  
>>>>> members of
>>>>> this list, with respect to solving an inverse sparse matrix  
>>>>> problem.
>>>>> Essentially, the code reads in a square double complex matrix from
>>>>> external file of size (order x order) and then proceeds to do a
>>>>> MatMatSolve(), where A is the sparse matrix to invert, B is a  
>>>>> dense
>>>>> identity matrix and X is the resultant dense matrix....hope that  
>>>>> makes
>>>>> sense.
>>>>>
>>>>> My main problem is that the code stalls on the MatSetValues()  
>>>>> for the
>>>>> sparse matrix A. With a trivial test matrix of (224 x 224) the  
>>>>> program
>>>>> terminates successfully (by successfully I mean all instructions
>>>>> execute...I am not interested in the validity of X right now).
>>>>> Unfortunately, when I move up to a (2352 x 2352) matrix the
>>>>> MatSetValues() routine for matrix A is still in progress after 15
>>>>> minutes on one processor of our AMD Opteron IBM Cluster. I know  
>>>>> that
>>>>> people will be screaming "preallocation"...but I have tried to  
>>>>> take this
>>>>> into account by running a loop over the rows in A and counting the
>>>>> non-zero values explicitly prior to creation. I then pass this  
>>>>> vector
>>>>> into the creation routine for the nnz argument. For the large  
>>>>> (2352 x
>>>>> 2352) problem that seems to be taking forever to set...at most  
>>>>> there are
>>>>> only 200 elements per row that are non-zero according to the  
>>>>> counts.
>>>>>
>>>>> Can anyone explain why the MatSetValues() routine is taking such  
>>>>> a long
>>>>> time. Maybe this expected for this specific task...although it  
>>>>> seems
>>>>> very long?
>>>>>
>>>>> I did notice that on the trivial (224 x 224) run that I was still
>>>>> getting mallocs (approx 2000) for the A assembly when I used the  
>>>>> -info
>>>>> command line parameter. I thought that it should be 0 if my
>>>>> preallocation counts were exact? Does this hint that I am doing
>>>>> something wrong. I have checked the code but don't see any obvious
>>>>> problems in the logic...not that means anything.
>>>>>
>>>>> I would be grateful if someone could advise on this matter.  
>>>>> Also, if you
>>>>> have a few seconds to spare I would be grateful if some experts  
>>>>> could
>>>>> scan the remaining logic of the code (not in fine detail)  to  
>>>>> make sure
>>>>> that I am doing all that I need to do to get this calculation
>>>>> working...assuming I can resolve the MatSetValues() problem.
>>>>>
>>>>> Once again many thanks in advance,
>>>>>
>>>>> Tim.
>>>>>
>>>>> ! Initialise the PETSc MPI Harness
>>>>> call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error)
>>>>>
>>>>> call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error)
>>>>> call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error)
>>>>>
>>>>> ! Read in Matrix
>>>>> open(321,file='Hamiltonian.bin',form='unformatted')
>>>>> read(321) order
>>>>> if (ID==0) then
>>>>>    print *
>>>>>    print *,processes," Processing Elements being used"
>>>>>    print *
>>>>>    print *,"Matrix has order ",order," rows by ",order," columns"
>>>>>    print *
>>>>> end if
>>>>>
>>>>> allocate(matrix(order,order))
>>>>> read(321) matrix
>>>>> close(321)
>>>>>
>>>>> ! Allocate array for nnz
>>>>> allocate(numberZero(order))
>>>>>
>>>>> ! Count number of non-zero elements in each matrix row
>>>>> do row=1,order
>>>>>    count=0
>>>>>    do column=1,order
>>>>>       if (matrix(row,column).ne.(0,0)) count=count+1
>>>>>    end do
>>>>>    numberZero(row)=count
>>>>> end do
>>>>>
>>>>> ! Declare a PETSc Matrices
>>>>>
>>>>> call
>>>>> MatCreateSeqAIJ 
>>>>> (PETSC_COMM_SELF 
>>>>> ,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error)
>>>>> call
>>>>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order, 
>>>>> 0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error)
>>>>> call
>>>>> MatCreateSeqDense 
>>>>> (PETSC_COMM_SELF 
>>>>> ,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error)
>>>>> call
>>>>> MatCreateSeqDense 
>>>>> (PETSC_COMM_SELF 
>>>>> ,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error)
>>>>>
>>>>> ! Set up zero-based array indexing for use in MatSetValues
>>>>> allocate(columnIndices(order))
>>>>>
>>>>> do column=1,order
>>>>>    columnIndices(column)=column-1
>>>>> end do
>>>>>
>>>>> ! Need to transpose values array as row-major arrays are used.
>>>>> call
>>>>> MatSetValues 
>>>>> (A 
>>>>> ,order 
>>>>> ,columnIndices 
>>>>> ,order 
>>>>> ,columnIndices 
>>>>> ,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error)
>>>>>
>>>>> ! Assemble Matrix A
>>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>>
>>>>> deallocate(matrix)
>>>>>
>>>>> ! Create Index Sets for Factorisation
>>>>> call
>>>>> ISCreateGeneral 
>>>>> (PETSC_COMM_SELF 
>>>>> ,order,columnIndices,indexSet,error);CHKERRQ(error)
>>>>> call MatFactorInfoInitialize(info,error);CHKERRQ(error)
>>>>> call ISSetPermutation(indexSet,error);CHKERRQ(error)
>>>>> call
>>>>> MatLUFactorSymbolic 
>>>>> (A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error)
>>>>> call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error)
>>>>>
>>>>> ! A no-longer needed
>>>>> call MatDestroy(A,error);CHKERRQ(error)
>>>>>
>>>>> one=(1,0)
>>>>>
>>>>> ! Set Diagonal elements in Identity Matrix B
>>>>> do row=0,order-1
>>>>>    call  
>>>>> MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error)
>>>>> end do
>>>>>
>>>>> ! Assemble B
>>>>> call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>> call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>>
>>>>> ! Assemble X
>>>>> call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>> call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>>
>>>>> ! Solve AX=B
>>>>> call MatMatSolve(factorMat,B,X,error);CHKERRQ(error)
>>>>>
>>>>> ! Deallocate Storage
>>>>> deallocate(columnIndices)
>>>>>
>>>>> call MatDestroy(factorMat,error);CHKERRQ(error)
>>>>> call MatDestroy(B,error);CHKERRQ(error)
>>>>> call MatDestroy(X,error);CHKERRQ(error)
>>>>>
>>>>> call PetscFinalize(error)
>>>>>
>>>>> -- 
>>>>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>>>
>>>>> Dublin Institute for Advanced Studies
>>>>> 5 Merrion Square - Dublin 2 - Ireland
>>>>>
>>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>
>>> Dublin Institute for Advanced Studies
>>> 5 Merrion Square - Dublin 2 - Ireland
>>>
>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>
>>
>
>
> -- 
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>


From timothy.stitt at ichec.ie  Thu Nov 15 14:41:23 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Thu, 15 Nov 2007 20:41:23 +0000
Subject: AX=B Fortran Petsc Code
In-Reply-To: <517A8A62-BDC1-495E-87C6-B2C5659A5B01@mcs.anl.gov>
References: <473B0288.2060002@ichec.ie> <a9f269830711140629g5501f7ebx74934420f279992d@mail.gmail.com> <473B2457.1050104@ichec.ie> <6D457C11-14AD-4E43-B842-8AD24C64F993@mcs.anl.gov> <473B38A5.8080602@ichec.ie> <517A8A62-BDC1-495E-87C6-B2C5659A5B01@mcs.anl.gov>
Message-ID: <473CAEF3.1030903@ichec.ie>

Thanks Barry.

Barry Smith wrote:
>
>   Tim,
>
>     There is an field in MatFactorInfo that contains this fill factor 
> called
> fill set it with 2.5 and you should be all set.
>
>     Barry
>
> On Nov 14, 2007, at 12:04 PM, Tim Stitt wrote:
>
>> OK...everything is working well now and I am getting the results I 
>> expect. Much appreciated.
>>
>> Saying that...I am trying to now satisfy the PC FACTOR FILL 
>> suggestion provided by the -info parameter on my sample sparse matrices.
>>
>> In my case I am getting:
>>
>> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 
>> needed 2.56568
>> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.56568 or 
>> use
>> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.56568);
>> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance.
>>
>> So I run my code with ./foo -pc_factor_fill 2.56568
>>
>> but I continually get
>>
>> WARNING! There are options you set that were not used!
>> WARNING! could be spelling mistake, etc!
>> Option left: name:-pc_factor_fill value: 2.56568
>>
>> Can someone suggest how I can improve performance with the 
>> pc_factor_fill parameter in my case? As my code stands there is no 
>> MatSetFromOptions() as I set everything explicitly in the code.
>>
>> Thanks again,
>>
>> Tim.
>>
>> Barry Smith wrote:
>>>
>>>  For sequential codes the index set is as large as the matrix.
>>>
>>>  In parallel the factor codes do not use the ordering, they do the 
>>> ordering
>>> internally.
>>>
>>>   Barry
>>>
>>> On Nov 14, 2007, at 10:37 AM, Tim Stitt wrote:
>>>
>>>> Can I just ask a question about MatLUFactorSymbolic() in this 
>>>> context? What sizes should the 'row' and 'col' index sets be? 
>>>> Should they span all global rows/columns in A?
>>>>
>>>> Matthew Knepley wrote:
>>>>> You appear to be setting every value in the sparse matrix. We do not
>>>>> throw out 0 values (since sometimes they are necessary for structural
>>>>> reasons). Thus you are allocating a ton of times. You need to remove
>>>>> the 0 values before calling MatSetValues (and their associated
>>>>> column entires as well).
>>>>>
>>>>> Matt
>>>>>
>>>>> On Nov 14, 2007 8:13 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>>>>>
>>>>>> Dear PETSc Users/Developers,
>>>>>>
>>>>>> I have the following sequential Fortran PETSc code that I have been
>>>>>> developing (on and off) based on the kind advice given by members of
>>>>>> this list, with respect to solving an inverse sparse matrix problem.
>>>>>> Essentially, the code reads in a square double complex matrix from
>>>>>> external file of size (order x order) and then proceeds to do a
>>>>>> MatMatSolve(), where A is the sparse matrix to invert, B is a dense
>>>>>> identity matrix and X is the resultant dense matrix....hope that 
>>>>>> makes
>>>>>> sense.
>>>>>>
>>>>>> My main problem is that the code stalls on the MatSetValues() for 
>>>>>> the
>>>>>> sparse matrix A. With a trivial test matrix of (224 x 224) the 
>>>>>> program
>>>>>> terminates successfully (by successfully I mean all instructions
>>>>>> execute...I am not interested in the validity of X right now).
>>>>>> Unfortunately, when I move up to a (2352 x 2352) matrix the
>>>>>> MatSetValues() routine for matrix A is still in progress after 15
>>>>>> minutes on one processor of our AMD Opteron IBM Cluster. I know that
>>>>>> people will be screaming "preallocation"...but I have tried to 
>>>>>> take this
>>>>>> into account by running a loop over the rows in A and counting the
>>>>>> non-zero values explicitly prior to creation. I then pass this 
>>>>>> vector
>>>>>> into the creation routine for the nnz argument. For the large 
>>>>>> (2352 x
>>>>>> 2352) problem that seems to be taking forever to set...at most 
>>>>>> there are
>>>>>> only 200 elements per row that are non-zero according to the counts.
>>>>>>
>>>>>> Can anyone explain why the MatSetValues() routine is taking such 
>>>>>> a long
>>>>>> time. Maybe this expected for this specific task...although it seems
>>>>>> very long?
>>>>>>
>>>>>> I did notice that on the trivial (224 x 224) run that I was still
>>>>>> getting mallocs (approx 2000) for the A assembly when I used the 
>>>>>> -info
>>>>>> command line parameter. I thought that it should be 0 if my
>>>>>> preallocation counts were exact? Does this hint that I am doing
>>>>>> something wrong. I have checked the code but don't see any obvious
>>>>>> problems in the logic...not that means anything.
>>>>>>
>>>>>> I would be grateful if someone could advise on this matter. Also, 
>>>>>> if you
>>>>>> have a few seconds to spare I would be grateful if some experts 
>>>>>> could
>>>>>> scan the remaining logic of the code (not in fine detail)  to 
>>>>>> make sure
>>>>>> that I am doing all that I need to do to get this calculation
>>>>>> working...assuming I can resolve the MatSetValues() problem.
>>>>>>
>>>>>> Once again many thanks in advance,
>>>>>>
>>>>>> Tim.
>>>>>>
>>>>>> ! Initialise the PETSc MPI Harness
>>>>>> call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error)
>>>>>>
>>>>>> call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error)
>>>>>> call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error)
>>>>>>
>>>>>> ! Read in Matrix
>>>>>> open(321,file='Hamiltonian.bin',form='unformatted')
>>>>>> read(321) order
>>>>>> if (ID==0) then
>>>>>>    print *
>>>>>>    print *,processes," Processing Elements being used"
>>>>>>    print *
>>>>>>    print *,"Matrix has order ",order," rows by ",order," columns"
>>>>>>    print *
>>>>>> end if
>>>>>>
>>>>>> allocate(matrix(order,order))
>>>>>> read(321) matrix
>>>>>> close(321)
>>>>>>
>>>>>> ! Allocate array for nnz
>>>>>> allocate(numberZero(order))
>>>>>>
>>>>>> ! Count number of non-zero elements in each matrix row
>>>>>> do row=1,order
>>>>>>    count=0
>>>>>>    do column=1,order
>>>>>>       if (matrix(row,column).ne.(0,0)) count=count+1
>>>>>>    end do
>>>>>>    numberZero(row)=count
>>>>>> end do
>>>>>>
>>>>>> ! Declare a PETSc Matrices
>>>>>>
>>>>>> call
>>>>>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error) 
>>>>>>
>>>>>> call
>>>>>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error) 
>>>>>>
>>>>>> call
>>>>>> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error) 
>>>>>>
>>>>>> call
>>>>>> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error) 
>>>>>>
>>>>>>
>>>>>> ! Set up zero-based array indexing for use in MatSetValues
>>>>>> allocate(columnIndices(order))
>>>>>>
>>>>>> do column=1,order
>>>>>>    columnIndices(column)=column-1
>>>>>> end do
>>>>>>
>>>>>> ! Need to transpose values array as row-major arrays are used.
>>>>>> call
>>>>>> MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error) 
>>>>>>
>>>>>>
>>>>>> ! Assemble Matrix A
>>>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>>>
>>>>>> deallocate(matrix)
>>>>>>
>>>>>> ! Create Index Sets for Factorisation
>>>>>> call
>>>>>> ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error) 
>>>>>>
>>>>>> call MatFactorInfoInitialize(info,error);CHKERRQ(error)
>>>>>> call ISSetPermutation(indexSet,error);CHKERRQ(error)
>>>>>> call
>>>>>> MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error) 
>>>>>>
>>>>>> call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error)
>>>>>>
>>>>>> ! A no-longer needed
>>>>>> call MatDestroy(A,error);CHKERRQ(error)
>>>>>>
>>>>>> one=(1,0)
>>>>>>
>>>>>> ! Set Diagonal elements in Identity Matrix B
>>>>>> do row=0,order-1
>>>>>>    call 
>>>>>> MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error)
>>>>>> end do
>>>>>>
>>>>>> ! Assemble B
>>>>>> call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>>> call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>>>
>>>>>> ! Assemble X
>>>>>> call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>>> call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error)
>>>>>>
>>>>>> ! Solve AX=B
>>>>>> call MatMatSolve(factorMat,B,X,error);CHKERRQ(error)
>>>>>>
>>>>>> ! Deallocate Storage
>>>>>> deallocate(columnIndices)
>>>>>>
>>>>>> call MatDestroy(factorMat,error);CHKERRQ(error)
>>>>>> call MatDestroy(B,error);CHKERRQ(error)
>>>>>> call MatDestroy(X,error);CHKERRQ(error)
>>>>>>
>>>>>> call PetscFinalize(error)
>>>>>>
>>>>>> --Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>>>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>>>>
>>>>>> Dublin Institute for Advanced Studies
>>>>>> 5 Merrion Square - Dublin 2 - Ireland
>>>>>>
>>>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>>
>>>> Dublin Institute for Advanced Studies
>>>> 5 Merrion Square - Dublin 2 - Ireland
>>>>
>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>>
>>>
>>
>>
>> --Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From timothy.stitt at ichec.ie  Fri Nov 16 09:12:15 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Fri, 16 Nov 2007 15:12:15 +0000
Subject: Linking Static Library to PETSc code
Message-ID: <473DB34F.4060803@ichec.ie>

PETSc Developers,

I am trying to link a self-written static library to my PETSc code but 
during the compile and link phase I keep getting the following link 
error with my library:

"could not read symbols: Bad value"

Can anyone suggest how I can call external routines from my PETSc code 
which are packaged in an external static library without getting this error?

Thanks,

Tim.

-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From mfhoel at ifi.uio.no  Fri Nov 16 10:24:06 2007
From: mfhoel at ifi.uio.no (Mads Hoel)
Date: Fri, 16 Nov 2007 17:24:06 +0100
Subject: Linking Static Library to PETSc code
In-Reply-To: <473DB34F.4060803@ichec.ie>
References: <473DB34F.4060803@ichec.ie>
Message-ID: <op.t1v7ugr8qk6d2a@mads-desktop>

On Fri, 16 Nov 2007 16:12:15 +0100, Tim Stitt <timothy.stitt at ichec.ie>  
wrote:

> could not read symbols: Bad value

I haven't seen that error message before, but i looked it up in a search  
engine and got 4 cases that might be the solution to your problem,  
suggesting to recompile the static library with -fPIC:
http://www.gentoo.org/proj/en/base/amd64/howtos/index.xml?part=1&chap=3

-- 
Mads Hoel


From balay at mcs.anl.gov  Fri Nov 16 12:10:49 2007
From: balay at mcs.anl.gov (Satish Balay)
Date: Fri, 16 Nov 2007 12:10:49 -0600 (CST)
Subject: Linking Static Library to PETSc code
In-Reply-To: <op.t1v7ugr8qk6d2a@mads-desktop>
References: <473DB34F.4060803@ichec.ie> <op.t1v7ugr8qk6d2a@mads-desktop>
Message-ID: <alpine.LFD.0.9999.0711161208200.458@www.wondir.com>

On Fri, 16 Nov 2007, Mads Hoel wrote:

> On Fri, 16 Nov 2007 16:12:15 +0100, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> 
> > could not read symbols: Bad value
> 
> I haven't seen that error message before, but i looked it up in a search
> engine and got 4 cases that might be the solution to your problem, suggesting
> to recompile the static library with -fPIC:
> http://www.gentoo.org/proj/en/base/amd64/howtos/index.xml?part=1&chap=3

Can you post the complete log [compiler, compiler options etc..] of:

- how this static library was built
- how you are attempting to link it with PETSc
-complete error message

Also is this linux or linux64?

This additional info could give us clues as to whats going wrong.

Satish


From bknaepen at ulb.ac.be  Sun Nov 18 03:32:23 2007
From: bknaepen at ulb.ac.be (Bernard Knaepen)
Date: Sun, 18 Nov 2007 10:32:23 +0100
Subject: problem compiling PETSC on MacOS Leopard
Message-ID: <E80B51C8-6448-4E1D-BB95-EF1733CB5389@ulb.ac.be>

Hello,

I would like to compile PETSC on Leopard but I am encountering a  
problem during configuration. The scripts stops with:

dolfin:petsc-2.3.3-p8 bknaepen$ ./config/configure.py --with-cc=mpicc  
--with-fc=mpif90 --with-cxx=mpicxx

= 
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
              Configuring PETSc to compile on your system
= 
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TESTING: checkFortranCompiler from config.setCompilers(python/ 
BuildSystem/config/setCompilers.py: 
708 
)                                                                           *********************************************************************************
          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log  
for details):
---------------------------------------------------------------------------------------
Fortran compiler you provided with --with-fc=mpif90 does not work
*********************************************************************************


My MPI installation is mpich2 1.0.6p1 and I have the latest ifort  
compiler installed (10.0.20). I have test mpif90 and it is working ok.  
I copy below the configure.log file.


Any help would be appreciated, thanks,

Bernard.


                             Pushing language C
                             Popping language C
                             Pushing language Cxx
                             Popping language Cxx
                             Pushing language FC
                             Popping language FC
sh: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/ 
config/packages/config.guess
Executing: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/python/ 
BuildSystem/config/packages/config.guess
sh: i686-apple-darwin9.1.0

sh: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/ 
config/packages/config.sub i686-apple-darwin9.1.0

Executing: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/python/ 
BuildSystem/config/packages/config.sub i686-apple-darwin9.1.0

sh: i686-apple-darwin9.1.0


= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
Starting Configure Run at Sun Nov 18 10:29:29 2007
Configure Options: --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx  
--with-shared=0 --configModules=PETSc.Configure -- 
optionsModule=PETSc.compilerOptions
Working directory: /Users/bknaepen/Unix/petsc-2.3.3-p8
Python version:
2.5.1 (r251:54863, Oct  5 2007, 21:08:09)
[GCC 4.0.1 (Apple Inc. build 5465)]
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
         Pushing language C
         Popping language C
         Pushing language Cxx
         Popping language Cxx
         Pushing language FC
         Popping language FC
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureExternalPackagesDir from config.framework(/Users/ 
bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/framework.py:807)
TESTING: configureExternalPackagesDir from config.framework(python/ 
BuildSystem/config/framework.py:807)
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureLibrary from PETSc.packages.PVODE(/Users/bknaepen/Unix/ 
petsc-2.3.3-p8/python/PETSc/packages/PVODE.py:10)
TESTING: configureLibrary from PETSc.packages.PVODE(python/PETSc/ 
packages/PVODE.py:10)
   Find a PVODE installation and check if it can work with PETSc
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureLibrary from PETSc.packages.NetCDF(/Users/bknaepen/Unix/ 
petsc-2.3.3-p8/python/PETSc/packages/NetCDF.py:10)
TESTING: configureLibrary from PETSc.packages.NetCDF(python/PETSc/ 
packages/NetCDF.py:10)
   Find a NetCDF installation and check if it can work with PETSc
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureMercurial from config.sourceControl(/Users/bknaepen/Unix/ 
petsc-2.3.3-p8/python/BuildSystem/config/sourceControl.py:23)
TESTING: configureMercurial from config.sourceControl(python/ 
BuildSystem/config/sourceControl.py:23)
   Find the Mercurial executable
Checking for program /opt/intel/fc/10.0.020/bin/hg...not found
Checking for program /usr/X11R6/bin/hg...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/hg...not found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/hg...not found
Checking for program /bin/hg...not found
Checking for program /sbin/hg...not found
Checking for program /usr/bin/hg...not found
Checking for program /usr/sbin/hg...not found
Checking for program /usr/local/bin/hg...not found
Checking for program /usr/texbin/hg...not found
Checking for program /Users/bknaepen/hg...not found
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureCVS from config.sourceControl(/Users/bknaepen/Unix/ 
petsc-2.3.3-p8/python/BuildSystem/config/sourceControl.py:30)
TESTING: configureCVS from config.sourceControl(python/BuildSystem/ 
config/sourceControl.py:30)
   Find the CVS executable
Checking for program /opt/intel/fc/10.0.020/bin/cvs...not found
Checking for program /usr/X11R6/bin/cvs...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/cvs...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/cvs...not found
Checking for program /bin/cvs...not found
Checking for program /sbin/cvs...not found
Checking for program /usr/bin/cvs...found
           Defined make macro "CVS" to "cvs"
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureSubversion from config.sourceControl(/Users/bknaepen/ 
Unix/petsc-2.3.3-p8/python/BuildSystem/config/sourceControl.py:35)
TESTING: configureSubversion from config.sourceControl(python/ 
BuildSystem/config/sourceControl.py:35)
   Find the Subversion executable
Checking for program /opt/intel/fc/10.0.020/bin/svn...not found
Checking for program /usr/X11R6/bin/svn...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/svn...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/svn...not found
Checking for program /bin/svn...not found
Checking for program /sbin/svn...not found
Checking for program /usr/bin/svn...found
           Defined make macro "SVN" to "svn"
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureMkdir from config.programs(/Users/bknaepen/Unix/ 
petsc-2.3.3-p8/python/BuildSystem/config/programs.py:21)
TESTING: configureMkdir from config.programs(python/BuildSystem/config/ 
programs.py:21)
   Make sure we can have mkdir automatically make intermediate  
directories
Checking for program /opt/intel/fc/10.0.020/bin/mkdir...not found
Checking for program /usr/X11R6/bin/mkdir...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mkdir...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mkdir...not  
found
Checking for program /bin/mkdir...found
sh: /bin/mkdir -p .conftest/tmp
Executing: /bin/mkdir -p .conftest/tmp
sh:
       Adding -p flag to /bin/mkdir -p to automatically create  
directories
         Defined make macro "MKDIR" to "/bin/mkdir -p"
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configurePrograms from config.programs(/Users/bknaepen/Unix/ 
petsc-2.3.3-p8/python/BuildSystem/config/programs.py:43)
TESTING: configurePrograms from config.programs(python/BuildSystem/ 
config/programs.py:43)
   Check for the programs needed to build and run PETSc
Checking for program /opt/intel/fc/10.0.020/bin/sh...not found
Checking for program /usr/X11R6/bin/sh...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/sh...not found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/sh...not found
Checking for program /bin/sh...found
           Defined make macro "SHELL" to "/bin/sh"
Checking for program /opt/intel/fc/10.0.020/bin/sed...not found
Checking for program /usr/X11R6/bin/sed...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/sed...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/sed...not found
Checking for program /bin/sed...not found
Checking for program /sbin/sed...not found
Checking for program /usr/bin/sed...found
           Defined make macro "SED" to "/usr/bin/sed"
Checking for program /opt/intel/fc/10.0.020/bin/mv...not found
Checking for program /usr/X11R6/bin/mv...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mv...not found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mv...not found
Checking for program /bin/mv...found
           Defined make macro "MV" to "/bin/mv"
Checking for program /opt/intel/fc/10.0.020/bin/cp...not found
Checking for program /usr/X11R6/bin/cp...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/cp...not found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/cp...not found
Checking for program /bin/cp...found
           Defined make macro "CP" to "/bin/cp"
Checking for program /opt/intel/fc/10.0.020/bin/grep...not found
Checking for program /usr/X11R6/bin/grep...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/grep...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/grep...not  
found
Checking for program /bin/grep...not found
Checking for program /sbin/grep...not found
Checking for program /usr/bin/grep...found
           Defined make macro "GREP" to "/usr/bin/grep"
Checking for program /opt/intel/fc/10.0.020/bin/rm...not found
Checking for program /usr/X11R6/bin/rm...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/rm...not found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/rm...not found
Checking for program /bin/rm...found
           Defined make macro "RM" to "/bin/rm -f"
Checking for program /opt/intel/fc/10.0.020/bin/diff...not found
Checking for program /usr/X11R6/bin/diff...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/diff...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/diff...not  
found
Checking for program /bin/diff...not found
Checking for program /sbin/diff...not found
Checking for program /usr/bin/diff...found
sh: /usr/bin/diff -w diff1 diff2
Executing: /usr/bin/diff -w diff1 diff2
sh:
         Defined make macro "DIFF" to "/usr/bin/diff -w"
Checking for program /usr/ucb/ps...not found
Checking for program /usr/usb/ps...not found
Checking for program /Users/bknaepen/ps...not found
Checking for program /opt/intel/fc/10.0.020/bin/gzip...not found
Checking for program /usr/X11R6/bin/gzip...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/gzip...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/gzip...not  
found
Checking for program /bin/gzip...not found
Checking for program /sbin/gzip...not found
Checking for program /usr/bin/gzip...found
           Defined make macro "GZIP" to "/usr/bin/gzip"
         Defined "HAVE_GZIP" to "1"
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureMake from PETSc.utilities.Make(/Users/bknaepen/Unix/ 
petsc-2.3.3-p8/python/PETSc/utilities/Make.py:21)
TESTING: configureMake from PETSc.utilities.Make(python/PETSc/ 
utilities/Make.py:21)
   Check various things about make
Checking for program /opt/intel/fc/10.0.020/bin/make...not found
Checking for program /usr/X11R6/bin/make...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/make...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/make...not  
found
Checking for program /bin/make...not found
Checking for program /sbin/make...not found
Checking for program /usr/bin/make...found
           Defined make macro "MAKE" to "/usr/bin/make"
sh: strings /usr/bin/make
Executing: strings /usr/bin/make
sh: attempt to use unsupported feature: `%s'
touch: Archive `%s' does not exist
touch: `%s' is not a valid archive
touch:
touch: Member `%s' does not exist in `%s'
touch: Bad return code from ar_member_touch on `%s'
!<arch>
ARFILENAMES/
$(MAKE)
${MAKE}
*** [%s] Archive member `%s' may be bogus; not deleted
*** Archive member `%s' may be bogus; not deleted
*** [%s] Deleting file `%s'
*** Deleting file `%s'
unlink:
kill
#  commands to execute
  (built-in):
  (from `%s', line %lu):
%.*s
GNUMAKE
MAKEFILEPATH
$(NEXT_ROOT)/Developer/Makefiles
CHECKOUT,v
+$(if $(wildcard $@),,$(CO) $(COFLAGS) $< $@)
COFL...
... h:
can't allocate %ld bytes for hash table: memory exhausted
Load=%ld/%ld=%.0f%%,
Rehash=%d,
Collisions=%ld/%ld=%.0f%%
$(VPATH)
Can't do VPATH expansion on a null file.
=|^();&<>*?[]:$`'"\
Using old-style VPATH substitution.
Consider using automatic variable substitution instead.
glob
next != NULL
/SourceCache/gnumake/gnumake-119/make/glob/glob.c
alnum
alpha
blank
cntrl
digit
graph
lower
print
punct
space
upper
xdigit
.out 
  .a 
  .ln 
  .o 
  .c 
  .cc 
  .C 
  .cpp 
  .p 
  .f 
  .F 
  .m 
  .r 
  .y 
  .l 
  .ym 
  .lm 
  .s 
  .S 
  .mod 
  .sym 
  .def 
  .h .info .dvi .tex .texinfo .texi .txinfo .w .ch .web .sh .elc .el
/bin/sh
#;"*?[]&|<>(){}$`^~!

         Defined make macro "OMAKE " to "/usr/bin/make  --no-print- 
directory"
         Defined make rule "libc" with dependencies "${LIBNAME}($ 
{OBJSC} ${SOBJSC})" and code []
         Defined make rule "libf" with dependencies "${OBJSF}" and  
code -${AR} ${AR_FLAGS} ${LIBNAME} ${OBJSF}
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureDebuggers from PETSc.utilities.debuggers(/Users/bknaepen/ 
Unix/petsc-2.3.3-p8/python/PETSc/utilities/debuggers.py:22)
TESTING: configureDebuggers from PETSc.utilities.debuggers(python/ 
PETSc/utilities/debuggers.py:22)
   Find a default debugger and determine its arguments
Checking for program /opt/intel/fc/10.0.020/bin/gdb...not found
Checking for program /usr/X11R6/bin/gdb...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/gdb...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/gdb...not found
Checking for program /bin/gdb...not found
Checking for program /sbin/gdb...not found
Checking for program /usr/bin/gdb...found
           Defined make macro "GDB" to "/usr/bin/gdb"
Checking for program /opt/intel/fc/10.0.020/bin/dbx...not found
Checking for program /usr/X11R6/bin/dbx...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/dbx...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/dbx...not found
Checking for program /bin/dbx...not found
Checking for program /sbin/dbx...not found
Checking for program /usr/bin/dbx...not found
Checking for program /usr/sbin/dbx...not found
Checking for program /usr/local/bin/dbx...not found
Checking for program /usr/texbin/dbx...not found
Checking for program /Users/bknaepen/dbx...not found
Checking for program /opt/intel/fc/10.0.020/bin/xdb...not found
Checking for program /usr/X11R6/bin/xdb...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/xdb...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/xdb...not found
Checking for program /bin/xdb...not found
Checking for program /sbin/xdb...not found
Checking for program /usr/bin/xdb...not found
Checking for program /usr/sbin/xdb...not found
Checking for program /usr/local/bin/xdb...not found
Checking for program /usr/texbin/xdb...not found
Checking for program /Users/bknaepen/xdb...not found
         Defined "USE_GDB_DEBUGGER" to "1"
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureCLanguage from PETSc.utilities.languages(/Users/bknaepen/ 
Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:43)
TESTING: configureCLanguage from PETSc.utilities.languages(python/ 
PETSc/utilities/languages.py:43)
   Choose between C and C++ bindings
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureLanguageSupport from PETSc.utilities.languages(/Users/ 
bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:49)
TESTING: configureLanguageSupport from  
PETSc.utilities.languages(python/PETSc/utilities/languages.py:49)
   Check c-support c++-support and other misc tests
       Turning off C++ support
       Allowing C++ name mangling
       C language is C
         Defined "CLANGUAGE_C" to "1"
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureExternC from PETSc.utilities.languages(/Users/bknaepen/ 
Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:66)
TESTING: configureExternC from PETSc.utilities.languages(python/PETSc/ 
utilities/languages.py:66)
   Protect C bindings from C++ mangling
         Defined "USE_EXTERN_CXX" to " "
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureFortranLanguage from PETSc.utilities.languages(/Users/ 
bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:72)
TESTING: configureFortranLanguage from  
PETSc.utilities.languages(python/PETSc/utilities/languages.py:72)
   Turn on Fortran bindings
       Using Fortran
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureDirectories from PETSc.utilities.petscdir(/Users/ 
bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/petscdir.py:34)
TESTING: configureDirectories from PETSc.utilities.petscdir(python/ 
PETSc/utilities/petscdir.py:34)
   Checks PETSC_DIR and sets if not set
       Version Information:
       #define PETSC_VERSION_RELEASE    1
       #define PETSC_VERSION_MAJOR      2
       #define PETSC_VERSION_MINOR      3
       #define PETSC_VERSION_SUBMINOR   3
       #define PETSC_VERSION_PATCH      8
       #define PETSC_VERSION_DATE       "May, 23, 2007"
       #define PETSC_VERSION_PATCH_DATE "Fri Nov 16 17:03:40 CST 2007"
       #define PETSC_VERSION_HG          
"414581156e67e55c761739b0deb119f7590d0f4b"
         Defined make macro "DIR" to "/Users/bknaepen/Unix/petsc-2.3.3- 
p8"
         Defined "DIR" to "/Users/bknaepen/Unix/petsc-2.3.3-p8"
sh: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/config.guess
Executing: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/ 
config.guess
sh: i686-apple-darwin9.1.0

sh: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/config.sub  
i686-apple-darwin9.1.0

Executing: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/ 
config.sub i686-apple-darwin9.1.0

sh: i686-apple-darwin9.1.0

= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureExternalPackagesDir from PETSc.utilities.petscdir(/Users/ 
bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/petscdir.py:112)
TESTING: configureExternalPackagesDir from  
PETSc.utilities.petscdir(python/PETSc/utilities/petscdir.py:112)
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureInstallationMethod from PETSc.utilities.petscdir(/Users/ 
bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/petscdir.py:119)
TESTING: configureInstallationMethod from  
PETSc.utilities.petscdir(python/PETSc/utilities/petscdir.py:119)
       This is a tarball installation
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST configureETags from PETSc.utilities.Etags(/Users/bknaepen/Unix/ 
petsc-2.3.3-p8/python/PETSc/utilities/Etags.py:27)
TESTING: configureETags from PETSc.utilities.Etags(python/PETSc/ 
utilities/Etags.py:27)
   Determine if etags files exist and try to create otherwise
Found etags file
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST getDatafilespath from PETSc.utilities.dataFilesPath(/Users/ 
bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/dataFilesPath.py:29)
TESTING: getDatafilespath from PETSc.utilities.dataFilesPath(python/ 
PETSc/utilities/dataFilesPath.py:29)
   Checks what DATAFILESPATH should be
         Defined make macro "DATAFILESPATH" to "None"
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST checkVendor from config.setCompilers(/Users/bknaepen/Unix/ 
petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:262)
TESTING: checkVendor from config.setCompilers(python/BuildSystem/ 
config/setCompilers.py:262)
   Determine the compiler vendor
       Compiler vendor is ""
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST checkInitialFlags from config.setCompilers(/Users/bknaepen/Unix/ 
petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:272)
TESTING: checkInitialFlags from config.setCompilers(python/BuildSystem/ 
config/setCompilers.py:272)
   Initialize the compiler and linker flags
         Pushing language C
       Initialized CFLAGS to
       Initialized CFLAGS to
       Initialized LDFLAGS to
         Popping language C
         Pushing language Cxx
       Initialized CXXFLAGS to
       Initialized CXX_CXXFLAGS to
       Initialized LDFLAGS to
         Popping language Cxx
         Pushing language FC
       Initialized FFLAGS to
       Initialized FFLAGS to
       Initialized LDFLAGS to
         Popping language FC
       Initialized CPPFLAGS to
       Initialized executableFlags to []
       Initialized sharedLibraryFlags to []
       Initialized dynamicLibraryFlags to []
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST checkCCompiler from config.setCompilers(/Users/bknaepen/Unix/ 
petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:380)
TESTING: checkCCompiler from config.setCompilers(python/BuildSystem/ 
config/setCompilers.py:380)
   Locate a functional C compiler
Checking for program /opt/intel/fc/10.0.020/bin/mpicc...not found
Checking for program /usr/X11R6/bin/mpicc...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mpicc...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpicc...found
           Defined make macro "CC" to "mpicc"
           Pushing language C
sh: mpicc -c -o conftest.o   conftest.c
Executing: mpicc -c -o conftest.o   conftest.c
sh:
sh: mpicc -c -o conftest.o   conftest.c
Executing: mpicc -c -o conftest.o   conftest.c
sh:
                                           Pushing language C
                                           Popping language C
                                           Pushing language Cxx
                                           Popping language Cxx
                                           Pushing language FC
                                           Popping language FC
                   Pushing language C
                   Popping language C
sh: mpicc  -o conftest     conftest.o
Executing: mpicc  -o conftest     conftest.o
sh:
sh: mpicc -c -o conftest.o   conftest.c
Executing: mpicc -c -o conftest.o   conftest.c
sh:
                       Pushing language C
                       Popping language C
sh: mpicc  -o conftest     conftest.o
Executing: mpicc  -o conftest     conftest.o
sh:
Executing: ./conftest
sh: ./conftest
Executing: ./conftest
sh:
           Popping language C
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST checkCPreprocessor from config.setCompilers(/Users/bknaepen/Unix/ 
petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:437)
TESTING: checkCPreprocessor from config.setCompilers(python/ 
BuildSystem/config/setCompilers.py:437)
   Locate a functional C preprocessor
Checking for program /opt/intel/fc/10.0.020/bin/mpicc...not found
Checking for program /usr/X11R6/bin/mpicc...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mpicc...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpicc...found
           Defined make macro "CPP" to "mpicc -E"
         Pushing language C
sh: mpicc -E   conftest.c
Executing: mpicc -E   conftest.c
sh: # 1 "conftest.c"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "conftest.c"
# 1 "confdefs.h" 1
# 2 "conftest.c" 2
# 1 "conffix.h" 1
# 3 "conftest.c" 2
# 1 "/usr/include/stdlib.h" 1 3 4
# 61 "/usr/include/stdlib.h" 3 4
# 1 "/usr/include/available.h" 1 3 4
# 62 "/usr/include/stdlib.h" 2 3 4
# 1 "/usr/include/_types.h" 1 3 4
# 27 "/usr/include/_types.h" 3 4
# 1 "/usr/include/sys/_types.h" 1 3 4
# 32 "/usr/include/sys/_types.h" 3 4
# 1 "/usr/include/sys/cdefs.h" 1 3 4
# 33 "/usr/include/sys/_types.h" 2 3 4
# 1 "/usr/include/machine/_types.h" 1 3 4
# 34 "/usr/include/machine/_types.h" 3 4
# 1 "/usr/inc...
...  size_t, size_t,
      int (*)(const void *, const void *));
void qsort_r(void *, size_t, size_t, void *,
      int (*)(void *, const void *, const void *));
int radixsort(const unsigned char **, int, const unsigned char *,
      unsigned);
void setprogname(const char *);
int sradixsort(const unsigned char **, int, const unsigned char *,
      unsigned);
void sranddev(void);
void srandomdev(void);
void *reallocf(void *, size_t);
long long
   strtoq(const char *, char **, int);
unsigned long long
   strtouq(const char *, char **, int);
extern char *suboptarg;
void *valloc(size_t);
# 3 "conftest.c" 2

         Popping language C
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST checkCxxCompiler from config.setCompilers(/Users/bknaepen/Unix/ 
petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:541)
TESTING: checkCxxCompiler from config.setCompilers(python/BuildSystem/ 
config/setCompilers.py:541)
   Locate a functional Cxx compiler
= 
= 
= 
= 
= 
= 
= 
= 
========================================================================
TEST checkFortranCompiler from config.setCompilers(/Users/bknaepen/ 
Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:708)
TESTING: checkFortranCompiler from config.setCompilers(python/ 
BuildSystem/config/setCompilers.py:708)
   Locate a functional Fortran compiler
Checking for program /opt/intel/fc/10.0.020/bin/mpif90...not found
Checking for program /usr/X11R6/bin/mpif90...not found
Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mpif90...not  
found
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpif90...found
           Defined make macro "FC" to "mpif90"
           Pushing language FC
sh: mpif90 -c -o conftest.o   conftest.F
Executing: mpif90 -c -o conftest.o   conftest.F
sh:
Possible ERROR while running compiler: ret = 256
error message = {ifort: error #10106: Fatal error in /opt/intel/fc/ 
10.0.020/bin/fpp, terminated by segmentation violation
}
Source:
       program main

       end
           Popping language FC
       Error testing Fortran compiler: Cannot compile FC with mpicc.
        MPI installation mpif90 is likely incorrect.
   Use --with-mpi-dir to indicate an alternate MPI.
*********************************************************************************
          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log  
for details):
---------------------------------------------------------------------------------------
Fortran compiler you provided with --with-fc=mpif90 does not work
*********************************************************************************
   File "./config/configure.py", line 190, in petsc_configure
     framework.configure(out = sys.stdout)
   File "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ 
framework.py", line 878, in configure
     child.configure()
   File "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ 
setCompilers.py", line 1267, in configure
     self.executeTest(self.checkFortranCompiler)
   File "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ 
base.py", line 93, in executeTest
     return apply(test, args,kargs)
   File "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ 
setCompilers.py", line 714, in checkFortranCompiler
     for compiler in self.generateFortranCompilerGuesses():
   File "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ 
setCompilers.py", line 631, in generateFortranCompilerGuesses
     raise RuntimeError('Fortran compiler you provided with --with- 
fc='+self.framework.argDB['with-fc']+' does not work')

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 486 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071118/0b20d767/attachment.pgp>

From bsmith at mcs.anl.gov  Sun Nov 18 08:54:20 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 18 Nov 2007 08:54:20 -0600 (CST)
Subject: problem compiling PETSC on MacOS Leopard
In-Reply-To: <E80B51C8-6448-4E1D-BB95-EF1733CB5389@ulb.ac.be>
References: <E80B51C8-6448-4E1D-BB95-EF1733CB5389@ulb.ac.be>
Message-ID: <Pine.OSX.4.64.0711180852170.206@apples-powerbook-g4-17.local>


    Please direct these problems to petsc-maint instead of petsc-users.

>From the log file
Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpif90...found
            Defined make macro "FC" to "mpif90"
            Pushing language FC
sh: mpif90 -c -o conftest.o   conftest.F
Executing: mpif90 -c -o conftest.o   conftest.F
sh:
Possible ERROR while running compiler: ret =3D 256
error message =3D {ifort: error #10106: Fatal error in /opt/intel/fc/=20
10.0.020/bin/fpp, terminated by segmentation violation
}
Source:
        program main

        end

So the mpif90 is crashing on a simple Fortran program with nothing
in it. Can you try compiling exactly as above from the command line?


    Barry


On Sun, 18 Nov 2007, Bernard Knaepen wrote:

> Hello,
>
> I would like to compile PETSC on Leopard but I am encountering a problem 
> during configuration. The scripts stops with:
>
> dolfin:petsc-2.3.3-p8 bknaepen$ ./config/configure.py --with-cc=mpicc 
> --with-fc=mpif90 --with-cxx=mpicxx
>
> =================================================================================
>            Configuring PETSc to compile on your system
> =================================================================================
> TESTING: checkFortranCompiler from 
> config.setCompilers(python/BuildSystem/config/setCompilers.py:708) 
> *********************************************************************************
>        UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for 
> details):
> ---------------------------------------------------------------------------------------
> Fortran compiler you provided with --with-fc=mpif90 does not work
> *********************************************************************************
>
>
> My MPI installation is mpich2 1.0.6p1 and I have the latest ifort compiler 
> installed (10.0.20). I have test mpif90 and it is working ok. I copy below 
> the configure.log file.
>
>
> Any help would be appreciated, thanks,
>
> Bernard.
>
>
>
>
>                           Pushing language C
>                           Popping language C
>                           Pushing language Cxx
>                           Popping language Cxx
>                           Pushing language FC
>                           Popping language FC
> sh: /bin/sh 
> /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/packages/config.guess
> Executing: /bin/sh 
> /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/packages/config.guess
> sh: i686-apple-darwin9.1.0
>
> sh: /bin/sh 
> /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/packages/config.sub 
> i686-apple-darwin9.1.0
>
> Executing: /bin/sh 
> /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/packages/config.sub 
> i686-apple-darwin9.1.0
>
> sh: i686-apple-darwin9.1.0
>
>
> ================================================================================
> ================================================================================
> Starting Configure Run at Sun Nov 18 10:29:29 2007
> Configure Options: --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx 
> --with-shared=0 --configModules=PETSc.Configure 
> --optionsModule=PETSc.compilerOptions
> Working directory: /Users/bknaepen/Unix/petsc-2.3.3-p8
> Python version:
> 2.5.1 (r251:54863, Oct  5 2007, 21:08:09)
> [GCC 4.0.1 (Apple Inc. build 5465)]
> ================================================================================
>       Pushing language C
>       Popping language C
>       Pushing language Cxx
>       Popping language Cxx
>       Pushing language FC
>       Popping language FC
> ================================================================================
> TEST configureExternalPackagesDir from 
> config.framework(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/framework.py:807)
> TESTING: configureExternalPackagesDir from 
> config.framework(python/BuildSystem/config/framework.py:807)
> ================================================================================
> TEST configureLibrary from 
> PETSc.packages.PVODE(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/packages/PVODE.py:10)
> TESTING: configureLibrary from 
> PETSc.packages.PVODE(python/PETSc/packages/PVODE.py:10)
> Find a PVODE installation and check if it can work with PETSc
> ================================================================================
> TEST configureLibrary from 
> PETSc.packages.NetCDF(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/packages/NetCDF.py:10)
> TESTING: configureLibrary from 
> PETSc.packages.NetCDF(python/PETSc/packages/NetCDF.py:10)
> Find a NetCDF installation and check if it can work with PETSc
> ================================================================================
> TEST configureMercurial from 
> config.sourceControl(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/sourceControl.py:23)
> TESTING: configureMercurial from 
> config.sourceControl(python/BuildSystem/config/sourceControl.py:23)
> Find the Mercurial executable
> Checking for program /opt/intel/fc/10.0.020/bin/hg...not found
> Checking for program /usr/X11R6/bin/hg...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/hg...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/hg...not found
> Checking for program /bin/hg...not found
> Checking for program /sbin/hg...not found
> Checking for program /usr/bin/hg...not found
> Checking for program /usr/sbin/hg...not found
> Checking for program /usr/local/bin/hg...not found
> Checking for program /usr/texbin/hg...not found
> Checking for program /Users/bknaepen/hg...not found
> ================================================================================
> TEST configureCVS from 
> config.sourceControl(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/sourceControl.py:30)
> TESTING: configureCVS from 
> config.sourceControl(python/BuildSystem/config/sourceControl.py:30)
> Find the CVS executable
> Checking for program /opt/intel/fc/10.0.020/bin/cvs...not found
> Checking for program /usr/X11R6/bin/cvs...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/cvs...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/cvs...not found
> Checking for program /bin/cvs...not found
> Checking for program /sbin/cvs...not found
> Checking for program /usr/bin/cvs...found
>         Defined make macro "CVS" to "cvs"
> ================================================================================
> TEST configureSubversion from 
> config.sourceControl(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/sourceControl.py:35)
> TESTING: configureSubversion from 
> config.sourceControl(python/BuildSystem/config/sourceControl.py:35)
> Find the Subversion executable
> Checking for program /opt/intel/fc/10.0.020/bin/svn...not found
> Checking for program /usr/X11R6/bin/svn...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/svn...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/svn...not found
> Checking for program /bin/svn...not found
> Checking for program /sbin/svn...not found
> Checking for program /usr/bin/svn...found
>         Defined make macro "SVN" to "svn"
> ================================================================================
> TEST configureMkdir from 
> config.programs(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/programs.py:21)
> TESTING: configureMkdir from 
> config.programs(python/BuildSystem/config/programs.py:21)
> Make sure we can have mkdir automatically make intermediate directories
> Checking for program /opt/intel/fc/10.0.020/bin/mkdir...not found
> Checking for program /usr/X11R6/bin/mkdir...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mkdir...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mkdir...not found
> Checking for program /bin/mkdir...found
> sh: /bin/mkdir -p .conftest/tmp
> Executing: /bin/mkdir -p .conftest/tmp
> sh:
>     Adding -p flag to /bin/mkdir -p to automatically create directories
>       Defined make macro "MKDIR" to "/bin/mkdir -p"
> ================================================================================
> TEST configurePrograms from 
> config.programs(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/programs.py:43)
> TESTING: configurePrograms from 
> config.programs(python/BuildSystem/config/programs.py:43)
> Check for the programs needed to build and run PETSc
> Checking for program /opt/intel/fc/10.0.020/bin/sh...not found
> Checking for program /usr/X11R6/bin/sh...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/sh...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/sh...not found
> Checking for program /bin/sh...found
>         Defined make macro "SHELL" to "/bin/sh"
> Checking for program /opt/intel/fc/10.0.020/bin/sed...not found
> Checking for program /usr/X11R6/bin/sed...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/sed...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/sed...not found
> Checking for program /bin/sed...not found
> Checking for program /sbin/sed...not found
> Checking for program /usr/bin/sed...found
>         Defined make macro "SED" to "/usr/bin/sed"
> Checking for program /opt/intel/fc/10.0.020/bin/mv...not found
> Checking for program /usr/X11R6/bin/mv...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mv...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mv...not found
> Checking for program /bin/mv...found
>         Defined make macro "MV" to "/bin/mv"
> Checking for program /opt/intel/fc/10.0.020/bin/cp...not found
> Checking for program /usr/X11R6/bin/cp...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/cp...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/cp...not found
> Checking for program /bin/cp...found
>         Defined make macro "CP" to "/bin/cp"
> Checking for program /opt/intel/fc/10.0.020/bin/grep...not found
> Checking for program /usr/X11R6/bin/grep...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/grep...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/grep...not found
> Checking for program /bin/grep...not found
> Checking for program /sbin/grep...not found
> Checking for program /usr/bin/grep...found
>         Defined make macro "GREP" to "/usr/bin/grep"
> Checking for program /opt/intel/fc/10.0.020/bin/rm...not found
> Checking for program /usr/X11R6/bin/rm...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/rm...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/rm...not found
> Checking for program /bin/rm...found
>         Defined make macro "RM" to "/bin/rm -f"
> Checking for program /opt/intel/fc/10.0.020/bin/diff...not found
> Checking for program /usr/X11R6/bin/diff...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/diff...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/diff...not found
> Checking for program /bin/diff...not found
> Checking for program /sbin/diff...not found
> Checking for program /usr/bin/diff...found
> sh: /usr/bin/diff -w diff1 diff2
> Executing: /usr/bin/diff -w diff1 diff2
> sh:
>       Defined make macro "DIFF" to "/usr/bin/diff -w"
> Checking for program /usr/ucb/ps...not found
> Checking for program /usr/usb/ps...not found
> Checking for program /Users/bknaepen/ps...not found
> Checking for program /opt/intel/fc/10.0.020/bin/gzip...not found
> Checking for program /usr/X11R6/bin/gzip...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/gzip...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/gzip...not found
> Checking for program /bin/gzip...not found
> Checking for program /sbin/gzip...not found
> Checking for program /usr/bin/gzip...found
>         Defined make macro "GZIP" to "/usr/bin/gzip"
>       Defined "HAVE_GZIP" to "1"
> ================================================================================
> TEST configureMake from 
> PETSc.utilities.Make(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/Make.py:21)
> TESTING: configureMake from 
> PETSc.utilities.Make(python/PETSc/utilities/Make.py:21)
> Check various things about make
> Checking for program /opt/intel/fc/10.0.020/bin/make...not found
> Checking for program /usr/X11R6/bin/make...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/make...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/make...not found
> Checking for program /bin/make...not found
> Checking for program /sbin/make...not found
> Checking for program /usr/bin/make...found
>         Defined make macro "MAKE" to "/usr/bin/make"
> sh: strings /usr/bin/make
> Executing: strings /usr/bin/make
> sh: attempt to use unsupported feature: `%s'
> touch: Archive `%s' does not exist
> touch: `%s' is not a valid archive
> touch:
> touch: Member `%s' does not exist in `%s'
> touch: Bad return code from ar_member_touch on `%s'
> !<arch>
> ARFILENAMES/
> $(MAKE)
> ${MAKE}
> *** [%s] Archive member `%s' may be bogus; not deleted
> *** Archive member `%s' may be bogus; not deleted
> *** [%s] Deleting file `%s'
> *** Deleting file `%s'
> unlink:
> kill
> #  commands to execute
> (built-in):
> (from `%s', line %lu):
> %.*s
> GNUMAKE
> MAKEFILEPATH
> $(NEXT_ROOT)/Developer/Makefiles
> CHECKOUT,v
> +$(if $(wildcard $@),,$(CO) $(COFLAGS) $< $@)
> COFL...
> ... h:
> can't allocate %ld bytes for hash table: memory exhausted
> Load=%ld/%ld=%.0f%%,
> Rehash=%d,
> Collisions=%ld/%ld=%.0f%%
> $(VPATH)
> Can't do VPATH expansion on a null file.
> =|^();&<>*?[]:$`'"\
> Using old-style VPATH substitution.
> Consider using automatic variable substitution instead.
> glob
> next != NULL
> /SourceCache/gnumake/gnumake-119/make/glob/glob.c
> alnum
> alpha
> blank
> cntrl
> digit
> graph
> lower
> print
> punct
> space
> upper
> xdigit
> .out .a .ln .o .c .cc .C .cpp .p .f .F .m .r .y .l .ym .lm .s .S .mod .sym 
> .def .h .info .dvi .tex .texinfo .texi .txinfo .w .ch .web .sh .elc .el
> /bin/sh
> #;"*?[]&|<>(){}$`^~!
>
>       Defined make macro "OMAKE " to "/usr/bin/make  --no-print-directory"
>       Defined make rule "libc" with dependencies "${LIBNAME}(${OBJSC} 
> ${SOBJSC})" and code []
>       Defined make rule "libf" with dependencies "${OBJSF}" and code -${AR} 
> ${AR_FLAGS} ${LIBNAME} ${OBJSF}
> ================================================================================
> TEST configureDebuggers from 
> PETSc.utilities.debuggers(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/debuggers.py:22)
> TESTING: configureDebuggers from 
> PETSc.utilities.debuggers(python/PETSc/utilities/debuggers.py:22)
> Find a default debugger and determine its arguments
> Checking for program /opt/intel/fc/10.0.020/bin/gdb...not found
> Checking for program /usr/X11R6/bin/gdb...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/gdb...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/gdb...not found
> Checking for program /bin/gdb...not found
> Checking for program /sbin/gdb...not found
> Checking for program /usr/bin/gdb...found
>         Defined make macro "GDB" to "/usr/bin/gdb"
> Checking for program /opt/intel/fc/10.0.020/bin/dbx...not found
> Checking for program /usr/X11R6/bin/dbx...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/dbx...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/dbx...not found
> Checking for program /bin/dbx...not found
> Checking for program /sbin/dbx...not found
> Checking for program /usr/bin/dbx...not found
> Checking for program /usr/sbin/dbx...not found
> Checking for program /usr/local/bin/dbx...not found
> Checking for program /usr/texbin/dbx...not found
> Checking for program /Users/bknaepen/dbx...not found
> Checking for program /opt/intel/fc/10.0.020/bin/xdb...not found
> Checking for program /usr/X11R6/bin/xdb...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/xdb...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/xdb...not found
> Checking for program /bin/xdb...not found
> Checking for program /sbin/xdb...not found
> Checking for program /usr/bin/xdb...not found
> Checking for program /usr/sbin/xdb...not found
> Checking for program /usr/local/bin/xdb...not found
> Checking for program /usr/texbin/xdb...not found
> Checking for program /Users/bknaepen/xdb...not found
>       Defined "USE_GDB_DEBUGGER" to "1"
> ================================================================================
> TEST configureCLanguage from 
> PETSc.utilities.languages(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:43)
> TESTING: configureCLanguage from 
> PETSc.utilities.languages(python/PETSc/utilities/languages.py:43)
> Choose between C and C++ bindings
> ================================================================================
> TEST configureLanguageSupport from 
> PETSc.utilities.languages(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:49)
> TESTING: configureLanguageSupport from 
> PETSc.utilities.languages(python/PETSc/utilities/languages.py:49)
> Check c-support c++-support and other misc tests
>     Turning off C++ support
>     Allowing C++ name mangling
>     C language is C
>       Defined "CLANGUAGE_C" to "1"
> ================================================================================
> TEST configureExternC from 
> PETSc.utilities.languages(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:66)
> TESTING: configureExternC from 
> PETSc.utilities.languages(python/PETSc/utilities/languages.py:66)
> Protect C bindings from C++ mangling
>       Defined "USE_EXTERN_CXX" to " "
> ================================================================================
> TEST configureFortranLanguage from 
> PETSc.utilities.languages(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:72)
> TESTING: configureFortranLanguage from 
> PETSc.utilities.languages(python/PETSc/utilities/languages.py:72)
> Turn on Fortran bindings
>     Using Fortran
> ================================================================================
> TEST configureDirectories from 
> PETSc.utilities.petscdir(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/petscdir.py:34)
> TESTING: configureDirectories from 
> PETSc.utilities.petscdir(python/PETSc/utilities/petscdir.py:34)
> Checks PETSC_DIR and sets if not set
>     Version Information:
>     #define PETSC_VERSION_RELEASE    1
>     #define PETSC_VERSION_MAJOR      2
>     #define PETSC_VERSION_MINOR      3
>     #define PETSC_VERSION_SUBMINOR   3
>     #define PETSC_VERSION_PATCH      8
>     #define PETSC_VERSION_DATE       "May, 23, 2007"
>     #define PETSC_VERSION_PATCH_DATE "Fri Nov 16 17:03:40 CST 2007"
>     #define PETSC_VERSION_HG 
> "414581156e67e55c761739b0deb119f7590d0f4b"
>       Defined make macro "DIR" to "/Users/bknaepen/Unix/petsc-2.3.3-p8"
>       Defined "DIR" to "/Users/bknaepen/Unix/petsc-2.3.3-p8"
> sh: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/config.guess
> Executing: /bin/sh 
> /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/config.guess
> sh: i686-apple-darwin9.1.0
>
> sh: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/config.sub 
> i686-apple-darwin9.1.0
>
> Executing: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/config.sub 
> i686-apple-darwin9.1.0
>
> sh: i686-apple-darwin9.1.0
>
> ================================================================================
> TEST configureExternalPackagesDir from 
> PETSc.utilities.petscdir(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/petscdir.py:112)
> TESTING: configureExternalPackagesDir from 
> PETSc.utilities.petscdir(python/PETSc/utilities/petscdir.py:112)
> ================================================================================
> TEST configureInstallationMethod from 
> PETSc.utilities.petscdir(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/petscdir.py:119)
> TESTING: configureInstallationMethod from 
> PETSc.utilities.petscdir(python/PETSc/utilities/petscdir.py:119)
>     This is a tarball installation
> ================================================================================
> TEST configureETags from 
> PETSc.utilities.Etags(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/Etags.py:27)
> TESTING: configureETags from 
> PETSc.utilities.Etags(python/PETSc/utilities/Etags.py:27)
> Determine if etags files exist and try to create otherwise
> Found etags file
> ================================================================================
> TEST getDatafilespath from 
> PETSc.utilities.dataFilesPath(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/dataFilesPath.py:29)
> TESTING: getDatafilespath from 
> PETSc.utilities.dataFilesPath(python/PETSc/utilities/dataFilesPath.py:29)
> Checks what DATAFILESPATH should be
>       Defined make macro "DATAFILESPATH" to "None"
> ================================================================================
> TEST checkVendor from 
> config.setCompilers(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:262)
> TESTING: checkVendor from 
> config.setCompilers(python/BuildSystem/config/setCompilers.py:262)
> Determine the compiler vendor
>     Compiler vendor is ""
> ================================================================================
> TEST checkInitialFlags from 
> config.setCompilers(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:272)
> TESTING: checkInitialFlags from 
> config.setCompilers(python/BuildSystem/config/setCompilers.py:272)
> Initialize the compiler and linker flags
>       Pushing language C
>     Initialized CFLAGS to
>     Initialized CFLAGS to
>     Initialized LDFLAGS to
>       Popping language C
>       Pushing language Cxx
>     Initialized CXXFLAGS to
>     Initialized CXX_CXXFLAGS to
>     Initialized LDFLAGS to
>       Popping language Cxx
>       Pushing language FC
>     Initialized FFLAGS to
>     Initialized FFLAGS to
>     Initialized LDFLAGS to
>       Popping language FC
>     Initialized CPPFLAGS to
>     Initialized executableFlags to []
>     Initialized sharedLibraryFlags to []
>     Initialized dynamicLibraryFlags to []
> ================================================================================
> TEST checkCCompiler from 
> config.setCompilers(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:380)
> TESTING: checkCCompiler from 
> config.setCompilers(python/BuildSystem/config/setCompilers.py:380)
> Locate a functional C compiler
> Checking for program /opt/intel/fc/10.0.020/bin/mpicc...not found
> Checking for program /usr/X11R6/bin/mpicc...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mpicc...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpicc...found
>         Defined make macro "CC" to "mpicc"
>         Pushing language C
> sh: mpicc -c -o conftest.o   conftest.c
> Executing: mpicc -c -o conftest.o   conftest.c
> sh:
> sh: mpicc -c -o conftest.o   conftest.c
> Executing: mpicc -c -o conftest.o   conftest.c
> sh:
>                                         Pushing language C
>                                         Popping language C
>                                         Pushing language Cxx
>                                         Popping language Cxx
>                                         Pushing language FC
>                                         Popping language FC
>                 Pushing language C
>                 Popping language C
> sh: mpicc  -o conftest     conftest.o
> Executing: mpicc  -o conftest     conftest.o
> sh:
> sh: mpicc -c -o conftest.o   conftest.c
> Executing: mpicc -c -o conftest.o   conftest.c
> sh:
>                     Pushing language C
>                     Popping language C
> sh: mpicc  -o conftest     conftest.o
> Executing: mpicc  -o conftest     conftest.o
> sh:
> Executing: ./conftest
> sh: ./conftest
> Executing: ./conftest
> sh:
>         Popping language C
> ================================================================================
> TEST checkCPreprocessor from 
> config.setCompilers(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:437)
> TESTING: checkCPreprocessor from 
> config.setCompilers(python/BuildSystem/config/setCompilers.py:437)
> Locate a functional C preprocessor
> Checking for program /opt/intel/fc/10.0.020/bin/mpicc...not found
> Checking for program /usr/X11R6/bin/mpicc...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mpicc...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpicc...found
>         Defined make macro "CPP" to "mpicc -E"
>       Pushing language C
> sh: mpicc -E   conftest.c
> Executing: mpicc -E   conftest.c
> sh: # 1 "conftest.c"
> # 1 "<built-in>"
> # 1 "<command line>"
> # 1 "conftest.c"
> # 1 "confdefs.h" 1
> # 2 "conftest.c" 2
> # 1 "conffix.h" 1
> # 3 "conftest.c" 2
> # 1 "/usr/include/stdlib.h" 1 3 4
> # 61 "/usr/include/stdlib.h" 3 4
> # 1 "/usr/include/available.h" 1 3 4
> # 62 "/usr/include/stdlib.h" 2 3 4
> # 1 "/usr/include/_types.h" 1 3 4
> # 27 "/usr/include/_types.h" 3 4
> # 1 "/usr/include/sys/_types.h" 1 3 4
> # 32 "/usr/include/sys/_types.h" 3 4
> # 1 "/usr/include/sys/cdefs.h" 1 3 4
> # 33 "/usr/include/sys/_types.h" 2 3 4
> # 1 "/usr/include/machine/_types.h" 1 3 4
> # 34 "/usr/include/machine/_types.h" 3 4
> # 1 "/usr/inc...
> ...  size_t, size_t,
>    int (*)(const void *, const void *));
> void qsort_r(void *, size_t, size_t, void *,
>    int (*)(void *, const void *, const void *));
> int radixsort(const unsigned char **, int, const unsigned char *,
>    unsigned);
> void setprogname(const char *);
> int sradixsort(const unsigned char **, int, const unsigned char *,
>    unsigned);
> void sranddev(void);
> void srandomdev(void);
> void *reallocf(void *, size_t);
> long long
> strtoq(const char *, char **, int);
> unsigned long long
> strtouq(const char *, char **, int);
> extern char *suboptarg;
> void *valloc(size_t);
> # 3 "conftest.c" 2
>
>       Popping language C
> ================================================================================
> TEST checkCxxCompiler from 
> config.setCompilers(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:541)
> TESTING: checkCxxCompiler from 
> config.setCompilers(python/BuildSystem/config/setCompilers.py:541)
> Locate a functional Cxx compiler
> ================================================================================
> TEST checkFortranCompiler from 
> config.setCompilers(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:708)
> TESTING: checkFortranCompiler from 
> config.setCompilers(python/BuildSystem/config/setCompilers.py:708)
> Locate a functional Fortran compiler
> Checking for program /opt/intel/fc/10.0.020/bin/mpif90...not found
> Checking for program /usr/X11R6/bin/mpif90...not found
> Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mpif90...not found
> Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpif90...found
>         Defined make macro "FC" to "mpif90"
>         Pushing language FC
> sh: mpif90 -c -o conftest.o   conftest.F
> Executing: mpif90 -c -o conftest.o   conftest.F
> sh:
> Possible ERROR while running compiler: ret = 256
> error message = {ifort: error #10106: Fatal error in 
> /opt/intel/fc/10.0.020/bin/fpp, terminated by segmentation violation
> }
> Source:
>     program main
>
>     end
>         Popping language FC
>     Error testing Fortran compiler: Cannot compile FC with mpicc.
>      MPI installation mpif90 is likely incorrect.
> Use --with-mpi-dir to indicate an alternate MPI.
> *********************************************************************************
>        UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for 
> details):
> ---------------------------------------------------------------------------------------
> Fortran compiler you provided with --with-fc=mpif90 does not work
> *********************************************************************************
> File "./config/configure.py", line 190, in petsc_configure
>   framework.configure(out = sys.stdout)
> File 
> "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/framework.py", 
> line 878, in configure
>   child.configure()
> File 
> "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py", 
> line 1267, in configure
>   self.executeTest(self.checkFortranCompiler)
> File 
> "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/base.py", line 
> 93, in executeTest
>   return apply(test, args,kargs)
> File 
> "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py", 
> line 714, in checkFortranCompiler
>   for compiler in self.generateFortranCompilerGuesses():
> File 
> "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py", 
> line 631, in generateFortranCompilerGuesses
>   raise RuntimeError('Fortran compiler you provided with 
> --with-fc='+self.framework.argDB['with-fc']+' does not work')
>


From balay at mcs.anl.gov  Sun Nov 18 09:37:49 2007
From: balay at mcs.anl.gov (Satish Balay)
Date: Sun, 18 Nov 2007 09:37:49 -0600 (CST)
Subject: problem compiling PETSC on MacOS Leopard
In-Reply-To: <E80B51C8-6448-4E1D-BB95-EF1733CB5389@ulb.ac.be>
References: <E80B51C8-6448-4E1D-BB95-EF1733CB5389@ulb.ac.be>
Message-ID: <alpine.LFD.0.9999.0711180936390.27341@www.wondir.com>

>>>>>>>>
Executing: mpif90 -c -o conftest.o   conftest.F
sh:
Possible ERROR while running compiler: ret = 256
error message = {ifort: error #10106: Fatal error in /opt/intel/fc/10.0.020/bin/fpp, terminated by segmentation violation
>>>>>>>

ifort is giving SEGV - hence configure failed. There must be some
compatibility issue with ifort and Leopard.

Satish

On Sun, 18 Nov 2007, Bernard Knaepen wrote:

> Hello,
> 
> I would like to compile PETSC on Leopard but I am encountering a problem
> during configuration. The scripts stops with:
> 
> dolfin:petsc-2.3.3-p8 bknaepen$ ./config/configure.py --with-cc=mpicc
> --with-fc=mpif90 --with-cxx=mpicxx


From timothy.stitt at ichec.ie  Sun Nov 18 11:22:21 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Sun, 18 Nov 2007 17:22:21 +0000
Subject: Parallel ISCreateGeneral()
Message-ID: <474074CD.7050509@ichec.ie>

Hi all,

Just wanted to know if the "the length of the index set" for a call to 
ISCreateGeneral() in a parallel code, is a global length, or the length 
of the local elements on each process?

Thanks,

Tim.

-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From knepley at gmail.com  Sun Nov 18 11:27:01 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 18 Nov 2007 11:27:01 -0600
Subject: Parallel ISCreateGeneral()
In-Reply-To: <474074CD.7050509@ichec.ie>
References: <474074CD.7050509@ichec.ie>
Message-ID: <a9f269830711180927g5f96d113wfec50ee5b724bf8b@mail.gmail.com>

IS are not really parallel, so all the lengths, etc. only refer to local things.

  Matt

On Nov 18, 2007 11:22 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> Hi all,
>
> Just wanted to know if the "the length of the index set" for a call to
> ISCreateGeneral() in a parallel code, is a global length, or the length
> of the local elements on each process?
>
> Thanks,
>
> Tim.
>
> --
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From timothy.stitt at ichec.ie  Sun Nov 18 11:34:32 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Sun, 18 Nov 2007 17:34:32 +0000
Subject: Parallel ISCreateGeneral()
In-Reply-To: <a9f269830711180927g5f96d113wfec50ee5b724bf8b@mail.gmail.com>
References: <474074CD.7050509@ichec.ie> <a9f269830711180927g5f96d113wfec50ee5b724bf8b@mail.gmail.com>
Message-ID: <474077A8.7080804@ichec.ie>

OK..so I should be using the aggregate length returned by 
MatGetOwnershipRange() routine?

Thanks Matt for you help.

Matthew Knepley wrote:
> IS are not really parallel, so all the lengths, etc. only refer to local things.
>
>   Matt
>
> On Nov 18, 2007 11:22 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>   
>> Hi all,
>>
>> Just wanted to know if the "the length of the index set" for a call to
>> ISCreateGeneral() in a parallel code, is a global length, or the length
>> of the local elements on each process?
>>
>> Thanks,
>>
>> Tim.
>>
>> --
>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>>
>>     
>
>
>
>   


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From knepley at gmail.com  Sun Nov 18 11:37:10 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 18 Nov 2007 11:37:10 -0600
Subject: Parallel ISCreateGeneral()
In-Reply-To: <474077A8.7080804@ichec.ie>
References: <474074CD.7050509@ichec.ie>
	 <a9f269830711180927g5f96d113wfec50ee5b724bf8b@mail.gmail.com>
	 <474077A8.7080804@ichec.ie>
Message-ID: <a9f269830711180937q70aa57d2i15ff990d1a8f57dd@mail.gmail.com>

On Nov 18, 2007 11:34 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> OK..so I should be using the aggregate length returned by
> MatGetOwnershipRange() routine?

If you are using it to permute a Mat, yes.

  Matt

> Thanks Matt for you help.
>
>
> Matthew Knepley wrote:
> > IS are not really parallel, so all the lengths, etc. only refer to local things.
> >
> >   Matt
> >
> > On Nov 18, 2007 11:22 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> >
> >> Hi all,
> >>
> >> Just wanted to know if the "the length of the index set" for a call to
> >> ISCreateGeneral() in a parallel code, is a global length, or the length
> >> of the local elements on each process?
> >>
> >> Thanks,
> >>
> >> Tim.
> >>
> >> --
> >> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> >> HPC Application Consultant - ICHEC (www.ichec.ie)
> >>
> >> Dublin Institute for Advanced Studies
> >> 5 Merrion Square - Dublin 2 - Ireland
> >>
> >> +353-1-6621333 (tel) / +353-1-6621477 (fax)
> >>
> >>
> >>
> >
> >
> >
> >
>
>
> --
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From timothy.stitt at ichec.ie  Sun Nov 18 11:52:32 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Sun, 18 Nov 2007 17:52:32 +0000
Subject: Parallel ISCreateGeneral()
In-Reply-To: <a9f269830711180937q70aa57d2i15ff990d1a8f57dd@mail.gmail.com>
References: <474074CD.7050509@ichec.ie>	 <a9f269830711180927g5f96d113wfec50ee5b724bf8b@mail.gmail.com>	 <474077A8.7080804@ichec.ie> <a9f269830711180937q70aa57d2i15ff990d1a8f57dd@mail.gmail.com>
Message-ID: <47407BE0.8050508@ichec.ie>

Matt,

It is in setup for MatLUFactorSymbolic() and MatLUFactorNumeric() calls 
which require index sets. I have distributed my rows across the 
processes and now just a bit confused about the arguments to the 
ISCreateGeneral() routine to set up the IS sets used by the Factor 
routines in parallel.

So my basic question is what in general is the length and integers that 
get passed to ISCreateGeneral() when doing this type of calculation in 
parallel? Are they local index values (0..#rows on process-1) or do they 
refer to the distributed indices of the global matrix?

Tim.

Matthew Knepley wrote:
> On Nov 18, 2007 11:34 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>   
>> OK..so I should be using the aggregate length returned by
>> MatGetOwnershipRange() routine?
>>     
>
> If you are using it to permute a Mat, yes.
>
>   Matt
>
>   
>> Thanks Matt for you help.
>>
>>
>> Matthew Knepley wrote:
>>     
>>> IS are not really parallel, so all the lengths, etc. only refer to local things.
>>>
>>>   Matt
>>>
>>> On Nov 18, 2007 11:22 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>>>
>>>       
>>>> Hi all,
>>>>
>>>> Just wanted to know if the "the length of the index set" for a call to
>>>> ISCreateGeneral() in a parallel code, is a global length, or the length
>>>> of the local elements on each process?
>>>>
>>>> Thanks,
>>>>
>>>> Tim.
>>>>
>>>> --
>>>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>>
>>>> Dublin Institute for Advanced Studies
>>>> 5 Merrion Square - Dublin 2 - Ireland
>>>>
>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>>
>>>>
>>>>
>>>>         
>>>
>>>
>>>       
>> --
>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>>
>>     
>
>
>
>   


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From knepley at gmail.com  Sun Nov 18 12:01:50 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 18 Nov 2007 12:01:50 -0600
Subject: Parallel ISCreateGeneral()
In-Reply-To: <47407BE0.8050508@ichec.ie>
References: <474074CD.7050509@ichec.ie>
	 <a9f269830711180927g5f96d113wfec50ee5b724bf8b@mail.gmail.com>
	 <474077A8.7080804@ichec.ie>
	 <a9f269830711180937q70aa57d2i15ff990d1a8f57dd@mail.gmail.com>
	 <47407BE0.8050508@ichec.ie>
Message-ID: <a9f269830711181001v5bf738d7k2501a70c63dbb026@mail.gmail.com>

On Nov 18, 2007 11:52 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> Matt,
>
> It is in setup for MatLUFactorSymbolic() and MatLUFactorNumeric() calls
> which require index sets. I have distributed my rows across the
> processes and now just a bit confused about the arguments to the
> ISCreateGeneral() routine to set up the IS sets used by the Factor
> routines in parallel.
>
> So my basic question is what in general is the length and integers that
> get passed to ISCreateGeneral() when doing this type of calculation in
> parallel? Are they local index values (0..#rows on process-1) or do they
> refer to the distributed indices of the global matrix?

To be consistent, these would be local sizes and global numberings. However,
I am not sure why you would be doing this. I do not believe any of the parallel
LU packages accept an ordering from the user (they calculate their own),
and I would really only use them from a KSP (or PC at the least).

  Matt

> Tim.
>
>
> Matthew Knepley wrote:
> > On Nov 18, 2007 11:34 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> >
> >> OK..so I should be using the aggregate length returned by
> >> MatGetOwnershipRange() routine?
> >>
> >
> > If you are using it to permute a Mat, yes.
> >
> >   Matt
> >
> >
> >> Thanks Matt for you help.
> >>
> >>
> >> Matthew Knepley wrote:
> >>
> >>> IS are not really parallel, so all the lengths, etc. only refer to local things.
> >>>
> >>>   Matt
> >>>
> >>> On Nov 18, 2007 11:22 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> >>>
> >>>
> >>>> Hi all,
> >>>>
> >>>> Just wanted to know if the "the length of the index set" for a call to
> >>>> ISCreateGeneral() in a parallel code, is a global length, or the length
> >>>> of the local elements on each process?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Tim.
> >>>>
> >>>> --
> >>>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> >>>> HPC Application Consultant - ICHEC (www.ichec.ie)
> >>>>
> >>>> Dublin Institute for Advanced Studies
> >>>> 5 Merrion Square - Dublin 2 - Ireland
> >>>>
> >>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >> --
> >> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> >> HPC Application Consultant - ICHEC (www.ichec.ie)
> >>
> >> Dublin Institute for Advanced Studies
> >> 5 Merrion Square - Dublin 2 - Ireland
> >>
> >> +353-1-6621333 (tel) / +353-1-6621477 (fax)
> >>
> >>
> >>
> >
> >
> >
> >
>
>
> --
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From timothy.stitt at ichec.ie  Sun Nov 18 12:21:37 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Sun, 18 Nov 2007 18:21:37 +0000
Subject: Parallel ISCreateGeneral()
In-Reply-To: <a9f269830711181001v5bf738d7k2501a70c63dbb026@mail.gmail.com>
References: <474074CD.7050509@ichec.ie>	 <a9f269830711180927g5f96d113wfec50ee5b724bf8b@mail.gmail.com>	 <474077A8.7080804@ichec.ie>	 <a9f269830711180937q70aa57d2i15ff990d1a8f57dd@mail.gmail.com>	 <47407BE0.8050508@ichec.ie> <a9f269830711181001v5bf738d7k2501a70c63dbb026@mail.gmail.com>
Message-ID: <474082B1.4060809@ichec.ie>

Oh...ok I am now officially confused.

I have developed a serial code for getting the first k rows of an 
inverted sparse matrix..thanks to PETSC users/developers help this past 
week.

In that code I was calling MatLUFactorSymbolic() and 
MatLUFactorNumeric() to factor the sparse matrix and then calling 
MatSolve for each of the first k columns in the identity matrix as the 
RHS. I then varied the matrix type from the command line to test MUMPS, 
SUPERLU etc. for the best performance.

Now I just want to translate the code into a parallel version...so I now 
assemble rows in a distributed fashion and now working on translating 
the MatLUFactorSymbolic() and MatLUFactorNumeric() calls which require 
index sets...hence my original question.

Are you saying that I now shouldn't be calling those routines?

Tim.

Matthew Knepley wrote:
> On Nov 18, 2007 11:52 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>   
>> Matt,
>>
>> It is in setup for MatLUFactorSymbolic() and MatLUFactorNumeric() calls
>> which require index sets. I have distributed my rows across the
>> processes and now just a bit confused about the arguments to the
>> ISCreateGeneral() routine to set up the IS sets used by the Factor
>> routines in parallel.
>>
>> So my basic question is what in general is the length and integers that
>> get passed to ISCreateGeneral() when doing this type of calculation in
>> parallel? Are they local index values (0..#rows on process-1) or do they
>> refer to the distributed indices of the global matrix?
>>     
>
> To be consistent, these would be local sizes and global numberings. However,
> I am not sure why you would be doing this. I do not believe any of the parallel
> LU packages accept an ordering from the user (they calculate their own),
> and I would really only use them from a KSP (or PC at the least).
>
>   Matt
>
>   
>> Tim.
>>
>>
>> Matthew Knepley wrote:
>>     
>>> On Nov 18, 2007 11:34 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>>>
>>>       
>>>> OK..so I should be using the aggregate length returned by
>>>> MatGetOwnershipRange() routine?
>>>>
>>>>         
>>> If you are using it to permute a Mat, yes.
>>>
>>>   Matt
>>>
>>>
>>>       
>>>> Thanks Matt for you help.
>>>>
>>>>
>>>> Matthew Knepley wrote:
>>>>
>>>>         
>>>>> IS are not really parallel, so all the lengths, etc. only refer to local things.
>>>>>
>>>>>   Matt
>>>>>
>>>>> On Nov 18, 2007 11:22 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>>>>>
>>>>>
>>>>>           
>>>>>> Hi all,
>>>>>>
>>>>>> Just wanted to know if the "the length of the index set" for a call to
>>>>>> ISCreateGeneral() in a parallel code, is a global length, or the length
>>>>>> of the local elements on each process?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Tim.
>>>>>>
>>>>>> --
>>>>>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>>>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>>>>
>>>>>> Dublin Institute for Advanced Studies
>>>>>> 5 Merrion Square - Dublin 2 - Ireland
>>>>>>
>>>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>
>>>>>           
>>>> --
>>>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>>
>>>> Dublin Institute for Advanced Studies
>>>> 5 Merrion Square - Dublin 2 - Ireland
>>>>
>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>>
>>>>
>>>>
>>>>         
>>>
>>>
>>>       
>> --
>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>>
>>     
>
>
>
>   


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From knepley at gmail.com  Sun Nov 18 12:37:37 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 18 Nov 2007 12:37:37 -0600
Subject: Parallel ISCreateGeneral()
In-Reply-To: <474082B1.4060809@ichec.ie>
References: <474074CD.7050509@ichec.ie>
	 <a9f269830711180927g5f96d113wfec50ee5b724bf8b@mail.gmail.com>
	 <474077A8.7080804@ichec.ie>
	 <a9f269830711180937q70aa57d2i15ff990d1a8f57dd@mail.gmail.com>
	 <47407BE0.8050508@ichec.ie>
	 <a9f269830711181001v5bf738d7k2501a70c63dbb026@mail.gmail.com>
	 <474082B1.4060809@ichec.ie>
Message-ID: <a9f269830711181037w489e8d8uc9236fd3555c40d5@mail.gmail.com>

On Nov 18, 2007 12:21 PM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> Oh...ok I am now officially confused.
>
> I have developed a serial code for getting the first k rows of an
> inverted sparse matrix..thanks to PETSC users/developers help this past
> week.
>
> In that code I was calling MatLUFactorSymbolic() and
> MatLUFactorNumeric() to factor the sparse matrix and then calling
> MatSolve for each of the first k columns in the identity matrix as the
> RHS. I then varied the matrix type from the command line to test MUMPS,
> SUPERLU etc. for the best performance.
>
> Now I just want to translate the code into a parallel version...so I now
> assemble rows in a distributed fashion and now working on translating
> the MatLUFactorSymbolic() and MatLUFactorNumeric() calls which require
> index sets...hence my original question.
>
> Are you saying that I now shouldn't be calling those routines?

You can certainly do it that way, but it is much easier to just use a KSP.
You set the Mat using KSPSetOperators, then KSPSetType(ksp, KSPPREONLY),
and PCSetType(pc, PCLU) (or MUMPS or whatever). Then KSPSolve() with the
identity columns. We handle everything else.

  Matt

> Tim.
>
> Matthew Knepley wrote:
> > On Nov 18, 2007 11:52 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> >
> >> Matt,
> >>
> >> It is in setup for MatLUFactorSymbolic() and MatLUFactorNumeric() calls
> >> which require index sets. I have distributed my rows across the
> >> processes and now just a bit confused about the arguments to the
> >> ISCreateGeneral() routine to set up the IS sets used by the Factor
> >> routines in parallel.
> >>
> >> So my basic question is what in general is the length and integers that
> >> get passed to ISCreateGeneral() when doing this type of calculation in
> >> parallel? Are they local index values (0..#rows on process-1) or do they
> >> refer to the distributed indices of the global matrix?
> >>
> >
> > To be consistent, these would be local sizes and global numberings. However,
> > I am not sure why you would be doing this. I do not believe any of the parallel
> > LU packages accept an ordering from the user (they calculate their own),
> > and I would really only use them from a KSP (or PC at the least).
> >
> >   Matt
> >
> >
> >> Tim.
> >>
> >>
> >> Matthew Knepley wrote:
> >>
> >>> On Nov 18, 2007 11:34 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> >>>
> >>>
> >>>> OK..so I should be using the aggregate length returned by
> >>>> MatGetOwnershipRange() routine?
> >>>>
> >>>>
> >>> If you are using it to permute a Mat, yes.
> >>>
> >>>   Matt
> >>>
> >>>
> >>>
> >>>> Thanks Matt for you help.
> >>>>
> >>>>
> >>>> Matthew Knepley wrote:
> >>>>
> >>>>
> >>>>> IS are not really parallel, so all the lengths, etc. only refer to local things.
> >>>>>
> >>>>>   Matt
> >>>>>
> >>>>> On Nov 18, 2007 11:22 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> Just wanted to know if the "the length of the index set" for a call to
> >>>>>> ISCreateGeneral() in a parallel code, is a global length, or the length
> >>>>>> of the local elements on each process?
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> Tim.
> >>>>>>
> >>>>>> --
> >>>>>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> >>>>>> HPC Application Consultant - ICHEC (www.ichec.ie)
> >>>>>>
> >>>>>> Dublin Institute for Advanced Studies
> >>>>>> 5 Merrion Square - Dublin 2 - Ireland
> >>>>>>
> >>>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>> --
> >>>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> >>>> HPC Application Consultant - ICHEC (www.ichec.ie)
> >>>>
> >>>> Dublin Institute for Advanced Studies
> >>>> 5 Merrion Square - Dublin 2 - Ireland
> >>>>
> >>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >> --
> >> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> >> HPC Application Consultant - ICHEC (www.ichec.ie)
> >>
> >> Dublin Institute for Advanced Studies
> >> 5 Merrion Square - Dublin 2 - Ireland
> >>
> >> +353-1-6621333 (tel) / +353-1-6621477 (fax)
> >>
> >>
> >>
> >
> >
> >
> >
>
>
> --
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From timothy.stitt at ichec.ie  Sun Nov 18 12:46:49 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Sun, 18 Nov 2007 18:46:49 +0000
Subject: Parallel ISCreateGeneral()
In-Reply-To: <a9f269830711181037w489e8d8uc9236fd3555c40d5@mail.gmail.com>
References: <474074CD.7050509@ichec.ie>	 <a9f269830711180927g5f96d113wfec50ee5b724bf8b@mail.gmail.com>	 <474077A8.7080804@ichec.ie>	 <a9f269830711180937q70aa57d2i15ff990d1a8f57dd@mail.gmail.com>	 <47407BE0.8050508@ichec.ie>	 <a9f269830711181001v5bf738d7k2501a70c63dbb026@mail.gmail.com>	 <474082B1.4060809@ichec.ie> <a9f269830711181037w489e8d8uc9236fd3555c40d5@mail.gmail.com>
Message-ID: <47408899.6060001@ichec.ie>

OK Matt...will try that out. Thanks.

Matthew Knepley wrote:
> On Nov 18, 2007 12:21 PM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>   
>> Oh...ok I am now officially confused.
>>
>> I have developed a serial code for getting the first k rows of an
>> inverted sparse matrix..thanks to PETSC users/developers help this past
>> week.
>>
>> In that code I was calling MatLUFactorSymbolic() and
>> MatLUFactorNumeric() to factor the sparse matrix and then calling
>> MatSolve for each of the first k columns in the identity matrix as the
>> RHS. I then varied the matrix type from the command line to test MUMPS,
>> SUPERLU etc. for the best performance.
>>
>> Now I just want to translate the code into a parallel version...so I now
>> assemble rows in a distributed fashion and now working on translating
>> the MatLUFactorSymbolic() and MatLUFactorNumeric() calls which require
>> index sets...hence my original question.
>>
>> Are you saying that I now shouldn't be calling those routines?
>>     
>
> You can certainly do it that way, but it is much easier to just use a KSP.
> You set the Mat using KSPSetOperators, then KSPSetType(ksp, KSPPREONLY),
> and PCSetType(pc, PCLU) (or MUMPS or whatever). Then KSPSolve() with the
> identity columns. We handle everything else.
>
>   Matt
>
>   
>> Tim.
>>
>> Matthew Knepley wrote:
>>     
>>> On Nov 18, 2007 11:52 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>>>
>>>       
>>>> Matt,
>>>>
>>>> It is in setup for MatLUFactorSymbolic() and MatLUFactorNumeric() calls
>>>> which require index sets. I have distributed my rows across the
>>>> processes and now just a bit confused about the arguments to the
>>>> ISCreateGeneral() routine to set up the IS sets used by the Factor
>>>> routines in parallel.
>>>>
>>>> So my basic question is what in general is the length and integers that
>>>> get passed to ISCreateGeneral() when doing this type of calculation in
>>>> parallel? Are they local index values (0..#rows on process-1) or do they
>>>> refer to the distributed indices of the global matrix?
>>>>
>>>>         
>>> To be consistent, these would be local sizes and global numberings. However,
>>> I am not sure why you would be doing this. I do not believe any of the parallel
>>> LU packages accept an ordering from the user (they calculate their own),
>>> and I would really only use them from a KSP (or PC at the least).
>>>
>>>   Matt
>>>
>>>
>>>       
>>>> Tim.
>>>>
>>>>
>>>> Matthew Knepley wrote:
>>>>
>>>>         
>>>>> On Nov 18, 2007 11:34 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>>>>>
>>>>>
>>>>>           
>>>>>> OK..so I should be using the aggregate length returned by
>>>>>> MatGetOwnershipRange() routine?
>>>>>>
>>>>>>
>>>>>>             
>>>>> If you are using it to permute a Mat, yes.
>>>>>
>>>>>   Matt
>>>>>
>>>>>
>>>>>
>>>>>           
>>>>>> Thanks Matt for you help.
>>>>>>
>>>>>>
>>>>>> Matthew Knepley wrote:
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> IS are not really parallel, so all the lengths, etc. only refer to local things.
>>>>>>>
>>>>>>>   Matt
>>>>>>>
>>>>>>> On Nov 18, 2007 11:22 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Just wanted to know if the "the length of the index set" for a call to
>>>>>>>> ISCreateGeneral() in a parallel code, is a global length, or the length
>>>>>>>> of the local elements on each process?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Tim.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>>>>>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>>>>>>
>>>>>>>> Dublin Institute for Advanced Studies
>>>>>>>> 5 Merrion Square - Dublin 2 - Ireland
>>>>>>>>
>>>>>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>               
>>>>>> --
>>>>>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>>>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>>>>
>>>>>> Dublin Institute for Advanced Studies
>>>>>> 5 Merrion Square - Dublin 2 - Ireland
>>>>>>
>>>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>
>>>>>           
>>>> --
>>>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>>
>>>> Dublin Institute for Advanced Studies
>>>> 5 Merrion Square - Dublin 2 - Ireland
>>>>
>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>>
>>>>
>>>>
>>>>         
>>>
>>>
>>>       
>> --
>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>>
>>     
>
>
>
>   


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From zonexo at gmail.com  Sun Nov 18 14:34:32 2007
From: zonexo at gmail.com (Ben Tay)
Date: Sun, 18 Nov 2007 13:34:32 -0700
Subject: Dual core performance estimate
Message-ID: <804ab5d40711181234o2c7276b8q1de066c19342f29e@mail.gmail.com>

Hi,

someone was talking abt core 2 duo performance on os x in some previous
email. it seems that due to memory issues, it's not possible to get 2x the
performance. there's also some mention of amd vs intel dual core.

for computation using PETSc, is there any reason to buy one instead of the
other? Also, supposed I use winxp + mpich2 + PETSc on a dual core, what sort
of performance increase can we expect as compared to PETSc + nompi on the
same machine?

or is that too difficult an answer to give since there are too many factors?

thank you

regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071118/4fbf2f74/attachment.htm>

From aja2111 at columbia.edu  Sun Nov 18 14:53:13 2007
From: aja2111 at columbia.edu (Aron Ahmadia)
Date: Sun, 18 Nov 2007 15:53:13 -0500
Subject: Dual core performance estimate
In-Reply-To: <804ab5d40711181234o2c7276b8q1de066c19342f29e@mail.gmail.com>
References: <804ab5d40711181234o2c7276b8q1de066c19342f29e@mail.gmail.com>
Message-ID: <37604ab40711181253x1031e1ddl41abe4f5a3c7e874@mail.gmail.com>

Hi Ben,

You're asking a question that is very specific to the program you're
running.  I think the general consensus on this list has been that for
the more common uses of PETSc, getting dual-cores will not speed up
your performance as much as dual-processors.  For OS X, dual-cores are
pretty much the baseline now, so I wouldn't worry too much about it.

~A

On Nov 18, 2007 3:34 PM, Ben Tay <zonexo at gmail.com> wrote:
> Hi,
>
> someone was talking abt core 2 duo performance on os x in some previous
> email. it seems that due to memory issues, it's not possible to get 2x the
> performance. there's also some mention of amd vs intel dual core.
>
> for computation using PETSc, is there any reason to buy one instead of the
> other? Also, supposed I use winxp + mpich2 + PETSc on a dual core, what sort
> of performance increase can we expect as compared to PETSc + nompi on the
> same machine?
>
> or is that too difficult an answer to give since there are too many factors?
>
> thank you
>
> regards


From grs2103 at columbia.edu  Sun Nov 18 18:59:44 2007
From: grs2103 at columbia.edu (Gideon Simpson)
Date: Sun, 18 Nov 2007 19:59:44 -0500
Subject: Dual core performance estimate
In-Reply-To: <37604ab40711181253x1031e1ddl41abe4f5a3c7e874@mail.gmail.com>
References: <804ab5d40711181234o2c7276b8q1de066c19342f29e@mail.gmail.com> <37604ab40711181253x1031e1ddl41abe4f5a3c7e874@mail.gmail.com>
Message-ID: <3841E758-7F54-4713-A45C-52B74E4DB9F9@columbia.edu>

I asked the original question, and I have a follow up.  Like it or  
not, multi-core CPUs have been thrust upon us by the manufacturers  
and many of us are more likely to have access to a shared memory,  
multi core/multi processor machine, than a properly built cluster  
with MPI in mind.

So, two questions in this direction:

1.  How feasible would it be to implement  OpenMP in PETSc so that  
multi core CPUs could be properly used?

2.  Even if we are building a cluster, it looks like AMD/Intel are  
thrusting multi core up on is.  To that end, what is the feasibility  
of merging MPI and OpenMP so that between nodes, we use MPI, but  
within each node, OpenMP is used to take advantage of the multiple  
cores.

-gideon

On Nov 18, 2007, at 3:53 PM, Aron Ahmadia wrote:

> Hi Ben,
>
> You're asking a question that is very specific to the program you're
> running.  I think the general consensus on this list has been that for
> the more common uses of PETSc, getting dual-cores will not speed up
> your performance as much as dual-processors.  For OS X, dual-cores are
> pretty much the baseline now, so I wouldn't worry too much about it.
>
> ~A
>
> On Nov 18, 2007 3:34 PM, Ben Tay <zonexo at gmail.com> wrote:
>> Hi,
>>
>> someone was talking abt core 2 duo performance on os x in some  
>> previous
>> email. it seems that due to memory issues, it's not possible to  
>> get 2x the
>> performance. there's also some mention of amd vs intel dual core.
>>
>> for computation using PETSc, is there any reason to buy one  
>> instead of the
>> other? Also, supposed I use winxp + mpich2 + PETSc on a dual core,  
>> what sort
>> of performance increase can we expect as compared to PETSc + nompi  
>> on the
>> same machine?
>>
>> or is that too difficult an answer to give since there are too  
>> many factors?
>>
>> thank you
>>
>> regards
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071118/8e49f4ba/attachment.htm>

From bsmith at mcs.anl.gov  Sun Nov 18 20:00:30 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 18 Nov 2007 20:00:30 -0600 (CST)
Subject: Dual core performance estimate
In-Reply-To: <3841E758-7F54-4713-A45C-52B74E4DB9F9@columbia.edu>
References: <804ab5d40711181234o2c7276b8q1de066c19342f29e@mail.gmail.com>
 <37604ab40711181253x1031e1ddl41abe4f5a3c7e874@mail.gmail.com>
 <3841E758-7F54-4713-A45C-52B74E4DB9F9@columbia.edu>
Message-ID: <Pine.OSX.4.64.0711181948190.206@apples-powerbook-g4-17.local>


   Gideon,

On Sun, 18 Nov 2007, Gideon Simpson wrote:

> I asked the original question, and I have a follow up.  Like it or not, 
> multi-core CPUs have been thrust upon us by the manufacturers and many of us 
> are more likely to have access to a shared memory, multi core/multi processor 
> machine, than a properly built cluster with MPI in mind.
>
> So, two questions in this direction:
>
> 1.  How feasible would it be to implement  OpenMP in PETSc so that multi core 
> CPUs could be properly used?
>
> 2.  Even if we are building a cluster, it looks like AMD/Intel are thrusting 
> multi core up on is.  To that end, what is the feasibility of merging MPI and 
> OpenMP so that between nodes, we use MPI, but within each node, OpenMP is 
> used to take advantage of the multiple cores.
>
> -gideon
>

    Unfortunately using MPI+OpenMP on multi-core systems for the iterative
solution of linear systems will not help AT ALL. Sparse matrix algorithms
(like matrix-vector production, triangular solves) are memory bandwidth limited.
The speed of the memory is not enough to support 2 (or more) processes both
trying to pull sparse matrices from memory at the same time; the details of
the parallelism are not the issue.

   Now it is possible that other parts of a PETSc code; like evaluating
nonlinear functions, evaluating Jacobians and other stuff may NOT be
memory bandwidth limited. Those parts of the code might benefit by using
OpenMP on those pieces of the code, while only using the single thread
on the linear solvers. That is, you would run PETSc with one MPI process
per node, then in parts of your code you would use OpenMP loop level
parallelism or OpenMP task parallelism.

    Barry


> On Nov 18, 2007, at 3:53 PM, Aron Ahmadia wrote:
>
>> Hi Ben,
>> 
>> You're asking a question that is very specific to the program you're
>> running.  I think the general consensus on this list has been that for
>> the more common uses of PETSc, getting dual-cores will not speed up
>> your performance as much as dual-processors.  For OS X, dual-cores are
>> pretty much the baseline now, so I wouldn't worry too much about it.
>> 
>> ~A
>> 
>> On Nov 18, 2007 3:34 PM, Ben Tay <zonexo at gmail.com> wrote:
>>> Hi,
>>> 
>>> someone was talking abt core 2 duo performance on os x in some previous
>>> email. it seems that due to memory issues, it's not possible to get 2x the
>>> performance. there's also some mention of amd vs intel dual core.
>>> 
>>> for computation using PETSc, is there any reason to buy one instead of the
>>> other? Also, supposed I use winxp + mpich2 + PETSc on a dual core, what 
>>> sort
>>> of performance increase can we expect as compared to PETSc + nompi on the
>>> same machine?
>>> 
>>> or is that too difficult an answer to give since there are too many 
>>> factors?
>>> 
>>> thank you
>>> 
>>> regards
>> 
>


From balay at mcs.anl.gov  Sun Nov 18 20:03:27 2007
From: balay at mcs.anl.gov (Satish Balay)
Date: Sun, 18 Nov 2007 20:03:27 -0600 (CST)
Subject: Dual core performance estimate
In-Reply-To: <3841E758-7F54-4713-A45C-52B74E4DB9F9@columbia.edu>
References: <804ab5d40711181234o2c7276b8q1de066c19342f29e@mail.gmail.com> <37604ab40711181253x1031e1ddl41abe4f5a3c7e874@mail.gmail.com> <3841E758-7F54-4713-A45C-52B74E4DB9F9@columbia.edu>
Message-ID: <alpine.LFD.0.9999.0711181910001.3170@www.wondir.com>

On Sun, 18 Nov 2007, Gideon Simpson wrote:

> I asked the original question, and I have a follow up.  Like it or not,
> multi-core CPUs have been thrust upon us by the manufacturers and many of us
> are more likely to have access to a shared memory, multi core/multi processor
> machine, than a properly built cluster with MPI in mind.

Sure they are here to stay.

> 1.  How feasible would it be to implement OpenMP in PETSc so that
> multi core CPUs could be properly used?

> 2.  Even if we are building a cluster, it looks like AMD/Intel are thrusting
> multi core up on is.  To that end, what is the feasibility of merging MPI and
> OpenMP so that between nodes, we use MPI, but within each node, OpenMP is used
> to take advantage of the multiple cores.

You are missing the point on previous e-mails on this topic. The point
was: when undersanding performance one gets on single/vs dual core -
one should investigate memory bandwidth behavior.

With sparse matrix operations, memory bandwidth is the primary
determining factor. So if you split-up 'the same amount of
memory-bandwidth between 2 processors, you split up performance
between them as well.

Memory bandwidth affects both OpenMP & MPI. Its not like - memory
bandwidth is MPI-only issue [and OpenMP somehow avoids this problem].

So the inference: "MPI is not suitable for multi-core, but OpenMP is
suitable" is incorrect. [if performance is limited by
memory-bandwidth].

So our sugestion is: be aware of this issue - when analysing the
performance you get. One way to look at it this is: performance per
dollar. Since the second core is practically free - even 5%
improvement [in 1 vs 2 node run] is a good investment. [There could be
other parts of the application that are not-memory bandwidth limited -
that benifit from the extra core]

Note-1: when folks compare MPI performance vs OpenMPI, or when
refering to mixed OpenMP/MPI code, they are sometimes mixing 2 things.
 
- implementation difference [OpenMP communication could be implemented
better than MPI communication on some machines]

- algorithmic difference [for eg: if you have a 4 way SMP.  if MPI
impl was using bjacobi with num_blocks=4, vs OpenMP - which just
unrolled a DirectSolver fortran subroutine]

We feel that the first one is an implementation issue, and MPI should
do the right thing. Wrt the second one, OpenMP/MPI mixed mode is more
of an algorithmic issue [generally 2 level algrorithm]. Same
2-level-algorithm implmeneted with MPI/MPI should have similar
behavior.

PETSc currently has some support for this with "-pc_type openmp"

Note-2: So multi-core hardware is the future, how does one fully
utilze them?

I guess one has to look at alternative algorithms that are not memory
bandwidth limited, perhas that can somehow reduce memory bandwith
requirement by just doing extra computation.  [perhaps new
researchwork? sorry I don't know more on this topic..]

Satish


From timothy.stitt at ichec.ie  Tue Nov 20 11:45:31 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Tue, 20 Nov 2007 17:45:31 +0000
Subject: Load Balancing and KSPSolve
Message-ID: <47431D3B.5000309@ichec.ie>

Hi all (again),

I finally got some data back from the KSP PETSc code that I put together 
to solve this sparse inverse matrix problem I was looking into. Ideally 
I am aiming for a O(N) (time complexity) approach to getting the first 
'k' columns of the inverse of a sparse matrix.

To recap the method: I have my solver which uses KSPSolve in a loop that 
iterates over the first k columns of an identity matrix B and computes 
the corresponding x vector.

I am just a bit curious about some of the timings I am obtaining...which 
I hope someone can explain. Here are the timings I obtained for a global 
sparse matrix (4704 x 4704) and solving for the first 1176 columns in 
the identity using P processes (processors) on our cluster.

(Timings are given in seconds for each process performing work in the 
loop and were obtained by encapsulating the loop with the cpu_time() 
Fortran intrinsic. The MUMPS package was requested for 
factorisation/solving, although similar timings were obtained for both 
the native solver and SUPERLU)

P=1  [30.92]
P=2  [15.47, 15.54]
P=4  [4.68, 5.49, 4.67, 5.07]
P=8  [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15]
P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 0.73, 
0.25, 0.43, 1.09, 1.08, 1.1]

Firstly, I notice very good scalability up to 16 processes...is this 
expected (by those people who use these solvers regularly)?

Also I notice that the timings per process vary as we scale up. Is this 
a load-balancing problem related to more non-zero values being on a 
given processor than others? Once again is this expected?

Please excuse my ignorance of matters relating to these solvers and 
their operation...as it really isn't my field of expertise.

Regards,

Tim.

-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From balay at mcs.anl.gov  Tue Nov 20 12:34:09 2007
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 20 Nov 2007 12:34:09 -0600 (CST)
Subject: Load Balancing and KSPSolve
In-Reply-To: <47431D3B.5000309@ichec.ie>
References: <47431D3B.5000309@ichec.ie>
Message-ID: <alpine.LFD.0.9999.0711201233260.26505@www.wondir.com>

Can you send the -log_summary for your runs [say p=1, p=8]

Satish

On Tue, 20 Nov 2007, Tim Stitt wrote:

> Hi all (again),
> 
> I finally got some data back from the KSP PETSc code that I put together to
> solve this sparse inverse matrix problem I was looking into. Ideally I am
> aiming for a O(N) (time complexity) approach to getting the first 'k' columns
> of the inverse of a sparse matrix.
> 
> To recap the method: I have my solver which uses KSPSolve in a loop that
> iterates over the first k columns of an identity matrix B and computes the
> corresponding x vector.
> 
> I am just a bit curious about some of the timings I am obtaining...which I
> hope someone can explain. Here are the timings I obtained for a global sparse
> matrix (4704 x 4704) and solving for the first 1176 columns in the identity
> using P processes (processors) on our cluster.
> 
> (Timings are given in seconds for each process performing work in the loop and
> were obtained by encapsulating the loop with the cpu_time() Fortran intrinsic.
> The MUMPS package was requested for factorisation/solving, although similar
> timings were obtained for both the native solver and SUPERLU)
> 
> P=1  [30.92]
> P=2  [15.47, 15.54]
> P=4  [4.68, 5.49, 4.67, 5.07]
> P=8  [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15]
> P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 0.73, 0.25,
> 0.43, 1.09, 1.08, 1.1]
> 
> Firstly, I notice very good scalability up to 16 processes...is this expected
> (by those people who use these solvers regularly)?
> 
> Also I notice that the timings per process vary as we scale up. Is this a
> load-balancing problem related to more non-zero values being on a given
> processor than others? Once again is this expected?
> 
> Please excuse my ignorance of matters relating to these solvers and their
> operation...as it really isn't my field of expertise.
> 
> Regards,
> 
> Tim.
> 
> 


From bsmith at mcs.anl.gov  Tue Nov 20 12:43:33 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 20 Nov 2007 12:43:33 -0600
Subject: Load Balancing and KSPSolve
In-Reply-To: <47431D3B.5000309@ichec.ie>
References: <47431D3B.5000309@ichec.ie>
Message-ID: <E65F1D6C-DB0A-4C18-A00D-610935C8AA0A@mcs.anl.gov>


   Tim,

     This is an unrelated comment, but may help you with scaling to  
many processes.
Since the matrix is so SMALL it will be hard to get good scaling on  
the linear solves
for a large number of processes, but since you need MANY right hand  
sides you
might consider having different groups of processes (MPI_Comms) handle  
collections
of right hand sides. For example  if you have 64 processes you might  
use 4 MPI_Comm's
each of size 16, or even 8 MPI_Comm's each of size 8. Coding this is  
easy
simply use MPI to generate the appropriate communicator (for the  
subsets of processes)
and then create the Mat, the KSP etc on that communicator instead of  
MPI_COMM_WORLD


    Barry


On Nov 20, 2007, at 11:45 AM, Tim Stitt wrote:

> Hi all (again),
>
> I finally got some data back from the KSP PETSc code that I put  
> together to solve this sparse inverse matrix problem I was looking  
> into. Ideally I am aiming for a O(N) (time complexity) approach to  
> getting the first 'k' columns of the inverse of a sparse matrix.
>
> To recap the method: I have my solver which uses KSPSolve in a loop  
> that iterates over the first k columns of an identity matrix B and  
> computes the corresponding x vector.
>
> I am just a bit curious about some of the timings I am  
> obtaining...which I hope someone can explain. Here are the timings I  
> obtained for a global sparse matrix (4704 x 4704) and solving for  
> the first 1176 columns in the identity using P processes  
> (processors) on our cluster.
>
> (Timings are given in seconds for each process performing work in  
> the loop and were obtained by encapsulating the loop with the  
> cpu_time() Fortran intrinsic. The MUMPS package was requested for  
> factorisation/solving, although similar timings were obtained for  
> both the native solver and SUPERLU)
>
> P=1  [30.92]
> P=2  [15.47, 15.54]
> P=4  [4.68, 5.49, 4.67, 5.07]
> P=8  [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15]
> P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34,  
> 0.73, 0.25, 0.43, 1.09, 1.08, 1.1]
>
> Firstly, I notice very good scalability up to 16 processes...is this  
> expected (by those people who use these solvers regularly)?
>
> Also I notice that the timings per process vary as we scale up. Is  
> this a load-balancing problem related to more non-zero values being  
> on a given processor than others? Once again is this expected?
>
> Please excuse my ignorance of matters relating to these solvers and  
> their operation...as it really isn't my field of expertise.
>
> Regards,
>
> Tim.
>
> -- 
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>


From timothy.stitt at ichec.ie  Tue Nov 20 14:34:59 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Tue, 20 Nov 2007 20:34:59 +0000
Subject: Load Balancing and KSPSolve
In-Reply-To: <alpine.LFD.0.9999.0711201233260.26505@www.wondir.com>
References: <47431D3B.5000309@ichec.ie> <alpine.LFD.0.9999.0711201233260.26505@www.wondir.com>
Message-ID: <474344F3.2000405@ichec.ie>

Satish,

Logs attached...hope they help.

Thanks,

Tim.

Satish Balay wrote:
> Can you send the -log_summary for your runs [say p=1, p=8]
>
> Satish
>
> On Tue, 20 Nov 2007, Tim Stitt wrote:
>
>   
>> Hi all (again),
>>
>> I finally got some data back from the KSP PETSc code that I put together to
>> solve this sparse inverse matrix problem I was looking into. Ideally I am
>> aiming for a O(N) (time complexity) approach to getting the first 'k' columns
>> of the inverse of a sparse matrix.
>>
>> To recap the method: I have my solver which uses KSPSolve in a loop that
>> iterates over the first k columns of an identity matrix B and computes the
>> corresponding x vector.
>>
>> I am just a bit curious about some of the timings I am obtaining...which I
>> hope someone can explain. Here are the timings I obtained for a global sparse
>> matrix (4704 x 4704) and solving for the first 1176 columns in the identity
>> using P processes (processors) on our cluster.
>>
>> (Timings are given in seconds for each process performing work in the loop and
>> were obtained by encapsulating the loop with the cpu_time() Fortran intrinsic.
>> The MUMPS package was requested for factorisation/solving, although similar
>> timings were obtained for both the native solver and SUPERLU)
>>
>> P=1  [30.92]
>> P=2  [15.47, 15.54]
>> P=4  [4.68, 5.49, 4.67, 5.07]
>> P=8  [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15]
>> P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 0.73, 0.25,
>> 0.43, 1.09, 1.08, 1.1]
>>
>> Firstly, I notice very good scalability up to 16 processes...is this expected
>> (by those people who use these solvers regularly)?
>>
>> Also I notice that the timings per process vary as we scale up. Is this a
>> load-balancing problem related to more non-zero values being on a given
>> processor than others? Once again is this expected?
>>
>> Please excuse my ignorance of matters relating to these solvers and their
>> operation...as it really isn't my field of expertise.
>>
>> Regards,
>>
>> Tim.
>>
>>
>>     
>
>   


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log.1
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071120/6325b7c9/attachment.diff>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log.8
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071120/6325b7c9/attachment-0001.diff>

From balay at mcs.anl.gov  Tue Nov 20 20:17:27 2007
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 20 Nov 2007 20:17:27 -0600 (CST)
Subject: Load Balancing and KSPSolve
In-Reply-To: <474344F3.2000405@ichec.ie>
References: <47431D3B.5000309@ichec.ie> <alpine.LFD.0.9999.0711201233260.26505@www.wondir.com> <474344F3.2000405@ichec.ie>
Message-ID: <alpine.LFD.0.9999.0711202000530.12458@www.wondir.com>

a couple of comments:

Looks like most of the time is spent in MatSolve(). [90% for np=1]

However on np=8 run, you have MatSolve() taking 42% time, whereas
VecAssemblyBegin() taking 32% time. Depending upon whats beeing done
with VecSetValues()/VecAssembly() - you might be able to reduce this
time considerably. [ If you can generate values locally - then no
communication is required. If you need to communicate values - then
you can explore VecScatters() for more efficient communication]

Wrt MatSolve() on 8 procs, the max/min time between any 2 procs is
2.6.  [i.e slowest proc is taking 16 sec, so the fastest proc would
probably be taking 6 sec.]. The max/min ratio of flops across procs is
1.8. So there is indeed a load balance issue that is contributing to
different times on different processors [I guess the slowest proc is
doing almost twice the amount of work as the fastest proc].

Satish

On Tue, 20 Nov 2007, Tim Stitt wrote:

> Satish,
> 
> Logs attached...hope they help.
> 
> Thanks,
> 
> Tim.
> 
> Satish Balay wrote:
> > Can you send the -log_summary for your runs [say p=1, p=8]
> > 
> > Satish
> > 
> > On Tue, 20 Nov 2007, Tim Stitt wrote:
> > 
> >   
> > > Hi all (again),
> > > 
> > > I finally got some data back from the KSP PETSc code that I put together
> > > to
> > > solve this sparse inverse matrix problem I was looking into. Ideally I am
> > > aiming for a O(N) (time complexity) approach to getting the first 'k'
> > > columns
> > > of the inverse of a sparse matrix.
> > > 
> > > To recap the method: I have my solver which uses KSPSolve in a loop that
> > > iterates over the first k columns of an identity matrix B and computes the
> > > corresponding x vector.
> > > 
> > > I am just a bit curious about some of the timings I am obtaining...which I
> > > hope someone can explain. Here are the timings I obtained for a global
> > > sparse
> > > matrix (4704 x 4704) and solving for the first 1176 columns in the
> > > identity
> > > using P processes (processors) on our cluster.
> > > 
> > > (Timings are given in seconds for each process performing work in the loop
> > > and
> > > were obtained by encapsulating the loop with the cpu_time() Fortran
> > > intrinsic.
> > > The MUMPS package was requested for factorisation/solving, although
> > > similar
> > > timings were obtained for both the native solver and SUPERLU)
> > > 
> > > P=1  [30.92]
> > > P=2  [15.47, 15.54]
> > > P=4  [4.68, 5.49, 4.67, 5.07]
> > > P=8  [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15]
> > > P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 0.73,
> > > 0.25,
> > > 0.43, 1.09, 1.08, 1.1]
> > > 
> > > Firstly, I notice very good scalability up to 16 processes...is this
> > > expected
> > > (by those people who use these solvers regularly)?
> > > 
> > > Also I notice that the timings per process vary as we scale up. Is this a
> > > load-balancing problem related to more non-zero values being on a given
> > > processor than others? Once again is this expected?
> > > 
> > > Please excuse my ignorance of matters relating to these solvers and their
> > > operation...as it really isn't my field of expertise.
> > > 
> > > Regards,
> > > 
> > > Tim.
> > > 
> > > 
> > >     
> > 
> >   
> 
> 
> 


From timothy.stitt at ichec.ie  Wed Nov 21 04:52:05 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Wed, 21 Nov 2007 10:52:05 +0000
Subject: Load Balancing and KSPSolve
In-Reply-To: <alpine.LFD.0.9999.0711202000530.12458@www.wondir.com>
References: <47431D3B.5000309@ichec.ie> <alpine.LFD.0.9999.0711201233260.26505@www.wondir.com> <474344F3.2000405@ichec.ie> <alpine.LFD.0.9999.0711202000530.12458@www.wondir.com>
Message-ID: <47440DD5.7050901@ichec.ie>

Satish,

Thanks for your helpful comments. I am unsure why the VecAssembyBegin() 
routine is taking a high percentage of the wall-clock when modifications 
to the parallel vector should be local (all I am doing is working out 
which element in the RHS b vector should be 1 and setting it).

Here is my loop for iterating through the RHS Identity matrix and 
setting the relevant element to 1...prior to the call to KSPSolve. I 
then reset that value to 0 after the Solve in preparation for the next 
iteration.

! Get vector index range per process
call VecGetOwnershipRange(B,firstElement,lastElement,error);

do column=0,rhs-1   ! Loop over RHS columns in Identity Matrix

     if ((column.ge.firstElement).and.(column.lt.lastElement)) then
        call VecSetValue(B,column,one,INSERT_VALUES,error)
     end if

     call VecAssemblyBegin(B,error)
     call VecAssemblyEnd(B,error)

     ! Solve Ax=b
     call KSPSolve(ksp,b,x,error);!CHKERRQ(error)

     if ((column.ge.firstElement).and.(column.lt.lastElement)) then
        call VecSetValue(B,column,zero,INSERT_VALUES,error)
     end if

  end do

Can you identify if I am doing something stupid which could be 
compromising the efficiency of the Assembly routine?

Thanks again,

Tim.

Satish Balay wrote:
> a couple of comments:
>
> Looks like most of the time is spent in MatSolve(). [90% for np=1]
>
> However on np=8 run, you have MatSolve() taking 42% time, whereas
> VecAssemblyBegin() taking 32% time. Depending upon whats beeing done
> with VecSetValues()/VecAssembly() - you might be able to reduce this
> time considerably. [ If you can generate values locally - then no
> communication is required. If you need to communicate values - then
> you can explore VecScatters() for more efficient communication]
>
> Wrt MatSolve() on 8 procs, the max/min time between any 2 procs is
> 2.6.  [i.e slowest proc is taking 16 sec, so the fastest proc would
> probably be taking 6 sec.]. The max/min ratio of flops across procs is
> 1.8. So there is indeed a load balance issue that is contributing to
> different times on different processors [I guess the slowest proc is
> doing almost twice the amount of work as the fastest proc].
>
> Satish
>
> On Tue, 20 Nov 2007, Tim Stitt wrote:
>
>   
>> Satish,
>>
>> Logs attached...hope they help.
>>
>> Thanks,
>>
>> Tim.
>>
>> Satish Balay wrote:
>>     
>>> Can you send the -log_summary for your runs [say p=1, p=8]
>>>
>>> Satish
>>>
>>> On Tue, 20 Nov 2007, Tim Stitt wrote:
>>>
>>>   
>>>       
>>>> Hi all (again),
>>>>
>>>> I finally got some data back from the KSP PETSc code that I put together
>>>> to
>>>> solve this sparse inverse matrix problem I was looking into. Ideally I am
>>>> aiming for a O(N) (time complexity) approach to getting the first 'k'
>>>> columns
>>>> of the inverse of a sparse matrix.
>>>>
>>>> To recap the method: I have my solver which uses KSPSolve in a loop that
>>>> iterates over the first k columns of an identity matrix B and computes the
>>>> corresponding x vector.
>>>>
>>>> I am just a bit curious about some of the timings I am obtaining...which I
>>>> hope someone can explain. Here are the timings I obtained for a global
>>>> sparse
>>>> matrix (4704 x 4704) and solving for the first 1176 columns in the
>>>> identity
>>>> using P processes (processors) on our cluster.
>>>>
>>>> (Timings are given in seconds for each process performing work in the loop
>>>> and
>>>> were obtained by encapsulating the loop with the cpu_time() Fortran
>>>> intrinsic.
>>>> The MUMPS package was requested for factorisation/solving, although
>>>> similar
>>>> timings were obtained for both the native solver and SUPERLU)
>>>>
>>>> P=1  [30.92]
>>>> P=2  [15.47, 15.54]
>>>> P=4  [4.68, 5.49, 4.67, 5.07]
>>>> P=8  [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15]
>>>> P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 0.73,
>>>> 0.25,
>>>> 0.43, 1.09, 1.08, 1.1]
>>>>
>>>> Firstly, I notice very good scalability up to 16 processes...is this
>>>> expected
>>>> (by those people who use these solvers regularly)?
>>>>
>>>> Also I notice that the timings per process vary as we scale up. Is this a
>>>> load-balancing problem related to more non-zero values being on a given
>>>> processor than others? Once again is this expected?
>>>>
>>>> Please excuse my ignorance of matters relating to these solvers and their
>>>> operation...as it really isn't my field of expertise.
>>>>
>>>> Regards,
>>>>
>>>> Tim.
>>>>
>>>>
>>>>     
>>>>         
>>>   
>>>       
>>
>>     
>
>   


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From timothy.stitt at ichec.ie  Wed Nov 21 04:56:29 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Wed, 21 Nov 2007 10:56:29 +0000
Subject: Load Balancing and KSPSolve
In-Reply-To: <E65F1D6C-DB0A-4C18-A00D-610935C8AA0A@mcs.anl.gov>
References: <47431D3B.5000309@ichec.ie> <E65F1D6C-DB0A-4C18-A00D-610935C8AA0A@mcs.anl.gov>
Message-ID: <47440EDD.5030304@ichec.ie>

Definitely will do that Barry...thanks for the tip.

We actually intend to use the code for larger matrices (of order 10^6) 
so your method would be very beneficial at that stage, on our cluster. 
As you say, multiple communicators are easy to implement which allows 
another level of parallelism on top of the parallel solver itself.

Best,

Tim.

Barry Smith wrote:
>
>   Tim,
>
>     This is an unrelated comment, but may help you with scaling to 
> many processes.
> Since the matrix is so SMALL it will be hard to get good scaling on 
> the linear solves
> for a large number of processes, but since you need MANY right hand 
> sides you
> might consider having different groups of processes (MPI_Comms) handle 
> collections
> of right hand sides. For example  if you have 64 processes you might 
> use 4 MPI_Comm's
> each of size 16, or even 8 MPI_Comm's each of size 8. Coding this is easy
> simply use MPI to generate the appropriate communicator (for the 
> subsets of processes)
> and then create the Mat, the KSP etc on that communicator instead of 
> MPI_COMM_WORLD
>
>
>    Barry
>
>
>
> On Nov 20, 2007, at 11:45 AM, Tim Stitt wrote:
>
>> Hi all (again),
>>
>> I finally got some data back from the KSP PETSc code that I put 
>> together to solve this sparse inverse matrix problem I was looking 
>> into. Ideally I am aiming for a O(N) (time complexity) approach to 
>> getting the first 'k' columns of the inverse of a sparse matrix.
>>
>> To recap the method: I have my solver which uses KSPSolve in a loop 
>> that iterates over the first k columns of an identity matrix B and 
>> computes the corresponding x vector.
>>
>> I am just a bit curious about some of the timings I am 
>> obtaining...which I hope someone can explain. Here are the timings I 
>> obtained for a global sparse matrix (4704 x 4704) and solving for the 
>> first 1176 columns in the identity using P processes (processors) on 
>> our cluster.
>>
>> (Timings are given in seconds for each process performing work in the 
>> loop and were obtained by encapsulating the loop with the cpu_time() 
>> Fortran intrinsic. The MUMPS package was requested for 
>> factorisation/solving, although similar timings were obtained for 
>> both the native solver and SUPERLU)
>>
>> P=1  [30.92]
>> P=2  [15.47, 15.54]
>> P=4  [4.68, 5.49, 4.67, 5.07]
>> P=8  [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15]
>> P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 
>> 0.73, 0.25, 0.43, 1.09, 1.08, 1.1]
>>
>> Firstly, I notice very good scalability up to 16 processes...is this 
>> expected (by those people who use these solvers regularly)?
>>
>> Also I notice that the timings per process vary as we scale up. Is 
>> this a load-balancing problem related to more non-zero values being 
>> on a given processor than others? Once again is this expected?
>>
>> Please excuse my ignorance of matters relating to these solvers and 
>> their operation...as it really isn't my field of expertise.
>>
>> Regards,
>>
>> Tim.
>>
>> --Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From balay at mcs.anl.gov  Wed Nov 21 10:16:25 2007
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 21 Nov 2007 10:16:25 -0600 (CST)
Subject: Load Balancing and KSPSolve
In-Reply-To: <47440DD5.7050901@ichec.ie>
References: <47431D3B.5000309@ichec.ie> <alpine.LFD.0.9999.0711201233260.26505@www.wondir.com> <474344F3.2000405@ichec.ie> <alpine.LFD.0.9999.0711202000530.12458@www.wondir.com> <47440DD5.7050901@ichec.ie>
Message-ID: <alpine.LFD.0.9999.0711211005390.3125@www.wondir.com>

If you are just setting local values, then its best to avoid calls to
VecAssembyBegin()/VecAssemblyEnd(). These have calls to
MPI_Allreduce() - eventhough there might not be any communication.

[so with a MPI_Barrier time of 0.00820498sec, 4704 calls to
MPI_Allreduce(), which is similar to a barrier - would add up to many
seconds. In this case it could be most of the 12sec time taken by
VecAssemblyBegin()]

Normally local assembly of a vec is done by accessing the local vector
data, directly and modifying the values.

VecGetArray(vec,&ptr)
ptr[local-dim]= val
VecRestoreArray(vec)

With fortran77, since pointer usageis not possible - there is a
workarround. [check vec/vec/examples/tutorials/ex4f.F for
VecGetArray() usage from F77]. But with F90, you can use
VecGetArrayF90()/VecRestoreArrayF90() [as in ex4f90.F].

However in your case - you might be able to continuing using
VecSetValue(), by just commenting out the calls to
VecAssemblyBegin()/End(). [you might first want to run with -info, to
make sure there is no communiation in VecAssembly]

Satish

On Wed, 21 Nov 2007, Tim Stitt wrote:

> Satish,
> 
> Thanks for your helpful comments. I am unsure why the VecAssembyBegin()
> routine is taking a high percentage of the wall-clock when modifications to
> the parallel vector should be local (all I am doing is working out which
> element in the RHS b vector should be 1 and setting it).
> 
> Here is my loop for iterating through the RHS Identity matrix and setting the
> relevant element to 1...prior to the call to KSPSolve. I then reset that value
> to 0 after the Solve in preparation for the next iteration.
> 
> ! Get vector index range per process
> call VecGetOwnershipRange(B,firstElement,lastElement,error);
> 
> do column=0,rhs-1   ! Loop over RHS columns in Identity Matrix
> 
>     if ((column.ge.firstElement).and.(column.lt.lastElement)) then
>        call VecSetValue(B,column,one,INSERT_VALUES,error)
>     end if
> 
>     call VecAssemblyBegin(B,error)
>     call VecAssemblyEnd(B,error)
> 
>     ! Solve Ax=b
>     call KSPSolve(ksp,b,x,error);!CHKERRQ(error)
> 
>     if ((column.ge.firstElement).and.(column.lt.lastElement)) then
>        call VecSetValue(B,column,zero,INSERT_VALUES,error)
>     end if
> 
>  end do
> 
> Can you identify if I am doing something stupid which could be compromising
> the efficiency of the Assembly routine?
> 
> Thanks again,
> 
> Tim.
> 
> Satish Balay wrote:
> > a couple of comments:
> > 
> > Looks like most of the time is spent in MatSolve(). [90% for np=1]
> > 
> > However on np=8 run, you have MatSolve() taking 42% time, whereas
> > VecAssemblyBegin() taking 32% time. Depending upon whats beeing done
> > with VecSetValues()/VecAssembly() - you might be able to reduce this
> > time considerably. [ If you can generate values locally - then no
> > communication is required. If you need to communicate values - then
> > you can explore VecScatters() for more efficient communication]
> > 
> > Wrt MatSolve() on 8 procs, the max/min time between any 2 procs is
> > 2.6.  [i.e slowest proc is taking 16 sec, so the fastest proc would
> > probably be taking 6 sec.]. The max/min ratio of flops across procs is
> > 1.8. So there is indeed a load balance issue that is contributing to
> > different times on different processors [I guess the slowest proc is
> > doing almost twice the amount of work as the fastest proc].
> > 
> > Satish
> > 
> > On Tue, 20 Nov 2007, Tim Stitt wrote:
> > 
> >   
> > > Satish,
> > > 
> > > Logs attached...hope they help.
> > > 
> > > Thanks,
> > > 
> > > Tim.
> > > 
> > > Satish Balay wrote:
> > >     
> > > > Can you send the -log_summary for your runs [say p=1, p=8]
> > > > 
> > > > Satish
> > > > 
> > > > On Tue, 20 Nov 2007, Tim Stitt wrote:
> > > > 
> > > >         
> > > > > Hi all (again),
> > > > > 
> > > > > I finally got some data back from the KSP PETSc code that I put
> > > > > together
> > > > > to
> > > > > solve this sparse inverse matrix problem I was looking into. Ideally I
> > > > > am
> > > > > aiming for a O(N) (time complexity) approach to getting the first 'k'
> > > > > columns
> > > > > of the inverse of a sparse matrix.
> > > > > 
> > > > > To recap the method: I have my solver which uses KSPSolve in a loop
> > > > > that
> > > > > iterates over the first k columns of an identity matrix B and computes
> > > > > the
> > > > > corresponding x vector.
> > > > > 
> > > > > I am just a bit curious about some of the timings I am
> > > > > obtaining...which I
> > > > > hope someone can explain. Here are the timings I obtained for a global
> > > > > sparse
> > > > > matrix (4704 x 4704) and solving for the first 1176 columns in the
> > > > > identity
> > > > > using P processes (processors) on our cluster.
> > > > > 
> > > > > (Timings are given in seconds for each process performing work in the
> > > > > loop
> > > > > and
> > > > > were obtained by encapsulating the loop with the cpu_time() Fortran
> > > > > intrinsic.
> > > > > The MUMPS package was requested for factorisation/solving, although
> > > > > similar
> > > > > timings were obtained for both the native solver and SUPERLU)
> > > > > 
> > > > > P=1  [30.92]
> > > > > P=2  [15.47, 15.54]
> > > > > >>>> P=4  [4.68, 5.49, 4.67, 5.07]
> > > > > P=8  [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15]
> > > > > P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 0.73,
> > > > > 0.25,
> > > > > 0.43, 1.09, 1.08, 1.1]
> > > > > 
> > > > > Firstly, I notice very good scalability up to 16 processes...is this
> > > > > expected
> > > > > (by those people who use these solvers regularly)?
> > > > > 
> > > > > Also I notice that the timings per process vary as we scale up. Is
> > > > > this a
> > > > > load-balancing problem related to more non-zero values being on a
> > > > > given
> > > > > processor than others? Once again is this expected?
> > > > > 
> > > > > Please excuse my ignorance of matters relating to these solvers and
> > > > > their
> > > > > operation...as it really isn't my field of expertise.
> > > > > 
> > > > > Regards,
> > > > > 
> > > > > Tim.
> > > > > 
> > > > > 
> > > > >             
> > > >         
> > > 
> > >     
> > 
> >   
> 
> 
> 


From w_subber at yahoo.com  Thu Nov 22 05:46:33 2007
From: w_subber at yahoo.com (Waad Subber)
Date: Thu, 22 Nov 2007 03:46:33 -0800 (PST)
Subject: pc_factor_fill
Message-ID: <447770.46513.qm@web38207.mail.mud.yahoo.com>

Hello PETSc Users,

I have a serial code to solve multiple matrices. I am using LU factorization. When I run the code  with the (-info) option, it gives me different values for (pc_factor_fill) depending on the input matrix. I am wondering if I can set these values for the (pc_factor_fill) inside the code instead of running it with runtime option, for it is one code with multiple inputs.
 
 Thanks a lot
 
 Waad


For Matrix No.1

[5] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 2.96
[5] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.96 or use
[5] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.96);
[5] MatLUFactorSymbolic_SeqAIJ(): for best performance.

For Matrix No.2

[9] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 2.42069
[9] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.42069 or use
[9] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.42069);
[9] MatLUFactorSymbolic_SeqAIJ(): for best performance.

For Matrix No. n

[8] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 1.87742
[8] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 1.87742 or use
[8] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,1.87742);
[8] MatLUFactorSymbolic_SeqAIJ(): for best performance.

       
---------------------------------
Be a better sports nut! Let your teams follow you with Yahoo Mobile. Try it now.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071122/c06b56a4/attachment.htm>

From z.sheng at ewi.tudelft.nl  Thu Nov 22 07:17:27 2007
From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng)
Date: Thu, 22 Nov 2007 14:17:27 +0100
Subject: pc_factor_fill
In-Reply-To: <447770.46513.qm@web38207.mail.mud.yahoo.com>
References: <447770.46513.qm@web38207.mail.mud.yahoo.com>
Message-ID: <47458167.5060308@ewi.tudelft.nl>

Waad Subber wrote:

> Hello PETSc Users,
>
> I have a serial code to solve multiple matrices. I am using LU 
> factorization. When I run the code  with the (-info) option, it gives 
> me different values for (pc_factor_fill) depending on the input 
> matrix. I am wondering if I can set these values for the 
> (pc_factor_fill) inside the code instead of running it with runtime 
> option, for it is one code with multiple inputs.
>
> Thanks a lot
>
> Waad
>
>
> For Matrix No.1
>
> [5] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 
> 2.96
> [5] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.96 or use
> [5] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.96);
> [5] MatLUFactorSymbolic_SeqAIJ(): for best performance.
>
> For Matrix No.2
>
> [9] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 
> 2.42069
> [9] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.42069 or use
> [9] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.42069);
> [9] MatLUFactorSymbolic_SeqAIJ(): for best performance.
>
> For Matrix No. n
>
> [8] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 
> 1.87742
> [8] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 1.87742 or use
> [8] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,1.87742);
> [8] MatLUFactorSymbolic_SeqAIJ(): for best performance.
>
> ------------------------------------------------------------------------
> Be a better sports nut! Let your teams follow you with Yahoo Mobile. 
> Try it now. 
> <http://us.rd.yahoo.com/evt=51731/*http://mobile.yahoo.com/sports;_ylt=At9_qDKvtAbMuh1G1SQtBI7ntAcJ> 


try


	*-pc_factor_fill <nfill>*


it works for me:)


From z.sheng at ewi.tudelft.nl  Thu Nov 22 09:10:07 2007
From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng)
Date: Thu, 22 Nov 2007 16:10:07 +0100
Subject: Matrix reuse
Message-ID: <47459BCF.3070409@ewi.tudelft.nl>

Dear all

In my application, I have small matrices that are created and destroied, 
those matrices are of the same nozero pattern.

I wonder if there is a way to reuse that matrices instead of destroy 
them every time.

Thank you

Best regards
Zhifeng Sheng


From timothy.stitt at ichec.ie  Thu Nov 22 11:16:01 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Thu, 22 Nov 2007 17:16:01 +0000
Subject: Banded Tridiagonal Matrices in PETSc
Message-ID: <4745B951.3000407@ichec.ie>

Hi,

I was just wondering if PETSc has any special provision for banded 
tridiagonal complex matrices when used in conjunction with KSPSolve(). 
Are there any special PETSc matrix types or factorisation/solver methods 
that benefit more from this matrix form?

Currently I am just using standard AIJ representation in my 
serial/parallel codes.

I would be grateful for any thoughts.

Thanks,

Tim.

-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From dalcinl at gmail.com  Thu Nov 22 12:22:24 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Thu, 22 Nov 2007 15:22:24 -0300
Subject: Matrix reuse
In-Reply-To: <47459BCF.3070409@ewi.tudelft.nl>
References: <47459BCF.3070409@ewi.tudelft.nl>
Message-ID: <e7ba66e40711221022y65dffa42j4a280c0aecf7fdb5@mail.gmail.com>

On 11/22/07, Zhifeng Sheng <z.sheng at ewi.tudelft.nl> wrote:
> In my application, I have small matrices that are created and destroied,
> those matrices are of the same nozero pattern.
> I wonder if there is a way to reuse that matrices instead of destroy
> them every time.

Just do not destroy them! Use the matrices, and then start again your
loop with MatSetValues(), calling MatAssemblyBegin() and
MatAssemblyEnd() after the loop.


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From bsmith at mcs.anl.gov  Thu Nov 22 12:38:12 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 22 Nov 2007 12:38:12 -0600
Subject: pc_factor_fill
In-Reply-To: <447770.46513.qm@web38207.mail.mud.yahoo.com>
References: <447770.46513.qm@web38207.mail.mud.yahoo.com>
Message-ID: <0779CEF7-E610-4ACE-B7CB-9241CD5F437D@mcs.anl.gov>


   PCFactorSetFill() after calling KSPSetPC()

    Barry

   If you are using multiple different KSP's you might look at  
KSPSetOptionsPrefix()
to allow using command line options to set different values for  
different solvers.

On Nov 22, 2007, at 5:46 AM, Waad Subber wrote:

> Hello PETSc Users,
>
> I have a serial code to solve multiple matrices. I am using LU  
> factorization. When I run the code  with the (-info) option, it  
> gives me different values for (pc_factor_fill) depending on the  
> input matrix. I am wondering if I can set these values for the  
> (pc_factor_fill) inside the code instead of running it with runtime  
> option, for it is one code with multiple inputs.
>
> Thanks a lot
>
> Waad
>
>
> For Matrix No.1
>
> [5] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0  
> needed 2.96
> [5] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.96 or use
> [5] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.96);
> [5] MatLUFactorSymbolic_SeqAIJ(): for best performance.
>
> For Matrix No.2
>
> [9] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0  
> needed 2.42069
> [9] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.42069  
> or use
> [9] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.42069);
> [9] MatLUFactorSymbolic_SeqAIJ(): for best performance.
>
> For Matrix No. n
>
> [8] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0  
> needed 1.87742
> [8] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 1.87742  
> or use
> [8] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,1.87742);
> [8] MatLUFactorSymbolic_SeqAIJ(): for best performance.
>
> Be a better sports nut! Let your teams follow you with Yahoo Mobile.  
> Try it now.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071122/29840416/attachment.htm>

From bsmith at mcs.anl.gov  Thu Nov 22 12:41:17 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 22 Nov 2007 12:41:17 -0600
Subject: Banded Tridiagonal Matrices in PETSc
In-Reply-To: <4745B951.3000407@ichec.ie>
References: <4745B951.3000407@ichec.ie>
Message-ID: <FB0D56E9-A896-415F-A54F-9B69C11F9D7E@mcs.anl.gov>


    There is a format Bdiag that stores by "banded diagonal". You will  
find that
this performs slower than then the AIJ format.

   If your matrix has constant values along the "diagaonals" then you  
will benefit from
using a MatShell and writing custom code. if the values along the  
diagonals are
not constant you will not do any better than AIJ anyways.

    Barry

On Nov 22, 2007, at 11:16 AM, Tim Stitt wrote:

> Hi,
>
> I was just wondering if PETSc has any special provision for banded  
> tridiagonal complex matrices when used in conjunction with  
> KSPSolve(). Are there any special PETSc matrix types or  
> factorisation/solver methods that benefit more from this matrix form?
>
> Currently I am just using standard AIJ representation in my serial/ 
> parallel codes.
>
> I would be grateful for any thoughts.
>
> Thanks,
>
> Tim.
>
> -- 
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>


From w_subber at yahoo.com  Thu Nov 22 12:55:38 2007
From: w_subber at yahoo.com (Waad Subber)
Date: Thu, 22 Nov 2007 10:55:38 -0800 (PST)
Subject: pc_factor_fill
In-Reply-To: <0779CEF7-E610-4ACE-B7CB-9241CD5F437D@mcs.anl.gov>
Message-ID: <22056.18433.qm@web38209.mail.mud.yahoo.com>

Thanks Barry

One more question please:

Should I get and set the fill factor  like this:

      call MatGetInfo(A,MAT_LOCAL,info,ierr)
      FACTFILL = info(MAT_INFO_fill_ratio_needed)
      call PCFactorSetFill(pc,FACTFILL,ierr)
      call ISSetPermutation(indexSet,IERR);CHKERRQ(IERR)
      call MatLUFactorSymbolic(A,indexSet,indexSet,FACTFILL,... 
      call MatLUFactorNumeric(A,FACTFILL,factorMat,IERR)

Thanks

Waad

Barry Smith <bsmith at mcs.anl.gov> wrote: 

  PCFactorSetFill() after calling KSPSetPC()

   Barry


  If you are using multiple different KSP's you might look at KSPSetOptionsPrefix()
to allow using command line options to set different values for different solvers.

On Nov 22, 2007, at 5:46 AM, Waad Subber wrote:

Hello PETSc Users,

I have a serial code to solve multiple matrices. I am using LU factorization. When I run the code  with the (-info) option, it gives me different values for (pc_factor_fill) depending on the input matrix. I am wondering if I can set these values for the (pc_factor_fill) inside the code instead of running it with runtime option, for it is one code with multiple inputs.
 
 Thanks a lot
 
 Waad


For Matrix No.1

[5] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 2.96
[5] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.96 or use
[5] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.96);
[5] MatLUFactorSymbolic_SeqAIJ(): for best performance.

For Matrix No.2

[9] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 2.42069
[9] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.42069 or use
[9] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.42069);
[9] MatLUFactorSymbolic_SeqAIJ(): for best performance.

For Matrix No. n

[8] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 1.87742
[8] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 1.87742 or use
[8] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,1.87742);
[8] MatLUFactorSymbolic_SeqAIJ(): for best performance.
       

---------------------------------
Be a better sports nut! Let your teams follow you with Yahoo Mobile. Try it now.


---------------------------------
Be a better pen pal. Text or chat with friends inside Yahoo! Mail. See how.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071122/1e9b12b4/attachment.htm>

From timothy.stitt at ichec.ie  Thu Nov 22 12:55:54 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Thu, 22 Nov 2007 18:55:54 +0000
Subject: Banded Tridiagonal Matrices in PETSc
In-Reply-To: <FB0D56E9-A896-415F-A54F-9B69C11F9D7E@mcs.anl.gov>
References: <4745B951.3000407@ichec.ie> <FB0D56E9-A896-415F-A54F-9B69C11F9D7E@mcs.anl.gov>
Message-ID: <4745D0BA.8030208@ichec.ie>

Thanks Barry, that is all I need to know.

I can stick with my current implementation then...great stuff.

Barry Smith wrote:
>
>    There is a format Bdiag that stores by "banded diagonal". You will 
> find that
> this performs slower than then the AIJ format.
>
>   If your matrix has constant values along the "diagaonals" then you 
> will benefit from
> using a MatShell and writing custom code. if the values along the 
> diagonals are
> not constant you will not do any better than AIJ anyways.
>
>    Barry
>
> On Nov 22, 2007, at 11:16 AM, Tim Stitt wrote:
>
>> Hi,
>>
>> I was just wondering if PETSc has any special provision for banded 
>> tridiagonal complex matrices when used in conjunction with 
>> KSPSolve(). Are there any special PETSc matrix types or 
>> factorisation/solver methods that benefit more from this matrix form?
>>
>> Currently I am just using standard AIJ representation in my 
>> serial/parallel codes.
>>
>> I would be grateful for any thoughts.
>>
>> Thanks,
>>
>> Tim.
>>
>> --Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From bsmith at mcs.anl.gov  Thu Nov 22 18:54:47 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 22 Nov 2007 18:54:47 -0600
Subject: pc_factor_fill
In-Reply-To: <22056.18433.qm@web38209.mail.mud.yahoo.com>
References: <22056.18433.qm@web38209.mail.mud.yahoo.com>
Message-ID: <7FA47E14-B041-4423-8D95-7164874E36CD@mcs.anl.gov>


   If you are using KSP then you should NEVER call MatGetInfo(),  
MatLUFactor....
just call the PCFactorSetFill(). If you are not using the KSP then
you need to declare a MatFactorInfo and fill it up the way you want.
Note that MatInfo is different from MatFactorInfo you need to set the  
value
in MatFactorInfo that you pass to the factorization routines.

Barry


On Nov 22, 2007, at 12:55 PM, Waad Subber wrote:

> Thanks Barry
>
> One more question please:
>
> Should I get and set the fill factor  like this:
>
>       call MatGetInfo(A,MAT_LOCAL,info,ierr)
>       FACTFILL = info(MAT_INFO_fill_ratio_needed)
>       call PCFactorSetFill(pc,FACTFILL,ierr)
>       call ISSetPermutation(indexSet,IERR);CHKERRQ(IERR)
>       call MatLUFactorSymbolic(A,indexSet,indexSet,FACTFILL,...
>       call MatLUFactorNumeric(A,FACTFILL,factorMat,IERR)
>
> Thanks
>
> Waad
>
> Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>   PCFactorSetFill() after calling KSPSetPC()
>
>    Barry
>
>   If you are using multiple different KSP's you might look at  
> KSPSetOptionsPrefix()
> to allow using command line options to set different values for  
> different solvers.
>
> On Nov 22, 2007, at 5:46 AM, Waad Subber wrote:
>
>> Hello PETSc Users,
>>
>> I have a serial code to solve multiple matrices. I am using LU  
>> factorization. When I run the code  with the (-info) option, it  
>> gives me different values for (pc_factor_fill) depending on the  
>> input matrix. I am wondering if I can set these values for the  
>> (pc_factor_fill) inside the code instead of running it with runtime  
>> option, for it is one code with multiple inputs.
>>
>> Thanks a lot
>>
>> Waad
>>
>>
>> For Matrix No.1
>>
>> [5] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0  
>> needed 2.96
>> [5] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.96 or  
>> use
>> [5] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.96);
>> [5] MatLUFactorSymbolic_SeqAIJ(): for best performance.
>>
>> For Matrix No.2
>>
>> [9] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0  
>> needed 2.42069
>> [9] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.42069  
>> or use
>> [9] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.42069);
>> [9] MatLUFactorSymbolic_SeqAIJ(): for best performance.
>>
>> For Matrix No. n
>>
>> [8] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0  
>> needed 1.87742
>> [8] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 1.87742  
>> or use
>> [8] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,1.87742);
>> [8] MatLUFactorSymbolic_SeqAIJ(): for best performance.
>>
>> Be a better sports nut! Let your teams follow you with Yahoo  
>> Mobile. Try it now.
>
>
>
> Be a better pen pal. Text or chat with friends inside Yahoo! Mail.  
> See how.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071122/6aca6adc/attachment.htm>

From w_subber at yahoo.com  Thu Nov 22 19:29:04 2007
From: w_subber at yahoo.com (Waad Subber)
Date: Thu, 22 Nov 2007 17:29:04 -0800 (PST)
Subject: pc_factor_fill
In-Reply-To: <7FA47E14-B041-4423-8D95-7164874E36CD@mcs.anl.gov>
Message-ID: <760027.34004.qm@web38212.mail.mud.yahoo.com>

Thank you Barry :o)

Waad

Barry Smith <bsmith at mcs.anl.gov> wrote: 

  If you are using KSP then you should NEVER call MatGetInfo(), MatLUFactor....just call the PCFactorSetFill(). If you are not using the KSP then
you need to declare a MatFactorInfo and fill it up the way you want.
Note that MatInfo is different from MatFactorInfo you need to set the value
in MatFactorInfo that you pass to the factorization routines.


Barry


On Nov 22, 2007, at 12:55 PM, Waad Subber wrote:

Thanks Barry

One more question please:

Should I get and set the fill factor  like this:

      call MatGetInfo(A,MAT_LOCAL,info,ierr)
      FACTFILL = info(MAT_INFO_fill_ratio_needed)
      call PCFactorSetFill(pc,FACTFILL,ierr)
      call ISSetPermutation(indexSet,IERR);CHKERRQ(IERR)
      call MatLUFactorSymbolic(A,indexSet,indexSet,FACTFILL,... 
      call MatLUFactorNumeric(A,FACTFILL,factorMat,IERR)

Thanks

Waad

Barry Smith <bsmith at mcs.anl.gov> wrote: 

  PCFactorSetFill() after calling KSPSetPC()

   Barry


  If you are using multiple different KSP's you might look at KSPSetOptionsPrefix()
to allow using command line options to set different values for different solvers.

On Nov 22, 2007, at 5:46 AM, Waad Subber wrote:

Hello PETSc Users,

I have a serial code to solve multiple matrices. I am using LU factorization. When I run the code  with the (-info) option, it gives me different values for (pc_factor_fill) depending on the input matrix. I am wondering if I can set these values for the (pc_factor_fill) inside the code instead of running it with runtime option, for it is one code with multiple inputs.
 
 Thanks a lot
 
 Waad


For Matrix No.1

[5] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 2.96
[5] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.96 or use
[5] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.96);
[5] MatLUFactorSymbolic_SeqAIJ(): for best performance.

For Matrix No.2

[9] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 2.42069
[9] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.42069 or use
[9] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.42069);
[9] MatLUFactorSymbolic_SeqAIJ(): for best performance.

For Matrix No. n

[8] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 1.87742
[8] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 1.87742 or use
[8] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,1.87742);
[8] MatLUFactorSymbolic_SeqAIJ(): for best performance.
       

---------------------------------
Be a better sports nut! Let your teams follow you with Yahoo Mobile. Try it now.


---------------------------------
Be a better pen pal. Text or chat with friends inside Yahoo! Mail. See how.


---------------------------------
Never miss a thing.   Make Yahoo your homepage.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071122/141b48aa/attachment.htm>

From berend at chalmers.se  Fri Nov 23 03:04:15 2007
From: berend at chalmers.se (Berend van Wachem)
Date: Fri, 23 Nov 2007 10:04:15 +0100
Subject: Matrix reuse
In-Reply-To: <47459BCF.3070409@ewi.tudelft.nl>
References: <47459BCF.3070409@ewi.tudelft.nl>
Message-ID: <4746978F.1080901@chalmers.se>

Hi,

I use MatZeroEntries on the matrix, and then re-use it. I'm not sure 
what the gain in time is, though.

Berend.


Zhifeng Sheng wrote:
> Dear all
> 
> In my application, I have small matrices that are created and destroied, 
> those matrices are of the same nozero pattern.
> 
> I wonder if there is a way to reuse that matrices instead of destroy 
> them every time.
> 
> Thank you
> 
> Best regards
> Zhifeng Sheng
> 


From amjad11 at gmail.com  Fri Nov 23 07:10:55 2007
From: amjad11 at gmail.com (amjad ali)
Date: Fri, 23 Nov 2007 18:10:55 +0500
Subject: Establishing Fast Eth Beowulf Cluster for Using PETSc on that
Message-ID: <428810f20711230510y508d6eeerc6a185a18363535d@mail.gmail.com>

Hello everybody at PETSc-maint and PETSc-users,

Being new in cluster world what I have learnt to establish a simple PC
cluster is to do
1)setup network (making master node as NFS server, compute nodes as NFS clients)
2)setup login system (RSH/SSH)
3)setup parallel environment (MPI).

Now I want to setup a new Beowulf Cluster of few PCs to run PETSc
program on that. please tell me steps of how to build it. I do not
want to install MPICH2 separately on the cluster (just want to
install/use as a part of PETSc).

I have successfully tested PETSc parallel programs on my PC
(definately with your kind guidence).

Regards,
Amjad Ali.


From dalcinl at gmail.com  Fri Nov 23 09:01:06 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Fri, 23 Nov 2007 12:01:06 -0300
Subject: Matrix reuse
In-Reply-To: <4746978F.1080901@chalmers.se>
References: <47459BCF.3070409@ewi.tudelft.nl> <4746978F.1080901@chalmers.se>
Message-ID: <e7ba66e40711230701j52842190gb0acb390ec9501fb@mail.gmail.com>

On 11/23/07, Berend van Wachem <berend at chalmers.se> wrote:
> Hi,
> I use MatZeroEntries on the matrix, and then re-use it. I'm not sure
> what the gain in time is, though.

MatZeroEntries just  zero-out in the scalar entries, but it retains
the nonzero structure of the matrix. Thus, the next time you make a
loop calling MatSetValues(), PETSc will fastly put the new values in
the right location, avoiding any memory allocation or data movement
inside the sparse structure. Spase matrix assembly is not actually a
trivial task, and in the parallel case, it is even far harder. But
PETSc make it really easy and it is optimized for data reuse. Then, to
get good performace, you have to create and preallocate your matrix,
and next reuse it as much as you can. You can gain a lot, unless your
matrices are really small.


> Zhifeng Sheng wrote:
> > Dear all
> >
> > In my application, I have small matrices that are created and destroied,
> > those matrices are of the same nozero pattern.
> >
> > I wonder if there is a way to reuse that matrices instead of destroy
> > them every time.
> >
> > Thank you
> >
> > Best regards
> > Zhifeng Sheng
> >
>
>


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From dalcinl at gmail.com  Fri Nov 23 09:38:25 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Fri, 23 Nov 2007 12:38:25 -0300
Subject: Establishing Fast Eth Beowulf Cluster for Using PETSc on that
In-Reply-To: <428810f20711230510y508d6eeerc6a185a18363535d@mail.gmail.com>
References: <428810f20711230510y508d6eeerc6a185a18363535d@mail.gmail.com>
Message-ID: <e7ba66e40711230738j79fb2aabl10b538ec547260a4@mail.gmail.com>

On 11/23/07, amjad ali <amjad11 at gmail.com> wrote:

> Now I want to setup a new Beowulf Cluster of few PCs to run PETSc
> program on that. please tell me steps of how to build it.
> I do not
> want to install MPICH2 separately on the cluster (just want to
> install/use as a part of PETSc).

I use a similar setup. I would recommed you to do the following:

* Make '/usr/local' availabe at your nodes using NFS.

* Build and install MPICH2 using prefix '/usr/local/mpich2'

$ ./configure --prefix=/usr/local/mpich2 ...
$ make
$ su -c 'make install' # or: sudo make install

* Now modify your path:
$ export PATH=/usr/local/mpich2/bin:$PATH

* Now build PETSc, unpacking sources on '/usr/local'
$ su -l
$ cd /usr/local
$ tar -zxf petsc-xxx.tar.gz
$ cd petsc-xxx
$ export PETSC_DIR=`pwd`
$ export PETSC_ARCH=linux-gnu
$ python config/configure.py <customization options>
$ make

And this should be all you have to do. You will have PETSc built at
'/usr/local/petsc-xxx', and all your nodes would be able to see and
use it via NFS.

If you have any problem, feel free to ask me again.

-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From ps at cs.caltech.edu  Fri Nov 23 18:58:33 2007
From: ps at cs.caltech.edu (=?ISO-8859-1?Q?Peter_Schr=F6der?=)
Date: Fri, 23 Nov 2007 16:58:33 -0800
Subject: dropping columns/rows from matrix?
Message-ID: <47477739.3010103@cs.caltech.edu>

As part of a greedy basis pursuit algorithm I drop/undrop columns/rows 
from a matrix and resolve. I don't want to rebuild the matrix each time. 
Is there a quick way to do this?

Basically the setup is this. Consider a 2-manifold triangle mesh and a 
discretization (piecewise linear FE) of the Laplace-Beltrami operator 
over this mesh (symmetric positive (semi-)definite [constant vector is 
the only null space vector]). Fix boundary conditions (zero Dirichlet in 
my case). Solve for a given rhs (I am using CG and absolute Jacobi as a 
precon with good success). Based on the solution, take out a column (and 
the same row) and resolve. Repeat this a few times until, say, 10 
variables are dropped. Now pick one of them, say, i, and reintroduce it. 
Based on the solution replace i with i_new. Now visit another variable 
of the original 10 and "move" it. Etc.

Each one of the solves is quick (and I need to do hundreds for matrices 
with hundreds of thousands to millions of variables). I'd rather not 
rebuild the matrix each time... Any suggestions?

Thanks much!

Peter


From bsmith at mcs.anl.gov  Fri Nov 23 19:28:52 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 23 Nov 2007 19:28:52 -0600
Subject: dropping columns/rows from matrix?
In-Reply-To: <47477739.3010103@cs.caltech.edu>
References: <47477739.3010103@cs.caltech.edu>
Message-ID: <D83A2878-4E9D-4495-B9DB-55877D0F5A8D@mcs.anl.gov>


   Peter,

    We use MatGetSubMatrix() to do this. Rather than dropping/adding  
rows and columns just grab the
part you want and use it then destroy it and grab another part you  
want. We've done this with active set
methods and the time to do the MatGetSubMatrix() has been surprisingly  
small percentage of the time.

    Barry


On Nov 23, 2007, at 6:58 PM, Peter Schr?der wrote:

> As part of a greedy basis pursuit algorithm I drop/undrop columns/ 
> rows from a matrix and resolve. I don't want to rebuild the matrix  
> each time. Is there a quick way to do this?
>
> Basically the setup is this. Consider a 2-manifold triangle mesh and  
> a discretization (piecewise linear FE) of the Laplace-Beltrami  
> operator over this mesh (symmetric positive (semi-)definite  
> [constant vector is the only null space vector]). Fix boundary  
> conditions (zero Dirichlet in my case). Solve for a given rhs (I am  
> using CG and absolute Jacobi as a precon with good success). Based  
> on the solution, take out a column (and the same row) and resolve.  
> Repeat this a few times until, say, 10 variables are dropped. Now  
> pick one of them, say, i, and reintroduce it. Based on the solution  
> replace i with i_new. Now visit another variable of the original 10  
> and "move" it. Etc.
>
> Each one of the solves is quick (and I need to do hundreds for  
> matrices with hundreds of thousands to millions of variables). I'd  
> rather not rebuild the matrix each time... Any suggestions?
>
> Thanks much!
>
> Peter
>


From ps at cs.caltech.edu  Fri Nov 23 20:13:20 2007
From: ps at cs.caltech.edu (=?ISO-8859-1?Q?Peter_Schr=F6der?=)
Date: Fri, 23 Nov 2007 18:13:20 -0800
Subject: dropping columns/rows from matrix?
In-Reply-To: <D83A2878-4E9D-4495-B9DB-55877D0F5A8D@mcs.anl.gov>
References: <47477739.3010103@cs.caltech.edu> <D83A2878-4E9D-4495-B9DB-55877D0F5A8D@mcs.anl.gov>
Message-ID: <474788C0.8050903@cs.caltech.edu>

Barry Smith wrote:
>    We use MatGetSubMatrix() to do this. Rather than dropping/adding 
> rows and columns just grab the
> part you want and use it then destroy it and grab another part you 
> want. We've done this with active set
> methods and the time to do the MatGetSubMatrix() has been surprisingly 
> small percentage of the time.
Aaaah. Ok. Even if the submatrix is everyone but one column/row? Very well.

The one other way I had thought might be to fix one of the variables 
(some entry of the x vector is forced to zero) and ignore the 
corresponding part of the residual. But this may be too hacky an 
approach given the overall structure of Petsc.

I'll use MatGetSubMatrix() for now.

peter


From bsmith at mcs.anl.gov  Fri Nov 23 20:38:52 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 23 Nov 2007 20:38:52 -0600
Subject: dropping columns/rows from matrix?
In-Reply-To: <474788C0.8050903@cs.caltech.edu>
References: <47477739.3010103@cs.caltech.edu> <D83A2878-4E9D-4495-B9DB-55877D0F5A8D@mcs.anl.gov> <474788C0.8050903@cs.caltech.edu>
Message-ID: <2EB9D0CA-5750-4AF5-A509-E8AFF015B0FB@mcs.anl.gov>


On Nov 23, 2007, at 8:13 PM, Peter Schr?der wrote:

> Barry Smith wrote:
>>   We use MatGetSubMatrix() to do this. Rather than dropping/adding  
>> rows and columns just grab the
>> part you want and use it then destroy it and grab another part you  
>> want. We've done this with active set
>> methods and the time to do the MatGetSubMatrix() has been  
>> surprisingly small percentage of the time.
> Aaaah. Ok. Even if the submatrix is everyone but one column/row?  
> Very well.
>
     Yup :-). Only a couple of percent of the active set solver was  
spent in the GetSubMatrix()


> The one other way I had thought might be to fix one of the variables  
> (some entry of the x vector is forced to zero) and ignore the  
> corresponding part of the residual. But this may be too hacky an  
> approach given the overall structure of Petsc.
>
> I'll use MatGetSubMatrix() for now.
>
> peter
>


From dalcinl at gmail.com  Sat Nov 24 14:39:20 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Sat, 24 Nov 2007 17:39:20 -0300
Subject: dropping columns/rows from matrix?
In-Reply-To: <2EB9D0CA-5750-4AF5-A509-E8AFF015B0FB@mcs.anl.gov>
References: <47477739.3010103@cs.caltech.edu>
	 <D83A2878-4E9D-4495-B9DB-55877D0F5A8D@mcs.anl.gov>
	 <474788C0.8050903@cs.caltech.edu>
	 <2EB9D0CA-5750-4AF5-A509-E8AFF015B0FB@mcs.anl.gov>
Message-ID: <e7ba66e40711241239r2fbe4c23jf877951d6fce5315@mail.gmail.com>

On 11/23/07, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > Barry Smith wrote:
> >>   We use MatGetSubMatrix() to do this.
>the time to do the MatGetSubMatrix() has been
> >> surprisingly small percentage of the time.

Indeed, MatGetSubMatrix is surprisingly fast in my personal experience.


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From timothy.stitt at ichec.ie  Sun Nov 25 07:18:16 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Sun, 25 Nov 2007 13:18:16 +0000
Subject: Diagonal Elements of an Inverse Matrix 
Message-ID: <47497618.8010004@ichec.ie>

Hi PETSc Users/Developers,

I was just wondering if anyone knew of any O(N) methods for obtaining 
the diagonal elements of the inverse of a block tridiagonal 
matrix,without computing all the off-diagonal values at the same time?

Actually, the general case would be most useful were selected elements 
in the inverse could be obtained in O(N) time.

I would be grateful if anyone could shed any light on this...

Thanks,

Tim.

-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From timothy.stitt at ichec.ie  Sun Nov 25 12:54:54 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Sun, 25 Nov 2007 18:54:54 +0000
Subject: Zero Pivot Row in LU Factorization
Message-ID: <4749C4FE.7070602@ichec.ie>

Hi all,

Can anyone suggest ways of overcoming the following pivot error I keep 
receiving in my PETSc code during a KSPSolve().

[1]PETSC ERROR: Detected zero pivot in LU factorization
see 
http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot!
[1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance 
0.00165189 * rowsum 1.65189e+09!

 From checking the documentation....the error is in row 1801, which 
means it is most likely not a matrix assembly issue?

I tried the following prior to the solve with no luck either.....

call KSPGetPC(ksp,pc,error)
call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error)

Is there anything else I can try?

Thanks,

Tim.

-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From knepley at gmail.com  Sun Nov 25 13:02:10 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 25 Nov 2007 13:02:10 -0600
Subject: Zero Pivot Row in LU Factorization
In-Reply-To: <4749C4FE.7070602@ichec.ie>
References: <4749C4FE.7070602@ichec.ie>
Message-ID: <a9f269830711251102w414a9085i9674287556952e0b@mail.gmail.com>

On Nov 25, 2007 12:54 PM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> Hi all,
>
> Can anyone suggest ways of overcoming the following pivot error I keep
> receiving in my PETSc code during a KSPSolve().
>
> [1]PETSC ERROR: Detected zero pivot in LU factorization
> see
> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot!
> [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance
> 0.00165189 * rowsum 1.65189e+09!
>
>  From checking the documentation....the error is in row 1801, which
> means it is most likely not a matrix assembly issue?
>
> I tried the following prior to the solve with no luck either.....
>
> call KSPGetPC(ksp,pc,error)
> call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error)

I bet you are shifting the Block-Jacobi PC, not the LU PC which is the
subsolver.

   Matt

> Is there anything else I can try?
>
> Thanks,
>
> Tim.
>
> --
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From timothy.stitt at ichec.ie  Sun Nov 25 13:10:12 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Sun, 25 Nov 2007 19:10:12 +0000
Subject: Zero Pivot Row in LU Factorization
In-Reply-To: <4749C4FE.7070602@ichec.ie>
References: <4749C4FE.7070602@ichec.ie>
Message-ID: <4749C894.5090602@ichec.ie>

I should also add that the code executes without this error when using 1 
processor...but then displays the error when running in parallel with 
more than one process.

Tim Stitt wrote:
> Hi all,
>
> Can anyone suggest ways of overcoming the following pivot error I keep 
> receiving in my PETSc code during a KSPSolve().
>
> [1]PETSC ERROR: Detected zero pivot in LU factorization
> see 
> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot! 
>
> [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance 
> 0.00165189 * rowsum 1.65189e+09!
>
> From checking the documentation....the error is in row 1801, which 
> means it is most likely not a matrix assembly issue?
>
> I tried the following prior to the solve with no luck either.....
>
> call KSPGetPC(ksp,pc,error)
> call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error)
>
> Is there anything else I can try?
>
> Thanks,
>
> Tim.
>


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)


From knepley at gmail.com  Sun Nov 25 13:13:09 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 25 Nov 2007 13:13:09 -0600
Subject: Zero Pivot Row in LU Factorization
In-Reply-To: <4749C894.5090602@ichec.ie>
References: <4749C4FE.7070602@ichec.ie> <4749C894.5090602@ichec.ie>
Message-ID: <a9f269830711251113l2836aad3oee11a8bdc453bb53@mail.gmail.com>

This is because on many processes I believe you are running Block-Jacobi
with LU on the diagonal blocks. It is easy for one of these blocks to
be singular.

   Matt

On Nov 25, 2007 1:10 PM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> I should also add that the code executes without this error when using 1
> processor...but then displays the error when running in parallel with
> more than one process.
>
> Tim Stitt wrote:
> > Hi all,
> >
> > Can anyone suggest ways of overcoming the following pivot error I keep
> > receiving in my PETSc code during a KSPSolve().
> >
> > [1]PETSC ERROR: Detected zero pivot in LU factorization
> > see
> > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot!
> >
> > [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance
> > 0.00165189 * rowsum 1.65189e+09!
> >
> > From checking the documentation....the error is in row 1801, which
> > means it is most likely not a matrix assembly issue?
> >
> > I tried the following prior to the solve with no luck either.....
> >
> > call KSPGetPC(ksp,pc,error)
> > call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error)
> >
>
> > Is there anything else I can try?
> >
> > Thanks,
> >
> > Tim.
> >
>
>
> --
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From ps at cs.caltech.edu  Sun Nov 25 18:01:51 2007
From: ps at cs.caltech.edu (=?ISO-8859-1?Q?Peter_Schr=F6der?=)
Date: Sun, 25 Nov 2007 16:01:51 -0800
Subject: running solvers on submatrices
Message-ID: <474A0CEF.8030101@cs.caltech.edu>

Following advice from Barry yesterday I am now using IS to subselect 
parts of my matrix. So far so good. I am having problems with the PC 
though on the second round of invoking the solver. Basically the PC 
still has the old number of variables. Here is the basic flow:

KSPCreate( PETSC_COMM_SELF, &ksp );
KSPSetType( ksp, KSPCG );
KSPGetPC( ksp, &m_pc );
PCSetType( pc, PCJACOBI );

Loop over decreasing numbers of variables
  mess with the index set to get the correct columns/rows
  Mat A;
  MatGetSubMatrix( K, is, is, PETSC_DECIDE, MAT_INITIAL_MATRIX, &A );
  KSPSetOperators( ksp, A, A, DIFFERENT_NONZERO_PATTERN);
  KSPSetUp( ksp );
  KSPSolve( ksp, b, x );
  MatDestroy( A );

(Mat K contains the entire matrix from which I am subselecting.) The 
first solve works fine. Then I kill one variable (A is rebuilt from K) 
and now I die in KSPSetUp with the Jacobi precon finding that the 
working array diag is still the old size while mat (A) is the new size 
(one variable less).

What's the proper way to deal with this? I would prefer not to destroy 
and recreate the ksp and pc each time through the loop as this implies 
file I/O to read .petscrc (which I use to control what type of solver is 
used in this section).

Peter


From bsmith at mcs.anl.gov  Sun Nov 25 18:12:06 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 25 Nov 2007 18:12:06 -0600
Subject: running solvers on submatrices
In-Reply-To: <474A0CEF.8030101@cs.caltech.edu>
References: <474A0CEF.8030101@cs.caltech.edu>
Message-ID: <02711B04-D221-4A40-B3EB-7FD2C779BA3B@mcs.anl.gov>


    Peter,

     The design of PETSc requires destroying the KSP and creating a  
new one. A mechanism
to allow resizing would require a major redo of PETSc.

    The .petscrc is actually only read ONCE in PetscInitialize() so  
there is no extra IO for each KSP creation.

   Also, the time to create and destroy the KSP each time is not  
noticable; this is what we do in the
active set code.

     Barry
On Nov 25, 2007, at 6:01 PM, Peter Schr?der wrote:

> Following advice from Barry yesterday I am now using IS to subselect  
> parts of my matrix. So far so good. I am having problems with the PC  
> though on the second round of invoking the solver. Basically the PC  
> still has the old number of variables. Here is the basic flow:
>
> KSPCreate( PETSC_COMM_SELF, &ksp );
> KSPSetType( ksp, KSPCG );
> KSPGetPC( ksp, &m_pc );
> PCSetType( pc, PCJACOBI );
>
> Loop over decreasing numbers of variables
> mess with the index set to get the correct columns/rows
> Mat A;
> MatGetSubMatrix( K, is, is, PETSC_DECIDE, MAT_INITIAL_MATRIX, &A );
> KSPSetOperators( ksp, A, A, DIFFERENT_NONZERO_PATTERN);
> KSPSetUp( ksp );
> KSPSolve( ksp, b, x );
> MatDestroy( A );
>
> (Mat K contains the entire matrix from which I am subselecting.) The  
> first solve works fine. Then I kill one variable (A is rebuilt from  
> K) and now I die in KSPSetUp with the Jacobi precon finding that the  
> working array diag is still the old size while mat (A) is the new  
> size (one variable less).
>
> What's the proper way to deal with this? I would prefer not to  
> destroy and recreate the ksp and pc each time through the loop as  
> this implies file I/O to read .petscrc (which I use to control what  
> type of solver is used in this section).
>
> Peter
>


From bsmith at mcs.anl.gov  Sun Nov 25 18:15:17 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 25 Nov 2007 18:15:17 -0600
Subject: Zero Pivot Row in LU Factorization
In-Reply-To: <4749C894.5090602@ichec.ie>
References: <4749C4FE.7070602@ichec.ie> <4749C894.5090602@ichec.ie>
Message-ID: <982DFD3C-8D76-4045-A8B4-F046675DBC08@mcs.anl.gov>


   KSP *subksp;

    KSPGetPC(ksp,pc)
    PCBJacobiGetSubKSP(pc,&n,PETSC_NULL,&subksp)
    KSPGetPC(subksp[0],&subpc);
    PCFactorSetxxxxxx(subpc, ....

   Barry


On Nov 25, 2007, at 1:10 PM, Tim Stitt wrote:

> I should also add that the code executes without this error when  
> using 1 processor...but then displays the error when running in  
> parallel with more than one process.
>
> Tim Stitt wrote:
>> Hi all,
>>
>> Can anyone suggest ways of overcoming the following pivot error I  
>> keep receiving in my PETSc code during a KSPSolve().
>>
>> [1]PETSC ERROR: Detected zero pivot in LU factorization
>> see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot 
>> !
>> [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance  
>> 0.00165189 * rowsum 1.65189e+09!
>>
>> From checking the documentation....the error is in row 1801, which  
>> means it is most likely not a matrix assembly issue?
>>
>> I tried the following prior to the solve with no luck either.....
>>
>> call KSPGetPC(ksp,pc,error)
>> call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error)
>>
>> Is there anything else I can try?
>>
>> Thanks,
>>
>> Tim.
>>
>
>
> -- 
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>


From ps at cs.caltech.edu  Sun Nov 25 18:26:15 2007
From: ps at cs.caltech.edu (=?ISO-8859-1?Q?Peter_Schr=F6der?=)
Date: Sun, 25 Nov 2007 16:26:15 -0800
Subject: running solvers on submatrices
In-Reply-To: <02711B04-D221-4A40-B3EB-7FD2C779BA3B@mcs.anl.gov>
References: <474A0CEF.8030101@cs.caltech.edu> <02711B04-D221-4A40-B3EB-7FD2C779BA3B@mcs.anl.gov>
Message-ID: <474A12A7.1070204@cs.caltech.edu>

Barry Smith wrote:
>   Also, the time to create and destroy the KSP each time is not 
> noticable; this is what we do in the
> active set code.
Aye. Thanks.


From z.sheng at ewi.tudelft.nl  Mon Nov 26 03:49:16 2007
From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng)
Date: Mon, 26 Nov 2007 10:49:16 +0100
Subject: Diagonal Elements of an Inverse Matrix
In-Reply-To: <47497618.8010004@ichec.ie>
References: <47497618.8010004@ichec.ie>
Message-ID: <474A969C.3000905@ewi.tudelft.nl>

Tim Stitt wrote:

> Hi PETSc Users/Developers,
>
> I was just wondering if anyone knew of any O(N) methods for obtaining 
> the diagonal elements of the inverse of a block tridiagonal 
> matrix,without computing all the off-diagonal values at the same time?
>
> Actually, the general case would be most useful were selected elements 
> in the inverse could be obtained in O(N) time.
>
> I would be grateful if anyone could shed any light on this...
>
> Thanks,
>
> Tim.
>
There is one method call approximate schur inverse . It computes the 
exact number of inverse matrix on diagonals without knowing the number on

off-diagonal blocks.

best regards
Zhifeng


From amjad11 at gmail.com  Tue Nov 27 07:56:43 2007
From: amjad11 at gmail.com (amjad ali)
Date: Tue, 27 Nov 2007 18:56:43 +0500
Subject: Better C2D or Quadcore
Message-ID: <428810f20711270556x2bcf06c4me3d28b1b62dbd660@mail.gmail.com>

Hello,
I planned to buy 9 PCs each having one Core2Duo E6600 (networked with GiGE)
to make cluster for running PETSc based applications.

I got an advice that because the prices of Xeon Quadcore is going to drop
next month, so I should buy 9 PCs each having one Quadcore Xeon (networked
with GiGE) to make cluster for running PETSc based applications.

Which is better for me to get better performance/speedup?

My question is due to following as given in PETSc-FAQ:


*What kind of parallel computers or clusters are needed to use PETSc?*

PETSc can be used with any kind of parallel system that supports MPI. BUT for
any decent performance one needs

   - a fast, low-latency interconnect; any ethernet, even 10 gigE simply
   cannot provide the needed performance.
   - high per-CPU memory performance. Each CPU (core in dual core
   systems) needs to have its own memory bandwith of roughly 2 or more
   gigabytes. For example, standard dual processor "PC's" will
notprovide better performance when the second processor is used, that
is, you
   will not see speed-up when you using the second processor. This is because
   the speed of sparse matrix computations is almost totally determined by the
   speed of the memory, not the speed of the CPU.


regards,
Amjad Ali.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071127/3d4e2a7f/attachment.htm>

From knepley at gmail.com  Tue Nov 27 08:26:26 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 27 Nov 2007 08:26:26 -0600
Subject: [Beowulf] Better C2D or Quadcore
In-Reply-To: <474C2740.5050109@cse.ucdavis.edu>
References: <428810f20711270556x2bcf06c4me3d28b1b62dbd660@mail.gmail.com>
	 <474C2740.5050109@cse.ucdavis.edu>
Message-ID: <a9f269830711270626v312c6cb1rec38b0900cf1966c@mail.gmail.com>

On Nov 27, 2007 8:18 AM, Bill Broadley <bill at cse.ucdavis.edu> wrote:
> >    - high per-CPU memory performance. Each CPU (core in dual core
> >    systems) needs to have its own memory bandwith of roughly 2 or more
> >    gigabytes.
>
> Er, presumably thats 2 or more GB/sec.
>
> > For example, standard dual processor "PC's" will
> > notprovide better performance when the second processor is used, that
>
> Er, standard dual processor PCs can hit 4GB/sec.  Even my $750 desktop from
> dell, lousy memory, 1.8 GHz cpu gets 4GB/sec at stream add and triad:
> Function      Rate (MB/s)   Avg time     Min time     Max time
> Add:         3945.7460       0.0124       0.0122       0.0126
> Triad:       3951.5930       0.0124       0.0121       0.0129

FAQ can't always have a precise explanation. The issue here is balance.

  Machine          & Peak (MF/s) & Triad (MB/s) & MF/MW & Eq. MF/s \\
  Matt's Laptop    &        1700 &       1122.4 & 12.1  &  93.5 (5.5\%) \\
  Intel Core2 Quad &       38400 &       5312.0 & 57.8  & 442.7 (1.2\%) \\

So, yes the bandwidth goes up, but not at anywhere near the rate to keep a
bandwidth hungry matvec satisfied. The first numbers are for my laptop, and the
second are from the STREAMS site. Obviously those are not good percentages of
peak, so yo ucan really get away with slower, cheaper processors.

   Matt
-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From amjad11 at gmail.com  Tue Nov 27 09:26:32 2007
From: amjad11 at gmail.com (amjad ali)
Date: Tue, 27 Nov 2007 20:26:32 +0500
Subject: [Beowulf] Better C2D or Quadcore
In-Reply-To: <474C310E.9010602@charter.net>
References: <428810f20711270556x2bcf06c4me3d28b1b62dbd660@mail.gmail.com>
	 <474C310E.9010602@charter.net>
Message-ID: <428810f20711270726o433d7639v543b33654c2bba5c@mail.gmail.com>

Hello,

What will you be doing with PETSc? It has lots of options and
capability. What
type of problems will you be solving?

I want to solve some CFD problems using PETSc.

BTW - I'm still getting some information, but if you drop back to 8
> systems, I
> know of an inexpensive 8-port Infiniband switch (SDR). You can also find
> inexpensive IB SDR NICs. When you go above 8 ports you have to either
> switch to a large switch or start looking at using several tiers of
> small switches
> (the guys at aggregate.org can help there).


Yes I can drop to 8 systems. So please tell me about that.

My basic question was that whether C2D E6600 nodes will be better than a
Quadcore which would be somehow nearer in price?

regards to all.
Amjad Ali.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071127/5bb8eb4d/attachment.htm>

From amjad11 at gmail.com  Wed Nov 28 00:24:14 2007
From: amjad11 at gmail.com (amjad ali)
Date: Wed, 28 Nov 2007 11:24:14 +0500
Subject: which MPI can we use
Message-ID: <428810f20711272224k116d77a3o12d60c9532052045@mail.gmail.com>

Hello,

Please name the MPI libraries (other than MPICH2) which can be used
efficiently with PETSc?

reagards,
Ali.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071128/79e7b3b6/attachment.htm>

From dalcinl at gmail.com  Wed Nov 28 09:49:56 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Wed, 28 Nov 2007 12:49:56 -0300
Subject: which MPI can we use
In-Reply-To: <428810f20711272224k116d77a3o12d60c9532052045@mail.gmail.com>
References: <428810f20711272224k116d77a3o12d60c9532052045@mail.gmail.com>
Message-ID: <e7ba66e40711280749n7c7bcf44wdd08d24f3c7f447e@mail.gmail.com>

On 11/28/07, amjad ali <amjad11 at gmail.com> wrote:
> Please name the MPI libraries (other than MPICH2) which can be used
> efficiently with PETSc?

On Linux/GNU, surelly Open-MPI. You also have Intel-MPI (actually, it
is based on MPICH2).


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From bsmith at mcs.anl.gov  Thu Nov 29 07:38:57 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 29 Nov 2007 07:38:57 -0600
Subject: Fwd: [PETSC #17089] PETSc on XT3 with CNL?
References: <Pine.LNX.4.62.0711290835180.21643@venus.cs.utk.edu>
Message-ID: <80886221-3BED-40FC-B9B9-C2A3CD3624EC@mcs.anl.gov>


Begin forwarded message:

> From: Tom Cortese <tcortese at cs.utk.edu>
> Date: November 29, 2007 7:36:06 AM CST
> To: petsc-maint at mcs.anl.gov
> Cc: petsc-maint at mcs.anl.gov
> Subject: [PETSC #17089] PETSc on XT3 with CNL?
>
>
> Hello,
>
> Has anyone there tried installing PETSc on a Cray XT3 running  
> compute-node linux instead of catamount on the compute nodes?
>
> Any recommendations?
>
> Thanx,
>
> 	-Tom Cortese
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071129/58836904/attachment.htm>

From mbostandoust at yahoo.com  Thu Nov 29 07:46:14 2007
From: mbostandoust at yahoo.com (Mehdi Bostandoost)
Date: Thu, 29 Nov 2007 05:46:14 -0800 (PST)
Subject: Fwd: [PETSC #17089] PETSc on XT3 with CNL?
In-Reply-To: <80886221-3BED-40FC-B9B9-C2A3CD3624EC@mcs.anl.gov>
Message-ID: <434667.64427.qm@web33502.mail.mud.yahoo.com>

we have petsc on XT3 at PSC,but it is on catamount. 

Barry Smith <bsmith at mcs.anl.gov> wrote:  
  
  Begin forwarded message:

      From: Tom Cortese <tcortese at cs.utk.edu>
  Date: November 29, 2007 7:36:06 AM CST
  To: petsc-maint at mcs.anl.gov
  Cc: petsc-maint at mcs.anl.gov
  Subject: [PETSC #17089] PETSc on XT3 with CNL?
  

Hello,

Has anyone there tried installing PETSc on a Cray XT3 running compute-node linux instead of catamount on the compute nodes?

Any recommendations?

Thanx,

-Tom Cortese


---------------------------------
Get easy, one-click access to your favorites.  Make Yahoo! your homepage.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071129/dedadf95/attachment.htm>

From keita at cray.com  Thu Nov 29 08:55:07 2007
From: keita at cray.com (Keita Teranishi)
Date: Thu, 29 Nov 2007 08:55:07 -0600
Subject: Fwd: [PETSC #17089] PETSc on XT3 with CNL?
In-Reply-To: <434667.64427.qm@web33502.mail.mud.yahoo.com>
References: <80886221-3BED-40FC-B9B9-C2A3CD3624EC@mcs.anl.gov> <434667.64427.qm@web33502.mail.mud.yahoo.com>
Message-ID: <925346A443D4E340BEB20248BAFCDBDF032C37B0@CFEVS1-IP.americas.cray.com>

Tom,

 
PETSc is available for XT series and runs both on Catamount and CNL.  On Jaguar at ORNL, you can run PETSc on CNL. Could you tell me which XT3 you are using now? 

 
Thank you,

================================
 Keita Teranishi
 Math Software Group
 Cray Inc.
 keita at cray.com
================================

________________________________

From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Mehdi Bostandoost
Sent: Thursday, November 29, 2007 7:46 AM
To: petsc-users at mcs.anl.gov
Subject: Re: Fwd: [PETSC #17089] PETSc on XT3 with CNL?

 
we have petsc on XT3 at PSC,but it is on catamount. 

Barry Smith <bsmith at mcs.anl.gov> wrote: 

 
Begin forwarded message:


From: Tom Cortese <tcortese at cs.utk.edu>

Date: November 29, 2007 7:36:06 AM CST

To: petsc-maint at mcs.anl.gov

Cc: petsc-maint at mcs.anl.gov

Subject: [PETSC #17089] PETSc on XT3 with CNL?

 
Hello,

Has anyone there tried installing PETSc on a Cray XT3 running compute-node linux instead of catamount on the compute nodes?

Any recommendations?

Thanx,

-Tom Cortese

 
________________________________

Get easy, one-click access to your favorites. Make Yahoo! your homepage. <http://us.rd.yahoo.com/evt=51443/*http:/www.yahoo.com/r/hs>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071129/bc083fd0/attachment.htm>

From jwicks at cs.brown.edu  Thu Nov 29 09:03:11 2007
From: jwicks at cs.brown.edu (John R. Wicks)
Date: Thu, 29 Nov 2007 10:03:11 -0500
Subject: PCGetFactoredMatrix
In-Reply-To: <Pine.OSX.4.64.0708122102430.2670@bsmith.local>
Message-ID: <000c01c83298$f66bd920$0201a8c0@jwickslptp>

The documentation for PCGetFactoredMatrix is not clear.  What does this
return for ILU(0), for example?
Does it return the product LU or the in place factorization?


From knepley at gmail.com  Thu Nov 29 11:04:21 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 29 Nov 2007 11:04:21 -0600
Subject: PCGetFactoredMatrix
In-Reply-To: <000c01c83298$f66bd920$0201a8c0@jwickslptp>
References: <Pine.OSX.4.64.0708122102430.2670@bsmith.local>
	 <000c01c83298$f66bd920$0201a8c0@jwickslptp>
Message-ID: <a9f269830711290904v32ebb465vdd805d6a772511f8@mail.gmail.com>

It depends on the package, but the petsc stuff stores L and U in one
matrix.

   Matt

On Nov 29, 2007 9:03 AM, John R. Wicks <jwicks at cs.brown.edu> wrote:
> The documentation for PCGetFactoredMatrix is not clear.  What does this
> return for ILU(0), for example?
> Does it return the product LU or the in place factorization?
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From jwicks at cs.brown.edu  Thu Nov 29 12:07:24 2007
From: jwicks at cs.brown.edu (John R. Wicks)
Date: Thu, 29 Nov 2007 13:07:24 -0500
Subject: PCGetFactoredMatrix
In-Reply-To: <a9f269830711290904v32ebb465vdd805d6a772511f8@mail.gmail.com>
Message-ID: <000201c832b2$b29277d0$0201a8c0@jwickslptp>

I would like to compute the residual A - LU, where LU is the ILU
factorization of A.  What is the most convenient way of doing so?

> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov 
> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley
> Sent: Thursday, November 29, 2007 12:04 PM
> To: petsc-users at mcs.anl.gov
> Subject: Re: PCGetFactoredMatrix
> 
> 
> It depends on the package, but the petsc stuff stores L and U 
> in one matrix.
> 
>    Matt
> 
> On Nov 29, 2007 9:03 AM, John R. Wicks <jwicks at cs.brown.edu> wrote:
> > The documentation for PCGetFactoredMatrix is not clear.  What does 
> > this return for ILU(0), for example? Does it return the 
> product LU or 
> > the in place factorization?
> >
> >
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin 
> their experiments is infinitely more interesting than any 
> results to which their experiments lead.
> -- Norbert Wiener
> 


From bsmith at mcs.anl.gov  Thu Nov 29 14:43:25 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 29 Nov 2007 14:43:25 -0600
Subject: PCGetFactoredMatrix
In-Reply-To: <000201c832b2$b29277d0$0201a8c0@jwickslptp>
References: <000201c832b2$b29277d0$0201a8c0@jwickslptp>
Message-ID: <CC40FE55-E1C7-43B4-82AC-42D594A860DA@mcs.anl.gov>


    John,

      There is no immediate way to do this.
For the SeqAIJ format, we store both the LU in a single CSR format.
with for each row first the part of L (below the diagonal) then 1/D_i
then the part of U for that row. You can see how the triangular solves
are done by looking at src/mat/impls/aij/seq/aijfact.c the routine  
MatSolve_SeqAIJ()
Note that it is actually more complicated due to the row and column  
permutations
(the factored matrix is stored in the ordering of the permutations).
For BAIJ matrix the storage is similar except it is stored by block  
instead of point
and the inverse of the block diagonal is stored.

One could take the MatSolve_SeqAIJ() routine and modify it to do the  
matrix
vector product without too much difficulty.

If you decide to do this we would gladly include it in our distribution.

    Barry

One can ask why we don't provide this functionality in PETSc since  
computing
A - LU is a reasonable thing to do if one wants to understand the  
convergence
of the method. The answer is two-fold 1) time and energy and 2) though  
we
like everyone to use PETSc we driven more by people who are not  
interested
in the solution algorithms etc but only in getting the answer easily  
and relatively
efficiently.


On Nov 29, 2007, at 12:07 PM, John R. Wicks wrote:

> I would like to compute the residual A - LU, where LU is the ILU
> factorization of A.  What is the most convenient way of doing so?
>
>> -----Original Message-----
>> From: owner-petsc-users at mcs.anl.gov
>> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley
>> Sent: Thursday, November 29, 2007 12:04 PM
>> To: petsc-users at mcs.anl.gov
>> Subject: Re: PCGetFactoredMatrix
>>
>>
>> It depends on the package, but the petsc stuff stores L and U
>> in one matrix.
>>
>>   Matt
>>
>> On Nov 29, 2007 9:03 AM, John R. Wicks <jwicks at cs.brown.edu> wrote:
>>> The documentation for PCGetFactoredMatrix is not clear.  What does
>>> this return for ILU(0), for example? Does it return the
>> product LU or
>>> the in place factorization?
>>>
>>>
>>
>>
>>
>> -- 
>> What most experimenters take for granted before they begin
>> their experiments is infinitely more interesting than any
>> results to which their experiments lead.
>> -- Norbert Wiener
>>
>


From hzhang at mcs.anl.gov  Thu Nov 29 16:49:54 2007
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Thu, 29 Nov 2007 16:49:54 -0600 (CST)
Subject: PCGetFactoredMatrix
In-Reply-To: <CC40FE55-E1C7-43B4-82AC-42D594A860DA@mcs.anl.gov>
References: <000201c832b2$b29277d0$0201a8c0@jwickslptp>
 <CC40FE55-E1C7-43B4-82AC-42D594A860DA@mcs.anl.gov>
Message-ID: <Pine.LNX.4.64.0711291635480.4394@terra.mcs.anl.gov>


John,

You may look at ~petsc/src/mat/examples/tests/ex74.c
in which we use
|| y - x || as an indicator for
|| A - ICC ||.
In ex74.c, x is a randomly generated vector, b=A*x,
and ICC*y = b.
If you uncomment line 318
  printf("lf: %d, error: %G\n", lf,norm2);
and run ex74, you get
lf: -1, error: 3.33036E-15
lf: 0, error: 4.44135
lf: 1, error: 4.40183
lf: 2, error: 3.13597
lf: 3, error: 2.39443
lf: 4, error: 1.79942
lf: 5, error: 1.4183
lf: 6, error: 1.11197
lf: 7, error: 0.877789
lf: 8, error: 0.750784
lf: 9, error: 0.571567

which shows the error || y - x || for ICC(lf), lf=level of fill.

Hong

On Thu, 29 Nov 2007, Barry Smith wrote:

>
>  John,
>
>    There is no immediate way to do this.
> For the SeqAIJ format, we store both the LU in a single CSR format.
> with for each row first the part of L (below the diagonal) then 1/D_i
> then the part of U for that row. You can see how the triangular solves
> are done by looking at src/mat/impls/aij/seq/aijfact.c the routine 
> MatSolve_SeqAIJ()
> Note that it is actually more complicated due to the row and column 
> permutations
> (the factored matrix is stored in the ordering of the permutations).
> For BAIJ matrix the storage is similar except it is stored by block instead 
> of point
> and the inverse of the block diagonal is stored.
>
> One could take the MatSolve_SeqAIJ() routine and modify it to do the matrix
> vector product without too much difficulty.
>
> If you decide to do this we would gladly include it in our distribution.
>
>  Barry
>
> One can ask why we don't provide this functionality in PETSc since computing
> A - LU is a reasonable thing to do if one wants to understand the convergence
> of the method. The answer is two-fold 1) time and energy and 2) though we
> like everyone to use PETSc we driven more by people who are not interested
> in the solution algorithms etc but only in getting the answer easily and 
> relatively
> efficiently.
>
>
> On Nov 29, 2007, at 12:07 PM, John R. Wicks wrote:
>
>> I would like to compute the residual A - LU, where LU is the ILU
>> factorization of A.  What is the most convenient way of doing so?
>> 
>>> -----Original Message-----
>>> From: owner-petsc-users at mcs.anl.gov
>>> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley
>>> Sent: Thursday, November 29, 2007 12:04 PM
>>> To: petsc-users at mcs.anl.gov
>>> Subject: Re: PCGetFactoredMatrix
>>> 
>>> 
>>> It depends on the package, but the petsc stuff stores L and U
>>> in one matrix.
>>>
>>>  Matt
>>> 
>>> On Nov 29, 2007 9:03 AM, John R. Wicks <jwicks at cs.brown.edu> wrote:
>>>> The documentation for PCGetFactoredMatrix is not clear.  What does
>>>> this return for ILU(0), for example? Does it return the
>>> product LU or
>>>> the in place factorization?
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin
>>> their experiments is infinitely more interesting than any
>>> results to which their experiments lead.
>>> -- Norbert Wiener
>>> 
>> 
>


From gdiso at ustc.edu  Fri Nov 30 03:09:03 2007
From: gdiso at ustc.edu (Gong Ding)
Date: Fri, 30 Nov 2007 17:09:03 +0800
Subject: Help: The SNES test result of my  jacobian matrix
Message-ID: <E8A4EACF919C4956A21FEC921F700809@nintatmel>

Hi all,
I use -snes_type test to check the Jacobian matrix of my semiconductor code.
And get the result as follows:

Testing hand-coded Jacobian, if the ratio is
O(1.e-8), the hand-coded Jacobian is probably correct.
Run with -snes_test_display to show difference
of hand-coded and finite difference Jacobian.
Norm of matrix ratio 0.00277812 difference 0.0152703
Norm of matrix ratio 1.82658e-09 difference 1.01895e-08
Norm of matrix ratio 1.82964e-09 difference 1.02066e-08
[0]PETSC ERROR: SNESSolve() line 1871 in src/snes/interface/snes.c

It seems PETSC check hand writing Jacobian for 3 times. 
The Norm is something large in the first time check but keeps small in other two.
Dose this mean my Jacobian implementation correct or not? 
However, my code seems work well. 

Yours
GONG DING

Last mail seems lost. I send it again.


From bsmith at mcs.anl.gov  Fri Nov 30 08:10:36 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 30 Nov 2007 08:10:36 -0600
Subject: Help: The SNES test result of my  jacobian matrix
In-Reply-To: <E8A4EACF919C4956A21FEC921F700809@nintatmel>
References: <E8A4EACF919C4956A21FEC921F700809@nintatmel>
Message-ID: <37B6C889-190D-41B4-8119-72385A2C8725@mcs.anl.gov>


    The testing of the Jacobian is done for three different input  
vectors.
Since the first number is fairly large this indicates the Jacobian  
computed
is not correct. You can run with -snes_test_display ALSO and it will  
show
you both your Jacobian and the one computed with differencing so you
can see exactly what entries are wrong.

    Good luck,

    Barry

On Nov 30, 2007, at 3:09 AM, Gong Ding wrote:

> Hi all,
> I use -snes_type test to check the Jacobian matrix of my  
> semiconductor code.
> And get the result as follows:
>
> Testing hand-coded Jacobian, if the ratio is
> O(1.e-8), the hand-coded Jacobian is probably correct.
> Run with -snes_test_display to show difference
> of hand-coded and finite difference Jacobian.
> Norm of matrix ratio 0.00277812 difference 0.0152703
> Norm of matrix ratio 1.82658e-09 difference 1.01895e-08
> Norm of matrix ratio 1.82964e-09 difference 1.02066e-08
> [0]PETSC ERROR: SNESSolve() line 1871 in src/snes/interface/snes.c
>
> It seems PETSC check hand writing Jacobian for 3 times.
> The Norm is something large in the first time check but keeps small  
> in other two.
> Dose this mean my Jacobian implementation correct or not?
> However, my code seems work well.
>
> Yours
> GONG DING
>
> Last mail seems lost. I send it again.
>