From u.tabak at tudelft.nl  Sun Nov  1 05:01:12 2009
From: u.tabak at tudelft.nl (Umut Tabak)
Date: Sun, 1 Nov 2009 12:01:12 +0100
Subject: set a column of a matrix
Message-ID: <20091101110112.GA11973@dutw689>

Dear all,

     I would like to set a column of a matrix, I read through the manual
     pages a bit... Since the matrices are row oriented in PETSc and
     there is a function MatSetValuesRow, I guess transposing my
     original matrix and then using this function is the best option I
     could see for the moment. Could you comment on this?

     BR,
     Umut
-- 
Quote:    	I love the name of honor, more than I fear death.
Author:   	Julius Caesar 101-44 BC, Roman Emperor

From knepley at gmail.com  Sun Nov  1 07:19:36 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 1 Nov 2009 08:19:36 -0500
Subject: set a column of a matrix
In-Reply-To: <20091101110112.GA11973@dutw689>
References: <20091101110112.GA11973@dutw689>
Message-ID: <a9f269830911010519j2de05847h93e2df3941b31fe6@mail.gmail.com>

This is not an efficient parallel operation. You should probably rework your
algorithm
so that it is not necessary.

  Matt

On Sun, Nov 1, 2009 at 6:01 AM, Umut Tabak <u.tabak at tudelft.nl> wrote:

> Dear all,
>
>     I would like to set a column of a matrix, I read through the manual
>     pages a bit... Since the matrices are row oriented in PETSc and
>     there is a function MatSetValuesRow, I guess transposing my
>     original matrix and then using this function is the best option I
>     could see for the moment. Could you comment on this?
>
>     BR,
>     Umut
> --
> Quote:          I love the name of honor, more than I fear death.
> Author:         Julius Caesar 101-44 BC, Roman Emperor
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091101/bec30776/attachment.htm>

From bsmith at mcs.anl.gov  Sun Nov  1 07:44:58 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 1 Nov 2009 07:44:58 -0600
Subject: set a column of a matrix
In-Reply-To: <20091101110112.GA11973@dutw689>
References: <20091101110112.GA11973@dutw689>
Message-ID: <6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>


    MatSetValues() so long as you have good matrix preallocation this  
will work fine.

    Doing a transpose is very expensive.

      Barry

On Nov 1, 2009, at 5:01 AM, Umut Tabak wrote:

> Dear all,
>
>     I would like to set a column of a matrix, I read through the  
> manual
>     pages a bit... Since the matrices are row oriented in PETSc and
>     there is a function MatSetValuesRow, I guess transposing my
>     original matrix and then using this function is the best option I
>     could see for the moment. Could you comment on this?
>
>     BR,
>     Umut
> -- 
> Quote:    	I love the name of honor, more than I fear death.
> Author:   	Julius Caesar 101-44 BC, Roman Emperor


From u.tabak at tudelft.nl  Sun Nov  1 10:57:23 2009
From: u.tabak at tudelft.nl (Umut Tabak)
Date: Sun, 1 Nov 2009 17:57:23 +0100
Subject: set a column of a matrix
In-Reply-To: <6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>
References: <20091101110112.GA11973@dutw689>
	<6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>
Message-ID: <20091101165723.GA24933@dutw689>

On Sun, Nov 01, 2009 at 07:44:58AM -0600, Barry Smith wrote:
>
>    MatSetValues() so long as you have good matrix preallocation this  
> will work fine.
>
>    Doing a transpose is very expensive.
>
Dear Barry and Matt,

     Thanks for the replies, I am not doing anything in parallel. I
     should use MatSetValues with appropriate column and row indices.
     
     Actually, what I would like to do, I would like to set up a
     vectorset(rectangular) and assign to a block in a matrix. Is 
     there a vectorset that I could use directly to somehow put 
     these vectors directly into this set. And use this vectorset to
     set some part of a matrix.

     Thanks for the advice in advance.

     BR,
     Umut
-- 
Quote:    	Coming together is a beginning, staying together is progress, and working together is success.
Author:   	Henry Ford 1863-1947, American Industrialist, Founder of Ford Motor Company

From balay at mcs.anl.gov  Sun Nov  1 11:14:03 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Sun, 1 Nov 2009 11:14:03 -0600 (CST)
Subject: set a column of a matrix
In-Reply-To: <20091101165723.GA24933@dutw689>
References: <20091101110112.GA11973@dutw689>
	<6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>
	<20091101165723.GA24933@dutw689>
Message-ID: <alpine.LFD.2.00.0911011107260.20472@asterix>

On Sun, 1 Nov 2009, Umut Tabak wrote:

> On Sun, Nov 01, 2009 at 07:44:58AM -0600, Barry Smith wrote:
> >
> >    MatSetValues() so long as you have good matrix preallocation this  
> > will work fine.
> >
> >    Doing a transpose is very expensive.
> >
> Dear Barry and Matt,
> 
>      Thanks for the replies, I am not doing anything in parallel. I
>      should use MatSetValues with appropriate column and row indices.
>      
>      Actually, what I would like to do, I would like to set up a
>      vectorset(rectangular) and assign to a block in a matrix. Is 
>      there a vectorset that I could use directly to somehow put 
>      these vectors directly into this set. And use this vectorset to
>      set some part of a matrix.


You can set a block of values at a time with MatSetValues(). However
you should first get preallocation correct - and then time the
MatSetValues() code - before attempting additional optimization.


Once the preallocation is perfect - the primary savings with the
setting block of values is the reduction in the number of calls of
MatSetValues(). The other optimization you can do with setting block
of values - is to hav the col indices [of the block of values set] be
sorted. This saves a bit with searches [during insertion]

Satish

From bsmith at mcs.anl.gov  Sun Nov  1 11:16:26 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 1 Nov 2009 11:16:26 -0600
Subject: set a column of a matrix
In-Reply-To: <20091101165723.GA24933@dutw689>
References: <20091101110112.GA11973@dutw689>
	<6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>
	<20091101165723.GA24933@dutw689>
Message-ID: <C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>


    Is your matrix dense?  If it is sparse then it doesn't make sense  
to take values from a vector (which is always dense) to a sparse matrix.

    Barry

On Nov 1, 2009, at 10:57 AM, Umut Tabak wrote:

> On Sun, Nov 01, 2009 at 07:44:58AM -0600, Barry Smith wrote:
>>
>>   MatSetValues() so long as you have good matrix preallocation this
>> will work fine.
>>
>>   Doing a transpose is very expensive.
>>
> Dear Barry and Matt,
>
>     Thanks for the replies, I am not doing anything in parallel. I
>     should use MatSetValues with appropriate column and row indices.
>
>     Actually, what I would like to do, I would like to set up a
>     vectorset(rectangular) and assign to a block in a matrix. Is
>     there a vectorset that I could use directly to somehow put
>     these vectors directly into this set. And use this vectorset to
>     set some part of a matrix.
>
>     Thanks for the advice in advance.
>
>     BR,
>     Umut
> -- 
> Quote:    	Coming together is a beginning, staying together is  
> progress, and working together is success.
> Author:   	Henry Ford 1863-1947, American Industrialist, Founder of  
> Ford Motor Company


From u.tabak at tudelft.nl  Sun Nov  1 11:27:04 2009
From: u.tabak at tudelft.nl (Umut Tabak)
Date: Sun, 01 Nov 2009 18:27:04 +0100
Subject: set a column of a matrix
In-Reply-To: <C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>
References: <20091101110112.GA11973@dutw689>	<6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>	<20091101165723.GA24933@dutw689>
	<C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>
Message-ID: <4AEDC4E8.5030901@tudelft.nl>

Barry Smith wrote:
>
>    Is your matrix dense?  If it is sparse then it doesn't make sense 
> to take values from a vector (which is always dense) to a sparse matrix.
>
>    
Right, the matrix is dense. Filled with eigenvectors, which are also 
dense...
Thx,
Umut

From bsmith at mcs.anl.gov  Sun Nov  1 11:31:20 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 1 Nov 2009 11:31:20 -0600
Subject: set a column of a matrix
In-Reply-To: <4AEDC4E8.5030901@tudelft.nl>
References: <20091101110112.GA11973@dutw689>	<6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>	<20091101165723.GA24933@dutw689>
	<C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>
	<4AEDC4E8.5030901@tudelft.nl>
Message-ID: <0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>


    Then I would just use MatGetArray() and stick the values directly  
into the matrix.

    Barry

On Nov 1, 2009, at 11:27 AM, Umut Tabak wrote:

> Barry Smith wrote:
>>
>>   Is your matrix dense?  If it is sparse then it doesn't make sense  
>> to take values from a vector (which is always dense) to a sparse  
>> matrix.
>>
>>
> Right, the matrix is dense. Filled with eigenvectors, which are also  
> dense...
> Thx,
> Umut


From jarunan at ascomp.ch  Tue Nov  3 04:26:11 2009
From: jarunan at ascomp.ch (jarunan at ascomp.ch)
Date: Tue, 03 Nov 2009 11:26:11 +0100
Subject: -malloc_log
In-Reply-To: <0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>
References: <20091101110112.GA11973@dutw689>
	<6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>
	<20091101165723.GA24933@dutw689>
	<C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>
	<4AEDC4E8.5030901@tudelft.nl>
	<0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>
Message-ID: <20091103112611.z3l1nypca6o848k4@webmail.ascomp.ch>


Hello,

When I use option -malloc_log, it prints the information at the end of  
running as below. The question is: What are the 2 numbers in front of  
the function? e.g.
2 3216 ClassPerfLogCreate()

What is 2 and what is 3216?

Thank you

Jarunan

0: [0] Maximum memory PetscMalloc()ed 2713976 maximum size of entire  
process 30131728
0: [0] Memory usage sorted by function
0: [0] 2 3216 ClassPerfLogCreate()
0: [0] 2 1616 ClassRegLogCreate()
0: [0] 2 6416 EventPerfLogCreate()
0: [0] 1 12800 EventPerfLogEnsureSize()
0: [0] 2 1616 EventRegLogCreate()
0: [0] 1 3200 EventRegLogRegister()
0: [0] 40 5760 ISCreateBlock()
0: [0] 160 701184 ISCreateGeneral()
0: [0] 96 12096 ISCreateStride()
0: [0] 24 342400 ISGetIndices_Stride()
0: [0] 8 171200 ISInvertPermutation_General()
0: [0] 48 13312 KSPCreate()
0: [0] 8 1280 KSPCreate_GMRES()
0: [0] 16 256 KSPDefaultConvergedCreate()
0: [0] 8 2048 KSPGMRESClassicalGramSchmidtOrthogonalization()


From jed at 59A2.org  Tue Nov  3 04:55:58 2009
From: jed at 59A2.org (Jed Brown)
Date: Tue, 03 Nov 2009 11:55:58 +0100
Subject: -malloc_log
In-Reply-To: <20091103112611.z3l1nypca6o848k4@webmail.ascomp.ch>
References: <20091101110112.GA11973@dutw689>	<6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>	<20091101165723.GA24933@dutw689>	<C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>	<4AEDC4E8.5030901@tudelft.nl>	<0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>
	<20091103112611.z3l1nypca6o848k4@webmail.ascomp.ch>
Message-ID: <4AF00C3E.5040902@59A2.org>

jarunan at ascomp.ch wrote:
> 
> Hello,
> 
> When I use option -malloc_log, it prints the information at the end of
> running as below. The question is: What are the 2 numbers in front of
> the function? e.g.
> 2 3216 ClassPerfLogCreate()
> 
> What is 2 and what is 3216?

2    : the total number of allocations from this function
3216 : total number of bytes allocated from this function

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091103/a0607904/attachment.pgp>

From jarunan at ascomp.ch  Thu Nov  5 03:32:17 2009
From: jarunan at ascomp.ch (jarunan at ascomp.ch)
Date: Thu, 05 Nov 2009 10:32:17 +0100
Subject: Reuse matrix and vector
In-Reply-To: <0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>
References: <20091101110112.GA11973@dutw689>
	<6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>
	<20091101165723.GA24933@dutw689>
	<C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>
	<4AEDC4E8.5030901@tudelft.nl>
	<0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>
Message-ID: <20091105103217.8vknqnkehwc0ww4g@webmail.ascomp.ch>


Hello,

I would like to reuse matrix and vector to save computing time. As the  
result in last iterations are not similar to the one from my old  
solver, so I am not sure if I program with PETSc the right way,  
especially, resetting values to matrix with  
MatMPIAIJSetPreallocationCSR(). Please take a look, here is how I  
program it:

At the beginning of the program I create the vector and matrix.

call VecCreateMPI(PETSC_COMM_WORLD,istorf_no_ovcell,PETSC_DETERMINE,rhs,ierr)
call VecDuplicate(rhs,sol,ierr)


call MatCreate(PETSC_COMM_WORLD,Ap,ierr)
call  
MatSetSizes(Ap,istorf_no_ovcell,istorf_no_ovcell,PETSC_DETERMINE,PETSC_DETERMINE,ierr)
call MatSetType(Ap,MATMPIAIJ,ierr)

Then, in each loop I reset values in the vector and the matrix.

do niter = 1,maxiter

     call VecSetValues(rhs,w,gindex_issu(1:w),f_issu(1:w),INSERT_VALUES,ierr)

     call VecAssemblyBegin(rhs,ierr)
     call VecAssemblyEnd(rhs,ierr)

     call MatMPIAIJSetPreallocationCSR(Ap,rowind,columnind,A,ierr)

     call MatAssemblyBegin(Ap,MAT_FINAL_ASSEMBLY,ierr)
     call MatAssemblyEnd(Ap,MAT_FINAL_ASSEMBLY,ierr)

     call solve_system
     call update_right_hand_side

endo

call MatDestroy(Ap,ierr)
call VecDestroy(sol,ierr)
call VecDestroy(rhs,ierr)


Regards,
Jarunan


-- 
Jarunan Panyasantisuk
Development Engineer
ASCOMP GmbH, Technoparkstr. 1
CH-8005 Zurich, Switzerland
Phone : +41 44 445 4072
Fax   : +41 44 445 4075
E-mail: jarunan at ascomp.ch
www.ascomp.ch

From thomas.witkowski at tu-dresden.de  Thu Nov  5 03:43:56 2009
From: thomas.witkowski at tu-dresden.de (Thomas Witkowski)
Date: Thu, 05 Nov 2009 10:43:56 +0100
Subject: Solving a singular matrix with BoomerAMG
Message-ID: <4AF29E5C.6030108@tu-dresden.de>

I want so solve a system with a singular matrix (just the laplace 
discretized with the fem using pure Neumann boundary conditions) with 
BoomerAMG. Is there any way to do it directly in BoomerAMG/PETSc, or 
must I change the matrix to make it nonsingular?

Thomas

From jed at 59A2.org  Thu Nov  5 03:56:27 2009
From: jed at 59A2.org (Jed Brown)
Date: Thu, 05 Nov 2009 10:56:27 +0100
Subject: Reuse matrix and vector
In-Reply-To: <20091105103217.8vknqnkehwc0ww4g@webmail.ascomp.ch>
References: <20091101110112.GA11973@dutw689>	<6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>	<20091101165723.GA24933@dutw689>	<C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>	<4AEDC4E8.5030901@tudelft.nl>	<0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>
	<20091105103217.8vknqnkehwc0ww4g@webmail.ascomp.ch>
Message-ID: <4AF2A14B.8070409@59A2.org>

jarunan at ascomp.ch wrote:
> 
> Hello,
> 
> I would like to reuse matrix and vector to save computing time. As the
> result in last iterations are not similar to the one from my old solver,
> so I am not sure if I program with PETSc the right way, especially,
> resetting values to matrix with MatMPIAIJSetPreallocationCSR().

I suspect you are using a stale preconditioner, but
MatMPIAIJSetPreallocationCSR should not be called every iteration.

> Please take a look, here is how I program it:
> 
> At the beginning of the program I create the vector and matrix.
> 
> call
> VecCreateMPI(PETSC_COMM_WORLD,istorf_no_ovcell,PETSC_DETERMINE,rhs,ierr)
> call VecDuplicate(rhs,sol,ierr)
> 
> 
> call MatCreate(PETSC_COMM_WORLD,Ap,ierr)
> call
> MatSetSizes(Ap,istorf_no_ovcell,istorf_no_ovcell,PETSC_DETERMINE,PETSC_DETERMINE,ierr)
> 
> call MatSetType(Ap,MATMPIAIJ,ierr)
> 
> Then, in each loop I reset values in the vector and the matrix.
> 
> do niter = 1,maxiter
> 
>     call
> VecSetValues(rhs,w,gindex_issu(1:w),f_issu(1:w),INSERT_VALUES,ierr)
> 
>     call VecAssemblyBegin(rhs,ierr)
>     call VecAssemblyEnd(rhs,ierr)
> 
>     call MatMPIAIJSetPreallocationCSR(Ap,rowind,columnind,A,ierr)

It's better to call this once at the beginning and update with
MatSetValues (insert one row at a time, it's very little code).
MatMPIAIJSetPreallocationCSR reallocates memory because it doesn't check
to see if the nonzero pattern has changed (because that's not what it's
for).

>     call MatAssemblyBegin(Ap,MAT_FINAL_ASSEMBLY,ierr)
>     call MatAssemblyEnd(Ap,MAT_FINAL_ASSEMBLY,ierr)
>
>     call solve_system

Where are you calling KSPSetOperators?  You have to call this every time
the matrix changes.

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091105/74ed6b70/attachment.pgp>

From jed at 59A2.org  Thu Nov  5 04:05:26 2009
From: jed at 59A2.org (Jed Brown)
Date: Thu, 05 Nov 2009 11:05:26 +0100
Subject: Solving a singular matrix with BoomerAMG
In-Reply-To: <4AF29E5C.6030108@tu-dresden.de>
References: <4AF29E5C.6030108@tu-dresden.de>
Message-ID: <4AF2A366.30304@59A2.org>

Thomas Witkowski wrote:
> I want so solve a system with a singular matrix (just the laplace
> discretized with the fem using pure Neumann boundary conditions) with
> BoomerAMG. Is there any way to do it directly in BoomerAMG/PETSc, or
> must I change the matrix to make it nonsingular?

The user's manual has a section on solving singular systems, you just
need to tell KSP about the null space (either with KSPSetNullSpace or
-ksp_constant_null_space).  Small known null spaces are rarely an issue
with iterative methods.  With multigrid, there is some risk that the
coarse operator is also singular; this can cause trouble since it is
usually solved with a direct solver, but should not happen in your case.

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091105/ec2e983d/attachment.pgp>

From jed at 59A2.org  Thu Nov  5 04:58:53 2009
From: jed at 59A2.org (Jed Brown)
Date: Thu, 05 Nov 2009 11:58:53 +0100
Subject: Reuse matrix and vector
In-Reply-To: <20091105111152.a6vy4mhjo8wo0kcg@webmail.ascomp.ch>
References: <20091101110112.GA11973@dutw689>	<6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>	<20091101165723.GA24933@dutw689>	<C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>	<4AEDC4E8.5030901@tudelft.nl>	<0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>	<20091105103217.8vknqnkehwc0ww4g@webmail.ascomp.ch>	<4AF2A14B.8070409@59A2.org>
	<20091105111152.a6vy4mhjo8wo0kcg@webmail.ascomp.ch>
Message-ID: <4AF2AFED.2090706@59A2.org>

jarunan at ascomp.ch wrote:
> 
>> I suspect you are using a stale preconditioner, but
>> MatMPIAIJSetPreallocationCSR should not be called every iteration.
> 
> What do you mean 'stale preconditioner' ? I use Additive Schwarz.

That the preconditioner was not being updated when you change the
matrix.  If you reset everything inside the loop, then that isn't the
problem.

>> It's better to call this once at the beginning and update with
>> MatSetValues (insert one row at a time, it's very little code).
>> MatMPIAIJSetPreallocationCSR reallocates memory because it doesn't check
>> to see if the nonzero pattern has changed (because that's not what it's
>> for).
>>
> 
> Yes, thank you for the advice. I will modify this. But MatSetValues() is
> not efficient with a big problem. It takes much time.

No, either you are inserting values that have not been preallocated
(check with -info | grep mallocs) or you are inserting single values.
You should insert a full row every time you call MatSetValues.

>> Where are you calling KSPSetOperators?  You have to call this every time
>> the matrix changes.
> 
> I did shortcut of the code, actually...just to put it here. After
> creating and setting Matrix and vector. In each loop I create the KSP:
> 
>     call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
>     call KSPSetOperators(ksp,Ap,Ap,DIFFERENT_NONZERO_PATTERN,ierr)
>     call KSPGetPC(ksp,pc,ierr)
>     call PCSetType(pc,pct,ierr)
> 
>     if (pct == PCASM) then
>        call PCASMSetTotalSubdomains(pc,glob_nblocks_psc,PETSC_NULL_OBJECT,ierr)
>        call PCASMSetLocalSubdomains(pc,nblocks_psc,PETSC_NULL_OBJECT,ierr)

Only call one of these.

>        call PetscOptionsSetValue('-pc_asm_overlap','1',ierr)
>        call PetscOptionsSetValue('-sub_pc_type','lu',ierr)
>        call PetscOptionsSetValue('-sub_pc_factor_zeropivot','0.0',ierr)
>     endif
> 
>     call KSPSetTolerances(ksp,resin,1.e-20,  &
>              PETSC_DEFAULT_DOUBLE_PRECISION,nswp_psc,ierr)
> 
>     call KSPSetType(ksp,kspt,ierr)
>     call KSPSetFromOptions(ksp,ierr)

Of the above, only KSPSetOperators() should be called inside the loop,
everything else is setup that should happen before your loop.

>     call KSPSolve(ksp,rhs,sol,ierr)
>
>     call KSPDestroy(ksp,ierr)


How was your code, and the convergence different before?

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091105/0ea1ef14/attachment.pgp>

From aja2111 at columbia.edu  Thu Nov  5 03:10:51 2009
From: aja2111 at columbia.edu (Aron Ahmadia)
Date: Thu, 5 Nov 2009 12:10:51 +0300
Subject: petsc function wrapper
In-Reply-To: <1ca0b8b40911041640o5b6d61d5t429734e25c241ed4@mail.gmail.com>
References: <1ca0b8b40911041640o5b6d61d5t429734e25c241ed4@mail.gmail.com>
Message-ID: <37604ab40911050110k21373c8ci65ecb501b1787b0a@mail.gmail.com>

Hi Braxton,

I don't think there's an explicit manual page in PETSc for doing it.
You would need to do:

VecGetArray
VecGetOwnershipRange

(iterate over range on data from array)

VecRestoreArray

I cc the PETSc user's list in case anyone else has a brighter idea.

Cheers,
A

On Thu, Nov 5, 2009 at 3:40 AM, Braxton Osting <braxtono at gmail.com> wrote:
> aron,
>
> last year you showed abby and I a slick way to apply a function to a
> petsc vec pointwise. i can't seem to find that manual page now. do you
> remember it?
>
> thanks,
> b
>

From knepley at gmail.com  Thu Nov  5 10:43:57 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 5 Nov 2009 10:43:57 -0600
Subject: petsc function wrapper
In-Reply-To: <37604ab40911050110k21373c8ci65ecb501b1787b0a@mail.gmail.com>
References: <1ca0b8b40911041640o5b6d61d5t429734e25c241ed4@mail.gmail.com>
	<37604ab40911050110k21373c8ci65ecb501b1787b0a@mail.gmail.com>
Message-ID: <a9f269830911050843h87949fcu6399e8b3b9d574c4@mail.gmail.com>

You can look at VecSqrt() to see us do it.

  Matt

On Thu, Nov 5, 2009 at 3:10 AM, Aron Ahmadia <aja2111 at columbia.edu> wrote:

> Hi Braxton,
>
> I don't think there's an explicit manual page in PETSc for doing it.
> You would need to do:
>
> VecGetArray
> VecGetOwnershipRange
>
> (iterate over range on data from array)
>
> VecRestoreArray
>
> I cc the PETSc user's list in case anyone else has a brighter idea.
>
> Cheers,
> A
>
> On Thu, Nov 5, 2009 at 3:40 AM, Braxton Osting <braxtono at gmail.com> wrote:
> > aron,
> >
> > last year you showed abby and I a slick way to apply a function to a
> > petsc vec pointwise. i can't seem to find that manual page now. do you
> > remember it?
> >
> > thanks,
> > b
> >
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091105/29467e8c/attachment.htm>

From w.drenth at gmail.com  Fri Nov  6 12:00:35 2009
From: w.drenth at gmail.com (Wienand Drenth)
Date: Fri, 6 Nov 2009 19:00:35 +0100
Subject: use of VecPlaceArray in parallel with fortran
Message-ID: <4a718f330911061000y53dcf231x83a4e41cf7b25097@mail.gmail.com>

Hello all,

In my research code I solve a linear system of equations, and (of
course) I use PetSc routines for that. However, in the code we have
our own data arrays for the right handside vector B, and solution
vector X. Only just prior to the call to KSPSolve, we use the routine
VecPlaceArray to synchronize the Fortran array B and X with their
PetSc counterparts (M_B and M_X, for example, respectively).

I was wondering if this would work in parallel as well? I have adapted
one of the tutorial examples  (ex2f from the ksp tutorials) to utilize
the VecPlaceArray mechanism. I encountered no problems, except when I
want to run the program in parallel.

When I do that, and print my own vector X afterwards, different
processors show different parts of the solution. For example, for a
vector of length 10, and with two processors, processor one will have
values for the first five elements (remainder is zero), and processor
two will have values for the last five elements in the array.

>From the same ksp tutorials, I have tried ex13 as well, the c program.
Here I do not get partial outputs for different processors.

I wonder whether one cannot use VecPlaceArray in a parralel setting in
Fortran, except by doing extra bookkeeping? I hope someone can
enlighten me, and indicate where I missed something in my programming
or otherwise.

Thanks in advance,

Wienand Drenth


-- 
Wienand Drenth PhD
Eindhoven, the Netherlands

From bsmith at mcs.anl.gov  Fri Nov  6 12:48:00 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 6 Nov 2009 12:48:00 -0600
Subject: use of VecPlaceArray in parallel with fortran
In-Reply-To: <4a718f330911061000y53dcf231x83a4e41cf7b25097@mail.gmail.com>
References: <4a718f330911061000y53dcf231x83a4e41cf7b25097@mail.gmail.com>
Message-ID: <D2843FD2-2706-4BF9-80D3-4575026F0D08@mcs.anl.gov>


   VecPlaceArray() gives to the vector its local (on process) part  of  
the array, not the whole array (and requires no communication). If you  
want the entire array of the vector on one or all processes you can  
use VecScatterCreateToAll() or VecScatterCreateToZero() and then use  
the VecScatter created to move the values to where you want them.

    Barry

On Nov 6, 2009, at 12:00 PM, Wienand Drenth wrote:

> Hello all,
>
> In my research code I solve a linear system of equations, and (of
> course) I use PetSc routines for that. However, in the code we have
> our own data arrays for the right handside vector B, and solution
> vector X. Only just prior to the call to KSPSolve, we use the routine
> VecPlaceArray to synchronize the Fortran array B and X with their
> PetSc counterparts (M_B and M_X, for example, respectively).
>
> I was wondering if this would work in parallel as well? I have adapted
> one of the tutorial examples  (ex2f from the ksp tutorials) to utilize
> the VecPlaceArray mechanism. I encountered no problems, except when I
> want to run the program in parallel.
>
> When I do that, and print my own vector X afterwards, different
> processors show different parts of the solution. For example, for a
> vector of length 10, and with two processors, processor one will have
> values for the first five elements (remainder is zero), and processor
> two will have values for the last five elements in the array.
>
>> From the same ksp tutorials, I have tried ex13 as well, the c  
>> program.
> Here I do not get partial outputs for different processors.
>
> I wonder whether one cannot use VecPlaceArray in a parralel setting in
> Fortran, except by doing extra bookkeeping? I hope someone can
> enlighten me, and indicate where I missed something in my programming
> or otherwise.
>
> Thanks in advance,
>
> Wienand Drenth
>
>
>
> -- 
> Wienand Drenth PhD
> Eindhoven, the Netherlands


From vyan2000 at gmail.com  Sat Nov  7 13:57:28 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Sat, 7 Nov 2009 14:57:28 -0500
Subject: Can I use MatSetBlockSize() for MPIAIJ
Message-ID: <bb5eaf5f0911071157s31d21d31yed29d4cfc601daa4@mail.gmail.com>

Hi All,
I have a question as follows:

In order to use MatSetValuesBlocked() for a MPIAIJ matrix. I need to call
MatSetBlockSize() when I create the matrix.

so I did the following. Here the blocksize = 5;


Mat *A;
....
  MatCreate(MPI_COMM_WORLD,A);
  MatSetSizes(*A,m*blocksize,n*blocksize,M*blocksize,N*blocksize);
  MatSetType(*A,MATMPIAIJ);
  MatSetBlockSize(*A,blocksize);
  ierr=MatMPIAIJSetPreallocation(*A,0,ourlens_ptws,0,offlens_ptws);
CHKERRQ(ierr);

  ierr = MatAssemblyBegin(*A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
  ierr = MatAssemblyEnd(*A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
  PetscPrintf(PETSC_COMM_WORLD,"the bs BEFORE is %d\n", bs);
  MatGetBlockSize(*A,&bs);

  PetscPrintf(PETSC_COMM_WORLD,"the bs is %d\n", bs);
  PetscPrintf(PETSC_COMM_WORLD,"the blocksize is %d\n", blocksize);
...

The output I get is:

the bs BEFORE is 0
the bs is 1
the blocksize is 5


It seems like the Mat A does not absorb the information blocksize=5 at all.
How should I make the function-call sequence correct, if I want to set a
blocksize for the MPIAIJ.

Thanks for any suggestions in advance,

Yan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091107/a2343a44/attachment.htm>

From vyan2000 at gmail.com  Sat Nov  7 14:01:46 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Sat, 7 Nov 2009 15:01:46 -0500
Subject: Can I use MatSetBlockSize() for MPIAIJ
In-Reply-To: <bb5eaf5f0911071157s31d21d31yed29d4cfc601daa4@mail.gmail.com>
References: <bb5eaf5f0911071157s31d21d31yed29d4cfc601daa4@mail.gmail.com>
Message-ID: <bb5eaf5f0911071201l2f5da8efs863e79bd7af56bdf@mail.gmail.com>

Sorry a typo:
  MatCreate(MPI_COMM_WORLD,*A);


On Sat, Nov 7, 2009 at 2:57 PM, Ryan Yan <vyan2000 at gmail.com> wrote:

> Hi All,
> I have a question as follows:
>
> In order to use MatSetValuesBlocked() for a MPIAIJ matrix. I need to call
> MatSetBlockSize() when I create the matrix.
>
> so I did the following. Here the blocksize = 5;
>
>
> Mat *A;
> ....
>   MatCreate(MPI_COMM_WORLD,A);
>   MatSetSizes(*A,m*blocksize,n*blocksize,M*blocksize,N*blocksize);
>   MatSetType(*A,MATMPIAIJ);
>   MatSetBlockSize(*A,blocksize);
>   ierr=MatMPIAIJSetPreallocation(*A,0,ourlens_ptws,0,offlens_ptws);
> CHKERRQ(ierr);
>
>   ierr = MatAssemblyBegin(*A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>   ierr = MatAssemblyEnd(*A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>   PetscPrintf(PETSC_COMM_WORLD,"the bs BEFORE is %d\n", bs);
>   MatGetBlockSize(*A,&bs);
>
>   PetscPrintf(PETSC_COMM_WORLD,"the bs is %d\n", bs);
>   PetscPrintf(PETSC_COMM_WORLD,"the blocksize is %d\n", blocksize);
> ...
>
> The output I get is:
>
> the bs BEFORE is 0
> the bs is 1
> the blocksize is 5
>
>
> It seems like the Mat A does not absorb the information blocksize=5 at all.
> How should I make the function-call sequence correct, if I want to set a
> blocksize for the MPIAIJ.
>
> Thanks for any suggestions in advance,
>
> Yan
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091107/09c9ca24/attachment.htm>

From jed at 59A2.org  Sat Nov  7 14:12:40 2009
From: jed at 59A2.org (Jed Brown)
Date: Sat, 07 Nov 2009 21:12:40 +0100
Subject: Can I use MatSetBlockSize() for MPIAIJ
In-Reply-To: <bb5eaf5f0911071157s31d21d31yed29d4cfc601daa4@mail.gmail.com>
References: <bb5eaf5f0911071157s31d21d31yed29d4cfc601daa4@mail.gmail.com>
Message-ID: <4AF5D4B8.1030901@59A2.org>

Ryan Yan wrote:
> Hi All,
> I have a question as follows:
> 
> In order to use MatSetValuesBlocked() for a MPIAIJ matrix. I need to
> call MatSetBlockSize() when I create the matrix.

Call MatSetBlockSize *after* preallocation.

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091107/87f58206/attachment.pgp>

From vyan2000 at gmail.com  Sat Nov  7 14:17:01 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Sat, 7 Nov 2009 15:17:01 -0500
Subject: Can I use MatSetBlockSize() for MPIAIJ
In-Reply-To: <4AF5D4B8.1030901@59A2.org>
References: <bb5eaf5f0911071157s31d21d31yed29d4cfc601daa4@mail.gmail.com>
	<4AF5D4B8.1030901@59A2.org>
Message-ID: <bb5eaf5f0911071217v2a0c1f26ncd90c9b23da0384a@mail.gmail.com>

Hi Jed,
Thanks,
So the following is very confusing:

http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatSetValuesBlocked.html

Notes The m and n count the NUMBER of blocks in the row direction and column
direction, NOT the total number of rows/columns; for example, if the block
size<http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Sys/size.html#size>is
2 and you are passing in values for rows 2,3,4,5 then m would be 2
(not
4). The values in idxm would be 1 2; that is the first index for each block
divided by the block
size<http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Sys/size.html#size>.


Note that you must call
MatSetBlockSize<http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatSetBlockSize.html#MatSetBlockSize>()
when constructing this matrix (and before preallocating it).........


On Sat, Nov 7, 2009 at 3:12 PM, Jed Brown <jed at 59a2.org> wrote:

> Ryan Yan wrote:
> > Hi All,
> > I have a question as follows:
> >
> > In order to use MatSetValuesBlocked() for a MPIAIJ matrix. I need to
> > call MatSetBlockSize() when I create the matrix.
>
> Call MatSetBlockSize *after* preallocation.
>
> Jed
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091107/d160b583/attachment.htm>

From vyan2000 at gmail.com  Sat Nov  7 14:19:30 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Sat, 7 Nov 2009 15:19:30 -0500
Subject: Can I use MatSetBlockSize() for MPIAIJ
In-Reply-To: <bb5eaf5f0911071217v2a0c1f26ncd90c9b23da0384a@mail.gmail.com>
References: <bb5eaf5f0911071157s31d21d31yed29d4cfc601daa4@mail.gmail.com>
	<4AF5D4B8.1030901@59A2.org>
	<bb5eaf5f0911071217v2a0c1f26ncd90c9b23da0384a@mail.gmail.com>
Message-ID: <bb5eaf5f0911071219v250e259dw9e3804c73cef1fca@mail.gmail.com>

It works.
Thanks agian.

Yan

On Sat, Nov 7, 2009 at 3:17 PM, Ryan Yan <vyan2000 at gmail.com> wrote:

> Hi Jed,
> Thanks,
> So the following is very confusing:
>
>
> http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatSetValuesBlocked.html
>
> Notes The m and n count the NUMBER of blocks in the row direction and
> column direction, NOT the total number of rows/columns; for example, if the
> block size<http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Sys/size.html#size>is 2 and you are passing in values for rows 2,3,4,5 then m would be 2 (not
> 4). The values in idxm would be 1 2; that is the first index for each block
> divided by the block size<http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Sys/size.html#size>.
>
>
> Note that you must call MatSetBlockSize<http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatSetBlockSize.html#MatSetBlockSize>()
> when constructing this matrix (and before preallocating it).........
>
>
>
>
> On Sat, Nov 7, 2009 at 3:12 PM, Jed Brown <jed at 59a2.org> wrote:
>
>> Ryan Yan wrote:
>> > Hi All,
>> > I have a question as follows:
>> >
>> > In order to use MatSetValuesBlocked() for a MPIAIJ matrix. I need to
>> > call MatSetBlockSize() when I create the matrix.
>>
>> Call MatSetBlockSize *after* preallocation.
>>
>> Jed
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091107/85007c82/attachment.htm>

From jed at 59A2.org  Sat Nov  7 14:39:41 2009
From: jed at 59A2.org (Jed Brown)
Date: Sat, 07 Nov 2009 21:39:41 +0100
Subject: Can I use MatSetBlockSize() for MPIAIJ
In-Reply-To: <bb5eaf5f0911071217v2a0c1f26ncd90c9b23da0384a@mail.gmail.com>
References: <bb5eaf5f0911071157s31d21d31yed29d4cfc601daa4@mail.gmail.com>	<4AF5D4B8.1030901@59A2.org>
	<bb5eaf5f0911071217v2a0c1f26ncd90c9b23da0384a@mail.gmail.com>
Message-ID: <4AF5DB0D.9030506@59A2.org>

Ryan Yan wrote:

> Note that you must call MatSetBlockSize
> <http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatSetBlockSize.html#MatSetBlockSize>()
> when constructing this matrix (and before preallocating it).........

Indeed, thanks for pointing it out.  I have fixed the documentation in
petsc-dev and also made MatSetBlockSize() work for BAIJ (it just checks
that the block size agrees with the way the matrix was allocated).

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091107/1d3694c4/attachment.pgp>

From jarunan at ascomp.ch  Mon Nov  9 06:03:57 2009
From: jarunan at ascomp.ch (jarunan at ascomp.ch)
Date: Mon, 09 Nov 2009 13:03:57 +0100
Subject: Create vectors
In-Reply-To: <bb5eaf5f0911071201l2f5da8efs863e79bd7af56bdf@mail.gmail.com>
References: <bb5eaf5f0911071157s31d21d31yed29d4cfc601daa4@mail.gmail.com>
	<bb5eaf5f0911071201l2f5da8efs863e79bd7af56bdf@mail.gmail.com>
Message-ID: <20091109130357.j12u8p5y8kwosks0@webmail.ascomp.ch>


Hello,

Is there an equivalent way to allocating array pointer for creating  
vectors (or vector pointers)?


Jarunan


From knepley at gmail.com  Mon Nov  9 06:06:30 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 9 Nov 2009 06:06:30 -0600
Subject: Create vectors
In-Reply-To: <20091109130357.j12u8p5y8kwosks0@webmail.ascomp.ch>
References: <bb5eaf5f0911071157s31d21d31yed29d4cfc601daa4@mail.gmail.com>
	<bb5eaf5f0911071201l2f5da8efs863e79bd7af56bdf@mail.gmail.com>
	<20091109130357.j12u8p5y8kwosks0@webmail.ascomp.ch>
Message-ID: <a9f269830911090406u384bef65mad12dd3245d383aa@mail.gmail.com>

Is this what you want?

http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Vec/VecCreateMPIWithArray.html

  Matt

On Mon, Nov 9, 2009 at 6:03 AM, <jarunan at ascomp.ch> wrote:

>
> Hello,
>
> Is there an equivalent way to allocating array pointer for creating vectors
> (or vector pointers)?
>
>
> Jarunan
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091109/36230cab/attachment.htm>

From jed at 59A2.org  Mon Nov  9 06:07:53 2009
From: jed at 59A2.org (Jed Brown)
Date: Mon, 09 Nov 2009 13:07:53 +0100
Subject: Create vectors
In-Reply-To: <20091109130357.j12u8p5y8kwosks0@webmail.ascomp.ch>
References: <bb5eaf5f0911071157s31d21d31yed29d4cfc601daa4@mail.gmail.com>	<bb5eaf5f0911071201l2f5da8efs863e79bd7af56bdf@mail.gmail.com>
	<20091109130357.j12u8p5y8kwosks0@webmail.ascomp.ch>
Message-ID: <4AF80619.8080601@59A2.org>

jarunan at ascomp.ch wrote:
> 
> Hello,
> 
> Is there an equivalent way to allocating array pointer for creating
> vectors (or vector pointers)?

What do you want to do?  Maybe you're looking for one of the
VecCreateXXWithArray() variants.

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091109/8e047b7b/attachment.pgp>

From jarunan at ascomp.ch  Mon Nov  9 07:29:07 2009
From: jarunan at ascomp.ch (jarunan at ascomp.ch)
Date: Mon, 09 Nov 2009 14:29:07 +0100
Subject: Create vectors
In-Reply-To: <20091109130357.j12u8p5y8kwosks0@webmail.ascomp.ch>
References: <bb5eaf5f0911071157s31d21d31yed29d4cfc601daa4@mail.gmail.com>
	<bb5eaf5f0911071201l2f5da8efs863e79bd7af56bdf@mail.gmail.com>
	<20091109130357.j12u8p5y8kwosks0@webmail.ascomp.ch>
Message-ID: <20091109142907.vcdgmk3sao0k0800@webmail.ascomp.ch>


I am solving multi-level grid (similar to multi grid but not the  
same). In each iteration, each level is solved separately but  
solutions are mapped to eachother. Each level has different size of  
matrix and vector. And each test case has different numbers of grid  
level.

I have a difficulty to create vectors and matrices for each level,  
preparing for the computation, as I do not want to create and destroy  
them in every iteration. I am thinking of something similar to array  
pointer (the code is in fortran) e.g.,

Type(real_array), Dimension(:), allocatable:: pointername
Allocate(pointername(level_numbers))

do i = 1, level_numbers
    allocate(pointername(i)%p(size))
enddo


Is it possible to create pointer to vectors?


Thank you
Jarunan


Quoting jarunan at ascomp.ch:

>
> Hello,
>
> Is there an equivalent way to allocating array pointer for creating
> vectors (or vector pointers)?
>
>
> Jarunan


-- 
Jarunan Panyasantisuk
Development Engineer
ASCOMP GmbH, Technoparkstr. 1
CH-8005 Zurich, Switzerland
Phone : +41 44 445 4072
Fax   : +41 44 445 4075
E-mail: jarunan at ascomp.ch
www.ascomp.ch

From knepley at gmail.com  Mon Nov  9 07:32:36 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 9 Nov 2009 07:32:36 -0600
Subject: Create vectors
In-Reply-To: <20091109142907.vcdgmk3sao0k0800@webmail.ascomp.ch>
References: <bb5eaf5f0911071157s31d21d31yed29d4cfc601daa4@mail.gmail.com>
	<bb5eaf5f0911071201l2f5da8efs863e79bd7af56bdf@mail.gmail.com>
	<20091109130357.j12u8p5y8kwosks0@webmail.ascomp.ch>
	<20091109142907.vcdgmk3sao0k0800@webmail.ascomp.ch>
Message-ID: <a9f269830911090532v21ed5247v8ed3aa086fe198a4@mail.gmail.com>

Yes, Vec is just a regular type.

  Matt

On Mon, Nov 9, 2009 at 7:29 AM, <jarunan at ascomp.ch> wrote:

>
> I am solving multi-level grid (similar to multi grid but not the same). In
> each iteration, each level is solved separately but solutions are mapped to
> eachother. Each level has different size of matrix and vector. And each test
> case has different numbers of grid level.
>
> I have a difficulty to create vectors and matrices for each level,
> preparing for the computation, as I do not want to create and destroy them
> in every iteration. I am thinking of something similar to array pointer (the
> code is in fortran) e.g.,
>
> Type(real_array), Dimension(:), allocatable:: pointername
> Allocate(pointername(level_numbers))
>
> do i = 1, level_numbers
>   allocate(pointername(i)%p(size))
> enddo
>
>
> Is it possible to create pointer to vectors?
>
>
> Thank you
> Jarunan
>
>
> Quoting jarunan at ascomp.ch:
>
>
>> Hello,
>>
>> Is there an equivalent way to allocating array pointer for creating
>> vectors (or vector pointers)?
>>
>>
>> Jarunan
>>
>
>
>
> --
> Jarunan Panyasantisuk
> Development Engineer
> ASCOMP GmbH, Technoparkstr. 1
> CH-8005 Zurich, Switzerland
> Phone : +41 44 445 4072
> Fax   : +41 44 445 4075
> E-mail: jarunan at ascomp.ch
> www.ascomp.ch
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091109/3ffd9151/attachment.htm>

From w.drenth at gmail.com  Mon Nov  9 10:30:16 2009
From: w.drenth at gmail.com (Wienand Drenth)
Date: Mon, 9 Nov 2009 17:30:16 +0100
Subject: use of VecPlaceArray in parallel with fortran
In-Reply-To: <D2843FD2-2706-4BF9-80D3-4575026F0D08@mcs.anl.gov>
References: <4a718f330911061000y53dcf231x83a4e41cf7b25097@mail.gmail.com>
	<D2843FD2-2706-4BF9-80D3-4575026F0D08@mcs.anl.gov>
Message-ID: <4a718f330911090830p3dc946e9qba2b3530dc26458e@mail.gmail.com>

Hello Barry,

Thank you for that.

Just another question. As I wrote in my first email, in the current
code, we utilize a local non-PetSc arrays and using VecPlaceArray we
"give" this array to PetSc vectors to do the KSPSolve. Afterwards, we
can just continue with our local non-PetSc arrays. If I understand you
correctly, and for my knowledge, this approach will not be possible in
a parallel setting?

When I do, with for example two processors, and with local array being
blocal = 1, 2, .... , 10
then for the zeroth processor I have also values 1, 2, ... , 10 and
not just half (i.e., 1,2,3,4,5,0,0,0,0,0).
for the first processor I have only part of the values, but they start
with the first entry of my array, and not half-way:
 0,0,0,0,0, 1,2,3,4,5 instead of 0,0,0,0,0, 6,7,8,9,10


Regards,

Wienand

On Fri, Nov 6, 2009 at 7:48 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>  VecPlaceArray() gives to the vector its local (on process) part  of the
> array, not the whole array (and requires no communication). If you want the
> entire array of the vector on one or all processes you can use
> VecScatterCreateToAll() or VecScatterCreateToZero() and then use the
> VecScatter created to move the values to where you want them.
>
>   Barry
>
> On Nov 6, 2009, at 12:00 PM, Wienand Drenth wrote:
>
>> Hello all,
>>
>> In my research code I solve a linear system of equations, and (of
>> course) I use PetSc routines for that. However, in the code we have
>> our own data arrays for the right handside vector B, and solution
>> vector X. Only just prior to the call to KSPSolve, we use the routine
>> VecPlaceArray to synchronize the Fortran array B and X with their
>> PetSc counterparts (M_B and M_X, for example, respectively).
>>
>> I was wondering if this would work in parallel as well? I have adapted
>> one of the tutorial examples  (ex2f from the ksp tutorials) to utilize
>> the VecPlaceArray mechanism. I encountered no problems, except when I
>> want to run the program in parallel.
>>
>> When I do that, and print my own vector X afterwards, different
>> processors show different parts of the solution. For example, for a
>> vector of length 10, and with two processors, processor one will have
>> values for the first five elements (remainder is zero), and processor
>> two will have values for the last five elements in the array.
>>
>>> From the same ksp tutorials, I have tried ex13 as well, the c program.
>>
>> Here I do not get partial outputs for different processors.
>>
>> I wonder whether one cannot use VecPlaceArray in a parralel setting in
>> Fortran, except by doing extra bookkeeping? I hope someone can
>> enlighten me, and indicate where I missed something in my programming
>> or otherwise.
>>
>> Thanks in advance,
>>
>> Wienand Drenth
>>
>>
>>
>> --
>> Wienand Drenth PhD
>> Eindhoven, the Netherlands
>
>


-- 
Wienand Drenth PhD
Eindhoven, the Netherlands

From jed at 59A2.org  Mon Nov  9 10:45:38 2009
From: jed at 59A2.org (Jed Brown)
Date: Mon, 09 Nov 2009 17:45:38 +0100
Subject: use of VecPlaceArray in parallel with fortran
In-Reply-To: <4a718f330911090830p3dc946e9qba2b3530dc26458e@mail.gmail.com>
References: <4a718f330911061000y53dcf231x83a4e41cf7b25097@mail.gmail.com>	<D2843FD2-2706-4BF9-80D3-4575026F0D08@mcs.anl.gov>
	<4a718f330911090830p3dc946e9qba2b3530dc26458e@mail.gmail.com>
Message-ID: <4AF84732.3060604@59A2.org>

Wienand Drenth wrote:
> Hello Barry,
> 
> Thank you for that.
> 
> Just another question. As I wrote in my first email, in the current
> code, we utilize a local non-PetSc arrays and using VecPlaceArray we
> "give" this array to PetSc vectors to do the KSPSolve. Afterwards, we
> can just continue with our local non-PetSc arrays. If I understand you
> correctly, and for my knowledge, this approach will not be possible in
> a parallel setting?
> 
> When I do, with for example two processors, and with local array being
> blocal = 1, 2, .... , 10
> then for the zeroth processor I have also values 1, 2, ... , 10 and
> not just half (i.e., 1,2,3,4,5,0,0,0,0,0).
> for the first processor I have only part of the values, but they start
> with the first entry of my array, and not half-way:
>  0,0,0,0,0, 1,2,3,4,5 instead of 0,0,0,0,0, 6,7,8,9,10

If it is this simple, you could still use VecPlaceArray, but you would
be responsible for updating ghost values (of your arrays, KSPSolve will
only put the solution in the contiguous owned segment).  In 2D or 3D,
the owned segment that you want to "give" to the KSP is likely to not be
contiguous.  BUT, you should just make a copy, it will not be a
significant amount of memory or time.  Look at VecScatterCreateToAll(),
this can be used to update the copy that the rest of your code works
with.

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091109/34b707fc/attachment.pgp>

From bsmith at mcs.anl.gov  Mon Nov  9 14:54:59 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 9 Nov 2009 14:54:59 -0600
Subject: Create vectors
In-Reply-To: <20091109142907.vcdgmk3sao0k0800@webmail.ascomp.ch>
References: <bb5eaf5f0911071157s31d21d31yed29d4cfc601daa4@mail.gmail.com>
	<bb5eaf5f0911071201l2f5da8efs863e79bd7af56bdf@mail.gmail.com>
	<20091109130357.j12u8p5y8kwosks0@webmail.ascomp.ch>
	<20091109142907.vcdgmk3sao0k0800@webmail.ascomp.ch>
Message-ID: <0B12D2AA-1B77-4FD3-92A4-9E237114B6FE@mcs.anl.gov>


    Yes, for example

    integer localn(level_numbers)
    Vec myvecs(level_numbers)

    for i=1,level_numbers
        
VecCreate(PETSC_COMM_WORLD,localn(i),PETSC_DETERMINE,myvecs(i),ierr)
    endo

    Of course, myvecs() can also may made allocatable and you can set  
at run time the number of levels.

    Barry


On Nov 9, 2009, at 7:29 AM, jarunan at ascomp.ch wrote:

>
> I am solving multi-level grid (similar to multi grid but not the  
> same). In each iteration, each level is solved separately but  
> solutions are mapped to eachother. Each level has different size of  
> matrix and vector. And each test case has different numbers of grid  
> level.
>
> I have a difficulty to create vectors and matrices for each level,  
> preparing for the computation, as I do not want to create and  
> destroy them in every iteration. I am thinking of something similar  
> to array pointer (the code is in fortran) e.g.,
>
> Type(real_array), Dimension(:), allocatable:: pointername
> Allocate(pointername(level_numbers))
>
> do i = 1, level_numbers
>   allocate(pointername(i)%p(size))
> enddo
>
>
> Is it possible to create pointer to vectors?
>
>
> Thank you
> Jarunan
>
>
> Quoting jarunan at ascomp.ch:
>
>>
>> Hello,
>>
>> Is there an equivalent way to allocating array pointer for creating
>> vectors (or vector pointers)?
>>
>>
>> Jarunan
>
>
>
> -- 
> Jarunan Panyasantisuk
> Development Engineer
> ASCOMP GmbH, Technoparkstr. 1
> CH-8005 Zurich, Switzerland
> Phone : +41 44 445 4072
> Fax   : +41 44 445 4075
> E-mail: jarunan at ascomp.ch
> www.ascomp.ch


From bsmith at mcs.anl.gov  Mon Nov  9 15:00:53 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 9 Nov 2009 15:00:53 -0600
Subject: use of VecPlaceArray in parallel with fortran
In-Reply-To: <4a718f330911090830p3dc946e9qba2b3530dc26458e@mail.gmail.com>
References: <4a718f330911061000y53dcf231x83a4e41cf7b25097@mail.gmail.com>
	<D2843FD2-2706-4BF9-80D3-4575026F0D08@mcs.anl.gov>
	<4a718f330911090830p3dc946e9qba2b3530dc26458e@mail.gmail.com>
Message-ID: <04702B19-55F3-4470-95F8-DE22B4716650@mcs.anl.gov>


On Nov 9, 2009, at 10:30 AM, Wienand Drenth wrote:

> Hello Barry,
>
> Thank you for that.
>
> Just another question. As I wrote in my first email, in the current
> code, we utilize a local non-PetSc arrays and using VecPlaceArray we
> "give" this array to PetSc vectors to do the KSPSolve. Afterwards, we

     We are having some difficulty understanding your question and  
what exactly you want to do?

     In PETSc we use the term "local part" to mean the part of a  
vector owned and stored on a particular process. A global vector  
(parallel) in PETSc then stores part of it on each process.
If you have on each process an array that holds the "local part" of a  
parallel vector. For example, double precision v(nlocal) then you can  
create a parallel vector with VecCreateWithArray() or
VecPlaceArray() passing in v.

    If you have the entire Fortran array stored on one process and you  
want it parallel in PETSc then you can use the  
VecScatterCreateToZero() to get the scatter to spread it to all the  
processors.

    If you have parts stored on each process and you want ghost points  
filled in on each process then you need to set up a scatter with  
VecScatterCreate().

    Barry


> can just continue with our local non-PetSc arrays. If I understand you
> correctly, and for my knowledge, this approach will not be possible in
> a parallel setting?
>
> When I do, with for example two processors, and with local array being
> blocal = 1, 2, .... , 10
> then for the zeroth processor I have also values 1, 2, ... , 10 and
> not just half (i.e., 1,2,3,4,5,0,0,0,0,0).
> for the first processor I have only part of the values, but they start
> with the first entry of my array, and not half-way:
> 0,0,0,0,0, 1,2,3,4,5 instead of 0,0,0,0,0, 6,7,8,9,10
>
>
> Regards,
>
> Wienand
>
> On Fri, Nov 6, 2009 at 7:48 PM, Barry Smith <bsmith at mcs.anl.gov>  
> wrote:
>>
>> VecPlaceArray() gives to the vector its local (on process) part  of  
>> the
>> array, not the whole array (and requires no communication). If you  
>> want the
>> entire array of the vector on one or all processes you can use
>> VecScatterCreateToAll() or VecScatterCreateToZero() and then use the
>> VecScatter created to move the values to where you want them.
>>
>>  Barry
>>
>> On Nov 6, 2009, at 12:00 PM, Wienand Drenth wrote:
>>
>>> Hello all,
>>>
>>> In my research code I solve a linear system of equations, and (of
>>> course) I use PetSc routines for that. However, in the code we have
>>> our own data arrays for the right handside vector B, and solution
>>> vector X. Only just prior to the call to KSPSolve, we use the  
>>> routine
>>> VecPlaceArray to synchronize the Fortran array B and X with their
>>> PetSc counterparts (M_B and M_X, for example, respectively).
>>>
>>> I was wondering if this would work in parallel as well? I have  
>>> adapted
>>> one of the tutorial examples  (ex2f from the ksp tutorials) to  
>>> utilize
>>> the VecPlaceArray mechanism. I encountered no problems, except  
>>> when I
>>> want to run the program in parallel.
>>>
>>> When I do that, and print my own vector X afterwards, different
>>> processors show different parts of the solution. For example, for a
>>> vector of length 10, and with two processors, processor one will  
>>> have
>>> values for the first five elements (remainder is zero), and  
>>> processor
>>> two will have values for the last five elements in the array.
>>>
>>>> From the same ksp tutorials, I have tried ex13 as well, the c  
>>>> program.
>>>
>>> Here I do not get partial outputs for different processors.
>>>
>>> I wonder whether one cannot use VecPlaceArray in a parralel  
>>> setting in
>>> Fortran, except by doing extra bookkeeping? I hope someone can
>>> enlighten me, and indicate where I missed something in my  
>>> programming
>>> or otherwise.
>>>
>>> Thanks in advance,
>>>
>>> Wienand Drenth
>>>
>>>
>>>
>>> --
>>> Wienand Drenth PhD
>>> Eindhoven, the Netherlands
>>
>>
>
>
>
> -- 
> Wienand Drenth PhD
> Eindhoven, the Netherlands


From jarunan at ascomp.ch  Tue Nov 10 02:28:56 2009
From: jarunan at ascomp.ch (jarunan at ascomp.ch)
Date: Tue, 10 Nov 2009 09:28:56 +0100
Subject: Reuse matrix and vector
In-Reply-To: <4AF2AFED.2090706@59A2.org>
References: <20091101110112.GA11973@dutw689>
	<6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>
	<20091101165723.GA24933@dutw689>
	<C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>
	<4AEDC4E8.5030901@tudelft.nl>
	<0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>
	<20091105103217.8vknqnkehwc0ww4g@webmail.ascomp.ch>
	<4AF2A14B.8070409@59A2.org>
	<20091105111152.a6vy4mhjo8wo0kcg@webmail.ascomp.ch>
	<4AF2AFED.2090706@59A2.org>
Message-ID: <20091110092856.rsg3y2fdq8wscgwg@webmail.ascomp.ch>


>>
>> Yes, thank you for the advice. I will modify this. But MatSetValues() is
>> not efficient with a big problem. It takes much time.
>
> No, either you are inserting values that have not been preallocated
> (check with -info | grep mallocs) or you are inserting single values.
> You should insert a full row every time you call MatSetValues.
>

Hi,

I have tried as you suggested: First allocate with  
MatCreateMPIAIJWithArrays() then use MatSetValues() to reset the  
matrix. MatSetValues() has great performance but  
MatCreateMPIAIJWithArrays() need really long time to allocate the  
matrix in the first iteration with more than 1 processor (With one  
processor it is very fast).

Total number of cells is 744872, divided into 40 blocks. In one  
processor, MatCreateMPIAIJWithArrays() takes 0.097 sec but 280 sec  
with 4 processors. Usually, this routine has no problem with small  
test case. It works the same for one or more than one processors.

in the first iteration.
     Mat Ap

     call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD,  
istorf_no_ovcell,         &
       istorf_no_ovcell, PETSC_DETERMINE, PETSC_DETERMINE, rowind,  
columnind,   &
       A, Ap, ierr)

     call MatAssemblyBegin(Ap,MAT_FINAL_ASSEMBLY,ierr)
     call MatAssemblyEnd(Ap,MAT_FINAL_ASSEMBLY,ierr)

After the first iteration, looping over the row. the whole row is set  
at a time.

     call MatSetValues(Ap,1,row_impl,7,col_impl,a_impl,INSERT_VALUES,ierr)
     call MatAssemblyBegin(Ap,MAT_FINAL_ASSEMBLY,ierr)
     call MatAssemblyEnd(Ap,MAT_FINAL_ASSEMBLY,ierr)

Does the communication of MatCreateMPIAIJWithArrays() in parallel  
computation cost a lot? What could be the cause that  
MatCreateMPIAIJWithArrays() so slow in the first iteration?


Best reagards,
Jarunan

From w.drenth at gmail.com  Tue Nov 10 03:21:48 2009
From: w.drenth at gmail.com (Wienand Drenth)
Date: Tue, 10 Nov 2009 10:21:48 +0100
Subject: use of VecPlaceArray in parallel with fortran
In-Reply-To: <04702B19-55F3-4470-95F8-DE22B4716650@mcs.anl.gov>
References: <4a718f330911061000y53dcf231x83a4e41cf7b25097@mail.gmail.com>
	<D2843FD2-2706-4BF9-80D3-4575026F0D08@mcs.anl.gov>
	<4a718f330911090830p3dc946e9qba2b3530dc26458e@mail.gmail.com>
	<04702B19-55F3-4470-95F8-DE22B4716650@mcs.anl.gov>
Message-ID: <4a718f330911100121w2346a3e8u7beb5b095ea39f6f@mail.gmail.com>

>    We are having some difficulty understanding your question and what
> exactly you want to do?

Hello Barry,

Apologies I am a bit unclear. I am relatively new to PetSc and the
proper terminology, so thank you for your time and help.

Currently we have a two Fortran arrays B and X, being the
righthandside and solution vector. There are no special considerations
to have this array on just one processor. So, if I am not mistaken,
when I run the program on multiple processors, each processor will
have the entire Fortran array and not just part of it.

In order to solve the system iteratively, we make calls
VecPlaceArray(M_X, X, ierr)
to place the Fortran array into the PetSc Vector M_X.
Then we call KSPSolve.

After the solve, we don't care for the PetSc vectors anymore, but
continue with the Fortran arrays (X and B) in our further
calculations.

Right know, when running in the above setting it will not function
correctly when run on multiple processors. Henceforth my question on
how to tackle this and adapt the code to run it in parallel.

Would the following procedure lead to a correct and working solution:

Suppose I have a Fortran array X, and I create on processor zero a
sequential PetSc vector  MS_X and place the array X into MS_X using
VecPlaceArray. With VecScatterCreaterToZero, and SCATTER_REVERSE as
scatter mode I can spread it onto the global (parallel) vector M_X.

After my calculations, I can do the same to scatter the parallel
solution onto my sequential vector MS_X (now with SCATTER_FORWARD),
and continue afterwards with X.

Regards,
Wienand


-- 
Wienand Drenth PhD
Eindhoven, the Netherlands

From jed at 59A2.org  Tue Nov 10 04:51:05 2009
From: jed at 59A2.org (Jed Brown)
Date: Tue, 10 Nov 2009 11:51:05 +0100
Subject: Reuse matrix and vector
In-Reply-To: <20091110092856.rsg3y2fdq8wscgwg@webmail.ascomp.ch>
References: <20091101110112.GA11973@dutw689>	<6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>	<20091101165723.GA24933@dutw689>	<C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>	<4AEDC4E8.5030901@tudelft.nl>	<0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>	<20091105103217.8vknqnkehwc0ww4g@webmail.ascomp.ch>	<4AF2A14B.8070409@59A2.org>	<20091105111152.a6vy4mhjo8wo0kcg@webmail.ascomp.ch>	<4AF2AFED.2090706@59A2.org>
	<20091110092856.rsg3y2fdq8wscgwg@webmail.ascomp.ch>
Message-ID: <4AF94599.7030309@59A2.org>

jarunan at ascomp.ch wrote:
> Total number of cells is 744872, divided into 40 blocks. In one
> processor, MatCreateMPIAIJWithArrays() takes 0.097 sec but 280 sec with
> 4 processors. Usually, this routine has no problem with small test case.
> It works the same for one or more than one processors.

This sounds like incorrect preallocation.  Is your PETSc built with
debugging?  Debug does some extra integrity checks that don't add
significantly to the time (although other Debug checks do), but it would
be useful to know that they pass.  In particular, it checks that your
rows are sorted.  If they are not sorted then PETSc's preallocation
would be wrong.  (I actually don't think this requirement enables
significantly faster implementation, so I'm tempted to change it to work
correctly with unsorted rows.)

You can also run with -info |grep malloc, there should be no mallocs in
MatSetValues().

> in the first iteration.
>     Mat Ap
> 
>     call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD, istorf_no_ovcell,         &
>       istorf_no_ovcell, PETSC_DETERMINE, PETSC_DETERMINE, rowind, columnind,   &
>       A, Ap, ierr)
>
>     call MatAssemblyBegin(Ap,MAT_FINAL_ASSEMBLY,ierr)
>     call MatAssemblyEnd(Ap,MAT_FINAL_ASSEMBLY,ierr)

This assembly is superfluous (but harmless).

> Does the communication of MatCreateMPIAIJWithArrays() in parallel
> computation cost a lot? What could be the cause that
> MatCreateMPIAIJWithArrays() so slow in the first iteration?

There is no significant communication, it has to be preallocation.

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091110/1f558f0e/attachment.pgp>

From jed at 59A2.org  Tue Nov 10 05:40:39 2009
From: jed at 59A2.org (Jed Brown)
Date: Tue, 10 Nov 2009 12:40:39 +0100
Subject: use of VecPlaceArray in parallel with fortran
In-Reply-To: <4a718f330911100121w2346a3e8u7beb5b095ea39f6f@mail.gmail.com>
References: <4a718f330911061000y53dcf231x83a4e41cf7b25097@mail.gmail.com>	<D2843FD2-2706-4BF9-80D3-4575026F0D08@mcs.anl.gov>	<4a718f330911090830p3dc946e9qba2b3530dc26458e@mail.gmail.com>	<04702B19-55F3-4470-95F8-DE22B4716650@mcs.anl.gov>
	<4a718f330911100121w2346a3e8u7beb5b095ea39f6f@mail.gmail.com>
Message-ID: <4AF95137.7030808@59A2.org>

Wienand Drenth wrote:
> Would the following procedure lead to a correct and working solution:
>
> Suppose I have a Fortran array X, and I create on processor zero a
> sequential PetSc vector  MS_X and place the array X into MS_X using
> VecPlaceArray. With VecScatterCreaterToZero, and SCATTER_REVERSE as
> scatter mode I can spread it onto the global (parallel) vector M_X.
> 
> After my calculations, I can do the same to scatter the parallel
> solution onto my sequential vector MS_X (now with SCATTER_FORWARD),
> and continue afterwards with X.

With this last part, you are responsible for broadcasting X before your
code can continue.  VecScatterCreateToAll() would get PETSc to do it for
you, *but* these may be too restrictive for what you want.

It will only work if the local portions are contiguous (it is an issue
of natural versus "PETSc" ordering, see Figure 9 of the user's manual).
Presumably your code uses the natural ordering, but solvers will perform
better if they can use the PETSc ordering.  Therefore you will probably
have to make your own scatter.

Assembling the matrix is more tricky because it will be a major
bottleneck if process 0 has to do all of it (unless you solve many
problems with the same matrix) and it is expensive to assemble it on the
wrong process (i.e. assemble in the natural ordering and let PETSc send
the entries to the correct process).

I don't know how how your code is organized, but I highly recommend
using a decomposition like is done by DA (and preferably also use the
DA, even if it means you have to do more copies -- cheap compared to the
shenanigans we are talking about here).  This should involve *less*
modification to your existing serial code, and will offer much better
scalability.

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091110/d8d91be9/attachment.pgp>

From knepley at gmail.com  Tue Nov 10 05:50:22 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 10 Nov 2009 05:50:22 -0600
Subject: Reuse matrix and vector
In-Reply-To: <4AF94599.7030309@59A2.org>
References: <20091101110112.GA11973@dutw689>
	<C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>
	<4AEDC4E8.5030901@tudelft.nl>
	<0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>
	<20091105103217.8vknqnkehwc0ww4g@webmail.ascomp.ch>
	<4AF2A14B.8070409@59A2.org>
	<20091105111152.a6vy4mhjo8wo0kcg@webmail.ascomp.ch>
	<4AF2AFED.2090706@59A2.org>
	<20091110092856.rsg3y2fdq8wscgwg@webmail.ascomp.ch>
	<4AF94599.7030309@59A2.org>
Message-ID: <a9f269830911100350w119fc659l7574d6835021d036@mail.gmail.com>

On Tue, Nov 10, 2009 at 4:51 AM, Jed Brown <jed at 59a2.org> wrote:

> jarunan at ascomp.ch wrote:
> > Total number of cells is 744872, divided into 40 blocks. In one
> > processor, MatCreateMPIAIJWithArrays() takes 0.097 sec but 280 sec with
> > 4 processors. Usually, this routine has no problem with small test case.
> > It works the same for one or more than one processors.
>
> This sounds like incorrect preallocation.  Is your PETSc built with
> debugging?  Debug does some extra integrity checks that don't add
> significantly to the time (although other Debug checks do), but it would
> be useful to know that they pass.  In particular, it checks that your
> rows are sorted.  If they are not sorted then PETSc's preallocation
> would be wrong.  (I actually don't think this requirement enables
> significantly faster implementation, so I'm tempted to change it to work
> correctly with unsorted rows.)
>

I do not think its preallocation per se, since 1 proc is fast. I think that
your
partition of rows fed to the MatCreate() call does not match what you
provide
to MatSetValues() and thus you do a lot of communication in
MatAssemblyEnd().
There are 2 ways to debug this:

  1)  -log_summary to see where the time is spent

  2) MatSetOption(A, *MAT_NEW_NONZERO_LOCATION_ERR)*

  Matt

You can also run with -info |grep malloc, there should be no mallocs in
> MatSetValues().
>
> > in the first iteration.
> >     Mat Ap
> >
> >     call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD, istorf_no_ovcell,
>       &
> >       istorf_no_ovcell, PETSC_DETERMINE, PETSC_DETERMINE, rowind,
> columnind,   &
> >       A, Ap, ierr)
> >
> >     call MatAssemblyBegin(Ap,MAT_FINAL_ASSEMBLY,ierr)
> >     call MatAssemblyEnd(Ap,MAT_FINAL_ASSEMBLY,ierr)
>
> This assembly is superfluous (but harmless).
>
> > Does the communication of MatCreateMPIAIJWithArrays() in parallel
> > computation cost a lot? What could be the cause that
> > MatCreateMPIAIJWithArrays() so slow in the first iteration?
>
> There is no significant communication, it has to be preallocation.
>
> Jed
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091110/35a51c31/attachment-0001.htm>

From jed at 59A2.org  Tue Nov 10 05:58:05 2009
From: jed at 59A2.org (Jed Brown)
Date: Tue, 10 Nov 2009 12:58:05 +0100
Subject: Reuse matrix and vector
In-Reply-To: <a9f269830911100350w119fc659l7574d6835021d036@mail.gmail.com>
References: <20091101110112.GA11973@dutw689>	<C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>	<4AEDC4E8.5030901@tudelft.nl>	<0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>	<20091105103217.8vknqnkehwc0ww4g@webmail.ascomp.ch>	<4AF2A14B.8070409@59A2.org>	<20091105111152.a6vy4mhjo8wo0kcg@webmail.ascomp.ch>	<4AF2AFED.2090706@59A2.org>	<20091110092856.rsg3y2fdq8wscgwg@webmail.ascomp.ch>	<4AF94599.7030309@59A2.org>
	<a9f269830911100350w119fc659l7574d6835021d036@mail.gmail.com>
Message-ID: <4AF9554D.4060002@59A2.org>

Matthew Knepley wrote:
> On Tue, Nov 10, 2009 at 4:51 AM, Jed Brown <jed at 59a2.org
> <mailto:jed at 59a2.org>> wrote:
> 
>     jarunan at ascomp.ch <mailto:jarunan at ascomp.ch> wrote:
>     > Total number of cells is 744872, divided into 40 blocks. In one
>     > processor, MatCreateMPIAIJWithArrays() takes 0.097 sec but 280 sec
>     with
>     > 4 processors. Usually, this routine has no problem with small test
>     case.
>     > It works the same for one or more than one processors.
> 
>     This sounds like incorrect preallocation.  Is your PETSc built with
>     debugging?  Debug does some extra integrity checks that don't add
>     significantly to the time (although other Debug checks do), but it would
>     be useful to know that they pass.  In particular, it checks that your
>     rows are sorted.  If they are not sorted then PETSc's preallocation
>     would be wrong.  (I actually don't think this requirement enables
>     significantly faster implementation, so I'm tempted to change it to work
>     correctly with unsorted rows.)
> 
> 
> I do not think its preallocation per se, since 1 proc is fast. I think
> that your partition of rows fed to the MatCreate() call does not match
> what you provide to MatSetValues() and thus you do a lot of
> communication in MatAssemblyEnd().  There are 2 ways to debug this:

Matt, he says MatSetValues() is fast, but MatCreateMPIAIJWithArrays() is
slow.  Look how preallocation is done (mpiaij.c:3263).  This would do
the correct thing in serial, but be under-allocate the diagonal part
when the rows are not sorted.  It looks like a silly "optimization" to
me.

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091110/c2355211/attachment.pgp>

From jarunan at ascomp.ch  Tue Nov 10 06:02:12 2009
From: jarunan at ascomp.ch (jarunan at ascomp.ch)
Date: Tue, 10 Nov 2009 13:02:12 +0100
Subject: Reuse matrix and vector
In-Reply-To: <4AF94599.7030309@59A2.org>
References: <20091101110112.GA11973@dutw689>
	<6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>
	<20091101165723.GA24933@dutw689>
	<C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>
	<4AEDC4E8.5030901@tudelft.nl>
	<0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>
	<20091105103217.8vknqnkehwc0ww4g@webmail.ascomp.ch>
	<4AF2A14B.8070409@59A2.org>
	<20091105111152.a6vy4mhjo8wo0kcg@webmail.ascomp.ch>
	<4AF2AFED.2090706@59A2.org>
	<20091110092856.rsg3y2fdq8wscgwg@webmail.ascomp.ch>
	<4AF94599.7030309@59A2.org>
Message-ID: <20091110130212.9ivjxj3a2oogckkw@webmail.ascomp.ch>

Quoting Jed Brown <jed at 59A2.org>:

> jarunan at ascomp.ch wrote:
>> Total number of cells is 744872, divided into 40 blocks. In one
>> processor, MatCreateMPIAIJWithArrays() takes 0.097 sec but 280 sec with
>> 4 processors. Usually, this routine has no problem with small test case.
>> It works the same for one or more than one processors.
>
> This sounds like incorrect preallocation.  Is your PETSc built with
> debugging?  Debug does some extra integrity checks that don't add
> significantly to the time (although other Debug checks do), but it would
> be useful to know that they pass.  In particular, it checks that your
> rows are sorted.  If they are not sorted then PETSc's preallocation
> would be wrong.  (I actually don't think this requirement enables
> significantly faster implementation, so I'm tempted to change it to work
> correctly with unsorted rows.)

The code is compiled with the optimized version of PETSc. The row  
indices and column indices for each row are sorted. Well, they are not  
sorted for diagonal or off-diagonal part.

>
> You can also run with -info |grep malloc, there should be no mallocs in
> MatSetValues().
>

Here are the output.
The first serie of MatAssemblyEnd_SeqAIJ() should be in  
MatCreateMPIAIJWithArrays(), which mallocs in MatSetValues() are not  
zero.
Would MatCreateMPIAIJWithSplitArrays() be better in preallocation?

[0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs.
[0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs.
[2] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 4379
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 13859
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 20286
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 4592
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 15042
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 11715
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatIncreaseOverlap_MPIAIJ_Receive(): Allocated 0 bytes, required 3  
bytes, no of mallocs = 0
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0


From knepley at gmail.com  Tue Nov 10 06:08:11 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 10 Nov 2009 06:08:11 -0600
Subject: Reuse matrix and vector
In-Reply-To: <20091110130212.9ivjxj3a2oogckkw@webmail.ascomp.ch>
References: <20091101110112.GA11973@dutw689> <4AEDC4E8.5030901@tudelft.nl>
	<0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>
	<20091105103217.8vknqnkehwc0ww4g@webmail.ascomp.ch>
	<4AF2A14B.8070409@59A2.org>
	<20091105111152.a6vy4mhjo8wo0kcg@webmail.ascomp.ch>
	<4AF2AFED.2090706@59A2.org>
	<20091110092856.rsg3y2fdq8wscgwg@webmail.ascomp.ch>
	<4AF94599.7030309@59A2.org>
	<20091110130212.9ivjxj3a2oogckkw@webmail.ascomp.ch>
Message-ID: <a9f269830911100408i612c3889x876f9039016aa11@mail.gmail.com>

On Tue, Nov 10, 2009 at 6:02 AM, <jarunan at ascomp.ch> wrote:

> Quoting Jed Brown <jed at 59A2.org>:
>
>  jarunan at ascomp.ch wrote:
>>
>>> Total number of cells is 744872, divided into 40 blocks. In one
>>> processor, MatCreateMPIAIJWithArrays() takes 0.097 sec but 280 sec with
>>> 4 processors. Usually, this routine has no problem with small test case.
>>> It works the same for one or more than one processors.
>>>
>>
>> This sounds like incorrect preallocation.  Is your PETSc built with
>> debugging?  Debug does some extra integrity checks that don't add
>> significantly to the time (although other Debug checks do), but it would
>> be useful to know that they pass.  In particular, it checks that your
>> rows are sorted.  If they are not sorted then PETSc's preallocation
>> would be wrong.  (I actually don't think this requirement enables
>> significantly faster implementation, so I'm tempted to change it to work
>> correctly with unsorted rows.)
>>
>
> The code is compiled with the optimized version of PETSc. The row indices
> and column indices for each row are sorted. Well, they are not sorted for
> diagonal or off-diagonal part.
>

Actually, what Jed says is the likely culprit. Please check that the column
indices in each row are sorted. It is clear that the preallocation
for multiple procs does not match what you feed to MatSetValues().

Jed: I agree. I would have just written the loop to check for membership in
the diagonal block (as I do elsewhere). Maybe we should change petsc-dev?

  Matt


>> You can also run with -info |grep malloc, there should be no mallocs in
>> MatSetValues().
>>
>>
> Here are the output.
> The first serie of MatAssemblyEnd_SeqAIJ() should be in
> MatCreateMPIAIJWithArrays(), which mallocs in MatSetValues() are not zero.
> Would MatCreateMPIAIJWithSplitArrays() be better in preallocation?
>
> [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs.
> [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs.
> [2] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is
> 4379
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is
> 13859
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is
> 20286
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is
> 4592
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is
> 15042
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is
> 11715
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatIncreaseOverlap_MPIAIJ_Receive(): Allocated 0 bytes, required 3
> bytes, no of mallocs = 0
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091110/1a5e92f7/attachment.htm>

From jed at 59A2.org  Tue Nov 10 06:08:45 2009
From: jed at 59A2.org (Jed Brown)
Date: Tue, 10 Nov 2009 13:08:45 +0100
Subject: Reuse matrix and vector
In-Reply-To: <20091110130212.9ivjxj3a2oogckkw@webmail.ascomp.ch>
References: <20091101110112.GA11973@dutw689>	<6265635E-33E3-44FC-9496-D9E0E86BEB3F@mcs.anl.gov>	<20091101165723.GA24933@dutw689>	<C086574A-A2A6-4DCC-9A9B-D142587C23A0@mcs.anl.gov>	<4AEDC4E8.5030901@tudelft.nl>	<0BC6D60B-9608-4E80-8174-E2B29BDF0609@mcs.anl.gov>	<20091105103217.8vknqnkehwc0ww4g@webmail.ascomp.ch>	<4AF2A14B.8070409@59A2.org>	<20091105111152.a6vy4mhjo8wo0kcg@webmail.ascomp.ch>	<4AF2AFED.2090706@59A2.org>	<20091110092856.rsg3y2fdq8wscgwg@webmail.ascomp.ch>	<4AF94599.7030309@59A2.org>
	<20091110130212.9ivjxj3a2oogckkw@webmail.ascomp.ch>
Message-ID: <4AF957CD.5050204@59A2.org>

jarunan at ascomp.ch wrote:

> The code is compiled with the optimized version of PETSc. The row
> indices and column indices for each row are sorted. Well, they are not
> sorted for diagonal or off-diagonal part.

I recommend using a debug version for all testing and only the optimized
for production/scalability.  You should use
MatCreateMPIAIJWithSplitArrays() if you have these available separately.
But it sounds like you have one big array where each row has the
diagonal part followed by the off-diagonal part?  This format can't be
used directly by PETSc, just preallocate with
MatMPIAIJSetPreallocation() and use MatSetValues().

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091110/8816bd37/attachment-0001.pgp>

From w.drenth at gmail.com  Tue Nov 10 06:32:27 2009
From: w.drenth at gmail.com (Wienand Drenth)
Date: Tue, 10 Nov 2009 13:32:27 +0100
Subject: use of VecPlaceArray in parallel with fortran
In-Reply-To: <4a718f330911100121w2346a3e8u7beb5b095ea39f6f@mail.gmail.com>
References: <4a718f330911061000y53dcf231x83a4e41cf7b25097@mail.gmail.com>
	<D2843FD2-2706-4BF9-80D3-4575026F0D08@mcs.anl.gov>
	<4a718f330911090830p3dc946e9qba2b3530dc26458e@mail.gmail.com>
	<04702B19-55F3-4470-95F8-DE22B4716650@mcs.anl.gov>
	<4a718f330911100121w2346a3e8u7beb5b095ea39f6f@mail.gmail.com>
Message-ID: <4a718f330911100432ufa2523ew8d08761cd544b9b9@mail.gmail.com>

As a follow-up of my previous email, I have tried the following:

bvec is a Fortran array
bseq2 is a sequential PetSc vector created only on processor 0
b is the parallel PetSc vector I use in KSPSolve

I fill bvec with bvec(i) = i, and b gets initialized to all ones.

then I do
      if (rank.eq.0) then
        call VecPlaceArray(bseq2, bvec, ierr)
        do my_i=1,m*n
            II=my_i-1
           call VecGetValues(bseq2,1,II,v,ierr)
           write(*,*) "bseq2 rhs: ", II, ": ", v, " for rank ", rank
        enddo
     endif

this prints nice 1, 2, etc

Next I want my parallel vector b to be filled. So I use VecScatterCreatertoZero.

      call VecScatterCreateToZero(b,vscat,bseq2,ierr)
      call VecScatterBegin(vscat, bseq2,b,INSERT_VALUES,
     >    SCATTER_REVERSE,ierr)
      call VecScatterEnd(vscat,bseq2,b,INSERT_VALUES,
     >      SCATTER_REVERSE,ierr)
      call VecScatterDestroy(vscat, ierr)

However, now b is filled with zeros. (If I would change the order in
the VecScatterBegin and VecScatterEnd (first b, then bseq2), possible
in combination with SCATTER_FORWARD, then b will still have its
original ones.)

I cannot immediately see what I did wrong here, so I hope somebody
could give a further hint.

Wienand

From knepley at gmail.com  Tue Nov 10 06:36:53 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 10 Nov 2009 06:36:53 -0600
Subject: use of VecPlaceArray in parallel with fortran
In-Reply-To: <4a718f330911100432ufa2523ew8d08761cd544b9b9@mail.gmail.com>
References: <4a718f330911061000y53dcf231x83a4e41cf7b25097@mail.gmail.com>
	<D2843FD2-2706-4BF9-80D3-4575026F0D08@mcs.anl.gov>
	<4a718f330911090830p3dc946e9qba2b3530dc26458e@mail.gmail.com>
	<04702B19-55F3-4470-95F8-DE22B4716650@mcs.anl.gov>
	<4a718f330911100121w2346a3e8u7beb5b095ea39f6f@mail.gmail.com>
	<4a718f330911100432ufa2523ew8d08761cd544b9b9@mail.gmail.com>
Message-ID: <a9f269830911100436u7b6ff9e6r1b1e66df33ad790c@mail.gmail.com>

On Tue, Nov 10, 2009 at 6:32 AM, Wienand Drenth <w.drenth at gmail.com> wrote:

> As a follow-up of my previous email, I have tried the following:
>
> bvec is a Fortran array
> bseq2 is a sequential PetSc vector created only on processor 0
> b is the parallel PetSc vector I use in KSPSolve
>
> I fill bvec with bvec(i) = i, and b gets initialized to all ones.
>
> then I do
>      if (rank.eq.0) then
>        call VecPlaceArray(bseq2, bvec, ierr)
>        do my_i=1,m*n
>            II=my_i-1
>           call VecGetValues(bseq2,1,II,v,ierr)
>           write(*,*) "bseq2 rhs: ", II, ": ", v, " for rank ", rank
>        enddo
>     endif
>
> this prints nice 1, 2, etc
>
> Next I want my parallel vector b to be filled. So I use
> VecScatterCreatertoZero.
>
>      call VecScatterCreateToZero(b,vscat,bseq2,ierr)
>

This call creates bseq2, so it will wipe out the values. Put them in
afterwards.

  Matt


>      call VecScatterBegin(vscat, bseq2,b,INSERT_VALUES,
>     >    SCATTER_REVERSE,ierr)
>      call VecScatterEnd(vscat,bseq2,b,INSERT_VALUES,
>     >      SCATTER_REVERSE,ierr)
>      call VecScatterDestroy(vscat, ierr)
>
> However, now b is filled with zeros. (If I would change the order in
> the VecScatterBegin and VecScatterEnd (first b, then bseq2), possible
> in combination with SCATTER_FORWARD, then b will still have its
> original ones.)
>
> I cannot immediately see what I did wrong here, so I hope somebody
> could give a further hint.
>
> Wienand
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091110/91a210dc/attachment.htm>

From yfeng1 at tigers.lsu.edu  Tue Nov 10 21:53:52 2009
From: yfeng1 at tigers.lsu.edu (Yin Feng)
Date: Tue, 10 Nov 2009 21:53:52 -0600
Subject: A problem on compiling a petsc code.
Message-ID: <1e8c69dc0911101953p5f495222j1771b431432c5a51@mail.gmail.com>

When I compiled petsc code, I encourntered following error code.
the compiler is mpicc

/usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../../lib64/crt1.o(.text+0x21):
In function `_start':
: undefined reference to `main'
collect2: ld returned 1 exit status

Thank you in advance!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091110/c75e523c/attachment.htm>

From balay at mcs.anl.gov  Wed Nov 11 00:49:28 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 11 Nov 2009 00:49:28 -0600 (CST)
Subject: A problem on compiling a petsc code.
In-Reply-To: <1e8c69dc0911101953p5f495222j1771b431432c5a51@mail.gmail.com>
References: <1e8c69dc0911101953p5f495222j1771b431432c5a51@mail.gmail.com>
Message-ID: <alpine.LFD.2.00.0911110047470.2791@asterix>

Please send the complete compile command that gave this error.

Also do PETSc examples compile/run correctly? What do you get for:

cd src/ksp/ksp/examples/tutorials
make ex2
make ex2f

Satish


On Tue, 10 Nov 2009, Yin Feng wrote:

> When I compiled petsc code, I encourntered following error code.
> the compiler is mpicc
> 
> /usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../../lib64/crt1.o(.text+0x21):
> In function `_start':
> : undefined reference to `main'
> collect2: ld returned 1 exit status
> 
> Thank you in advance!
> 


From yfeng1 at tigers.lsu.edu  Wed Nov 11 12:24:10 2009
From: yfeng1 at tigers.lsu.edu (Yin Feng)
Date: Wed, 11 Nov 2009 12:24:10 -0600
Subject: A problem on compiling a petsc code.
In-Reply-To: <alpine.LFD.2.00.0911110047470.2791@asterix>
References: <1e8c69dc0911101953p5f495222j1771b431432c5a51@mail.gmail.com>
	<alpine.LFD.2.00.0911110047470.2791@asterix>
Message-ID: <1e8c69dc0911111024t6397ff3ai79830c069aad621f@mail.gmail.com>

include ${PETSC_DIR}/conf/base
v:  a.o b.o
        -${CLINKER}  -g3 -O0 -o v  a.o b.o ${PETSC_SNES_LIB}

and I compiled ex2 successfully.

Thank you!


On Wed, Nov 11, 2009 at 12:49 AM, Satish Balay <balay at mcs.anl.gov> wrote:

> Please send the complete compile command that gave this error.
>
> Also do PETSc examples compile/run correctly? What do you get for:
>
> cd src/ksp/ksp/examples/tutorials
> make ex2
> make ex2f
>
> Satish
>
>
> On Tue, 10 Nov 2009, Yin Feng wrote:
>
> > When I compiled petsc code, I encourntered following error code.
> > the compiler is mpicc
> >
> >
> /usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../../lib64/crt1.o(.text+0x21):
> > In function `_start':
> > : undefined reference to `main'
> > collect2: ld returned 1 exit status
> >
> > Thank you in advance!
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091111/e75e69ad/attachment.htm>

From balay at mcs.anl.gov  Wed Nov 11 12:35:50 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 11 Nov 2009 12:35:50 -0600 (CST)
Subject: A problem on compiling a petsc code.
In-Reply-To: <1e8c69dc0911111024t6397ff3ai79830c069aad621f@mail.gmail.com>
References: <1e8c69dc0911101953p5f495222j1771b431432c5a51@mail.gmail.com>
	<alpine.LFD.2.00.0911110047470.2791@asterix>
	<1e8c69dc0911111024t6397ff3ai79830c069aad621f@mail.gmail.com>
Message-ID: <alpine.LFD.2.00.0911111235040.2717@asterix>

I need the *'complete make output'* from the PETSc example - and your code.

This output has the compile commands used [and associated errors]

Satish

On Wed, 11 Nov 2009, Yin Feng wrote:

> include ${PETSC_DIR}/conf/base
> v:  a.o b.o
>         -${CLINKER}  -g3 -O0 -o v  a.o b.o ${PETSC_SNES_LIB}
> 
> and I compiled ex2 successfully.
> 
> Thank you!
> 
> 
> On Wed, Nov 11, 2009 at 12:49 AM, Satish Balay <balay at mcs.anl.gov> wrote:
> 
> > Please send the complete compile command that gave this error.
> >
> > Also do PETSc examples compile/run correctly? What do you get for:
> >
> > cd src/ksp/ksp/examples/tutorials
> > make ex2
> > make ex2f
> >
> > Satish
> >
> >
> > On Tue, 10 Nov 2009, Yin Feng wrote:
> >
> > > When I compiled petsc code, I encourntered following error code.
> > > the compiler is mpicc
> > >
> > >
> > /usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../../lib64/crt1.o(.text+0x21):
> > > In function `_start':
> > > : undefined reference to `main'
> > > collect2: ld returned 1 exit status
> > >
> > > Thank you in advance!
> > >
> >
> >
> 


From jarunan at ascomp.ch  Thu Nov 12 05:21:43 2009
From: jarunan at ascomp.ch (jarunan at ascomp.ch)
Date: Thu, 12 Nov 2009 12:21:43 +0100
Subject: PetscScalar, PetscReal
In-Reply-To: <alpine.LFD.2.00.0911111235040.2717@asterix>
References: <1e8c69dc0911101953p5f495222j1771b431432c5a51@mail.gmail.com>
	<alpine.LFD.2.00.0911110047470.2791@asterix>
	<1e8c69dc0911111024t6397ff3ai79830c069aad621f@mail.gmail.com>
	<alpine.LFD.2.00.0911111235040.2717@asterix>
Message-ID: <20091112122143.s3eiipe0co4o08o8@webmail.ascomp.ch>


Hello,

I would like to ask about PetscScalar.
- What is PetscScalar by default?(double precision real number?)
- Is PetscReal represent double precision real number?
- Is PetscScalar need more memory than PetscReal?

By the way, in the Petsc command I cannot use Integer or normal real  
array from fortran. I have to convert every variables I need Petsc  
command to PetscInt, PetscReal or PetscScalar. Is it meant to be so?  
In the last version, I used to be able to use common integer or real  
in Petsc code.

Best regards,
Jarunan

From jed at 59A2.org  Thu Nov 12 05:33:45 2009
From: jed at 59A2.org (Jed Brown)
Date: Thu, 12 Nov 2009 12:33:45 +0100
Subject: PetscScalar, PetscReal
In-Reply-To: <20091112122143.s3eiipe0co4o08o8@webmail.ascomp.ch>
References: <1e8c69dc0911101953p5f495222j1771b431432c5a51@mail.gmail.com>	<alpine.LFD.2.00.0911110047470.2791@asterix>	<1e8c69dc0911111024t6397ff3ai79830c069aad621f@mail.gmail.com>	<alpine.LFD.2.00.0911111235040.2717@asterix>
	<20091112122143.s3eiipe0co4o08o8@webmail.ascomp.ch>
Message-ID: <4AFBF299.8090107@59A2.org>

jarunan at ascomp.ch wrote:
> 
> Hello,
> 
> I would like to ask about PetscScalar.
> - What is PetscScalar by default?(double precision real number?)
> - Is PetscReal represent double precision real number?
> - Is PetscScalar need more memory than PetscReal?

By default they are both double precision reals and thus use the same
amount of memory.  When built with complex, PetscScalar is complex,
hence takes twice as much space.  You should use Scalar for all "state"
variables, Real is for parameters or time that only make sense as reals.

> By the way, in the Petsc command I cannot use Integer or normal real
> array from fortran. I have to convert every variables I need Petsc
> command to PetscInt, PetscReal or PetscScalar. Is it meant to be so? In
> the last version, I used to be able to use common integer or real in
> Petsc code.

Can you be more specific about what no longer works.

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091112/6f2b111d/attachment.pgp>

From dominik at itis.ethz.ch  Sat Nov 14 14:42:01 2009
From: dominik at itis.ethz.ch (Dominik Szczerba)
Date: Sat, 14 Nov 2009 21:42:01 +0100
Subject: write_line error
Message-ID: <4AFF1619.6060006@itis.ethz.ch>

My application exits with the last displayed lines:

[cli_0]: write_line error; fd=8 buf=:cmd=get kvsname=kvs_7282_0 
key=P1-businesscard
:
system msg for write_line failure : Bad file descriptor
[cli_0]: write_line error; fd=8 buf=:cmd=get kvsname=kvs_7282_0 
key=P1-businesscard
:
system msg for write_line failure : Bad file descriptor


leaving me with no clue whatsoever what goes on here. Any directions are 
highly appreciated.

Dominik

From jed at 59A2.org  Sat Nov 14 14:50:27 2009
From: jed at 59A2.org (Jed Brown)
Date: Sat, 14 Nov 2009 21:50:27 +0100
Subject: write_line error
In-Reply-To: <4AFF1619.6060006@itis.ethz.ch>
References: <4AFF1619.6060006@itis.ethz.ch>
Message-ID: <4AFF1813.8030504@59A2.org>

Dominik Szczerba wrote:
> My application exits with the last displayed lines:
> 
> [cli_0]: write_line error; fd=8 buf=:cmd=get kvsname=kvs_7282_0
> key=P1-businesscard
> :
> system msg for write_line failure : Bad file descriptor
> [cli_0]: write_line error; fd=8 buf=:cmd=get kvsname=kvs_7282_0
> key=P1-businesscard
> :
> system msg for write_line failure : Bad file descriptor
> 
> 
> 
> leaving me with no clue whatsoever what goes on here. Any directions are
> highly appreciated.

Maybe this is related?

  http://trac.mcs.anl.gov/projects/mpich2/ticket/907

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091114/b9fe396c/attachment.pgp>

From dominik at itis.ethz.ch  Sat Nov 14 14:57:01 2009
From: dominik at itis.ethz.ch (Dominik Szczerba)
Date: Sat, 14 Nov 2009 21:57:01 +0100
Subject: write_line error
In-Reply-To: <4AFF1813.8030504@59A2.org>
References: <4AFF1619.6060006@itis.ethz.ch> <4AFF1813.8030504@59A2.org>
Message-ID: <4AFF199D.5030509@itis.ethz.ch>

Yes I saw that post in google before but I was not able to conclude much 
for myself...

Interesting is it happens after all the usual (success) messages, e.g.

------------------------------------------
Using C linker: 
/home/domel/pack/petsc-3.0.0-p9/linux-gnu-c-debug/bin/mpicc -Wall 
-Wwrite-strings -Wno-strict-aliasing -g3
Using Fortran linker: 
/home/domel/pack/petsc-3.0.0-p9/linux-gnu-c-debug/bin/mpif90 -Wall 
-Wno-unused-variable -g
Using libraries: 
-Wl,-rpath,/home/domel/pack/petsc-3.0.0-p9/linux-gnu-c-debug/lib 
-L/home/domel/pack/petsc-3.0.0-p9/linux-gnu-c-debug/lib -lpetscts 
-lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc 
-Wl,-rpath,/home/domel/pack/petsc-3.0.0-p9/linux-gnu-c-debug/lib 
-L/home/domel/pack/petsc-3.0.0-p9/linux-gnu-c-debug/lib -lHYPRE 
-lmpichcxx -lstdc++ -lflapack -lfblas -lnsl -lrt 
-L/home/domel/pack/petsc-3.0.0-p9/linux-gnu-c-debug/lib 
-L/usr/lib/gcc/i486-linux-gnu/4.3.3 -ldl -lmpich -lpthread -lrt -lgcc_s 
-lmpichf90 -lgfortranbegin -lgfortran -lm -L/usr/lib/gcc/i486-linux-gnu 
-lm -lmpichcxx -lstdc++ -lmpichcxx -lstdc++ -ldl -lmpich -lpthread -lrt 
-lgcc_s -ldl
------------------------------------------


Jed Brown wrote:
> Dominik Szczerba wrote:
>> My application exits with the last displayed lines:
>>
>> [cli_0]: write_line error; fd=8 buf=:cmd=get kvsname=kvs_7282_0
>> key=P1-businesscard
>> :
>> system msg for write_line failure : Bad file descriptor
>> [cli_0]: write_line error; fd=8 buf=:cmd=get kvsname=kvs_7282_0
>> key=P1-businesscard
>> :
>> system msg for write_line failure : Bad file descriptor
>>
>>
>>
>> leaving me with no clue whatsoever what goes on here. Any directions are
>> highly appreciated.
> 
> Maybe this is related?
> 
>   http://trac.mcs.anl.gov/projects/mpich2/ticket/907
> 
> Jed
> 


From jed at 59A2.org  Sat Nov 14 15:00:42 2009
From: jed at 59A2.org (Jed Brown)
Date: Sat, 14 Nov 2009 22:00:42 +0100
Subject: write_line error
In-Reply-To: <4AFF199D.5030509@itis.ethz.ch>
References: <4AFF1619.6060006@itis.ethz.ch> <4AFF1813.8030504@59A2.org>
	<4AFF199D.5030509@itis.ethz.ch>
Message-ID: <4AFF1A7A.6020405@59A2.org>

Dominik Szczerba wrote:
> Yes I saw that post in google before but I was not able to conclude much
> for myself...
> 
> Interesting is it happens after all the usual (success) messages, e.g.

That is because it is produced in MPI_Finalize() which PETSc only calls
after it is done with everything else (including the output you cite).
You can initialize MPI yourself or set a breakpoint on MPI_Finalize to
confirm, but none of this output is produced by PETSc.

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091114/296d1fab/attachment.pgp>

From dominik at itis.ethz.ch  Sat Nov 14 15:14:45 2009
From: dominik at itis.ethz.ch (Dominik Szczerba)
Date: Sat, 14 Nov 2009 22:14:45 +0100
Subject: write_line error
In-Reply-To: <4AFF1A7A.6020405@59A2.org>
References: <4AFF1619.6060006@itis.ethz.ch>
	<4AFF1813.8030504@59A2.org>	<4AFF199D.5030509@itis.ethz.ch>
	<4AFF1A7A.6020405@59A2.org>
Message-ID: <4AFF1DC5.9020303@itis.ethz.ch>

OK that was very useful: due to some hasty changes I was calling 
MPI_Finalize without MPI_Init (calling only PetscInitialize and 
PetscFinalize).
Thanks a lot!

Jed Brown wrote:
> Dominik Szczerba wrote:
>> Yes I saw that post in google before but I was not able to conclude much
>> for myself...
>>
>> Interesting is it happens after all the usual (success) messages, e.g.
> 
> That is because it is produced in MPI_Finalize() which PETSc only calls
> after it is done with everything else (including the output you cite).
> You can initialize MPI yourself or set a breakpoint on MPI_Finalize to
> confirm, but none of this output is produced by PETSc.
> 
> Jed
> 


From dominik at itis.ethz.ch  Sat Nov 14 15:32:29 2009
From: dominik at itis.ethz.ch (Dominik Szczerba)
Date: Sat, 14 Nov 2009 22:32:29 +0100
Subject: malloc(): memory corruption:
Message-ID: <4AFF21ED.5080106@itis.ethz.ch>

Now for something more serious: I get a crash like this one:

Starting KSPSolve (1/2)
   0 KSP Residual norm 2.964538623545e-06
*** glibc detected *** /home/domel/build/solve-debug/ns3t10mpi: 
malloc(): memory corruption: 0x09258008 ***
======= Backtrace: =========
/lib/tls/i686/cmov/libc.so.6[0x5f9ff1]
/lib/tls/i686/cmov/libc.so.6[0x5fcbb3]
/lib/tls/i686/cmov/libc.so.6(__libc_calloc+0xa9)[0x5fe009]
/home/domel/build/solve-debug/ns3t10mpi(hypre_CAlloc+0x2c)[0x8b4ea28]
/home/domel/build/solve-debug/ns3t10mpi(hypre_BoomerAMGCoarsenRuge+0xb5)[0x8af2c7b]
(and so on)

gdb invoked as:

mpiexec -np 2 ..... -on_error_attach_debugger -display :0.0

does not display any backtrace after the crash.

Any hints how to debug are highly appreciated.

Dominik

From knepley at gmail.com  Sat Nov 14 15:45:00 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 14 Nov 2009 15:45:00 -0600
Subject: malloc(): memory corruption:
In-Reply-To: <4AFF21ED.5080106@itis.ethz.ch>
References: <4AFF21ED.5080106@itis.ethz.ch>
Message-ID: <a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com>

Try valgrind.

  Matt

On Sat, Nov 14, 2009 at 3:32 PM, Dominik Szczerba <dominik at itis.ethz.ch>wrote:

> Now for something more serious: I get a crash like this one:
>
> Starting KSPSolve (1/2)
>  0 KSP Residual norm 2.964538623545e-06
> *** glibc detected *** /home/domel/build/solve-debug/ns3t10mpi: malloc():
> memory corruption: 0x09258008 ***
> ======= Backtrace: =========
> /lib/tls/i686/cmov/libc.so.6[0x5f9ff1]
> /lib/tls/i686/cmov/libc.so.6[0x5fcbb3]
> /lib/tls/i686/cmov/libc.so.6(__libc_calloc+0xa9)[0x5fe009]
> /home/domel/build/solve-debug/ns3t10mpi(hypre_CAlloc+0x2c)[0x8b4ea28]
>
> /home/domel/build/solve-debug/ns3t10mpi(hypre_BoomerAMGCoarsenRuge+0xb5)[0x8af2c7b]
> (and so on)
>
> gdb invoked as:
>
> mpiexec -np 2 ..... -on_error_attach_debugger -display :0.0
>
> does not display any backtrace after the crash.
>
> Any hints how to debug are highly appreciated.
>
> Dominik
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091114/e72f62c5/attachment.htm>

From dominik at itis.ethz.ch  Sat Nov 14 15:51:32 2009
From: dominik at itis.ethz.ch (Dominik Szczerba)
Date: Sat, 14 Nov 2009 22:51:32 +0100
Subject: malloc(): memory corruption:
In-Reply-To: <a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com>
References: <4AFF21ED.5080106@itis.ethz.ch>
	<a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com>
Message-ID: <4AFF2664.20202@itis.ethz.ch>

run onlu in single, he says things like below - but does not crash. 
Also, the program run with -np 1 does not crash. No clear idea though 
about valgrind's output, please advise if this tells you anything...

Call from NS3T10::createSolverContexts() referenced therein is:

ierr = KSPCreate(petsc_comm,&kspSchurVelocity);CHKERRQ(ierr);


==2605== Conditional jump or move depends on uninitialised value(s)
==2605==    at 0x8AE720F: hypre_BoomerAMGSetPlotFileName (par_amg.c:2115)
==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate (HYPRE_parcsr_amg.c:31)
==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
(NS3T10mpi.cxx:1980)
==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
==2605==
==2605== Conditional jump or move depends on uninitialised value(s)
==2605==    at 0x8AE7244: hypre_BoomerAMGSetPlotFileName (par_amg.c:2120)
==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate (HYPRE_parcsr_amg.c:31)
==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
(NS3T10mpi.cxx:1980)
==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
==2605==
==2605== Conditional jump or move depends on uninitialised value(s)
==2605==    at 0x4025C16: strcpy (mc_replace_strmem.c:303)
==2605==    by 0x8AE727A: hypre_BoomerAMGSetPlotFileName (par_amg.c:2123)
==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate (HYPRE_parcsr_amg.c:31)
==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
(NS3T10mpi.cxx:1980)
==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
==2605==
==2605== Conditional jump or move depends on uninitialised value(s)
==2605==    at 0x4025C35: strcpy (mc_replace_strmem.c:303)
==2605==    by 0x8AE727A: hypre_BoomerAMGSetPlotFileName (par_amg.c:2123)
==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate (HYPRE_parcsr_amg.c:31)
==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
(NS3T10mpi.cxx:1980)
==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
==2605==
Solver contexts created in 2.520000 s
Starting KSPSolve (0/1)
    0 KSP Residual norm 8.368803253774e-06
==2605== Invalid read of size 8
==2605==    at 0x8B23B5A: hypre_BoomerAMGCreateS (par_strength.c:223)
==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c:630)
==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup (HYPRE_parcsr_amg.c:58)
==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
==2605==    by 0x86256A9: PCSetUp (precon.c:794)
==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
(NS3T10mpi.cxx:3741)
==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
==2605==    by 0x862074E: PCApply (precon.c:357)
==2605==    by 0x863AC4C: KSPInitialResidual (itres.c:64)
==2605==    by 0x85EB09A: KSPSolve_GMRES (gmres.c:241)
==2605==  Address 0xafae5d0 is 0 bytes after a block of size 93,488 alloc'd
==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize (csr_matrix.c:91)
==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
(par_csr_matrix.c:200)
==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
(IJMatrix_parcsr.c:272)
==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize (HYPRE_IJMatrix.c:302)
==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ (mhyp.c:174)
==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
==2605==    by 0x86256A9: PCSetUp (precon.c:794)
==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
==2605==
==2605== Invalid write of size 4
==2605==    at 0x8B23E0C: hypre_BoomerAMGCreateS (par_strength.c:301)
==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c:630)
==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup (HYPRE_parcsr_amg.c:58)
==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
==2605==    by 0x86256A9: PCSetUp (precon.c:794)
==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
(NS3T10mpi.cxx:3741)
==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
==2605==    by 0x862074E: PCApply (precon.c:357)
==2605==    by 0x863AC4C: KSPInitialResidual (itres.c:64)
==2605==    by 0x85EB09A: KSPSolve_GMRES (gmres.c:241)
==2605==  Address 0xb12a050 is 0 bytes after a block of size 46,744 alloc'd
==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
==2605==    by 0x8B23980: hypre_BoomerAMGCreateS (par_strength.c:163)
==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c:630)
==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup (HYPRE_parcsr_amg.c:58)
==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
==2605==    by 0x86256A9: PCSetUp (precon.c:794)
==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
(NS3T10mpi.cxx:3741)
==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
==2605==    by 0x862074E: PCApply (precon.c:357)
==2605==
...
==2605== Invalid read of size 8
==2605==    at 0x8B1ACE8: hypre_BoomerAMGRelax (par_relax.c:182)
==2605==    by 0x8B1DFBF: hypre_BoomerAMGRelaxIF (par_relax_interface.c:110)
==2605==    by 0x8AFC310: hypre_BoomerAMGCycle (par_cycle.c:386)
==2605==    by 0x8AEE09E: hypre_BoomerAMGSolve (par_amg_solve.c:252)
==2605==    by 0x8AE4A25: HYPRE_BoomerAMGSolve (HYPRE_parcsr_amg.c:76)
==2605==    by 0x855AAA4: PCApply_HYPRE (hypre.c:172)
==2605==    by 0x862074E: PCApply (precon.c:357)
==2605==    by 0x8606095: KSPSolve_PREONLY (preonly.c:29)
==2605==    by 0x85A85D3: KSPSolve (itfunc.c:385)
==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
(NS3T10mpi.cxx:3741)
==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
==2605==    by 0x862074E: PCApply (precon.c:357)
==2605==  Address 0xafae5d0 is 0 bytes after a block of size 93,488 alloc'd
==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize (csr_matrix.c:91)
==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
(par_csr_matrix.c:200)
==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
(IJMatrix_parcsr.c:272)
==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize (HYPRE_IJMatrix.c:302)
==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ (mhyp.c:174)
==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
==2605==    by 0x86256A9: PCSetUp (precon.c:794)
==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
==2605==
...
    0 KSP Residual norm 8.368803253774e-06
==2605== Invalid read of size 8
==2605==    at 0x8B1ADC0: hypre_BoomerAMGRelax (par_relax.c:196)
==2605==    by 0x8B1DFBF: hypre_BoomerAMGRelaxIF (par_relax_interface.c:110)
==2605==    by 0x8AFC310: hypre_BoomerAMGCycle (par_cycle.c:386)
==2605==    by 0x8AEE09E: hypre_BoomerAMGSolve (par_amg_solve.c:252)
==2605==    by 0x8AE4A25: HYPRE_BoomerAMGSolve (HYPRE_parcsr_amg.c:76)
==2605==    by 0x855AAA4: PCApply_HYPRE (hypre.c:172)
==2605==    by 0x862074E: PCApply (precon.c:357)
==2605==    by 0x8606095: KSPSolve_PREONLY (preonly.c:29)
==2605==    by 0x85A85D3: KSPSolve (itfunc.c:385)
==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
(NS3T10mpi.cxx:3741)
==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
==2605==    by 0x862074E: PCApply (precon.c:357)
==2605==  Address 0xcded820 is 0 bytes after a block of size 93,488 alloc'd
==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize (csr_matrix.c:91)
==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
(par_csr_matrix.c:200)
==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
(IJMatrix_parcsr.c:272)
==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize (HYPRE_IJMatrix.c:302)
==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ (mhyp.c:174)
==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
==2605==    by 0x86256A9: PCSetUp (precon.c:794)
==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
==2605==


Matthew Knepley wrote:
> Try valgrind.
> 
>   Matt
> 
> On Sat, Nov 14, 2009 at 3:32 PM, Dominik Szczerba <dominik at itis.ethz.ch 
> <mailto:dominik at itis.ethz.ch>> wrote:
> 
>     Now for something more serious: I get a crash like this one:
> 
>     Starting KSPSolve (1/2)
>      0 KSP Residual norm 2.964538623545e-06
>     *** glibc detected *** /home/domel/build/solve-debug/ns3t10mpi:
>     malloc(): memory corruption: 0x09258008 ***
>     ======= Backtrace: =========
>     /lib/tls/i686/cmov/libc.so.6[0x5f9ff1]
>     /lib/tls/i686/cmov/libc.so.6[0x5fcbb3]
>     /lib/tls/i686/cmov/libc.so.6(__libc_calloc+0xa9)[0x5fe009]
>     /home/domel/build/solve-debug/ns3t10mpi(hypre_CAlloc+0x2c)[0x8b4ea28]
>     /home/domel/build/solve-debug/ns3t10mpi(hypre_BoomerAMGCoarsenRuge+0xb5)[0x8af2c7b]
>     (and so on)
> 
>     gdb invoked as:
> 
>     mpiexec -np 2 ..... -on_error_attach_debugger -display :0.0
> 
>     does not display any backtrace after the crash.
> 
>     Any hints how to debug are highly appreciated.
> 
>     Dominik
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener


From knepley at gmail.com  Sat Nov 14 16:21:39 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 14 Nov 2009 16:21:39 -0600
Subject: malloc(): memory corruption:
In-Reply-To: <4AFF2664.20202@itis.ethz.ch>
References: <4AFF21ED.5080106@itis.ethz.ch>
	<a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com>
	<4AFF2664.20202@itis.ethz.ch>
Message-ID: <a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com>

This is already bad. You had an Invalid Read and Invalid Write in your
Hypre. Did you build it
yourself? If so, let us build it. If not, please try your matrix on KSP ex10
and see if you get a
crash on 2 procs.

  Thanks,

    Matt

On Sat, Nov 14, 2009 at 3:51 PM, Dominik Szczerba <dominik at itis.ethz.ch>wrote:

> run onlu in single, he says things like below - but does not crash. Also,
> the program run with -np 1 does not crash. No clear idea though about
> valgrind's output, please advise if this tells you anything...
>
> Call from NS3T10::createSolverContexts() referenced therein is:
>
> ierr = KSPCreate(petsc_comm,&kspSchurVelocity);CHKERRQ(ierr);
>
>
> ==2605== Conditional jump or move depends on uninitialised value(s)
> ==2605==    at 0x8AE720F: hypre_BoomerAMGSetPlotFileName (par_amg.c:2115)
> ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
> ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate (HYPRE_parcsr_amg.c:31)
> ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
> ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
> ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
> (NS3T10mpi.cxx:1980)
> ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
> ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
> ==2605==
> ==2605== Conditional jump or move depends on uninitialised value(s)
> ==2605==    at 0x8AE7244: hypre_BoomerAMGSetPlotFileName (par_amg.c:2120)
> ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
> ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate (HYPRE_parcsr_amg.c:31)
> ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
> ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
> ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
> (NS3T10mpi.cxx:1980)
> ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
> ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
> ==2605==
> ==2605== Conditional jump or move depends on uninitialised value(s)
> ==2605==    at 0x4025C16: strcpy (mc_replace_strmem.c:303)
> ==2605==    by 0x8AE727A: hypre_BoomerAMGSetPlotFileName (par_amg.c:2123)
> ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
> ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate (HYPRE_parcsr_amg.c:31)
> ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
> ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
> ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
> (NS3T10mpi.cxx:1980)
> ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
> ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
> ==2605==
> ==2605== Conditional jump or move depends on uninitialised value(s)
> ==2605==    at 0x4025C35: strcpy (mc_replace_strmem.c:303)
> ==2605==    by 0x8AE727A: hypre_BoomerAMGSetPlotFileName (par_amg.c:2123)
> ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
> ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate (HYPRE_parcsr_amg.c:31)
> ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
> ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
> ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
> (NS3T10mpi.cxx:1980)
> ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
> ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
> ==2605==
> Solver contexts created in 2.520000 s
> Starting KSPSolve (0/1)
>   0 KSP Residual norm 8.368803253774e-06
> ==2605== Invalid read of size 8
> ==2605==    at 0x8B23B5A: hypre_BoomerAMGCreateS (par_strength.c:223)
> ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c:630)
> ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup (HYPRE_parcsr_amg.c:58)
> ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
> ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
> ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
> ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
> ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
> (NS3T10mpi.cxx:3741)
> ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
> ==2605==    by 0x862074E: PCApply (precon.c:357)
> ==2605==    by 0x863AC4C: KSPInitialResidual (itres.c:64)
> ==2605==    by 0x85EB09A: KSPSolve_GMRES (gmres.c:241)
> ==2605==  Address 0xafae5d0 is 0 bytes after a block of size 93,488 alloc'd
> ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
> ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
> ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize (csr_matrix.c:91)
> ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
> (par_csr_matrix.c:200)
> ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
> (IJMatrix_parcsr.c:272)
> ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize (HYPRE_IJMatrix.c:302)
> ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ (mhyp.c:174)
> ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
> ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
> ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
> ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
> ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
> ==2605==
> ==2605== Invalid write of size 4
> ==2605==    at 0x8B23E0C: hypre_BoomerAMGCreateS (par_strength.c:301)
> ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c:630)
> ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup (HYPRE_parcsr_amg.c:58)
> ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
> ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
> ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
> ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
> ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
> (NS3T10mpi.cxx:3741)
> ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
> ==2605==    by 0x862074E: PCApply (precon.c:357)
> ==2605==    by 0x863AC4C: KSPInitialResidual (itres.c:64)
> ==2605==    by 0x85EB09A: KSPSolve_GMRES (gmres.c:241)
> ==2605==  Address 0xb12a050 is 0 bytes after a block of size 46,744 alloc'd
> ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
> ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
> ==2605==    by 0x8B23980: hypre_BoomerAMGCreateS (par_strength.c:163)
> ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c:630)
> ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup (HYPRE_parcsr_amg.c:58)
> ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
> ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
> ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
> ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
> ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
> (NS3T10mpi.cxx:3741)
> ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
> ==2605==    by 0x862074E: PCApply (precon.c:357)
> ==2605==
> ...
> ==2605== Invalid read of size 8
> ==2605==    at 0x8B1ACE8: hypre_BoomerAMGRelax (par_relax.c:182)
> ==2605==    by 0x8B1DFBF: hypre_BoomerAMGRelaxIF
> (par_relax_interface.c:110)
> ==2605==    by 0x8AFC310: hypre_BoomerAMGCycle (par_cycle.c:386)
> ==2605==    by 0x8AEE09E: hypre_BoomerAMGSolve (par_amg_solve.c:252)
> ==2605==    by 0x8AE4A25: HYPRE_BoomerAMGSolve (HYPRE_parcsr_amg.c:76)
> ==2605==    by 0x855AAA4: PCApply_HYPRE (hypre.c:172)
> ==2605==    by 0x862074E: PCApply (precon.c:357)
> ==2605==    by 0x8606095: KSPSolve_PREONLY (preonly.c:29)
> ==2605==    by 0x85A85D3: KSPSolve (itfunc.c:385)
> ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
> (NS3T10mpi.cxx:3741)
> ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
> ==2605==    by 0x862074E: PCApply (precon.c:357)
> ==2605==  Address 0xafae5d0 is 0 bytes after a block of size 93,488 alloc'd
> ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
> ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
> ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize (csr_matrix.c:91)
> ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
> (par_csr_matrix.c:200)
> ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
> (IJMatrix_parcsr.c:272)
> ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize (HYPRE_IJMatrix.c:302)
> ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ (mhyp.c:174)
> ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
> ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
> ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
> ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
> ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
> ==2605==
> ...
>   0 KSP Residual norm 8.368803253774e-06
> ==2605== Invalid read of size 8
> ==2605==    at 0x8B1ADC0: hypre_BoomerAMGRelax (par_relax.c:196)
> ==2605==    by 0x8B1DFBF: hypre_BoomerAMGRelaxIF
> (par_relax_interface.c:110)
> ==2605==    by 0x8AFC310: hypre_BoomerAMGCycle (par_cycle.c:386)
> ==2605==    by 0x8AEE09E: hypre_BoomerAMGSolve (par_amg_solve.c:252)
> ==2605==    by 0x8AE4A25: HYPRE_BoomerAMGSolve (HYPRE_parcsr_amg.c:76)
> ==2605==    by 0x855AAA4: PCApply_HYPRE (hypre.c:172)
> ==2605==    by 0x862074E: PCApply (precon.c:357)
> ==2605==    by 0x8606095: KSPSolve_PREONLY (preonly.c:29)
> ==2605==    by 0x85A85D3: KSPSolve (itfunc.c:385)
> ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
> (NS3T10mpi.cxx:3741)
> ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
> ==2605==    by 0x862074E: PCApply (precon.c:357)
> ==2605==  Address 0xcded820 is 0 bytes after a block of size 93,488 alloc'd
> ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
> ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
> ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize (csr_matrix.c:91)
> ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
> (par_csr_matrix.c:200)
> ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
> (IJMatrix_parcsr.c:272)
> ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize (HYPRE_IJMatrix.c:302)
> ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ (mhyp.c:174)
> ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
> ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
> ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
> ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
> ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
> ==2605==
>
>
>
>
> Matthew Knepley wrote:
>
>> Try valgrind.
>>
>>  Matt
>>
>>
>> On Sat, Nov 14, 2009 at 3:32 PM, Dominik Szczerba <dominik at itis.ethz.ch<mailto:
>> dominik at itis.ethz.ch>> wrote:
>>
>>    Now for something more serious: I get a crash like this one:
>>
>>    Starting KSPSolve (1/2)
>>     0 KSP Residual norm 2.964538623545e-06
>>    *** glibc detected *** /home/domel/build/solve-debug/ns3t10mpi:
>>    malloc(): memory corruption: 0x09258008 ***
>>    ======= Backtrace: =========
>>    /lib/tls/i686/cmov/libc.so.6[0x5f9ff1]
>>    /lib/tls/i686/cmov/libc.so.6[0x5fcbb3]
>>    /lib/tls/i686/cmov/libc.so.6(__libc_calloc+0xa9)[0x5fe009]
>>    /home/domel/build/solve-debug/ns3t10mpi(hypre_CAlloc+0x2c)[0x8b4ea28]
>>
>>  /home/domel/build/solve-debug/ns3t10mpi(hypre_BoomerAMGCoarsenRuge+0xb5)[0x8af2c7b]
>>    (and so on)
>>
>>    gdb invoked as:
>>
>>    mpiexec -np 2 ..... -on_error_attach_debugger -display :0.0
>>
>>    does not display any backtrace after the crash.
>>
>>    Any hints how to debug are highly appreciated.
>>
>>    Dominik
>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091114/8ab8d81d/attachment.htm>

From dominik at itis.ethz.ch  Sat Nov 14 16:41:28 2009
From: dominik at itis.ethz.ch (Dominik Szczerba)
Date: Sat, 14 Nov 2009 23:41:28 +0100
Subject: malloc(): memory corruption:
In-Reply-To: <a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com>
References: <4AFF21ED.5080106@itis.ethz.ch>	<a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com>	<4AFF2664.20202@itis.ethz.ch>
	<a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com>
Message-ID: <4AFF3218.3070303@itis.ethz.ch>

No I am using Hypre built automatically along with petsc...
I will try ex10, thanks...

Matthew Knepley wrote:
> This is already bad. You had an Invalid Read and Invalid Write in your 
> Hypre. Did you build it
> yourself? If so, let us build it. If not, please try your matrix on KSP 
> ex10 and see if you get a
> crash on 2 procs.
> 
>   Thanks,
> 
>     Matt
> 
> On Sat, Nov 14, 2009 at 3:51 PM, Dominik Szczerba <dominik at itis.ethz.ch 
> <mailto:dominik at itis.ethz.ch>> wrote:
> 
>     run onlu in single, he says things like below - but does not crash.
>     Also, the program run with -np 1 does not crash. No clear idea
>     though about valgrind's output, please advise if this tells you
>     anything...
> 
>     Call from NS3T10::createSolverContexts() referenced therein is:
> 
>     ierr = KSPCreate(petsc_comm,&kspSchurVelocity);CHKERRQ(ierr);
> 
> 
>     ==2605== Conditional jump or move depends on uninitialised value(s)
>     ==2605==    at 0x8AE720F: hypre_BoomerAMGSetPlotFileName
>     (par_amg.c:2115)
>     ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>     ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate (HYPRE_parcsr_amg.c:31)
>     ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>     ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>     ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>     (NS3T10mpi.cxx:1980)
>     ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>     ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>     ==2605==
>     ==2605== Conditional jump or move depends on uninitialised value(s)
>     ==2605==    at 0x8AE7244: hypre_BoomerAMGSetPlotFileName
>     (par_amg.c:2120)
>     ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>     ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate (HYPRE_parcsr_amg.c:31)
>     ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>     ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>     ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>     (NS3T10mpi.cxx:1980)
>     ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>     ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>     ==2605==
>     ==2605== Conditional jump or move depends on uninitialised value(s)
>     ==2605==    at 0x4025C16: strcpy (mc_replace_strmem.c:303)
>     ==2605==    by 0x8AE727A: hypre_BoomerAMGSetPlotFileName
>     (par_amg.c:2123)
>     ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>     ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate (HYPRE_parcsr_amg.c:31)
>     ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>     ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>     ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>     (NS3T10mpi.cxx:1980)
>     ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>     ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>     ==2605==
>     ==2605== Conditional jump or move depends on uninitialised value(s)
>     ==2605==    at 0x4025C35: strcpy (mc_replace_strmem.c:303)
>     ==2605==    by 0x8AE727A: hypre_BoomerAMGSetPlotFileName
>     (par_amg.c:2123)
>     ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>     ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate (HYPRE_parcsr_amg.c:31)
>     ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>     ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>     ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>     (NS3T10mpi.cxx:1980)
>     ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>     ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>     ==2605==
>     Solver contexts created in 2.520000 s
>     Starting KSPSolve (0/1)
>       0 KSP Residual norm 8.368803253774e-06
>     ==2605== Invalid read of size 8
>     ==2605==    at 0x8B23B5A: hypre_BoomerAMGCreateS (par_strength.c:223)
>     ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c:630)
>     ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup (HYPRE_parcsr_amg.c:58)
>     ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
>     ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>     ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>     ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>     ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>     (NS3T10mpi.cxx:3741)
>     ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>     ==2605==    by 0x862074E: PCApply (precon.c:357)
>     ==2605==    by 0x863AC4C: KSPInitialResidual (itres.c:64)
>     ==2605==    by 0x85EB09A: KSPSolve_GMRES (gmres.c:241)
>     ==2605==  Address 0xafae5d0 is 0 bytes after a block of size 93,488
>     alloc'd
>     ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>     ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>     ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize (csr_matrix.c:91)
>     ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
>     (par_csr_matrix.c:200)
>     ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
>     (IJMatrix_parcsr.c:272)
>     ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize
>     (HYPRE_IJMatrix.c:302)
>     ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ (mhyp.c:174)
>     ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
>     ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
>     ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>     ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>     ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>     ==2605==
>     ==2605== Invalid write of size 4
>     ==2605==    at 0x8B23E0C: hypre_BoomerAMGCreateS (par_strength.c:301)
>     ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c:630)
>     ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup (HYPRE_parcsr_amg.c:58)
>     ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
>     ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>     ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>     ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>     ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>     (NS3T10mpi.cxx:3741)
>     ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>     ==2605==    by 0x862074E: PCApply (precon.c:357)
>     ==2605==    by 0x863AC4C: KSPInitialResidual (itres.c:64)
>     ==2605==    by 0x85EB09A: KSPSolve_GMRES (gmres.c:241)
>     ==2605==  Address 0xb12a050 is 0 bytes after a block of size 46,744
>     alloc'd
>     ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>     ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>     ==2605==    by 0x8B23980: hypre_BoomerAMGCreateS (par_strength.c:163)
>     ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c:630)
>     ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup (HYPRE_parcsr_amg.c:58)
>     ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
>     ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>     ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>     ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>     ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>     (NS3T10mpi.cxx:3741)
>     ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>     ==2605==    by 0x862074E: PCApply (precon.c:357)
>     ==2605==
>     ...
>     ==2605== Invalid read of size 8
>     ==2605==    at 0x8B1ACE8: hypre_BoomerAMGRelax (par_relax.c:182)
>     ==2605==    by 0x8B1DFBF: hypre_BoomerAMGRelaxIF
>     (par_relax_interface.c:110)
>     ==2605==    by 0x8AFC310: hypre_BoomerAMGCycle (par_cycle.c:386)
>     ==2605==    by 0x8AEE09E: hypre_BoomerAMGSolve (par_amg_solve.c:252)
>     ==2605==    by 0x8AE4A25: HYPRE_BoomerAMGSolve (HYPRE_parcsr_amg.c:76)
>     ==2605==    by 0x855AAA4: PCApply_HYPRE (hypre.c:172)
>     ==2605==    by 0x862074E: PCApply (precon.c:357)
>     ==2605==    by 0x8606095: KSPSolve_PREONLY (preonly.c:29)
>     ==2605==    by 0x85A85D3: KSPSolve (itfunc.c:385)
>     ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>     (NS3T10mpi.cxx:3741)
>     ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>     ==2605==    by 0x862074E: PCApply (precon.c:357)
>     ==2605==  Address 0xafae5d0 is 0 bytes after a block of size 93,488
>     alloc'd
>     ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>     ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>     ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize (csr_matrix.c:91)
>     ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
>     (par_csr_matrix.c:200)
>     ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
>     (IJMatrix_parcsr.c:272)
>     ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize
>     (HYPRE_IJMatrix.c:302)
>     ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ (mhyp.c:174)
>     ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
>     ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
>     ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>     ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>     ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>     ==2605==
>     ...
>       0 KSP Residual norm 8.368803253774e-06
>     ==2605== Invalid read of size 8
>     ==2605==    at 0x8B1ADC0: hypre_BoomerAMGRelax (par_relax.c:196)
>     ==2605==    by 0x8B1DFBF: hypre_BoomerAMGRelaxIF
>     (par_relax_interface.c:110)
>     ==2605==    by 0x8AFC310: hypre_BoomerAMGCycle (par_cycle.c:386)
>     ==2605==    by 0x8AEE09E: hypre_BoomerAMGSolve (par_amg_solve.c:252)
>     ==2605==    by 0x8AE4A25: HYPRE_BoomerAMGSolve (HYPRE_parcsr_amg.c:76)
>     ==2605==    by 0x855AAA4: PCApply_HYPRE (hypre.c:172)
>     ==2605==    by 0x862074E: PCApply (precon.c:357)
>     ==2605==    by 0x8606095: KSPSolve_PREONLY (preonly.c:29)
>     ==2605==    by 0x85A85D3: KSPSolve (itfunc.c:385)
>     ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>     (NS3T10mpi.cxx:3741)
>     ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>     ==2605==    by 0x862074E: PCApply (precon.c:357)
>     ==2605==  Address 0xcded820 is 0 bytes after a block of size 93,488
>     alloc'd
>     ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>     ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>     ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize (csr_matrix.c:91)
>     ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
>     (par_csr_matrix.c:200)
>     ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
>     (IJMatrix_parcsr.c:272)
>     ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize
>     (HYPRE_IJMatrix.c:302)
>     ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ (mhyp.c:174)
>     ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
>     ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
>     ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>     ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>     ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>     ==2605==
> 
> 
> 
> 
>     Matthew Knepley wrote:
> 
>         Try valgrind.
> 
>          Matt
> 
> 
>         On Sat, Nov 14, 2009 at 3:32 PM, Dominik Szczerba
>         <dominik at itis.ethz.ch <mailto:dominik at itis.ethz.ch>
>         <mailto:dominik at itis.ethz.ch <mailto:dominik at itis.ethz.ch>>> wrote:
> 
>            Now for something more serious: I get a crash like this one:
> 
>            Starting KSPSolve (1/2)
>             0 KSP Residual norm 2.964538623545e-06
>            *** glibc detected *** /home/domel/build/solve-debug/ns3t10mpi:
>            malloc(): memory corruption: 0x09258008 ***
>            ======= Backtrace: =========
>            /lib/tls/i686/cmov/libc.so.6[0x5f9ff1]
>            /lib/tls/i686/cmov/libc.so.6[0x5fcbb3]
>            /lib/tls/i686/cmov/libc.so.6(__libc_calloc+0xa9)[0x5fe009]
>          
>          /home/domel/build/solve-debug/ns3t10mpi(hypre_CAlloc+0x2c)[0x8b4ea28]
>          
>          /home/domel/build/solve-debug/ns3t10mpi(hypre_BoomerAMGCoarsenRuge+0xb5)[0x8af2c7b]
>            (and so on)
> 
>            gdb invoked as:
> 
>            mpiexec -np 2 ..... -on_error_attach_debugger -display :0.0
> 
>            does not display any backtrace after the crash.
> 
>            Any hints how to debug are highly appreciated.
> 
>            Dominik
> 
> 
> 
> 
>         -- 
>         What most experimenters take for granted before they begin their
>         experiments is infinitely more interesting than any results to
>         which their experiments lead.
>         -- Norbert Wiener
> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener


From bsmith at mcs.anl.gov  Sat Nov 14 21:22:28 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 14 Nov 2009 21:22:28 -0600
Subject: malloc(): memory corruption:
In-Reply-To: <4AFF3218.3070303@itis.ethz.ch>
References: <4AFF21ED.5080106@itis.ethz.ch>	<a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com>	<4AFF2664.20202@itis.ethz.ch>
	<a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com>
	<4AFF3218.3070303@itis.ethz.ch>
Message-ID: <D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov>


   If you run without hypre preconditioner but use instead, say  
bjacobi under valgrind do you get any valgrind errors?

    The problem you are having could be do to (1) some memory  
corruption in your code that is messing up hypre or (2) some bug in  
hypre that we don't see with our simple test codes.

    Barry

On Nov 14, 2009, at 4:41 PM, Dominik Szczerba wrote:

> No I am using Hypre built automatically along with petsc...
> I will try ex10, thanks...
>
> Matthew Knepley wrote:
>> This is already bad. You had an Invalid Read and Invalid Write in  
>> your Hypre. Did you build it
>> yourself? If so, let us build it. If not, please try your matrix on  
>> KSP ex10 and see if you get a
>> crash on 2 procs.
>>  Thanks,
>>    Matt
>> On Sat, Nov 14, 2009 at 3:51 PM, Dominik Szczerba <dominik at itis.ethz.ch 
>>  <mailto:dominik at itis.ethz.ch>> wrote:
>>    run onlu in single, he says things like below - but does not  
>> crash.
>>    Also, the program run with -np 1 does not crash. No clear idea
>>    though about valgrind's output, please advise if this tells you
>>    anything...
>>    Call from NS3T10::createSolverContexts() referenced therein is:
>>    ierr = KSPCreate(petsc_comm,&kspSchurVelocity);CHKERRQ(ierr);
>>    ==2605== Conditional jump or move depends on uninitialised  
>> value(s)
>>    ==2605==    at 0x8AE720F: hypre_BoomerAMGSetPlotFileName
>>    (par_amg.c:2115)
>>    ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>>    ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate  
>> (HYPRE_parcsr_amg.c:31)
>>    ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>>    ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>>    ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>>    (NS3T10mpi.cxx:1980)
>>    ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>>    ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>>    ==2605==
>>    ==2605== Conditional jump or move depends on uninitialised  
>> value(s)
>>    ==2605==    at 0x8AE7244: hypre_BoomerAMGSetPlotFileName
>>    (par_amg.c:2120)
>>    ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>>    ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate  
>> (HYPRE_parcsr_amg.c:31)
>>    ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>>    ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>>    ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>>    (NS3T10mpi.cxx:1980)
>>    ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>>    ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>>    ==2605==
>>    ==2605== Conditional jump or move depends on uninitialised  
>> value(s)
>>    ==2605==    at 0x4025C16: strcpy (mc_replace_strmem.c:303)
>>    ==2605==    by 0x8AE727A: hypre_BoomerAMGSetPlotFileName
>>    (par_amg.c:2123)
>>    ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>>    ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate  
>> (HYPRE_parcsr_amg.c:31)
>>    ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>>    ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>>    ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>>    (NS3T10mpi.cxx:1980)
>>    ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>>    ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>>    ==2605==
>>    ==2605== Conditional jump or move depends on uninitialised  
>> value(s)
>>    ==2605==    at 0x4025C35: strcpy (mc_replace_strmem.c:303)
>>    ==2605==    by 0x8AE727A: hypre_BoomerAMGSetPlotFileName
>>    (par_amg.c:2123)
>>    ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>>    ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate  
>> (HYPRE_parcsr_amg.c:31)
>>    ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>>    ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>>    ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>>    (NS3T10mpi.cxx:1980)
>>    ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>>    ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>>    ==2605==
>>    Solver contexts created in 2.520000 s
>>    Starting KSPSolve (0/1)
>>      0 KSP Residual norm 8.368803253774e-06
>>    ==2605== Invalid read of size 8
>>    ==2605==    at 0x8B23B5A: hypre_BoomerAMGCreateS (par_strength.c: 
>> 223)
>>    ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c: 
>> 630)
>>    ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup  
>> (HYPRE_parcsr_amg.c:58)
>>    ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>    (NS3T10mpi.cxx:3741)
>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>    ==2605==    by 0x863AC4C: KSPInitialResidual (itres.c:64)
>>    ==2605==    by 0x85EB09A: KSPSolve_GMRES (gmres.c:241)
>>    ==2605==  Address 0xafae5d0 is 0 bytes after a block of size  
>> 93,488
>>    alloc'd
>>    ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>>    ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>>    ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize  
>> (csr_matrix.c:91)
>>    ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
>>    (par_csr_matrix.c:200)
>>    ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
>>    (IJMatrix_parcsr.c:272)
>>    ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize
>>    (HYPRE_IJMatrix.c:302)
>>    ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ  
>> (mhyp.c:174)
>>    ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
>>    ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>    ==2605==
>>    ==2605== Invalid write of size 4
>>    ==2605==    at 0x8B23E0C: hypre_BoomerAMGCreateS (par_strength.c: 
>> 301)
>>    ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c: 
>> 630)
>>    ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup  
>> (HYPRE_parcsr_amg.c:58)
>>    ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>    (NS3T10mpi.cxx:3741)
>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>    ==2605==    by 0x863AC4C: KSPInitialResidual (itres.c:64)
>>    ==2605==    by 0x85EB09A: KSPSolve_GMRES (gmres.c:241)
>>    ==2605==  Address 0xb12a050 is 0 bytes after a block of size  
>> 46,744
>>    alloc'd
>>    ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>>    ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>>    ==2605==    by 0x8B23980: hypre_BoomerAMGCreateS (par_strength.c: 
>> 163)
>>    ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c: 
>> 630)
>>    ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup  
>> (HYPRE_parcsr_amg.c:58)
>>    ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>    (NS3T10mpi.cxx:3741)
>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>    ==2605==
>>    ...
>>    ==2605== Invalid read of size 8
>>    ==2605==    at 0x8B1ACE8: hypre_BoomerAMGRelax (par_relax.c:182)
>>    ==2605==    by 0x8B1DFBF: hypre_BoomerAMGRelaxIF
>>    (par_relax_interface.c:110)
>>    ==2605==    by 0x8AFC310: hypre_BoomerAMGCycle (par_cycle.c:386)
>>    ==2605==    by 0x8AEE09E: hypre_BoomerAMGSolve (par_amg_solve.c: 
>> 252)
>>    ==2605==    by 0x8AE4A25: HYPRE_BoomerAMGSolve  
>> (HYPRE_parcsr_amg.c:76)
>>    ==2605==    by 0x855AAA4: PCApply_HYPRE (hypre.c:172)
>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>    ==2605==    by 0x8606095: KSPSolve_PREONLY (preonly.c:29)
>>    ==2605==    by 0x85A85D3: KSPSolve (itfunc.c:385)
>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>    (NS3T10mpi.cxx:3741)
>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>    ==2605==  Address 0xafae5d0 is 0 bytes after a block of size  
>> 93,488
>>    alloc'd
>>    ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>>    ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>>    ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize  
>> (csr_matrix.c:91)
>>    ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
>>    (par_csr_matrix.c:200)
>>    ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
>>    (IJMatrix_parcsr.c:272)
>>    ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize
>>    (HYPRE_IJMatrix.c:302)
>>    ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ  
>> (mhyp.c:174)
>>    ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
>>    ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>    ==2605==
>>    ...
>>      0 KSP Residual norm 8.368803253774e-06
>>    ==2605== Invalid read of size 8
>>    ==2605==    at 0x8B1ADC0: hypre_BoomerAMGRelax (par_relax.c:196)
>>    ==2605==    by 0x8B1DFBF: hypre_BoomerAMGRelaxIF
>>    (par_relax_interface.c:110)
>>    ==2605==    by 0x8AFC310: hypre_BoomerAMGCycle (par_cycle.c:386)
>>    ==2605==    by 0x8AEE09E: hypre_BoomerAMGSolve (par_amg_solve.c: 
>> 252)
>>    ==2605==    by 0x8AE4A25: HYPRE_BoomerAMGSolve  
>> (HYPRE_parcsr_amg.c:76)
>>    ==2605==    by 0x855AAA4: PCApply_HYPRE (hypre.c:172)
>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>    ==2605==    by 0x8606095: KSPSolve_PREONLY (preonly.c:29)
>>    ==2605==    by 0x85A85D3: KSPSolve (itfunc.c:385)
>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>    (NS3T10mpi.cxx:3741)
>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>    ==2605==  Address 0xcded820 is 0 bytes after a block of size  
>> 93,488
>>    alloc'd
>>    ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>>    ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>>    ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize  
>> (csr_matrix.c:91)
>>    ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
>>    (par_csr_matrix.c:200)
>>    ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
>>    (IJMatrix_parcsr.c:272)
>>    ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize
>>    (HYPRE_IJMatrix.c:302)
>>    ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ  
>> (mhyp.c:174)
>>    ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
>>    ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>    ==2605==
>>    Matthew Knepley wrote:
>>        Try valgrind.
>>         Matt
>>        On Sat, Nov 14, 2009 at 3:32 PM, Dominik Szczerba
>>        <dominik at itis.ethz.ch <mailto:dominik at itis.ethz.ch>
>>        <mailto:dominik at itis.ethz.ch <mailto:dominik at itis.ethz.ch>>>  
>> wrote:
>>           Now for something more serious: I get a crash like this  
>> one:
>>           Starting KSPSolve (1/2)
>>            0 KSP Residual norm 2.964538623545e-06
>>           *** glibc detected *** /home/domel/build/solve-debug/ 
>> ns3t10mpi:
>>           malloc(): memory corruption: 0x09258008 ***
>>           ======= Backtrace: =========
>>           /lib/tls/i686/cmov/libc.so.6[0x5f9ff1]
>>           /lib/tls/i686/cmov/libc.so.6[0x5fcbb3]
>>           /lib/tls/i686/cmov/libc.so.6(__libc_calloc+0xa9)[0x5fe009]
>>                  /home/domel/build/solve-debug/ 
>> ns3t10mpi(hypre_CAlloc+0x2c)[0x8b4ea28]
>>                  /home/domel/build/solve-debug/ 
>> ns3t10mpi(hypre_BoomerAMGCoarsenRuge+0xb5)[0x8af2c7b]
>>           (and so on)
>>           gdb invoked as:
>>           mpiexec -np 2 ..... -on_error_attach_debugger -display :0.0
>>           does not display any backtrace after the crash.
>>           Any hints how to debug are highly appreciated.
>>           Dominik
>>        --         What most experimenters take for granted before  
>> they begin their
>>        experiments is infinitely more interesting than any results to
>>        which their experiments lead.
>>        -- Norbert Wiener
>> -- 
>> What most experimenters take for granted before they begin their  
>> experiments is infinitely more interesting than any results to  
>> which their experiments lead.
>> -- Norbert Wiener
>


From dominik at itis.ethz.ch  Sun Nov 15 02:24:49 2009
From: dominik at itis.ethz.ch (Dominik Szczerba)
Date: Sun, 15 Nov 2009 09:24:49 +0100
Subject: malloc(): memory corruption:
In-Reply-To: <D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov>
References: <4AFF21ED.5080106@itis.ethz.ch>	<a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com>	<4AFF2664.20202@itis.ethz.ch>	<a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com>	<4AFF3218.3070303@itis.ethz.ch>
	<D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov>
Message-ID: <4AFFBAD1.4050107@itis.ethz.ch>

Yes, I have found an error in my matrix...
Thank you all for the useful hints!
Still, I wonder if there are some more efficient ways to set up bug 
traps to get get the backtrace leading to the real problem and not to 
the innocent parts... <sigh/>.

With regards,
Dominik

Barry Smith wrote:
>    If you run without hypre preconditioner but use instead, say  
> bjacobi under valgrind do you get any valgrind errors?
> 
>     The problem you are having could be do to (1) some memory  
> corruption in your code that is messing up hypre or (2) some bug in  
> hypre that we don't see with our simple test codes.
> 
>     Barry
> 
> On Nov 14, 2009, at 4:41 PM, Dominik Szczerba wrote:
> 
>> No I am using Hypre built automatically along with petsc...
>> I will try ex10, thanks...
>>
>> Matthew Knepley wrote:
>>> This is already bad. You had an Invalid Read and Invalid Write in  
>>> your Hypre. Did you build it
>>> yourself? If so, let us build it. If not, please try your matrix on  
>>> KSP ex10 and see if you get a
>>> crash on 2 procs.
>>>  Thanks,
>>>    Matt
>>> On Sat, Nov 14, 2009 at 3:51 PM, Dominik Szczerba <dominik at itis.ethz.ch 
>>>  <mailto:dominik at itis.ethz.ch>> wrote:
>>>    run onlu in single, he says things like below - but does not  
>>> crash.
>>>    Also, the program run with -np 1 does not crash. No clear idea
>>>    though about valgrind's output, please advise if this tells you
>>>    anything...
>>>    Call from NS3T10::createSolverContexts() referenced therein is:
>>>    ierr = KSPCreate(petsc_comm,&kspSchurVelocity);CHKERRQ(ierr);
>>>    ==2605== Conditional jump or move depends on uninitialised  
>>> value(s)
>>>    ==2605==    at 0x8AE720F: hypre_BoomerAMGSetPlotFileName
>>>    (par_amg.c:2115)
>>>    ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>>>    ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate  
>>> (HYPRE_parcsr_amg.c:31)
>>>    ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>>>    ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>>>    ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>>>    (NS3T10mpi.cxx:1980)
>>>    ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>>>    ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>>>    ==2605==
>>>    ==2605== Conditional jump or move depends on uninitialised  
>>> value(s)
>>>    ==2605==    at 0x8AE7244: hypre_BoomerAMGSetPlotFileName
>>>    (par_amg.c:2120)
>>>    ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>>>    ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate  
>>> (HYPRE_parcsr_amg.c:31)
>>>    ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>>>    ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>>>    ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>>>    (NS3T10mpi.cxx:1980)
>>>    ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>>>    ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>>>    ==2605==
>>>    ==2605== Conditional jump or move depends on uninitialised  
>>> value(s)
>>>    ==2605==    at 0x4025C16: strcpy (mc_replace_strmem.c:303)
>>>    ==2605==    by 0x8AE727A: hypre_BoomerAMGSetPlotFileName
>>>    (par_amg.c:2123)
>>>    ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>>>    ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate  
>>> (HYPRE_parcsr_amg.c:31)
>>>    ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>>>    ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>>>    ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>>>    (NS3T10mpi.cxx:1980)
>>>    ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>>>    ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>>>    ==2605==
>>>    ==2605== Conditional jump or move depends on uninitialised  
>>> value(s)
>>>    ==2605==    at 0x4025C35: strcpy (mc_replace_strmem.c:303)
>>>    ==2605==    by 0x8AE727A: hypre_BoomerAMGSetPlotFileName
>>>    (par_amg.c:2123)
>>>    ==2605==    by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276)
>>>    ==2605==    by 0x8AE4A71: HYPRE_BoomerAMGCreate  
>>> (HYPRE_parcsr_amg.c:31)
>>>    ==2605==    by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850)
>>>    ==2605==    by 0x8563068: PCHYPRESetType (hypre.c:964)
>>>    ==2605==    by 0x80E67BB: NS3T10::createSolverContexts()
>>>    (NS3T10mpi.cxx:1980)
>>>    ==2605==    by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306)
>>>    ==2605==    by 0x8104860: main (ns3t10mpi_main.cxx:1516)
>>>    ==2605==
>>>    Solver contexts created in 2.520000 s
>>>    Starting KSPSolve (0/1)
>>>      0 KSP Residual norm 8.368803253774e-06
>>>    ==2605== Invalid read of size 8
>>>    ==2605==    at 0x8B23B5A: hypre_BoomerAMGCreateS (par_strength.c: 
>>> 223)
>>>    ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c: 
>>> 630)
>>>    ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup  
>>> (HYPRE_parcsr_amg.c:58)
>>>    ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
>>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>>    (NS3T10mpi.cxx:3741)
>>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>>    ==2605==    by 0x863AC4C: KSPInitialResidual (itres.c:64)
>>>    ==2605==    by 0x85EB09A: KSPSolve_GMRES (gmres.c:241)
>>>    ==2605==  Address 0xafae5d0 is 0 bytes after a block of size  
>>> 93,488
>>>    alloc'd
>>>    ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>>>    ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>>>    ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize  
>>> (csr_matrix.c:91)
>>>    ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
>>>    (par_csr_matrix.c:200)
>>>    ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
>>>    (IJMatrix_parcsr.c:272)
>>>    ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize
>>>    (HYPRE_IJMatrix.c:302)
>>>    ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ  
>>> (mhyp.c:174)
>>>    ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
>>>    ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
>>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>>    ==2605==
>>>    ==2605== Invalid write of size 4
>>>    ==2605==    at 0x8B23E0C: hypre_BoomerAMGCreateS (par_strength.c: 
>>> 301)
>>>    ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c: 
>>> 630)
>>>    ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup  
>>> (HYPRE_parcsr_amg.c:58)
>>>    ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
>>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>>    (NS3T10mpi.cxx:3741)
>>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>>    ==2605==    by 0x863AC4C: KSPInitialResidual (itres.c:64)
>>>    ==2605==    by 0x85EB09A: KSPSolve_GMRES (gmres.c:241)
>>>    ==2605==  Address 0xb12a050 is 0 bytes after a block of size  
>>> 46,744
>>>    alloc'd
>>>    ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>>>    ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>>>    ==2605==    by 0x8B23980: hypre_BoomerAMGCreateS (par_strength.c: 
>>> 163)
>>>    ==2605==    by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c: 
>>> 630)
>>>    ==2605==    by 0x8AE4A4D: HYPRE_BoomerAMGSetup  
>>> (HYPRE_parcsr_amg.c:58)
>>>    ==2605==    by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134)
>>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>>    (NS3T10mpi.cxx:3741)
>>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>>    ==2605==
>>>    ...
>>>    ==2605== Invalid read of size 8
>>>    ==2605==    at 0x8B1ACE8: hypre_BoomerAMGRelax (par_relax.c:182)
>>>    ==2605==    by 0x8B1DFBF: hypre_BoomerAMGRelaxIF
>>>    (par_relax_interface.c:110)
>>>    ==2605==    by 0x8AFC310: hypre_BoomerAMGCycle (par_cycle.c:386)
>>>    ==2605==    by 0x8AEE09E: hypre_BoomerAMGSolve (par_amg_solve.c: 
>>> 252)
>>>    ==2605==    by 0x8AE4A25: HYPRE_BoomerAMGSolve  
>>> (HYPRE_parcsr_amg.c:76)
>>>    ==2605==    by 0x855AAA4: PCApply_HYPRE (hypre.c:172)
>>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>>    ==2605==    by 0x8606095: KSPSolve_PREONLY (preonly.c:29)
>>>    ==2605==    by 0x85A85D3: KSPSolve (itfunc.c:385)
>>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>>    (NS3T10mpi.cxx:3741)
>>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>>    ==2605==  Address 0xafae5d0 is 0 bytes after a block of size  
>>> 93,488
>>>    alloc'd
>>>    ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>>>    ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>>>    ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize  
>>> (csr_matrix.c:91)
>>>    ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
>>>    (par_csr_matrix.c:200)
>>>    ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
>>>    (IJMatrix_parcsr.c:272)
>>>    ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize
>>>    (HYPRE_IJMatrix.c:302)
>>>    ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ  
>>> (mhyp.c:174)
>>>    ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
>>>    ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
>>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>>    ==2605==
>>>    ...
>>>      0 KSP Residual norm 8.368803253774e-06
>>>    ==2605== Invalid read of size 8
>>>    ==2605==    at 0x8B1ADC0: hypre_BoomerAMGRelax (par_relax.c:196)
>>>    ==2605==    by 0x8B1DFBF: hypre_BoomerAMGRelaxIF
>>>    (par_relax_interface.c:110)
>>>    ==2605==    by 0x8AFC310: hypre_BoomerAMGCycle (par_cycle.c:386)
>>>    ==2605==    by 0x8AEE09E: hypre_BoomerAMGSolve (par_amg_solve.c: 
>>> 252)
>>>    ==2605==    by 0x8AE4A25: HYPRE_BoomerAMGSolve  
>>> (HYPRE_parcsr_amg.c:76)
>>>    ==2605==    by 0x855AAA4: PCApply_HYPRE (hypre.c:172)
>>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>>    ==2605==    by 0x8606095: KSPSolve_PREONLY (preonly.c:29)
>>>    ==2605==    by 0x85A85D3: KSPSolve (itfunc.c:385)
>>>    ==2605==    by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*)
>>>    (NS3T10mpi.cxx:3741)
>>>    ==2605==    by 0x851C47E: PCApply_Shell (shellpc.c:129)
>>>    ==2605==    by 0x862074E: PCApply (precon.c:357)
>>>    ==2605==  Address 0xcded820 is 0 bytes after a block of size  
>>> 93,488
>>>    alloc'd
>>>    ==2605==    at 0x4023F5B: calloc (vg_replace_malloc.c:418)
>>>    ==2605==    by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121)
>>>    ==2605==    by 0x8B4CA67: hypre_CSRMatrixInitialize  
>>> (csr_matrix.c:91)
>>>    ==2605==    by 0x8B32EC8: hypre_ParCSRMatrixInitialize
>>>    (par_csr_matrix.c:200)
>>>    ==2605==    by 0x8AE0C44: hypre_IJMatrixInitializeParCSR
>>>    (IJMatrix_parcsr.c:272)
>>>    ==2605==    by 0x8ADBE09: HYPRE_IJMatrixInitialize
>>>    (HYPRE_IJMatrix.c:302)
>>>    ==2605==    by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ  
>>> (mhyp.c:174)
>>>    ==2605==    by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131)
>>>    ==2605==    by 0x855A445: PCSetUp_HYPRE (hypre.c:130)
>>>    ==2605==    by 0x86256A9: PCSetUp (precon.c:794)
>>>    ==2605==    by 0x85A6E62: KSPSetUp (itfunc.c:237)
>>>    ==2605==    by 0x85A7EAB: KSPSolve (itfunc.c:353)
>>>    ==2605==
>>>    Matthew Knepley wrote:
>>>        Try valgrind.
>>>         Matt
>>>        On Sat, Nov 14, 2009 at 3:32 PM, Dominik Szczerba
>>>        <dominik at itis.ethz.ch <mailto:dominik at itis.ethz.ch>
>>>        <mailto:dominik at itis.ethz.ch <mailto:dominik at itis.ethz.ch>>>  
>>> wrote:
>>>           Now for something more serious: I get a crash like this  
>>> one:
>>>           Starting KSPSolve (1/2)
>>>            0 KSP Residual norm 2.964538623545e-06
>>>           *** glibc detected *** /home/domel/build/solve-debug/ 
>>> ns3t10mpi:
>>>           malloc(): memory corruption: 0x09258008 ***
>>>           ======= Backtrace: =========
>>>           /lib/tls/i686/cmov/libc.so.6[0x5f9ff1]
>>>           /lib/tls/i686/cmov/libc.so.6[0x5fcbb3]
>>>           /lib/tls/i686/cmov/libc.so.6(__libc_calloc+0xa9)[0x5fe009]
>>>                  /home/domel/build/solve-debug/ 
>>> ns3t10mpi(hypre_CAlloc+0x2c)[0x8b4ea28]
>>>                  /home/domel/build/solve-debug/ 
>>> ns3t10mpi(hypre_BoomerAMGCoarsenRuge+0xb5)[0x8af2c7b]
>>>           (and so on)
>>>           gdb invoked as:
>>>           mpiexec -np 2 ..... -on_error_attach_debugger -display :0.0
>>>           does not display any backtrace after the crash.
>>>           Any hints how to debug are highly appreciated.
>>>           Dominik
>>>        --         What most experimenters take for granted before  
>>> they begin their
>>>        experiments is infinitely more interesting than any results to
>>>        which their experiments lead.
>>>        -- Norbert Wiener
>>> -- 
>>> What most experimenters take for granted before they begin their  
>>> experiments is infinitely more interesting than any results to  
>>> which their experiments lead.
>>> -- Norbert Wiener
> 


From Chun.SUN at 3ds.com  Mon Nov 16 10:08:29 2009
From: Chun.SUN at 3ds.com (SUN Chun)
Date: Mon, 16 Nov 2009 11:08:29 -0500
Subject: MatMult with nonconforming error
In-Reply-To: <4AFFBAD1.4050107@itis.ethz.ch>
References: <4AFF21ED.5080106@itis.ethz.ch>	<a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com>	<4AFF2664.20202@itis.ethz.ch>	<a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com>	<4AFF3218.3070303@itis.ethz.ch><D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov>
	<4AFFBAD1.4050107@itis.ethz.ch>
Message-ID: <2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds>

Hi,

I was trying to do MatMult with a non-square matrix and a vector. They have different local dimensions.

For this particular case, my Mat is 12x48, my Vec is 48x1.When I run in parallel with 2 cores, I have Mat partitioned by row in 12+0, and I have Vec partitioned by row in 18+30. When I perform MatMult, I get:

[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Nonconforming object sizes!
[0]PETSC ERROR: Incompatible partition of A (24) and xx (18)!
[0]PETSC ERROR: ------------------------------------------------------------------------

Is it required to partition Vec and Mat such that my Vec's row partition agrees my Mat's column partition? If so, is there any way to get around this?

Thanks,
Chun

From bsmith at mcs.anl.gov  Mon Nov 16 10:35:19 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 16 Nov 2009 10:35:19 -0600
Subject: MatMult with nonconforming error
In-Reply-To: <2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds>
References: <4AFF21ED.5080106@itis.ethz.ch>	<a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com>	<4AFF2664.20202@itis.ethz.ch>	<a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com>	<4AFF3218.3070303@itis.ethz.ch><D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov>
	<4AFFBAD1.4050107@itis.ethz.ch>
	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds>
Message-ID: <D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov>


    With y = A x the row partition of A MUST match the row partition  
of y and the column partition of A MUST match the row partition of x.

    There is no avoiding this,

     Barry


On Nov 16, 2009, at 10:08 AM, SUN Chun wrote:

> Hi,
>
> I was trying to do MatMult with a non-square matrix and a vector.  
> They have different local dimensions.
>
> For this particular case, my Mat is 12x48, my Vec is 48x1.When I run  
> in parallel with 2 cores, I have Mat partitioned by row in 12+0, and  
> I have Vec partitioned by row in 18+30. When I perform MatMult, I get:
>
> [0]PETSC ERROR: --------------------- Error Message  
> ------------------------------------
> [0]PETSC ERROR: Nonconforming object sizes!
> [0]PETSC ERROR: Incompatible partition of A (24) and xx (18)!
> [0]PETSC ERROR:  
> ------------------------------------------------------------------------
>
> Is it required to partition Vec and Mat such that my Vec's row  
> partition agrees my Mat's column partition? If so, is there any way  
> to get around this?
>
> Thanks,
> Chun


From dalcinl at gmail.com  Mon Nov 16 11:00:28 2009
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Mon, 16 Nov 2009 15:00:28 -0200
Subject: MatMult with nonconforming error
In-Reply-To: <D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov>
References: <4AFF21ED.5080106@itis.ethz.ch>
	<a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com>
	<4AFF2664.20202@itis.ethz.ch>
	<a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com>
	<4AFF3218.3070303@itis.ethz.ch>
	<D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov>
	<4AFFBAD1.4050107@itis.ethz.ch>
	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds>
	<D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov>
Message-ID: <e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>

On Mon, Nov 16, 2009 at 2:35 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
> ? With y = A x the row partition of A MUST match the row partition of y and
> the column partition of A MUST match the row partition of x.
>
> ? There is no avoiding this,
>

But it can be workaround-ed using a VecScatter, right?

>
> On Nov 16, 2009, at 10:08 AM, SUN Chun wrote:
>
>> Hi,
>>
>> I was trying to do MatMult with a non-square matrix and a vector. They
>> have different local dimensions.
>>
>> For this particular case, my Mat is 12x48, my Vec is 48x1.When I run in
>> parallel with 2 cores, I have Mat partitioned by row in 12+0, and I have Vec
>> partitioned by row in 18+30. When I perform MatMult, I get:
>>
>> [0]PETSC ERROR: --------------------- Error Message
>> ------------------------------------
>> [0]PETSC ERROR: Nonconforming object sizes!
>> [0]PETSC ERROR: Incompatible partition of A (24) and xx (18)!
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>>
>> Is it required to partition Vec and Mat such that my Vec's row partition
>> agrees my Mat's column partition? If so, is there any way to get around
>> this?
>>
>> Thanks,
>> Chun
>
>


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

From bsmith at mcs.anl.gov  Mon Nov 16 12:00:31 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 16 Nov 2009 12:00:31 -0600
Subject: MatMult with nonconforming error
In-Reply-To: <e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>
References: <4AFF21ED.5080106@itis.ethz.ch>
	<a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com>
	<4AFF2664.20202@itis.ethz.ch>
	<a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com>
	<4AFF3218.3070303@itis.ethz.ch>
	<D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov>
	<4AFFBAD1.4050107@itis.ethz.ch>
	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds>
	<D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov>
	<e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>
Message-ID: <16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>


On Nov 16, 2009, at 11:00 AM, Lisandro Dalcin wrote:

> On Mon, Nov 16, 2009 at 2:35 PM, Barry Smith <bsmith at mcs.anl.gov>  
> wrote:
>>
>>   With y = A x the row partition of A MUST match the row partition  
>> of y and
>> the column partition of A MUST match the row partition of x.
>>
>>   There is no avoiding this,
>>
>
> But it can be workaround-ed using a VecScatter, right?

    Well since PETSc is just a library of routines one can do anything  
they want with it.  So yes, one could include a VecScatter to get  
things into the right shape, but it would be a bit cumbersome.

    Barry

>
>>
>> On Nov 16, 2009, at 10:08 AM, SUN Chun wrote:
>>
>>> Hi,
>>>
>>> I was trying to do MatMult with a non-square matrix and a vector.  
>>> They
>>> have different local dimensions.
>>>
>>> For this particular case, my Mat is 12x48, my Vec is 48x1.When I  
>>> run in
>>> parallel with 2 cores, I have Mat partitioned by row in 12+0, and  
>>> I have Vec
>>> partitioned by row in 18+30. When I perform MatMult, I get:
>>>
>>> [0]PETSC ERROR: --------------------- Error Message
>>> ------------------------------------
>>> [0]PETSC ERROR: Nonconforming object sizes!
>>> [0]PETSC ERROR: Incompatible partition of A (24) and xx (18)!
>>> [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>>
>>> Is it required to partition Vec and Mat such that my Vec's row  
>>> partition
>>> agrees my Mat's column partition? If so, is there any way to get  
>>> around
>>> this?
>>>
>>> Thanks,
>>> Chun
>>
>>
>
>
>
> -- 
> Lisandro Dalc?n
> ---------------
> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
> PTLC - G?emes 3450, (3000) Santa Fe, Argentina
> Tel/Fax: +54-(0)342-451.1594


From Chun.SUN at 3ds.com  Tue Nov 17 11:00:48 2009
From: Chun.SUN at 3ds.com (SUN Chun)
Date: Tue, 17 Nov 2009 12:00:48 -0500
Subject: matlab viewer  variable name
In-Reply-To: <16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>
References: <4AFF21ED.5080106@itis.ethz.ch><a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com><4AFF2664.20202@itis.ethz.ch><a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com><4AFF3218.3070303@itis.ethz.ch><D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov><4AFFBAD1.4050107@itis.ethz.ch><2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds><D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov><e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>
	<16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>
Message-ID: <2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>

Hi,

When I MatView or VecView with matlab format, I am automatically
assigned a variable name like "Mat_0", "Vec_1". Which function can I use
to change this value?

Sorry, I had a feeling I have seen this question asked before. Spent
30min searching and didn't find...

Thanks,
Chun

From jed at 59A2.org  Tue Nov 17 11:06:50 2009
From: jed at 59A2.org (Jed Brown)
Date: Tue, 17 Nov 2009 18:06:50 +0100
Subject: matlab viewer  variable name
In-Reply-To: <2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>
References: <4AFF21ED.5080106@itis.ethz.ch><a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com><4AFF2664.20202@itis.ethz.ch><a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com><4AFF3218.3070303@itis.ethz.ch><D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov><4AFFBAD1.4050107@itis.ethz.ch><2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds><D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov><e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>	<16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>
	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>
Message-ID: <4B02D82A.1080207@59A2.org>

SUN Chun wrote:
> Hi,
> 
> When I MatView or VecView with matlab format, I am automatically
> assigned a variable name like "Mat_0", "Vec_1". Which function can I use
> to change this value?

PetscObjectSetName((PetscObject)yourvec,"Name_of_your_vec");


Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091117/da60267b/attachment.pgp>

From bsmith at mcs.anl.gov  Tue Nov 17 11:06:57 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 17 Nov 2009 11:06:57 -0600
Subject: matlab viewer  variable name
In-Reply-To: <2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>
References: <4AFF21ED.5080106@itis.ethz.ch><a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com><4AFF2664.20202@itis.ethz.ch><a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com><4AFF3218.3070303@itis.ethz.ch><D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov><4AFFBAD1.4050107@itis.ethz.ch><2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds><D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov><e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>
	<16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>
	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>
Message-ID: <8E0BED3A-55F7-4F6E-BAAA-4EDADB1C9ACA@mcs.anl.gov>


    PetscObjectSetName((PetscObject)x, "myname");


On Nov 17, 2009, at 11:00 AM, SUN Chun wrote:

> Hi,
>
> When I MatView or VecView with matlab format, I am automatically
> assigned a variable name like "Mat_0", "Vec_1". Which function can I  
> use
> to change this value?
>
> Sorry, I had a feeling I have seen this question asked before. Spent
> 30min searching and didn't find...
>
> Thanks,
> Chun


From jarunan at ascomp.ch  Wed Nov 18 02:27:30 2009
From: jarunan at ascomp.ch (jarunan at ascomp.ch)
Date: Wed, 18 Nov 2009 09:27:30 +0100
Subject: scaling in 4-core machine
In-Reply-To: <8E0BED3A-55F7-4F6E-BAAA-4EDADB1C9ACA@mcs.anl.gov>
References: <4AFF21ED.5080106@itis.ethz.ch><a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com><4AFF2664.20202@itis.ethz.ch><a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com><4AFF3218.3070303@itis.ethz.ch><D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov><4AFFBAD1.4050107@itis.ethz.ch><2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds><D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov><e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>
	<16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>
	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>
	<8E0BED3A-55F7-4F6E-BAAA-4EDADB1C9ACA@mcs.anl.gov>
Message-ID: <20091118092730.zxzn2ru7nogk8g8c@webmail.ascomp.ch>


Hello,

I have read the topic about performance of a machine with 2 dual-core  
chips, and it is written that with -np 2 it should scale the best. I  
would like to ask about 4-core machine.

I run the test on a quad core machine with mpiexec -n 1, 2 and 4 to  
see the parallel scaling. The cpu times of the test are:

Solver/Precond/Sub_Precond

gmres/bjacobi/ilu

-n 1, 1917.5730 sec,
-n 2, 1699.9490 sec, efficiency = 56.40%
-n 4, 1661.6810 sec, efficiency = 28.86%

bicgstab/asm/ilu

-n 1, 1800.8380 sec,
-n 2, 1415.0170 sec, efficiency = 63.63%
-n 4, 1119.3480 sec, efficiency = 40.22%

Why is the scaling so low, especially with option -n 4?
Would it be expected to be better running with real 4 CPU's instead of  
a quad core ship?


Regards,
Jarunan


-- 
Jarunan Panyasantisuk
Development Engineer
ASCOMP GmbH, Technoparkstr. 1
CH-8005 Zurich, Switzerland
Phone : +41 44 445 4072
Fax   : +41 44 445 4075
E-mail: jarunan at ascomp.ch
www.ascomp.ch

From zonexo at gmail.com  Wed Nov 18 02:45:12 2009
From: zonexo at gmail.com (Wee-Beng Tay)
Date: Wed, 18 Nov 2009 16:45:12 +0800
Subject: Using PETSc with Silverfrost FTN95 for windows
Message-ID: <804ab5d40911180045u1ee90bc2o3a1c69b40fe7080e@mail.gmail.com>

Hi,

Has anyone managed to use PETSc successfully with Silverfrost FTN95 for
windows. I didn't manage to google any info regarding this. Silverfrost
FTN95 seems like a good free alternative fortran compiler in windows.

Are these possible:

1. Using pre-compiled library of PETSc with Silverfrost FTN95 - supposed
that the library is precompiled using CVF or intel or visual studio.

2. Compiling the PETSc library using Silverfrost FTN95 and another C
compiler for windows

Thanks alot!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091118/9f67cf5b/attachment.htm>

From jed at 59A2.org  Wed Nov 18 04:13:05 2009
From: jed at 59A2.org (Jed Brown)
Date: Wed, 18 Nov 2009 11:13:05 +0100
Subject: scaling in 4-core machine
In-Reply-To: <20091118092730.zxzn2ru7nogk8g8c@webmail.ascomp.ch>
References: <4AFF21ED.5080106@itis.ethz.ch><a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com><4AFF2664.20202@itis.ethz.ch><a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com><4AFF3218.3070303@itis.ethz.ch><D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov><4AFFBAD1.4050107@itis.ethz.ch><2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds><D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov><e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>	<16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>	<8E0BED3A-55F7-4F6E-BAAA-4EDADB1C9ACA@mcs.anl.gov>
	<20091118092730.zxzn2ru7nogk8g8c@webmail.ascomp.ch>
Message-ID: <4B03C8B1.2080305@59A2.org>

jarunan at ascomp.ch wrote:
> 
> Hello,
> 
> I have read the topic about performance of a machine with 2 dual-core
> chips, and it is written that with -np 2 it should scale the best. I
> would like to ask about 4-core machine.
> 
> I run the test on a quad core machine with mpiexec -n 1, 2 and 4 to see
> the parallel scaling. The cpu times of the test are:
> 
> Solver/Precond/Sub_Precond
> 
> gmres/bjacobi/ilu
> 
> -n 1, 1917.5730 sec,
> -n 2, 1699.9490 sec, efficiency = 56.40%
> -n 4, 1661.6810 sec, efficiency = 28.86%
> 
> bicgstab/asm/ilu
> 
> -n 1, 1800.8380 sec,
> -n 2, 1415.0170 sec, efficiency = 63.63%
> -n 4, 1119.3480 sec, efficiency = 40.22%

These numbers are worthless without at least knowing iteration counts.

> Why is the scaling so low, especially with option -n 4?
> Would it be expected to be better running with real 4 CPU's instead of a
> quad core ship?

4 sockets using a single core each (4x1) will generally do better than
2x2 or 1x4, but 4x4 costs about the same as 4x1 these days.  This is a
very common question, the answer is that a single floating point unit is
about 10 times faster than memory for the sort of operations that we do
when solving PDE.  You don't get another memory bus every time you add a
core so the ratio becomes worse.  More cores are not a complete loss
because at least you get an extra L1 cache for each core, but sparse
matrix and vector kernels are atrocious at reusing cache (there's not
much to reuse because most values are only needed to perform one
operation).

Getting better multicore performance requires changing the algorithms to
better reuse L1 cache.  This means moving away from assembled matrices
where possible and of course finding good preconditioners.  High-order
and fast multipole methods are good for this.  But it's very much an
open problem and unless you want to do research in the field, you have
to live with poor multicore performance.

When buying hardware, remember that you are buying memory bandwidth (and
a low-latency network) instead of floating point units.

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091118/3513e496/attachment.pgp>

From balay at mcs.anl.gov  Wed Nov 18 09:09:18 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 18 Nov 2009 09:09:18 -0600 (CST)
Subject: Using PETSc with Silverfrost FTN95 for windows
In-Reply-To: <804ab5d40911180045u1ee90bc2o3a1c69b40fe7080e@mail.gmail.com>
References: <804ab5d40911180045u1ee90bc2o3a1c69b40fe7080e@mail.gmail.com>
Message-ID: <alpine.LFD.2.00.0911180907240.5692@asterix>

Nope - this compiler won't work with PETSc. Someone would need to fix
win32fe to work with this compiler.

Satish

On Wed, 18 Nov 2009, Wee-Beng Tay wrote:

> Hi,
> 
> Has anyone managed to use PETSc successfully with Silverfrost FTN95 for
> windows. I didn't manage to google any info regarding this. Silverfrost
> FTN95 seems like a good free alternative fortran compiler in windows.
> 
> Are these possible:
> 
> 1. Using pre-compiled library of PETSc with Silverfrost FTN95 - supposed
> that the library is precompiled using CVF or intel or visual studio.
> 
> 2. Compiling the PETSc library using Silverfrost FTN95 and another C
> compiler for windows
> 
> Thanks alot!
> 


From balay at mcs.anl.gov  Wed Nov 18 09:14:19 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 18 Nov 2009 09:14:19 -0600 (CST)
Subject: scaling in 4-core machine
In-Reply-To: <4B03C8B1.2080305@59A2.org>
References: <4AFF21ED.5080106@itis.ethz.ch><a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com><4AFF2664.20202@itis.ethz.ch><a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com><4AFF3218.3070303@itis.ethz.ch><D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov><4AFFBAD1.4050107@itis.ethz.ch><2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds><D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov><e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>
	<16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>
	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>
	<8E0BED3A-55F7-4F6E-BAAA-4EDADB1C9ACA@mcs.anl.gov>
	<20091118092730.zxzn2ru7nogk8g8c@webmail.ascomp.ch>
	<4B03C8B1.2080305@59A2.org>
Message-ID: <alpine.LFD.2.00.0911180910190.5692@asterix>

Just want to add one more point to this.

Most multicore machines do not provide scalable hardware. [yeah - the
FPUs cores are scalable - but the memory subsystem is not]. So one
should not expect scalable performance out of them. You should take
the 'max' performance you can get out out them - and then look for
scalability with multiple nodes.

Satish

On Wed, 18 Nov 2009, Jed Brown wrote:

> jarunan at ascomp.ch wrote:
> > 
> > Hello,
> > 
> > I have read the topic about performance of a machine with 2 dual-core
> > chips, and it is written that with -np 2 it should scale the best. I
> > would like to ask about 4-core machine.
> > 
> > I run the test on a quad core machine with mpiexec -n 1, 2 and 4 to see
> > the parallel scaling. The cpu times of the test are:
> > 
> > Solver/Precond/Sub_Precond
> > 
> > gmres/bjacobi/ilu
> > 
> > -n 1, 1917.5730 sec,
> > -n 2, 1699.9490 sec, efficiency = 56.40%
> > -n 4, 1661.6810 sec, efficiency = 28.86%
> > 
> > bicgstab/asm/ilu
> > 
> > -n 1, 1800.8380 sec,
> > -n 2, 1415.0170 sec, efficiency = 63.63%
> > -n 4, 1119.3480 sec, efficiency = 40.22%
> 
> These numbers are worthless without at least knowing iteration counts.
> 
> > Why is the scaling so low, especially with option -n 4?
> > Would it be expected to be better running with real 4 CPU's instead of a
> > quad core ship?
> 
> 4 sockets using a single core each (4x1) will generally do better than
> 2x2 or 1x4, but 4x4 costs about the same as 4x1 these days.  This is a
> very common question, the answer is that a single floating point unit is
> about 10 times faster than memory for the sort of operations that we do
> when solving PDE.  You don't get another memory bus every time you add a
> core so the ratio becomes worse.  More cores are not a complete loss
> because at least you get an extra L1 cache for each core, but sparse
> matrix and vector kernels are atrocious at reusing cache (there's not
> much to reuse because most values are only needed to perform one
> operation).
> 
> Getting better multicore performance requires changing the algorithms to
> better reuse L1 cache.  This means moving away from assembled matrices
> where possible and of course finding good preconditioners.  High-order
> and fast multipole methods are good for this.  But it's very much an
> open problem and unless you want to do research in the field, you have
> to live with poor multicore performance.
> 
> When buying hardware, remember that you are buying memory bandwidth (and
> a low-latency network) instead of floating point units.
> 
> Jed
> 
> 


From aron.ahmadia at kaust.edu.sa  Wed Nov 18 09:26:48 2009
From: aron.ahmadia at kaust.edu.sa (Aron Ahmadia)
Date: Wed, 18 Nov 2009 18:26:48 +0300
Subject: scaling in 4-core machine
In-Reply-To: <alpine.LFD.2.00.0911180910190.5692@asterix>
References: <4AFF21ED.5080106@itis.ethz.ch>
	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds>
	<D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov>
	<e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>
	<16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>
	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>
	<8E0BED3A-55F7-4F6E-BAAA-4EDADB1C9ACA@mcs.anl.gov>
	<20091118092730.zxzn2ru7nogk8g8c@webmail.ascomp.ch>
	<4B03C8B1.2080305@59A2.org>
	<alpine.LFD.2.00.0911180910190.5692@asterix>
Message-ID: <74e91d510911180726w56df1797p7984f6f9f95d7035@mail.gmail.com>

Does anybody have good references in the literature analyzing the memory
access patterns for sparse solvers and how they scale?  I remember seeing
Barry's talk about multigrid memory access patterns, but I'm not sure if
I've ever seen a good paper reference.

Cheers,
Aron

On Wed, Nov 18, 2009 at 6:14 PM, Satish Balay <balay at mcs.anl.gov> wrote:

> Just want to add one more point to this.
>
> Most multicore machines do not provide scalable hardware. [yeah - the
> FPUs cores are scalable - but the memory subsystem is not]. So one
> should not expect scalable performance out of them. You should take
> the 'max' performance you can get out out them - and then look for
> scalability with multiple nodes.
>
> Satish
>
> On Wed, 18 Nov 2009, Jed Brown wrote:
>
> > jarunan at ascomp.ch wrote:
> > >
> > > Hello,
> > >
> > > I have read the topic about performance of a machine with 2 dual-core
> > > chips, and it is written that with -np 2 it should scale the best. I
> > > would like to ask about 4-core machine.
> > >
> > > I run the test on a quad core machine with mpiexec -n 1, 2 and 4 to see
> > > the parallel scaling. The cpu times of the test are:
> > >
> > > Solver/Precond/Sub_Precond
> > >
> > > gmres/bjacobi/ilu
> > >
> > > -n 1, 1917.5730 sec,
> > > -n 2, 1699.9490 sec, efficiency = 56.40%
> > > -n 4, 1661.6810 sec, efficiency = 28.86%
> > >
> > > bicgstab/asm/ilu
> > >
> > > -n 1, 1800.8380 sec,
> > > -n 2, 1415.0170 sec, efficiency = 63.63%
> > > -n 4, 1119.3480 sec, efficiency = 40.22%
> >
> > These numbers are worthless without at least knowing iteration counts.
> >
> > > Why is the scaling so low, especially with option -n 4?
> > > Would it be expected to be better running with real 4 CPU's instead of
> a
> > > quad core ship?
> >
> > 4 sockets using a single core each (4x1) will generally do better than
> > 2x2 or 1x4, but 4x4 costs about the same as 4x1 these days.  This is a
> > very common question, the answer is that a single floating point unit is
> > about 10 times faster than memory for the sort of operations that we do
> > when solving PDE.  You don't get another memory bus every time you add a
> > core so the ratio becomes worse.  More cores are not a complete loss
> > because at least you get an extra L1 cache for each core, but sparse
> > matrix and vector kernels are atrocious at reusing cache (there's not
> > much to reuse because most values are only needed to perform one
> > operation).
> >
> > Getting better multicore performance requires changing the algorithms to
> > better reuse L1 cache.  This means moving away from assembled matrices
> > where possible and of course finding good preconditioners.  High-order
> > and fast multipole methods are good for this.  But it's very much an
> > open problem and unless you want to do research in the field, you have
> > to live with poor multicore performance.
> >
> > When buying hardware, remember that you are buying memory bandwidth (and
> > a low-latency network) instead of floating point units.
> >
> > Jed
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091118/1fdcbea5/attachment.htm>

From amjad11 at gmail.com  Wed Nov 18 11:47:07 2009
From: amjad11 at gmail.com (amjad ali)
Date: Wed, 18 Nov 2009 12:47:07 -0500
Subject: scaling in 4-core machine
In-Reply-To: <74e91d510911180726w56df1797p7984f6f9f95d7035@mail.gmail.com>
References: <4AFF21ED.5080106@itis.ethz.ch>
	<D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov>
	<e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>
	<16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>
	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>
	<8E0BED3A-55F7-4F6E-BAAA-4EDADB1C9ACA@mcs.anl.gov>
	<20091118092730.zxzn2ru7nogk8g8c@webmail.ascomp.ch>
	<4B03C8B1.2080305@59A2.org>
	<alpine.LFD.2.00.0911180910190.5692@asterix>
	<74e91d510911180726w56df1797p7984f6f9f95d7035@mail.gmail.com>
Message-ID: <428810f20911180947i723b4a24vcabf72b12adc266e@mail.gmail.com>

Hi, Aron,
Can you please give link of Barry's talk about multigrid memory access
patterns (u just mentioned)?

thanks

On Wed, Nov 18, 2009 at 10:26 AM, Aron Ahmadia <aron.ahmadia at kaust.edu.sa>wrote:

> Does anybody have good references in the literature analyzing the memory
> access patterns for sparse solvers and how they scale?  I remember seeing
> Barry's talk about multigrid memory access patterns, but I'm not sure if
> I've ever seen a good paper reference.
>
> Cheers,
> Aron
>
>
> On Wed, Nov 18, 2009 at 6:14 PM, Satish Balay <balay at mcs.anl.gov> wrote:
>
>> Just want to add one more point to this.
>>
>> Most multicore machines do not provide scalable hardware. [yeah - the
>> FPUs cores are scalable - but the memory subsystem is not]. So one
>> should not expect scalable performance out of them. You should take
>> the 'max' performance you can get out out them - and then look for
>> scalability with multiple nodes.
>>
>> Satish
>>
>> On Wed, 18 Nov 2009, Jed Brown wrote:
>>
>> > jarunan at ascomp.ch wrote:
>> > >
>> > > Hello,
>> > >
>> > > I have read the topic about performance of a machine with 2 dual-core
>> > > chips, and it is written that with -np 2 it should scale the best. I
>> > > would like to ask about 4-core machine.
>> > >
>> > > I run the test on a quad core machine with mpiexec -n 1, 2 and 4 to
>> see
>> > > the parallel scaling. The cpu times of the test are:
>> > >
>> > > Solver/Precond/Sub_Precond
>> > >
>> > > gmres/bjacobi/ilu
>> > >
>> > > -n 1, 1917.5730 sec,
>> > > -n 2, 1699.9490 sec, efficiency = 56.40%
>> > > -n 4, 1661.6810 sec, efficiency = 28.86%
>> > >
>> > > bicgstab/asm/ilu
>> > >
>> > > -n 1, 1800.8380 sec,
>> > > -n 2, 1415.0170 sec, efficiency = 63.63%
>> > > -n 4, 1119.3480 sec, efficiency = 40.22%
>> >
>> > These numbers are worthless without at least knowing iteration counts.
>> >
>> > > Why is the scaling so low, especially with option -n 4?
>> > > Would it be expected to be better running with real 4 CPU's instead of
>> a
>> > > quad core ship?
>> >
>> > 4 sockets using a single core each (4x1) will generally do better than
>> > 2x2 or 1x4, but 4x4 costs about the same as 4x1 these days.  This is a
>> > very common question, the answer is that a single floating point unit is
>> > about 10 times faster than memory for the sort of operations that we do
>> > when solving PDE.  You don't get another memory bus every time you add a
>> > core so the ratio becomes worse.  More cores are not a complete loss
>> > because at least you get an extra L1 cache for each core, but sparse
>> > matrix and vector kernels are atrocious at reusing cache (there's not
>> > much to reuse because most values are only needed to perform one
>> > operation).
>> >
>> > Getting better multicore performance requires changing the algorithms to
>> > better reuse L1 cache.  This means moving away from assembled matrices
>> > where possible and of course finding good preconditioners.  High-order
>> > and fast multipole methods are good for this.  But it's very much an
>> > open problem and unless you want to do research in the field, you have
>> > to live with poor multicore performance.
>> >
>> > When buying hardware, remember that you are buying memory bandwidth (and
>> > a low-latency network) instead of floating point units.
>> >
>> > Jed
>> >
>> >
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091118/58e1e96f/attachment-0001.htm>

From knepley at gmail.com  Wed Nov 18 14:18:27 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 18 Nov 2009 14:18:27 -0600
Subject: scaling in 4-core machine
In-Reply-To: <74e91d510911180726w56df1797p7984f6f9f95d7035@mail.gmail.com>
References: <4AFF21ED.5080106@itis.ethz.ch>
	<D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov>
	<e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>
	<16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>
	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>
	<8E0BED3A-55F7-4F6E-BAAA-4EDADB1C9ACA@mcs.anl.gov>
	<20091118092730.zxzn2ru7nogk8g8c@webmail.ascomp.ch>
	<4B03C8B1.2080305@59A2.org>
	<alpine.LFD.2.00.0911180910190.5692@asterix>
	<74e91d510911180726w56df1797p7984f6f9f95d7035@mail.gmail.com>
Message-ID: <a9f269830911181218w23fe80c2wbd44804db65ed46c@mail.gmail.com>

There is also the paper by Barry, Bill, David, and Dinesh about SpMV. Its
very good. That is what I base my slides on.
You can see the punchline in the tutorial slides.

  Matt

On Wed, Nov 18, 2009 at 9:26 AM, Aron Ahmadia <aron.ahmadia at kaust.edu.sa>wrote:

> Does anybody have good references in the literature analyzing the memory
> access patterns for sparse solvers and how they scale?  I remember seeing
> Barry's talk about multigrid memory access patterns, but I'm not sure if
> I've ever seen a good paper reference.
>
> Cheers,
> Aron
>
>
> On Wed, Nov 18, 2009 at 6:14 PM, Satish Balay <balay at mcs.anl.gov> wrote:
>
>> Just want to add one more point to this.
>>
>> Most multicore machines do not provide scalable hardware. [yeah - the
>> FPUs cores are scalable - but the memory subsystem is not]. So one
>> should not expect scalable performance out of them. You should take
>> the 'max' performance you can get out out them - and then look for
>> scalability with multiple nodes.
>>
>> Satish
>>
>> On Wed, 18 Nov 2009, Jed Brown wrote:
>>
>> > jarunan at ascomp.ch wrote:
>> > >
>> > > Hello,
>> > >
>> > > I have read the topic about performance of a machine with 2 dual-core
>> > > chips, and it is written that with -np 2 it should scale the best. I
>> > > would like to ask about 4-core machine.
>> > >
>> > > I run the test on a quad core machine with mpiexec -n 1, 2 and 4 to
>> see
>> > > the parallel scaling. The cpu times of the test are:
>> > >
>> > > Solver/Precond/Sub_Precond
>> > >
>> > > gmres/bjacobi/ilu
>> > >
>> > > -n 1, 1917.5730 sec,
>> > > -n 2, 1699.9490 sec, efficiency = 56.40%
>> > > -n 4, 1661.6810 sec, efficiency = 28.86%
>> > >
>> > > bicgstab/asm/ilu
>> > >
>> > > -n 1, 1800.8380 sec,
>> > > -n 2, 1415.0170 sec, efficiency = 63.63%
>> > > -n 4, 1119.3480 sec, efficiency = 40.22%
>> >
>> > These numbers are worthless without at least knowing iteration counts.
>> >
>> > > Why is the scaling so low, especially with option -n 4?
>> > > Would it be expected to be better running with real 4 CPU's instead of
>> a
>> > > quad core ship?
>> >
>> > 4 sockets using a single core each (4x1) will generally do better than
>> > 2x2 or 1x4, but 4x4 costs about the same as 4x1 these days.  This is a
>> > very common question, the answer is that a single floating point unit is
>> > about 10 times faster than memory for the sort of operations that we do
>> > when solving PDE.  You don't get another memory bus every time you add a
>> > core so the ratio becomes worse.  More cores are not a complete loss
>> > because at least you get an extra L1 cache for each core, but sparse
>> > matrix and vector kernels are atrocious at reusing cache (there's not
>> > much to reuse because most values are only needed to perform one
>> > operation).
>> >
>> > Getting better multicore performance requires changing the algorithms to
>> > better reuse L1 cache.  This means moving away from assembled matrices
>> > where possible and of course finding good preconditioners.  High-order
>> > and fast multipole methods are good for this.  But it's very much an
>> > open problem and unless you want to do research in the field, you have
>> > to live with poor multicore performance.
>> >
>> > When buying hardware, remember that you are buying memory bandwidth (and
>> > a low-latency network) instead of floating point units.
>> >
>> > Jed
>> >
>> >
>>
>>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091118/19f40b30/attachment.htm>

From hxie at umn.edu  Wed Nov 18 14:19:59 2009
From: hxie at umn.edu (hxie at umn.edu)
Date: 18 Nov 2009 14:19:59 -0600
Subject: petsc-users Digest, Vol 11, Issue 17
In-Reply-To: <mailman.1282.1258566450.16969.petsc-users@mcs.anl.gov>
References: <mailman.1282.1258566450.16969.petsc-users@mcs.anl.gov>
Message-ID: <Gophermail.2.0.0911181419590.14235@vs-a.tc.umn.edu>

Hi,

I want to output the true residual norm after calling KSPSolve in a fortran 
program. I know there is a command option '-ksp_final_residual', but I 
cannot do this using the PBS job submission. What is the related fortran 
subroutine for this? Thanks for your help.

Bests,
Hui

From knepley at gmail.com  Wed Nov 18 14:21:06 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 18 Nov 2009 14:21:06 -0600
Subject: petsc-users Digest, Vol 11, Issue 17
In-Reply-To: <Gophermail.2.0.0911181419590.14235@vs-a.tc.umn.edu>
References: <mailman.1282.1258566450.16969.petsc-users@mcs.anl.gov>
	<Gophermail.2.0.0911181419590.14235@vs-a.tc.umn.edu>
Message-ID: <a9f269830911181221x2a5867eaqc3dab48c84d4f4ff@mail.gmail.com>

http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/KSP/KSPGetResidualNorm.html

  Matt

On Wed, Nov 18, 2009 at 2:19 PM, <hxie at umn.edu> wrote:

> Hi,
>
> I want to output the true residual norm after calling KSPSolve in a fortran
> program. I know there is a command option '-ksp_final_residual', but I
> cannot do this using the PBS job submission. What is the related fortran
> subroutine for this? Thanks for your help.
>
> Bests,
> Hui
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091118/a97264ab/attachment.htm>

From jed at 59A2.org  Wed Nov 18 14:27:19 2009
From: jed at 59A2.org (Jed Brown)
Date: Wed, 18 Nov 2009 21:27:19 +0100
Subject: scaling in 4-core machine
In-Reply-To: <a9f269830911181218w23fe80c2wbd44804db65ed46c@mail.gmail.com>
References: <4AFF21ED.5080106@itis.ethz.ch>	<D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov>	<e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>	<16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>	<8E0BED3A-55F7-4F6E-BAAA-4EDADB1C9ACA@mcs.anl.gov>	<20091118092730.zxzn2ru7nogk8g8c@webmail.ascomp.ch>	<4B03C8B1.2080305@59A2.org>	<alpine.LFD.2.00.0911180910190.5692@asterix>	<74e91d510911180726w56df1797p7984f6f9f95d7035@mail.gmail.com>
	<a9f269830911181218w23fe80c2wbd44804db65ed46c@mail.gmail.com>
Message-ID: <4B0458A7.90004@59A2.org>

Matthew Knepley wrote:
> There is also the paper by Barry, Bill, David, and Dinesh about SpMV.
> Its very good. That is what I base my slides on.
> You can see the punchline in the tutorial slides.

I like this one

  http://portal.acm.org/ft_gateway.cfm?id=370405&type=pdf

Recent work on multicore/auto-tuning

  http://crd.lbl.gov/~oliker/


Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091118/76314ad7/attachment.pgp>

From bsmith at mcs.anl.gov  Wed Nov 18 14:53:52 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 18 Nov 2009 14:53:52 -0600
Subject: petsc-users Digest, Vol 11, Issue 17
In-Reply-To: <a9f269830911181221x2a5867eaqc3dab48c84d4f4ff@mail.gmail.com>
References: <mailman.1282.1258566450.16969.petsc-users@mcs.anl.gov>
	<Gophermail.2.0.0911181419590.14235@vs-a.tc.umn.edu>
	<a9f269830911181221x2a5867eaqc3dab48c84d4f4ff@mail.gmail.com>
Message-ID: <A7A45A0E-6FEA-47F4-8094-C6E035A3DAA4@mcs.anl.gov>


    This only gives the final residual norm as computed by the Krylov  
method; so depending on your options it may be a preconditioned  
residual norm.

    If you want the true residual norm then you should compute it via  
the formula || b - A*x|| in your Fortran code after the KSPSolve()

    Barry

On Nov 18, 2009, at 2:21 PM, Matthew Knepley wrote:

> http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/KSP/KSPGetResidualNorm.html
>
>   Matt
>
> On Wed, Nov 18, 2009 at 2:19 PM, <hxie at umn.edu> wrote:
> Hi,
>
> I want to output the true residual norm after calling KSPSolve in a  
> fortran program. I know there is a command option '- 
> ksp_final_residual', but I cannot do this using the PBS job  
> submission. What is the related fortran subroutine for this? Thanks  
> for your help.
>
> Bests,
> Hui
>
>
>
> -- 
> What most experimenters take for granted before they begin their  
> experiments is infinitely more interesting than any results to which  
> their experiments lead.
> -- Norbert Wiener


From jarunan at ascomp.ch  Thu Nov 19 02:13:55 2009
From: jarunan at ascomp.ch (jarunan at ascomp.ch)
Date: Thu, 19 Nov 2009 09:13:55 +0100
Subject: scaling in 4-core machine
In-Reply-To: <4B03C8B1.2080305@59A2.org>
References: <4AFF21ED.5080106@itis.ethz.ch><a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com><4AFF2664.20202@itis.ethz.ch><a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com><4AFF3218.3070303@itis.ethz.ch><D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov><4AFFBAD1.4050107@itis.ethz.ch><2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds><D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov><e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>
	<16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>
	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>
	<8E0BED3A-55F7-4F6E-BAAA-4EDADB1C9ACA@mcs.anl.gov>
	<20091118092730.zxzn2ru7nogk8g8c@webmail.ascomp.ch>
	<4B03C8B1.2080305@59A2.org>
Message-ID: <20091119091355.3pnblyhjf4ss0wko@webmail.ascomp.ch>


>> I run the test on a quad core machine with mpiexec -n 1, 2 and 4 to see
>> the parallel scaling. The cpu times of the test are:
>>
>> Solver/Precond/Sub_Precond
>>
>> gmres/bjacobi/ilu
>>
>> -n 1, 1917.5730 sec,
>> -n 2, 1699.9490 sec, efficiency = 56.40%
>> -n 4, 1661.6810 sec, efficiency = 28.86%
>>
>> bicgstab/asm/ilu
>>
>> -n 1, 1800.8380 sec,
>> -n 2, 1415.0170 sec, efficiency = 63.63%
>> -n 4, 1119.3480 sec, efficiency = 40.22%
>
> These numbers are worthless without at least knowing iteration counts.

I cannot show the iteration counts as I run it 50 time steps, 10  
maximum iterations for each time step, and 200 number of sweep (or  
maxit in KSPSetTolerance())

I test also our inhouse solver on the same machine, which is much  
slower than Petsc solver but scales better.

-n 1, 10022.8360 sec,
-n 2,  5684.1490 sec, efficiency = 88%
-n 4,  4067.0480 sec, efficiency = 61.61%

efficiency = 100*Speedup/nproc
Speedup = (cpu's time of -n 1)/(cpu's time of nproc)

If anyone has done parallel scaling test, it would be very kind, if  
you could share the test results.


Regards,
Jarunan


From hxie at umn.edu  Thu Nov 19 13:14:57 2009
From: hxie at umn.edu (hxie at umn.edu)
Date: 19 Nov 2009 13:14:57 -0600
Subject: Change orthogonalization option in fortran?
In-Reply-To: <mailman.67.1258653619.22574.petsc-users@mcs.anl.gov>
References: <mailman.67.1258653619.22574.petsc-users@mcs.anl.gov>
Message-ID: <Gophermail.2.0.0911191314570.4194@vs-a.tc.umn.edu>

Hi,

I want to change the method for orthogonalization for the default ksp 
solver in fortran. I added the following in my code:
--------------
    call 
KSPGMRESSetOrthogonalization(ksp,KSPGMRESClassicalGramSchmidtOrthogonalization,pterr)
    call 
KSPGMRESSetCGSRefinementType(ksp,KSP_GMRES_CGS_REFINEMENT_IFNEEDED,pterr)
--------------

And I got the following error message when compiling.
--------------
error #6404: This name does not have a type, and must have an explicit 
type. [KSPGMRESCLASSICALGRAMSCHMIDTORTHOGONALIZATI]
       call 
KSPGMRESSetOrthogonalization(ksp,KSPGMRESClassicalGramSchmidtOrthogonalization,pterr)

error #6404: This name does not have a type, and must have an explicit 
type. [KSP_GMRES_CGS_REFINEMENT_IFNEEDED]
       call 
KSPGMRESSetCGSRefinementType(ksp,KSP_GMRES_CGS_REFINEMENT_IFNEEDED,pterr)
--------------

If I comment these two lines, the code is OK. Any idea for this? Thanks.

Bests,
Hui

From jarunan at ascomp.ch  Fri Nov 20 03:26:02 2009
From: jarunan at ascomp.ch (jarunan at ascomp.ch)
Date: Fri, 20 Nov 2009 10:26:02 +0100
Subject: scaling in 4-core machine: unassembled structured
In-Reply-To: <4B03C8B1.2080305@59A2.org>
References: <4AFF21ED.5080106@itis.ethz.ch><a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com><4AFF2664.20202@itis.ethz.ch><a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com><4AFF3218.3070303@itis.ethz.ch><D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov><4AFFBAD1.4050107@itis.ethz.ch><2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds><D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov><e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>
	<16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>
	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>
	<8E0BED3A-55F7-4F6E-BAAA-4EDADB1C9ACA@mcs.anl.gov>
	<20091118092730.zxzn2ru7nogk8g8c@webmail.ascomp.ch>
	<4B03C8B1.2080305@59A2.org>
Message-ID: <20091120102602.nwotyi9tc8o4kwks@webmail.ascomp.ch>

>
> Getting better multicore performance requires changing the algorithms to
> better reuse L1 cache.  This means moving away from assembled matrices
> where possible and of course finding good preconditioners.

I do not know how to move away from assembled matrix. As I have to  
reset values to the matrix in each iteration, I oblige to call  
MatAssemblyBegin() and MatAssemblyEnd(). Is there other option to  
create and set values the matrix??

> High-order
> and fast multipole methods are good for this.

For example, please?


Jarunan


-- 
Jarunan Panyasantisuk
Development Engineer
ASCOMP GmbH, Technoparkstr. 1
CH-8005 Zurich, Switzerland
Phone : +41 44 445 4072
Fax   : +41 44 445 4075
E-mail: jarunan at ascomp.ch
www.ascomp.ch

From jed at 59A2.org  Fri Nov 20 04:24:29 2009
From: jed at 59A2.org (Jed Brown)
Date: Fri, 20 Nov 2009 11:24:29 +0100
Subject: scaling in 4-core machine: unassembled structured
In-Reply-To: <20091120102602.nwotyi9tc8o4kwks@webmail.ascomp.ch>
References: <4AFF21ED.5080106@itis.ethz.ch><a9f269830911141345n62aeb5fdhf78cf67ec4ac56bd@mail.gmail.com><4AFF2664.20202@itis.ethz.ch><a9f269830911141421u5442d9e1lb6d12f11f678285c@mail.gmail.com><4AFF3218.3070303@itis.ethz.ch><D89F1D01-0B2F-41FB-84EB-EEEF89E5ABA1@mcs.anl.gov><4AFFBAD1.4050107@itis.ethz.ch><2545DC7A42DF804AAAB2ADA5043D57DA28E4E8@CORP-CLT-EXB01.ds><D78306C0-0ACE-4D30-AC07-9C7732184A30@mcs.anl.gov><e7ba66e40911160900x37787799xa70d962a552b2d93@mail.gmail.com>	<16E85BAA-87E4-4D27-96EF-24B77EBE0859@mcs.anl.gov>	<2545DC7A42DF804AAAB2ADA5043D57DA28E4E9@CORP-CLT-EXB01.ds>	<8E0BED3A-55F7-4F6E-BAAA-4EDADB1C9ACA@mcs.anl.gov>	<20091118092730.zxzn2ru7nogk8g8c@webmail.ascomp.ch>	<4B03C8B1.2080305@59A2.org>
	<20091120102602.nwotyi9tc8o4kwks@webmail.ascomp.ch>
Message-ID: <4B066E5D.7050709@59A2.org>

jarunan at ascomp.ch wrote:
>>
>> Getting better multicore performance requires changing the algorithms to
>> better reuse L1 cache.  This means moving away from assembled matrices
>> where possible and of course finding good preconditioners.
> 
> I do not know how to move away from assembled matrix. As I have to
> reset values to the matrix in each iteration, I oblige to call
> MatAssemblyBegin() and MatAssemblyEnd(). Is there other option to
> create and set values the matrix??

A matrix is just a linear operation.  What I mean by not assembling is
that you no longer define that operation in terms of matrix entries.  A
DFT is a famous example of a linear operation that should not be
represented in terms of matrix entries, instead it should be implemented
by FFT.  How to do this is highly dependent on discretization and
physics, good preconditioners almost always require assembled matrices
somewhere, but it's often possible to assemble something cheaper than
the real Jacobian.

>> High-order and fast multipole methods are good for this.
> 
> For example, please?

Spectral element methods implement certain operations by exploiting a
tensor product structure which turns O(p^6) memory O(p^6) flops into
O(p^3) memory O(p^4) flops (with larger constants).  Matt has been doing
some work with FMM.  The key is to choose algorithms that do more work
on the CPU for each value loaded from memory.

I have some slides on the subject,

  http://59A2.org/files/Labs09-Dohp.pdf

you could also take a look at slides from this mini-course (we did
high-order methods on the last day)

  http://59A2.org/newton-krylov

I can send you more technical references if you would like.

Finally, if you are in Z?rich, we can talk about it sometime (I'm at
ETH).


Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091120/30d1f3f9/attachment.pgp>

From likask at civil.gla.ac.uk  Fri Nov 20 11:22:11 2009
From: likask at civil.gla.ac.uk (Lukasz Kaczmarczyk)
Date: Fri, 20 Nov 2009 17:22:11 +0000
Subject: suprelu
Message-ID: <A2CEF121-1B26-41E4-B704-CE9D5FF6D0DE@civil.gla.ac.uk>

Hello
I try to configure petsc (p9) with SuperLU and I get error message 

*********************************************************************************
         UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for details):
---------------------------------------------------------------------------------------
Error unzipping _d_SuperLU_DIST.tar.gz: Could not execute 'cd /Users/likask/MyBuild/src/petsc-3.0.0-p9/externalpackages; gunzip _d_SuperLU_DIST.tar.gz':

gzip: _d_SuperLU_DIST.tar.gz: not in gzip format
*********************************************************************************

Regards,
Lukasz

From balay at mcs.anl.gov  Fri Nov 20 11:31:30 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Fri, 20 Nov 2009 11:31:30 -0600 (CST)
Subject: suprelu
In-Reply-To: <A2CEF121-1B26-41E4-B704-CE9D5FF6D0DE@civil.gla.ac.uk>
References: <A2CEF121-1B26-41E4-B704-CE9D5FF6D0DE@civil.gla.ac.uk>
Message-ID: <alpine.LFD.2.00.0911201130380.2776@asterix>

works fine for me.

Perhaps you can retry after 'rm -rf externalpackages'

If the problem persists - send configure.log to petsc-maint at mcs.anl.gov

Satish

On Fri, 20 Nov 2009, Lukasz Kaczmarczyk wrote:

> Hello
> I try to configure petsc (p9) with SuperLU and I get error message 
> 
> *********************************************************************************
>          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for details):
> ---------------------------------------------------------------------------------------
> Error unzipping _d_SuperLU_DIST.tar.gz: Could not execute 'cd /Users/likask/MyBuild/src/petsc-3.0.0-p9/externalpackages; gunzip _d_SuperLU_DIST.tar.gz':
> 
> gzip: _d_SuperLU_DIST.tar.gz: not in gzip format
> *********************************************************************************
> 
> Regards,
> Lukasz


From bsmith at mcs.anl.gov  Fri Nov 20 11:33:26 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 20 Nov 2009 11:33:26 -0600
Subject: suprelu
In-Reply-To: <A2CEF121-1B26-41E4-B704-CE9D5FF6D0DE@civil.gla.ac.uk>
References: <A2CEF121-1B26-41E4-B704-CE9D5FF6D0DE@civil.gla.ac.uk>
Message-ID: <C13B3A13-DEC2-4D23-98D6-CF3979C47312@mcs.anl.gov>


   Please send configure.log to petsc-maint at mcs.anl.gov

   Looks like you have a corrupt tar.gz file; recommend deleting it  
and getting it again.

    Barry

On Nov 20, 2009, at 11:22 AM, Lukasz Kaczmarczyk wrote:

> Hello
> I try to configure petsc (p9) with SuperLU and I get error message
>
> *********************************************************************************
>         UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log  
> for details):
> ---------------------------------------------------------------------------------------
> Error unzipping _d_SuperLU_DIST.tar.gz: Could not execute 'cd /Users/ 
> likask/MyBuild/src/petsc-3.0.0-p9/externalpackages; gunzip  
> _d_SuperLU_DIST.tar.gz':
>
> gzip: _d_SuperLU_DIST.tar.gz: not in gzip format
> *********************************************************************************
>
> Regards,
> Lukasz


From bsmith at mcs.anl.gov  Fri Nov 20 13:07:35 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 20 Nov 2009 13:07:35 -0600
Subject: Change orthogonalization option in fortran?
In-Reply-To: <Gophermail.2.0.0911191314570.4194@vs-a.tc.umn.edu>
References: <mailman.67.1258653619.22574.petsc-users@mcs.anl.gov>
	<Gophermail.2.0.0911191314570.4194@vs-a.tc.umn.edu>
Message-ID: <765E16C6-1159-4FA1-9626-B284ED6FD87B@mcs.anl.gov>


   Sorry, we don't have the Fortran interfaces for these operations.  
You can use

   call PetscOptionsSet("- 
ksp_gmres_classicalgramschmidt",PETSC_NULL_CHARACTER,ierr)
   call PetscOptionsSet("- 
ksp_gmres_cgs_refinement_type","REFINE_IFNEEDED",ierr)

  before creating the KSP object.

    Barry


On Nov 19, 2009, at 1:14 PM, hxie at umn.edu wrote:

> Hi,
>
> I want to change the method for orthogonalization for the default  
> ksp solver in fortran. I added the following in my code:
> --------------
>   call  
> KSPGMRESSetOrthogonalization 
> (ksp,KSPGMRESClassicalGramSchmidtOrthogonalization,pterr)
>   call  
> KSPGMRESSetCGSRefinementType 
> (ksp,KSP_GMRES_CGS_REFINEMENT_IFNEEDED,pterr)
> --------------
>
> And I got the following error message when compiling.
> --------------
> error #6404: This name does not have a type, and must have an  
> explicit type. [KSPGMRESCLASSICALGRAMSCHMIDTORTHOGONALIZATI]
>      call  
> KSPGMRESSetOrthogonalization 
> (ksp,KSPGMRESClassicalGramSchmidtOrthogonalization,pterr)
>
> error #6404: This name does not have a type, and must have an  
> explicit type. [KSP_GMRES_CGS_REFINEMENT_IFNEEDED]
>      call  
> KSPGMRESSetCGSRefinementType 
> (ksp,KSP_GMRES_CGS_REFINEMENT_IFNEEDED,pterr)
> --------------
>
> If I comment these two lines, the code is OK. Any idea for this?  
> Thanks.
>
> Bests,
> Hui


From likask at civil.gla.ac.uk  Fri Nov 20 13:43:29 2009
From: likask at civil.gla.ac.uk (Lukasz Kaczmarczyk)
Date: Fri, 20 Nov 2009 19:43:29 +0000
Subject: suprelu
In-Reply-To: <C13B3A13-DEC2-4D23-98D6-CF3979C47312@mcs.anl.gov>
References: <A2CEF121-1B26-41E4-B704-CE9D5FF6D0DE@civil.gla.ac.uk>
	<C13B3A13-DEC2-4D23-98D6-CF3979C47312@mcs.anl.gov>
Message-ID: <EC5A0824-5C18-440A-9F5B-325C9A41CF42@civil.gla.ac.uk>

Thanks, for help. Strange enough It is working now. 

Regards,
Lukasz

On 20 Nov 2009, at 17:33, Barry Smith wrote:

> 
>  Please send configure.log to petsc-maint at mcs.anl.gov
> 
>  Looks like you have a corrupt tar.gz file; recommend deleting it and getting it again.
> 
>   Barry
> 
> On Nov 20, 2009, at 11:22 AM, Lukasz Kaczmarczyk wrote:
> 
>> Hello
>> I try to configure petsc (p9) with SuperLU and I get error message
>> 
>> *********************************************************************************
>>        UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for details):
>> ---------------------------------------------------------------------------------------
>> Error unzipping _d_SuperLU_DIST.tar.gz: Could not execute 'cd /Users/likask/MyBuild/src/petsc-3.0.0-p9/externalpackages; gunzip _d_SuperLU_DIST.tar.gz':
>> 
>> gzip: _d_SuperLU_DIST.tar.gz: not in gzip format
>> *********************************************************************************
>> 
>> Regards,
>> Lukasz
> 
> 

Lukasz Kaczmarczyk
Lecturer
Department of Civil Engineering,
University of Glasgow,
GLASGOW, G12 8LT
Tel: +44 141 3305348
email: likask at civil.gla.ac.uk
web: http://www.civil.gla.ac.uk/~kaczmarczyk/
web: http://code.google.com/p/yaffems/


From ajs at craft-tech.com  Fri Nov 20 15:25:52 2009
From: ajs at craft-tech.com (Srinivasan Arunajatesan)
Date: Fri, 20 Nov 2009 16:25:52 -0500
Subject: Petsc-Fun3d
Message-ID: <4B070960.9040801@craft-tech.com>

Is it possible to get a hold of the source code for Petsc-fun3d code?

-- 
Srinivasan Arunajatesan, PhD.
Senior Research Scientist
Combustion Research and Flow Technology, Inc.
6210 Keller's Church Road,
Pipersville, PA 18947.

email : ajs at craft-tech.com
Tel.  : 215 766 1520
Fax.  : 215 766 1524

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091120/36f7de95/attachment.htm>

From bsmith at mcs.anl.gov  Fri Nov 20 15:31:44 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 20 Nov 2009 15:31:44 -0600
Subject: Petsc-Fun3d
In-Reply-To: <4B070960.9040801@craft-tech.com>
References: <4B070960.9040801@craft-tech.com>
Message-ID: <21CED7DE-6BBA-46C4-89BF-A2D73D3C145D@mcs.anl.gov>


    It is in src/contrib/fun3d.

     The difficulty is that we cannot give you the grids for large  
problems that we have used; those were created by NASA and it doesn't  
want them handed out. You, of course, are free to use your own grids  
you just need to figure out the structure of the grid files; which you  
can be looking at the source code.

    Barry

On Nov 20, 2009, at 3:25 PM, Srinivasan Arunajatesan wrote:

> Is it possible to get a hold of the source code for Petsc-fun3d code?
> -- 
> Srinivasan Arunajatesan, PhD.
> Senior Research Scientist
> Combustion Research and Flow Technology, Inc.
> 6210 Keller's Church Road,
> Pipersville, PA 18947.
>
> email : ajs at craft-tech.com
> Tel.  : 215 766 1520
> Fax.  : 215 766 1524


From denist at al.com.au  Sun Nov 22 20:29:05 2009
From: denist at al.com.au (Denis Teplyashin)
Date: Mon, 23 Nov 2009 13:29:05 +1100
Subject: DA memory consumption
Message-ID: <4B09F371.3000600@al.com.au>

Hi guys,

I'm a bit confused with distributed array memory consumption. I did a 
simple test like this one:
  ierr = DACreate3d(PETSC_COMM_WORLD, DA_NONPERIODIC, DA_STENCIL_BOX, 
1000, 1000, 1000, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 1, 1, 
PETSC_NULL, PETSC_NULL, PETSC_NULL , &da);
and then checked memory with PetscMemoryGetCurrentUsage and 
PetscMemoryGetMaximumUsage. Running this test using mpi on one core 
gives me this result: current usage 3818Mb and maximum usage 7633Mb. And 
this is the result after creating just a DA without actual vectors. 
Running the same test on two cores gives me even more interesting 
result: rank 0 - 9552/11463Mb and rank 1 - 5735/5732Mb.
Is it what i should expect in general or am i doing something wrong? Is 
there a simple formula which could show how much memory i would need to 
allocate and array with given resolution?

Thanks in advance,
Denis

From knepley at gmail.com  Sun Nov 22 20:41:24 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 22 Nov 2009 20:41:24 -0600
Subject: DA memory consumption
In-Reply-To: <4B09F371.3000600@al.com.au>
References: <4B09F371.3000600@al.com.au>
Message-ID: <a9f269830911221841g43c0e54ap4d69593080d5a52d@mail.gmail.com>

It is not simple, but it is scalable, meaning in the limit of large N, the
memory will be constant on
each processor. When it is created, the VecScatter objects mapping global to
local vecs are created.

  Matt

On Sun, Nov 22, 2009 at 8:29 PM, Denis Teplyashin <denist at al.com.au> wrote:

> Hi guys,
>
> I'm a bit confused with distributed array memory consumption. I did a
> simple test like this one:
>  ierr = DACreate3d(PETSC_COMM_WORLD, DA_NONPERIODIC, DA_STENCIL_BOX, 1000,
> 1000, 1000, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 1, 1, PETSC_NULL,
> PETSC_NULL, PETSC_NULL , &da);
> and then checked memory with PetscMemoryGetCurrentUsage and
> PetscMemoryGetMaximumUsage. Running this test using mpi on one core gives me
> this result: current usage 3818Mb and maximum usage 7633Mb. And this is the
> result after creating just a DA without actual vectors. Running the same
> test on two cores gives me even more interesting result: rank 0 -
> 9552/11463Mb and rank 1 - 5735/5732Mb.
> Is it what i should expect in general or am i doing something wrong? Is
> there a simple formula which could show how much memory i would need to
> allocate and array with given resolution?
>
> Thanks in advance,
> Denis
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091122/26929eb8/attachment.htm>

From denist at al.com.au  Sun Nov 22 21:34:32 2009
From: denist at al.com.au (Denis Teplyashin)
Date: Mon, 23 Nov 2009 14:34:32 +1100
Subject: DA memory consumption
In-Reply-To: <a9f269830911221841g43c0e54ap4d69593080d5a52d@mail.gmail.com>
References: <4B09F371.3000600@al.com.au>
	<a9f269830911221841g43c0e54ap4d69593080d5a52d@mail.gmail.com>
Message-ID: <4B0A02C8.8050709@al.com.au>

An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091123/328ad2ee/attachment.htm>

From bsmith at mcs.anl.gov  Sun Nov 22 21:47:53 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 22 Nov 2009 21:47:53 -0600
Subject: DA memory consumption
In-Reply-To: <4B0A02C8.8050709@al.com.au>
References: <4B09F371.3000600@al.com.au>
	<a9f269830911221841g43c0e54ap4d69593080d5a52d@mail.gmail.com>
	<4B0A02C8.8050709@al.com.au>
Message-ID: <DFD4E53E-FAE6-409E-B451-3993EBFCF754@mcs.anl.gov>


    Sometimes computing can be an experimental science. Run the same  
size DA on 1, 2, 4, 8, 16, 32 processes and gather the information  
about memory usage and make a little table.

    Here is what you should find. The amount of memory depends on the  
local size of the array, which for your example below is  
1000*1000*1000 on one process. Thus you will see that as you increase  
the number of processes you'll see the space needed per process for  
the DA decreases. It increases from 1 process to 2 because it needs  
all the ghost point data and VecScatter that are not needed on 1.

   Note also on one process a SINGLE vector for this size mesh is 8  
gigabytes so the DA is really not much of pig since it is less than  
one vector.

    Barry

On Nov 22, 2009, at 9:34 PM, Denis Teplyashin wrote:

> So this sort of memory consumption is expected? Is it possible to  
> reduce is somehow? I'm not sure about underlying petsc object but it  
> looks like these additional objects require more memory than the  
> actual vector itself.
>
> Cheers,
> Denis
>
> Matthew Knepley wrote:
>>
>> It is not simple, but it is scalable, meaning in the limit of large  
>> N, the memory will be constant on
>> each processor. When it is created, the VecScatter objects mapping  
>> global to local vecs are created.
>>
>>   Matt
>>
>> On Sun, Nov 22, 2009 at 8:29 PM, Denis Teplyashin  
>> <denist at al.com.au> wrote:
>> Hi guys,
>>
>> I'm a bit confused with distributed array memory consumption. I did  
>> a simple test like this one:
>>  ierr = DACreate3d(PETSC_COMM_WORLD, DA_NONPERIODIC,  
>> DA_STENCIL_BOX, 1000, 1000, 1000, PETSC_DECIDE, PETSC_DECIDE,  
>> PETSC_DECIDE, 1, 1, PETSC_NULL, PETSC_NULL, PETSC_NULL , &da);
>> and then checked memory with PetscMemoryGetCurrentUsage and  
>> PetscMemoryGetMaximumUsage. Running this test using mpi on one core  
>> gives me this result: current usage 3818Mb and maximum usage  
>> 7633Mb. And this is the result after creating just a DA without  
>> actual vectors. Running the same test on two cores gives me even  
>> more interesting result: rank 0 - 9552/11463Mb and rank 1 -  
>> 5735/5732Mb.
>> Is it what i should expect in general or am i doing something  
>> wrong? Is there a simple formula which could show how much memory i  
>> would need to allocate and array with given resolution?
>>
>> Thanks in advance,
>> Denis
>>
>>
>>
>> -- 
>> What most experimenters take for granted before they begin their  
>> experiments is infinitely more interesting than any results to  
>> which their experiments lead.
>> -- Norbert Wiener
>


From craig-tanis at utc.edu  Mon Nov 23 15:12:02 2009
From: craig-tanis at utc.edu (Craig Tanis)
Date: Mon, 23 Nov 2009 16:12:02 -0500
Subject: pre-decomposed domains
Message-ID: <DB807DFF-988C-4EF8-945E-7662A74A3587@utc.edu>

I have an existing MPI code that builds a linear system corresponding to an unstructured mesh.  I'm hoping that I can change my code to work with PETSc, but I'm not sure the domain decomposition scheme is compatible.  

The big problem seems to be that my domains are not guaranteed to have contiguous global node ids.  How can I specify explicitly which processor owns which node/vector element (for the purposes of ghost-node synchronization)?

Thanks for your help,
Craig Tanis


From jed at 59A2.org  Mon Nov 23 15:25:29 2009
From: jed at 59A2.org (Jed Brown)
Date: Mon, 23 Nov 2009 22:25:29 +0100
Subject: pre-decomposed domains
In-Reply-To: <DB807DFF-988C-4EF8-945E-7662A74A3587@utc.edu>
References: <DB807DFF-988C-4EF8-945E-7662A74A3587@utc.edu>
Message-ID: <878wdxj5me.fsf@59A2.org>

On Mon, 23 Nov 2009 16:12:02 -0500, Craig Tanis <craig-tanis at utc.edu> wrote:
> The big problem seems to be that my domains are not guaranteed to have
> contiguous global node ids.  How can I specify explicitly which
> processor owns which node/vector element (for the purposes of
> ghost-node synchronization)?

PETSc matrices require that each process has contiguous rows, so your
numbering scheme will have to be changed for matrix insertion.  But the
code that handles your physics does not need to be changed, you just
encode the different numbering in a scatter.

Jed

From bsmith at mcs.anl.gov  Mon Nov 23 15:35:44 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 23 Nov 2009 15:35:44 -0600
Subject: pre-decomposed domains
In-Reply-To: <878wdxj5me.fsf@59A2.org>
References: <DB807DFF-988C-4EF8-945E-7662A74A3587@utc.edu>
	<878wdxj5me.fsf@59A2.org>
Message-ID: <D86E9B5A-F0B3-4352-9136-91118B502164@mcs.anl.gov>


   Take a look at the manual page for AO. This provides a mechanism  
for renumbering the nodes (and references to the nodes) into what  
PETSc needs. Then you just assemble the matrix and vectors using the  
new PETSc numbering.

   Or you can do the renumbering yourself.

   Note that renumbering doesn't mean moving any data between  
processes, you use the data layout you already have, you just change  
the "names" of the nodes.

    Barry

On Nov 23, 2009, at 3:25 PM, Jed Brown wrote:

> On Mon, 23 Nov 2009 16:12:02 -0500, Craig Tanis <craig- 
> tanis at utc.edu> wrote:
>> The big problem seems to be that my domains are not guaranteed to  
>> have
>> contiguous global node ids.  How can I specify explicitly which
>> processor owns which node/vector element (for the purposes of
>> ghost-node synchronization)?
>
> PETSc matrices require that each process has contiguous rows, so your
> numbering scheme will have to be changed for matrix insertion.  But  
> the
> code that handles your physics does not need to be changed, you just
> encode the different numbering in a scatter.
>
> Jed


From achatter at cse.psu.edu  Tue Nov 24 09:12:36 2009
From: achatter at cse.psu.edu (Anirban Chatterjee)
Date: Tue, 24 Nov 2009 10:12:36 -0500
Subject: cannot convert '_p_Vec* const' to '_p_VecScatter*' for argument '1'
	to 'PetscErrorCode
Message-ID: <4B0BF7E4.8030208@cse.psu.edu>

Hi,

I am trying to use Steve Wright's OOQP with PETSc support and get this 
error:
cannot convert '/_p_Vec/* const' to '/_p_VecScatter/*' for argument '1' 
to 'PetscErrorCode

Can anyone tell me why I am getting this error. I am getting this error 
in the VecScatterBegin function where the first argument type is Vec.

Thanks,
Anirban

From balay at mcs.anl.gov  Tue Nov 24 09:21:23 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 24 Nov 2009 09:21:23 -0600 (CST)
Subject: cannot convert '_p_Vec* const' to '_p_VecScatter*' for argument
	'1' to 'PetscErrorCode
In-Reply-To: <4B0BF7E4.8030208@cse.psu.edu>
References: <4B0BF7E4.8030208@cse.psu.edu>
Message-ID: <alpine.LFD.2.00.0911240917240.13986@asterix>

I guess OOQP is not updated to use the latest petsc version - where
the first argument is VecScatter.

Is this the only error you get? If so - its easy to just fix the code
to use petsc-3

VecScatterBegin(Vec,Vec,InsertMode,ScatterMode,VecScatter);
changed to:
VecScatterBegin(VecScatter,Vec,Vec,InsertMode,ScatterMode);

Satish

On Tue, 24 Nov 2009, Anirban Chatterjee wrote:

> Hi,
> 
> I am trying to use Steve Wright's OOQP with PETSc support and get this error:
> cannot convert '/_p_Vec/* const' to '/_p_VecScatter/*' for argument '1' to
> 'PetscErrorCode
> 
> Can anyone tell me why I am getting this error. I am getting this error in the
> VecScatterBegin function where the first argument type is Vec.
> 
> Thanks,
> Anirban
> 


From achatter at cse.psu.edu  Tue Nov 24 09:43:43 2009
From: achatter at cse.psu.edu (Anirban Chatterjee)
Date: Tue, 24 Nov 2009 10:43:43 -0500
Subject: cannot convert '_p_Vec* const' to '_p_VecScatter*' for argument
	'1' to 'PetscErrorCode
In-Reply-To: <alpine.LFD.2.00.0911240917240.13986@asterix>
References: <4B0BF7E4.8030208@cse.psu.edu>
	<alpine.LFD.2.00.0911240917240.13986@asterix>
Message-ID: <4B0BFF2F.8020005@cse.psu.edu>

Hi Satish,

Yes, that fixes it. I can install OOQP with petsc-2.3.0 without a problem.

But after fixing the VecScatter problem if I try petsc-3.0 then I end up 
in a linker error in src/QpBound/QpBoundPetsc.o
"undefined reference to PetscIterativeSolver::PetscIterativeSolver". I 
cannot understand this. I have to check this.

--Anirban


Satish Balay wrote:
> I guess OOQP is not updated to use the latest petsc version - where
> the first argument is VecScatter.
>
> Is this the only error you get? If so - its easy to just fix the code
> to use petsc-3
>
> VecScatterBegin(Vec,Vec,InsertMode,ScatterMode,VecScatter);
> changed to:
> VecScatterBegin(VecScatter,Vec,Vec,InsertMode,ScatterMode);
>
> Satish
>
> On Tue, 24 Nov 2009, Anirban Chatterjee wrote:
>
>   
>> Hi,
>>
>> I am trying to use Steve Wright's OOQP with PETSc support and get this error:
>> cannot convert '/_p_Vec/* const' to '/_p_VecScatter/*' for argument '1' to
>> 'PetscErrorCode
>>
>> Can anyone tell me why I am getting this error. I am getting this error in the
>> VecScatterBegin function where the first argument type is Vec.
>>
>> Thanks,
>> Anirban
>>
>>     
>
>   


From fpacull at fluorem.com  Tue Nov 24 10:46:01 2009
From: fpacull at fluorem.com (francois pacull)
Date: Tue, 24 Nov 2009 17:46:01 +0100
Subject: PCFieldSplit and Schur
Message-ID: <4B0C0DC9.3060602@fluorem.com>

Dear PETSc team,

I am having a little bit of trouble with the Schur option of 
PCFieldSplit: I would like to apply some special treatments respectively 
to the five first and the two last equations over seven for each grid 
node of a CFD linear system...

In the present case, the KSP is "gmres" (or "fgmres" if the PC is 
changing), the PC is "asm" (the subdomain IS is defined with 
PCASMSetLocalSubdomains in order to always include all the fields 
associated to the grid points of the overlap), the SUBKSP is "preonly" 
and the SUBPC is "fieldsplit".

So far the FieldSplit options PC_COMPOSITE_ADDITIVE and 
PC_COMPOSITE_MULTIPLICATIVE work well with fieldsplit_0 and fieldsplit_1 
set to KSPPREONLY and PCILU in the following way:

PCFieldSplitGetSubKSP(SubPC,number_of_split,&FieldSplitKSP);
... [for the fields 0,1,2,3,4 (fieldsplit_0)]
KSPSetType(FieldSplitKSP[0],KSPPREONLY);
KSPGetPC(FieldSplitKSP[0],&FieldSplitPC0);
PCSetType(FieldSplitPC0,PCILU);
... [for the fields 5 and 6 (fieldsplit_1)]
KSPSetType(FieldSplitKSP[1],KSPPREONLY);
KSPGetPC(FieldSplitKSP[1],&FieldSplitPC1);
PCSetType(FieldSplitPC1,PCILU);
...

They also work well when I set the FieldSplitKSP[0] and/or 
FieldSplitKSP[1] to KSPGMRES. However, I always get a segmentation 
violation when I try to use the FieldSplit option PC_COMPOSITE_SCHUR... 
I guess that I am missing something: is there more options to set in 
this Schur case?
So far I did not turn on the Schur complement preconditioner option: 
PCFieldSplitSchurPrecondition(subpc,PETSC_FALSE);
Is there a book or article describing the methods implemented in this 
FieldSplit option?

The version is petsc-3.0.0-p7...

Thank you for your help,
Regards,
francois pacull.


From knepley at gmail.com  Tue Nov 24 11:10:51 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 24 Nov 2009 11:10:51 -0600
Subject: PCFieldSplit and Schur
In-Reply-To: <4B0C0DC9.3060602@fluorem.com>
References: <4B0C0DC9.3060602@fluorem.com>
Message-ID: <a9f269830911240910h4aee5364kb3d1008e241e3aa5@mail.gmail.com>

Can you use the debugger to get a stack trace?

 Matt

On Tue, Nov 24, 2009 at 10:46 AM, francois pacull <fpacull at fluorem.com>wrote:

> Dear PETSc team,
>
> I am having a little bit of trouble with the Schur option of PCFieldSplit:
> I would like to apply some special treatments respectively to the five first
> and the two last equations over seven for each grid node of a CFD linear
> system...
>
> In the present case, the KSP is "gmres" (or "fgmres" if the PC is
> changing), the PC is "asm" (the subdomain IS is defined with
> PCASMSetLocalSubdomains in order to always include all the fields associated
> to the grid points of the overlap), the SUBKSP is "preonly" and the SUBPC is
> "fieldsplit".
>
> So far the FieldSplit options PC_COMPOSITE_ADDITIVE and
> PC_COMPOSITE_MULTIPLICATIVE work well with fieldsplit_0 and fieldsplit_1 set
> to KSPPREONLY and PCILU in the following way:
>
> PCFieldSplitGetSubKSP(SubPC,number_of_split,&FieldSplitKSP);
> ... [for the fields 0,1,2,3,4 (fieldsplit_0)]
> KSPSetType(FieldSplitKSP[0],KSPPREONLY);
> KSPGetPC(FieldSplitKSP[0],&FieldSplitPC0);
> PCSetType(FieldSplitPC0,PCILU);
> ... [for the fields 5 and 6 (fieldsplit_1)]
> KSPSetType(FieldSplitKSP[1],KSPPREONLY);
> KSPGetPC(FieldSplitKSP[1],&FieldSplitPC1);
> PCSetType(FieldSplitPC1,PCILU);
> ...
>
> They also work well when I set the FieldSplitKSP[0] and/or FieldSplitKSP[1]
> to KSPGMRES. However, I always get a segmentation violation when I try to
> use the FieldSplit option PC_COMPOSITE_SCHUR... I guess that I am missing
> something: is there more options to set in this Schur case?
> So far I did not turn on the Schur complement preconditioner option:
> PCFieldSplitSchurPrecondition(subpc,PETSC_FALSE);
> Is there a book or article describing the methods implemented in this
> FieldSplit option?
>
> The version is petsc-3.0.0-p7...
>
> Thank you for your help,
> Regards,
> francois pacull.
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091124/550e0808/attachment.htm>

From jed at 59A2.org  Tue Nov 24 11:35:31 2009
From: jed at 59A2.org (Jed Brown)
Date: Tue, 24 Nov 2009 18:35:31 +0100
Subject: PCFieldSplit and Schur
In-Reply-To: <4B0C0DC9.3060602@fluorem.com>
References: <4B0C0DC9.3060602@fluorem.com>
Message-ID: <87y6lvygf0.fsf@59A2.org>

On Tue, 24 Nov 2009 17:46:01 +0100, francois pacull <fpacull at fluorem.com> wrote:

> PCFieldSplitGetSubKSP(SubPC,number_of_split,&FieldSplitKSP);

Note that Schur currently only works when number_of_split=2.

I think this is the culprit:

> PCSetType(FieldSplitPC1,PCILU);
                          ^^^^^^
Here you are trying to precondition the Schur complement, but ...

> So far I did not turn on the Schur complement preconditioner option: 
> PCFieldSplitSchurPrecondition(subpc,PETSC_FALSE);

There should be a check for this so that you get a better error.  When
not PCFieldSplitSchurPrecondition, we set PCNONE for the Schur
complement.

> Is there a book or article describing the methods implemented in this 
> FieldSplit option?

Not really, it implements "physics-based" relaxation
(additive/multiplicative) and factorization (Schur), the former is a
fairly traditional idea, for the latter you generally have to look at
references for your application area.  We might be able to refer you to
specific references if you explain the physics you are working with.

There are two classes of problems for which the factorization option is
to be preferred (because usually nothing else works):

* indefinite systems such as incompressible flow or LNK optimization

* stiff wave problems like shallow water, low-Mach gas dynamics, MHD

A while ago, I started extending PCFieldSplit to do hybrid factorization
and relaxation with no restriction on the number of splits, and with the
appropriate hooks to get preconditioners in all the places you might
want them.  Sadly, I have not yet gotten it cleaned up enough to push to
PETSc-dev, but it is still on my to do list.

Jed

From fpacull at fluorem.com  Tue Nov 24 11:38:26 2009
From: fpacull at fluorem.com (francois pacull)
Date: Tue, 24 Nov 2009 18:38:26 +0100
Subject: PCFieldSplit and Schur
In-Reply-To: <a9f269830911240910h4aee5364kb3d1008e241e3aa5@mail.gmail.com>
References: <4B0C0DC9.3060602@fluorem.com>
	<a9f269830911240910h4aee5364kb3d1008e241e3aa5@mail.gmail.com>
Message-ID: <4B0C1A12.40108@fluorem.com>

Yes, here is what I am getting
francois.

[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, 
probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or 
-on_error_attach_debugger                              
[0]PETSC ERROR: or see 
http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC 
ERROR: [1]PETSC ERROR: 
------------------------------------------------------------------------                                      

[1]PETSC ERROR: or try http://valgrind.org on linux or man libgmalloc on 
Apple to find memory corruption errors          
Caught signal number 11 SEGV: Segmentation Violation, probably memory 
access out of range                                
[0]PETSC ERROR: likely location of problem given in stack 
below                                                          
[1]PETSC ERROR: Try option -start_in_debugger or 
-on_error_attach_debugger                                               
[1]PETSC ERROR: [0]PETSC ERROR: ---------------------  Stack Frames 
------------------------------------                 
or see 
http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[1]PETSC 
ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to 
find memory corruption errors                                              
[1]PETSC ERROR: likely location of problem given in stack 
below                                                          
[1]PETSC ERROR: ---------------------  Stack Frames 
------------------------------------                                 
[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not 
available,                                             
[1]PETSC ERROR: Note: The EXACT line numbers in the stack are not 
available,                                             
[1]PETSC ERROR: [0]PETSC ERROR:       INSTEAD the line number of the 
start of the function                               
[0]PETSC ERROR:       INSTEAD the line number of the start of the 
function                                               
[1]PETSC ERROR:       is 
given.                                                                                           

      is 
given.                                                                                                           

[0]PETSC ERROR: [0] PCFieldSplitGetSubKSP_FieldSplit_Schur line 751 
src/ksp/pc/impls/fieldsplit/fieldsplit.c             
[1]PETSC ERROR: [1] PCFieldSplitGetSubKSP_FieldSplit_Schur line 751 
src/ksp/pc/impls/fieldsplit/fieldsplit.c             
[1]PETSC ERROR: [0]PETSC ERROR: [0] PCFieldSplitGetSubKSP line 951 
src/ksp/pc/impls/fieldsplit/fieldsplit.c              
[1] PCFieldSplitGetSubKSP line 951 
src/ksp/pc/impls/fieldsplit/fieldsplit.c                                               

[1]PETSC ERROR: --------------------- Error Message 
------------------------------------                                 
[0]PETSC ERROR: --------------------- Error Message 
------------------------------------                                 
[1]PETSC ERROR: Signal 
received!                                                                                          

[1]PETSC ERROR: [0]PETSC ERROR: 
------------------------------------------------------------------------                  

[1]PETSC ERROR: Petsc Release Version 3.0.0, Patch 7, Mon Jul  6 
11:33:34 CDT 2009                                       
Signal 
received!                                                                                                          

[1]PETSC ERROR: See docs/changes/index.html for recent 
updates.                                                          
[1]PETSC ERROR: [0]PETSC ERROR: 
------------------------------------------------------------------------                  

See docs/faq.html for hints about trouble 
shooting.                                                                       

[1]PETSC ERROR: See docs/index.html for manual 
pages.                                                                    
[0]PETSC ERROR: [1]PETSC ERROR: Petsc Release Version 3.0.0, Patch 7, 
Mon Jul  6 11:33:34 CDT 2009                       
------------------------------------------------------------------------      

...                                           

Matthew Knepley a ?crit :
> Can you use the debugger to get a stack trace?
>
>  Matt
>
> On Tue, Nov 24, 2009 at 10:46 AM, francois pacull <fpacull at fluorem.com 
> <mailto:fpacull at fluorem.com>> wrote:
>
>     Dear PETSc team,
>
>     I am having a little bit of trouble with the Schur option of
>     PCFieldSplit: I would like to apply some special treatments
>     respectively to the five first and the two last equations over
>     seven for each grid node of a CFD linear system...
>
>     In the present case, the KSP is "gmres" (or "fgmres" if the PC is
>     changing), the PC is "asm" (the subdomain IS is defined with
>     PCASMSetLocalSubdomains in order to always include all the fields
>     associated to the grid points of the overlap), the SUBKSP is
>     "preonly" and the SUBPC is "fieldsplit".
>
>     So far the FieldSplit options PC_COMPOSITE_ADDITIVE and
>     PC_COMPOSITE_MULTIPLICATIVE work well with fieldsplit_0 and
>     fieldsplit_1 set to KSPPREONLY and PCILU in the following way:
>
>     PCFieldSplitGetSubKSP(SubPC,number_of_split,&FieldSplitKSP);
>     ... [for the fields 0,1,2,3,4 (fieldsplit_0)]
>     KSPSetType(FieldSplitKSP[0],KSPPREONLY);
>     KSPGetPC(FieldSplitKSP[0],&FieldSplitPC0);
>     PCSetType(FieldSplitPC0,PCILU);
>     ... [for the fields 5 and 6 (fieldsplit_1)]
>     KSPSetType(FieldSplitKSP[1],KSPPREONLY);
>     KSPGetPC(FieldSplitKSP[1],&FieldSplitPC1);
>     PCSetType(FieldSplitPC1,PCILU);
>     ...
>
>     They also work well when I set the FieldSplitKSP[0] and/or
>     FieldSplitKSP[1] to KSPGMRES. However, I always get a segmentation
>     violation when I try to use the FieldSplit option
>     PC_COMPOSITE_SCHUR... I guess that I am missing something: is
>     there more options to set in this Schur case?
>     So far I did not turn on the Schur complement preconditioner
>     option: PCFieldSplitSchurPrecondition(subpc,PETSC_FALSE);
>     Is there a book or article describing the methods implemented in
>     this FieldSplit option?
>
>     The version is petsc-3.0.0-p7...
>
>     Thank you for your help,
>     Regards,
>     francois pacull.
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener


From fpacull at fluorem.com  Tue Nov 24 12:24:48 2009
From: fpacull at fluorem.com (francois pacull)
Date: Tue, 24 Nov 2009 19:24:48 +0100
Subject: PCFieldSplit and Schur
In-Reply-To: <87y6lvygf0.fsf@59A2.org>
References: <4B0C0DC9.3060602@fluorem.com> <87y6lvygf0.fsf@59A2.org>
Message-ID: <4B0C24F0.2070608@fluorem.com>

Thank you for your email Jed, this is very helpfull.
Just in case you would know some specific references:
the physics we are working with are very common:  turbulent low-Mach gas 
dynamics; since part of the systems stiffness is created by the 
turbulent variables,  i am trying to couple a first system with 5 fields 
(density, momentum, energy) and a second one with 2 fields (turbulence), 
within the preconditioner. The motivation is the diminution of the 
factorization memory requirement, compare to when the fields are not split.
Regards,
francois pacull.

Jed Brown a ?crit :
> On Tue, 24 Nov 2009 17:46:01 +0100, francois pacull <fpacull at fluorem.com> wrote:
>
>   
>> PCFieldSplitGetSubKSP(SubPC,number_of_split,&FieldSplitKSP);
>>     
>
> Note that Schur currently only works when number_of_split=2.
>
> I think this is the culprit:
>
>   
>> PCSetType(FieldSplitPC1,PCILU);
>>     
>                           ^^^^^^
> Here you are trying to precondition the Schur complement, but ...
>
>   
>> So far I did not turn on the Schur complement preconditioner option: 
>> PCFieldSplitSchurPrecondition(subpc,PETSC_FALSE);
>>     
>
> There should be a check for this so that you get a better error.  When
> not PCFieldSplitSchurPrecondition, we set PCNONE for the Schur
> complement.
>
>   
>> Is there a book or article describing the methods implemented in this 
>> FieldSplit option?
>>     
>
> Not really, it implements "physics-based" relaxation
> (additive/multiplicative) and factorization (Schur), the former is a
> fairly traditional idea, for the latter you generally have to look at
> references for your application area.  We might be able to refer you to
> specific references if you explain the physics you are working with.
>
> There are two classes of problems for which the factorization option is
> to be preferred (because usually nothing else works):
>
> * indefinite systems such as incompressible flow or LNK optimization
>
> * stiff wave problems like shallow water, low-Mach gas dynamics, MHD
>
> A while ago, I started extending PCFieldSplit to do hybrid factorization
> and relaxation with no restriction on the number of splits, and with the
> appropriate hooks to get preconditioners in all the places you might
> want them.  Sadly, I have not yet gotten it cleaned up enough to push to
> PETSc-dev, but it is still on my to do list.
>
> Jed
>
>   


From jed at 59A2.org  Tue Nov 24 13:30:38 2009
From: jed at 59A2.org (Jed Brown)
Date: Tue, 24 Nov 2009 20:30:38 +0100
Subject: PCFieldSplit and Schur
In-Reply-To: <4B0C24F0.2070608@fluorem.com>
References: <4B0C0DC9.3060602@fluorem.com> <87y6lvygf0.fsf@59A2.org>
	<4B0C24F0.2070608@fluorem.com>
Message-ID: <87vdgzyb35.fsf@59A2.org>

On Tue, 24 Nov 2009 19:24:48 +0100, francois pacull <fpacull at fluorem.com> wrote:
> Thank you for your email Jed, this is very helpfull.
> Just in case you would know some specific references:
> the physics we are working with are very common:  turbulent low-Mach gas 
> dynamics; since part of the systems stiffness is created by the 
> turbulent variables,

So this depends somewhat on the turbulence model, but it is normally
some sort of advection-diffusion system.

> i am trying to couple a first system with 5 fields (density, momentum,
> energy) and a second one with 2 fields (turbulence), within the
> preconditioner.

Ah, if you split the turbulence variables then you still have to deal
with acoustics in the bs=5 split.

As an experimental heuristic (one I'm trying to formulate) for how to
split a stiff wave system (at least with a conservative formulation) try
this.  Identify the characteristic of your largest eigenvector.  If this
involves a single subsystem, say it is

  [0, 0, 0, 1, -1]

then you should be able to do one multiplicative split ([a,b,c], [d,e])
as long as you can find a good preconditioner for the [d,e] part.  In
this last part, the fastest wave has inter-field coupling, so you could
split this with factorization.  Note that you could also have done just
one split ([a,b,c,d],[e]), and this would have little impact on the
Schur complement in the [e] block.

If instead the characteristic looked like

  [0.3, 0.4, 0.8, 0.4, -1]

then all the fields participate in the fastest wave, and the natural
(and I hypothesize, mandatory) split would be ([a,b,c,d],[e]).


So in light of this framework, let's consider your system.  Since the
velocity is much slower than acoustics, you should be able to split off
the turbulence model using relaxation (the advection is only as fast as
the velocity, if the turbulence model is as stiff as acoustics then the
diffusive part must be the culprit, but that is self-coupling).

Here is one recent paper on JFNK applied to low-mach compressible flow

@article{park2008jacobian,
  title={{Jacobian-free Newton Krylov discontinuous Galerkin method and
    physics-based preconditioning for nuclear reactor simulations}},
  author={Park, H.K. and Nourgaliev, R.R. and Martineau, R.C. and Knoll, D.A.},
  year={2008},
  publisher={INL/CON-08-14243, Idaho National Laboratory (INL)},
  url={http://www.osti.gov/bridge/servlets/purl/940059-ITD6J2/940059.pdf}
}


> The motivation is the diminution of the factorization memory
> requirement, compare to when the fields are not split.

Incomplete or full factorization?  Does ASM actually work?  If so, then
I'm skeptical that PCFieldSplit-Schur will gain you anything, you should
be fine to split off the turbelence variables (multigrid might even work
fine for them), and use a cheaper preconditioner on the rest of the
system.


Finally, I would be interested to know how you fare.  In particular,
whether you find that the heuristic arguments above actually correlate
with efficient algorithms.

Jed

From irfan.khan at gatech.edu  Tue Nov 24 18:59:23 2009
From: irfan.khan at gatech.edu (irfan.khan at gatech.edu)
Date: Tue, 24 Nov 2009 19:59:23 -0500 (EST)
Subject: Multiple communicators
In-Reply-To: <926310557.1459151259110602884.JavaMail.root@mail8.gatech.edu>
Message-ID: <2011669276.1459301259110763060.JavaMail.root@mail8.gatech.edu>

Hello 

Does the procedure for providing command line options and to obtain profiling data through -log_summary change if there are multiple MPI communicators in a petsc code? 

If so, can somebody point me to information, references on how to provide command line options if there are multiple communicators? 

Thank you 
Irfan 

-- 
PhD Candidate 
G.W. Woodruff School of Mechanical Engineering 
Georgia Institute of Technology 
Atlanta, GA (30332) 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091124/314b1ac4/attachment.htm>

From knepley at gmail.com  Tue Nov 24 19:03:00 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 24 Nov 2009 19:03:00 -0600
Subject: Multiple communicators
In-Reply-To: <2011669276.1459301259110763060.JavaMail.root@mail8.gatech.edu>
References: <926310557.1459151259110602884.JavaMail.root@mail8.gatech.edu>
	<2011669276.1459301259110763060.JavaMail.root@mail8.gatech.edu>
Message-ID: <a9f269830911241703g43f3f545q3b59e9122e8df51e@mail.gmail.com>

On Tue, Nov 24, 2009 at 6:59 PM, <irfan.khan at gatech.edu> wrote:

> Hello
>
> Does the procedure for providing command line options and to obtain
> profiling data through -log_summary change if there are multiple MPI
> communicators in a petsc code?
>

There is always WORLD, so it should be fine unless I misunderstand your
question.

  Matt


> If so, can somebody point me to information, references on how to provide
> command line options if there are multiple communicators?
>
> Thank you
> Irfan
>
> --
> PhD Candidate
> G.W. Woodruff School of Mechanical Engineering
> Georgia Institute of Technology
> Atlanta, GA (30332)
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091124/9ae6b7d2/attachment.htm>

From irfan.khan at gatech.edu  Tue Nov 24 19:13:36 2009
From: irfan.khan at gatech.edu (irfan.khan at gatech.edu)
Date: Tue, 24 Nov 2009 20:13:36 -0500 (EST)
Subject: Multiple communicators
In-Reply-To: <84104561.1459831259111212973.JavaMail.root@mail8.gatech.edu>
Message-ID: <58356696.1460341259111616362.JavaMail.root@mail8.gatech.edu>

I use seperete communicators for fluid and solid ranks in a fluid-structure interaction code. I use PETSc tools (KSP) to solve for the solid phase (FEA) which is carried out under a different communicator (FEA_Comm). 

Unless I am doing something wrong, I have found that the command line options: -ksp_type; -mat_partitioning_type, don't work if there are 2 communicators. But for single communicator they work. Please see the attached files containing output of -log_summary for 2 different codes (scroll down to the end). One with single communicator and the other with two communicators. Both the output have been generated with the same PETSc compilation. 

Futher -log_summary results are not able to detect all the events that have been registered if they are not in the WORLD communicator. 


Thank you 
Irfan 
----- Original Message ----- 
From: "Matthew Knepley" <knepley at gmail.com> 
To: "PETSc users list" <petsc-users at mcs.anl.gov> 
Sent: Tuesday, November 24, 2009 8:03:00 PM GMT -05:00 US/Canada Eastern 
Subject: Re: Multiple communicators 

On Tue, Nov 24, 2009 at 6:59 PM, < irfan.khan at gatech.edu > wrote: 


Hello 

Does the procedure for providing command line options and to obtain profiling data through -log_summary change if there are multiple MPI communicators in a petsc code? 


There is always WORLD, so it should be fine unless I misunderstand your question. 

Matt 


If so, can somebody point me to information, references on how to provide command line options if there are multiple communicators? 

Thank you 
Irfan 

-- 
PhD Candidate 
G.W. Woodruff School of Mechanical Engineering 
Georgia Institute of Technology 
Atlanta, GA (30332) 


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. 
-- Norbert Wiener 


-- 
PhD Candidate 
G.W. Woodruff School of Mechanical Engineering 
Georgia Institute of Technology 
Atlanta, GA (30332) 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091124/225e71c7/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1communicator.out
Type: application/octet-stream
Size: 13888 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091124/225e71c7/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2communicators.out
Type: application/octet-stream
Size: 20399 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091124/225e71c7/attachment-0003.obj>

From denist at al.com.au  Tue Nov 24 19:21:34 2009
From: denist at al.com.au (Denis Teplyashin)
Date: Wed, 25 Nov 2009 12:21:34 +1100
Subject: DA memory consumption
In-Reply-To: <DFD4E53E-FAE6-409E-B451-3993EBFCF754@mcs.anl.gov>
References: <4B09F371.3000600@al.com.au>	<a9f269830911221841g43c0e54ap4d69593080d5a52d@mail.gmail.com>	<4B0A02C8.8050709@al.com.au>
	<DFD4E53E-FAE6-409E-B451-3993EBFCF754@mcs.anl.gov>
Message-ID: <4B0C869E.3000907@al.com.au>

An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091125/486ae79e/attachment.htm>

From knepley at gmail.com  Tue Nov 24 19:28:54 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 24 Nov 2009 19:28:54 -0600
Subject: Multiple communicators
In-Reply-To: <58356696.1460341259111616362.JavaMail.root@mail8.gatech.edu>
References: <84104561.1459831259111212973.JavaMail.root@mail8.gatech.edu>
	<58356696.1460341259111616362.JavaMail.root@mail8.gatech.edu>
Message-ID: <a9f269830911241728i3d1a06d0u846047c649740255@mail.gmail.com>

On Tue, Nov 24, 2009 at 7:13 PM, <irfan.khan at gatech.edu> wrote:

> I use seperete communicators for fluid and solid ranks in a fluid-structure
> interaction code. I use PETSc tools (KSP) to solve for the solid phase (FEA)
> which is carried out under a different communicator (FEA_Comm).
>
> Unless I am doing something wrong, I have found that the command line
> options: -ksp_type; -mat_partitioning_type, don't work if there are 2
> communicators. But for single communicator they work. Please see the
> attached files containing output of -log_summary for 2 different codes
> (scroll down to the end). One with single communicator and the other with
> two communicators. Both the output have been generated with the same PETSc
> compilation.
>
> Futher -log_summary results are not able to detect all the events that have
> been registered if they are not in the WORLD communicator.
>

I am not sure what you are doing, but something is wrong in this code. Using
different communicators does not
have much to do with options. Any object (like a KSP) can be created using a
subcomm of WORLD. Are you
setting PETSC_COMM_WORLD to something different? That is not necessary here.

  Matt


> Thank you
> Irfan
>
> ----- Original Message -----
> From: "Matthew Knepley" <knepley at gmail.com>
> To: "PETSc users list" <petsc-users at mcs.anl.gov>
> Sent: Tuesday, November 24, 2009 8:03:00 PM GMT -05:00 US/Canada Eastern
> Subject: Re: Multiple communicators
>
> On Tue, Nov 24, 2009 at 6:59 PM, <irfan.khan at gatech.edu> wrote:
>
>> Hello
>>
>> Does the procedure for providing command line options and to obtain
>> profiling data through -log_summary change if there are multiple MPI
>> communicators in a petsc code?
>>
>
> There is always WORLD, so it should be fine unless I misunderstand your
> question.
>
>   Matt
>
>
>> If so, can somebody point me to information, references on how to provide
>> command line options if there are multiple communicators?
>>
>> Thank you
>> Irfan
>>
>> --
>> PhD Candidate
>> G.W. Woodruff School of Mechanical Engineering
>> Georgia Institute of Technology
>> Atlanta, GA (30332)
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
> --
> PhD Candidate
> G.W. Woodruff School of Mechanical Engineering
> Georgia Institute of Technology
> Atlanta, GA (30332)
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091124/988f5786/attachment.htm>

From irfan.khan at gatech.edu  Tue Nov 24 19:56:12 2009
From: irfan.khan at gatech.edu (irfan.khan at gatech.edu)
Date: Tue, 24 Nov 2009 20:56:12 -0500 (EST)
Subject: Multiple communicators
In-Reply-To: <1774530298.1463411259114157468.JavaMail.root@mail8.gatech.edu>
Message-ID: <252444921.1463431259114172374.JavaMail.root@mail8.gatech.edu>


I don't explicitly use the communicator PETSC_COMM_WORLD. Instead I used MPI_COMM_WORLD, which according to the manual should be the same. 

However, I create two new communicator from MPI_COMM_WORLD using MPI_Comm_split(). LBM_Comm and FEA_Comm. Only the ranks in FEA_Comm make use of KSP solver and PARMETIS. When creating objects for mesh partitioning (MatPartitioningCreate()) and solver (KSPCreate()), I use the new communicator FEA_Comm. Would this have any effect on the command line options? 

Regards 
Irfan 


I am not sure what you are doing, but something is wrong in this code. Using different communicators does not 
have much to do with options. Any object (like a KSP) can be created using a subcomm of WORLD. Are you 
setting PETSC_COMM_WORLD to something different? That is not necessary here. 

Matt 


Thank you 
Irfan 

----- Original Message ----- 
From: "Matthew Knepley" < knepley at gmail.com > 
To: "PETSc users list" < petsc-users at mcs.anl.gov > 
Sent: Tuesday, November 24, 2009 8:03:00 PM GMT -05:00 US/Canada Eastern 
Subject: Re: Multiple communicators 

On Tue, Nov 24, 2009 at 6:59 PM, < irfan.khan at gatech.edu > wrote: 


Hello 

Does the procedure for providing command line options and to obtain profiling data through -log_summary change if there are multiple MPI communicators in a petsc code? 


There is always WORLD, so it should be fine unless I misunderstand your question. 

Matt 


If so, can somebody point me to information, references on how to provide command line options if there are multiple communicators? 

Thank you 
Irfan 

-- 
PhD Candidate 
G.W. Woodruff School of Mechanical Engineering 
Georgia Institute of Technology 
Atlanta, GA (30332) 


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. 
-- Norbert Wiener 


-- 
PhD Candidate 
G.W. Woodruff School of Mechanical Engineering 
Georgia Institute of Technology 
Atlanta, GA (30332) 


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. 
-- Norbert Wiener 


-- 
PhD Candidate 
G.W. Woodruff School of Mechanical Engineering 
Georgia Institute of Technology 
Atlanta, GA (30332) 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091124/d6cdbbed/attachment.htm>

From knepley at gmail.com  Tue Nov 24 20:51:43 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 24 Nov 2009 20:51:43 -0600
Subject: Multiple communicators
In-Reply-To: <252444921.1463431259114172374.JavaMail.root@mail8.gatech.edu>
References: <1774530298.1463411259114157468.JavaMail.root@mail8.gatech.edu>
	<252444921.1463431259114172374.JavaMail.root@mail8.gatech.edu>
Message-ID: <a9f269830911241851j713a8d52pa172a54009ddbb4f@mail.gmail.com>

On Tue, Nov 24, 2009 at 7:56 PM, <irfan.khan at gatech.edu> wrote:

> I don't explicitly use the communicator PETSC_COMM_WORLD.  Instead I used
> MPI_COMM_WORLD, which according to the manual should be the same.
>
> However, I create two new communicator from MPI_COMM_WORLD using
> MPI_Comm_split(). LBM_Comm and FEA_Comm. Only the ranks in FEA_Comm make use
> of KSP solver and PARMETIS. When creating objects for mesh partitioning
> (MatPartitioningCreate()) and solver (KSPCreate()), I use the new
> communicator FEA_Comm. Would this have any effect on the command line
> options?
>

No.

  Matt


> Regards
> Irfan
>
>
>
> I am not sure what you are doing, but something is wrong in this code.
> Using different communicators does not
> have much to do with options. Any object (like a KSP) can be created using
> a subcomm of WORLD. Are you
> setting PETSC_COMM_WORLD to something different? That is not necessary
> here.
>
>   Matt
>
>
>
>
>> Thank you
>> Irfan
>>
>> ----- Original Message -----
>> From: "Matthew Knepley" <knepley at gmail.com>
>> To: "PETSc users list" <petsc-users at mcs.anl.gov>
>> Sent: Tuesday, November 24, 2009 8:03:00 PM GMT -05:00 US/Canada Eastern
>> Subject: Re: Multiple communicators
>>
>> On Tue, Nov 24, 2009 at 6:59 PM, <irfan.khan at gatech.edu> wrote:
>>
>>> Hello
>>>
>>> Does the procedure for providing command line options and to obtain
>>> profiling data through -log_summary change if there are multiple MPI
>>> communicators in a petsc code?
>>>
>>
>> There is always WORLD, so it should be fine unless I misunderstand your
>> question.
>>
>>   Matt
>>
>>
>>> If so, can somebody point me to information, references on how to provide
>>> command line options if there are multiple communicators?
>>>
>>> Thank you
>>> Irfan
>>>
>>> --
>>> PhD Candidate
>>> G.W. Woodruff School of Mechanical Engineering
>>> Georgia Institute of Technology
>>> Atlanta, GA (30332)
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>>
>> --
>> PhD Candidate
>> G.W. Woodruff School of Mechanical Engineering
>> Georgia Institute of Technology
>> Atlanta, GA (30332)
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
> --
> PhD Candidate
> G.W. Woodruff School of Mechanical Engineering
> Georgia Institute of Technology
> Atlanta, GA (30332)
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091124/b83a8a72/attachment.htm>

From bsmith at mcs.anl.gov  Tue Nov 24 22:50:02 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 24 Nov 2009 22:50:02 -0600
Subject: Multiple communicators
In-Reply-To: <58356696.1460341259111616362.JavaMail.root@mail8.gatech.edu>
References: <58356696.1460341259111616362.JavaMail.root@mail8.gatech.edu>
Message-ID: <DABA8EB2-07CA-497A-BB8D-8F4C6273D886@mcs.anl.gov>


On Nov 24, 2009, at 7:13 PM, irfan.khan at gatech.edu wrote:

> I use seperete communicators for fluid and solid ranks in a fluid- 
> structure interaction code. I use PETSc tools (KSP) to solve for the  
> solid phase (FEA) which is carried out under a different  
> communicator (FEA_Comm).
>
> Unless I am doing something wrong, I have found that the command  
> line options: -ksp_type; -mat_partitioning_type, don't work if there  
> are 2 communicators.

    Are you sure that they don't work? Or is it just printing the  
message?

WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
Option left: name:-ksp_type value: cg
Option left: name:-mat_partitioning_type value: parmetis

    Any option that is not accessed on PROCESS ZERO will be listed  
here as unused, even if it may be used on some process.  I suppose we  
could/should fix this by adding some complicated communication that  
determines just the options that are never used. I never liked this  
warning, only added it after pressure from people who couldn't type.

> But for single communicator they work. Please see the attached files  
> containing output of -log_summary for 2 different codes (scroll down  
> to the end). One with single communicator and the other with two  
> communicators. Both the output have been generated with the same  
> PETSc compilation.
>
> Futher -log_summary results are not able to detect all the events  
> that have been registered if they are not in the WORLD communicator.

    Is your concern here that it does not print the name of the event,  
it leaves the name blank? If so this is likely the type of problem as  
the print of options; we don't have a mechanism to get them to process  
0 to print.

    In summary, yes these are some weaknesses of PETSc. If you switch  
the two communicators you create and use the one that has process 0 of  
MPI_COMM_WORLD for the PETSc stuff then it will look nicer and hide  
these two problems.

    Barry

>
>
> Thank you
> Irfan
> ----- Original Message -----
> From: "Matthew Knepley" <knepley at gmail.com>
> To: "PETSc users list" <petsc-users at mcs.anl.gov>
> Sent: Tuesday, November 24, 2009 8:03:00 PM GMT -05:00 US/Canada  
> Eastern
> Subject: Re: Multiple communicators
>
> On Tue, Nov 24, 2009 at 6:59 PM, <irfan.khan at gatech.edu> wrote:
> Hello
>
> Does the procedure for providing command line options and to obtain  
> profiling data through -log_summary change if there are multiple MPI  
> communicators in a petsc code?
>
> There is always WORLD, so it should be fine unless I misunderstand  
> your question.
>
>   Matt
>
> If so, can somebody point me to information, references on how to  
> provide command line options if there are multiple communicators?
>
> Thank you
> Irfan
>
> -- 
> PhD Candidate
> G.W. Woodruff School of Mechanical Engineering
> Georgia Institute of Technology
> Atlanta, GA (30332)
>
>
>
> -- 
> What most experimenters take for granted before they begin their  
> experiments is infinitely more interesting than any results to which  
> their experiments lead.
> -- Norbert Wiener
>
>
> -- 
> PhD Candidate
> G.W. Woodruff School of Mechanical Engineering
> Georgia Institute of Technology
> Atlanta, GA (30332)
> <1communicator.out><2communicators.out>


From fpacull at fluorem.com  Wed Nov 25 10:15:04 2009
From: fpacull at fluorem.com (francois pacull)
Date: Wed, 25 Nov 2009 17:15:04 +0100
Subject: PCFieldSplit and Schur
In-Reply-To: <87vdgzyb35.fsf@59A2.org>
References: <4B0C0DC9.3060602@fluorem.com>
	<87y6lvygf0.fsf@59A2.org>	<4B0C24F0.2070608@fluorem.com>
	<87vdgzyb35.fsf@59A2.org>
Message-ID: <4B0D5808.2080400@fluorem.com>

Thanks again for your message.
It is true that in the case of a single split of the seven fields 
between the 5 aerodynamic variables and the two turbulent variables 
(classical k-omega in our case), the blocksize=5 system still has some 
kind of stiffness due to the large difference of speed between the fluid 
motion and the acoustic waves. However, we use some block diagonal 
scaling with a local preconditioning blocksize=5 matrix (Weiss and Smith 
preconditioner). We do observe a better convergence rate when using this 
scaling in order to improve the local condition number. In general, when 
we remove the rows and columns corresponding to the two turbulent 
variables of a system, we get a rather nice behavior of the solver. But 
when we keep them, a high level of incomplete factorization within the 
PCASM (which works well with full factorization) is required in order to 
avoid a complete stagnation of the GMRES. This is why i am trying to 
split off this bs=2 matrix and find the best way to deal with it, as 
well as its coupling with the aerodynamic variables matrix. I will 
definitely give you the feedback of your arguments and the results of 
this fieldsplit when i get more work done regarding this issue.
francois pacull.

Jed Brown a ?crit :

> On Tue, 24 Nov 2009 19:24:48 +0100, francois pacull <fpacull at fluorem.com> wrote:
>   
>> Thank you for your email Jed, this is very helpfull.
>> Just in case you would know some specific references:
>> the physics we are working with are very common:  turbulent low-Mach gas 
>> dynamics; since part of the systems stiffness is created by the 
>> turbulent variables,
>>     
>
> So this depends somewhat on the turbulence model, but it is normally
> some sort of advection-diffusion system.
>
>   
>> i am trying to couple a first system with 5 fields (density, momentum,
>> energy) and a second one with 2 fields (turbulence), within the
>> preconditioner.
>>     
>
> Ah, if you split the turbulence variables then you still have to deal
> with acoustics in the bs=5 split.
>
> As an experimental heuristic (one I'm trying to formulate) for how to
> split a stiff wave system (at least with a conservative formulation) try
> this.  Identify the characteristic of your largest eigenvector.  If this
> involves a single subsystem, say it is
>
>   [0, 0, 0, 1, -1]
>
> then you should be able to do one multiplicative split ([a,b,c], [d,e])
> as long as you can find a good preconditioner for the [d,e] part.  In
> this last part, the fastest wave has inter-field coupling, so you could
> split this with factorization.  Note that you could also have done just
> one split ([a,b,c,d],[e]), and this would have little impact on the
> Schur complement in the [e] block.
>
> If instead the characteristic looked like
>
>   [0.3, 0.4, 0.8, 0.4, -1]
>
> then all the fields participate in the fastest wave, and the natural
> (and I hypothesize, mandatory) split would be ([a,b,c,d],[e]).
>
>
> So in light of this framework, let's consider your system.  Since the
> velocity is much slower than acoustics, you should be able to split off
> the turbulence model using relaxation (the advection is only as fast as
> the velocity, if the turbulence model is as stiff as acoustics then the
> diffusive part must be the culprit, but that is self-coupling).
>
> Here is one recent paper on JFNK applied to low-mach compressible flow
>
> @article{park2008jacobian,
>   title={{Jacobian-free Newton Krylov discontinuous Galerkin method and
>     physics-based preconditioning for nuclear reactor simulations}},
>   author={Park, H.K. and Nourgaliev, R.R. and Martineau, R.C. and Knoll, D.A.},
>   year={2008},
>   publisher={INL/CON-08-14243, Idaho National Laboratory (INL)},
>   url={http://www.osti.gov/bridge/servlets/purl/940059-ITD6J2/940059.pdf}
> }
>
>
>   
>> The motivation is the diminution of the factorization memory
>> requirement, compare to when the fields are not split.
>>     
>
> Incomplete or full factorization?  Does ASM actually work?  If so, then
> I'm skeptical that PCFieldSplit-Schur will gain you anything, you should
> be fine to split off the turbelence variables (multigrid might even work
> fine for them), and use a cheaper preconditioner on the rest of the
> system.
>
>
> Finally, I would be interested to know how you fare.  In particular,
> whether you find that the heuristic arguments above actually correlate
> with efficient algorithms.
>
> Jed
>
>   


From nicolas.aunai at gmail.com  Sat Nov 28 03:18:04 2009
From: nicolas.aunai at gmail.com (nicolas aunai)
Date: Sat, 28 Nov 2009 10:18:04 +0100
Subject: VecDestroy and memory leak
Message-ID: <d849a8bc0911280118n27b504bem7db8d18694045a5f@mail.gmail.com>

Hi,

I have a memory leak in my code, and I think I may have located it (if
it is the only one). Unfortunatly I can't find out what to do to fix
it.

I think I have identified the procedure where the leak is because the
current memory usage at the end is not the same at it is at the
begining of the procedure, as it should be. This procedure creates
some vectors, that are supposed to be destroyed before the end is
reached. I've checked that all VecDestroy() functions are called.

However, looking at the memory current usage after ALL instructions of
the procedure, I've noticed that one of the VecDestroy() I call is not
changing the memory usage, meaning that it is not deallocating the
vector created before. The corresponding vector is created with the
routine : DACreateNaturalVector(), which is successful since a/ I can
use without problem the vector created, b/ the level of memory does
increase right after it creation.

I've looked at the function VecDestroy() definition, it seems to check
a PetscObject member called 'refct' before actually free the memory,
I've printed this 'refct' for my natural vector and its value is '2',
while it is '1' for a vector correctly freed.

What should I do ?

thx
Nico

From jed at 59A2.org  Sat Nov 28 07:23:31 2009
From: jed at 59A2.org (Jed Brown)
Date: Sat, 28 Nov 2009 14:23:31 +0100
Subject: VecDestroy and memory leak
In-Reply-To: <d849a8bc0911280118n27b504bem7db8d18694045a5f@mail.gmail.com>
References: <d849a8bc0911280118n27b504bem7db8d18694045a5f@mail.gmail.com>
Message-ID: <87ocmmeqb0.fsf@59A2.org>

On Sat, 28 Nov 2009 10:18:04 +0100, nicolas aunai <nicolas.aunai at gmail.com> wrote:
> I've looked at the function VecDestroy() definition, it seems to check
> a PetscObject member called 'refct' before actually free the memory,
> I've printed this 'refct' for my natural vector and its value is '2',
> while it is '1' for a vector correctly freed.

The DA holds a reference to this vector so it can give it away the next
time you call DACreateNaturalVector().  It's not a leak because the DA
will destroy it's reference in DADestroy().  You can run with
-malloc_dump to confirm that all PETSc objects have been destroyed.

Jed
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091128/42a1e73d/attachment.pgp>

From nicolas.aunai at gmail.com  Sat Nov 28 08:44:19 2009
From: nicolas.aunai at gmail.com (nicolas aunai)
Date: Sat, 28 Nov 2009 15:44:19 +0100
Subject: VecDestroy and memory leak
In-Reply-To: <87ocmmeqb0.fsf@59A2.org>
References: <d849a8bc0911280118n27b504bem7db8d18694045a5f@mail.gmail.com>
	<87ocmmeqb0.fsf@59A2.org>
Message-ID: <d849a8bc0911280644p4ac07f07w470a7a7d52411e10@mail.gmail.com>

Hi,

ah ok I thought the natural vector was destroyed with VecDestroy each
time it is called. I could indeed check what you said in the memory
log I'm writing.

Here is what -malloc_dump is writing at the end of the execution :

[0]Total space allocated 26648 bytes
[ 0]8 bytes PetscStrallocpy() line 82 in src/sys/utils/str.c
      [0]  PetscObjectChangeTypeName() line 107 in src/sys/objects/pname.c
      [0]  VecCreate_Seq_Private() line 714 in src/vec/vec/impls/seq/bvec2.c
      [0]  VecCreate_Seq() line 804 in src/vec/vec/impls/seq/bvec2.c
      [0]  VecSetType() line 39 in src/vec/vec/interface/vecreg.c
      [0]  VecCreateSeq() line 37 in src/vec/vec/impls/seq/vseqcr.c
[ 0]8 bytes PetscMapSetUp() line 140 in src/vec/vec/impls/mpi/pmap.c
      [0]  VecCreate_Seq_Private() line 714 in src/vec/vec/impls/seq/bvec2.c
      [0]  VecCreate_Seq() line 804 in src/vec/vec/impls/seq/bvec2.c
      [0]  VecSetType() line 39 in src/vec/vec/interface/vecreg.c
      [0]  VecCreateSeq() line 37 in src/vec/vec/impls/seq/vseqcr.c
[ 0]24 bytes VecCreate_Seq_Private() line 715 in src/vec/vec/impls/seq/bvec2.c
      [0]  VecCreate_Seq() line 804 in src/vec/vec/impls/seq/bvec2.c
      [0]  VecSetType() line 39 in src/vec/vec/interface/vecreg.c
      [0]  VecCreateSeq() line 37 in src/vec/vec/impls/seq/vseqcr.c
[ 0]25344 bytes VecCreate_Seq() line 809 in src/vec/vec/impls/seq/bvec2.c
      [0]  VecSetType() line 39 in src/vec/vec/interface/vecreg.c
      [0]  VecCreateSeq() line 37 in src/vec/vec/impls/seq/vseqcr.c
[ 0]40 bytes VecCreate() line 42 in src/vec/vec/interface/veccreate.c
      [0]  VecCreateSeq() line 37 in src/vec/vec/impls/seq/vseqcr.c
[ 0]496 bytes VecCreate() line 39 in src/vec/vec/interface/veccreate.c
      [0]  VecCreateSeq() line 37 in src/vec/vec/impls/seq/vseqcr.c
[ 0]64 bytes VecCreate() line 39 in src/vec/vec/interface/veccreate.c
      [0]  VecCreateSeq() line 37 in src/vec/vec/impls/seq/vseqcr.c
[ 0]656 bytes VecCreate() line 39 in src/vec/vec/interface/veccreate.c
      [0]  VecCreateSeq() line 37 in src/vec/vec/impls/seq/vseqcr.c
[ 0]8 bytes PetscCommDuplicate() line 221 in src/sys/objects/tagm.c
      [0]  PetscHeaderCreate_Private() line 26 in src/sys/objects/inherit.c
      [0]  VecCreate() line 32 in src/vec/vec/interface/veccreate.c
      [0]  VecCreateSeqWithArray() line 771 in src/vec/vec/impls/seq/bvec2.c
      [0]  DACreate2d() line 338 in src/dm/da/src/da2.c


All this disapears when I comment the call to the function I suspect
to be responsible for my leak.
However I still can't see where I don't free what I've created.


here is the function :


void getseqsol(void)
{
DA da;
Vec b[3], nsol, bx, by, bz, ssol;
PetscInt mx, my, i, j, ij, dof, ija;
VecScatter ctx;
PetscScalar *bxa, *bya, *bza;
Vec sol;
struct UserCtx   *uc = (struct UserCtx *) (*(solv.dmmg))->user;

sol = DMMGGetx(solv.dmmg);

PetscObjectQuery((PetscObject) sol, "DA", (PetscObject *) &da);


DAGetInfo( da, PETSC_IGNORE, &mx, &my,
           PETSC_IGNORE,PETSC_IGNORE,
           PETSC_IGNORE, PETSC_IGNORE,
           &dof, PETSC_IGNORE, PETSC_IGNORE ,
           PETSC_IGNORE);


DACreateNaturalVector(da, &nsol);
DAGlobalToNaturalBegin(da, sol, INSERT_VALUES, nsol);
DAGlobalToNaturalEnd(da, sol, INSERT_VALUES, nsol);


VecCreateSeq(PETSC_COMM_SELF, mx*my*dof, &ssol);
VecScatterCreateToAll(nsol, &ctx, &ssol);
VecScatterBegin(ctx, nsol, ssol, INSERT_VALUES, SCATTER_FORWARD);
VecScatterEnd(ctx, nsol, ssol, INSERT_VALUES,SCATTER_FORWARD);

VecDestroy(nsol);


VecCreateSeq(PETSC_COMM_SELF, mx*my, &bx);
VecCreateSeq(PETSC_COMM_SELF, mx*my, &by);
VecCreateSeq(PETSC_COMM_SELF, mx*my, &bz);


b[0] = bx;
b[1] = by;
b[2] = bz;

VecSetBlockSize(ssol, 3);
VecStrideGatherAll(ssol, b, INSERT_VALUES);

VecGetArray(bx, &bxa);
VecGetArray(by, &bya);
VecGetArray(bz, &bza);


if (uc->ipc == 0)
   {
   for (i=0; i < uc->nx; i++)
       {
       for (j=0; j < uc->ny+1; j++)
           {
           ij = i + j*(uc->nx+1);
           ija = i + j*uc->nx;

           uc->s1[ij].c[0] = bxa[ija];
           uc->s1[ij].c[1] = bya[ija];
           uc->s1[ij].c[2] = bza[ija];
           }
       }

   for (j = 0; j < uc->ny+1; j++)
       {
       ij  = uc->nx + j*(uc->nx+1);
       ija = 0 + j*(uc->nx+1);

       uc->s1[ij].c[0] = uc->s1[ija].c[0];
       uc->s1[ij].c[1] = uc->s1[ija].c[1];
       uc->s1[ij].c[2] = uc->s1[ija].c[2];
       }
   }


else if (uc->ipc == 1)
   {
   for (i=0; i < uc->nx; i++)
       {
       for (j=0; j < uc->ny+1; j++)
           {
           ij = i + j*(uc->nx+1);
           ija = i + j*uc->nx;

           uc->s1[ij].b[0] = bxa[ija];
           uc->s1[ij].b[1] = bya[ija];
           uc->s1[ij].b[2] = bza[ija];
           }
       }

   for (j = 0; j < uc->ny+1; j++)
       {
       ij  = uc->nx + j*(uc->nx+1);
       ija = 0 + j*(uc->nx+1);

       uc->s1[ij].b[0] = uc->s1[ija].b[0];
       uc->s1[ij].b[1] = uc->s1[ija].b[1];
       uc->s1[ij].b[2] = uc->s1[ija].b[2];
       }
   }


VecRestoreArray(bx, &bxa);
VecRestoreArray(by, &bya);
VecRestoreArray(bz, &bza);

VecDestroy(bx);
VecDestroy(by);
VecDestroy(bz);
VecScatterDestroy(ctx);
VecDestroy(ssol);

}


 The malloc_dump seems to refer to sequential vectors... well I think
I'm freeing everything I've created unless I misunderstood something ?


Nico


2009/11/28 Jed Brown <jed at 59a2.org>:
> On Sat, 28 Nov 2009 10:18:04 +0100, nicolas aunai <nicolas.aunai at gmail.com> wrote:
>> I've looked at the function VecDestroy() definition, it seems to check
>> a PetscObject member called 'refct' before actually free the memory,
>> I've printed this 'refct' for my natural vector and its value is '2',
>> while it is '1' for a vector correctly freed.
>
> The DA holds a reference to this vector so it can give it away the next
> time you call DACreateNaturalVector(). ?It's not a leak because the DA
> will destroy it's reference in DADestroy(). ?You can run with
> -malloc_dump to confirm that all PETSc objects have been destroyed.
>
> Jed
>

From jed at 59A2.org  Sat Nov 28 09:27:34 2009
From: jed at 59A2.org (Jed Brown)
Date: Sat, 28 Nov 2009 16:27:34 +0100
Subject: VecDestroy and memory leak
In-Reply-To: <d849a8bc0911280644p4ac07f07w470a7a7d52411e10@mail.gmail.com>
References: <d849a8bc0911280118n27b504bem7db8d18694045a5f@mail.gmail.com>
	<87ocmmeqb0.fsf@59A2.org>
	<d849a8bc0911280644p4ac07f07w470a7a7d52411e10@mail.gmail.com>
Message-ID: <87aay6vfdl.fsf@59A2.org>

On Sat, 28 Nov 2009 15:44:19 +0100, nicolas aunai <nicolas.aunai at gmail.com> wrote:
> VecCreateSeq(PETSC_COMM_SELF, mx*my*dof, &ssol);
> VecScatterCreateToAll(nsol, &ctx, &ssol);
                                    ^^^^^

This is a new vector, the one you create on the line above is lost.
From the man page:

|     Do NOT create a vector and then pass it in as the final argument vout! vout is created by this routine
|   automatically (unless you pass PETSC_NULL in for that argument if you do not need it).


Jed
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091128/f80e60a4/attachment.pgp>

From nicolas.aunai at gmail.com  Sat Nov 28 09:53:23 2009
From: nicolas.aunai at gmail.com (nicolas aunai)
Date: Sat, 28 Nov 2009 16:53:23 +0100
Subject: VecDestroy and memory leak
In-Reply-To: <87aay6vfdl.fsf@59A2.org>
References: <d849a8bc0911280118n27b504bem7db8d18694045a5f@mail.gmail.com>
	<87ocmmeqb0.fsf@59A2.org>
	<d849a8bc0911280644p4ac07f07w470a7a7d52411e10@mail.gmail.com>
	<87aay6vfdl.fsf@59A2.org>
Message-ID: <d849a8bc0911280753o2bbfb1afk7569d79ece127e13@mail.gmail.com>

ah yes this is it !

thanks a lot.

Nico


2009/11/28 Jed Brown <jed at 59a2.org>:
> On Sat, 28 Nov 2009 15:44:19 +0100, nicolas aunai <nicolas.aunai at gmail.com> wrote:
>> VecCreateSeq(PETSC_COMM_SELF, mx*my*dof, &ssol);
>> VecScatterCreateToAll(nsol, &ctx, &ssol);
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?^^^^^
>
> This is a new vector, the one you create on the line above is lost.
> From the man page:
>
> | ? ? Do NOT create a vector and then pass it in as the final argument vout! vout is created by this routine
> | ? automatically (unless you pass PETSC_NULL in for that argument if you do not need it).
>
>
> Jed
>

From sekikawa at msi.co.jp  Sun Nov 29 19:12:11 2009
From: sekikawa at msi.co.jp (Takuya Sekikawa)
Date: Mon, 30 Nov 2009 10:12:11 +0900
Subject: Can SLEPc handle 1mx1m matrix?
Message-ID: <20091130095750.87DB.SEKIKAWA@msi.co.jp>

Dear SLEPc/PETSc team,

In past I have made eigenvalue solving program with SLEPc.
At that time matrix size was at most 10000x10000.
(that program is running with no problem)

Now I need to extend it to handle 1,000,000x1,000,1000 matrix.
(100 times larger than before, in row and column each)
Also, over 99% of matrix values are zero. (sparse matrix)
and I don't need all the eigenvalues/vectors.
I only need several large magnitudes.

I think I could handle it with indirect method like KRYLOVSCHUR or
LANCZOS, plus using MPI, but I'm not sure if I can.

Can SLEPc handle these quite large matrix?

Thanks,
Takuya
---------------------------------------------------------------
  Takuya Sekikawa
         Mathematical Systems, Inc
                    sekikawa at msi.co.jp
---------------------------------------------------------------


From bsmith at mcs.anl.gov  Sun Nov 29 19:36:19 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 29 Nov 2009 19:36:19 -0600
Subject: Can SLEPc handle 1mx1m matrix?
In-Reply-To: <20091130095750.87DB.SEKIKAWA@msi.co.jp>
References: <20091130095750.87DB.SEKIKAWA@msi.co.jp>
Message-ID: <5F8D8D0B-F0B9-4CF2-9452-6875513E137A@mcs.anl.gov>


    Yes,

On Nov 29, 2009, at 7:12 PM, Takuya Sekikawa wrote:

> Dear SLEPc/PETSc team,
>
> In past I have made eigenvalue solving program with SLEPc.
> At that time matrix size was at most 10000x10000.
> (that program is running with no problem)
>
> Now I need to extend it to handle 1,000,000x1,000,1000 matrix.
> (100 times larger than before, in row and column each)
> Also, over 99% of matrix values are zero. (sparse matrix)
> and I don't need all the eigenvalues/vectors.
> I only need several large magnitudes.
>
> I think I could handle it with indirect method like KRYLOVSCHUR or
> LANCZOS, plus using MPI, but I'm not sure if I can.
>
> Can SLEPc handle these quite large matrix?
>
> Thanks,
> Takuya
> ---------------------------------------------------------------
>  Takuya Sekikawa
>         Mathematical Systems, Inc
>                    sekikawa at msi.co.jp
> ---------------------------------------------------------------
>
>


From foolishzhu at yahoo.com.cn  Sun Nov 29 19:41:51 2009
From: foolishzhu at yahoo.com.cn (ming zhu)
Date: Mon, 30 Nov 2009 09:41:51 +0800 (CST)
Subject: How to fill a matrix with a vector parallelly
Message-ID: <432981.94747.qm@web15804.mail.cnb.yahoo.com>

HII am trying to solve an eigenvalue problem. A = U * LAMBDA*U'. ?While SLEPC proivdes a solver to get an eigenvector and the related eigenvalue, I have to fill the eigenvector to form the eigen matrix. However, there seems no such function to do this directly. I was trying to use VecGetValues and filled the matrix one by one. Howevevr, it can not do that in parallel.
?? Is there anyone can help me ?
Thank you??


      ___________________________________________________________ 
  ????????????????? 
http://card.mail.cn.yahoo.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091130/acfc34e3/attachment.htm>

From bsmith at mcs.anl.gov  Sun Nov 29 19:53:13 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 29 Nov 2009 19:53:13 -0600
Subject: How to fill a matrix with a vector parallelly
In-Reply-To: <432981.94747.qm@web15804.mail.cnb.yahoo.com>
References: <432981.94747.qm@web15804.mail.cnb.yahoo.com>
Message-ID: <BB5864B3-F231-4A0F-88FC-1A4CB35A2C3D@mcs.anl.gov>


   Since evenvectors will (almost always) be dense vectors you will  
want to use a MPIDENSE matrix to store them. So create a MPIDENSE  
matrix with the same row layout as the eigenvector then on each  
process use VecGetArray() to access that processes part of the vector  
and MatGetArray() to access that part of the matrix and copy the  
values over from the vector array to the matrix array.

    Barry

On Nov 29, 2009, at 7:41 PM, ming zhu wrote:

> HI
> I am trying to solve an eigenvalue problem. A = U * LAMBDA*U'.   
> While SLEPC proivdes a solver to get an eigenvector and the related  
> eigenvalue, I have to fill the eigenvector to form the eigen matrix.  
> However, there seems no such function to do this directly. I was  
> trying to use VecGetValues and filled the matrix one by one.  
> Howevevr, it can not do that in parallel.
>
>    Is there anyone can help me ?
>
> Thank you
>
>
> ?????????????????


From foolishzhu at yahoo.com.cn  Sun Nov 29 21:09:33 2009
From: foolishzhu at yahoo.com.cn (ming zhu)
Date: Mon, 30 Nov 2009 11:09:33 +0800 (CST)
Subject: How to fill a matrix with a vector parallelly
In-Reply-To: <BB5864B3-F231-4A0F-88FC-1A4CB35A2C3D@mcs.anl.gov>
Message-ID: <522271.28044.qm@web15805.mail.cnb.yahoo.com>

Thank you for your quick reply.Two questions:one, ? ? how do I know which part belongs to the present processor?second, is there any examples for this kind of question?
Thank you

--- 09?11?30????, Barry Smith <bsmith at mcs.anl.gov> ???

???: Barry Smith <bsmith at mcs.anl.gov>
??: Re: How to fill a matrix with a vector parallelly
???: "PETSc users list" <petsc-users at mcs.anl.gov>
??: 2009?11?30?,??,??9:53


? Since evenvectors will (almost always) be dense vectors you will want to use a MPIDENSE matrix to store them. So create a MPIDENSE matrix with the same row layout as the eigenvector then on each process use VecGetArray() to access that processes part of the vector and MatGetArray() to access that part of the matrix and copy the values over from the vector array to the matrix array.

???Barry

On Nov 29, 2009, at 7:41 PM, ming zhu wrote:

> HI
> I am trying to solve an eigenvalue problem. A = U * LAMBDA*U'.? While SLEPC proivdes a solver to get an eigenvector and the related eigenvalue, I have to fill the eigenvector to form the eigen matrix. However, there seems no such function to do this directly. I was trying to use VecGetValues and filled the matrix one by one. Howevevr, it can not do that in parallel.
> 
>? ? Is there anyone can help me ?
> 
> Thank you
> 
> 
> ?????????????????


      ___________________________________________________________ 
  ????????????????? 
http://card.mail.cn.yahoo.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091130/f797275a/attachment.htm>

From bsmith at mcs.anl.gov  Sun Nov 29 21:12:59 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 29 Nov 2009 21:12:59 -0600
Subject: How to fill a matrix with a vector parallelly
In-Reply-To: <522271.28044.qm@web15805.mail.cnb.yahoo.com>
References: <522271.28044.qm@web15805.mail.cnb.yahoo.com>
Message-ID: <BCF74A08-DF2F-44A0-983E-1789950C1866@mcs.anl.gov>


On Nov 29, 2009, at 9:09 PM, ming zhu wrote:

> Thank you for your quick reply.
> Two questions:
> one,     how do I know which part belongs to the present processor?

   Call VecGetOwnershipRange() to know what rows of the vector belong  
to the process. Then create the MPIDense matrix with the same number  
of local rows.

> second, is there any examples for this kind of question?
>
> Thank you
>
> --- 09?11?30????, Barry Smith <bsmith at mcs.anl.gov> ? 
> ??
>
> ???: Barry Smith <bsmith at mcs.anl.gov>
> ??: Re: How to fill a matrix with a vector parallelly
> ???: "PETSc users list" <petsc-users at mcs.anl.gov>
> ??: 2009?11?30?,??,??9:53
>
>
>   Since evenvectors will (almost always) be dense vectors you will  
> want to use a MPIDENSE matrix to store them. So create a MPIDENSE  
> matrix with the same row layout as the eigenvector then on each  
> process use VecGetArray() to access that processes part of the  
> vector and MatGetArray() to access that part of the matrix and copy  
> the values over from the vector array to the matrix array.
>
>    Barry
>
> On Nov 29, 2009, at 7:41 PM, ming zhu wrote:
>
> > HI
> > I am trying to solve an eigenvalue problem. A = U * LAMBDA*U'.   
> While SLEPC proivdes a solver to get an eigenvector and the related  
> eigenvalue, I have to fill the eigenvector to form the eigen matrix.  
> However, there seems no such function to do this directly. I was  
> trying to use VecGetValues and filled the matrix one by one.  
> Howevevr, it can not do that in parallel.
> >
> >    Is there anyone can help me ?
> >
> > Thank you
> >
> >
> > ?????????????????
>
>
> ?????????????????


From foolishzhu at yahoo.com.cn  Sun Nov 29 21:21:12 2009
From: foolishzhu at yahoo.com.cn (ming zhu)
Date: Mon, 30 Nov 2009 11:21:12 +0800 (CST)
Subject: How to fill a matrix with a vector parallelly
In-Reply-To: <BCF74A08-DF2F-44A0-983E-1789950C1866@mcs.anl.gov>
Message-ID: <196195.59784.qm@web15806.mail.cnb.yahoo.com>

OK.may I ask , I think Petsc matrix is row -based. That means,?MatGetArray(Mat mat,PetscScalar *v[])v[0] is the pointer to the first row. v[0][0] is the value of mat[0][0].Am I right?And do you mean that, each time I got a vector, I create a sub matrix to store it.Later on, the merging will be a problem.
--- 09?11?30????, Barry Smith <bsmith at mcs.anl.gov> ???

???: Barry Smith <bsmith at mcs.anl.gov>
??: Re: How to fill a matrix with a vector parallelly
???: "PETSc users list" <petsc-users at mcs.anl.gov>
??: 2009?11?30?,??,??11:12


On Nov 29, 2009, at 9:09 PM, ming zhu wrote:

> Thank you for your quick reply.
> Two questions:
> one,? ???how do I know which part belongs to the present processor?

? Call VecGetOwnershipRange() to know what rows of the vector belong to the process. Then create the MPIDense matrix with the same number of local rows.

> second, is there any examples for this kind of question?
> 
> Thank you
> 
> --- 09?11?30????, Barry Smith <bsmith at mcs.anl.gov> ???
> 
> ???: Barry Smith <bsmith at mcs.anl.gov>
> ??: Re: How to fill a matrix with a vector parallelly
> ???: "PETSc users list" <petsc-users at mcs.anl.gov>
> ??: 2009?11?30?,??,??9:53
> 
> 
>???Since evenvectors will (almost always) be dense vectors you will want to use a MPIDENSE matrix to store them. So create a MPIDENSE matrix with the same row layout as the eigenvector then on each process use VecGetArray() to access that processes part of the vector and MatGetArray() to access that part of the matrix and copy the values over from the vector array to the matrix array.
> 
>? ? Barry
> 
> On Nov 29, 2009, at 7:41 PM, ming zhu wrote:
> 
> > HI
> > I am trying to solve an eigenvalue problem. A = U * LAMBDA*U'.? While SLEPC proivdes a solver to get an eigenvector and the related eigenvalue, I have to fill the eigenvector to form the eigen matrix. However, there seems no such function to do this directly. I was trying to use VecGetValues and filled the matrix one by one. Howevevr, it can not do that in parallel.
> >
> >? ? Is there anyone can help me ?
> >
> > Thank you
> >
> >
> > ?????????????????
> 
> 
> ?????????????????


      ___________________________________________________________ 
  ????????????????? 
http://card.mail.cn.yahoo.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091130/69f79f91/attachment-0001.htm>

From bsmith at mcs.anl.gov  Sun Nov 29 21:33:32 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 29 Nov 2009 21:33:32 -0600
Subject: How to fill a matrix with a vector parallelly
In-Reply-To: <196195.59784.qm@web15806.mail.cnb.yahoo.com>
References: <196195.59784.qm@web15806.mail.cnb.yahoo.com>
Message-ID: <0A98534E-5001-431D-9B82-2429DDB91BF4@mcs.anl.gov>


On Nov 29, 2009, at 9:21 PM, ming zhu wrote:

> OK.
> may I ask , I think Petsc matrix is row -based. That means,
> MatGetArray(Mat mat,PetscScalar *v[])
> v[0] is the pointer to the first row. v[0][0] is the value of mat[0] 
> [0].
> Am I right?

    No. It is column based and there is just one array for the who  
thing so v[0:m-1] is the first column v[m:2m-1] is the next column etc
> And do you mean that, each time I got a vector, I create a sub  
> matrix to store it.

  No. Just create the full matrix and copy over each vector as you get  
it.

   Barry

> Later on, the merging will be a problem.
>
> --- 09?11?30????, Barry Smith <bsmith at mcs.anl.gov> ? 
> ??
>
> ???: Barry Smith <bsmith at mcs.anl.gov>
> ??: Re: How to fill a matrix with a vector parallelly
> ???: "PETSc users list" <petsc-users at mcs.anl.gov>
> ??: 2009?11?30?,??,??11:12
>
>
> On Nov 29, 2009, at 9:09 PM, ming zhu wrote:
>
> > Thank you for your quick reply.
> > Two questions:
> > one,     how do I know which part belongs to the present processor?
>
>   Call VecGetOwnershipRange() to know what rows of the vector belong  
> to the process. Then create the MPIDense matrix with the same number  
> of local rows.
>
> > second, is there any examples for this kind of question?
> >
> > Thank you
> >
> > --- 09?11?30????, Barry Smith <bsmith at mcs.anl.gov> ? 
> ??
> >
> > ???: Barry Smith <bsmith at mcs.anl.gov>
> > ??: Re: How to fill a matrix with a vector parallelly
> > ???: "PETSc users list" <petsc-users at mcs.anl.gov>
> > ??: 2009?11?30?,??,??9:53
> >
> >
> >   Since evenvectors will (almost always) be dense vectors you will  
> want to use a MPIDENSE matrix to store them. So create a MPIDENSE  
> matrix with the same row layout as the eigenvector then on each  
> process use VecGetArray() to access that processes part of the  
> vector and MatGetArray() to access that part of the matrix and copy  
> the values over from the vector array to the matrix array.
> >
> >    Barry
> >
> > On Nov 29, 2009, at 7:41 PM, ming zhu wrote:
> >
> > > HI
> > > I am trying to solve an eigenvalue problem. A = U * LAMBDA*U'.   
> While SLEPC proivdes a solver to get an eigenvector and the related  
> eigenvalue, I have to fill the eigenvector to form the eigen matrix.  
> However, there seems no such function to do this directly. I was  
> trying to use VecGetValues and filled the matrix one by one.  
> Howevevr, it can not do that in parallel.
> > >
> > >    Is there anyone can help me ?
> > >
> > > Thank you
> > >
> > >
> > > ?????????????????
> >
> >
> > ?????????????????
>
>
> ?????????????????


From foolishzhu at yahoo.com.cn  Sun Nov 29 21:49:52 2009
From: foolishzhu at yahoo.com.cn (ming zhu)
Date: Mon, 30 Nov 2009 11:49:52 +0800 (CST)
Subject: How to fill a matrix with a vector parallelly
In-Reply-To: <0A98534E-5001-431D-9B82-2429DDB91BF4@mcs.anl.gov>
Message-ID: <234597.60826.qm@web15802.mail.cnb.yahoo.com>

Thank you?I know that you are trying to let me know to match the vector element with the matrix.But it seems, the total number of rows (suppose m) has to satisfy??? m = N * r,where N is the number of process and r is the local range.Am I right?


--- 09?11?30????, Barry Smith <bsmith at mcs.anl.gov> ???

???: Barry Smith <bsmith at mcs.anl.gov>
??: Re: How to fill a matrix with a vector parallelly
???: "PETSc users list" <petsc-users at mcs.anl.gov>
??: 2009?11?30?,??,??11:33


On Nov 29, 2009, at 9:21 PM, ming zhu wrote:

> OK.
> may I ask , I think Petsc matrix is row -based. That means,
> MatGetArray(Mat mat,PetscScalar *v[])
> v[0] is the pointer to the first row. v[0][0] is the value of mat[0][0].
> Am I right?

???No. It is column based and there is just one array for the who thing so v[0:m-1] is the first column v[m:2m-1] is the next column etc
> And do you mean that, each time I got a vector, I create a sub matrix to store it.

 No. Just create the full matrix and copy over each vector as you get it.

? Barry

> Later on, the merging will be a problem.
> 
> --- 09?11?30????, Barry Smith <bsmith at mcs.anl.gov> ???
> 
> ???: Barry Smith <bsmith at mcs.anl.gov>
> ??: Re: How to fill a matrix with a vector parallelly
> ???: "PETSc users list" <petsc-users at mcs.anl.gov>
> ??: 2009?11?30?,??,??11:12
> 
> 
> On Nov 29, 2009, at 9:09 PM, ming zhu wrote:
> 
> > Thank you for your quick reply.
> > Two questions:
> > one,? ???how do I know which part belongs to the present processor?
> 
>???Call VecGetOwnershipRange() to know what rows of the vector belong to the process. Then create the MPIDense matrix with the same number of local rows.
> 
> > second, is there any examples for this kind of question?
> >
> > Thank you
> >
> > --- 09?11?30????, Barry Smith <bsmith at mcs.anl.gov> ???
> >
> > ???: Barry Smith <bsmith at mcs.anl.gov>
> > ??: Re: How to fill a matrix with a vector parallelly
> > ???: "PETSc users list" <petsc-users at mcs.anl.gov>
> > ??: 2009?11?30?,??,??9:53
> >
> >
> >???Since evenvectors will (almost always) be dense vectors you will want to use a MPIDENSE matrix to store them. So create a MPIDENSE matrix with the same row layout as the eigenvector then on each process use VecGetArray() to access that processes part of the vector and MatGetArray() to access that part of the matrix and copy the values over from the vector array to the matrix array.
> >
> >? ? Barry
> >
> > On Nov 29, 2009, at 7:41 PM, ming zhu wrote:
> >
> > > HI
> > > I am trying to solve an eigenvalue problem. A = U * LAMBDA*U'.? While SLEPC proivdes a solver to get an eigenvector and the related eigenvalue, I have to fill the eigenvector to form the eigen matrix. However, there seems no such function to do this directly. I was trying to use VecGetValues and filled the matrix one by one. Howevevr, it can not do that in parallel.
> > >
> > >? ? Is there anyone can help me ?
> > >
> > > Thank you
> > >
> > >
> > > ?????????????????
> >
> >
> > ?????????????????
> 
> 
> ?????????????????


      ___________________________________________________________ 
  ????????????????? 
http://card.mail.cn.yahoo.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091130/c6eff8ef/attachment.htm>

From foolishzhu at yahoo.com.cn  Sun Nov 29 23:36:18 2009
From: foolishzhu at yahoo.com.cn (ming zhu)
Date: Mon, 30 Nov 2009 13:36:18 +0800 (CST)
Subject: What is the format of binary file ?
Message-ID: <751647.13038.qm@web15805.mail.cnb.yahoo.com>

HII am trying to convert a sparse matrix from matlab data file to PETSC binary file. While PETSC offer an example ?(\src\mat\examples\tests\ex72.c) to convert Matmarket file to binary. It is too slow for me. My matrix is 1.7 M * 1.7 M with 62 M?non zeros. So, is there any faster way to convert it? Or is it possible to write to it one line by one line ?Thank you


      ___________________________________________________________ 
  ????????????????? 
http://card.mail.cn.yahoo.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091130/78182725/attachment.htm>

From bsmith at mcs.anl.gov  Sun Nov 29 23:42:58 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 29 Nov 2009 23:42:58 -0600
Subject: What is the format of binary file ?
In-Reply-To: <751647.13038.qm@web15805.mail.cnb.yahoo.com>
References: <751647.13038.qm@web15805.mail.cnb.yahoo.com>
Message-ID: <9F21A19B-7C54-44C3-8CC2-CB8046CC5614@mcs.anl.gov>


   Look in bin/matlab/ for Matlab scripts that save sparse matrices  
directly and quickly to PETSc binary format.


On Nov 29, 2009, at 11:36 PM, ming zhu wrote:

> HI
> I am trying to convert a sparse matrix from matlab data file to  
> PETSC binary file. While PETSC offer an example  (\src\mat\examples 
> \tests\ex72.c) to convert Matmarket file to binary. It is too slow  
> for me. My matrix is 1.7 M * 1.7 M with 62 M non zeros. So, is there  
> any faster way to convert it? Or is it possible to write to it one  
> line by one line ?
> Thank you
>
>
> ?????????????????


From bsmith at mcs.anl.gov  Sun Nov 29 23:44:00 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 29 Nov 2009 23:44:00 -0600
Subject: How to fill a matrix with a vector parallelly
In-Reply-To: <234597.60826.qm@web15802.mail.cnb.yahoo.com>
References: <234597.60826.qm@web15802.mail.cnb.yahoo.com>
Message-ID: <870B06B1-4DDB-4E09-95B9-C2ECC9A4F01F@mcs.anl.gov>


On Nov 29, 2009, at 9:49 PM, ming zhu wrote:

> Thank you
> I know that you are trying to let me know to match the vector  
> element with the matrix.
> But it seems, the total number of rows (suppose m) has to satisfy
>    m = N * r,where N is the number of process and r is the local  
> range.
> Am I right?

    Yes, if all the local ranges are the same. In general different  
processors could have a different number of rows.

   Barry

>
>
>
> --- 09?11?30????, Barry Smith <bsmith at mcs.anl.gov> ? 
> ??
>
> ???: Barry Smith <bsmith at mcs.anl.gov>
> ??: Re: How to fill a matrix with a vector parallelly
> ???: "PETSc users list" <petsc-users at mcs.anl.gov>
> ??: 2009?11?30?,??,??11:33
>
>
> On Nov 29, 2009, at 9:21 PM, ming zhu wrote:
>
> > OK.
> > may I ask , I think Petsc matrix is row -based. That means,
> > MatGetArray(Mat mat,PetscScalar *v[])
> > v[0] is the pointer to the first row. v[0][0] is the value of  
> mat[0][0].
> > Am I right?
>
>    No. It is column based and there is just one array for the who  
> thing so v[0:m-1] is the first column v[m:2m-1] is the next column etc
> > And do you mean that, each time I got a vector, I create a sub  
> matrix to store it.
>
> No. Just create the full matrix and copy over each vector as you get  
> it..
>
>   Barry
>
> > Later on, the merging will be a problem.
> >
> > --- 09?11?30????, Barry Smith <bsmith at mcs.anl.gov> ? 
> ??
> >
> > ???: Barry Smith <bsmith at mcs.anl.gov>
> > ??: Re: How to fill a matrix with a vector parallelly
> > ???: "PETSc users list" <petsc-users at mcs.anl.gov>
> > ??: 2009?11?30?,??,??11:12
> >
> >
> > On Nov 29, 2009, at 9:09 PM, ming zhu wrote:
> >
> > > Thank you for your quick reply.
> > > Two questions:
> > > one,     how do I know which part belongs to the present  
> processor?
> >
> >   Call VecGetOwnershipRange() to know what rows of the vector  
> belong to the process. Then create the MPIDense matrix with the same  
> number of local rows.
> >
> > > second, is there any examples for this kind of question?
> > >
> > > Thank you
> > >
> > > --- 09?11?30????, Barry Smith <bsmith at mcs.anl.gov> ? 
> ??
> > >
> > > ???: Barry Smith <bsmith at mcs.anl.gov>
> > > ??: Re: How to fill a matrix with a vector parallelly
> > > ???: "PETSc users list" <petsc-users at mcs.anl.gov>
> > > ??: 2009?11?30?,??,??9:53
> > >
> > >
> > >   Since evenvectors will (almost always) be dense vectors you  
> will want to use a MPIDENSE matrix to store them. So create a  
> MPIDENSE matrix with the same row layout as the eigenvector then on  
> each process use VecGetArray() to access that processes part of the  
> vector and MatGetArray() to access that part of the matrix and copy  
> the values over from the vector array to the matrix array.
> > >
> > >    Barry
> > >
> > > On Nov 29, 2009, at 7:41 PM, ming zhu wrote:
> > >
> > > > HI
> > > > I am trying to solve an eigenvalue problem.. A = U *  
> LAMBDA*U'.  While SLEPC proivdes a solver to get an eigenvector and  
> the related eigenvalue, I have to fill the eigenvector to form the  
> eigen matrix. However, there seems no such function to do this  
> directly. I was trying to use VecGetValues and filled the matrix one  
> by one. Howevevr, it can not do that in parallel.
> > > >
> > > >    Is there anyone can help me ?
> > > >
> > > > Thank you
> > > >
> > > >
> > > > ?????????????????
> > >
> > >
> > > ?????????????????
> >
> >
> > ?????????????????
>
>
> ?????????????????


From foolishzhu at yahoo.com.cn  Mon Nov 30 00:09:06 2009
From: foolishzhu at yahoo.com.cn (ming zhu)
Date: Mon, 30 Nov 2009 14:09:06 +0800 (CST)
Subject: What is the format of binary file ?
In-Reply-To: <9F21A19B-7C54-44C3-8CC2-CB8046CC5614@mcs.anl.gov>
Message-ID: <157406.81598.qm@web15808.mail.cnb.yahoo.com>

That is great.
It is very quick.
Thank you

--- 09?11?30????, Barry Smith <bsmith at mcs.anl.gov> ???

???: Barry Smith <bsmith at mcs.anl.gov>
??: Re: What is the format of binary file ?
???: "PETSc users list" <petsc-users at mcs.anl.gov>
??: 2009?11?30?,??,??1:42


? Look in bin/matlab/ for Matlab scripts that save sparse matrices directly and quickly to PETSc binary format.


On Nov 29, 2009, at 11:36 PM, ming zhu wrote:

> HI
> I am trying to convert a sparse matrix from matlab data file to PETSC binary file. While PETSC offer an example? (\src\mat\examples\tests\ex72.c) to convert Matmarket file to binary. It is too slow for me. My matrix is 1.7 M * 1.7 M with 62 M non zeros. So, is there any faster way to convert it? Or is it possible to write to it one line by one line ?
> Thank you
> 
> 
> ?????????????????


      ___________________________________________________________ 
  ????????????????? 
http://card.mail.cn.yahoo.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091130/64f0ddc8/attachment-0001.htm>

From foolishzhu at yahoo.com.cn  Mon Nov 30 23:21:54 2009
From: foolishzhu at yahoo.com.cn (ming zhu)
Date: Tue, 1 Dec 2009 13:21:54 +0800 (CST)
Subject: How to copy global data to local ?
Message-ID: <650026.6302.qm@web15802.mail.cnb.yahoo.com>

HiI have a huge matrix U for all processors (PETSC_COMM_WORLD). I only want to copy only two rows (i,j) for each local processor. i,j is different for each processor and not related to rank. ?I was trying to use a vector filter (like, 00000010000). However, if the vector is?PETSC_COMM_WORLD, it will be changed by different processor. So, how to make it ?Thank you
My original code is?
Vec filterVecCreate(PETSC_COMM_WORLD,&filter);VecSetsizes(filter,PETSC_DECIDE,m);VecSetFromOptions(filter);
Vecset(filter,0);VecSetValue(filter,i,1,INSERT_VALUES);
....MatMult(U, filter,ui);
If I create filter with?PETSC_COMM_SELF, it is impossible for U and filter to multiply.


      ___________________________________________________________ 
  ????????????????? 
http://card.mail.cn.yahoo.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091201/5d9afdcb/attachment.htm>